Document Type

Journal Article

Publication Title

Sensors

Volume

Issue

First Page

Last Page

PubMed ID

33291759

Publisher

MDPI

School

School of Science

RAS ID

32539

Funders

National Science Foundation of China China Postdoctoral Science Foundation China Scholarship Council National Science Foundation of Shaan Xi Province

Comments

Yan, X., Gilani, S. Z., Feng, M., Zhang, L., Qin, H., & Mian, A. (2020). Self-supervised learning to detect key frames in videos. Sensors, 20(23), article 6941. https://doi.org/10.3390/s20236941

Abstract

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. Detecting key frames in videos is a common problem in many applications such as video classification, action recognition and video summarization. These tasks can be performed more efficiently using only a handful of key frames rather than the full video. Existing key frame detection approaches are mostly designed for supervised learning and require manual labelling of key frames in a large corpus of training data to train the models. Labelling requires human annotators from different backgrounds to annotate key frames in videos which is not only expensive and time consuming but also prone to subjective errors and inconsistencies between the labelers. To overcome these problems, we propose an automatic self-supervised method for detecting key frames in a video. Our method comprises a two-stream ConvNet and a novel automatic annotation architecture able to reliably annotate key frames in a video for self-supervised learning of the ConvNet. The proposed ConvNet learns deep appearance and motion features to detect frames that are unique. The trained network is then able to detect key frames in test videos. Extensive experiments on UCF101 human action and video summarization VSUMM datasets demonstrates the effectiveness of our proposed method.

DOI

10.3390/s20236941

Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 License.

Download

Included in

Broadcast and Video Studies Commons, Systems Architecture Commons

COinS

Link to publisher version (DOI)

10.3390/s20236941

Research outputs 2014 to 2021

Self-supervised learning to detect key frames in videos

Document Type

Publication Title

Volume

Issue

First Page

Last Page

PubMed ID

Publisher

School

RAS ID

Funders

Comments

Abstract

DOI

Creative Commons License

Included in

Link to publisher version (DOI)

Search

Links

Browse

Author Information

Article Locations

Research outputs 2014 to 2021

Self-supervised learning to detect key frames in videos

Authors

Document Type

Publication Title

Volume

Issue

First Page

Last Page

PubMed ID

Publisher

School

RAS ID

Funders

Comments

Abstract

DOI

Creative Commons License

Included in

Share

Link to publisher version (DOI)

Search

Links

Browse

Author Information

Article Locations