MovieCuts: A New Dataset and Benchmark for Cut Type Recognition

Alejandro Pardo, Fabian Caba Heilbron, Juan León Alcázar, Ali Thabet, Bernard Ghanem

July 2022

Abstract

Understanding movies and their structural patterns is a crucial task to decode the craft of video editing. While previous works have developed tools for general analysis such as detecting characters or recognizing cinematography properties at the shot level, less effort has been devoted to understanding the most basic video edit, the Cut. This paper introduces the cut type recognition task, which requires modeling of multi-modal information. To ignite research in the new task, we construct a large-scale dataset called MovieCuts, which contains more than 170K video clips labeled among ten cut types. We benchmark a series of audio-visual approaches, including some that deal with the problem’s multi-modal and multi-label nature. Our best model achieves 45.7% mAP, which suggests that the task is challenging and that attaining highly accurate cut type recognition is an open research problem.

Type

Conference paper

Publication

Accepted at European Conference of Computer Vision 2022

MovieCuts: A New Dataset and Benchmark for Cut Type Recognition

Abstract

Alejandro Pardo

PhD Student in Computer Vision