In recent years, the task of automatically creating subtitles for audiovisual content in another language has gained a lot of attention, as we have seen a surge in the amount of movies, series and user-generated videos which are being streamed and distributed all over the world. The task of automatic subtitling is multi-faceted: starting from the speech, not only the translation has to be generated, but it must be segmented into subtitles compliant with constraints that ensure high-quality user experience, like a proper reading speed, synchrony with the voices, maximum number of subtitle lines and characters per line, etc.
IWSLT 2024 proposes two tasks concerning the subtitle processing:
- like in 2023, a specific task on automatic subtitling, where participants are asked to generate subtitles in German and/or Spanish of different kinds of audiovisual documents, featuring different levels of complexity, starting from English speech
- in addition, a subtitle compression task is also proposed, where participants are asked to rephrase subtitles that are non-compliant with the reading speed constraint (at most 21 characters / second) to make them compliant.
More details on both tasks will be provided soon.
Data will be released in January.
- Mauro Cettolo, FBK
- Mattia Di Gangi, AppTek
- Evgeny Matusov, AppTek
- Matteo Negri, FBK
- Sara Papi, FBK, University of Trento
- Marco Turchi, Zoom Video Communications
- Patrick Wilken, AppTek
- Mauro Cettolo, FBK, Italy
- Evgeny Matusov, AppTek, Germany