Anna Matamala who has BA in Translation (UAB) and PhD in Applied Linguistics (UPF, Barcelona), is since 2009 a senior tenured lecturer at the Universitat Autònoma de Barcelona. Certified audiovisual translator for the Catalan Television (1996-present). Former coordinator of the PhD in Translation and Intercultural Studies at UAB (2010-2014), where she led an active internationalization policy, and of the MA in Audiovisual Translation at UAB (2005-2012).

A member of the international research group TransMedia (, and of its local branch Transmedia Catalonia (, Anna Matamala has participated and led many funded projects on audiovisual translation and media accessibility.

She has taken an active role in the organisation of scientific events such as the Media for All conference or the Advanced Research Audio Description Seminar ARSAD, and has published extensively in international refereed journals such as Meta, The Translator, Perspectives, Babel, Translation Studies, among others. She is the author of a book on interjections and lexicography (IEC, 2005), co-author (with Eliana Franco and Pilar Orero) of a book on voice-over (Peter Lang, 2010), and co-editor of four volumes on audiovisual translation and media accessibility. Joan Coromines Prize in 2005, and APOSTA Award to Young Researchers in 2011.

Her research interests are audiovisual translation, media accessibility and applied linguistics. She is currently involved in standardisation work at ISO.

Anna Matamala

will present …

The ALST project: technologies for audiovisual translation



The ALST project (Language and sensorial accessibility: technologies for voice-over and audio description), funded by the Spanish Ministry of Economy (FFI2012-31024) during a three-year period (2013-2015), has researched the implementation of three technologies (speech recognition, machine translation and speech synthesis) into voice-over and audio description, two audiovisual transfer modes that are delivered orally. The ultimate aim of the project is to semi-automatize the process of audio description creation and voice-over translation in order to guarantee a higher accessibility.

Voice-over is generally used for the translation of audiovisual non-fictional genres in Western Europe and is addressed to those who do not understand the original language: a voice in the target language is heard on top of the voice in the original language. Audio description, on the other hand, is an access service addressed primarily to the blind and visually impaired in which a voice narrates the visual elements on screen during the silent gaps in an audiovisual production. In a way, both modalities give access to an audiovisual content that would be inaccessible to a significant portion of the population.

The analysis has been carried out at three levels.

First of all, ALST has investigated whether speech recognition, either automatic or via respeaking, could be used to generate transcripts in a faster and more efficient way. Two experiments have been carried out: in audio description a speaker diarization and automatic speech recognition process have been implemented, and objective quality measures have been obtained. In voice-over, an experiment with professional transcribers has been carried out, comparing the time involved and the perceived effort in three scenarios: when creating a transcript manually, when generating a transcript via respeaking, and when correcting an automatically generated one.

Secondly, ALST has researched whether machine translation could be efficiently applied to reach high quality audiovisual translations. In voice-over the tests have analysed the technical, temporal and cognitive efforts involved in translating a documentary excerpt as compared to post-editing it. This analysis has been complemented with two additional tests: a blind subjective quality analysis by professional lecturers on the quality of the written output, and an end-user blind evaluation test on recorded translated versus post-edited versions. As for audio description, the experiment has compared the temporal, technical and cognitive efforts involved in three situations: creation of an AD, human translation of an AD, and post-editing of an AD.

Thirdly, ALST has investigated the implementation of text-to-speech technologies instead of human voices in both modalities, and has tested the end user’s reception, taking into account various variables such as overall impression, listening effort, acceptance, punctuation, pronunciation, speech pauses, intonation, naturalness, and voice pleasantness. Results have been very promising in the field of audio description and are still under analysis in the field of voice-over.

Results of all tests will be summarized in the presentation, focusing specifically in machine translation and text-to-speech.

The presentation aims to give an overview of the whole project, which will finish in December 2015. This project is an innovation in audiovisual translation because until now research on technologies in audiovisual translations has been almost limited to subtitling. ALST will hopefully be the first step in the application of other technologies in both voice-over and audio description and will open new research horizons at international level.