Presentations, their Presenters and Co-Authors
(in alphabetical order by title of presentation)
|This page provides abstracts of presentations provided by authors. Texts are as at approx 2 weeks prior to the conference. For final titles (and slides) of all presentations, use the link above.|
150 Million Words a Year and Counting – How the PCT is Using Technology to Handle a 62% Increase in Workload without Exponentially Increasing the Number of Translators or Budget Allocation
In this paper we shall present the PCT Translation Division of the World Intellectual Property Organization (WIPO), a specialized agency of the United Nations.
Responsible for the translation of documents related to the international patent filing process, we have witnessed significant growth, with the number of words requiring translation per year increasing from approximately 57 million in 2007 to 150 million in 2017 and there are no signs that this is going to slow down any time soon. We will look specifically at the impact of this year-on-year expansion and at how it has motivated and driven forward technology adoption within the PCT Translation Division, analysing how we have managed to get more work done without exponentially increasing the number of translators and the budget allocation.
We will share the successes and the failures and will conclude by discussing the future actions to be taken in order to keep abreast of the continuously changing technology and to ensure that we use it in the best way to achieve maximum efficiency.
Tracey Hay (WIPO)
Tracey Hay is Head of the English Translation Section, PCT Translation Division, at WIPO in Geneva, Switzerland.
After graduating in 1997 from the Translation and Interpreting Masters programme at the University of Bath, UK , she worked initially as a translator and then a reviser in WIPO’s patent translation department.
Since 2013 she has headed the English Translation Section, with responsibility for overseeing the translation of patent abstracts and patentability reports from Arabic, French, German, Portuguese, Russian and Spanish into English, amounting to 23 million words of translation in 2017.
About TeMpTations & Masks – Information Security and Privacy Aspects of Using Online Machine Translation in CAT Tools
Information Security and Privacy Aspects of using Online Machine Translation in CAT Tools
Almost all translation memory (TM) tools nowadays offer integrations with online machine translation (MT) solutions. Better MT quality and self-learning capabilities thanks to neural and adaptive MT technologies as well as the availability of a large number of MT plugins for TM-based tools make the classic TM and ‘innovative’ MT combination more attractive for translators and LSPs. In my talk, I will not cover the typical aspects related to machine translation in the professional translation workflow (post-editing, quality, pricing, process impact, etc.), but rather focus on information security and data protection aspects. In addition to highlighting some of the critical sections of the terms of service of popular online MT offerings, I will take a closer look at the needs, technical and organisational options and implications for protecting personal data when using online MT solutions (GDPR compliance). The conclusions and discussion will focus on if, when and how we can securely integrate and use online MT in the TM tool based translation process.
Christine Bruckner (Freelance consultant and trainer)
Christine Bruckner holds university degrees in translation and in computational linguistics; she was one of the early adopters of translation memory (TM) technology in her freelance translator’s life in the 1990s.
Since 2001, she has worked for several German corporate and government language services where she contributed to and led the introduction and continuous improvement of computer-aided translation (CAT), machine translation (MT), as well as translation management & terminology systems and processes.
As a freelance consultant and trainer, Christine is now sharing her experience in translation technology – and especially her passion for the TM+MT combination – with corporate customers, solution providers, LSPs and translators.
Approaches to reducing OOV's and specialising baseline engines in Neural Machine Translation
Two of the main issues that hamper the implementation of NMT solutions in production settings are the apparent inability to deal with tokens not contained in the model’s vocabulary (‘s, or OOV’s) and the problematic translations generated when the model is applied to translate “out of domain” texts, i.e. texts dealing with a specialised field on which the model has not been trained. Failure to resolve these issues makes NMT output unpalatable to professional translators.
We apply two strategies to deal with these issues. The first involves the implementation of an intermediate server between the client application and the RESTserver (Xavante) that delivers the predictions proposed by the NMT model. This intermediate server provides both pre-processing and post-processing modules. Both these modules allow any number of routines to be applied before and after inference (translation) by the NMT engine. The talk will present practical examples of what happens in these routines. We also explain how we customize a baseline NMT engine so that it can correctly translate specialist texts without going through the lengthy procedure of training the baseline engine from scratch.
Both these strategies make the output of NMT engines more useful in production settings.
Terence Lewis, MITI, entered the world of translation as a young brother in an Italian religious order, when he was entrusted with the task of translating some of the founder’s speeches into English. His religious studies also called for a knowledge of Latin, Greek and Hebrew. After some years in South Africa and Brazil, he severed his ties with the Catholic Church and returned to the UK where he worked as a translator, lexicographer (Harrap’s English-Brazilian Portuguese Business Dictionary) and playwright.
As an external translator for Unesco he translated texts ranging from Mongolian cultural legislation to a book by a minor French existentialist. At the age of 50 he taught himself to program and wrote a rule-based Dutch-English machine translation application which has been used to translate documentation for some of the largest engineering projects in Dutch history.
For the past 15 years he has devoted himself to the study and development of translation technology. Over the last 18 months he has familiarised himself with the OpenNMT toolkit and now uses Neural Machine Translation as the basis for the supply of translations to his clients through his company MyDutchPal Ltd.
Automating Terminology Management. Discussion of IATE and suggestions for enhancing its features
Terminology management is subject to automation in the framework of automating translation processes. This paper analyses its selected aspects, taking the IATE terminology database as an example.
First, the author discusses the concept of managing terminology and identifies its automation potential, by taking into account such aspects as digitising and visualising knowledge or terminology extraction. AI-driven methods are also looked at in this context. Secondly, the paper presents a case study concerning IATE, the joint terminology database and a terminological source of reference for all European institutions.
Apart from general introductory remarks on IATE and on “automated terminology management”, the paper follows with the discussion of terminology consolidation projects carried out at the Council of the European Union and provides recommendations for the current version of IATE by outlining difficulties related to performing a terminological search with the use of this database. The aim of the analysis is to give an overview of IATE’s features and to identify aspects of the database which could benefit from implementing efficient “intelligent technologies”.
Finally, this paper provides meaningful insights into the importance of terminology automation in national institutions or larger companies.
Anna Maria Władyka-Leittretter
Anna Maria Władyka-Leittretter
Born in Warsaw (Poland), a freelance translator (since 2013) and conference interpreter (since 2014) for Polish, English and German, currently based in Leipzig (Germany). She is a publicly appointed and certified translator and interpreter (certification from the Higher Regional Court of Dresden and the District Court of Berlin), a member of BDÜ (German Association of Translators and Interpreters) and a member of VKD (German Association of Conference Interpreters).
She graduated in Applied Linguistics (B.A. Translation and Language Teaching, M.A. Conference Interpreting) from the University of Warsaw.
Her areas of expertise and interest include but are not limited to: machine translation, automation of translation processes, simultaneous interpreting, European Union, journalism, e-commerce and art. As a student and a recent graduate, she completed traineeships in the European institutions (the European Commission and the Council of the European Union).
She currently finalises her doctoral thesis on the automation of translation processes, which is supervised by Prof. Dr. Peter A. Schmitt and will be submitted at the University of Leipzig.
One of her great passions is literary translation.
Can interpreters’ booth notes tell us what really matters in terms of information and terminology management?
In the last two decades, several Computer Aided Interpreting (CAI) tools have been developed to satisfy the needs of simultaneous interpreters and the workflow of interpreters’ knowledge-, information- and terminology-related workflow has been studied. A case study (as opposed to personal experience, theoretical considerations or replies to questionnaires) looking at the booth notes of interpreters might shed some light on their actual information management behaviour and help to verify or improve existing theoretical models and software architectures. In this study the booth notes, i.e. hand-written sheets of paper, produced by conference interpreters of both the private market and the EU institutions will be collected and analysed. The following questions will be studied: What kind of information do interpreters consider crucial, i.e. write it down to be used in the booth? How do the case study findings fit with theoretical considerations about terminology, information and knowledge management of interpreters? Can this information be modelled in conventional terminology management/CAI/generic software solutions like spreadsheets or databases? Might the notes of one interpreter be useful to the interpreters working at the next meeting of the same kind?
Anja Rütten (Freelance conference interpreter)
Dr Anja Rütten (Sprachmanagement.net) is a freelance conference interpreter for German A, Spanish B, English and French C based in Düsseldorf, Germany since 2001.
Apart from the private market, she works for the EU institutions and is a lecturer at the TH Cologne. She holds a degree in conference interpreting as well as a phD of the University of Saarbrücken (doctoral thesis on Information and Knowledge Management in Conference Interpreting, 2006).
As a member of AIIC, the international conference interpreters’ association, she is actively involved in the German region’s working group on profitability.
She has specialised in knowledge management since the mid-1990s and shares her insights in her blog on www.dolmetscher-wissen-alles.de.
A Collaborative Approach of Computer-Assisted Communities of Practice. Official Release of the “Practice Mode” by Interpreters’ Help
This will be a short introduction to the author’s workshop, Getting Started with Interpreters’ Help.
Few computer programs specialized in the collaborative practice of conference interpreting are currently available, and they are either too limited or not developed to meet the requirements of our professional practice. To fulfil their need to practice, interpreters have often used software that is not tailored to their requirements. Although CAIT-Tools have helped interpreters’ in their need to practice, existing tools are limited and difficult to access. We’ll disclose the latest feature of Interpreters’ Help: the Practice Mode. It is aimed at facilitating the practice of conference interpreting. The Practice Mode is set in an ergonomic environment that allows for comfortable assessment. In the platform, interpreters will be able to access videos carefully selected by experts. They will be able to practice by setting goals and achieving high-quality results. The breakthrough consists of harmonizing several elements: a template based on academic assessment criteria, a system that allows practising interpreters to hear the original speech from one earpiece and the interpretation from the other for a synchronised assessment, the possibility of creating private speech pools and accessing a public speech repository, the opportunity to give and receive online feedback regarding interpreting performance, and a place to store interpreting recordings.
For more on this, please attend the author’s workshop.
Lourdes de la Torre Salceda (Interpreters' Help)
Lourdes de la Torre Salceda
Lourdes de la Torre Salceda, born in Santander (Spain), is a professional conference interpreter and localizer. She joined Interpreters’ Help after creating Cleopatra, the first smartphone app developed for consecutive interpreting professionals.
In 2017, she graduated from the Pontifical University Comillas, Madrid, earning an Official Master’s Degree in Conference Interpreting. This MA is awarded by the EMCI (European Master in Conference Interpreting) Consortium in collaboration with the DG-SCIC of the European Commission and the DG-INTE of the European Parliament. During her second MA Thesis, she was conducted by Lola Rodríguez Melchor, at Pontifical University Comillas, and her research focused on the academic frame of Cleopatra: the app for interpreters to automate their own symbols for consecutive interpreting note-taking.
Concurrent Translation - Reality or Hype?
Recent years have seen a growth in cloud-based technology platforms enabling multiple translators, editors and experts to work on the same translation concurrently. In such environments, the text production process is centralised – the PM is overseeing the project and the multiple agents (translators, editors, experts etc.) produce the target text collaboratively and concurrently, drawing on a central TM and TB updated in real time. The text can either be split amongst individual translators, or translators can take the next available segment in the text. This workflow potentially impacts the translation process as we know it – i.e. one that resembles writing. According to Carl et al. (2011), the translation process includes orientation, drafting and revision phases, and translators can display different styles of working in each phase. The concurrent scenario necessitates a more uniform text production style, mostly with regard to the self-revision phase. This paper will report on an explorative study examining translators working with SmartCAT in concurrent mode, to study the effects of the flexible workflow on the translation process and examine the level of adaptation required to comply with the new workflow. It will also report on the adoption of this technology in the translation industry.
Joanna Gough and Katerina Perdikaki (University of Surrey)
Joanna Gough is a lecturer in Translation Studies at the University of Surrey. Joanna’s research interests encompass a variety of language and technology related subjects, such as tools and resources for translators, process oriented translation research, the evolution of the Web and its impact on translation and many more.
Joanna is also interested in the business and industry aspects of the translation profession and is a keen advocate of cooperation between academia and industry. She is involved in the ITI Professional Development Committee.
Dr Katerina Perdikaki
I am an Associate Teaching Fellow at the University of Surrey, where I teach audiovisual translation and specialised translation from English into Greek. I have also taught modules on interpreting and translation theory and advertising.
My research interests lie in the area of audiovisual translation and film semiotics, as used in the analysis of multimodal and multimedial texts, and in intersemiotic acts of communication, as that involved in the film adaptation process. Currently, I am also looking into the emotional impact that subtitlers undergo when subtitling material of sensitive subject matter.
I am involved in the membership committee of the International Association for Translation and Intercultural Studies (IATIS) and I also work as a freelance subtitler.
Creating an Online Translation Platform to Build Target Language Resources for a Medical Phraselator
In emergency and immigrant health service departments, medical professionals frequently have no language in common with a patient. When no interpreter is available, especially in emergency situations, doctors need another means of collecting patient anamneses. Currently available solutions include machine translation or medical phraselators. The Geneva University Hospitals (HUG) have developed BabelDr, a speech-enabled phraselator, for the languages critical at HUG. BabelDr uses speech recognition to process doctors’ utterances and applies linguistic rules to map the recognition result to a canonical representation which, after approval by the doctor, is translated into the patient’s language. The system currently has a coverage of tens of millions of utterances, mapped to around 5’000 canonicals, translated into five languages. In this paper, we focus on the development of the target language resources for this system.
Due to the repetitive nature of the content, the source language grammar uses variables to make resources more compact, i.e. sentences can contain one or more variables which are replaced by different values at compile time. Target language resources should follow the same compositional scheme, with sentences and variables translated separately. The platform provides a functionality to preview sentences with variables replaced by values, exactly as they will be presented to patients. In cases where a sentence cannot be treated compositionally in the target language, due for example to word agreement issues or lexical gaps, the platform lets translators add specific non compositional translations. Further features of the translation view include a translation memory and an annotation functionality, allowing translators to share their insights by appending comments to canonical sentences. For the second step, revision, we have chosen to present the reviser with expanded sentences, i.e. complete sentences with variables replaced, to make the task simpler and ensure that the final expanded content is correct. Revisers cannot edit the translations directly, but add comments to individual sentences. Since this expanded format does not match the “real” compositional target language resource format, a third correction step is necessary, where the translator can make changes in a view where the commented expanded sentences are linked back to the editable compositional sentences and variables they were constructed from.
A first version of the platform is currently in use by multiple translators, completing translations from French into Albanian, Arabic, Farsi, Spanish and Tigrinya. An evaluation of the platform by these translators and revisers is ongoing, the results of which will be included in our final paper.
Johanna Gerlach, Hervé Spechbach, Pierrette Bouillon (University of Geneva)
Johanna Gerlach is a Research and Teaching Fellow (maître-assistante) at the Department of Translation Technology (referred to by its French acronym TIM). She currently works on the BabelDr project, a spoken language translation system for the medical domain.
Before that, she contributed to the MedSLT and CALL-SLT projects, developing linguistic resources for German. Johanna also holds a Master’s degree in translation from the Faculty of Translation and Interpreting of the University of Geneva.
Hervé Spechbach is Médecin adjoint responsable de l’unité d’urgences ambulatoires at Geneva University Hospitals (HUG).
Pierrette Bouillon has been Professor at the FTI, University of Geneva since 2007. She is currently Director of the Department of Translation Technology (referred to by its French acronym TIM) and Vice-Dean of the FTI.
She has numerous publications in computational linguistics and natural language processing, particularly within lexical semantics (Generative lexicon theory), speech-to-speech machine translation for limited domains and more recently pre-edition/post-edition.
In the past, she participated in different EU projects (EAGLES/ISLE, MULTEXT, etc.) and was lead for three Swiss projects in speech translation: MEDSLT 1 and 2 (offering a system for spoken language translation of dialogues in the medical domain) and REGULUS (a platform for controlled spoken dialog application) and two projects in computer assisted language learning: CALL-SLT 1 (a generic platform for CALL based on speech translation) and CALL-SLT 2 (designing and evaluating spoken dialogue based CALL systems).
Between 2012 and 2015, she coordinated the European ACCEPT project (Automated Community Content Editing PorTal). At present, she co-coordinates the new Swiss Research Center Barrier-free communicationthe with the Zurich University of Applied Sciences and the project BabelDr with the HUG (Geneva University Hospitals). She also takes part in the new COST network EnetCollect : European Network for Combining Language Learning with Crowdsourcing Techniques.
Foregrounding Accessibility Features in a Multimodal Translation Tool
This poster reports on the results of an accessibility study of a browser-based translation (and post-editing) tool that accepts multiple input modes. The tool was conceived to be used by professional translators and includes all the features that are deemed necessary for integrating translation memories (TM) and machine translation (MT). Its distinctive features include the possibility of using touch commands and voice input, in addition to the typical keyboard and mouse commands.
In order to assess whether the accessibility features included in the tool translate into improved performance and user experience, we carried out an experiment with three blind or partially-sighted translators, who translated using assistive technology. We analysed translation productivity using translation process research methods, and used qualitative methods to assess user satisfaction.
Results were very positive, with participants effusive about the potential of the tool and eager to participate in further development. By detailing the development cycle and results, we hope to encourage a focus on accessibility for translation tool developers.
Joss Moorkens, Carlos S.C. Texeira, Daniel Turner, Joris Vreeke and Andy Way (Trinity College Dublin)
Joss Moorkens is an Assistant Professor of Translation Studies at the School of Applied Language and Intercultural Studies (SALIS) at Dublin City University, and a researcher in the ADAPT Centre and Centre for Translation and Textual Studies (CTTS). Within ADAPT, he has contributed to the development of translation tools for both desktop and mobile and supervised projects with prominent industry partners. He is co-editor of the book Translation Quality Assessment: From Principles to Practice (Springer), and has authored journal articles and book chapters on topics such as translation technology, post-editing of machine translation, human and automatic translation quality evaluation, and ethical issues in translation technology in relation to both machine learning and professional practice.
Carlos Teixeira is a post-doctoral researcher in the ADAPT Centre for Intelligent Digital Content Technology and a member of the Centre for Translation and Textual Studies (CTTS) at Dublin City University (DCU). He holds a PhD in Translation and Intercultural Studies and Bachelor degrees in Electrical Engineering and Linguistics. His research interests include Translation Technology, Translation Process Research, Translator-Computer Interaction, Localisation and Specialised Translation. He has vast experience in the use of eye tracking for assessing the usability of translation tools. His industry experience includes over 15 years working as a translator, localiser and language consultant.
Daniel Turner is a research engineer in the ADAPT Centre’s Design & Innovation Lab (dLab). Within ADAPT, he has contributed to projects with a strong focus on rapid prototyping of user interfaces. He is proficient in full stack development with experience using a variety of languages and tools.
Joris Vreeke is Scrum Master and Senior Software Engineer in the ADAPT Centre’s dLab. He has a background in software development and design with a preference for graphics, UI/UX and web application development.
Andy Way is a professor in the School of Computing at DCU and deputy director of the ADAPT Centre, supervising projects with prominent industry partners. He has published over 350 peer-reviewed papers and successfully graduated numerous PhD and MSc students. His research interests include all areas of machine translation such as statistical MT, example-based MT, neural MT, rule-based MT, hybrid models of MT, MT evaluation and MT teaching.
From a Discreet Role to a Co-Star: the Post-Editor Profile Becomes Key in the PEMT Workflow for an Optimal Outcome
Traditional machine translation workflows usually involve the post-editor only at the end. A lot of the time, post-editors are forced to rework unacceptable machine translation output. Even when the output is acceptable, they lack information about how the MT system works and what they can do to improve it, which causes frustration and unwillingness to post-edit again. Low-quality raw MT eventually leads to post-editors developing a negative attitude towards the MT system, which can result in poor quality post-edited texts which, in turn, contribute very little to the system training cycle, thus resulting in a static, never-improving process, and a tedious task for post-editors. This presentation aims both to encourage post-editors to embrace the often controversial MT processes, and to encourage companies to integrate post-editors into their MT workflows from the outset, even reshaping such processes to become linguist-focused rather than machine-centred, with the post-editor playing a central role. This change will enable higher quality and productivity and, therefore, a successful post-editing experience for everyone.
Lucía Guerrero (CPSL)
Lucía Guerrero is a senior Translation and Localization Project Manager at CPSL, specialising in machine translation and post-editing, and also part of the collaborative teaching staff at the Universitat Oberta de Catalunya.
Having worked in the translation industry since 1998, she has also managed localization projects for Apple Computer and translated children’s and art books.
Human-Computer Interaction in Translation: literary translators on technology and their roles
In the digital age, where everything is multiplied and transformed continuously, translatability is not relegated solely to the realm of language anymore. Instead, it becomes an inherent quality of culture and society, to the point that the present age could indeed be labelled ‘the translation age’ (Cronin, 2013: 3). The bombardment of information, the instantaneity of communication and knowledge, the ever-increasing automation of the profession and digitalisation of materials, the introduction of new translation tools, have all contributed to (1) the configuration of translation as a form of Human-Computer Interaction (HCI) (O’Brien, 2012) and (2) the emergence of the need for Translation Studies to focus on human issues arising from the problematic relationship between translators and digital technologies (Kenny, 2017). My research project takes this framework as the springboard to explore the dynamic, mutual and social construction of human-computer interaction in literary translation, defined by Toral and Way as ‘the last bastion of human translation’ (2014: 174). In particular, it aims at gaining a richer understanding of the human and technological factors at play in the field of literary translation by asking literary translators to share their perceptions of their role in an increasingly technology-dependent globalised society and their attitudes towards technology as part of their profession. The study adopts a socio-technological theoretical framework inspired by the Social Construction of Technology (SCOT) model (Pinch and Bijker, 1984; Olohan, 2014; Braun, Davitti and Dicerto, 2018). This poster presentation outlines the theoretical and methodological frameworks adopted, and accounts for the study’s latest findings on literary translators’ personal narratives of the translation profession’s shift from humanistic to technology-driven and trends related to perceptions of their role in society and attitudes towards technology.
Cronin, Michael (2013) Translation in the digital age, Oxon and New York: Routledge.
Paola Ruffo (Heriot Watt University)
Paola Ruffo is currently a second year PhD student of the Centre for Translation and Interpreting Studies in Scotland at Heriot-Watt University, Edinburgh, UK. Her research interests lie at the intersection of literary translation and translation technologies. Before embarking on her research journey, she obtained an MA in Translation and Literature from the University of Essex and worked as a Project Manager for a translation company in London.
In addition to conducting research, she also practices translation as a freelancer.
Machine Translation Markers in Post-Edited Machine Translation Output
The author has conducted an experiment for two consecutive years with postgraduate university students in which half do an unaided human translation (HT) and the other half post-edit machine translation output (PEMT). Comparison of the texts produced shows – rather unsurprisingly – that post-editors faced with an acceptable solution tend not to edit it, even when often more than 60% of translators tackling the same text prefer an array of other different solutions. As a consequence, certain turns of phrase, expressions and choices of words occur with greater frequency in PEMT than in HT, making it theoretically possible to design tests to tell them apart. To verify this, the author successfully carried out one such test on a small group of professional translators. This implies that PEMT may lack the variety and inventiveness of HT, and consequently may not actually reach the same standard. It is evident that the additional post-editing effort required to eliminate what are effectively MT markers is likely to nullify a great deal, if not all, of the time and cost-saving advantages of PEMT. However, the author argues that failure to eradicate these markers may eventually lead to lexical impoverishment of the target language.
Michael Farrell (Traduzioni inglese)
Michael Farrell is an untenured lecturer in post-editing, machine translation, and computer tools for translators at the International University of Languages and Media (IULM), Milan, Italy, the developer of the terminology search tool IntelliWebSearch, a qualified member of the Italian Association of Translators and Interpreters (AITI), and member of the Mediterranean Editors and Translators association.
Besides this, he is also a freelance translator and transcreator. Over the years, he has acquired experience in the cultural tourism field and in transcreating advertising copy and press releases, chiefly for the promotion of technology products. Being a keen amateur cook, he also translates texts on Italian cuisine.
Measuring Comprehension and Acceptability of Neural Machine Translated Texts: a Pilot Study
In this paper we compare the results of reading comprehension tests on both human translated and raw (unedited) machine translated texts. We selected three texts of the English Machine Translation Evaluation version (CREG-MT-eval) of the Corpus of Reading Comprehension Exercises (CREG), which were translated manually into Dutch by a master’s student of translation of Ghent University and translated automatically by means of two neural machine translation engines, viz. DeepL and Google Translate.
For each text we formulated five reading comprehension questions. The experiment was conducted via SurveyMonkey and the participants were asked to read the translation very carefully after which they had to answer the questions without having access to the translation. In total, 99 participants took part in the experiments.
All answers to the questions were given a score of 0 to 1 (1 for fully correct answers, 0.5 for partially correct answers and 0 for completely wrong answers). The scores were averaged over all particpants. Preliminary results for the first text show that the human translation scores better (with an average score of 0.68) than the translation obtained by Google Translate (0.60) and the translation generated by DeepL (0.48).
Lieve Macken, Iris Ghyselen (Ghent University)
Lieve Macken is Assistant Professor at the Department of Translation, Interpreting and Communication at Ghent University (Belgium). She has strong expertise in multilingual natural language processing. Her main research focus is translation technology and more specifically the comparison of different methods of translation (human vs. post-editing, human vs. computer-aided translation), translation quality assessment, and quality estimation for machine translation.
She is the operational head of the language technology section of the department, where she also teaches Translation Technology, Machine Translation and Localisation.
Iris Ghyselen is a Master’s student of Translation of Ghent University at the Department of Translation, Interpreting and Communication. The article presented is based on her thesis which she has written under the guidance of prof. dr. Lieve Macken.
Modification, Rendering in Context of a Comprehensive Standards-Based L10n Architecture
The Translation API Cases and Classes (TAPICC) initiative is a collaborative, community-driven, open-source project to advance API standards for multilingual content delivery. The overall purpose of this initiative is to provide a metadata and API framework on which users can base their integration, automation, and interoperability efforts. All industry stakeholders are encouraged to participate. The standard TAPICC relies on for bitext interchange is XLIFF version 2.
The presentation will explore different approaches of “Modifiers”, i.e. roundtrip agents performing Modifications, to the changes of the XLIFF documents, compare their pros and cons and provide recommendations on selecting the most suitable approach.
We will also describe ways how to render the information available in XLIFF documents in the tools, to provide the value for language specialists (i.e. translators and reviewers) using them in the most optimal way.
Most of the discussed concepts will be accompanied by examples of recommended and discouraged practices.
Ján Husarčik and David Filip
Ján is Solutions Architect at Moravia where he focuses on assessing customers’ needs and mapping them to products and services. With a background in design and development, he contributes to process improvements and optimization and handles various activities around implementation.
David Filip is Chair (Convener) of OASIS XLIFF OMOS TC; Secretary, Editor and Liaison Officer of OASIS XLIFF TC; a former Co-Chair and Editor for the W3C ITS 2.0 Recommendation; Advisory Editorial Board member for the Multilingual magazine; co-moderator of the Interoperability and Standards WG at JIAMCATT.
David has been also appointed as NSAI expert to ISO TC37 SC3 and SC5, ISO/IEC JTC1 WG9, WG10 and SC38. His specialties include open standards and process metadata, workflow and metaworkflow automation. David works as a Research Fellow at the ADAPT Research Centre, Trinity College Dublin, Ireland. Before 2011, he oversaw key research and change projects for Moravia’s worldwide operations.
David held research scholarships at universities in Vienna, Hamburg and Geneva, and graduated in 2004 from Brno University with a PhD in Analytic Philosophy. David also holds master’s degrees in Philosophy, Art History, Theory of Art and German Philology.
Proposal for a Bilingual Brazilian Portuguese-French Glossary of Marriage Certificates: Assistance for Translators
Currently, the Certified Translation (CT) is often required for various purposes. Among the most common requests, there are personal documents, such as certificates of marriage. This is due to the growing relationship between France and Brazil and consequently, the translation of documents into the French-Portuguese language pair grows up and therefore it is important that researches develops dictionaries and glossaries to help facilitate this process. In this sense, our PhD research aims to elaborate a bilingual glossary of the recurring terms in Brazilian and French certificates of marriage. In this presentation, we aim to present the study we have conducted on the search for equivalents in French of the terms of the domain of Brazilian certificates of marriage. For this, we based on the theoretical presuppositions of Terminology (CABRÉ, 1999; BARROS, 2004, 2007, among others), more specifically of Bilingual Terminology (AUBERT, 1996; DUBUC, 1992). As a methodology of our investigation, first, we formed six corpora:
1) CCFCorpus, which constitutes of 102 French certificates of marriage issued between the years of 1792 to 2012;
2) LFCorpus, which gathers laws and decrees on civil marriage throughout the history of French legislation;
3) Corpus de ApoioFR, composed of a bibliography specialized in French Law and History of France, as well as by legal dictionaries;
4) CCBCorpus, consisting of 333 Brazilian certificates of marriages, issued between the years of 1890 to 2015;
5) LBCorpus, which groups a set of laws, decrees and provisions on civil marriage in the history of Brazilian legislation;
6) Corpus de ApoioBR, composed of a bibliography specialized in Brazilian Law and History of Brazil, as well as by legal monolingual dictionaries.
We store the contents of CCBCorpus and CCFCorpus, which contains respectively 85,115 words and 13,827 words, in the textual database of WordSmith Tools (SCOTT, 2004). We used the WordList tool that allowed us to select only the words of the grammatical class of nouns. Next, we used the Concord tool to come up with a list of words for each selected the candidate to be a term. In order to confirm the status of these candidates to be terms, e.g., in order to establish if they really are relevant terms to the domain of Brazilian and French certificates of marriage, we verified the occurrence of these lexical units in LBCorpus and LFCorpus, and monolingual dictionaries and glossaries specialized in the area of Brazilian and French Law that integrate the Corpus de ApoioBR and Corpus de ApoioFR. This verification allowed us to consider the candidate as a term or discard it. At the end of this process, we reached a total of 307 terms in Portuguese and 107 terms in French. Then, we organized the data terms in bilingual terminology sheets, which present the information referring to the grammatical class, domain of origin, morphosyntactic organization, definition, context and terminological variant (if it occurs). Finally, we searched for the equivalences in French of the terminological units in Portuguese based on Dubuc’s (1992) equivalence proposal (Support: São Paulo Research Foundation/Brazil).
Beatriz Curti-Contessoto and Lidia Barros (São Paulo State University )
I am PhD Student (2015-2019) in Linguistic Studies at São Paulo State University (UNESP) in Brazil, who receives a support of the São Paulo Research Foundation (FAPESP). I do a research stage at University of Paris 3 – Sorbonne Nouvelle in France (2018-2019). I work as a translator since 2012. I graduated in Bachelor’s Degree in Translation (French / Spanish) (2011-2014) in the same institution.
Since 2013, I collaborate with the Centre de Ressources et Information en Français (CRIF) that was founded by UNESP and French Embassy of São Paulo.
I work also as a French teacher in extension courses (2012-2016) and as a substitute professor in Letters and Bachelor’s Degree in Translation (since 2016) offered by UNESP in São José do Rio Preto/São Paulo.
As a researcher, I have experience in Linguistics, mainly working on the following topics: Diachronic Terminology, Mono / Bilingual Terminology, sociocultural and historical aspects in Terminology, legal terminology, Translation and Sworn Translation.
Lidia Almeida Barros
Now, I am a professor in São Paulo State University (UNESP) and I have experience in Linguistics, with emphasis on Terminology, working mainly in the following subjects: Terminology, Terminology, Translation and Sworn Translation.
I have a strong international insertion for having taught and conducted researches at the University of Lyon 2 for 7 years (1990-1997) and for coordinating the cooperation agreement between UNESP and this University and supervising Joint PhD students at University of Paris 3 – Sorbonne Nouvelle and New University of Lisbon.
Statistical & Neural MT Systems in the Motorcycling Domain for Less Frequent Language Pairs - How do Professional Post-editors Perform?
As more language service providers are including post-editing of machine translation in their workflow, we see how studies on quality estimation of MT output become more and more important. We report findings from a user study that evaluates three MT engines (two phrase-based and one neural) from French into Spanish and Italian. We describe results from two text types: product description and blog post, both from a motorcycling website that was translated by Datawords. We use task-based evaluation, automatic evaluation metrics (BLEU and edit distance) and human evaluation through ranking to establish which system requires less PE effort and we set the basis for a method to decide when an LSP could use MT and how to evaluate the output. Unfortunately, large parallel corpora are unavailable for some language pairs and domains. Motorcycling and French language are low-resourced, and this represents the main limitation to this user study. It especially affects the performance of the neural model.
Keywords: NMT, post-editing, quality evaluation, machine translation
Clara Ginovart (Universitat Pompeu Fabra)
Clara Ginovart Cid
After having completed studies in Translation at Pompeu Fabra University (Bachelor) and University of Geneva (Master), she has now become an Industrial PhD candidate, in Pompeu Fabra University, in Barcelona, and Datawords Datasia, in Paris.
Thanks to her interest in translation technologies she has become proficient at using project management systems such as Projetex, XTRF, Plunet, or Everwin, as well a long list of CAT tools (i.e. MemoQ, Trados SDL, DéjàVu, MultiTrans, Memsource, etc.), terminology databases, and MT systems. Other IT skills include subtitling and audiovisual tools (Subtitle Workshop, Audacity, etc.), as well as collaborative software (Lotus Notes, Confluence, Digital Core, etc.), or even QA tools, such as XBench.
She has won the 1st Translation Contest of AETI (Asociación Española de Traductores e Intérpretes), and she presented one paper at the 2nd International Young Researchers’ Conference on Translation and Interpreting, Speaker, in 2015. She has followed a long list of webinars and trainings, such as all certifications of SDL and the Quality Management and Post-Editing certifications of TAUS.
Many internships preceded the beginning of her career (as translation or terminology intern in Tortosa, Barcelona, and Geneva), and her three main contracts have been: translation project manager (one year, in Barcelona), translation teacher (3 months, in Barcelona), and CAT and MT tools consultant (current, soon 2 years, in Paris).
Statistical vs. Neural Machine Translation: a Comparison of MTH and DeepL at Swiss Post’s Language Service
This paper presents a study conducted in collaboration with Swiss Post’s Language Service. In order to integrate MT in its workflow, the Language Service asked us to perform an evaluation of different MT systems. We compared the customizable statistical MT system Microsoft Translator Hub (MTH) with the generic neural MT system DeepL for the language pair German>French. The aim of the study was to provide answers to the following two questions: Can a generic neural system (DeepL) compete with a specialized statistical commercial system (MTH)? And is BLEU a suitable metric for the evaluation of neural machine translation systems? In order to answer our first research question, we performed automatic evaluations using BLEU and post-editing human evaluations to compare MTH and DeepL. For the second research question, we looked at the correlation between human and automatic evaluations. Our results show that, in the context of the Swiss Post’s Language Service, the non-customizable neural MT system DeepL can achieve a much better quality than the customizable statistical system MTH. Furthermore, our study shows that BLEU tends to underestimate the quality of neural machine translation and then might not be a suitable metric for the evaluation of this kind of system.
Lise Volkart, Pierrette Bouillon, Sabrina Girletti (University of Geneva)
Lise Volkart has recently completed her master’s degree in Translation, with a specialization in Translation Technology, at the University of Geneva. For her master’s thesis, she worked on a project testing the implementation of machine translation at Swiss Post. She has also worked as a freelance and in-house translator in several translation departments in Switzerland. Lise holds a bachelor’s degree in Translation and Interpreting from the Institut Libre Marie Haps of Brussels.
Pierrette Bouillon has been Professor at the FTI, University of Geneva since 2007. She is currently Director of the Department of Translation Technology (referred to by its French acronym TIM) and Vice-Dean of the FTI. She has numerous publications in computational linguistics and natural language processing, particularly within lexical semantics (Generative lexicon theory), speech-to-speech machine translation for limited domains and more recently pre-edition/post-edition. In the past, she participated in different EU projects (EAGLES/ISLE, MULTEXT, etc.) and was lead for three Swiss projects in speech translation: MEDSLT 1 and 2 (offering a system for spoken language translation of dialogues in the medical domain) and REGULUS (a platform for controlled spoken dialog application) and two projects in computer assisted language learning: CALL-SLT 1 (a generic platform for CALL based on speech translation) and CALL-SLT 2 (designing and evaluating spoken dialogue based CALL systems). Between 2012 and 2015, she coordinated the European ACCEPT project (Automated Community Content Editing PorTal). At present, she co-coordinates the new Swiss Research Center for Barrier-free communication with the Zurich University of Applied Sciences, and the project BabelDr with the HUG (Geneva University Hospitals). She also takes part in the new COST network EnetCollect: European Network for Combining Language Learning with Crowdsourcing Techniques.
Sabrina Girletti is a research and teaching assistant in the Translation Technology Department of the Faculty of Translation and Interpreting (FTI) at the University of Geneva, where she contributes to postgraduate courses in machine translation and localisation. Her research interests include post-editing approaches and human factors in machine translation. As a young language technology consultant, she also works with Suissetra, the Swiss association for translation technology promotion. She is currently involved in a project testing the implementation of machine translation at Swiss Post. Sabrina holds a master’s degree in Translation, with a specialisation in Translation Technology, from the University of Geneva, and a bachelor’s degree in Linguistic and Cultural Mediation from the University of Naples L’Orientale.
Translation Quality Assurance - a risk avoidance approach
This will be a short introduction to the author’s workshop, Linguistic Quality Assurance (LQA) and Translation Quality Assessment (TQA).
Up to now, quality measurement on language translation has largely been subjective, if such measurement was undertaken at all. If an company did set up a quality process with its translation suppliers, the quality of translated service information would generally be reviewed by in-country validators designated by the company.
The main goals of the translation of technical documentation are regulatory compliance and to minimize risks with regard to product liability and product safety. In order to cover different risks, risk-appropriate processes in translation project management are needed. ISO 17100, however, requires adherence to one single standard process and does not include risk aspects.
In our model and system “myproof”, the core processes of risk management according to ISO 31000 are applied to processes in translation projects. The main element of the model is a risk matrix which allows a risk analysis of the texts to be translated and consequently the development of risk-based processes for translation projects. The implementation of comprehensive risk management for translations will result in regulatory compliance as well as a higher quality of translations with the corresponding quality level.
For more on this, please attend the author’s workshop.
David Benotmane (Glossa Group GmbH)
David Benotmane is Solutions Architect & Product Director at Glossa Group GmbH, a language service provider specialized in linguistic quality assurance. He is also co-organiser of the annual linguistic symposium in Zurich, Switzerland.
He reorganised the translation services and implemented a fully automated translation management system with specific customized functions and various connectors to subsystems such as CMS, ERP, MRM, PIM, CRM…
He joined Glossa Group in 2015. In the first years, he initially worked as a workflow and CAT Tool specialist, but also worked on the development of myproof – among many other activities that a very rapidly growing company required.
Trying to Standardize Translation Quality – What Were They Thinking?
Evaluating the quality of a translation work product (i.e. a target text), opposed to evaluating a translation process (see ISO 17100) is a hot topic. And quite controversial. Is it ready for standardization? Recently (in the past few years), two projects have begun with the objective of producing a standard in the area of TQE (translation quality evaluation): First within ASTM (www.astm.org). Then within ISO (www.iso.org). Some would ask why anyone would even try to standardize in this area. This presentation will describe the two projects and attempt to justify why it is indeed time for such standards to be developed. A side note might be included about “QA”, a highly ambiguous acronym that is sometimes expanded as quality assessment and sometimes as quality assurance. This ambiguity and the varied uses of quality assurance in the translation industry are why I have chosen the acronym TQE in this abstract rather than TQA.
Alan Melby (FIT, LTAC)
Alan K. Melby
Alan K. Melby is a certified French-to-English translator. Has three additional titles as of July 2018:
Since 1980, he has added work on several translation-related standards, including TBX (www.tbxinfo.net). Since the 1990s, his interests have expanded to include evaluating/assessing translation quality.
When terminology work and semantic web meet: Ways to help to improve the discoverability of data and their re-use, to provide terminologists with new technological solutions and to contribute to the creation of linguistic assets for linguists
Information storage and retrieval is a long researched problem in library and computer sciences. Additionally the increased need to retrieve information in a set of multiple languages increases the complexity of the task. Translation and management of multilingual terminological resources is a related task bearing its own complexities. This paper addresses ways to improve the discoverability of data and their re-use through new technological solutions which also contribute to the creation of linguistic assets for drafters, terminologists and translators. We will be building our argument using the examples of the InterActive Terminology for Europe database (IATE), and of the EuroVoc thesaurus and controlled vocabularies managed by the Publications Office of the European Union.
Denis Dechandon, Eugeniu Costetchi, Anikó Gerencsér, Anne Waniart (European Commission)
Denis Dechandon has over 20 years’ experience in translation and linguistics, in office automation and in different management roles. After getting acquainted with the translation work and its requirements at European Union level, he fully committed himself to the definition and implementation of processes and workflows to provide structured and efficient support to linguists and to streamline the work of support teams.
Previously Denis was responsible for leading a service dedicated to the linguistic and technical support provided to translators, revisers, editors, captioners and subtitlers (Computer Assisted Translation, corpus management, formatting and layouting, machine translation and terminology).
He also supervised the maintenance and development of tools and linguistic resources at the Translation Centre for the Bodies of the European Union. Committed to further changes and evolutions in these fields, Denis took over the role of InterActive Terminology for Europe (IATE) Tool Manager from May 2015 to August 2017.
Currently, as Head of the Metadata sector, he is leading the activities in standardization (in particular: EuroVoc and registry of metadata) and is intensely involved in the field of linked open data at the Publications Office of the European Union.
Eugeniu Costetchi is a Semantic Architect at the European Publication Office in Luxembourg.
His expertise and research interests are Semantic Web technologies, Knowledge Representation and Computational Linguistics. His joint PhD between University of Bremen and Luxembourg Institute of Science and Technology (LIST) addressed the problem of parsing English text with Systemic Functional Grammars applicable to dialogue systems and chat bots.
Anikó Gerencsér has a PhD in Italian Culture and Literature, a Master’s Degree in Italian Language and Literature and a Master’s Degree in Library and Information Science from the University ELTE of Budapest.
Since joining the Publications Office of the European Union she has worked in the field of metadata standardisation and linked open data management. Her particular area of responsibility is the maintenance of the EuroVoc multilingual thesaurus and its alignment with other controlled vocabularies.
She is currently working on the optimisation of the thesaurus management tool Vocbench which involves analysing users’ needs and improving the customised forms and templates. She is an active participant in the VocBench user community, particularly with regard to the development of collaborative features. In addition she is involved in an on-going project that aims to achieve interoperability between controlled vocabularies by sharing common tools and formats for the creation, use and maintenance of vocabularies and taxonomies.
Since joining the European institutions Anne Waniart has worked in the field of thesaurus management. It is worth recalling that she drafted the “Guidelines for the production of a multilingual version of the European Training Thesaurus” published by the European Committee for Standardization in Learning technologies workshop: controlled vocabularies for learning object metadata: typology, impact analysis, guidelines and a web-based vocabularies registry (Brussels: CEN, 2004 (CEN Workshop Agreement, 15871).
Additionally, she has been working in the field of metadata standardisation and linked open data management from the time of her recruitment by the Publications Office of the European Union. Her particular area of responsibility is the content management of the EuroVoc multilingual thesaurus and its alignment with other controlled vocabularies.