Workshops – TC38

listed in order by surname of (1st) workshop moderator

David Calvert (TransForm) Born 1954. Diploma in Creative Photography, Trent Polytechnic, 1974. Worked as a photographer for a number of years before returning to study. BSC (Hons.) Chemistry and Geological Sciences, University of Leeds, 1983. Worked for science-teaching equipment suppliers in London and as a research assistant at the Welsh School of Architecture before moving to Germany in 1986. Taught English and started working in translation, initially as the PC specialist with a group of freelance translators. Set up TransForm GmbH with two other partners in 1994. Now the owner and Managing Director of the company, which holds ISO 17100 certification.	Lost for Words — Maximizing Terminological Quality and Value at an LSP Part 1: Theory. The quality and value of terminology are discussed from an LSP’s viewpoint and defined for an LSP using risk/reward matrices. The features of an optimal terminology process and the process’ relationship to the ISO17100 translation process are identified. The interests of the other parties in the translation process are reviewed and best practices for terminology work are identified for the different parties involved in translation. Objectives for a terminology process are formulated and discussed. The features of two standard terminology modules are discussed and my choice of terminology server is explained. A standard terminological record structure for termbases is introduced. Part 2: Practice. An implementation of termbases using this term record structure is presented. This includes the ways in which TransForm is dealing with the strengths and weaknesses of QTerm and an iterative process for improving the value of terminological records. Different approaches to automatic term matching are evaluated with particular attention paid to the problem of false positive results in QA checks. Although the economics of the business preclude large-scale investment in terminology, I believe that an iterative approach to collecting and improving terminological data can pay off for an LSP.
Annalisa De Petra and Daniele Cocozza (MateCat/translated srl. - Gold Sponsor) Annalisa De PetraAnnalisa is Director of Project Management at Translated. She is responsible for coordinating the team of sixteen project managers, as well as defining the management processes for projects for small and medium-sized enterprises, in particular for key clients such as Google, Uber and the European Commission. She is also responsible for defining and putting in place processes to improve the company’s efficiency, working with the development team on the creation of new technological tools to help project managers and translators optimize their time and productivity, and improving the quality of the service offered. Daniele Cocozza Daniele Cocozza is a Project Manager at Translated. Daniele is responsible for managing translation projects for private customers, companies and key clients, such as Google and IBM. This involves constant communication with clients’ project leaders, managing a team of more than 500 translators, solving file engineering issues, meeting tight deadlines, taking care of invoicing for all projects and following up with clients. He also helps defining processes for key clients in order to improve the quality of the service and make them scalable for project managers. He is also responsible for testing new MateCat features in collaboration with the development team.	MateCat, the Cloud-based Professional Translation Tool that Provides More Matches than any other CAT Tool (MateCat - Gold Sponsor) In this workshop, participants will have the opportunity to see how to work and improve productivity with MateCat. After a brief overview of the translation memory and machine translation engines available in the tool, participants will see how to create a project on MateCat and how to translate and review efficiently, benefiting from more matches than any other CAT tool. Participants also will see how to use the management panel of the projects and how to measure the productivity of translators through the data collected during the translation, and finally how to make money with MateCat. The workshop will include a practical session in which the participants can use MateCat in real working conditions, the phase of creating the project revision. The workshop is intended for agencies, freelancers and students. Workshop participants will receive a certificate from MateCat.
Ayten Dersan (EU Translation Center) Ayten Dersan has more than 15 years’ experience in translation and linguistics, language technologies and in workflow management. After getting expertise with translation workflows and its requirements at EU level, she fully committed herself to the definition and implementation of various processes and workflows to provide a better support to linguists and to streamline the working methods of stakeholders. Ayten Dersan is currently working as Business developer dedicated to innovative language technologies provided to translators, revisers, editors, captioners and subtitlers (CAT, corpus management, machine translation and terminology) and new long-term added-value services , such as web localisation to clients of the Translation Centre for the Bodies of the European Union. Ayten Dersan has defined, put in place and assessed an optimized workflow of subtitling within the Translation Centre taking into account all business processes and requirements.	The Art of Subtitling Within the European Institutions Enhance EU agencies and institutions communication Streaming videos has become one of the most popular communication channels in the world. Educational and/or promotional videos are booming in the web content landscape. The video production of EU institutions and agencies has grown and so has the wish to make their activities accessible on web sites and social media as YouTube, Facebook or LinkedIn. These videos should be understandable by as many citizens as possible and without any discrimination. With its specific subtitling workflow, the Translation centre for the Bodies of the European Union has managed to provide all citizens with access to EU agencies and institutions’ audio-visual material in any language. The purpose of this presentation will be to put forward the challenges and constraints we experienced and how they have been motivating us to move forward in our everyday translation life.
Miloš Jakubíček (Lexical Computing - Gold Sponsor) Miloš Jakubíček is an NLP researcher and software engineer. His research interests are devoted mainly to two fields: effective processing of very large text corpora and parsing of morphologically rich languages. Since 2008 he has been involved in the development of Sketch Engine corpus management suite on behalf of Lexical Computing, a small research company working at the intersection of corpus and computational linguistics. Since 2011 he was director of the Czech branch of Lexical Computing leading the local development team of Sketch Engine and became CEO of the Lexical Computing in 2014. He is a fellow of the NLP Centre at Masaryk University, where his interests lie mainly in syntactic analysis and its practical applications.	Sketch Engine for Translation and Terminology: Interfacing Corpora with CAT Tools Present CAT tools mostly focus on handling import and exports from and to various formats, typesetting of the actual translated text, use of translation memories and terminology glossaries consistency checking as well as project management and accounting. However they usually lack deep linguistic processing of the texts using state-of-the-art natural language processing (NLP) tools because these are often language dependent, not very straightforward to integrate for installation on end-user computers and sometimes with unstable accuracy with regard to the text types being translated. For similar reasons these tools also do not possess access to large amounts of texts in the form of annotated text corpora, which currently often size in terabytes of data. This all leads to a growing gap between theoretical studies showing how translators can benefit from exploiting both NLP tools and large text corpora and actual usage of these tools and data in practice. We present an integration of one of the major CAT tool software, the SDL Trados, with Sketch Engine, an online platform for building, analyzing and managing text corpora in over 80 languages. These corpora are often annotated using state-of-the-art NLP tools and for major languages they contain billions of words.
Ronan Martin (SAS) Terminology Manager at SAS European Localization Centre (12 yrs) 20 years experience working with terminology in connection with language course provision, later localization. Background in language learning/acquisition (M.A. Pedagogy from Univ. of Copenhagen). Responsible for architecture, design, implementation and maintenance of SAS’s current Terminology Management System for localization. Areas of activity: (linguistic) Term extraction, term validation, bridging cross-cultural divides re terms and concepts, interplay of TM and termbases, terminology workflow processes. (technical) Web programming, data analysis, SAS coding, distributed dataflow environment.	The Annotation System Two of the main challenges of translation are comprehension and terminology: understanding the text, and knowing what to call things in the target language. The focus of this topic is the comprehension component, not so much “how to help translators in understanding what is meant by a text string”, but more, “how to deploy the solution to translation queries so that all translators have access to the solutions when they highlight a string in the translation (CAT) tool”. The CAT tools we use do have partial solutions, but we did not find these viable for different reasons. We needed a way of annotating source files once and for all. However, it was vital that we did not leave a footprint in our source files. Any footprint would lead to a breakdown at build (compilation) time. The challenge: How do you create an external annotation that will always find its target string in the source files. We opted for a methodology borrowed from the terminology paradigm. Our source string was like a term, and the annotation like a term comment. The termbank became a stringbank, and the term dictionary an annotation dictionary.
Ondřej Matuška (Lexical Computing - Gold Sponsor ) Ondřej Matuška graduated from English at Masaryk University, Brno, Czech Republic. Until recently he worked as Sales Manager at Macmillan Education with an extended responsibility for the promotion of the dictionary products in the Central European region. Apart from the use of corpus tools for lexicography, his area of interest includes the use of corpora in English Language Teaching. Recently he has joined Sketch Engine where his responsibilities focus on usability, the user interface, user experience and implementing new functionality based on user feedback.	Introduction to Sketch Engine for Translators and Terminologists The aim of the workshop is to introduce translators and terminologists to the benefits gained by using a corpus query and management system such as Sketch Engine and by exploiting the natural language processing expertise contained in such tools. The system has undergone a thorough development from a mainly academically oriented tool to technology which can be used outside of the academic world. Nowadays, Sketch Engine can be used to look up information about how language is used in its widest sense. Evidence found in Sketch Engine’s multi-billion text corpora serves as a solid ground for various types of research from purely linguistic or lexicographic to socio-linguistic and other areas. In the context of translation and terminology, the high-quality text corpora of impressive sizes serve as representative samples of language large enough to provide sufficient evidence of use even for domain specific language. Recently, Sketch Engine added a sophisticated tool for term extraction which uses statistical methods on linguistically annotated texts to extract both single- and multi-word terminology. This is further aided by comparing the difference in the use of words in domain specific texts and general texts to produce terminology candidates of unprecedented quality.
Anja Rütten Anna Rütten is a freelance conference interpreter for German (A), Spanish (B), English (C) and French (C) based in Düsseldorf, Germany since 2001. She is a member of AIIC as well as the German Conference Interpreters’ Association VKD im BDÜ e.V. Apart from the private market, she works for the EU institutions and as a lecturer at the TH Cologne. She holds a degree in conference interpreting (Diplom-Dolmetscherin) and finished her phD about Information and Knowledge Management in Conference Interpreting at the University of Saarbrücken in 2006. She has specialised in knowledge management since the mid-1990s and has been blogging about this subject since 2014 (www.dolmetscher-wissen-alles.de).	Interpreters’ Workflows and Fees in the Digital Era In today’s digital and connected environment, interpreters being recruited and paid by the minute (or hour) does not seem as inconceivable as it used to be. There are indeed business cases of knowledge work on a micro level, i.e. the interpreter operating within a limited window of information meaning with no access to background knowledge. The usual scenario, however, involves interpreters working on a macro level, meaning that background knowledge plays an important role. Accordingly, recruitment and payment can occur on micro and macro level, i.e. taking account or not the secondary knowledge work involved in an assignment. Whereas it might sound logical to pay “informed” macro knowledge work in larger payment units (days, maybe hours) and micro knowledge work in smaller payment units (minutes, or words), this does not always seem to be the case. An interesting question to deal with in this context is how software and digital platforms can help to support interpreters’ workflow on one hand and determine adequate amounts and units of payment on the other. Surely, some insight can be gained by looking at the differences and similarities between translation and interpreting.
Clémentine Tissier (SDL - Silver Sponsor) Clémentine has now been working at SDL, specifically with the SDL AppStore team, as the Marketing Executive for over 2 years. She is responsible for marketing the SDL AppStore and Developer Hub platform as a way for translation professionals to extend and personalise their software to tailor it to their own specific needs. Her role varies from promoting the latest free apps and APIs to SDL’s customer base through webinars, blogs and emails, to attending and presenting at both SDL and industry events. Her favourite part of the job is writing or filming content for the SDL AppStore website which includes informational blogs and ‘how-to videos’ for the apps, as well as teaching those who are new to the SDL AppStore in her monthly introductory webinar.	Building your Ideal Translation Environment with Apps and APIs from the SDL AppStore (SDL - Silver Sponsor) More than two-thirds (68%) of over 2500 respondents in a global research study say that it’s important to be able to add applications to CAT tools today, whilst almost four in five (78%) think they will be essential in five years. Join us for an introductory workshop on how you can extend and customise SDL’s translation software to create your own ideal translation environment using the SDL AppStore and Developer Hub platform. During this session we will cover and demonstrate: • An introduction to SDL AppStore and its capabilities • How to find and download apps on the app store • How to start developing your own apps with SDL’s APIs As well as this workshop, you can visit SDL’s exhibition stand for demos of the apps or example case studies of users who have developed their own solutions with SDL’s APIs.
Andrzej Zydroń (XTM) Andrzej Zydroń MBCS CITP CTO @ XTM International, Andrzej Zydroń is one of the leading IT experts on Localization and related Open Standards. Zydroń sits/has sat on, the following Open Standard Technical Committees: 1. LISA OSCAR GMX 2. LISA OSCAR xml:tm 3. LISA OSCAR TBX 4. W3C ITS 5. OASIS XLIFF 6. OASIS Translation Web Services 7. OASIS DITA Translation 8. OASIS OAXAL 9. ETSI LIS 10. DITA Localization 11. Interoperability Now! 12. Linport Zydroń has been responsible for the architecture of the essential word and character count GMX-V (Global Information Management Metrics eXchange) standard, as well as the revolutionary xml:tm (XML based text memory) standard which will change the way in which we view and use translation memory. Zydroń is also chair of the OASIS OAXAL (Open Architecture for XML Authoring and Localization) reference architecture technical committee which provides an automated environment for authoring and localization based on Open Standards. Zydroń has worked in IT since 1976 and has been responsible for major successful projects at Xerox, SDL, Oxford University Press, Ford of Europe, DocZone and Lingo24 in the fields of document imaging, dictionary systems and localization. Zydroń is currently working on new advances in localization technology based on XML and linguistic methodology. Highlights of his career include: 1. The design and architecture of the European Patent Office patent data capture system for Xerox Business Services. 2. Writing a system for the automated optimal typographical formatting of generically encoded tables (1989). 3. The design and architecture of the Xerox Language Services XTM translation memory system. 4. Writing the XML and SGML filters for SDL International’s SDLX Translation Suite. 5. Assisting the Oxford University Press, the British Council and Oxford University in work on the New Dictionary of the National Biography. 6. Design and architecture of Ford’s revolutionary CMS Localization system and workflow. 7. Technical Architect of XTM International’s revolutionary Cloud based CAT and translation workflow system: XTM. Specific areas of specialization: 1. Advanced automated localization workflow 2. Author memory 3. Controlled authoring 4. Advanced Translation memory systems 5. Terminology extraction 6. Terminology Management 7. Translation Related Web Services 8. XML based systems 9. Web 2.0 Translation related technology	Calculating the Percentage Reduction in Translator Effort when Using Machine Translation Presently there is no indication for users of Machine Translation (MT) regarding how much productivity improvement and therefore cost saving they can expect from using machine translation for a given project. At best you will get an answer from MT practitioners of the type ‘well it depends’. The workshop will describe a relatively simple formula that will provide a precise indication of exactly how much productivity improvement to expect for a given language pair and domain specific MT engine. This is an important new way of bringing clarity and precision to a previously undefined area in computational linguistics.