This volume provides a comprehensive introduction to the Translation Process Research Database (TPR-DB), which was compiled by the Centre for Research and Innovation in Translation and Technologies (CRITT). The TPR-DB is a unique resource featuring more than 500 hours of recorded translation process data, augmented with over 200 different rich annotations. Twelve chapters describe the diverse research directions this data can support, including the computational, statistical and psycholinguistic modeling of human translation processes.
In the first chapters of this book, the reader is introduced to the CRITT TPR-DB. This is followed by two main parts, the first of which focuses on usability issues and details of implementing interactive machine translation. It also discusses the use of external resources and translator-information interaction. The second part addresses the cognitive and statistical modeling of human translation processes, including co-activation at the lexical, syntactic and discourse levels, translation literality, and various annotation schemata for the data.
About the Author
About the Book Editors
Michael Carl is an Associate Professor for Human and Machine Translation at the Copenhagen Business School, Denmark. His current research interest is related to the investigation of human translation processes and interactive machine translation. Dr. Carl studied computer sciences and computational linguistics in Berlin, Paris and Hong Kong. He obtained his PhD from the Saarland University in 2001.
Srinivas Bangalore is a Lead Inventive Scientist at Interactions LLC in New Jersey, USA. His current research interests include natural language processing, speech-to-speech translation, and machine learning. He has co-edited two books, published over 100 technical papers and has over 80 patents in these areas. He received a PhD in 1997 from University of Pennsylvania in Computer Science and was awarded the AT&T Science and Technology Medal in 2009.
Moritz Schaeffer is a postdoctoral researcher at the Center for Research and Innovation in Translation and Translation Technology (CRITT) at the Copenhagen Business School. His primary research interests are cognitive modelling of the human translation process, human-computer interaction in the context of translation, and the psychology of reading. His previous research includes bilingual memory during translation, the role of shared semantics and syntax during translation, and error detection in reading for translation.
List of contributing authors
Adriana Pagano is Professor in Translation Studies at the Federal University of Minas Gerais, Brazil, where she directs MA dissertations and doctoral theses in the Graduate Programme in Linguistics and Applied Linguistics and conducts research at the Laboratory for Experimentation in Translation. Her research interests include domain knowledge in translation tasks; expertise and expert knowledge in translation; and quality assessment in translation from an end-user perspective.
Ana L. V. Leal is Assistant Professor at Department of Portuguese, University of Macau. She holds a PhD in Computer Science from the University of Évora, Portugal and a Master's Degree in Applied Linguistics from Pontifícia Universidade Católica do Rio Grande do Sul, Brazil. She is the principal investigator in the AuTema-Dis II, AuTema-Syntree and AuTema-PostEd projects with Research Grants of University of Macau.
Annegret Sturm received an intermediate diploma in interpreting from the University of Leipzig and her MA in Translation Studies from the University of Geneva. Her main research interests are the role of social cognition in translation and the role of metacognitive proficiency in translation processes.
Arlene Koglin is a PhD candidate in Translation Studies at Universidade Federal de Minas Gerais (Brazil). She holds a Master’s degree in Translation Studies from the Universidade Federal de Santa Catarina (Brazil). Her current research and publications focus on post-editing, translation process, metaphor, cognitive effort and eye tracking.
Arndt Heilmann is a student of English Studies and Political Sciences in his final Master’s semester and a prospective doctoral student at the English Linguistics Department at the RWTH Aachen in Germany. Currently he is employed as a research assistant at the department’s eye-tracking laboratory helping to prepare and conduct experiments. Apart from politics, his field of interest is cognitive linguistics and, related to this, translation studies.
Arnt Lykke Jakobsen was professor of translation and translation technology at Copenhagen Business School (CBS) until his retirement at the end of 2013. A growing interest in translation processes and methods of exploring them led to his invention of the keylog software program Translog in 1995. In 2005 he established CRITT, the CBS Centre for research and innovation in translation and translation technology, which he directed until his retirement. The main focus of research here has been on developing and exploiting a methodology for translation process research using keylogging and eyetracking.
Arthur de Melo Sá is currently pursuing his MA degree in Applied Linguistics (Translation Studies) at the Universidade Federal de Minas Gerais (UFMG), Brazil. He also has a Bachelor of Arts degree in Letters (English and Translation Studies). He is a researcher at the UFMG’s Laboratory for Experimentation in Translation (LETRA) and a member of research groups on Systemic-Functional Modelling of Translation and Multilingual Text Production.
Barbara Dragsted is associate professor at the Department of International Business Communication, Copenhagen Business School, where she teaches specialised communication and translation and is involved in various research projects under the Center for Research and Innovation in Translation and Translation Technology (CRITT). Her research interests include cognitive processes in translation, translation technology and LSP communication and translation.
Bartolomé Mesa-Lao holds a PhD in translation technologies and is a research affiliate at the CRITT - Copenhagen Business School. He is also a visiting lecturer at the Universitat Autònoma de Barcelona (Spain) and the Università degli Studi di Genova (Italy). His current research interests are in translator-computer interaction, translation technologies, the impact of computer-aided translation and post-editing workflows and the changes brought about by processes of globalization in translator training.
Bergljot Behrens is an Associate Professor of English linguistics and translation studies, Department of Literature, Area studies and European Languages, University of Oslo, Norway. She has taught semantics and pragmatics, translation and translation theory for many years. Her research has centered on contrastive linguistics, discourse representation, and translation, from a qualitative semantic/pragmatic viewpoint, and from a quantitative viewpoint.
Dagmara Płońska is a PhD student at the University of Social Sciences and Humanities in Warsaw. In June 2015 she submitted a PhD dissertation entitled “Strategies activated in the process of written translation: Factors of translation competence”. She holds a Master’s degree in applied linguistics from the University of Warsaw. She has spent six months as an intern at the Center for Research and Innovation in Translation and Translation Technology at Copenhagen Business School.
Daniel Ortiz-Martínez is a member of the PRHLT research center, and an assistant professor in the Statistics Department of the Technical University of Valencia. His research interests include pattern recognition and its application to statistical machine translation. He has worked in several research projects including the MIPRCV and the CASMACAT projects.
Derek F.Wong received the Ph.D. degree in Automation from Tsinghua University in 2005. He is currently an Assistant Professor in the Department of Computer and Information Science at University of Macau, with a secondary appointment as a project manager in the Instituto de Engenharia de Sistemas e Computadores de Macau during 2003-2013. His active and diverse research interests span areas of natural language processing and machine translation.
Fabio Alves is Full Professor in Translation Studies at Universidade Federal de Minas Gerais (UFMG), Brazil, where he carries out empirical-experimental research at the Laboratory for Experimentation in Translation (LETRA). His research interests encompass expertise and expert knowledge in translation; cognitive approaches to translation; translation and technology; and human-machine interaction in translation.
Francisco Casacuberta is a full professor of computer science in the Universitat Politècnica de València (Spain) and is a member of Pattern Recognition and Human Language Technology research centre. His research interests include pattern recognition, machine learning and their application to statistical machine translation, computer assisted translation, spoken language translation and speech recognition.
Germán Sanchis-Trilles is co-founder of Sciling, SL. Before that, he served for almost 10 years as a PhD and post-doc researcher at the PRHLT research centre, Universitat Politècnica de València, where he received a PhD in computer science, with a specialisation in machine translation. His research interests include multimodal interaction, natural language processing, statistical MT, model adaptation, online learning and software localisation.
Igor Antônio Lourenço da Silva is an assistant professor of translation studies at Universidade Federal de Uberlândia (UFU) in Brazil. He holds a PhD from Universidade Federal de Minas Gerais (UFMG), Brazil. His current research interests encompass translation process research, translation training, translation expertise, and human-machine interaction in translation. He has worked as a freelance translator and proofreader since 2005.
José Luiz Gonçalves is Associate Professor in Translation Studies and English at Universidade Federal de Ouro Preto (UFOP), Brazil, and a research associate at the Laboratory of Experimentation in Translation (LETRA/UFMG), where he carries out empirical-experimental research in translation. His main research interests are expertise and expert knowledge in translation; cognitive approaches to translation; translation competence and translator's training/education.
Jean Nitzke studied translation between 2006 and 2011 for the language pair EnglishGerman at the Department for Language, Culture and Translation Studies in Germersheim (FTSK), at the University of Mainz. She worked as a professional translator from 2011 till 2012. In 2012 she started working as a research assistant at the FTSK. Her research interests include translation process research, postediting and eyetracking. She is currently writing her PhD thesis on "Problemsolving strategies in postediting".
Jesús González-Rubio is an NLP scientist at Unbabel Lda. He received his B.Sc. in computer science in 2005, his Master in pattern recognition, artificial intelligence and digital image in 2008, and his PhD in computer science in 2014; all of them from the Universitat Politècnica de València. His main research interest is the application of pattern recognition techniques to the understanding of human languages.
Joke Daems is a research assistant at Ghent University, working for the Department of Translation, Interpreting and Communication as part of the Language and Translation Technology Team. She obtained a Master's degree in translation at Hogeschool-Universiteit Brussel in 2012, after which she started her PhD project (ROBOT) at Ghent University. Her research interests include translation, machine translation, post-editing, translatability, translation quality assessment, and human-computer interaction.
Julian Zapata holds a B.A. (Honours) in English-French-Spanish translation and an MA in translation studies from the University of Ottawa, and is currently a PhD candidate at the same university. His research interests include multimodal interaction, human-information interaction, speech technologies, translation dictation and translation technologies. He has also worked as an English-to-Spanish translation professor and teacher assistant for translation technology courses.
Karina Sarto Szpak is currently a PhD student at Universidade Federal de Minas Gerais (UFMG), Brazil, where she works on empirical-experimental research in translation at the Laboratory for Experimentation in Translation (LETRA). Her main research interests encompass expert knowledge in translation, cognitive approaches to translation and translation and technology.
Katharina Oster studied translation for the language pairs EnglishGerman and FrenchGerman at the Department for Language, Culture and Translation Studies in Germersheim (FTSK), at the University of Mainz between 2007 and 2012. In 2014, she started working as a research assistant at the FTSK. Her research interests cover the translation process research, psycholinguistics and event related potentials. She is currently writing her PhD thesis on "The reorganization of the mental lexicon during translation".
Kristian Tangsgaard Hvelplund is an assistant professor in the Department of English, Germanic and Romance Studies at the University of Copenhagen. He holds a PhD in translation from the Copenhagen Business School. His research interests include translation and cognition. His research has concerned in particular the cognitive processes involved in translation, reading and writing
Kyoko Sekino is currently a PhD student at Federal University of Minas Gerais (UFMG) in Brazil, where she works on empirical-experimental research in translation at the Laboratory for Experimentation in Translation (LETRA). Her research interests are on distant language translation processes, such as Japanese-Portuguese translation process.
Laura Winther Balling holds a PhD in psycholinguistics from the University of Aarhus and is now associate professor of experimental psycholinguistics at Copenhagen Business School. She is fascinated by the multitude of complex cognitive processes that happen during translation, and works on investigating them experimentally. In addition, she does experimental work on word and sentence processing in first and foreign languages.
Lidia S. Chao received a PhD degree in Software Engineering from the University of Macau in 2008. Since 1996, she has been with the Department of Computer and Information Science at the University of Macau, currently as an Assistant Professor. Her current research focuses are data mining and machine learning technology, and knowledge acquisition in language and bioinformatics.
Lieve Macken is a senior researcher at the Department of Translation, Interpreting and Communication of Ghent University with over 20 years of experience in language technology. Her current research interests are computer-assisted translation, terminology extraction, human-computer interaction in translation and machine translation. She is the operational head of the language technology section of the department, where she also teaches Computer-assisted translation and Localisation.
Maheshwar Ghankot is a Hindi Linguist with multilingual translation and writing capabilities and works as Hindi Officer for Indian Space Research Organisation. He did his Masters in Translation and MPhil from University of Hyderabad, India and pursues his PhD with a theme "Translation Memory for Scientific and Technical Literature" at IGNOU, New Delhi. His research areas are Human & Machine Translation, Editing, NLP and Second Language Teaching.
Marceli Aquino is a PhD student at the Laboratory for Experimentation in Translation (LETRA/UFMG) where she conducts an experimental research on the translation process of German Modal Particles in the language pair Portuguese/German. For two years she had a Teaching Assistant position for “Portuguese as a Foreigner Language” at UFMG. Currently she conducts part of her PhD studies at Ludwig-Maximilian University (LMU), Germany. Her main research expertise is: experimental translation studies and applied linguistics.
Mercedes García Martínez is a PhD student at the University of Maine. Her main research involves machine translation using deep learning. She studied Computer Science Engineer and received a Master's degree in Artificial Intelligence, Pattern Recognition and Digital Image at the Polytechnic University of Valencia. She has worked on projects of machine translation and speech recognition, as well as a research assistant on the CASMACAT European project at the Copenhagen Business School.
Michael Carl is a Professor mso. for Human and Machine Translation at the Copenhagen Business School/Denmark and director of the Center of Research and Innovation in Translation and Translation Technology (CRITT). His current research interest is related to the investigation of human translation processes and interactive machine translation. He has been working on machine translation, terminology tools, and the implementation of natural language processing software.
Moritz Schaeffer received his PhD from the University of Leicester in translation studies and has since worked as a research assistant at the Center of Research and Innovation in Translation and Translation Technology (CRITT), Copenhagen Business School, and the Institute for Language, Cognition and Computation (University of Edinburgh).
Márcia Schmaltz has worked as interpreter for the Brazilian President, ministers, governors and mayors. She has been teaching on the Master’s Program in Translation Studies at the University of Macau, from where she received a PhD in Linguistics. Her research interests include translation process research, cognitive linguistics, machine translation, and translation historiography of Chinese – Portuguese speaking countries.
Norma Fonseca is currently a PhD candidate in Applied Linguistics on the Graduate Programme in Linguistics and Applied Linguistics at the Federal University of Minas Gerais (UFMG) in Brazil, where she develops empirical-experimental research in Translation Studies. She obtained a Master's degree in Applied Linguistics from the same Programme.
Paulo Quaresma is an Associate Professor in the Department of Informatics at the Universidade de Évora, Portugal and a member of the Spoken Language Systems Laboratory of INESC-ID, an Associated Laboratory of FCT, the Portuguese Foundation of Science and Technology. His research interests include natural language processing, automatic machine translation, question-answering systems, text information extraction, sentiment analysis, and ontologies.
Robert Hartsuiker’s research team studies the cognitive processes underlying language comprehension and language production. His most important research topics are language production, language self-monitoring, neurocognition of language, and in particular cognitive aspects of bilingualism.
Samuel Läubli holds a Bachelor’s degree in Computational Linguistics and Natural Language Processing from the University of Zurich and a Master’s degree in Artificial Intelligence from the University of Edinburgh. He conducted research on web‐based approaches to human‐aided machine translation as a research associate with the University of Edinburgh’s Machine Translation Group and, in 2014, joined Autodesk as a Computational Linguist.
Sonia Vandepitte is full professor of English and Translation Studies (TS) at Ghent University in the Department of Translation, Interpreting and Communication. Research interests have covered causal and modal expressions in translation, anticipation in interpreting, methodology in TS, translation competences, (peer) feedback, international translation training and the translation process.
Srinivas Bangalore is a Lead Inventive Scientist at Interactions Corporation and has made significant contributions in the areas of spoken language translation, multimodal understanding, language generation and syntactic parsing. He has authored over 100 research publications and holds 80 patents in these areas. He has co-edited books on Supertagging and Natural Language Generation in Interactive Systems and has been awarded the AT&T Science and Technology Medal.
Ulrich Germann holds Master's degrees in Theoretical Linguistics from the Ruhr University Bochum, (M.A., 1996) and in Computer Science from the University of Toronto (MSc., 2013). He is currently a Senior Researcher in the Machine Translation Group within the School of Informatics at the University of Edinburgh, and has been active in research on machine translation since 1995.
Vicent Alabau received his PhD in computer science from the UPV in 2014 on Multimodal Interactive Structured Prediction. Dr. Alabau has worked on several projects related to speech recognition and speech translation, MT, and multimodal interactive tools for pattern recognition and has experience in designing and building web system architectures, such as those used in the MIPRCV and CASMACAT projects. He is CEO at Sciling, SL.
Table of Contents
Foreword. A.L.Jakobsen.- Introduction. M.Carl, S.Bangalore, M.Schaeffer.- 1.The CRITT Translation Process Research Database. M.Carl, M.Schaeffer and S.Bangalore.- Part I.Post-editing with CASMACAT.- 2.Integrating Online and Active Learning in a Computer-Assisted Translation Workbench. D.Ortiz-Martínez, J.González-Rubio, V.Alabau, G.Sanchis-Trilles, F.Casacuberta.- 4. Analysing the Impact of Interactive Machine Translation on Post-editing Effort. F.Alves, A.Koglin, B.Mesa-Lao, M.García Martínez, N.B. de Lima Fonseca, A.de Melo Sá, J.L.Gonçalves, K.Sarto Szpak, K.Sekino, M.Aquino.- 5. Learning Advanced Post-editing. V.Alabau, M.Carl, F.Casacuberta, M.García Martínez, B.Mesa-Lao, D.Ortiz-Martínez, J.González-Rubio, G.Sanchis-Trilles, M. Schaeffer.- 6.The Effectiveness of Consulting External Resources During Translation and Post-editing of General Text Types. J.Daems, M.Carl, S.Vandepitte, R.Hartsuiker, L.Macken.- 7.Investigating Translator-information Interaction: A Case Study on the use of the Prototype Biconcordancer Tool Integrated in CASMACAT. J.Zapata.- Part II.Modelling Translation Behaviour.- 8. Statistical Modelling and Automatic Tagging of Human Translation Processes. S.Läubli, U.German.- 9. Word Translation Entropy: Evidence of early Target Language Activation during Reading for Translation. M.Schaeffer, B.Dragsted, K.Tangsgaard Hvelplund, L.Winther Balling, M.Carl.- 10. Syntactic Variance and Priming Effects in Translation. S.Bangalore, B.Behrens, M.Carl, M.Ghankot, A.Heilmann, J.Nitzke, M.Schaeffer, A.Sturm.- 11. Cohesive Relations in Text Comprehension and Production: An Exploratory Study Comparing Translation and Post-editing. M.Schmaltz, A.Leal, D.Wong, L.Chao, I.A. L. Silva, F.Alves, A.Pagano, P.Quaresma.- 12. The Task of Structuring Information in Translation. B.Behrens.- 13. Problems of Literality in French-Polish Translations of a Press Article. D.Płońska.- 14. Comparing Translation and Post-editing – an Annotation Schema for Activity Units. J.Nitzke and K.Oster.- List of Contributing Authors.- Subject Index.