Tuesday, August 02, 2022

MTPE of Poor Quality Source Texts: Some Practical Suggestions

To achieve the best MT results, you should first correct the source text, when it is a scanned hard copy or an automatic transcription of recorded speech. Here are a few practical suggestions:

  • Choose the correct settings before running OCR. In particular, select the correct source language (you’ll see better suggestions during verification), select the correct graphic resolution for each page and the correct text direction for each piece of text, and de-skew and clean each page that requires it. Verification should be run by someone familiar with both the source language and the subject.
  • Correct misspelled or wrongly transcribed words.
  • Add “[sic]” after any word that you cannot identify and that you suspect is an artifact of the OCR process. This helps the post-editor focus on problem areas.
  • Capitalize proper nouns and acronyms.
  • Lower case incorrectly capitalized words.
  • Reassemble sentences broken up by periods (hard returns) or new lines (soft returns).
  • Feed the source text to the MT engine only after completing such corrections; doing otherwise will yield substandard results and will take longer to post-edit.

When the source text is good, you can skip pre-editing, but, when it is questionable or poor, pre-editing enhances the quality of the resulting machine translation and helps the post-editor achieve the desired quality.

Tuesday, July 27, 2021

An AutoHotkey solution to a memoQ problem

The memoQ problem…

Some installations of memoQ suffer an annoying problem: cut and paste doesn’t work. The rest of the program works, as does copy and paste, but not the "paste" part of cut and paste: memoQ deletes your text at Ctrl+X all right, but it does not save it to the clipboard, so you have nothing to paste when you hit Ctrl+V.

If you try to perform the operation via the context menu, no luck, either: after Ctrl+X, paste (Ctrl+V) is greyed out.

Cut and Paste not working in memoQ

The problem appears unpredictably: you could have (as I do) two similar computers, with a similar panoply of software installed. On one cut and paste works as expected in memoQ, while in the other it doesn't.

memoQ’s support staff are aware of this problem, but (since they have never been able to reproduce it) their developers are unable to fix it. memoQ's support offer several suggestions that sometime work, from disabling other programs that might interfere with the clipboard, to deleting certain temporary files, and finally to that old favorite of all support organizations: reinstalling the program… but even reinstallation, for certain users, fails to correct the problem.

Fortunately, there is a simple workaround: instead of hitting Ctrl+X to cut and Ctrl+V to paste, you can add a step—copy (Ctrl+C), delete (Del), and finally paste (Ctrl+V)—but if you are accustomed to just using Ctrl+X/Ctrl+V in all other programs, you are likely to forget that you have to use different steps in memoQ.

...and the AutoHotkey solution

So we need a more permanent solution, and one is at hand (this also is thanks to a suggestion from memoQ support): using an AutoHotkey1 script to replace the “cut” part of cut and paste.

Here is the script (complete with comments to explain what each step does):

#NoEnv  		; Recommended for performance and compatibility with future 
                        ; AutoHotkey releases.
SendMode Input  	; Recommended for new scripts due to its superior speed 
                        ; and reliability.
SetWorkingDir %A_ScriptDir%  ; Ensures a consistent starting directory.
^x:: 			; Assigns to the Ctrl+X shortcut the following actions:
#IfWinActive memoQ 	; ensures script works with memoQ only, not with other programs
Send ^c 		; Copy (as if "Ctrl+C" had been pressed)
Send {del} 		; Delete (as if the "Delete" key had been pressed)
#IfWinActive 		; end of the "works in memoQ only" part of the script
Return 			; end of the script

You can use this script "as is" if you already have AutoHotkey installed: just copy the above code to an empty text file and save it with an .ahk extension; then, whenever you need to work in memoQ, double click on the script file to launch it.

If you use other AutoHotkey scripts, you also can add the above code snippet to one of your other scripts (so long as none of them tries uses Ctrl+X as a hotkey). And you can even compile the script to an independent .exe file, to use on computers where AutoHotkey is not installed.


1 AutoHotkey is a free, open-source scripting language for Windows that allows users to easily create small to complex scripts.

Monday, July 26, 2021

Guest Post: Launching CLEAR Global: Translators without Borders Is 10 and Ready for More!

by Jennifer Cajina-Grigsby

Translators without Borders begun when a small group of people translated information for aid organizations helping people hit by the Haiti earthquake 10 years ago. Driven to close the language gaps hindering critical humanitarian and international development efforts worldwide, the rapidly growing Translators without Borders team has achieved a lot since our founding. For example, we:

  • Translated information following the earthquake in Haiti, tsunamis in Indonesia, cyclones in Mozambique, and hurricanes in the USA.
  • Translated 1,900 Wikimedia medical articles in 83 languages, which helped over 40 million people access free and accurate health information.
  • Created multilingual chatbots to give people in the Democratic Republic of Congo and Nigeria accurate and reliable COVID-19 and Ebola information (read about Translators without Borders’ COVID-19 global response and glossary).
  • Started creating language maps in over 60 languages for organizations to better prepare for multilingual communication.

This would have been impossible without the support of donors, partners, and our community of linguists, translators, and local experts: over 60,000 volunteers contributing time and skills for a world without language barriers. Some have been with us for almost 10 years. Their experience working with Translators without Borders allows them to polish their translation skills, learn about the language technology we’re building, and get a solid foundation for their careers in translation.

Translators Without Borders recognizes the work of its volunteers by giving recommendations, references, and milestone certificates when the volunteers reach a certain word threshold. We foster a relationship with our community based on reciprocity, fairness, and shared values. Our community members work when they can, producing content for people who speak marginalized languages.

Thanks to this community, our impact has increased over the past years. Since our foundation, we have significantly grown as an organization, but much is yet to be done. Thus, with Translators without Borders at our core, we have evolved our brand to CLEAR Global, which includes CLEAR Insights and CLEAR Tech: entities working together for a more just society. Our goal is to make sure that the four billion people who speak a marginalized language are listened to, and can get the information they need, want, and understand. That’s what CLEAR is all about: Community, Language, Engagement, Accountability, and Reach.

ClearGlobal website

To go with this brand evolution, we’re running a campaign to lead us into the next 10 years. We want to raise funds for building diverse, scalable language technologies based on artificial intelligence and machine learning. Our teams and partners will use them to better communicate with local communities. We’ll implement solutions to help people who speak marginalized languages to ask for, receive, and share vital information.

As a nonprofit, we’re always seeking support. If our cause and growth resonates with you, you can join our community right now. If you want to do more to support our growth, you can also help us spread the word about our “10 years on. Ready for more” campaign or even make a donation. Every contribution counts and will contribute to improving other people’s lives.

About the author:

Jennifer Cajina-Grigsby, before joining Translators without Borders, worked as a freelance translator for US nonprofits. Jennifer has always been passionate about language diversity and how to help bridge the global language divide. Her desire to do more led her to Translators without Borders, a nonprofit helping people get vital information, and be heard, whatever language they speak. 

Tuesday, July 13, 2021

Translations from Italian and into Italian

A couple of links for those interested in Italian translation:

Nota del Traduttore, a YouTube series from Italian publisher Gruppo Mondadori, in which several Italian translators speak about their work.

Backstories: Afro-Italian Women Writers, in the July/August 2021 issue of WORDS without BORDERS, The Online Magazine for International Literature: 

This issue presents writing by Afro-Italian women. In the face of xenophobic rhetoric and policies, Black Italians have pushed their country to confront its colonial past and engage with its present diversity.

Monday, July 05, 2021

Guest Post: The Studio Academy - Mastering File Types in Trados Studio

by Michael Widemann

Even though I have been using Trados products for nearly 20 years now, I only started digging deeper with the release of Studio in 2009. And this for a good reason.

As a project manager responsible for delivering multilingual translation projects to my clients, I am confronted with ever more different file formats, many of which are specific to only one client. This is especially true for XML. But there is so much more: Different versions of Microsoft Office documents, FrameMaker, InDesign, csv and text files, JSON or YAML. Every file type is based on a completely different concept and each new version comes with new features that make established processes redundant.

What I needed was a completely different approach to how I use Studio. The defaults were not good enough anymore.

Then, in 2009, I also started working as a Trados trainer where I had the chance to work with freelance translators, project managers at agencies and localization specialist in small and large companies all over the world. And what I soon began to realize is that – even though everybody has their own workflows – most of them work with Studio’s default settings. They install it and go for it.

And it works. Even if you have never worked with such a tool, the fundamental concepts are easy to understand: translation memory, concordance, terminology integration. Saves time and money. Great. Plus 51 file types right out of the box. Studio handles them all.

After all, this is Studio’s concept: Whether you know how InDesign works, what an XML file is made up of or have mastered the intricacies of JSON files – Studio makes it possible for you translate them. No questions asked. No job you need to turn down because you do not have the required software. Studio – even in its standard installation – extracts the text it deems translation-worthy and presents it to you in a uniform working environment.

Yet there seems to be a problem...

All these options might be overwhelming. How can you possibly decide on whether to extract content from Master Pages in InDesign documents, decide on the right Parser settings or if it is necessary to insert a UTF-8-BOM, for example, when you have no idea what this is all about? And what's the deal with regular expressions and segmentation rules?

This is the problem I aim to solve with “The Studio Academy”: The complete guide to mastering file types in SDL Trados Studio:

- Detailed explanations on all available file type options, based on real-world examples.

- Everything you need to know about the concept behind file types in order to make the right decisions.

- Bonus information on embedded content, regular expressions, segmentation rules, XPath, ....

These modules are for you if...

- You don’t want a piece of software to make decisions for you. You want to be in control.

- You want to customize Studio to extract only the text you actually need. Not more, and certainly not less.

- You want to create your own file types to have the best solution for unknown file formats.

- You want to be able to handle files that do not follow any standard (e.g., HTML files copied to Excel) by using embedded content, regular expressions and customized segmentation rules like a pro.

Where and when to customize file type settings

About the author:

Michael Widemann is a project manager at a translation agency and an approved Trados trainer with 20 years’ experience in the industry. He also works as a translator and has published several books, mainly about music, some of them with Cosoc Grand Palace Publishing (his own publishing company). He is responsible for the German version of the Xbench manual, loves finding new ways to improve his workflow and hosts the podcast “Keine Zeit”, a weekly talk show about productivity, communication, motivation, goals, life and whatever else can go wrong.

Saturday, July 03, 2021

Feed Burner Goes Away, and unfortunately so does your email subscription

 If you subscribed to receive updates from About Translation by email, please note that, since Google is going to remove or restrict Feedburner, your subscription will stop working in July 2021.

I've already removed the link on this page that allowed readers to subscribe, since there is little point in accepting new subscriptions, if the entire subscription service will stop working soon; I'm currently looking into alternatives to send the updates to this blog  to the readers who still wish to receive them. In the meantime, please note that I also automatically announce new posts to this blog via my Twitter feed (@RSchiaffino).

Wednesday, June 30, 2021

memoQ Regex Assistant

Version 9.8 of memoQ includes the Regex Assistant, a new tool that helps creating, validating and using regular expressions. I haven’t used the new feature extensively, yet, but look forward to exploring it more next time I use memoQ for a translation project.
memoQ 9.8 - Regex Assistant

Friday, April 16, 2021

Trados Studio 2021 - The Manual

Mats Linder has just published a new edition of his excellent Trados Studio manual, now covering version 2021 of the tool.

Cover of Trados Studio 2021 - The Manual

As usual, Mats has done a thorough job of describing the details of the new version of the tool, with one important exception, that Mats explains at the beginning of the new manual:
The 2021 version [of the tool] is mainly about the introduction of SDL Trados Live [...] The online editor will require many pages of documentation before it is covered to the same depth here as Studio. Upcoming editions of the 2021 manual will provide such documentation
So, the new manual covers other important changes introduced by SDL (now RWS) in the new version of the tool, but doesn’t describe (yet) the details of Trados Live, the online version of the tool.

Still, while we wait for Mats to also cover the new online tool, the 2021 manual is essential reading for all translators who want to make the most of the new features in the tool, including, for example, improvements to the advanced display filter.

As usual Mats provides also a version of the manual which highlights the changes made to the previous edition. I’ve always found the highlighted version to be particularly useful: the highlights help readers skip to the places of the book which describe changes or new features.
You can buy the Manual (or upgrade to the new edition) from Mat’s web page: SDL Trados Studio - The Manual

Tuesday, December 01, 2020

Guest post: Translators’ Attitudes towards Machine Translation

By Irene Chamali

In my dissertation, I tackled the topic of Machine Translation vs. Translators, not only because I want to later become a translator myself, but also because I was always fascinated with technology and how it is used in different professions. My key question was: What are professional translators’ attitudes towards the technological tools created for their profession?

Word Cloud

1. Research Questions and Hypotheses

My first question was “Do professional translators believe that Machine Translation (MT) increases their productivity?” What I found (from the answers received and existing research) was that such software is easy to use, offers fast results and, according to professional translators, it improves their productivity. 

My second question, “Do translators view MT as a threat?” looked at how translators feel about automated programs which can translate entire texts automatically. I found that there is no fear that MT will replace translators, since, according to research participants, it is not quite advanced yet and there are aspects of language which MT software cannot yet cope with. So, translators do not view MT as a threat (yet). 

Moving on to the third question, “What are the requirements that MT software has to fulfill in order for translators to use it?” I originally believed that it would be difficult to pinpoint specific requirements. Previous research claimed that speed, usefulness and ease of use are the main factors driving MT software adoption, and my research confirmed this: I found that ease of use, fast results, a target text which requires only minor corrections, the availability of training and support for MT software are the requirements for MT software adoption.

My last question was “Is experience one of the factors which lead translators to the acceptance of MT?”, and the answers showed that more experienced translators are more likely to use MT software.

2. Participants and Data Collection

The participants were 42 professional translators (freelancers, in-house, working in companies or in the EU) from all over the world, of different ages and experience. I collected data through online questionnaires and then examined it with the help of SPSS (a statistical tool).

3. Results

Not all the results were what I was expecting, but this didn’t discourage me, because unexpected findings can encourage further research. 

The results regarding perceived increase in productivity thanks to MT software showed that most participants recognize the advantage of using such software, since it can increase productivity. Most respondents, however, appear not to trust the quality of machine translation. Not all groups of translators (freelancers, in-house translators, etc.) have the same opinion regarding perceived productivity. For example, none of the in-house translators agreed that MT software can increase productivity, although most of the other groups thought otherwise. The reason may be that they are urged by their companies to use software which does not suit their needs.

Almost no participant feared that MT will replace human translators, since MT still needs to improve considerably. The younger the participants were, the less they believed that MT software can replace them. I think this is because younger translators are more used to using technology and seeing such tools complementing one’s work instead of taking their place, so they are less intimidated by MT. Gender, on the other hand, did not seem to play any role in perceived threat. What played a role, according to the results, was nationality, as the answers to questions regarding perceived threat differed from one nationality group to the other. For example, Turkish, Spanish, Australian, Swedish, Bulgarian and Danish participants did not seem to agree that MT software can replace human translators. French participants, on the other hand, agreed, and Portuguese, Moldovan and Austrian ones were generally neutral. Regarding the requirements for MT software, the participants’ ranking showed that the most important are usefulness, fast results and ease of use. It was interesting to see that the answers that MT software users gave did not differ from those of non-users, which could mean that non-users have a realistic view of what MT software can offer.

Finally, the outcome of my last research question about work experience as a determining factor for MT software use was that groups with different working experience gave similar answers. The small number of participants could explain the fact that my results differed from those of previous studies.

I think that conducting research surveys like the one I did for my university is not only important for academic purposes but is also useful to help software developers tailor MT software to the needs of their clients. I will be very glad if my paper makes a contribution, however small it may be, to the investigation and enhancement of the relationship between human and machine.

About the author:

Irene Chamali is a recent graduate from CITY College, International Faculty of the University of Sheffield, in Thessaloniki, Greece. She was accepted in 2017, studied in the English Studies Department for three years, and was awarded the BA (Honors) degree in English Language and Linguistics. After her BA studies, Irene was accepted for an MA in Translation and Interpreting from CITY College, which she is currently undertaking. Her article summarizes the research she completed for her dissertation.

Friday, October 23, 2020

Guest Post: Bohemicus - a multifaced translator’s tool

by Jan Kapoun

Ever wanted to use machine translation or voice dictation in just about any CAT tool out there?

Well ... now, you can!

What is Bohemicus?

Bohemicus is a powerful translator’s tool. It integrates with your CAT tool (or any other application) to enhance its capabilities. It works like an interface. With Bohemicus, you can use machine translation, voice dictation (speech-to-text), your own translation memories, conveniently search in online/offline dictionaries, take notes, and much more… in CAT tools that do not actually provide such functionality by themselves. This way, your productivity and translation speed are greatly boosted.

Bohemicus enables you to work in professional software such as Across or Transit and to use machine translation or voice dictation, even if your software itself does not provide such functionality.

For a better understanding of what Bohemicus actually is all about, please watch the introductory video below:

Bohemicus: A program that’s actually on your side

Bohemicus has been created by a person who truly understands your needs: Jan Kapoun, a professional translator and IT developer with 13 years’ experience in the translation industry.

Machine translation in Bohemicus

Machine translation is provided by Google (paid service) or by MyMemory (free, but limited to 10K words/day).

To machine-translate a segment, simply press Ctrl+Space in your CAT tool. Bohemicus captures this command, translates your text behind the scenes and re-inserts the translation in the target language into your CAT tool.

Bohemicus works in several CAT tools: SDL Studio, Across, WordFast, memoQ, and DejaVu. In other tools, especially online tools like XTM or Coach, you just need to copy the source text into your target segment, select all the relevant text in this target segment and press Ctrl+Alt+Space. This will translate the selected portion of text.

Voice dictation

Voice dictation is based on the excellent Google speech-to-text engine, which functions even with minor languages, such as Czech, Slovak and Hungarian. To use this feature, it is necessary to download Bohemicus to your Android device (phone or tablet) and connect it through Bluetooth, with Bohemicus running in Windows 7/8/10. The Android and Windows instances of Bohemicus connect to each other automatically. Once you have established this connection, simply press the tilde key (~) on your PC keyboard (or tap the big blue B on your Android screen) to initiate the listening function. When you are done speaking, press the tilde key again to stop listening. Your speech will be almost instantly converted to text and inserted into your target CAT tool.

Offline/online dictionaries

To look up a specific word or term in your connected offline or online dictionary, simply select it in your CAT tool and press Ctrl+Alt+K and your offline/online dictionary will automatically appear on the screen, having looked up your word/term.

Bohemicus' Concordance Tab
Bohemicus’ Concordance Tab

Your own translation memories

When working in Across or in online tools like XTM or Coach you cannot use your own translation memories. This can really be a hassle, especially if you know that you have previously translated a similar text. With Bohemicus, you can connect your own translation memory and look up selected terms or even whole segments in it, by simply pressing Ctrl+Alt+K.

And more

Bohemicus also offers other useful editing functions, like a really neat note-taking feature, a clipboard manager for quickly inserting predefined strings... and much more.

About the author

Mgr. Jan Kapoun is a Czech linguist and programmer with a degree in Applied Information Technology (University of South Bohemia) and more than 13 years’ experience in the translation industry. He translates technical texts from English, German and French into Czech, and is continuously developing the Bohemicus software. You can try out his software downloading it from his web page: Bohemicus Software