Thursday, June 02, 2016

A step back to the past: translating without a CAT

I'm working on a large translation project. Legal documents, scanned pdf files, not really suitable for OCR (too many stamps, signatures and handwritten text). The documents are repetitive, but without major blocks of identical text, although a few occasional sentences appear almost unchanged on different parts of different files.

This is exactly the kind of project (minus the "scanned pdf, not really suitable for OCR" part) that CAT tools were invented for. I'm translating these documents with the pdf open on the left of the screen, and MS Word on the right. My fingers itch for the concordance and filter shortcuts, but that is not possible here: I cannot really search the source (although there is a way to do it... more about that later), and while I can search the files I have already translated, I cannot perform a real concordance search.

This is the way we all translated up to a little over twenty years ago, when CAT tools were first introduced. Even without CAT tools, though, I enjoy a much larger and clear screen, a more modern word processor, and a fast Internet connection for looking up references. Still, it feels like going back almost to the days of pen and paper. I know that there are translators who still work this way, who refuse to use CAT tools, and who maintain that the only translation memory they need is the one they have between their ears. The only thing I can say is that everyone is entitled to their opinion, but that they should give CAT tools a try.

If you are accustomed, like me, to work on most projects using CAT tools, there are still a few things you can do if you find yourself faced with a large project to be completed using just a word processor.
  • If you know there are words, phrases and sentences that repeat themselves throughout the project, you can speed up things using a text expander program. MS Word includes similar functionality, but I prefer to use an external tool to have more control on what I do. In my case I use AutoHotkey. This scripting program allows me to create pairs of triggers and sentences. For example I can add to my triggers "<PBC", which then expands to "Provincia della Columbia Britannica." If you use a text expander, pay attention not to use as trigger a combination of letters that could appear normally in your writing, otherwise you risk getting garbage words: if you use the trigger "PR" as a shortucut for "Provincia" but then try typing "professionista", you end up with the garbage word "Provinciaofessionista". That is the reason I always add the "<" character at the beginning of my triggers.
  • Even If I cannot use CAT tools on this kind of project, I can still use translation memories and glossaries: I load them in Xbench, and use it as a search engine. I can even use Xbench shortcuts to highlight text in MS Word and transfer it to the Search box in Xbench.
  • Scanned pdf files are not normally searchable... at least not with a free pdf reader. The Pro versions of modern pdf tools, however, include OCR. So if you have Nitro Pro, for example, it indexes your scanned pdf files if their quality is good enough; you can then search them. The results won't be perfect, but better than nothing. Nitro Pro is pricey ($ 160 for the desktop version), but for a one-off project you can download the free trial version: this is exactly what I've done for this project. If you find the Pro version of Nitro useful (besides OCR, it offers a bunch of other functions), you may well decide to pay for it: it depends on how often you have to deal with scanned pdf files.
  • If you can put all your translation in a single file, MS Word search is excellent, but what if you have to create a separate word file for each of the many source files? What can you do, for example, if you want to know if you have used a certain term in previous files? In a CAT tool you can do that easily, either using the search function or, better, using filters. You can do the same on MS Word files using specialized search tools. In my case, to search the .docx files I created for this project, I used FUNDUC's Replace Studio Pro. If you decide to give Replace Studio Pro a try, read carefully the section of the help file devoted to searching and replacing in docx files. Replace Studio Pro works on many kinds of files, including .docx files. If you have to search in old-style .doc files, though, you need to use Word Search and Replace, a freeware utility again by FUNDUC. Be aware that searching in multiple MS Word files using an external tool is easy enough, but if you want to replace words you have to tread carefully, in order not to damage your files: if you damage them, MS Word might no longer be able to open them.

So, if you find yourself stuck with old-style files that cannot be translated easily in a CAT tool (some CAT tools try to give their users the ability to work even with scanned pdf files), you still have at your disposal a wealth of useful options to help you: no need to be stuck with the primitive techniques we used a quarter of a century ago.

After working for three days on this project, all I can say is that I'm amazed, in hindsight, that in the bad old days we were able to translate more than a thousand words a day. CAT tools are real time savers, and they do wonder for the consistency of our translations. They don't translate better for us, but they help us be better and more productive.


  1. Not to mention when Internet or computers didn't even exist and translators had to use typewriters or paper dictionaries and material.

  2. In fact, a computer-assisted translation tool is a modern comprehensive solution for translators and companies that are engaged in professional translations. And, you are right stating that CAT tools are real time savers.


Thank you for your comment!

Unfortunately, comment spam has grown to the point that all comments need to be moderated. All legitimate comments will be published as soon as possible.