About Translation

Monday, May 08, 2006

TagEditor Sundry Annoyances

I don't mind working in Trados' TagEditor - at least it is much better than translating Power Point files in the dread T-Window application, but TagEditor sure has more than its fair share of annoying quirks:

Since this is basically a no-frill text editor, why does it attempt to display fonts in a half-assed WYSIWYG way? (especially since it does it in such a buggy way: text that changes sizes on screen for no understandable reason, or displays in bold and/or italic when it is neither). Admittedly, these display defects do not affect the translation, but why have them at all, since the preview function is just a click away (and works reasonably well)?
Why do source string, translated string, etc. all are displayed in the same color, instead of using the colors one sets in Workbench?
Trying to use the MS Word spell-checker still doesn't always work, and
If you use the supplied spell checker, the Check Spelling window comes up, by default, with the focus on the "Not in dictionary" field, instead of the "Change to" field, as would be logical.
Why such a puny internal search function: you can only specify a search string, a replacement string, whether to match whole words only or not, whether to match case or not, and (for the search function only) whether to search up or down: no regular expressions, not even the scaled down version one can find in MS Word's wildcard searches... and this when such functionality is easily available in text editors that sell for just a few bucks (such as Text Pad).

Thursday, May 04, 2006

Googling Within a Site

One thing I'm frequently asked to do is to make sure the translation I'm doing or editing is consistent with the fairly large corpus of literature in the customer's web site.

Some of the consistency checking I can take care of by checking previous translation memories, or glossaries (if they exist), and some using the search functions within the site itself.

However, using the search box provided within a site is often not enough. A technique that I find very useful in these cases is to google within the site.

Say for example that I need to see within the Italian portion of my customer whether in the past they have used more often "implementazione dell'applicazione" or "deployment dell'applicazione".

If my customer's Italian web site is, for example http://www.xyz.com/IT, I just need to enter in the google search box

"deployment dell'applicazione" site:http://www.xyz.com/IT

and then repeat the operation for

"implementazione dell'applicazione" site:http://www.xyz.com/IT

Both searches will be limited to the customer's Italian web site, and the google search results could give me a good idea of the relative frequency of the terms used.

Wednesday, May 03, 2006

New Version of Translator's Tool Box

The Translator's Tool Box is an e-book aimed at professional translators. It contains a wealth of information about software useful to translators: from useful information on how to use various features of the operating system and of Office application to a discussion of CAT tools and to information on more specialized "little" utilities such as Search & Replace or Clipmate, and much more.

We have just received the fourth edition, which adds information on how to translate complex file formats (such as XML).

Highly recommended.

Tuesday, May 02, 2006

Another Useful Wildcard Search

(c) Riccardo Schiaffino 2006

When working with Trados and MS Word, I often take advantage of the fairly powerful wildcard search options of Word - which are really a scaled-down and non-standard version of regular expressions. In a previous article from last year (How to use wildcard and format searches in MSWord to make sure all your numbers are formatted correctly), I showed how wildcard searches could be used to make sure that numbers in a translation are formatted properly according to the target language rules. In this post we are going to see another way wildcard searches may be of use to translators when working with Trados.

One of the tasks I regularly use MS Word wildcard searches for is to make sure that index entries in Framemaker's .mif files that I'm translating as rtf (after conversion with S-tagger) are formatted correctly: according to the style guide I have to follow, index entries in Italian should normally start with a lower-case letter (unless they are the name of some program).

Problem is, index entries are, by their very nature, standalone segments (which normally start with an upper case letter), and also segments that are very likely to be used elsewhere: "Program Installation" may be a section title, and, at the same time, an index entry in English. In Italian, however, I need to have "Installazione programma" for the title, and "installazione programma" for the corresponding index entry.

Working in Trados with a large memory, with segments that come from other translators and other projects, it is often easy to have the various index entries already translated from perfect matches, and, likely with a mismatch of upper and lower cases.

I thought that the best solution would be some search string able to find only index entries that, in Italian, begin with an upper case letter. At that point I could manually make them lower case by pressing F3, or leave them as is when they actually needed to be upper case.

The first part of the search string was going to be easier, as all index entries begin with either the <il> or <ie> markup.

So I knew that my search string needed to begin with

\<i[el]\>

This means:

\< - Find all the strings that begin with the "open markup" sign (the open angle bracket "<"; the backslash character "\" is used to indicate that the character that follows needs to be taken literally, and is necessary because the angle bracket characters otherwise have special meaning within wildcard searches.
i - Followed by an "i"
[el] - Followed by either an "e" or an "l" (the square brackets surrounding "el" group the alternate valid characters. <ie> and <il> are two markups that precede index entries in .mif files)
\> - Followed by the "close "markup" sign.

Now we need to search beyond the entire English source segment, whatever it contains, until we reach the first letter of the Italian one. In order to do this, we can take advantage of the Trados source segment delimiters "{0>" and "<}0{>".

Therefore the search strings needs to continue with

\{0\>[A-Za-z,;:\-\*\!\?\\\/"'=.£%&+\@#°_ 0-9]{1,255}\<\}[0-9]{1,3}\{\>

This looks quite complicated and unreadable (fine-tuning this part of the search string took quite a long time, and it probably is still not perfect). It means:

\{0\> - Trados markup to indicate the begin of the source language string (the first backslash character indicates that the open bracket "{" needs to be taken literally, since on its own it has other uses within the wildcard search, as we shall see presently)
[A-Za-z,;:\-\*\!\?\\\/"'=.£%&+\@#°_ 0-9] - All the characters that could be contained within the source language string. Again, backslashes precede characters that otherwise would have special meaning within the wildcard search. The square brackets are used again to group all the possible characters.

Now, let's explain a little further these "all possible characters":
- A-Za-z - All alphabetical characters
- ,;: - Comma, semi-colon and colon
- \-\*\!\? - Various punctuation and symbol marks (-*!?()each preceded by the backslash to indicate it has to be taken literally)
- \\ - The backslash "\" symbol itself (when it is doubled thus, the first backslash indicates that the second one is to be taken literally)
- \/ - The forward slash "/"
- "'=.£%&+\@#°_ - Various other punctuation and other symbols (double-quote, single-quote, equal sign, full stop, etc., up to the underscore sign "_"
- - The space " " (sorry, cannot show a space in red...)
- 0-9 - All numerical characters
Some of these "special characters" might not be necessary: it depends on whether they could actually be present within an index entry. however, if I have forgotten to include any character that actually occurred within an index entry, my search would not work properly, as it would stop at the first unrecognized character.
{1,255} - Here is one of the special uses of the brackets within wildcard searches: they are used to indicate how many characters (any combinations of the previously listed ones from "A-Z" through "0-9" can be contained in the previous part of the search. "1,255" means "from a single character through the maximum allowed (which unfortunately is only 255).
\<\}[0-9]{1,3}\{\> - Trados markup to indicate the end of the source language string and the beginning of the target language.
- \<\} - Beginning of the markup used by Trados between SL and TL
- [0-9] - Indicates that the markup may contain here any number
- {1,3} - Indicates that the number contained in the markup may be between 1 and three digits (in fact, between 0 and 100)
- \{\> - End of the markup used by Trados between SL and TL

Finally we need to indicate that we are looking only for those index entries in which the target language strings begins with an upper case:

[A-Z] - That is "All upper case alphabetical characters between 'A' and 'Z'"

Our complete search string will therefore be:

\<i[el]\>\{0\>[A-Za-z,;:\-\*\!\?\\\/"'=.£%&+\@#°_ 0-9]{1,255}\<\}[0-9]{1,3}\{\>[A-Z]

This needs to be typed exactly as is in Word's search dialog.

I keep a text file with all the wildcard search strings I know I'm going to use in the future, and when I need them I copy from the text file to Word's search dialog, and I suggest doing the same if you start using wildcard searches.

Wildcard searches are probably not for everybody: they look cryptic, may be very complicated, and usually take a fair amount of time to get right. On the other hand, as we have seen, they may help solving problems that may be difficult to solve any other way.

If you are interested in more information about wildcard searches, my previous post) contained some references. In addition to those, I suggest a book on regular expression that has been published recently, and that contains an entire chapter devoted to wildcard searches in MS Word: Andrew Watt's Beginning Regular Expression, published by Wrox.

Monday, May 01, 2006

Forecast Growth of the Translation Market

The New York Times published yesterday an article (Speaking in (Many) Tongues Can Be Profitable), on the growing importance of translation and interpreting, with the federal Bureau of Labor Statistics forecasting a 20% growth in the number of translators and interpreters between 2004 and 2014.

Thursday, April 20, 2006

Forthcoming Article on Translation Quality

We worked quite hard on this article, which will be titled "Translation Quality Measurement: Using the Translation Quality Index to assess the quality of translations". The article deals with such questions as:

Why measure translation quality?

Why are translations so difficult to evaluate? What methods are available to help assess translation quality?

The article is scheduled to appear in the June 2006 issue of Multilingual, which should also contain other articles about translation quality.

UPDATE

(4/26/06)
The article will actually appear in the July/August issue, not in the June issue.

Wednesday, April 19, 2006

Advice to Beginning Translators (3) - Contacting Prospects

Today I received an unsolicited message with an attached résumé from a youger colleague.

The message was written in Italian (though our company is based in the United States), was not addressed to anybody in particular (in fact it just started with "I'm an interpreter and translator...", without any salutation, and was signed with the first name only (no surname) of the sender.

I think that when we approach our prospects we should put our best foot forward, and this clearly was not the way to do it. My advice to this younger colleague was:

Draft your résumé to conform to the format used in the country where it is sent. For instance, since our company is in the United States, adding such personal data as your date of birth is really inapprpriate.

The text of the cover letter (or e-mail message) also should take into account the preferred format(s) for the target country; therefore, in this instance, the message should have been written in English, should have begun with a salutation, and should have been more formal (hence, no first name only as a signature).

Do not send unsolicited résumés as file attachments, because, unfortunately, doing otherwise might mean that the attachment is automatically deleted by the security settings of the e-mail client or antivirus program. On the other hand, a text-only version of the résumé could be appended at the end of the message.

Do not set the message to send an automatic read confirmation: in the case of unsolicited messages many people prefer not to reveal to the sender whether the message actually arrived or not (this helps preventing spam).

My previous posts on this subject are:

Advice to Beginning Translators (1) - Résumés

and

Advice to Beginning Translators (2) - Sending Out Your Résumé

Friday, April 14, 2006

If You Do Something, Do it Right

From the Daily News Record online.

This paper started to add local news in Spanish. According to this letter, the reader's first reaction was "this is great", swiftly followed by "I noticed that there were many grammatical errors. My first language is Spanish and I found it difficult to read."

I find this very typical: companies realizing they need their material translated, but then not bothering to have it done properly - whether because of a desire to spend as little as possible (with predictable consequences), or because they really don't know how or where to get a professionally done translation.

I think that our professional associations (but also us, as professional translators) should be much more active in communicating what professional translation is and why it is so important to get it done right.

New Azeri-English Translation Software Released

(From Trend)

According to this press release, a new Azeri-English translation software, called Dilmanc, has been released, and it already claims 6500 users.

From what I know, most commercial MT programs have normally been aimed first at much more widespread languages, to take advantage of the larger translation market for those language combinations.

I wonder, though, whether MT translation isn't actually more useful for languages combinations (such as Azeri-English, perhaps?) where the number of professional translators available is limited: in such cases it mught be argued that the choice would not be between machine translation and (better) human translation, but between machine translation and no translation at all.

Saturday, April 01, 2006

Many mediocre translators earning high salaries (?)

"High", of course, is a relative concept: This article about the booming translation market in Shangai, from the Shanghai Daily, says that "While many unqualified translators are over-paid, some real good translation and interpretation professionals are under-paid".

The article goes on to say that the officially suggested guidelines for the pay of non-professional Chinese-English translators indicate 120 yuan ($ 14.80) for 1,000 Chinese characters, but that actual rates range from 30 through 250 yuan.

Several of the complaints voiced in the article will sound awfully familiar to any experienced translator:

"Many employers don't have any idea about what a qualified translator is"

"...some employers don't care about the translation quality at all"

"...unqualified translators [...] take the position and [this] leads to unwarranted pricing"

Although the title of the article says that unqualified translators earn good money, what the article suggests, at least to me, is that, in China as elsewhere, unqualified translators willing to work for peanuts depress the market for the rest of us.

Pages