Regular Expressions

Regular expressions have a reputation for being difficult, but, once you start using them, you’ll find that they can really help you do things with your translation tools that would otherwise be impossible.
In this presentation we’ll be focusing on the use of regular expressions for searching, filtering and  replacing text, but CAT tools also use them for creating segmentation and autotranslation rules, protecting tags, and other operations.

Search (and Replace) on Steroids (online Power Point presentation):


In addition to this presentation, you might find interesting the following posts:

Articles on special search techniques

How to use wildcard and format searches in MSWord to make sure all your numbers are formatted correctly
A known drawback of translating using Trados is that segments which contain only numbers cannot be opened in the translation memory tool.
This can be a problem when the document to translate contains tables of numbers: for example, you might be translating English into Italian, and you want to make sure that all numbers are formatted correctly, with a comma to separate decimals and a dot to separate thousands.
Click here to read the rest of the article
Another useful wildcard search
One of the tasks I regularly use MS Word wildcard searches for is to make sure that index entries in Framemaker's .mif files that I'm translating as rtf (after conversion with S-tagger) are formatted correctly: according to the style guide I have to follow, index entries in Italian should normally start with a lower-case letter (unless they are the name of some program).
Click here to read the rest of the article.
Googling within a site
One thing I'm frequently asked to do is to make sure the translation I'm doing or editing is consistent with the fairly large corpus of literature in the customer's web site. [...]
Some of the consistency checking I can take care of by checking previous translation memories, or glossaries [...]
A technique that I find very useful in these cases is to google within the site.
Click here to read the rest of the article.
grepWin: a great help for complex search and replace operations
For complex search and replace operations, nothing really beats RegEx (regular expressions) searches, but regular expressions may be very difficult to create.
Click here to read the rest of the article.

8 comments:

  1. This is probably the clearest enunciation of regex syntax I have come across. Thank you for making a valuable but complex subject easy to grasp for punters like me.
    With kind regards,
    Adam Warren.

    ReplyDelete
  2. Hi Adam,
    Thank you very much for your comment -- I really appreciate it!
    Riccardo

    ReplyDelete
  3. Hello Riccardo,

    Tom Fennell here. I came up to you after your presentation at ATA.

    I am afraid I could not find my note on the site you recommended to get more information on Regex and VBA - could you advise?

    Thanks

    ReplyDelete
  4. FYI this is what I found - was this what you were referring to?
    http://www.regular-expressions.info./vb.html

    ReplyDelete
    Replies
    1. Hi Tom:
      Regular-Expressions.ino is an excellent reference to all things related to regular expressions. Regarding using regular expressions with VBA (and VBA with MS Word "wildcards"... i.e. MS Word's stripped down version of regular expression search) when we talked at the ATA Conference I was thinking more of the MS Word MVP website (https://wordmvp.com/), where you can find information about VBA, wildcards, and a lot more.

      Delete
  5. Hi Riccardo, thanks for explaining regex! This will save me a lot of time going forward.

    ReplyDelete
  6. Hi Riccardo, thanks so much for sharing this presentation.
    Best regards,

    ReplyDelete

Thank you for your comment!

Unfortunately, comment spam has grown to the point that all comments need to be moderated. All legitimate comments will be published as soon as possible.