번역가의 비밀병기:
Regular Expression

blog main banner

정규식(Regular Expression, Regex)은 텍스트에서 특정한 패턴이나 정보를 빠르게 검색하고 추출하는 데 사용되는 강력한 도구입니다. 이것은 또한 텍스트를 깔끔하게 정리하거나 특정 형식에 맞게 변경하는 데 유용합니다. 한샘글로벌 베트남은 이 기술을 효과적으로 활용하여 번역 작업을 개선하고 프리랜서 번역가들에게 교육을 제공합니다. 한샘글로벌 베트남이 보내온 아래 기사에서 그들이 정규식을 사용하여 어떻게 텍스트처리 작업을 자동화하고, 정확성을 높이며 효율성을 개선했는지 사례를 공유하고 있습니다. 자세한 내용은 한샘글로벌 베트남에서 보내온 기사를 읽어 보세요.

Scripted by
Anh Le

UNLEASHING THE POWER OF REGULAR EXPRESSIONS: ELEVATE YOUR TRANSLATION GAME

Hello there, language enthusiasts, budding linguists, and those curious about the art of translation! Today, we’re delving into a subject that might seem like magic at first – regular expressions. Don’t worry, we’re here to make this topic easier to understand and demonstrate its potential to revolutionize the translation industry. So, let’s embark on this learning journey together, and by the end, you’ll have a new linguistic superpower up your sleeve!

Decoding Regular Expressions: What’s the Deal?

Imagine this scenario: You’re faced with a gigantic text document, and somewhere within its depths lies that one term you need to revise. The catch? This term might be hiding under different variations or formats. That’s where Regular Expressions, or Regex, come to the rescue.

In its essence, regex is like a finely tuned search command. It’s a way to instruct your computer to find words or patterns that fit a certain rule. For example, that pattern can be any digits, including thousand separators and decimal points, followed by a space, and then, any measurement units. The results can be any of these:

3.5 meters 2.3 GHz 8.5 km/L
256 GB 750 kg 150 Mbps
42,195 km 15.6 mm 215 km/h

Whether it’s a specific sequence of letters, a range of numbers, or even a combination of both, regex is your trusty compass in the vast sea of text.

Why You Should Care About Regular Expressions

We all know that time is of the essence, especially for freelancers juggling multiple tasks. This is where regex steps in as your efficient sidekick. It’s not just about locating words; but also doing it intelligently and effectively. Imagine being able to extract all the email addresses from a document with just a few clicks – that’s the kind of power regex brings to the table.

Putting Regex to Work in CAT Tools

Now, let’s talk about CAT Tools – those handy companions that make translation projects smoother. But what happens when the text’s a tad messy? This is where regex comes in to save the day. CAT Tools like Trados and MemoQ are compatible with regex, allowing you to streamline your work.

Struggling with unlocalized date formats? Tired of dealing with haphazard punctuation? Frustrated by sneaky leading and trailing spaces? Let’s use Regex for tidying up the chaos, especially when you are a reviewer or LQA trying to clean up a translation. It’s akin to teaching your CAT Tool a new language – the language of efficiency and precision. Now, let’s begin!

  1. Punctuation Predicaments
    Our first scavenger hunt: texts enclosed within either standard double quotation marks or curly double quotation marks: (“|“)([^”]*)(“|”)
    (“|“) marks all the opening double quotation mark, either it is a classic one or a smart/curly one.
    ([^”]*) grabs everything inside the quotes.
    (“|”) is similar to the first one, but for closing quotations.
    And voilà, this is what you got when you put this spell in Filter.

    Now, if your client tells you to use smart quotes “…” instead, would you spend ages fixing each and every sentence? Nope, we’re all about working smart, not just hard. Replace everything you find with “$2”, and make sure you tick that “Use:” box for Regular Expressions.
    Original:
    Do:
    Result:
  2. Tricky Spaces
    Get ready for our next challenge: a translated document filled with sneaky leading and trailing spaces. Now, you could certainly rely on your keen eyes to spot and manually delete those extra spaces. But this repetitive task might not be the best recipe for a long and pain-free career. Here’s another trick: ^\s+|\s+$
    ^\s+ reveals all the leading spaces
    \s+$ uncovers all the trailing spaces
    Original:
    Do:
    Result:
  3. Conquering Date Format Challenges
    The last one may get you a bit dizzy, but I am here to help you decode. This time, we need to identify dates in “mm/dd/yy” format and convert them into “dd/mm/yy” automatically. Do this:
    Find what: \b(?<m>(0?[1-9]|1[012]))\/(?<d>(0?\d|1\d|2\d|3[01]))\/(?<y>(\d{4}|\d{2}))\b
    Replace with: ${d}/${m}/${y}
    \b stands for a word boundary, telling the regex to start matching at the beginning of a word.
    (?<m>(0?[1-9]|1[012])) looks for a month, which is a number between 1 and 12, with or without leading zeros.
    \/ identifies the slash ‘/’ as our separator. Replace “/” with “-“ or “.” as you wish.
    (?<d>(0?\d|1\d|2\d|3[01])) finds a date, which is a number from 01 to 31, with or without leading zeros.
    (?<y>(\d{4}|\d{2})) searches for a year, which can be either a four-digit one like “2023” or a two-digit one like “23”.
    Original:
    Do:
    Result:

From Novice to Regex Expert: It’s Possible

Now, you might be wondering if mastering regex is within your grasp. Absolutely! Think of it like a new skill you’re learning – much like playing a musical instrument. Initially, you’re learning the basics, but with practice, you’re soon playing complex tunes. Becoming proficient in regex is like unlocking a new level of expertise. Start with a simple tutorial and helping tool similar to this site and you will be an expert in no time.

And here’s the exciting part: As a freelancer with Hansem Vietnam, you’re not only gaining access to paid projects but also to valuable training sessions. It’s like a golden ticket to upskilling in your translation journey. To all the translators with a passion for languages, and those who see patterns where others see chaos, it’s time to master Regex for your convenience and turn it into a superpower.

Are you ready to unlock the door to a world of linguistic magic? Grab your Regex wand and embark on the adventure with us, where learning, growth, and exciting opportunities await! 📚🌐

LIST

  • Hansem Story

    중국 우한팀의 오피스 이전

    중국 우한팀의 오피스 이전

    5. 8, 2019

  • Hansem Story

    2019 World IT show 컨퍼런스 참관기 1

    2019 World IT show 컨퍼런스 참관기 1

    5. 3, 2019

  • Posts

    Developing Quality Technical Information_2. 사용자 중심으로 잘 만들어졌는가

    Developing Quality Technical Information_2. 사용자 중심으로 잘 만들어졌는가

    4. 30, 2019