What You Need to Know About Machine Translation and Post-editing

Google Translate can be a great help when you travel abroad and need to have a quick conversation with the locals. ChatGPT comes in handy when you encounter an article written in a language you do not speak. Many other similar “machine translators” help to break down the language barrier so that you can shop, chat, and join social media platforms internationally.

How convenient! But is the quality of those machine-generated translations trustworthy?

First, Machine translation (MT) is constantly improving. These days, sophisticated artificial intelligence (AI) powers most machine translation engines. However, it’s still far from perfect. Unless that changes, humans and machines will need to work together to produce a quality final translation with an optimal price and turnaround time. This type of collaboration is becoming increasingly popular among language service providers— it’s called “machine translation post-editing,” or MTPE for short.

So, what is MTPE?

MTPE is a process by which a machine translation engine automatically translates a source text, and human translators then review and edit the output. This process gives you the best of both worlds: the speed of machine translation combined with human editors’ deep linguistic and cultural knowledge.

Let’s take a look at an example of the MTPE process below.

It’s true that “blackouts” can be translated as “mất điện” (power outage) or “ngất xỉu” (faint) in Vietnamese. Still, in this medical context, the latter is clearly more suitable. Google Translate failed to notice this, and it’s the human editor’s job to fix this mistranslation.

When Should You Use MTPE?

Even the most passionate AI advocates can’t deny that humans have a deeper, more nuanced understanding of language than any AI algorithm ever will. Wouldn’t it be safer to rely on a human linguist for the translation step to ensure a top-notch translation requiring minimal revisions?

In some cases, the answer is unequivocally “yes.” However, for many types of translation work, the MTPE process has significant advantages:

  • Speed: Machine translation engines work much faster than any human translator. What once took days or weeks of work can now be completed in minutes (not including the time spent post-editing).
  • Human Productivity: The average translation capacity of human linguists is about 2,000 words/day, while the average post-editing capacity ranges from 3,500 to 7,000 words/day. That is a massive boost in capacity—up to 350%! 
  • Cost: Since MTPE is faster than translating from scratch, it’s also often more cost-effective, especially when content doesn’t require full post-editing (such as user-generated content)
  • Grammatical Quality: Machines are more accurate than humans in the mechanical aspects of the translation process. Machine translation engines rarely make non-linguistic errors like typos, punctuation, space, and capitalization mistakes. Skilled human translators are better at understanding language and handling complex or tricky situations related to linguistics or culture.  By working with AI translation engines, human linguists can focus their energy on what they do best.

What content types are good for MT and MTPE?

MTPE can be an excellent choice for many different content types across many different sectors, especially in the following situations:

  • High-volume content that requires rapid turnaround times – especially when only a light post-edit is required to hit quality standards for that type of content
  • Straightforward technical, instructional, or informative texts
  • Language pairs that MT handles well.  For example, English to French, Spanish, or Chinese work well because these languages have a wealth of translated texts that machine translation systems can learn from, leading to more accurate translations.
  • Content that would otherwise go untranslated.

When Should You Not Use MTPE?

On the flip side, there are also some situations where it simply doesn’t make sense to use MTPE (or MT at all):

  • Creative texts, like advertising slogans or literary translations, require creative translations to hit the same notes in the target language as they do in the source language. A human translator is still your best choice to make sure the desired impact carries through from one language to another.
  • Language pairs where there is limited data from both the source and target language to train the engine on. If this data is lacking, the quality of the output will suffer.
  • Texts with complicated syntax and grammar. The MT engine will struggle to translate this type of content accurately.

Lastly, while MTPE is successfully used in the legal, medical, financial, and governmental domains, where mistakes can be grave, we recommend a full post edit of 100% of the machine-translated content.

How to improve the quality of machine translations

The biggest concern with machine translations is the potential loss of accuracy and quality. To get the benefits of MT while maintaining high-quality translations, there are a few guidelines that it’s important to follow:

Use the most suitable machine translation engine

All MT engines are not created equal. For example, DeepL Translator does not support a wide range of languages, but it excels at those it does support. Baidu Translate performs better for Chinese than it does for other languages. Bing Translate is the most effective choice for literary translation in some cases. Depending on the language pair you are working with and the translation domain you are in, each MT engine has pros and cons. It’s best to test as many MT engines as possible to choose the best one. Some companies even develop their own customized machine translation engines by training the engine with industry terminology and company data.

Simplify the source text

MT is far from having the language proficiency of human linguists. It can struggle with sentences with complex syntax, figures of speech, and plays on words.

To demonstrate how poorly MT performs in these situations, let’s analyze how Bing Translate transfers the lyrics to Demi Lovato’s “Give Your Heart a Break” into Vietnamese.

Clearly, it’s too complicated for Bing to understand that the second “break” in the lyric does not mean “to shatter” (làm tan nát) but “to comfort” or “to heal” (xoa dịu). This example shows that the source text used in MT projects should be simple in structure, unambiguous in meaning, and error-free if you want an adequate translation.

Choose the appropriate level of post-editing for your content

Not all MT output requires the same level of post-editing. Depending on your quality needs, you can request light post-editing or full post-editing. Light MTPE involves quickly editing machine translations for basic readability and coherence without worrying too much about making them perfect. Full MTPE means post-editing machine translations more thoroughly, fixing all the errors, and making sure the text reads well and matches the style and cultural expectations of the target audience, almost as if a human translated it from the start.

At Hansem, our MT programs include post-editing of all or part of the raw machine translation output to your specifications. It’s important to define the level of quality you require, determine the level of post-editing (light or full), and pinpoint the amount of content you will revise.

Choose knowledgeable post-editors

Post-editors have specific skills that help them excel at MTPE. While all post-editors are translators, not all translators are post-editors. In addition to translation skills, post-editors need to be adept at spotting, classifying, and correcting errors and working with MT systems. 

For example, MT post-editors should be able to recognize patterns in machine translation engines. Each translation engine has its own error pattern.  For instance, it may produce translations that are too literal, have terminology issues, or use an unlocalized date format. The example below shows that Google Translate does not change the US date format (MM/DD/YYYY) to the Vietnamese date format (DD/MM/YYYY) if your DD is below 12 (if the DD is above 12, Google Translate does make the adjustment).

Experienced MT post-editors are familiar with the mistakes that each engine they need frequently makes, so detecting and correcting them are easier.

Apply MTPE quality control measures

Human expertise is an invaluable part of the MTPE quality control process. However, there is also technology to help keep MTPE output free of errors. Most computer-assisted translation (CAT) tools offer built-in QA features that can assist in checking MT accuracy.  For example, memoQ’s QA feature alerts the post-editor if the terms used by the translation engine do not match the terms requested by the client, while QA Checker on Trados displays warning messages when the machine mistakenly translates any brand names that are not supposed to be translated.

What about AI for post-editing?

Yes, you can have AI both do the translation and edit it. However, while you may consider some combination of AI translation and automated AI editing tools for less critical content, if you need the results to be accurate, it’s essential to involve a human linguist. AI recognizes and replicates patterns but does not process or experience language like humans do.

Get fast, accurate translations with MTPE

Why settle for slow translations when you can have lightning-fast results with Hansem Global’s MTPE services? Our team of expert post-editors works hand-in-hand with state-of-the-art translation engines to bring you the best of both worlds – speed and accuracy. With Hansem, you won’t have to sacrifice quality for efficiency. We will help select the engine, decide quality processes, assign and manage the post-editors, and deliver translations that meet your quality expectations. 

Our MT engineers train, evaluate, and advise clients on how best to use these systems. For example, in our work with Naver, Hansem’s engineers introduced our own quality score calculation method, which incorporated Python and Excel macros to efficiently score the quality of both the original MT results (over 27 million words) and the translated/revised content.

Contact us today to learn how we can help you bridge language barriers with ease, faster than ever before.