The acquisition of translation memory and machine translation retraining is the key to making this Sci-Fi notion a reality, but it would take an army of machine translators an estimated 1,000,000,000,000,000 finely crafted translations to statistically gain the accuracy rate of a real human translator.
Seem impossible? Take heart. It is doable, but in order to achieve this goal, we must change our mindset by producing these translations more efficiently in a cloud-based collaborative and accessible way.
Today there is no consistency with the way translation memory is generated and stored. Common approaches allow analysts to align previously translated documents in order to generate translation memories for the potential benefit to a Computer Aided Translation (CAT) tool in the future. Unfortunately these tactics waste time and provide very little gains down the road, unless all content is domain specific.
Machine translation usage today is also a sticking point when generating content for end-users. Machine translation engines do not get the message across to the customers nearly as well as a degreed linguist trained in the culture and nuances of the target language. This is partially why machine translations alone should only be used for certain types of content -- and not all content types -- to convey corporate messages.
How to Fill in the Missing 35 Percent
Think of computer translation as a child learning to write. Despite the fact it doubles in capacity every 18 months and has access to the largest collection of information known to man, the Internet, it is still in its adolescence. It will present its best possible guess at how to translate something, based on a large statistical set that would make the average person’s head spin, but it needs the help of professional and novice translators to help it learn.
Content entering the cloud-based system should be broken up into small, manageable parts called segments. These segments make it easier for the computer to make its best guess and to receive help along the way. The segment is first populated with the computer’s estimated translation, typically resulting in something close, but it still needs work to achieve the proper message. Remember, the computer is akin to a child, and children rarely get it right the first time either.
But they do remember what they have learned and so does the cloud-based translation management tool. Every segment is analyzed to see if it has been translated before. Sources can include a client’s aligned documents or 1,000s of translations previously accomplished. The system looks for exact matches and swaps them with the computer’s best guess. In most cases, the document is now 65 percent accurate and conveying the information intended for a client’s audience.
Translators are then needed to fill in the missing 35 percent. There is no substitute for bilingual speakers who have both a good handle on the target language and the company’s messaging in mind when they translate.
By using cloud-based translation this is all possible, making the idea of a Universal Translator closer to reality than most people think.
Title image courtesy of Neung Stocker Photography (Shutterstock)
Editor's Note: To read more of Rob's thoughts on translation: