A recent report shows how Google’s translation tool repeatedly mistranslates sensitive content, leading to a major misunderstanding of legal terms.
Errors found in Google translate has contributed to a major misunderstanding of legal terms with conflicting meanings, according to recent research.
The study showed Google’s translation software turning an English sentence about a court “enjoining” violence, or banning it, into one in the Indian language of Kannada that implies the “court ordered violence.”
Bias in Google Translate📢
The word "enjoin" in English means 2 opposite things, either:
– to prohibit something, or
– to command something
So, what groups prohibit violence vs. command violence, according to @GoogleTranslatr?
We test this by translating sentences EN->TR->EN: pic.twitter.com/0SqTggkFn0
— Abubakar Abid (@abidlabs) December 15, 2020
The word “enjoin” can refer to either promoting or restraining an action based on context. Mistranslations also occur with other contronyms, or words with contradictory meanings depending on context, such as “all over,” “eventual” and “garnish,” the report added.
Results of the research add to ever-growing criticism of automated translations generated by artificial intelligence software.
Previous studies also revealed how some programmes perpetuate historical gender biases, such as associating “doctor” with “he.”
The latest study, presented at the Africa-NLP Workshop, EACL 2021, also points out a common method used by companies to broaden the vocabulary of their translation software. This includes translating foreign text into English and then back into the foreign language to teach the software to associate similar ways of saying the same phrase, Reuters noted.
The method known as back-translation, struggles with contronyms, said Vinay Prabhu, chief scientist at authentication startup UnifyID and one of the paper’s authors.
When the researchers translated a sentence about a court enjoining violence into 109 languages supported by Google’s software, most results erred. When spun back to English, 88 back translations said the court called for violence and only 10 properly said the court prohibited it. The remainder generated other issues, Reuters explained.
In December, researcher Abubakar Abid tweeted that he found another example of bias in back-translation through the Turkish language. Using Google’s translation software, short phrases with the word “enjoin” translated to “people” and “Muslims” ordering violence but the “government” and “CIA” outlawing it.
The mistranslation could lead to severe consequences as many businesses depend on AI to generate or translate legal texts.
Another example provided in the paper is a news headline about nonlethal domestic violence. It shows that AI translated “hit” into “killed” – a correct synonym in some contexts but could also be a problematic association.
In its support material, Google warns it may not have the best solution “for specialised translation in your own fields.”
Its translation tool is “still just a complement to specialised professional translation” and that it is “continually researching improvements, from better handling ambiguous language, to mitigating bias, to making large quality gains for under-resourced languages,” Reuters reported on Monday, citing Google.