[ad_1]
Facebook has developed an artificial intelligence capable of accurately translating between any pair of 100 languages without relying on first translating to English, as many existing systems do.
The AI outperforms such systems by 10 points on a 100-point scale used by academics to automatically evaluate the quality of machine translations. Translations produced by the model were also assessed by humans, who scored it as around 90 per cent accurate.
FacebookтАЩs system was trained on a data set of 7.5 billion sentence pairs gathered from the web across 100 languages, though not all the languages had an equal number of sentence pairs. тАЬWhat I really was interested in was cutting out English as a middle man. Globally there are plenty of regions where they speak two languages that arenтАЩt English,тАЭ says Angela Fan of Facebook AI, who led the work.
Advertisement
The model was trained by focusing on languages that are commonly translated to and from each other, grouping languages into 14 separate collections based on geography and cultural similarities. This was done to ensure high quality translation of more commonly used connections, and to train the model more accurately.
For some language pairs, the new system shows significant improvements over existing translation quality. For example, translating from Spanish to Portuguese is particularly strong because Spanish is the second-most spoken first language worldwide, meaning the researchers had access to a large amount of training data. Translation between English and Belarusian also improved over existing efforts because the AI learns from translating Russian, which shares similarities with Belarusian.
While the system isnтАЩt yet in use on the social network site, Facebook plans to put it to work soon to handle the 20 billion translations made every day when people click тАЬTranslateтАЭ on posts written in more than 160 languages. Future work will be done on other languages, says Fan, тАЬespecially for languages where we donтАЩt have a lot of data, like South-East Asian and African languagesтАЭ.
The work тАЬbreaks away from the English-centric models and tries to build more diverse multilingual onesтАЭ, says Sheila Castilho of the ADAPT Centre at Dublin City University, Ireland. тАЬThatтАЩs refreshing.тАЭ But, says Castilho, the human assessments only looked at a small fraction of examples, making it hard to know if this is an accurate judgement of how the AI performs.
She also worries that the evaluation was done by bilingual volunteers, rather than professional translators. тАЬNon-professionals lack knowledge of translation and so might not notice subtle differences that make one translation better than another,тАЭ she says.
Her colleague at the ADAPT Centre, Andy Way, suggests Facebook isnтАЩt making a fair comparison with state-of-the-art translation systems. тАЬTheir claim to have such a large improvement over тАШEnglish-centricтАЩ models is a bit empty, as most of the time, people donтАЩt do this anymore,тАЭ he says. Facebook disagrees, saying translation through English is still commonplace.
Journal reference: Journal of Machine Learning Research, in press
Article amended on
20 October 2020
We corrected Shelia CastilhoтАЩs institution
More on these topics:
[ad_2]
Source link