During the Google I/O 2022 phase today, CEO Sundar Pichai announced that the company is boosting Google Translate with 24 additional languages.
Since Translate is already a pretty robust product, all 24 new languages being added today are being used by demographic groups around the world that aren’t as well represented in today’s tech landscape. Even then, the company claims that these languages are spoken every day by a combined population of about 300 million people.
To extend Google Translate to include these languages at the moment, Google uses a relatively new technology called Zero-Shot Machine Translation.
This impressive model is special because it has learned to translate these new languages just by looking at the languages themselves – meaning no real translation samples have been shown with any of these languages. Google says Zero-Shot Machine Translation only looked at “monolingual text” – so by looking only at text in one of these 24 languages, it seems to be fluid enough to handle translations.
Impressive! Still, Google warns us that while this new technology is already delivering incredible results, it’s still not quite perfect.
Finally, here’s the full list of all 24 new languages being added to Google Translate:
- Assameseused by approximately 25 million people in Northeast India
- Aymaraused by approximately two million people in Bolivia, Chile and Peru
- Bambaraused by about 14 million people in Mali
- Bhojpuricused by approximately 50 million people in Northern India, Nepal and Fiji
- dhivehiused by about 300,000 people in the Maldives
- rabiesused by approximately three million people in Northern India
- Eweused by approximately seven million people in Ghana and Togo
- Guaraniused by approximately seven million people in Paraguay and Bolivia, Argentina and Brazil
- Ilocanoused by about 10 million people in the north of the Philippines
- Konkanicused by approximately two million people in Central India
- Krioused by approximately four million people in Sierra Leone
- Kurdish (Sorani), used by about eight million people, mainly in Iraq
- Lingalaused by approximately 45 million people in the Democratic Republic of the Congo, the Republic of the Congo, the Central African Republic, Angola, and the Republic of South Sudan
- Lugandaused by approximately 20 million people in Uganda and Rwanda
- Maithiliused by approximately 34 million people in North India
- Meiteilon (Manipuri), used by approximately two million people in Northeast India
- Mizoused by approximately 830,000 people in Northeast India
- Oromoused by approximately 37 million people in Ethiopia and Kenya
- Quechuaused by approximately 10 million people in Peru, Bolivia, Ecuador and surrounding countries
- Sanskritused by about 20,000 people in India
- Septemberused by approximately 14 million people in South Africa
- Tigrinyaused by approximately eight million people in Eritrea and Ethiopia
- Tsongaused by approximately seven million people in Eswatini, Mozambique, South Africa and Zimbabwe
- twiceused by approximately 11 million people in Ghana