“Leveraging AI techniques, we’ve expanded Google Translate by over 110 new languages last year, welcoming half a billion global speakers,” announced Pichai, scrutinizing his notes. “This brings our total to 249 languages, with 60 African languages included – more will follow.”
His monotone delivery failed to stir much reaction from the summit attendees, a crowd of international leaders, researchers, NGOs, and tech executives.

For those advocating for linguistic diversity in AI, Pichai’s statement was a quiet triumph, the result of two years of intense negotiations in the realm of digital diplomacy.
“This demonstrates that our message is resonating with tech companies,” said Joseph Nkalwo Ngoula, a UN digital policy advisor for the International Organisation of La Francophonie headquartered in New York.
A Linguistic Divide
Pichai’s remarks came in stark contrast to the early days of generative AI, which often struggled with non-English languages.
OpenAI’s ChatGPT launch in 2022 quickly revealed its limitations for non-English speakers. A query in English would return a detailed response, while the same question in French might result in a brief, apologetic statement.
The root of this divide lies in the way AI tools operate, using large language models trained on vast amounts of English data from the internet.
Given the predominantly Anglophone nature of the internet, with only 20% of the world population speaking English at home and nearly half of the training data for major AI models being in English, the linguistic gap in AI-generated content persists today.
Shifting Priorities
“There’s more and more up-to-date information available in English,” explained Mr. Nkalwo Ngoula. The default for AI development, training, and deployment is English, leaving other languages playing catch-up.
Furthermore, AI can ‘hallucinate’ or generate incorrect information when training in a given language is inadequate. This could manifest as an AI model inventing facts or careers for historical figures.
A Black Box Issue
“It’s similar to a black box absorbing data,” said Mr. Nkalho Ngoula. “The responses may be coherent and structured, but can be factually wrong.”
Language models also tend to overlook language variations like regional dialects and multilingual expressions, which could confuse A.I. systems.
La Francophonie’s Shadow Campaign
La Francophonie, which represents 93 states and governments promoting the French language, has focused on this digital divide in its strategy. They lobbied for linguistic diversity as a central principle in the UN Global Digital Compact, particularly through the Francophone Ambassadors’ Group at the UN.
Despite some progress, challenges remain. Francophone content is often overshadowed by algorithms favoring popularity, and there’s still a lack of English-dominated AI training data. Mr. Nkalho Ngoula argues that linguistic diversity should remain central to La Francophonie’s advocacy efforts.
Given the rapid development of AI, these concerns need urgent attention to ensure that technology serves all of humanity equitably.
Source: https://news.un.org/feed/view/en/story/2025/03/1161406