AI’s Dirty Secret: It Mostly Speaks English
✨ AI Summary
🔊 جاري الاستماع
InnovationAI’s Dirty Secret: It Mostly Speaks EnglishByVéronique Özkaya,Forbes Councils Member.for Forbes Technology CouncilCOUNCIL POSTExpertise from Forbes Councils members, operated under license. Opinions expressed are those of the author. | Membership (fee-based)May 19, 2026, 10:00am EDTVéronique Özkaya is CEO of DATAmundi.ai, delivering high-quality human data for leading global AI labs and enterprises. gettyAt first glance, AI is viewed as a global technology. However, if you look at its linguistic foundations, AI remains far from global.Of course, AI generates content and writes in dozens of languages, translates instantly and powers products used across continents. The trouble is that most AI systems still think in one language. You guessed it: English.Despite the frequent claim that today’s models are “multilingual,” the reality is that modern AI has largely been built on English. As highlighted by the World Economic Forum, most AI systems are trained on only a small subset, roughly 100 languages, of the approximately 7,000 languages spoken worldwide.Analyses of large public training datasets for large language models (LLMs) show a strong dominance of English. For example, studies such as Meta’s LLaMA 2 paper indicate that roughly 90% of training tokens are English, while broader web data suggests English still accounts for nearly half of online content. If AI models such as ChatGPT are primarily trained on internet data, this imbalance inevitably shapes and skews how they understand and represent the world.How Did We Get Here?Several structural forces have shaped AI’s English-centric trajectory. The early internet was largely built in the U.S., and much of its foundational infrastructure, from domain systems to major content platforms, was developed in English.Today, many of the frontier AI labs remain U.S.-based, and widely used evaluation benchmarks such as the MMLU benchmark were originally developed in English. Data pipelines tend to follow the pa...





