The Colonial Mirror Part 2 : How Western Data Shapes Global AI
The most complete digitised archives, the most cited web crawls, and the most linked sites remain overwhelmingly English and Western European. Even when new datasets broaden their linguistic range, the centre of mass stays Anglophone because that is where the infrastructure, funding, and compute reside.