In traditional SEO, hreflang tags were the holy grail of internationalization. They told Google: “This page is for French speakers in Canada.” But in a world where AI models are inherently polyglot, does this tag still matter?
The Polyglot LLM
Models like GPT-4 and Gemini are trained on multilingual datasets. They can seamlessly translate between English, Japanese, and Swahili. If a user asks a question in Spanish, the model can retrieve an English source, translate the facts, and generate a Spanish answer.
Read more →Cross-lingual retrieval is the frontier of international SEO. With vector embeddings, the barrier of language is dissolving. A query in Spanish can match a document in English if the semantic vector is similar. This fundamental shift challenges everything we know about global site architecture.
How Vector Spaces Bridge Languages
In a high-dimensional vector space (like that created by text-embedding-ada-002 or cohere-multilingual), the concept of “Dog” (English), “Perro” (Spanish), and “Inu” (Japanese) cluster in the same geometric region. They are semantically identical, even if lexically distinct.
Read more →