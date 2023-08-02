For over two hundred years, scholars have debated where the Indo-European languages ​​once originated. Two main theories have dominated this debate so far. The steppe hypothesis locates its origin in the Pontic-Caspian steppe around 6,000 years ago, while according to the Anatolia or arable farming hypothesis, it had been widespread along with arable farming in Anatolia for around 9,000 years. Phylogenetic analyzes of the Indo-European languages ​​have so far come to different conclusions, especially about the age of this language family, which can be attributed to inaccuracies and inconsistencies in the data sets used, as well as to limitations in the analysis of “ancient” languages.

To solve these problems, researchers from the Department of Language and Cultural Evolution at the Max Planck Institute for Evolutionary Anthropology, together with an international team of more than 80 language specialists, have created a new data set containing a selected core vocabulary in 161 Indo-European languages, including also those of 52 »old« or historical languages. This broader and more balanced choice of languages, coupled with strict protocols for encoding such lexical data, has resolved the problems of datasets from previous studies.

Estimated age of the Proto-Indo-European proto-language about 8,100 years

The team used a novel Bayesian phylogenetic analysis to test whether ancient written languages, such as Classical Latin and Vedic Sanskrit, are the direct ancestors of modern spoken (Romance and Indic, respectively). Russell Gray, director of the Department of Language and Cultural Evolution and lead author of the study, emphasizes: “Our chronology is robust across a variety of alternative phylogenetic models and sensitivity analyses.” The researchers thus estimated the age of Proto-Indo-European at around 8,100 years, five main branches split off about 7,000 years ago.

These results do not agree entirely with either the steppe or the arable farming hypothesis. The study’s first author, Paul Heggarty, states: “Recent DNA data indicate that the Anatolian branch of Indo-European is not from the Steppe, but from further south, in or near the northern arc of the Fertile Crescent – as the earliest source the Indo-European family. The topology of our language family tree and the tree splitting dates point to other early branches that may also have spread directly from there and not across the steppe.”

New insights from linguistics and genetics

The authors of the study therefore proposed a new hybrid hypothesis for the origin of the Indo-European languages, with a final ancestral home south of the Caucasus and a subsequent northward branching into the steppe as a secondary homeland for some branches of Indo-European associated with the later Yamnaya and Yamnaya languages Line-related expansions to Europe came. “Ancient DNA and language phylogenetics thus suggest that the solution to the 200-year-old Indo-European puzzle lies in a hybrid of the arable and the steppe hypotheses,” says Gray.

Wolfgang Haak, group leader in the Department of Archaeogenetics at the Max Planck Institute for Evolutionary Anthropology, summarizes the great importance of the new study: “Apart from a refined time estimate for the entire language tree, the tree topology and the branching order are crucial for agreement with key archaeological events and changing ancestry patterns as found in the genome data of people living at the time. This is a big step away from the mutually exclusive earlier scenarios towards a more plausible model that integrates archaeological, anthropological and genetic knowledge.”

