Genes and Languages, with Special Reference to Europe and the Caucasus
Bernard Comrie
Max Planck Institute for Evolutionary Anthropology, Leipzig


One of the topics that engaged Joseph Greenberg towards the end of his career was the relationship between the genealogical classification of the world's languages as reconstructed by historical-comparative linguistics and the biological classification of the world's populations as reconstructed by genetics. Greenberg and his collaborators pointed to instances of agreement between the two classifications, but also noted instances of disagreement. Such disagreement provides prima facie evidence for language shift, i.e. a historical event whereby a community abandons its ancestral language in favor of some other language, with the result that that community's linguistic affiliation no longer matches its biological affiliation. The issue of how important a factor language shift has been in human linguistic history is one that has engaged scientists since the dawn of evolutionary studies, with Charles Darwin and T.H. Huxley taking opposite stances in the mid-nineteenth centuries.

In this paper I examine two test cases, one from the Caucasus, the other from Europe in the narrower sense, to show how interdisciplinary cooperation between linguistics and genetics, in the second case also including archeology, can serve to throw light on the general issue of language shift.

Recent work by my geneticist colleagues Ivane Nasidze and Mark Stoneking has shown that Armenian- and Azerbaijani-speaking populations are genetically very close to one another, although of course Armenian is an Indo-European language while Azerbaijani is a Turkic language. Historical evidence enables us to fill in some of the details. It is known that Turkic languages entered the Caucasus within the last 1000 years. The discrepancy between linguistic and genetic classifications suggests that part of an indigenous population, ancestral to both Armenian- and Azerbaijani-speaking communities, shifted its language to a Turkic language, the ancestor of present-day Azerbaijani. There was probably no wholesale population replacement, at least none that leaves any trace in the female line, given that so far most analysis has concentrated on mitochondrial DNA.

Indo-European languages are known to have entered Europe relatively recently, but the precise date within a range of some 5-10,000 years BP remain controversial. Genetic analysis shows that a considerable proportion of Europe's genepool, perhaps as much as 80%, continues Paleolithic populations, and thus necessarily predates the Indo-European incursion, again providing prima facie evidence for language shift whereby the population shifted from other languages to Indo-European languages. Around 10,000 years BP we find both genetic evidence for new geneflow entering Europe from the south-east and, at around 9000 BP, clear evidence for the arrival of agriculture in Europe, suggesting that these two innovations might be correlated, with new populations bringing new techniques. Can these be identified with the Indo-Europeans? While most detailed investigation of the reconstructed Indo-European agricultural vocabulary leaves the question by and large open, the existence of a Proto-Indo-European word for 'to plow', found not only in European languages but also in Armenian and Tocharian, and possibly in Hittite proves crucial. The plow is attested in Europe only from the fourth millennium BC, so that if the word was common to the Indo-European languages before their dispersal, then this dispersal could not have predated the fourth millennium BC. This suggests that the spread of Indo-European languages to Europe is later than both the new geneflow of around 10,000 BP and the introduction of agriculture. Since geneflow that might be correlated with the later arrival date of Indo-European languages in Europe is at best small, this suggests even more widespread language shift to Indo-European.

I conclude with some reflections on the mechanism of language shift, in particular what I take to be the crucial step, namely when a population bilingual in A and B but dominant in A shifts to being bilingual in A and B but dominant in B.