The Indo-European Family Tree
The Indo-European language family one of the largest (in terms of number of languages, population of speakers, and area covered) and most diverse in the world. Today, I’m going to attempt to give an overview of the Indo-European family tree.
First, a quick introduction to Indo-European languages. All Indo-European languages are descended from a single language, called Proto-Indo-European. It was spoken between the Black and Caspian Seas during the Neolithic Era, and it gradually spread from there to cover almost the entirety of Europe, Central Asia, and India, diversifying into different languages as it did so. There are no written records of Proto-Indo-European; everything linguists know about it comes from reconstructing it from all the living and dead Indo-European languages we have records of.
Without further ado, the following is all of the branches of the Indo-European family tree, by the date of the first written evidence of them. For some of these languages, this is very late, because they were just not being written down in anything that got preserved. This often happens when a language is spoken by the majority of a population, but not their ruling or religious classes, because the people who speak the language can’t read, and therefore don’t write anything down in it, or if they do, it’s not preserved because they’re not part of the ruling class.
This language family includes two main branches, the Indo-Aryan languages, which are mostly spoken in India and Pakistan (as well as Bangladesh and Sri Lanka), and the Iranian languages, which are spoken in and around Iran. The Indo-Iranian family includes Persian, Punjabi, Bengali, Pashto, and Kurdish. (Though Kurdish is usually viewed as a collection of mutually intelligible dialects, rather than a single language.) The most widely spoken Indo-Iranian language is Hindustani, which is considered a single language despite having two different writing systems. In India, it’s called Hindi, and uses a Brahmic script, while in Pakistan, it’s Urdu, and uses the same script as Persian. While some speakers of both languages will argue that they are separate, most linguists say that the major differences between Hindi and Urdu are socio-linguistic and that they are not linguistically distinct enough to be separate languages.
The Anatolian languages have all been extinct for roughly the last two thousand years, and some of them well before that. The best known of them is probably the Hittite language (remember those guys from history class and/or the Bible?). It is notable largely for being the oldest known Indo-European language.
In modern times, the Hellenic language family is basically just Greek. In Ancient times, Hellenic languages were spoken throughout the Balkans and Anatolia (Turkey), but those languages are largely extinct. The only strong contender for a spot in the Hellenic language family is Tsakonian, which is spoken in the Peloponnese in Greece and is not intelligible to modern Greek speakers, which weakens the traditional argument that it’s a dialect of Greek. Unfortunately, the language is also rapidly dying, so Greek will soon go back to being the only living Hellenic language unless a significant restoration effort starts.
The current members of the Italic family include the Romance languages, which are Latin and its Iinguistic descendants, (Spanish, Portuguese, Italian, French, and Romanian, as well as several smaller ones like Occitan and Romansch.) However, the Italic family also includes the now-extinct languages spoken on the Italian peninsula before the rise of the Roman empire and the subsequent latinization of the entire peninsula. There is debate about how related the Italic languages were before Latin took over, and therefore the family’s current definition is somewhat awkward because it is the “languages of the Italic peoples,” which may or may not have been linguistically related (there is not a ton of written evidence of most of these languages. Thanks, Roman Empire.)
In ancient times, Celtic languages were widely spoken across Europe. However, in modern times, they are confined to the edges of Western Europe, having been driven out by Italic and Germanic empires. Currently, the Celtic family consists of Scottish and Irish Gaelic, which are fairly mutually intelligible but are usually considered separate, Welsh, Breton, Cornish, and Manx. Cornish and Manx technically went extinct in 1777 and 1974 respectively, but recent revitalization efforts have led to a small but growing community of speakers for both languages, and a few children being born in Cornwall today are being raised as native speakers of Cornish along with English. All Celtic languages other than Welsh are considered endangered, and most people who speak Celtic languages speak them as a second language or co-first language with English or French.
The language family that English belongs to, Germanic consists of several major subfamilies, namely West Germanic and North Germanic. (The East Germanic languages have gone extinct.) West Germanic includes English, German, Dutch, Yiddish, and Afrikaans, while North Germanic includes Icelandic, Norwegian, Swedish, and Danish, but not, as is sometimes assumed by the average person, Finnish, which is not an Indo-European language at all. Here is a very funny little bit of stand up comedy about this:
I’ve already done a post about Armenian, which you can find here. Suffice to say that Armenian comprises its own branch of the Indo-European family, and is a fairly old language with a complicated history.
The Tocharian languages, like the Anatolian languages, are all extinct. We know of their existence only from a few fragmentary manuscripts found in what is now northwest China. They are distinct from all other Indo-European languages and very little is known about them.
Obviously, this family is composed of two major subgroups, the Baltic languages, which include Latvian and Lithuanian, as well as a number of extinct languages, and the Slavic languages. The Slavic languages are a much larger and more diverse group, including Russian and Ukrainian, Polish, and Bulgarian, to name a few. The Baltic and Slavic languages are grouped together because they came from a single ancestor, known as Proto-Balto-Slavic and they all share several grammatical features and words. A small group of linguists think this is because of the prolonged period of contact between the two language families, rather than a common ancestor, and that they should therefore not be grouped together.
The most recently recorded of the Indo-European language families, Albanian, like Armenian and Greek, is the sole language in its branch of the family. However, there is a significant dialect of Albanian, called Arberesh, which is spoken by the Albanian communities in Italy and is fairly different from modern standard Albanian. While it probably does not count as its own language, it is interesting in that much like Quebecois French, it preserves several now archaic features of the language.
This has been a quick guide to all ten of the substantiated branches of the Indo-European language family. There are, of course, many now-extinct languages and language groups that may or may not have also been part of the Indo-European family, but I thought this was enough to be getting on with.