As a worldbuilder one thing big thing I occasionally decide to take part in is protolanguages and conlangs. Worldbuilding a language can be quite difficult. One thing that has helped me tremendously is a select few channels, communities, and resources.
Biblaridion
Youtuber number one being showcased today. His whole channel is all about conlanging, worldbuilding, and fun biology stuff. He has amazing tutorials in working on protolanguages and conlangs.
A repository of tutorials and discussions on conlanging, speculative biology, and other assorted worldbuilding topics, with the aim of showi
Etymology Nerd
I don't think this guy is human, but we're all going to pretend he is for now. You want to know anything about languages, conlangs, linguistics, dialects, etymology, anything words and language- look to this guy. He not only is incredibly entertaining, but I have learned so much from him in the very little time I've discovered his content.
i have a harvard linguistics degree.
follow for more linguistics
Learning About Language First
A good place to start is learning about the nuances and funnies of language. One of my favorite videos in this category is this one. Go explore, look into languages you like or want to learn, etc.
r/conlangs
Some of Favorites:
Tsevhu (Koilang) by Koallary
Language that is drawn through the lens of an inked out koi pond. Yes, you are looking at a poem. Yes. A whole poem right there.
A'e'i'a (Dolphin Language) by Etymology Nerd
I can't make this shit up. Don't worry he has more animal languages, and has even started created Creole between them.
A Proto-Nastowic text
(Proto-Nastowic see more here)
To see its evolution through more stages of the language, see parts 2, 3, 4, and 5.
*Igmǝl gigod, gʷiʔqi gi-bun. Kʷihebsi səxɯdiŋ, pimuqebsi dagɯl, niximasi pelkebsi kʷǝd, ʔalə ʔalə adlɯkɯg. Lɯs biqa kʷǝd altǝd, pihin ɯqi. Imbagʷ ǝʔqi, tagma tiqmu, igmǝl numpuna, dimin gʷandǝda. Palin qatɯg ǝʔqi qʷǝlaʔ mul oŋkʷǝn mibakɯdǝd. Pihi ʔakixaʔi sippu god saŋɯʔi ǝʔqi, qʷitiqmu teʔimna lu. Sobugʷ ǝʔqi, pihi alɯsǝglǝdkɯmuna, dimin mǝgtab niqib. An, χima pid numpuna, iqodǝn ǝgpod muqlɯ. Nana dlaʔ bǝtta tekigǝna, pikasin qaniŋina, qʷiqɯd qaspa tedǝkʷupuqɯda. Tata, ɯχa, alǝnkʷǝ miqʷinigǝna mitubsǝ. Gʷiqa miχʷǝlu silikanǝ. Nixɯdǝn hɯgmin mibǝhatǝte, temǝhatǝkɯgil. Qʷiqalɯ alǝkɯ teqʷǝgted: Qʷotadlǝ lu. Na, tesadlǝ lu! Qʷelǝnkʷǝ qʷoyexasǝte alǝkɯ tepa. Qʷodǝgtedkɯgilte tihɯgmin qʷodǝgipo dagtǝn ɯqisi. Tǝdǝn tagma, an mekɯn kɯʔan gad lu. Sobugʷǝn qʷǝlaʔǝn qʷǝlaʔ igɯlkɯ. Tigalǝ, mexǝna migini. Agʷudabi miqaspo igʷpi ǝha ixaʔilka ǝʔqi. Imbagʷ goʔimnokɯgǝb ina!
The sky is above us, the earth below. To the north is the Great Lake; to the east are mountains; to the south and west are more plains. There are the horses and antelopes, and the sun’s home. In summer, even at night the sky is white and the earth sweats. We gather fat black berries in our skin bags. When the sun, burning red, floats over the grass, I know that it watches me too. Through winter, the sun never climbs high, but slumbers near the horizon. Only the moon and snow are white; everything else is black. In blankets of furs, with numb fingers, mother starts a fire with flint and dry grass. Carrying spears, father, brother and I go outside. The wind cuts into our bones. We greet the spirit of the woods, and it answers. A deer speaks to my heart, “Here I am.” And there it was. I throw my spear and it falls. I thank the deer and send its soul back to the world of beasts. Tonight the moon is like the edge of a fingernail. Soon it will be the blackest black of winter. Then we will eat meat and sing. The remains we dry in the smoky hut. May we see the summer again!
If language began with gestures around a campfire and secret signals on hunts, why did speech come to dominate communication?
An article by Kensy Cooperrider in Aeon about gestural theories for the origin of language. Excerpt:
Proposals about the origins of language abound. And it’s no wonder: language is a marvel, our most distinctive capacity. A few slight movements of tongue, teeth and lips, and I can give you a new idea, whisk you somewhere else or give you goosebumps. Any thought a human can think, it would seem, can be shared on a puff of air. Explaining how this all started has been called the ‘hardest problem in science’ and it’s one that few can resist. Linguists, neuroscientists, philosophers and primatologists – not to mention novelists and historians – have all taken cracks at it.
Over this long and colourful history, one idea has proven particularly resilient: the notion that language began as gesture. What we now do with tongue, teeth and lips, the proposal goes, we originally did with arms, hands and fingers. For hundreds of thousands of years, maybe longer, our prehistoric forebears commanded a gestural ‘protolanguage’. This idea is evident in some of the earliest writings about language evolution, and is now as popular as ever. Yet even as the popularity of the ‘gesture-first’ theory has surged, its major weakness – a flaw some consider fatal – has become all the more glaring.
Transcript Episode 45: Tracing languages back before recorded history
This is a transcript for Lingthusiasm Episode 45: Tracing languages back before recorded history. It’s been lightly edited for readability. Listen to the episode here or wherever you get your podcasts. Links to studies mentioned and further reading can be found on the Episode 45 show notes page.
[Music]
Gretchen: Welcome to Lingthusiasm, a podcast that’s enthusiastic about linguistics! I’m Gretchen McCulloch.
Lauren: I’m Lauren Gawne. Today, we’re getting enthusiastic about language families and reconstructing proto-languages. First, we’re excited to announce that the LingComm grants have been granted. We have amazing winners that are listed in the show notes page or on the website lingcomm.org.
Gretchen: You can stay tuned for further news from our four grantees. As their projects start coming out, we’ll be telling you about them as well.
Lauren: Speaking of things that are out and in the world, I’m very excited that P. M. Freestone’s Crown of Smoke, which is the conclusion to the Shadowscent duology, is now available. The UK edition is available worldwide. I created the Aramteskan language for those books. We talked about that for Book 1 in our Episode 37 about language and smell because it’s set in a world where scent is really evocative and powerful. If you want to know how that book series ends, you can get Book 2 now.
Gretchen: This month’s Patreon bonus episode is about linguistics with kids – books and activities and observations that you can do with kids to learn more about how they’re learning language or to incite a joy of language with kids.
Lauren: You can get this as one of 40 bonus episodes that we have available at patreon.com/lingthusiasm.
Gretchen: We have over 40 bonus episodes. If you’ve listened to all of the main Lingthusiasm episodes and you’re like, “Oh no! I wish there were more Lingthusiasm,” good news – you can support the show and listen to so many bonus episodes that are right there waiting for you.
[Music]
Gretchen: While we’re talking about history and getting kids excited about linguistics, can we go back into really ancient history and talk about how I got into linguistics?
Lauren: Oh my gosh, yes.
Gretchen: I’ve told part of this story before. I first encountered linguistics when I was around 12 or 13, and I happened across a pop linguistics book on a bookshelf –
Lauren: Ah, so this was started by a specific book! How interesting.
Gretchen: This was started by a specific book that I still have. I came across this bookshelf, and I just sort of picked it up because it looked like pop science and I knew I liked pop science. Then, I got about halfway through it, and I was reading it and I was like, “This is so cool! This is the coolest thing! This book is never leaving my possession again!” Fortunately, it belonged to my grandparents and they were willing to let me steal it.
Lauren: Nice of them.
Gretchen: One of the things that really sparked my imagination when I was reading this pop linguistics book was that it had this chapter about proto-world.
Lauren: Proto-world?
Gretchen: I feel like you have to say it like that – PROTO-WOOOORLD.
Lauren: Ok, let me try. I was about PROTO-WOOOORLD!
Gretchen: Exactly. This is this tremendously exciting idea that maybe we could figure out what the oldest language in the world sounds like. To me, at the time, this seemed incredibly exciting. What an ambitious task. All of the 7,000 languages in the world, you could figure out what they might have in common. Unfortunately, as I discovered –
Lauren: I don’t want to disappoint 13-year-old Gretchen, but that’s a bit of a tall order.
Gretchen: Unfortunately, as I learned more about linguistics, I also learned that proto-world is not a thing.
Lauren: But, having said that, the ability to take what we know about languages now and work backwards is definitely a thing. That’s the field of historical linguistics.
Gretchen: Right! What we can do is we can go back at this time depth of a few thousand years – 2,000 years to maybe 5,000 years – and figure out what languages had in common there and figure out some larger language families. Fortunately, very interestingly, language is probably some hundreds of thousands of years old. There’s just no way of going back to that extreme time depth. But, for this couple thousand years that we can do, it’s super interesting.
Lauren: I think it’s worth explaining why things get too squishy to go back to PROTO-WOOOORLD by looking at how we start doing historical linguistics and why you can only go back so far.
Gretchen: Exactly. I think the language that a lot of people think of when we think about reconstructing historical languages is this language called Proto-Indo-European. You don’t have say to PROTO-INDO-EUROPEAN because this one’s pretty solid.
Lauren: Yeah.
Gretchen: This is the reconstructed ancestor language of many of the languages that are spoken in Europe and the Indian subcontinent. Not all of them. There are some that aren’t related to this, like Basque and Hungarian. But most of the languages spoken in Europe, and a large number of languages spoken in India, all have this relationship.
Lauren: I think it’s not super surprising to people that languages are related. As an English speaker, if you hear German or Dutch, you’re like, “I can recognize some of those words. They’re easy.”
Gretchen: I don’t think it really takes a whole reconstructing proto-language to hear the English word “apple” and the German word “Apfel” and be like, “Hm, I wonder if there’s a connection there.”
Lauren: “Woah! They’re related?!”
Gretchen: Or even better, the Dutch word which is literally also “appel” but spelled differently. I don’t think that’s rocket science.
Lauren: What I think is impressive though, and where historical linguistics really came into its own, is looking at similarities but also differences and figuring out how those differences were systematic, so you can work back in time. If you think about as languages go on the sounds change so, if you put that in reverse, you can reverse those sound changes to figure out what an earlier version of the language might’ve sounded like.
Gretchen: You can make fun of this apple/Apfel example, but there are connections that are a little bit less intuitive. One of my favourite connections is you get these very large and elaborate tables, which we cannot convey tables in an audio podcast form –
Lauren: Not pleasantly, anyway.
Gretchen: – that go through and look at particular words. You have a word like “pater” in Latin and “father” in English and “pitar” in Sanskrit and “Vater” in German. You can look at then and be like, “Huh, all of those languages except for English have that T in the middle.” English has a /θ/ sound, which isn’t that different from a T. You have “padre” in Spanish. You’re like, “Uh, D, that’s not that different from a T.” Maybe there was an ancestor language that had a word like this for “father” that also had a T in the middle. Or if you have “pater” from Latin with a P, and Sanskrit both have a P, and German has a V, and English has an F, you can be like, okay, well, we have two Ps – well, several Ps – and a couple Fs and Vs. Maybe there was one of these, especially the P because Latin and Sanskrit are spoken pretty far apart – German and English are spoken pretty close together, just geographically – so maybe the ancestor language had a P there as well. You can do that comparison at this very, very nitty gritty level for a whole bunch of individual words and figure out which sounds change. Again, if you see P changing to F from Latin to English in “father,” you should also see P changing to F in other words. The Latin word “pes, pedes,” meaning “foot,” it’s cognate with foot because, again, there’s that P to F change. Well, German is “Fuß,” which isn’t a V, so, again, you have to account for this. You can go through and make these very elaborate tables with cross comparisons.
Lauren: My favourite historical sound change comparison, because I think it’s always good to have a favourite, is the – if you look across a lot of languages that are related to Latin, they have a /k/ sound where the German languages, which branched off at a slightly different time and in a slightly different way, have a /h/ sound, which means that the word “canine,” which is the Latin for “dog,” and the word “hound” are actually historically related. That /k/h/ is one example of the change.
Gretchen: Because they both have this N afterwards.
Lauren: Yeah. There’re other changes that happen as well. “Heart” and “cordial,” or “cordis,” which is where a lot of our medical words around “heart/coronary,” come from, that’s that Latin /k/ again.
Gretchen: My favourite example with English “heart” – which is, again, there’s your H – my favourite example of the specific /k/ to /h/ change is in “cornucopia” versus “horn of plenty.”
Lauren: It’s right there.
Gretchen: That “cornu-” at the beginning of “cornucopia,” that’s a horn! And the “-orn” part, “corn” and “horn,” you really see a lot of similarities because the rest of the sounds in those words haven’t changed as much.
Lauren: The nice thing about “cornucopia” is that we borrowed “copious” from Latin to mean “a lot.” Because it wasn’t just a word that stayed in English and had the systematic sound change, we didn’t borrow it as “hopious.” We just borrowed it with the Latin sound. That’s the difference between words that are borrowed and words that are related.
Gretchen: The neat thing about doing this comparative reconstruction is that you need to figure out, okay, first of all, what do we know about the history of this word? Because the word “copious” in English isn’t gonna give us evidence for what was going on in the ancestor language of the Germanic languages because we know that it was borrowed from Latin much more recently. First of all, figuring out where all these words came from and then doing the comparison only with that bit of core vocabulary that did have all of the sound changes happen to it. The fun thing is that, as an English speaker, because we’ve borrowed so many words from other European languages at various times, is that we can often see these sound changes happen even within English. Even if you don’t speak Latin, you can be like, “Oh, I know the word ‘cornucopia.’” You can see this happen within English as well because we’ve borrowed words at different time depths.
Lauren: One thing I’m always blown away with, with historical linguistics doing this comparison and then working backwards to reconstruct an older language, is just how many words it requires, how much knowledge of which words are a borrowing and which words are original to the language. For every sound rule there’s always these sub-rule exceptions because in front of some verb something doesn’t happen. It’s this meticulous work. As computers have become more advanced, the scale of the work has expanded so much because you can crunch more data. There’s very large spreadsheets and tables that happen to build these reconstructions. But even before that, people could go so far back to reconstruct what they think words were like before written records because writing is actually a relatively new invention, especially writing of sounds and not just images for words.
Gretchen: Writing is so new. It’s so interesting to look back at this part before written records exist. This is why people call them proto-languages. Any language that’s preceded by a “proto” is something that’s been reconstructed. Sometimes, you get Old English. We have records of that. We have books about it. But if it’s a proto-language, it’s specifically called that because we don’t have any sort of written records and it’s been figured out in that sort of way. I think what I really love about historical linguistics is the way that it pays attention – the people who do it – pay such close attention to what’s attested and what’s reconstructed and the sources of information. There’s this convention in historical linguistics to write an asterisk before proto-words. This confused me coming from the rest of linguistics because in the rest of linguistics, you put an asterisk before a word or sentence that’s not grammatical.
Lauren: Or like in my instant message chat where I frequently have to use an asterisk because I misspelt a word and need to correct it.
Gretchen: Well, okay, there’s that too. I think that is a bit later of a usage than the linguistic sense. Normally, in most of linguistics, an asterisk is used for “This is ungrammatical.” This is like “horn plenty of” or something like that, which you wouldn’t say those words in that order. Historical linguists use that asterisk as a reminder to themselves to be humble about it. It’s not that it’s ungrammatical; it’s that it’s unattested. “This one isn’t real” in the same sort of way as like, “This is ungrammatical. It’s negative evidence. This one isn’t real.” Except, for the historical linguists, it’s “We’ve put in a whole bunch of effort and we’ve had all these theoretical debates and we’ve come up with our best effort possible with a lot of sweat and tears to figure out this thing that we still want to acknowledge as not real.” The asterisk there is really a sort of intellectual humility.
Lauren: You begin to see why proto-world would be such a challenge because, by the time you get to Proto-Indo-European, which was probably spoken 4,000 years B.C.E. – so you’re looking at 5,000 to 6,000 years ago for Proto-Indo-European – you are making a lot of guesses to get to that point. Then, if you try and compare it to any other proto-language that’s been reconstructed, you’re now making guesses out of two guesses which becomes very slippery.
Gretchen: It’s like if you have one number that’s fairly uncertain that’s got a margin of error – you know, plus or minus 5% – and you have another number that also has a margin of error of plus or minus 5%, and then you multiply together, you suddenly have a number that has an error of, I think it’s plus or minus 25% because I think you also multiply the margins of errors. Don’t quote me on the statistics, but the point is you get a worse number – a less precise number that has a greater margin of error. The same thing happens when you try to do that. There are some attempts to say, “Okay, well, even if we can’t do PROTO-WOOOORLD, maybe we can find an ancestor language of Proto-Indo-European that is spoken in a slightly larger area,” but even that is still very speculative because of the time depth involved.
Lauren: Unless we discover some way that spoken and signed languages left fossils or we invent time travel, neither of which feel particularly plausible, we’re stuck with the –
Gretchen: I always say this is the first thing that I’m gonna do whenever I get my hands on a time machine is I’m gonna go back and retrace some language families. People just don’t seem to understand why this is such an important idea.
Lauren: So compelling. I would fund the heck out of that time travel research.
Gretchen: Thank you! I appreciate it. We need to write a grant. It’s still really neat. If you know a couple of Indo-European languages, you can use it to make nifty connections. I definitely managed to figure out some various Dutch vocabulary from my knowledge of English and German. I was able to be like, oh, I can probably triangulate on what this word means because I know what the sound changes have been. That’s really satisfying. I also think that when we talk about historical linguistics and comparative reconstruction, it’s easy to get down the Indo-European rabbit hole because it’s so satisfying to do this with languages where you already know potentially more than one of them or at least you have a bunch of borrowed cognate vocabulary to work with and to see these connections between languages that you’re maybe more familiar with. There’s a whole bunch of comparative reconstruction that’s also happened in language families in the rest of the world as well.
Lauren: Yes. I’d hate for people to think that this is the only language that this has happened for. It is an area where there has been a lot of written records, and I think that has helped drive a lot of that work and a lot of local interest from Europeans who are interested in tracing their own languages back.
Gretchen: I mean, the Indo-European languages have been written down longer than some languages, not as long as others. You could go pretty far back with Semitic languages too, I expect, because they’ve been written down for a long time. One thing that I find really interesting is – so Proto-Algonquian, which is the ancestor language of the Algonquian languages which are spoken in a large portion of Canada, especially – not in the north – but a large portion from British Columbia all the way to Nova Scotia and the Maritimes on the east and into the north-eastern United States in New England. There’s about 30 Algonquian languages. Proto-Algonquian is a really well attested, well reconstructed language family that is the ancestor language from which these languages are descended. The estimate’s that it was probably spoken around 2,000 to 3,000 years ago and, like with the apple/Apfel example, it’s very clear at a certain level to speakers that these languages are similar and that these languages feel like cousins. If you get some speakers of different Algonquian languages in a room together – and I’ve been to the Algonquian conference when I was in grad school and you had different speakers and different linguists who had researched these languages – sometimes they just sit around being like, “What’s your word for this? Oh, yeah, we have that.” It can be really fun.
Lauren: Okay. So, no one is like, surprised by Proto-Algonquian as a concept?
Gretchen: No one is surprised by Proto-Algonquian as a concept. I noticed when researching for this episode that there had actually been a linguist who had noticed this connection 10 years before one of the early famous speeches about Indo-European. Even if you want European intellectual tradition bragging rights, which are their own thing – it is definitely not where all knowledge comes from – but even if you want to participate in that tradition, this language family has been talked about for a long time.
Lauren: Cool!
Gretchen: I should probably mention a few of the names of the languages in Algonquian because many of them are languages that people have probably heard of – Ojibwe, Cree, Massachusett, Menominee, Wampanoag, Blackfoot, Mi’kmaq, Innu. There’re a whole bunch of languages that are in this family. A lot of words from them have been borrowed into English at this point. A lot of words for concepts that English speakers encountered in North America, things like “moose,” “chipmunk,” “moccasin,” “hickory,” “caribou,” “racoon,” “skunk,” “succotash,” “toboggin,” “woodchuck.” Some of these were reshaped on the model of more English-looking words. “Woodchuck” was reshaped. But a lot of words for animals and concepts in North America were ultimately borrowed from an Algonquian language.
Lauren: That’s really cool. I assume that place names are probably also somewhere this pops up a lot as well.
Gretchen: Tons of place names – “Ottawa,” “Massachusetts,” “Connecticut.” I mean, and there are other indigenous languages spoken in this area. Iroquoian languages are also spoken in this area and they’re not related, but they were some of the first languages that European settlers had contact with. A lot of their words for things in the North American continent became the English words for the items as well.
Lauren: Once you have a word for “raccoon,” you can take that word with you across the country.
Gretchen: What’s also really interesting is that these languages had structural and grammatical similarities as well.
Lauren: That’s cool because so far, we’ve only talked about doing historical work by comparing words and sounds and working backwards from them. Of course, we also have grammar, and grammar between closely related languages can be very similar and tends to be more similar. If you’ve learnt a language close to English or you’ve learnt English from the basis of speaking another Germanic language and then learnt a grammar of a very different language family, I think that’s probably something you’ve encountered before.
Gretchen: Indo-European languages, for example, often have a grammatical gender distinction between masculine, feminine, and neuter. Some of them have collapsed some of those genders into just masculine and feminine, or neuter and common, with common as the collapsed masculine-plus-feminine gender. Or in some cases, like English, they only retained relics of that on the pronoun system. Algonquian languages have an animate/inanimate contrast, which is also a way of splitting things up. They also do particular things with the verb depending on whether they’re dealing with animate nouns or inanimate nouns. There’s a lot of ways that that animate/inanimate contrast shows up and goes all the way through the grammar.
Lauren: A bit like with sounds we look for that systematic change with the grammar. We can look at systematic ways that it pops up across each individual language and then compare them to see if we can find commonalities.
Gretchen: Sometimes, reconstructing sounds can help us find bits of grammar. One of the ways where animate versus inanimate shows up in Algonquian languages is that the way of making the plural is different. If we can reconstruct a plural suffix for the animate nouns and we can reconstruct a plural suffix for the inanimate ones, then that must mean that they had this distinction between the two suffixes because otherwise they wouldn’t both exist and they wouldn’t exist in all the daughter languages.
Lauren: This is one of the fun things about historical linguistics is that it lets you dabble in all the different parts. You have to know a little bit about how sounds might work and change and how grammatical structures work and how they might change across languages as well. You get to look at all the different parts of how language works.
Gretchen: Indo-European languages in general don’t distinguish between inclusive and exclusive “we.” You know, “You and me, we’re going to go to the movies” versus “Me and this other person, we’re gonna go to the movies, and we’re leaving you behind.”
Lauren: Oh, okay. But at least I know. At least I’m not waiting for my invitation over here.
Gretchen: At least you know. You’re not waiting. The Algonquian languages all do make this distinction. That’s something you can reconstruct because all these related languages make it. Whereas, Indo-European languages, none of them make this distinction. This is the kinds of fine-grained grammatical stuff that can last for thousands of years that this distinction sticks around, or it doesn’t.
Lauren: So great.
Gretchen: There’s this really great website the Algonquian Linguistics Atlas that has audio files of speakers saying various words and phrases in different Algonquian languages. They’re kind of mapped around. You can click on sound files and hear what they sound like and how they’re written in various different areas. We’ll link to that.
Lauren: In Australia, you also have this one language family reconstructed that has a really large geographic spread. That’s known as the Pama-Nyungan language family, which is made up of the Pama languages, which is a subgroup, and the Nyungan ones. They’re just talked about together as a single group. In fact, it’s such a large group – it’s around 300 languages, we think – that the dozen or so other language families that sit across the top of Australia are all just known together as the Non-Pama-Nyungan languages. Even though they’re not related to each other, they’re just lumped into being not this major language family.
Gretchen: I feel like maybe I should clarify that the Algonquian language family is definitely not the only language family in North America. It’s a very big one, but there are a lot of languages, especially on the west coast, that are in smaller groups. In towards the south into the US, there are various other groups as well. It’s not quite as cohesive as the Australian picture.
Lauren: Yeah. It’s very diverse at the top of Australia. There’s a lot of diversity within this Pama-Nyungan language family, but it definitely dominates in terms of the number of languages in Australia. One thing that I obviously don’t just find upsetting because of its relevance to historical linguistics, but you’re really working with fragmented records in Australia in a way that is obviously because of a direct and traumatic experience of colonization where you had an incredibly rich oral tradition, and you still have an incredibly rich oral tradition across millennia. Oral traditions have incredible ways of conveying rich time depth of information. It’s just a very different experience to the written traditions we’re used to with things like Latin and Old English. It means that, when those channels of passing on knowledge and passing on language were lost, we really lost this ability to tell the full story of Australia. A lot of the work that’s done is done with very fragmentary records of word lists or, sometimes, just a couple of words we know – maybe a few key words in a particular language. When I read about historical linguistics in Australia, I’m always just left with this really heavy-hearted feeling about the stories that we’ve lost in being able to tell this big time-depth story.
Gretchen: That’s huge.
Lauren: Because we have these really fragmentary lists of data, a lot of what we can say for certain about similarities and the things that make up the Proto-Pama-Nyungan language is around the sounds that are in the language. Like many of the Pama-Nyungan languages spoken today, it was a language without fricatives, so it doesn’t have sounds like /s/ or /z/ that we take for granted in a lot of the other language families in the world and can make it very striking as a language family.
Gretchen: Interesting.
Lauren: It had three vowels which, again, is definitely on the smaller side for human languages.
Gretchen: Which three vowels – or is that not agreed on?
Lauren: An /i/, /a/, and /u/. The cool thing is, when you have fewer vowels, you do more interesting things with the vowel space.
Gretchen: When I was learning Arabic, which also only has three vowels, a lot of things that you might think of as different vowels in English are considered just like versions of the same vowel in Arabic.
Lauren: Yeah. It’s really neat.
Gretchen: I think I noticed when I was visiting Australia that a lot of the place names that were based on local languages had /ŋ/ sounds in them. Is that something that’s true of Pama-Nyungan?
Lauren: Yes. The velar nasal gets a lot of use. It’s in Proto-Pama-Nyungan as well. Although, when it occurs at the start of a word, we often have changed that so it’s easier for English speakers to pronounce.
Gretchen: Ah, yes. That old story. I mean, there are some languages in North America that were wiped out very early. We don’t have a lot of records. One of them is Beothuk, which was spoken in Newfoundland. We know that, but we don’t know very much about the language or the speakers in general. It’s not even clear whether it was Proto-Algonquian or not – whether it was its own language family. Another one, Wampanoag, has relatively recently been part of a pretty successful language revitalisation movement because they did leave written records – quite a lot of them – so speakers were able to bring the language of their ancestors back. Overall, a lot of the Algonquian languages still have at least some speakers around and many of them have revitalisation movements and these kinds of things because there is still sufficient language transmission happening or being very actively worked on to continue happening in a lot of these areas. It takes effort but it’s not down to the situation where you only have a couple word lists.
Lauren: Sometimes, when you see these very neat historical trees – much like animal evolution trees, people will draw these tress of the evolution of languages like Proto-Indo-European as they split off into all the language families and then the languages we know today – sometimes I feel like you can get so into paying attention to words in tables and sound comparisons you can kind of a forget that there’s a big story of human history that happens across thousands and thousands of years when you’re doing historical linguistics.
Gretchen: Yeah, and that languages have this history but a lot of this history, I think, to get back to this question what can we even know, is related to other types of activities that happen in the world, whether that’s conquest or war or people moving from one area to another. Those histories also show up in what we can know about language.
Lauren: So far, we’ve talked about sounds. We’ve talked about grammar. But we can also look at which words you end up being able to reconstruct all the way through to the proto-language. The word for “mobile phone” we’re not going to be able to reconstruct back to Old English even if we have records or not.
Gretchen: Oh, no! Really?
Lauren: The words that we can reconstruct can tell us potentially something about what people were talking about back in a language before we have written records of it.
Gretchen: Right. The idea being that if these languages all had a common ancestor word for “fish” or something, we know that people must have had fish because otherwise they wouldn’t have had a word for it.
Lauren: Yes. There’s been a lot of work on Proto-Indo-European for this just because there’s been a lot of work on Proto-Indo-European and there’s a lot of data for it. We know, for example, that words that we can reconstruct all the way back include domestic animals like “cow,” “sheep,” “goats,” and “pigs,” and words for dairy foods like “milk” and words for “wool.” That gives you an idea of the kind of agriculture they were performing.
Gretchen: The level of technology.
Lauren: Well, yeah, we can reconstruct a word “wagon,” and there’s an indication that they had access to the technology of the wheel.
Gretchen: That’s some technology right there.
Lauren: It’s an exciting technology back then.
Gretchen: They’re a couple hypotheses about where this means that Proto-Indo-European was spoken. I think a lot of people these days think that it was spoken in the steppe – the Pontic-Caspian steppes – zone in Eastern Europe. That’s around present-day Ukraine and southern Russian. There’s also a hypothesis that it was spoken in Anatolia, which is pretty much modern-day Turkey, somewhere in that general direction, but there’s still people doing archaeological research and various kinds of research to try to figure out exactly where. One thing you can do is say we’re quite sure that it wasn’t spoken on an ocean because there isn’t a Proto-Indo-European word for “ocean,” but there was for other smaller bodies of water. They had access to some water but not the big one. You can figure out things like that. For Proto-Algonquian, people think that it was spoken somewhere – I think the recent research suggests that it’s spoken immediately west of Lake Superior, based on looking at the names of plants and animals that are across the different Algonquian languages. Those are found in that particular area.
Lauren: Once again, you have to add some knowledge of semantics and then some botany and agriculture to your list of things you need to know about when figuring out proto-languages.
Gretchen: My favourite recent example of this is that a lot of people have been very excited these days about the fact that many European languages use a verb that’s like “to hamster” to mean “hoard.” German has “hamstern.” Dutch has something very similar. There’s a bunch of different languages – I think some Slavic languages have it. People were like, why doesn’t English have this great word “to hamster”? I was asking this on Twitter the other day and somebody pointed out to me that the range of the European wild hamster doesn’t extend into Britain. It’s only on the European continent.
Lauren: That is excellent. Maybe we have a hypothesis there about the extent of linguistic and hamster spreads.
Gretchen: Well, English does have the word “to squirrel,” which means something quite similar because we do have squirrels in English-speaking areas more than we used to have hamsters. No claims about Proto-Indo-European there, but it’s an interesting example of how geography and climate and animals and plants and botany and so on, the flora and fauna, can influence what we know about languages as well.
Lauren: We’ve talked about three different proto-groups. We haven’t even talked about isolate languages for which you can’t find any related languages to do any comparison. There are a whole bunch of other language families that have been constructed and reconstructed back to proto-forms. I work on Tibetan languages which are, depending on how far back you’re willing to make educated conjectures, are either part of the Tibeto-Burman family, which includes the languages of Tibet and the languages of Burma and that whole region, that some people claim we can reconstruct even further back to include the Sinosphere, so the languages of China, which would be a much larger and older group. That’s a part of the world that I work in directly in terms of proto-language reconstruction.
Gretchen: Another proto-language that I’m just a huge fan of is Proto-Bantu.
Lauren: Yeah, that’s a good one.
Gretchen: I studied a very, very small amount of Kinyarwanda when I was in undergrad. Ever since then I’ve been like, Bantu languages are great! Proto-Bantu is really neat because one of the things that a lot of people know about the Bantu languages is that they have all of these noun classes. It turns out like, yeah, you can reconstruct all of these noun classes. Of course, there’s controversy about exactly how many can be reconstructed to Proto-Bantu. A lot of the – there are a certain set that are very well-established, and then there’s another area where there’s more controversy. We can also reconstruct where people lived and what technology they had. Agriculture, fishing, and the use of boats based on the vocabulary were already known to the Bantu people before they started expanding into different areas and the languages started diverging, but iron working was something that showed up later once they’re already expanded. You can place the date of expansion between 3,000 and 800 B.C.E. These things you can know about these people so long ago is really neat.
Lauren: Gretchen, if we wanted to get to proto-world, we would have to take all of these proto-families that people have spent centuries now meticulously working back to these incredibly tentative hypotheses about how these languages work and you would have to go back even further to get to proto-world. How does grown-up Gretchen feel about 13-year-old Gretchen’s aspirations?
Gretchen: Well, I went back – and I still have this book – I went back, and I looked at it again for the first time in many, many years.
Lauren: Oh, my gosh, amazing! What was it like to revisit?
Gretchen: I mean, fortunately, the chapter was only nine pages long, so there was only so much damage they could do in nine pages. I’m not naming the book. It was a great book for getting a 12-year-old interested in linguistics.
Lauren: It served its purpose.
Gretchen: I haven’t read anything else of it in many years, but I was very willing to be very credulous about a lot of these statements. One of the things that I noticed is that, so we’re able to go back a couple thousand years by comparing what we have of existent languages or sometimes records of languages from a few hundred years ago or about 1,000/2,000 years ago you can go back an additional hop step of many 2,000 to 3,000 years. That’s where we can go. The problem is time depth-wise. It’s not just like, okay, well, what if we take each of these languages and compare them. Even if we could do all of this methodology, which we can’t always, that only takes us back another potential hop step of another couple thousand years. We need to go back, like, 100,000 years. We need to do this step, like, 50 times.
Lauren: We’re already in incredibly tenuous ground.
Gretchen: We just don’t have data at that point. Maybe 50 times, maybe 70 or 80 times, because we also don’t know how far we have to go back because speech and sign doesn’t leave fossils. Not only do we not know how far we have to go back, we start multiplying very approximate numbers by each other and then we just keep doing so and the inaccuracies just keep compounding. I’m glad that this book got me into linguistics, but I’m also glad to leave it safely on the shelf as a memento.
[Music]
Lauren: For more Lingthusiasm and links to all the things mentioned in this episode, go to lingthusiasm.com. You can listen to us on Apple Podcasts, Google Podcasts, Spotify, SoundCloud, or wherever else you get your podcasts. You can follow @Lingthusiasm on Twitter, Facebook, Instagram, and Tumblr. You can get IPA scarves, IPA ties, and other Lingthusiasm merch at lingthusiasm.com/merch. I tweet and blog as Superlinguo.
Gretchen: I can be found as @GretchenAMcC on Twitter, my blog is AllThingsLinguistic.com, and my book about internet language is called Because Internet.
Have you listened to all of the Lingthusiasm episodes and wish there were more? You can get access to 40 bonus episodes right now to listen to at patreon.com/lingthusiasm or follow the links from our website. Patrons also get access to our Discord chatroom to talk with other linguistics fans and other rewards like helping to keep the show ad-free for everyone. Recent bonus topics include synaesthesia, numbers, and linguistics with kids. Can’t afford to pledge? That’s okay, too. We also really appreciate it if you can recommend Lingthusiasm to anyone who needs a little more linguistics in their life.
Lauren: Lingthusiasm is created and produced by Gretchen McCulloch and Lauren Gawne. Our senior producer is Claire Gawne, our editorial producer is Sarah Dopierala, our music is “Ancient City” by The Triangles.
Gretchen: Stay lingthusiastic!
[Music]
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Musical protolanguage"
Delve even further, and we might begin to understand how those traits arose in pre-history. Music and language both involved extraordinary changes: refined articulation, the capacity to imitate others and the ability think symbolically. But what set it all in motion?
One particularly elegant solution to this conundrum dates back to the father of evolutionary theory, Charles Darwin, who proposed that the two traits arose together as a kind of “musical protolanguage”. According to this view, humans first started singing before we could talk – perhaps as a kind of courtship ritual. Like the blackbird’s song, the musical protolanguage would have been a way to show off our virtuosity, forge social bonds, and scare off rivals, without carrying specific meanings. Over time, however, the practice would have pushed us to evolve a finer control of our vocal chords, which then laid the foundations for more meaningful utterances.
BBC - Future - The beautiful languages of the people who talk like birds
Let’s talk about kinship terminology in Swiric. (This has gotten long.)
Swira, mentioned occasionally, are a mostly nomadic nation living north of Tepat. They are many other steppe nomads belong to a larger language family I call Macro-Swiric. The protolanguage has been previously introduced here. Although society has evolved and gotten more complex, of course, like their ancestors the nomads are largely migratory and live in small groups, but have shifted from primarily hunting to herding.
The proto-Swiric most likely migrated in small bands - becoming larger bands throughout history. The group could be seasonal, with people gathering in larger groups at certain times. Within bands, people lived in small family groups, sharing one dwelling, the *igʷpi. This would include parents and any unmarried children. Married children often stayed nearby but moved into a separate dwelling.
Thus the nuclear family was prominent, as in modern industrial societies. But unlike American/European families there was an additional consideration for age that sometimes overrode characteristics such as gender, generation, and so on.
In Proto-Swiric, we can reconstruct words for parents and other family members - many have a reduplicated form which is otherwise unusual in Proto-Swiric: *nana ‘mother,’ *tata ‘father,’ *xaxa ‘grandmother,’ and *dladla (or *dlada or *dlala) ‘grandfather.’ Parents were also known as *qʷabtoʔi ‘one who sent me down’ i.e., gave birth to me. Common terms for siblings were divided by relative age. Older brother was *ɯχʷa, and older sister *ini, but younger siblings were *lig regardless of gender. Siblings in general could be called *tampid(-ʔi) ‘father [is] one’ or *nampid(-ʔi) ‘mother [is] one,’ regardless of age or gender. The whole nuclear family, parents and siblings, was *igʷǝmpitʔi ‘house [is] one.’
The reduplicated family terms were often possessed, and often occurred in non-reduplicated form when they were possessed: *qʷetata or *qʷeta ‘my father,’ *qʷena ‘my mother,’ *dlena ‘your mother, etc. The simple forms of the terms were also used as suffixes, for example on names: *-ta for men, *-na for women, and *-(d)la for old people. Depending on the overall word, they might have reduced forms *-tǝ, *-nǝ, and *-lǝ.
A son was *mu and a daughter *mag, but children overall could be called *kus, which also meant ‘small,’ and there were also compound terms, *mukus and *makkus.
Aunts and uncles could be specified by compound terms (notice the non-reduplicated form):
*tan ini ‘father’s older sister’
*tan ɯχʷa ‘father’s older brother’
*tan lig ‘father’s younger sibling’
*nan ini ‘mother’s older sister’
*nan ɯχʷa ‘mother’s older brother’
*nan lig ‘mother’s younger sibling’
However the words for older siblings could also be used generically to refer to older relatives, of any generation. So a person could say *ɯχʷa to his older brother, but also an older male cousin, or an uncle, and *ini to an older sister, older cousin, or aunt.
Conversely, the words for grandparents might also be used for older siblings of the parents, especially if the speaker was young. A term such as *dladla or *xaxa might have been used politely to address any old man or old woman.
Cousins - people in the same generation - were also distinguished by age, with older cousins being *iksi (female) and *utsu (male). All younger cousins, however, were lumped together with younger siblings as *lig. If necessary they could be specified as *iksinlig or *utsunlig, which evolved into terms of their own.
Beyond *dla ‘grandfather,’ there are words for further generations. The third generation above a person was known as *sab, and the fourth one is *gʷin. The female counterparts were *xa sab/sab xa and *xa gʷin/gʷin xa. Further generations might be indicated by compounding or reduplication. However the compound form *sab(ǝ) gʷin, as well as the plural forms *dlalod, *sabud, and *gʷinud, could all refer to ancestors collectively, or to the clan or tribe.
Going the other direction, *kus meaning both ‘small’ and ‘child’ generally could be appended to the grandparent terms to indicate grandchildren. Hence a grandchild is *dla kus, great-grandchild *sab kus, great-great-grandchild *gʷin kus etc.
Returning to the plural form, , this plural is not strictly a plural in the English sense, but also referred to collective groups, typified by one member. So the plural form of ‘mother,’ *nanod or *nǝnod, did not mean that Heather had two mommies; it meant ‘mother and women of her generation, aunts,’ or even more broadly ‘maternal relatives.’ In parallel, *tatod/tǝtod was ‘uncles’ or ‘father’s relatives,’ or simply ‘clan, tribe,’ among groups with paternal descent. Paternal relatives / clan were also called *ahimud / *isod ‘bloods,’ and maternal relatives / clan were *χʷəlod ‘bones,’ reflecting the belief that one inherited one’s ‘bones’ from the mother, and ‘blood’ from the father, while either kin group might be *qalod ‘fleshes.’
The oldest male in a group was the *axʷǝn ‘chief’ or ‘master.’ Hence the *igʷ-axʷǝn or ‘home-master’ was ‘father’ or ‘husband’ - but also the *axʷǝn kus referred to the oldest son.
Upon marriage, the bride usually moved to the groom. The *akanisʔi or *akǝnesʔi was the bride or ‘one who is brought in [to the house].’ (Becoming enetsi.) The one who brought her in, her father, was *akanuʔi or *akǝnoʔi (> *eno’i). The mother-in-law was *akǝnoʔina, and the brother-in-law to the groom was *akǝnesʔin ɯχʷa or lig.
*Dibil referred to a variety of inlaws, including brother-in-laws. The basic meaning was related to ‘succeed,’ ‘inherit,’ ‘replace,’ or ‘exchange.’ It could refer to a youngest son - the youngest son was often the ‘heir’ of a family. In some historical Swiric families, older children would receive gifts on marriage and move away, while the last child would stay with the parents and receive whatever they had left. Dibil could also mean the son-in-law of a family without sons. But it could also refer to other in-laws, such as a brother-in-law. Many tribes had a custom of levirate marriage, where a man would marry a deceased brother’s widow, as a second wife if need be, and among later tribes where chiefs had multiple wives, whoever succeeded the chief, including his son, might marry the former chief’s wives (excluding his own mother).
The word *qɯd ‘part, side, half’ plays two different, important roles. In western languages, it evolved into a prefix or modifier in front of kinship terms, referring to more distant, thinner relationships, such as distant cousins, or to inlaws. Hence, *qɯd ɯχʷa ‘brother in law,’ *qɯd ini ‘sister in law.’ (E.g. > Swiric kulini ‘sister in law.’) In Nasic languages, it referred to a side of one’s family, such as mother’s side, father’s side, becoming a suffix -χt forming words for groups of people.
Aside from the terms above, Proto-Swiric had several prefixes referring to particular categories of kinship, which could be added to some of the same terms to create several sets of other terms for kinds of relatives, such as in-laws, etc.
*O- paternal relatives (with vowel reduction)
*o-dlǝ grandfather
*o-lig uncle (father’s younger brother)
*o-χʷǝ father’s older brother
*o-xǝ paternal grandmother
*o-ni father’s older sister
*NA- maternal relatives (with vowel reduction)
*na-dlǝ maternal grandfather
*na-lig uncle (father’s younger brother)
*neni *neni mother’s older sister
etc.
*IB- husband’s relatives
*ib-pa husband
*ib-lig husband’s younger sibling
*ib-ta husband’s father
etc.
*UGI- wife’s relatives
*ugi-pa wife
*ugi-ta father-in-law
*ugi-na wi:na mother-in-law
*ugeni wife’s older sister
etc.
*SO- younger in-laws (with vowel reduction)
*so-mu son-in-law
*so-mǝg daughter-in-law
So, overall, there are a lot of possible relationship terms, but that’s fine; I have a lot to choose from in constructing later Swira’s words & can forget the ones that don’t sound OK.
An example of Late Proto-Swiric. Previously, I posted a short text in Proto-Nastowic, then one in Early Proto-Swiric. Here is its evolution to the next stage, (late) Proto-Swiric. This is the ancestor of Swira, one of the languages of the "barbarians" north of Tepat.
It is followed by the early Proto-Swiric text, then the Proto-Nastowic text, to illustrate the development. At this stage, we start to see long (double) vowels and vowel clusters, and almost all open syllables.
For this one, I've bolded a couple words which underwent replacement or an irregular change.
Would it be better to show morpheme boundaries too, or not?
The sky is above us, the earth below. To the north is the Great Lake; to the east are mountains; to the south and west are more plains. There are the horses and antelopes, and the sun’s home. In summer, even at night the sky is white and the earth sweats. We gather fat black berries in our skin bags. When the sun, burning red, floats over the grass, I know that it watches me too. Through winter, the sun never climbs high, but slumbers near the horizon. Only the moon and snow are white; everything else is black. In blankets of furs, with numb fingers, mother starts a fire with flint and dry grass. Carrying spears, father, brother and I go outside. The wind cuts into our bones. We greet the spirit of the woods, and it answers. A deer speaks to my heart, “Here I am.” And there it was. I throw my spear and it falls. I thank the deer and send its soul back to the world of beasts. Tonight the moon is like the edge of a fingernail. Soon it will be the blackest black of winter. Then we will eat meat and sing. The remains we dry in the smoky hut. May we see the summer again!
Three abstracts of my working file on the Proto-Nastowic language.
The Swira language, spoken by nomadic herders in the northern frontier of Tepat, is one of many “barbarian” tribes ringing Tepat, belonging to the widespread Nastowic language family. This family is named for two of its branches, Nasic and Towic, whose names both mean “people.” Swiric comprises another branch, as does Para-Towic, and the extinct historic language Wepum. Proto-Nastowic is a head-final, head-marking, compositionally polysynthetic language. Phonologically, Proto-Nastowic includes a series of labialized dorsal consonants and distinction between lenis and fortis consonants, and a hierarchy of vowel strength that interplays with stress assignment and other processes. It has an extensive system of personal prefixes, noun-incorporation and verb-root compounding, allowing very long verbal forms (compositional polysynthesis), possessor-raising incorporation, with several applicatives. Word order is flexible with a default of SOV. Nominal modifiers precede heads while verbal modifiers follow, because of basic SOV order combined with internally-headed relative clauses. Most of these features apply in daughter languages as well. However, their well-known pastoral culture evolved later than the protolanguage itself.
The Proto-Nastowic language is the reconstructed ancestor of the Nastowic language family, which includes Swiric, Towic, Nasic, and the extinct Wepum language. It was a polysynthetic, head-final, and head-marking language with extensive verb morphology, including noun incorporation, personal prefixes, and applicatives. Its phonological system featured labialized dorsal consonants, a distinction between lenis and fortis stops, and a vowel hierarchy that influenced stress and phonotactic rules. Proto-Nastowic had a flexible SOV word order, with nominal modifiers preceding nouns and verbal modifiers following verbs. The language likely originated among semi-nomadic hunter-gatherers living north of modern Tepat, with vocabulary reflecting their environment and way of life. Over time, Proto-Nastowic diversified into its daughter languages, influenced by interactions with neighboring linguistic groups.
Proto-Nastowic is the reconstructed ancestor of the Nastowic language family, which includes Swiric, Towic, Nasic, and the extinct Wepum language. It was a polysynthetic, head-final, and head-marking language with an extensive system of verb morphology, including personal prefixes, noun incorporation, and multiple applicatives that allowed highly complex verbal constructions. The language had a flexible SOV word order, with nominal modifiers preceding nouns and verbal modifiers following verbs, likely due to the presence of internally-headed relative clauses. Phonologically, Proto-Nastowic featured a distinction between fortis and lenis consonants, labialized dorsal consonants, and a hierarchical vowel system influencing stress patterns. Its vocabulary reflected the semi-nomadic lifestyle of its speakers, who inhabited the northern frontier of Tepat, transitioning seasonally between mountain and grassland environments. Early Proto-Nastowic lacked terms for agriculture and horse domestication, though later borrowings suggest cultural shifts. As it evolved, Proto-Nastowic diversified into distinct branches, with Towic becoming the most widespread. The language also exhibited contact-induced changes due to interactions with neighboring Taknic and Yuktepatic languages. These influences, along with internal phonological and grammatical developments, shaped the emergence of its descendant languages, each preserving and innovating upon Proto-Nastowic structures.