OnWords @onwordsblog - Tumblr Blog

“A squawking man’s biggest regret”and other fake TED talks: How I made a title generator

As part of a project for a Natural Language Processing class, I decided to build a generator that invented fake titles of TED talks. The trick was to get titles that made at least some sense, sounded like titles, and were made of pieces found within a set of TED titles available online. I was determined to make the generator learn from preexisting data only; anyone can write a bunch of Mad-Lib-style templates like “10 things you never knew about ____” or “The surprising beauty of ______” or “The future of “________” and plug in random words. I wanted the generator to know which syntactic structures were allowed and which weren’t on its own, meaning it could invent a structure without being given a complete title minus one noun.

First attempt (abysmal)

My first attempt was really straightforward. I used the Stanford Part-of-Speech Tagger to tag all of the existing TED titles and store the possible tag sequences. For example, “Why I do theater” is tagged as:

Wh-determiner, personal pronoun, base-form verb, singular noun I also stored all of the words of all the titles in a directory such that they could be accessed by their tags—separate arrays of singular nouns, prepositions, -ing verbs, etc. To generate a title, all I had to do was pick a random sequence of tags, and then for each tag in that sequence, pick a random word. Makes sense, right?

Unfortunately, it turned out not to be that simple. This model didn’t know enough about which words could follow one other. “The beautiful math of coral” has the same tag sequence as “Any wrong cancer upon secrets”, which is nonsense.

Fixing the tagger*

The problem with the random-sequence-random-words approach was the lack of features in the tags. Sure, past tense verbs were distinguished from present tense verbs and plural nouns were distinguished from singular nouns (and that’s a huge deal and very impressive), but the constraints just weren’t enough to generate anything that wasn’t a big bowl of word salad. I tried to identify what the recurring problems were in tagging and figure out ways around them. I noticed first that a lot of function words (things like prepositions, “the”, “a”, “what”, etc that don’t carry most of the meaning in the sentence) are not interchangeable but are treated as such by the tagger. For example, all prepositions held the same status and all personal pronouns held the same status. However, “upon” just doesn’t work the same as “of” (as in “Any wrong cancer upon secrets”). For that matter, “any” does not work the same as “the”, even though they’re both technically determiners according to the tagger. “The wrong cancer of secrets” is much closer to making sense than “Any wrong cancer upon secrets”. This kind of thing happened a lot. To fix it, I took a list of “stop words”, which are very common words that show up all the time and are often removed when doing any kind of text analysis. I then had all of the stop words be tagged as themselves. “The” is now just a “the” and “any” is an “any”. Likewise, before I could have gotten a sequence like “I has” because “I” and “he” were indistinguishable as personal pronouns. No longer!

This made things better, but there were still some issues that the tagger couldn’t handle. The two big ones were verb transitivity and mass/count noun distinction. Verb transitivity has to do with whether a verb can take a direct object (whether you can put a noun directly after it to express that the verb is done to that noun). For example: you can kick a ball or pet a cat, but you can’t “arrive” anything. You can arrive AT a place, but that’s an indirect relationship, not a direct one. “Sit” or “lie” are other good examples. The Stanford tagger cannot parse transitive verbs from intransitive ones. This means it can produce a sequence of an intransitive verb followed by a noun: “How I arrived my piano” (pianos come up a lot in TED talks for some reason). My adhesive-bandage solution (look Ma, no buzz marketing!) was to create a new tag for transitive verbs. Any verb that was immediately followed by an adjective, a noun, or a determiner (e.g. “the” or “a”) was marked as Transitive.

The next challenge was mass nouns and count nouns. English (and most languages) has different types of nouns. Count nouns are anything you can put after an indefinite article “a” or “an”. A book. A laptop. A squirrel. Mass nouns are more like substances that are harder to separate into discreet items that you can count. Stuff. Water. Pudding. Independence. We can probably find a scenario in which it’s ok to say “5 puddings”, but it would need to be understood that we mean “cups of pudding” or something.

If you don’t have this distinction in a language generator, you get phrases like “How much squirrel” or “a lot of laptop”. I worked around this a lot like how I dealt with the verbs. Any noun that came after a number or an indefinite article got marked as a count noun, and everything else was defaulted to a mass noun. A lot of nouns can go either way—“I have a fear of flying” and “I was paralyzed by fear” are both fine—so unless a noun shows up behaving as both a mass noun and a count noun, my generator won’t know that it could be either. But such is the problem with generators like this. They only see what’s in the training corpus.

Now that there were more types of tags than before, the generator started giving me much better results. The longer titles were still sometimes word salad, but the shorter ones were pretty reliably wonderful. Check out some of the results below.

Stay tuned! I’m going to post the actual generator to the web shortly so you can create your own fake TED titles!

*For those of you with some NLP knowledge, I should mention that I tried to use a trigram language model to fix the problems with the tagger, but it didn’t improve the results for a number of reasons I don’t want to get into here.

Generated TED talk titles (personal favorites in bold)

The high childhood Animations from tiny people The social truth of galaxies An electric government The crap between killing and taking The college of mental fractals child-driven cyborgs and new remix The innovation of kids The mobile and new legos of designers that can learn Sculpting the power of a law Hear your city Why we need new cats Why voices should protect the thinking The ecosystem to fission How dead africa matter can be new Complicated origins and the end of science Illusions from jeopardy business lessons An ugly chief keyboard What means more than your roots The time-lapse war cost Anatomy + democracy The true architectural gift Understanding’s global government Fashion, the democracy music and humanity on the secret tv How to learn a morality A true poem for light reason ... from a mother My 1000 games of species Refugees of life Building for climate How we can embrace our mistakes Beyond the underwater science A squawking man's biggest regret Carseats designed from capitalism Make grownups the next poverty! Food’s music Get your English piano Reinventing the innovative Missing the city Women through cognitive power That terrifying autism of law 3 scenes in the synthetic past A new bold salvation Green college billions and robots The lost fashion in wonder mathematics How instruments do global online The synthetic spectroscopy underneath wireless power A surprising web of the strange music Universal mesh The world by attitudes Endangered data, medical nonprofits Chocolate, inventory, and the ollie about tourism Found a consciousness? Domesticate inventing A broken time The reasons in my peace

#linguistics #nlp #computational linguistics #ted talks #why we need new cats

Solved: Turkeys are from India because Guinea fowl are from Ethiopia

The French word for “turkey hen” is dinde. This alone is not that interesting, I know, but give me a minute.

I remember learning in a French class once that dinde is a respelling of d’inde, or d’Inde, more precisely. D’Inde is a contraction of de + Inde, literally “From India”. So the French word for turkey is literally “from India”. That’s weird. Turkeys have nothing to do with India. Aren’t they a New World bird? Otherwise, wouldn’t the story of the First Thanksgiving be a total lie?! (It is, but not because of turkeys.)

So how did a bird that comes from the Americas be named after India? I had wondered this for years but it hadn’t bothered me enough to go looking for an answer. Fortunately, the answer found its way to me without any effort on my part (I love when that happens!). It arrived via Dan Jurafsky, a computational linguist and the author of this wonderful book. Jurafsky dedicates an entire chapter to turkey and how it came to be named after one country in English and another country in French (not to mention several others in languages around the world), even though it comes from neither Turkey nor India.

It all has to do with the fact that we have trouble keeping our birds straight. The Portuguese brought New World turkeys back to Europe, but around the same time, guinea fowl were being imported from Africa, primarily Ethiopia. Many of the traders importing these exotic birds were Turkish, and the birds became known as “hens of turkey” and similar descriptors. Guinea fowl bear a passing resemblance to turkeys, and before long these two different birds were conflated in the language.

So that answers the Turkey question, but what about India? Well, it turns out that in the 16th century, “India” could refer either to the area known today as India, or to Ethiopia. Oh, right, of course. This isn’t surprising considering the indigenous nations of the New World were also dubbed “Indians”. It seems like whenever Europeans bumped into a land mass they just assumed it was India. Keeps things simple, I guess, like calling every guy you meet “Kevin”.

#turkey #langague #history #guineafowl

thisisnotharmless

“No problem” vs “you’re welcome”

Speaking of linguistics, there’s one particular linguistic tick that I think clearly separates Baby Boomers from Millennials: how we reply when someone says “thank you.”

You almost never hear a Millennial say “you’re welcome.” At least not when someone thanks them. It just isn’t done. Not because Millennials are ingrates lacking all manners, but because the polite response is “No problem.” Millennials only use “you’re welcome” sarcastically when they haven’t been thanked or when something has been taken from/done to them without their consent. It’s a phrase that’s used to point out someone else’s rudeness. A Millennial would typically be fairly uncomfortable saying “you’re welcome” as an acknowledgement of genuine thanks because the phrase is only ever used disingenuously.

Baby Boomers, however, get really miffed if someone says “no problem” in response to being thanked. From their perspective, saying “no problem” means that whatever they’re thanking someone for was in fact a problem, but the other person did it anyway as a personal favor. To them “You’re welcome” is the standard polite response.

“You’re welcome” means to Millennials what “no problem” means to Baby Boomers, and vice versa.The two phrases have converse meanings to the different age sets. I’m not sure exactly where this line gets drawn, but it’s somewhere in the middle of Gen X. This is a real pain in the ass if you work in customer service because everyone thinks that everyone else is being rude when they’re really being polite in their own language.

hinallie

Something interesting to note is also the more literal meaning behind these two phrases and how they themselves differ and oppose each other

‘No problem’, coming from a millennial’s mouth, within the context of helping someone – whether it be holding a door open/picking up something someone may have dropped/etc. – and, naturally, being thanked for it, implies that the kind gesture was indeed, not a problem, that it was just the thing to do, that they were happy to help and that no thanks was really necessary.

While a Baby Boomer’s ‘You’re welcome’ in contrast, says something miles different, it actually highlights the fact that the person went out of their way to help someone; almost brings attention to it in a way, saying 'Yeah, I helped you, I did you this favor I accept your thanks.’ which, malicious intent or not, is strikingly different than the millennial downplay of their act of kindness for the sake of helping someone.

allthingslinguistic

The broad class that these expressions belong to is that of phatic expressions, formulaic expressions that are used less for their literal meaning and more for their social contribution. Others include “hi/hello/good morning” and “how do you do?/how are you?/what’s up?/how’s it going?” (Fun experiment: next time someone asks you “what’s up?” try replying smoothly “good, how are are you?” – or the inverse, replying “not much, what’s up with you?” to “how are you?” – and see if they even notice.)

The sub-class of phatic expressions involving “you’re welcome” and “no problem” is known as minimizers, and “no problem” is indeed newer (although it seems to date to the 1980s, so it’s not a perfect correlation with Millennials). Other minimizers are “no worries”, “sure”, “my pleasure”, “okay”, “anytime”, “don’t mention it” and others in this list. Minimizers often vary by age, geographic location, and other factors: ”you’re welcome”, for example, isn’t so old either – the OED’s first citation for it is from 1907 (”my pleasure” and “don’t mention it” are older). Another which I’ve noticed Americans using over the past few years is “mmhm” or “uh-huh”, which I’ve never heard in Canada as a minimizer. That’s about all the research I can find on them at the moment – anyone know any more detailed studies?

onwordsblog

A related anecdote:

Once I sneezed, said “excuse me”, and my friend (a native Spanish speaker) sitting next to me absent-mindedly muttered “No problem”. I then erupted into unstoppable laughter. “No problem” is his default response to “sorry” or “excuse me” or “thanks”, which usually works fine, but the starkness of its infelicity in the context of a sneeze was hilarious to me.

Just goes to show that phatic expressions can sometimes suddenly regain their literal meaning if they’re used infelicitously. I recently had a conversation with some friends over whether it’s appropriate to say “no problem” or “don’t worry about it” in response to a customer’s “thank you” when working in retail. I strongly feel that it is not, because it suggests that it might have been a problem for me to do my job. “No problem” is totally phatic elsewhere, but in some cases it strikes the listener as very literal.

#linguistics #no problem #bless you

The 7 reasons you keep mishearing that Taylor Swift lyric

Everyone’s current favorite mondegreen shows up around the 58-second mark of the hit song “Blank Space”:

https://www.youtube.com/watch?v=e-ORhEE9VVg#t=53

The real lyric is “Got a long list of ex-lovers.” but I (and most of the Internet, apparently), hear “got along with Starbucks lovers.” This is a totally understandable mistake, and the reasons can be boiled down to 7 simple linguistic truths:

1. Taylor’s prosody (intonation and stress) is not what you would expect for that sentence.

Normally, short words that don’t have a lot of meaning like of are not stressed in a sentence. Imagine saying “I have a list of things to buy.” You probably won’t stress of in that sentence. The weird thing is that the Blank Space melody kind of requires of to be stressed in a way that isn’t that natural to native speakers. Using weird prosody is a pretty surefire way to screw with other people’s perception and comprehension of what you’re saying.

Since of is stressed, your brain expects it to be a more important word than of, which makes it more likely that the st in list will be heard as part of the following syllable, forming something like stuv, which is not a word, but your brain wants to turn it into one.

So far we have neutralized one difference between the real lyric and the misheard one by re-assigning the st at the end of list to the beginning of of, turning the original lyric into this:

Got a long lih stuv ex-lovers

2. Vowels tend to get closer to each other when you sing them.

This might need a visual aid. When linguists talk about vowels, we visualize them like this:

This diagram VERY ROUGHLY represents your mouth if you were facing to the left. The [i] vowel in the upper left-hand corner is pronounced like the vowel in keep, and is referred to as a “high-front” vowel, because your tongue is pushed forward and up toward the roof of your mouth. The [æ] near the bottom left corner (the vowel in cat) requires you to lower your jaw and push your tongue forward while keeping it low in your mouth. That’s why it’s called a “low-front” vowel.

When you sing, you probably pronounce vowels a little closer to the center of this space than you would when you’re speaking. Imagine singing a word like feet and exaggerating that ee sound by making it as high and front as you can. It would sound very silly and not very musical. Instead, you might naturally pronounce it a bit more like the vowel in fit.

So the vowels we’re working with here are as [ɛ] in ex and [ʌ] as in ucks, which are not that close to each other but they’re also not that far away. When you’re singing a word with one of these sounds, it won’t sound as distinctly like itself as it would if you spoke it. It’s totally believable that these vowels could meet in the middle and be mistaken for each other in a song, so let’s rewrite that line again:

Got a long lih stuv ucks-lovers.

3. R sounds aren’t as pronounced when you’re singing

This is related to the last issue. American English speakers tend to soften r sounds when they’re singing, probably because it sounds more melodic. Combined with the centralizing vowel problem discussed above, this accounts for the confusion between the syllables stuh and star. If you were actually singing about Starbucks, that r would be softened or maybe not there at all, so it’s not unreasonable to imagine it. That brings us here:

Got a long lih starv ucks-lovers

4. L and W are pretty similar.

No, really! Imagine a little kid who pronounces little kind of like widow. Or a Cockney British speaker who says fill sort of like fiww. It’s not a coincidence that l and w converge in more than one situation. They’re acoustically pretty similar.

So now we can fairly swap out that l in lih for a w:

Got a long wih starv ucks-lovers

5. V and B are pretty similar.

Say a v sound and hold it. Notice that your top teeth are making contact with your bottom lip. Now say bob. Notice that your top and bottom lips are making contact with each other. The fact that at least one lip is involved in both sounds make them sound similar to us. Plus, they both require you to vibrate your vocal chords, so the acoustic signal produced by v and b are pretty alike. In fact, some languages don’t have a v sound and when speakers encounter one they’ll replace it with a b because it’s as close as they can get. If your name is Kevin and you meet a Japanese speaker, you might be called Kebin. So that takes care of the v/b problem:

Got a long wih starb ucks-lovers

6. We don’t always pronounce the end of “with” anyway.

Unless you’re really concentrating on your speech, you’ll probably shorten the word with at least a little. That might just mean shorten all the sounds, but that last th sound may get slurred into whatever sound comes afterwards, basically erasing it, so hearing with if someone says wih is not unreasonable:

Got a long with starb ucks-lovers

7. Got a long = got along, and starb ucks = starbucks

Got along with Starbucks lovers

And we’re done. So if you fell victim to this too, don’t feel bad. They’re basically the same sentence.

Mass nouns and count nouns

In English, some nouns can be pluralized and some can't. In order for a word like "water" or "Jell-o" to be pluralized as "waters" or "Jell-os", we need to redefine them to mean something you can count, like bottles of water or boxes of Jell-o. Unless you're talking about a definite denomination of water, you can't really say "5 waters" and have it mean anything. That's because water is mass noun as opposed to a count noun. The difference is pretty easy to remember: mass nouns can only have more or less mass, but they're not discrete entities you can point to. What is "a sand" or "a magnesium"? You have to invent something very context-specific to get those to mean something. Count nouns, on the other hand, can be counted. I know. Linguists are sometimes not very intuitive with their terminology, but this isn't one of those times.

What's really interesting here is that words can start as one kind of noun and gradually change their type. For example, fill in the blanks here:

The national news media ___ using scare tactics.

The data I just collected ___ going to change science forever.

Did you want to say the national news media IS or the national news media ARE? Probably is. But the more conservative prescriptive grammarians would tell you that's wrong. There was a time when "media" was simply the plural of "medium." The news media were TV news, newspapers, radio news, etc. If an artist worked in photography, painting, and sculpture, she didn't work in several mediums; she worked in several media (this use hasn't completely gone away, but it's not as common as it was).

The same is true of "data." It was once just the plural of "datum," which is now more commonly referred to as a "data point." In my experience, no one really says "datum" anymore, which is a shame because it's a really nice little word, and why use two words when can use the one you already have?

Media and data were once plurals of count nouns, but they almost completely turned into mass nouns that don't really have a commonly used singular form. I bet eventually the count noun use will be so archaic even the most pedantic grammar nerds won't hold their friends to that rule. In the mean time, we can still enjoy jokes like this:

http://xkcd.com/1429/

When people are picky about "less" and "fewer," it's the mass/count distinction they're sensitive to. You can say "less water" but someone decided you should use "fewer" with count nouns. Much and many follow a similar pattern:

Yesterday there were many peacocks and much rice pudding, but today there are fewer peacocks and less rice pudding.

(Do you think I was the first person in history to put that sentence on the Internet?)

And one more thing: while some nouns can't really be pluralized, some must be pluralized. These are called "pluralia tantum" nouns and they include things like pants and scissors and pliers. Pluralia tantum is only an obvious thing to call these nouns if you know Latin because it means "plural only." Not as clear as "mass" or "count."

Linguistic variation and the birth of sociolinguistics

In the early 60s, a young Columbia University graduate student named William Labov was investigating the differences in dialects in New York City. The study he conducted became one of the most famous studies in the history of linguistics and was instrumental in establishing sociolinguistics (the study of language as a social phenomenon) as an academic discipline. *

Here’s what Labov did: he went into 3 different department stores in Manhattan. He chose them carefully, with the intention of capturing the speech of distinct socio-economic strata. The chosen stores were Saks Fifth Avenue, bastion of the wealthy and educated; Macy’s, a department store targeting solidly middle-class consumers; and Klein’s, a discount department store that no longer exists. At each establishment, Labov attempted the same interaction with a sales clerk. He entered the store, studied the directory, located a clerk, and asked for the location of an item he suspected would be on the 4th floor of the building. The clerk would reliably provide a response that included the phrase “fourth floor.” Labov would then take note of whether each of the “r” sounds in “fourth floor” was pronounced. Then, pretending not to have heard, he would ask the clerk to repeat his or her answer, thus yielding another token of “fourth floor” pronunciation, only this time spoken more carefully.

The purpose of asking twice was to determine whether there are differences between casual speech and careful or “emphatic” speech. When repeating a phrase to someone who didn’t hear you the first time, you are very likely to enunciate very consciously, thereby amplifying the qualities of the sounds you’re producing. In a canonical New York City accent, “fourth floor” might sound something like “fawth flaw,” sometimes with each word taking up 2 discernable syllables (“Fo-ahth flo-ah”). If a clerk pronounced it this way in both spontaneous and careful speech, he or she is consistently “r-less.” Sometimes, however, a clerk would fairly predictably leave out the “r” in the spontaneous answer but pronounce it in the careful answer.

Who was most likely to do that, do you think? Macy’s. Why? The short answer is that middle class people have more to prove.

The Klein’s clerks were the least likely to pronounce the “r” at all, even in careful speech. The clerks at Saks tended to pronounce the “r” at all times, but when they didn’t, they weren’t consciously trying to correct themselves when they concentrated on their speech. The Macy’s clerks’ casual speech, however, betrayed their NYC dialects when they answered without thinking (“fawth flaw”), but when forced to concentrate on their enunciation, produced or even emphasized the “r” (“fouRth flooR”).

Labov called this phenomenon “hypercorrection,” and it’s still very much a salient concept in language and cultural studies. Those in the middle classes tend to be more aware of their speech and behavior with respect to societal norms and are more likely to calibrate what they say and do accordingly, sometimes excessively (hence, hyper-correction).

For example, some people tend to use “I” when, in a strictly prescriptive grammatical sense, “me” is the appropriate pronoun (“Keep this between you and I.” “Join my husband and I for dinner.”). No grammar teacher taught those people to use “I” in that context, but it sounds more polite and educated, even though it violates the grammatical rules they are probably trying to obey. The most educated are less likely to commit this offense because they have actually been taught not to, and the least educated are generally less concerned with sounding educated at all, so they wouldn’t either. It’s those in the middle who feel the need to position themselves with respect to what they perceive as a set of rules.

Look out for other examples of hyper-correction. They’re everywhere!

*The resulting paper is available from Stanford:

http://web.stanford.edu/class/linguist62n/labov001.pdf

In my last post, I claimed that [tʃ]** (which we usually spell “ch”) is really just [t] and [ʃ] (“sh”) pronounced in rapid succession, and [dʒ] (spelled a lot of ways, but often “dge” or “j”) is just a[d] and a [ʒ] (“zh” like in “mirage”). First of all, it’s important to see that these are parallel sounds. [dʒ] is the voiced counterpart of [tʃ]. They pronounced the same way, except when you say [dʒ], your vocal chords are engaged. Switch back and forth between saying “char” and “jar.” The two words should feel about the same in your mouth, but if you put your hand to your throat, you should feel your vocal chords start vibrating earlier in “jar.”

** IPA symbols are often written in brackets to show that you're transcribing speech.

But that’s not really what this is about. That’s just to emphasize the idea of a voicing contrast. [d] is the voiced counterpart of [t], [ʒ] is the voiced counterpart of ʃ, and when you put them together into affricates (stop-fricative sequences produced as one sound) you get [dʒ], which is the voiced counterpart of [tʃ]. Got it?

Now, you can take my word for this, but you don’t have to. I’ll prove it. First, here’s a recording of me saying “the church”:

The IPA transcription of “church” is [tʃɝtʃ]. Don’t worry about that weird 3 with a tail in the middle of the word -- it’s just the “er” vowel’s IPA symbol. The important thing is that the word has both an initial and final voiceless alveo-palatal affricate [tʃ].

Now, I’m going to take that recording and reverse it. If “the church” is transcribed as [ðə tʃɝtʃ], then the reversed sound should be transcribed as [ʃtɝʃt əð]. If you were to sound out those IPA symbols, it might sound sort of like “shtershteth.” Listen for it:

It’s hard to listen to backwards speech and parse the sounds apart, but try again. Listen to it a few times and see what you hear.

I definitely hear “Shter-shteth.” The “eth” at the end is the reversed “the,” and the vowel in the middle is the same forwards and backwards. The important thing is the “shtersht.” That’s “church” backwards, but it’s much easier to hear the “ch” sounds as “t+sh” when they’re played in reverse.

Here’s a similar experiment, but with “judge.” First, normal recording of “a judge”, which is transcribed as [ə dʒʌdʒ]. The [ə] is the short vowel we often use for the article “a.” [ʌ] is the “uh” vowel in “judge”:

Now we’ll reverse this sound. Again, listen for the compound segments of “zh” and “d”:

Do you hear something like “zhduh-zhduh”? I do. If you don’t, I don’t know how else to prove this to you without covering your mouth in electrodes so you can see exactly how you pronounce things.

Actually, even if you do believe me, try to get some electropalatography done anyway, if you have the opportunity. You might discover you don’t say your t’s or s’s the same as everyone else, giving you a brand new thing to be insecure about!

#phonetics #IPA #acoustics

An Introduction to the IPA

No! Not beer. Sorry. Feel free to enjoy an IPA while learning about the IPA (I recommend this one), but today I’m only writing about the International Phonetic Alphabet—the system of symbols that linguists use to represent the sounds of spoken language.

Are you wondering why we need this if we have already have an alphabet? I don't blame you.

A few reasons. First, some languages don’t have alphabets and we need a way to record how people speak those languages too. Second, alphabets are obviously not universal (Roman vs. Cyrillic vs. Hebrew, to cite some clear examples), and people who study language as a universal human phenomenon don’t usually want to learn their way around every language on the planet. On that subject, the IPA is useful for people who don’t care about language at all. Consider classical vocalists, for example, who don’t necessarily speak the language a choral or opera piece is written in but need a system to work through the sounds they’re expected to pronounce.

Most important, probably, is how spectacularly useless (and often counterproductive) a writing system can be for the study of pronunciation. If you didn’t know better, would you ever think that “laughter” and “daughter” didn’t rhyme? Or that “they’re,” “their,” and “there” were all pronounced the same way? How close would you get to the correct pronunciation of “enough” or “knew”? If you’re trying to learn about how people speak, you need a consistent and accurate method to record it in writing, and written language is not that method. That’s what the IPA does for us.

This post will only cover English consonants. I’ll cover vowels and symbols not used in English in subsequent posts.

Here’s a handy chart of the IPA symbols that represent all consonants in the English phoneme inventory. I gave examples of any sounds that aren’t easily guessed based on their symbols.

Yes, I know! This doesn’t quite make sense to you! And what’s a phoneme inventory?! We’ll get there. One thing at a time.

First: how to read this chart. To understand IPA, you need to keep in your mind the fact that all sounds we make are just manipulations of airflow made with certain parts of our vocal tract. The vocal tract is anything involved in producing speech sounds, which basically means your lungs and everything air passes on its way out of your mouth or nose. We use all of that stuff to block or encourage airflow as it passes through certain places.

The top row (“labial” through “glottal”) are places of articulation in the mouth. These correspond to where airflow is constricted. Some of them are self-explanatory, but I’ll go through them all:

Labial: The place where your lips touch each other. These sounds are all made by closing lips either partially or completely to disrupt or block airflow out of the mouth.

*The w is a labial sound but it also requires constriction farther back in the mouth, at the velum. It’s articulated in two places at once, which is why isn’t not as easily place on the chart as some other sounds.

Labio-dental: Where your top teeth touch your bottom lip.

Dental: Your teeth. Duh. But this specifically refers to the tip of your tongue coming in contact with your top teeth.

Alveolar: The alveolar ridge is that strange bit of topography on the roof of your mouth directly behind your top teeth. If you run your tongue from your teeth back along the roof of your mouth, you’ll notice a steep ridge. Alveolar sounds are produced with the tongue there.

Palatal-alveolar: This is just behind the alveolar ridge. It’s at the front of your hard palate, which is the plate across the roof of your mouth.

Palatal: You can probably guess that palatal sounds are pronounced with the tongue at the hard palate. If your tongue can’t feel the slope of the alveolar ridge, you’re no longer in palatal-alveolar territory. The palate goes about as far back as your tongue can reach.

Velar: These sounds are made with the back of the tongue pressed against the velum, which is your soft palate. The velum is a little flap in the back of your mouth that is either in raised or lowered position. A lowered velum (resting position for most people) allows air to travel into the nasal cavity and out the nose. A lowered velum seals off the nasal cavity and forces the air through the mouth. Here’s a creepy picture of a cross-section of someone’s head to illustrate that:

http://www.unil.ch/sli/page98921.html

Glottal: These sounds are made by constricting your glottis, which is the space between your vocal chords. When you make an “h” sound, for example, you’re pushing air through vocal chords that are constricted enough to create more turbulence (and therefore noise) than you would normally. Your tongue is not involved here because it can’t reach that far back into your throat.

Now that we’ve covered place of articulation, let’s talk about that vertical axis (“stop” through “glide”). These refer to manners of articulation, which means how air is manipulated. When you combine a place of articulation with a manner of articulation you get a unique sound that we can (usually) distinguish between other combinations of place and manner.

Stop: These stop all airflow at the given place of articulation. They’re also sometimes called “plosives” because they tend to release a burst of air when we let the airflow through again. This is the result of pressure building up behind the point where we form the stop.

Fricative: These sounds result when airflow is constricted but not entirely stopped. When we make fricatives, we create a very narrow passage through which air may pass, but it creates a lot of turbulent noise.

Affricate: An affricate is a stop followed immediately by a fricative at the same place of articulation. The two we have in English are tʃ (the first and last in “church”) and dʒ (the first and last sound in “judge”). Those are really two sets of sounds squished together, but speakers think of the as one sound each, so they are treated in the IPA as such. You may not believe me that they’re compound sounds, but my next post is going to prove that they are.

Nasal: Nasal sounds are made by keeping the velum lowered so air may be released through the nose. Have you ever had a cold and tried to call your mom but it ended up sounding sort of like “Bob”? That’s because your nose was stuffed and you couldn’t successfully get air out of it, so you made the non-nasal equivalent of “m,” which is pretty much a “b.”

Lateral: These are made by pushing air through two channels formed on either side of the on tongue. The placement of the tongue tip may vary, but sound is always the result of air flowing along the sides of the tongue and out of your mouth.

Glide: Glides are barely consonants at all. They’re formed by constricting the vocal tract just enough to change the sound coming through your vocal chords, but not quite enough to cause any noticeable frication or air bursts. This is why they’re sometimes called “semivowels.”

Voicing

The other thing you need to get to that chart is the voicing distinction. In a single box, there might be two symbols. For example, f and v. If that’s the case, the symbol on the right is the voiced counterpart of the symbol on the left. That means your mouth is doing the same thing when you pronounce each sound, but in one the vocal chords are dormant and in the other they’re vibrating. You can test this by putting your hand to your throat (lightly!), and switching back and forth between a sustained “f” sound and a “v” sound. You shouldn’t change the position of your lips or tongue, but you should be engaging and disengaging your vocal chords depending on which sound you’re producing.

This post got a little out of hand. But now you know how to read an IPA chart! You will be a hit at parties.

CONtrast and conTRAST

This one may be boring to non-wordnerds, but it kind of changed my world when it was pointed out to me.

Have you encountered a new word in writing and been unsure of how to pronounce it? The most difficult part of pronouncing English words, even for native speakers, is often the word’s stress pattern—what syllable receives stress, what gets secondary stress, if anything, and what syllables are unstressed. There is some variation in acceptable stress patterns in certain words (you may have heard both CON-troversy and con-TRO-versy, especially from British speakers, or both PAT-ina and pa-TEEN-a, also depending on dialect), but most often there is only one “correct” stress pattern for a given English word. When a speaker uses a different stress pattern from what we’re expecting, it might sound weird, but it usually doesn’t mean the wrong thing. For example, something like this. It’s certainly irritating and arguably incorrect, but the line in that movie clip is only a joke because we can identify the words and we know what the correct pronunciation is.

But have you noticed that stress can distinguish between two different words? Say the following pairs of sentences out loud and notice how you pronounce the bolded words:

I set up the photoshoot. The setup took hours to complete.

When the light turns green, you may proceed. The proceeds from the auction were donated to the hospital.

I will record the lecture. I have a large record collection.

The lady doth protest too much, methinks. There is a protest going on outside the office.

The students will construct a bridge. Gender is a social construct.

I think my car is about to break down. I blame it on the breakdown of society.

See a pattern? When a noun and a verb are made using the same sounds, the noun will have stress on the first syllable and verbs have stress on the last syllable. This is call contrastive stress, because stress alone is what is creating a contrast between two distinct words or phrases. Most people tend to think of stress or intonation as something that can affect meaning in sort of a supralinguistic way—that is, it can show a lot about how you’re feeling or what you want to emphasize, but it doesn’t interfere with the literal meaning of the words you’re saying. Except when it does. Additional examples include progress/progress, and, appropriately enough, contrast/contrast. I predict you will notice more of these now that you’ve read this. Sorry. (Not sorry.)

Buffalo!

An often-cited example the crazy things the English language can do is a sentence that goes like this:

Buffalo buffalo Buffalo buffalo buffalo buffalo Buffalo buffalo.

It’s pretty safe to say no one would ever have any reason to utter this sentence in conversation, but it is technically grammatical, which is the point. Why would a language permit such a ludicrous combination of words? Well, it’s not that English permits a sentence to be one word pronounced 8 times consecutively. It permits a sentence to take the following form:

Adjective; noun; relative clause (of the form adjective; noun; verb); verb; adjective; noun.

It doesn’t seem so incomprehensible when you think of it that way. The problem is that English has an adjective (sort of), a noun, and a verb which all happen to be pronounced “buffalo.” The noun is the large hairy animal you kill on the Oregon Trail (but you can only carry 200 pounds of meat!). It doesn’t help that the singular and plural form are both “buffalo.” The verb means to “bewilder or baffle” (Merriam-Webster). The adjective (which is actually a noun adjunct, but it’s used to modify a noun so let’s treat it as an adjective here) is the city of Buffalo. Using Buffalo to modify a noun is like saying “Boston policemen” or “California surfers.”

So now that we know the difference between our buffaloes, the sentence is easier to parse. It might help to include optional parentheticals that underscore the grammatical function of certain parts of the sentence, like so:

Buffalo buffalo (that) Buffalo buffalo buffalo (in turn) buffalo Buffalo buffalo.

To make it even easier, let’s reassign the grammatical functions to new words that don’t happen to sound the same.

Instead of the city of Buffalo, let’s use another modifier. Just a regular adjective, like “disgruntled.”

Instead of the animal buffalo (plural in this sentence), let’s pick another creature. “Lemurs?” Great.

And finally, instead of having the lemurs buffalo each other, let’s make them poke each other.

We’re going to do this in stages. Here’s our starter sentence:

Buffalo buffalo (that) Buffalo buffalo buffalo (in turn) buffalo Buffalo buffalo.

If we swap out the verb, we get:

Buffalo buffalo (that) Buffalo buffalo poke (in turn) poke Buffalo buffalo.

Next, these buffalo are no longer from Buffalo; they’re just disgruntled:

Disgruntled buffalo (that) disgruntled buffalo poke (in turn) poke disgruntled buffalo.

And now they’re no longer buffalo at all:

Disgruntled lemurs (that) disgruntled lemurs poke (in turn) poke disgruntled lemurs.

There! No more confusing buffalo from Buffalo buffaloing each other. Now we’re left with a very comprehensible scenario involving disgruntled lemurs.

By the way, the buffalo sentence can be taken further, depending on how creative you want to be with language, as Professor William J. Rapaport demonstrates:

http://www.cse.buffalo.edu/~rapaport/BuffaloBuffalo/

Now that you’re an expert on buffalo, try to make sense of this thing (It is possible. I left out punctuation to make it harder.):

"James while John had had had had had had had had had had had a better effect on the teacher.”

Aksing the right questions about metathesis

You’ve probably noticed that some English speakers, most likely some of African American English* (AAE), pronounce “ask” such that it sounds like “aks.” You may have wondered why. As is the case with most linguistic phenomena, there is no definite answer, but there are a few theories floating around. First, know that this process is called metathesis (the switching of sounds, from the Greek meaning “put in a different order"). There are other common examples of metathesis in English, but I’ll get to that in a bit. Whenever a group of people is discussing the pronunciation of “ask/aks,” it’s pretty much guaranteed that someone will mention Chaucer. No, really. The passage that you’ll see cited in language blogs all over the internet comes from the Wife’s Prologue in the Canterbury Tales (1386). It reads:

“I axe, why the fyfte man Was nought housband to the Samaritan?”

So Chaucer said “aks.” Or at least he was aware that someone might say “aks” so he had a character in the Canterbury Tales say it. So speakers of AAE actually just lifted that feature from 14th century English? Well, maybe. “Aks” may have entered the speech of black slaves through white American southerners, whose dialects in turn descended from regions of England where “aks” was the standard pronunciation. As far as I can tell, this has by no means been proven, and whether “aks” appears in AAE as a survival of Middle English or not is unclear. “Aks” may have appeared at various times in the history of English because it is simply easier to pronounce. Metathesis is a common process because it generally enhances ease of pronunciation, and “ks” is indeed easier to articulate than “sk.”

Other examples of Metathesis that we see in English support the idea that ease of articulation is a powerful force in sound change over time. For example, when was the last time you pronounced iron “i-ron” instead of “i-ern?” That’s what I thought. No one thinks you’re mispronouncing it when you say “i-ern.” Other examples do get labeled as incorrect (and I admit it bothers me to no end when I hear them), like “calvery” instead of “cavalry” or “asterix” instead of “asterisk.”

Notice that “i-ern” and “asterix” are not associated with any ethnicity, socioeconomic status, or education level as readily as “aks” is. Metathesis occurs all over the English language, but only “aks” is given such marked status. It is most common among Black American speakers (The Linguistic Atlas of the Gulf States (1986-1992) found it used by 23% of Black American speakers and only 1% of speakers of European descent), but the reasons for its markedness are much too complicated to get into here. 23% is hardly an overwhelming majority, and yet the "aks" pronunciation has become one of the most salient features of AAE to the ears of Standard American English speakers. Why do you supposed that is? I have a few ideas, but let me do some reading and get back to this in another post.

*I want to write a post at some point about what the hell we should actually call this dialect, if anything.

This is my blog. This is my blog on words. Any questions?

Welcome to OnWords. This blog will—I hope—serve as your source for word nerd indulgence. I'm talking about etymological oddities, nitpicky usage discussions, and other linguistic phenomena that language geeks love to swoon over. The goal, of course, is not to target language geeks exclusively, but rather to allow each of you to discover the language geek within. Don't scoff; you have one. It may not be as outspoken as mine, but it's in there. I promise that at least one entry posted here, at some point, will elicit a "wow!" or a "cool" or even just a "huh..." from you.

Now, before we dive into some of these linguistic nuggets, let's talk through the title of this blog. First, and most important, its literal meaning conveys valuable information about the content of this site. I am thinking and writing on words, and anyone interested in words would know right away that he* might want to read this. Second, it allowed me to make the terrible joke you see under the main title (It’s not just wordplay. If you don't get the reference, click here). Finally, I couldn't resist the pun on "onward(s),"** which I like to think captures the intrepid spirit of what this blog will become. This is a space for exploration before all else. I am not claiming total expertise in all (or any) areas of linguistic study. I'm just a graduate student who wants to spread some enthusiasm for words and what we do with them.

So let's talk language. Any questions?

* I prefer the masculine general "he" to "he or she" or "they." This is an issue for a later entry.

** I'll also address the "onward" vs. "onwards" issue another time.

The McGurk effect: Look at me while I’m talking to you!

This one still blows my mind every time I see it. The McGurk effect is basically the idea that much of speech perception comes from visual cues. This may not sound mind-blowing, and it’s not. Many of you have probably noticed that it’s easier to understand speech when you can watch someone pronouncing it. The extent to which this is true may surprise you, though. To demonstrate, I'll show you how visual cues can make speech harder to understand. First, click play on the video below, but then hide the window or close your eyes. Keep them closed until the speaker finishes talking. What did he say?

Click here and close your eyes!

You probably think he said “ba ba, ba ba, ba ba.” That’s because he did. Now, play the video again, but this time watch the speaker. Now what does it sound like?

I can’t speak for you, but to me it sounds decidedly NOT like “ba.” I hear something more like “da” or possibly even “ga.” I find the difference so striking that I need to remind myself that the audio didn’t change at all.

The video (made by Professor Arnt Maasø of the University of Oslo, who is also the speaker), was composed by syncing a bilabial audio stimulus (the sound of Maasø saying “ba ba”) with a velar visual stimulus (the video of Maasø saying “ga ga.”) Without the visual stimulus, the audio is perceived with the bilabial stop (the b sound, made by closing and then opening the lips). However, the visual information tells your brain to expect a velar stop (a k or a g sound, made by pressing the back of your tongue to the soft palate and then releasing a burst of air). When audio and visual information conflict, what the hearer perceives is something like the result of splitting the difference. “Da” is pronounced at the alveolar ridge, farther back in the mouth than “ba” but in front of “ga.” So if you heard “da” on the second listen, it was your speech recognition mechanisms finding middle ground between the audio and the visual stimuli. You’re hearing “ba,” but seeing “ga,” so your brain settles on “da.”

So, this is all to say that visual signals can alter the perception of an audio signal to an alarming degree. If looking at a speaker can inhibit perception of an audio signal so dramatically, imagine how much it enhances your perception of sound every day.

References and links

McGurk, Henry and John MacDonald. (1976) “Hearing Lips and Seeing Voices.” Nature 264 (5588): 746–8

http://www.nature.com/nature/journal/v264/n5588/abs/264746a0.html

Arnt Maasø’s faculty webpage: http://www.hf.uio.no/imk/english/people/aca/arntm/

#Phonetics #Speech perception

Trending Blogs

Recently Viewed Blogs

OnWords