Arrant Pedantry


100,000 Words Whose Pronunciations Have Changed

We all know that language changes over time, and one of the major components of language change is sound change. Many of the words we use today are pronounced differently than they were in Shakespeare’s or Chaucer’s time. You may have seen articles like this one that list 10 or 15 words whose pronunciations have changed over time. But I can do one better. Here are 100,000 words that illustrate how words change.

  1. a: Before the Great Vowel Shift, the name of the first letter of the alphabet was pronounced /aː/, much like when the doctor asks you to open your mouth and say “ah” to look down your throat. In Old English, it was /ɑː/, which is pronounced slightly further back in the mouth. The name of the letter was borrowed from Latin, which introduced its alphabet to much of Europe. The Romans got their alphabet from the Greeks, probably by way of the Etruscans. But unlike the Greeks, the Romans simply called the letters by the sounds they made. The corresponding Greek letter, alpha, got its name from the Phoenician aleph, meaning ‘ox’, because the letter aleph represented the first sound in the word aleph. In Phoenician this was a glottal stop (which is not written in the Latin alphabet). The Greeks didn’t use this sound, so they borrowed it for the /a/ sound instead.
  2. a: This casual pronunciation of the preposition of goes back at least to the 1200s. It doesn’t appear in writing much, except in dialogue, where it’s usually attached to another word, as in kinda. But of itself comes from an unstressed form of the Old English preposition æf. Æf didn’t survive past Old English, but in time a new stressed form of of arose, giving us the preposition off. Of and off were more or less interchangeable until the 1600s, at which point they finally started to diverge into two distinct words. Æf is cognate with the German ab, and these ultimately come from the Proto-Indo-European *h₂epó ‘off, away, from’, which is also the source of the Greek apo (as in apostasy) and the Latin ab (as in abuse). So the initial laryngeal sound in *h₂epó disappeared after changing the following vowel to /a/, the final /o/ disappeared, the /p/ fricatized to /f/, the vowel moved back and reduced, the /f/ became voiced to /v/, and then the /v/ fell away, leaving only a schwa, the barest little wisp of a word.
  3. a: The indefinite article a comes from an unstressed version of the numeral one, which in Old English was ān, though it also inflected for gender, number, and case, meaning that it could look like āne, ānum, ānes, ānre, or ānra. By Middle English those inflections were gone, leaving only an. The /n/ started to disappear before consonants starting in the 1100s, giving us the a/an distinction we have today. But the Old English ān came from an earlier Proto-Germanic *ainaz. The az ending had disappeared by Old English, and the diphthong /ai/ smoothed and became /ɑ:/. In its use as an article, its vowel shortened and eventually reduced to a schwa. But in its use as a numeral, it retained a long vowel, which eventually rose to /o:/ and then broke into the diphthong /wʊ/ and then lowered to /wʌ/, giving us the modern word one. The Proto-Germanic *ainaz goes further back to the Proto-Indo-European *óynos, so between PIE and Proto-Germanic the vowels lowered and the final /s/ became voiced.
  4. aback: This adverb comes from the prefix a- and the noun back. The prefix a- comes from an unstressed form of the preposition on which lost its final /n/ and reduced to a schwa. This prefix also appears in words like among, atop, awake, and asleep. On comes from the Proto-Germanic *ana, which in turn comes from the Proto-Indo-European **h₂en-, which is also the source of the Greek ana-, as in analog and analyze. As with *h₂epó, the initial laryngeal sound changed the vowel to /a/ and then disappeared. Back, on the other hand, has changed remarkably little in the last thousand years. It was spelled bæc in Old English and was pronounced just like the modern word. It comes from a Proto-Germanic word *baka, though its ultimate origin is unknown.

Hopefully by now you see where I’m going with this. It’s interesting to talk about how words have changed over the years, but listicles like “10 Words Whose Pronunciations Have Changed” can be misleading, because they imply that changes in pronunciation are both random and rare. Well, sound changes are random in a way, in that it’s hard to predict what will change in the future, but they’re not random in the sense that they affect random words. Sound changes are just that—changes to a sound in the language, like /r/ disappearing after vowels or /t/ turning into a flap in certain cases in the middle of words. Words can randomly change too, but that’s the exception rather than the rule.

And sound changes aren’t something that just happen from time to time, like the Great Vowel Shift. They’re happening continuously, and they have been happening since the beginning of language. If you like really deep dives (or if you need something to combat your insomnia), this Wikipedia article details the sound changes that have happened between late Proto-Germanic, spoken roughly 2,000 years ago, and the present day, when changes like th-fronting in England (saying fink for think) and the Northern Cities Shift in the US are still occurring.

So while it’s okay to talk about individual words whose pronunciations have changed, I think we shouldn’t miss the bigger picture: it’s language change all the way down.


Cognates, False and Otherwise

A few months ago, I was editing some online German courses, and I came across one of my biggest peeves in discussions of language: false cognates that aren’t.

If you’ve ever studied a foreign language, you’ve probably learned about false cognates at some point. According to most language teachers and even many language textbooks, false cognates are words that look like they should mean the same thing as their supposed English counterparts but don’t. But cognates don’t necessarily look the same or mean the same thing, and words that look the same and mean the same thing aren’t necessarily cognates.

In linguistics, cognate is a technical term meaning that words are etymologically related—that is, they have a common origin. The English one, two, three, German eins, zwei, drei, French un, deux, trois, and Welsh un, dau, tri are all cognate—they and words for one, two, three in many other language all trace back to the Proto-Indo-European (PIE) *oino, *dwo, *trei.

These sets are all pretty obvious, but not all cognates are. For example, the English four, five, German vier, fünf, French quatre, cinq, and Welsh pedwar, pump. The English and German are still obviously related, but the others less so. Fünf and pump are actually pretty close, but it seems a pretty long way from four and vier to pedwar, and an even longer way from them to quatre and cinq.

And yet these words all go back to the PIE *kwetwer and *penkwe. Though the modern-day forms aren’t as obviously related, linguists can nevertheless establish their relationships by tracing the them back through a series of sound changes to their conjectured historical forms.

And not all cognates share meaning. The English yoke, for instance, is related to the Latin jugular, the Greek zeugma, and the Hindi yoga, along with join, joust, conjugate, and many others. These words all trace back to the PIE *yeug ‘join’, and that sense can still be seen in some of its modern descendants, but if you’re learning Hindi, you can’t rely on the word yoke to tell you what yoga means.

Which brings us back to the German course that I was editing. Cognates are often presented as a way to learn vocabulary quickly, because the form and meaning are often similar enough to the form and meaning of the English word to make them easy to remember. But cognates often vary wildly in form (like four, quatre, and pedwar) and in meaning (like yoke, jugular, zeugma, and yoga). And many of the words presented as cognates are in fact not cognates but merely borrowings. Strictly speaking, cognates are words that have a common origin—that is, they were inherited from an ancestral language, just as the cognates above all descend from Proto-Indo-European. Cognates are like cousins—they may belong to different families, but they all trace back to a common ancestor.

But if cognates are like cousins, then borrowings are like clones, where a copy of word is taken directly from one language to another. Most of the cognates that I learned in French class years ago are actually borrowings. The English and French forms may look a little different now, but the resemblance is unmistakable. Many of the cognates in the German course I was editing were also borrowings, and in many cases they were words that were borrowed into both German and English from French:


Of these, only gold, hand, land, sand, and wind are actually cognates. Maybe it’s nitpicking to point out that the English jaguar and the German Jaguar aren’t cognates but borrowings from Portuguese. For a language learner, the important thing is that these words are similar in both languages, making them easy to learn.

But it’s the list of supposed false cognates that really irks me:

karton/cardboard box
peperoni/chili pepper
beamer/video projector
argument/proof, reasons

The German word is on the left and the English word on the right. Once again, many of these words are borrowings, mostly from French and Latin. All of these borrowings are clearly related, though their senses may have developed in different directions. For example, chef generally means “boss” in French, but it acquired its specialized sense in English from the longer phrase chef de cuisine, “head of the kitchen”. The earlier borrowing chief still maintains the sense of “head” or “boss”.

(It’s interesting that billion and trillion are on the list, since this isn’t necessarily an English/German difference—it also used to be an American/British difference, but the UK has adopted the same system as the US. Some languages use billion to mean a thousand million, while other languages use it to mean a million million. There’s a whole Wikipedia article on it.)

But some of these words really are cognate with English words—they just don’t necessarily look like it. Bad, for example, is cognate with the English bath. You just need to know that the English sounds spelled as <th>—like the /θ/ in thin or the /ð/ in then—generally became /d/ in German.

And, surprisingly, the German Gift, “poison”, is indeed cognate with the English gift. Gift is derived from the word give, and it means “something given”. The German word is essentially just a highly narrowed sense of the word: poison is something you give someone. (Well, hopefully not something you give someone.)

On a related note, that most notorious of alleged false cognates, the Spanish embarazado, really is related to the English embarrassed. They both trace back to an earlier word meaning “to put someone in an awkward or difficult situation”.

Rather than call these words false cognates, it would be more accurate to call them
false friends. This term is broad enough to encompass both words that are unrelated and words that are borrowings or cognates but that have different senses.

This isn’t to say that cognates aren’t useful in learning a language, of course, but sometimes it takes a little effort to see the connections. For example, when I learned German, one of my professors gave us a handout of some common English–German sound correlations, like the th ~ d connection above. For example, if you know that the English /p/ often corresponds to a German /f/ and that the English sound spelled <ea> often corresponds to the German /au/, then the relation between leap and laufen “to run” becomes clearer.

Or if you know that the English sound spelled <ch> often corresponds with the German /k/ or that the English /p/ often corresponds with the German /f/, then the relation between cheap and kaufen “to buy” becomes a little clearer. (Incidentally, this means that the English surname Chapman is cognate with the German Kaufmann.) And knowing that the English <y> sometimes corresponds to the German /g/ might help you see the relationship between the verb yearn and the German adverb gern “gladly, willingly”.

You don’t have to teach a course in historical linguistics in order to teach a foreign language like German, but you’re doing a disservice if you teach that obviously related pairs like Bad and bath aren’t actually related. Rather than teach students that language is random and treacherous, you can teach them to find the patterns that are already there. A little bit of linguistic background can go a long way.

Plus, you know, real etymology is a lot more fun.

Edited to add: In response to this post, Christopher Bergmann ( created this great diagram of helpful cognates, unhelpful or less-helpful cognates, false cognates, and so on:

Click to see the full-sized image.

%d bloggers like this: