Whence Did They Come?

In a recent episode of Slate’s Lexicon Valley podcast, John McWhorter discussed the history of English personal pronouns. Why don’t we use ye or thee and thou anymore? What’s the deal with using they as a gender-neutral singular pronoun? And where do they and she come from?

The first half, on the loss of ye and the original second-person singular pronoun thou, is interesting, but the second half, on the origins of she and they, missed the mark, in my opinion.

I recommend listening to the whole thing, but here’s the short version. The pronouns she and they/them/their(s) are new to the language, relatively speaking. This is what the personal pronoun paradigm looked like in Old English:

Case Masculine Neuter Feminine Plural
Nominative hit hēo hīe
Accusative hine hit hīe hīe
Dative him him hire him
Genitive his his hire heora

There was some variation in some forms in different dialects and sometimes even within a single dialect, but this table captures the basic forms. (Note that the vowels here basically have classical values, so would be pronounced somewhat like hey, hire would be something like hee-reh, and so on. A macron or acute accent just indicates that a vowel is longer.)

One thing that’s surprising is how recognizable many of them are. We can easily see he, him, and his in the singular masculine forms (though hine, along with all the other accusative forms, have been lost), it (which has lost its h) in the singular neuter forms, and her in the singular feminine forms. The real oddballs here are the singular feminine form, hēo, and the third-person plural forms. They look nothing like their modern forms.

These changes started when the case system started to disappear at the end of the Old English period. , hēo, and hie began to merge together, which would have led to a lot of confusion. But during the Middle English period (roughly 1100 to 1500 AD), some new pronouns appeared, and then things started settling down into the paradigms we know now: he/him/his, it/it/its, she/her/her, and they/them/their. (Note that the original dative and genitive forms for it were identical to those for he, but it wasn’t until Early Modern English that these were replaced by it and his, respectively.)

The origin of they/them/their is fairly uncontroversial: these were apparently borrowed from Old Norse–speaking settlers, who invaded during the Old English period and captured large parts of eastern and northern England, forming what is known as the Danelaw. These Old Norse speakers gave us quite a lot of words, including anger, bag, eye, get, leg, and sky.

The Old Norse words for they/them/their looked like this:

Case Masculine Neuter Feminine
Nominative þeir þau þær
Accusative þá þau þær
Dative þeim þeim þeim
Genitive þeirra þeirra þeirra

If you look at the masculine column, you’ll notice the similarity to the current they/them/their paradigm. (Note that the letter that looks like a cross between a b and a p is a thorn, which stood for the sounds now represented by th in English.)

Many Norse borrowings lost their final r, and unstressed final vowels began to be dropped in Middle English, which would yield þei/þeim/þeir. (As with the Old English pronouns, the accusative form was lost.) It seems like a pretty straightforward case of borrowing. The English third-person pronouns began to merge together as the result of some regular sound changes, but the influx of Norse speakers provided us an alternative for the plural forms.

But not so fast, McWhorter says. Borrowing nouns, verbs, and the like is pretty common, but borrowing pronouns, especially personal pronouns, is pretty rare. So he proposes an alternative origin for they/them/their: the Old English demonstrative pronouns—that is, words like this and these (though in Old English, the demonstratives functioned as definite articles too). Since hē/hēo/hīe were becoming ambiguous, McWhorter argues, English speakers turned to the next best thing: a set of words meaning essentially “that one” or “those ones”. Here’s what the plural demonstrative pronouns in Old English looked like:

Case Plural
Nominative þā
Accusative þā
Dative þǣm/þām
Genitive þāra/þǣra

(Old English had a common plural form rather than separate plural forms for the masculine, neuter, and feminine genders.)

There’s some basis for this kind of change from a demonstrative to a person pronoun; third-person pronouns in many languages come from demonstratives, and the third-person plural pronouns in Old Norse actually come from demonstratives themselves, which explains why they look similar to the Old English demonstratives: they all start with þ, and the dative and genitive forms have the -m and -r on the end just like them/their and the Old Norse forms do.

But notice that the vowels are different. Instead of ei in the nominative, dative, and genitive forms, we have ā or ǣ. This may not seem like a big deal, but generally speaking, vowel changes don’t just randomly affect a few words at a time; they usually affect every word with that sound. There has to be some way to explain the change from ā to ei/ey.

And to make matters worse, we know that ā (/ɑː/ in the International Phonetic Alphabet) raised to /ɔː/ (the vowel in court or caught if you don’t rhyme it with cot) during Middle English and eventually raised to /oʊ/ (the vowel in coat) during the Great Vowel Shift. In a nutshell, if English speakers had started using þā as the third-person plural pronoun in the nominative case, we’d be saying tho rather than they today.

But the biggest problem is that the historical evidence just doesn’t support the idea that they originates from þā. The first recorded instance of they, according to The Oxford English Dictionary, is in a twelfth-century manuscript known as the Ormulum, written by a monk known only as Orm. Orm is the Old Norse word for worm, serpent, or dragon, and the manuscript is written in an East Midlands dialect, which means that it came from the Danelaw, the area once controlled by Norse speakers.

In the Ormulum we finds forms like þeȝȝ and þeȝȝre for they and their, respectively. (The letter ȝ, known as yogh, could represent a variety of sounds, but in this case it represents /i/ or /j/). Other early forms of they include þei, þai, and thei.

The spread of these new forms was gradual, moving from areas of heaviest Old Norse influence throughout the rest of the English-speaking British Isles. The early-fifteenth-century Hengwert Chaucer, a manuscript of The Canterbury Tales, usually has they as the subject but retains her for genitives (from the Old English plural genitive form hiera or heora) and em for objects (from the Old English plural dative him. The ’em that we use today as a reduced form of them probably traces back to this, making it the last vestige of the original Old English third-person plural pronouns.

So to make a long story short, we have new pronouns that look like Old Norse pronouns that arose in an Old Norse–influenced area and then spread out from there. McWhorter’s argument boils down to “borrowing personal pronouns is rare, so it must not have happened”, and then he ignores or hand-waves away any problems with this theory. The idea that these pronouns instead come from the Old English þā just doesn’t appear to be supported either phonologically or historically.

This isn’t even an area of controversy. When I tweeted about McWhorter’s podcast, Merriam-Webster lexicographer Kory Stamper was surprised, responding, “I…didn’t realize there was an argument about the ety of ‘they’? I mean, all the etymologists I know agree it’s Old Norse.” Borrowing pronouns may be rare, but in this case all the signs point to yes.

For a more controversial etymology, though, you’ll have to wait until a later date, when I wade into the murky etymology of she.


The Taxing Etymology of Ask

A couple of months back, I learned that task arose as a variant of tax, with the /s/ and /k/ metathesized. This change apparently happened in French before the word was borrowed into English. That is, French had the word taxa, which came from Latin, and then the variant form tasca arose and evolved into a separate word with an independent meaning.

I thought this was an interesting little bit of historical linguistics, and as a side note, I mentioned on Twitter that a similar phonological change gave us the word ask, which was originally ax (or acs or ahs—spelling was not standardized back then). Beowulf and Chaucer both use ax, and we didn’t settle on ask as the standard form until the time of Shakespeare.

But when I said that “it was ‘ax’ before it was ‘ask'”, that didn’t necessarily mean that ax was the original form—history is a little more complicated than that.

The Oxford English Dictionary says that ask originally meant “to call for, call upon (a person or thing personified) to come” and that it comes from the Old English áscian, which comes from the Proto-Germanic *aiskôjan. But most of the earliest recorded instances, like this one from Beowulf, are of the ax form:

syþðan hé for wlenco wéan áhsode

(after he sought misery from pride)

(A note on Old English orthography: spelling was not exactly standardized, but it was still fairly predictable and mostly phonetic, even though it didn’t follow the same conventions we follow today. In Old English, the letter h represented either the sound /h/ at the beginning of words or the sound /x/ [like the final consonant in the Scottish loch] in the middle of or at the end of words. And when followed by s, as in áhsode, it made the k sound, so hs was pronounced like modern-day x, or /ks/. But the /ks/ cluster could also be represented by cs or x. For simplicity’s sake, I’m going to use ask and ax rather than asc or ahs or whatever other variant spellings have been used over the years.)

We know that ask must have been the original form because that’s what we find in cognate languages like Old Saxon, Old Frisian, and Old High German. This means that at some point after Old English became differentiated from those other languages (around 500 AD), the /s/ and /k/ metathesized and produced ax.

Almost all of the OED’s citations from Old English (which lasted to about 1100 AD) use the ax form, as in this translation of Mark 12:34 from the West Saxon Gospels: “Hine ne dorste nan mann ahsian” (no man durst ask him). (As a bonus, this sentence also has a great double negative: it literally says “no man durst not ask him”.) Only a few of the citations from the Old English period are of the ask variety. I’ll discuss this variation between ask and ax later on.

The ax forms continued through Middle English (about 1100 to 1475 AD) and into Early Modern English. Chaucer’s Canterbury Tales (about 1386 AD) has ax: “I axe, why the fyfte man Was nought housbond to the Samaritan?” In Middle English, ask starts to become a little more common in written work, and we also occasionally see ash, though this form peters out by about 1500. (Again, I’ll discuss this variant more below.)

William Tyndale’s Bible, which was the first Early Modern English translation of the Bible, has ax: Matthew 7:7 reads, “Axe and it shalbe geven you.” The Coverdale Bible, published in 1535 and based on Tyndale’s work, also has ax, but the King James Bible, published in 1611, has the now-standard ask. So do Shakespeare’s plays (dating from the late 1500s to the early 1600s). After about 1600, ax forms become scarce, though one citation from 1803 records axe as a dialectal form used in London. And it’s in nonstandard dialects where ax survives today, especially in Southern US English and African American English. (I assume it also survives in other places besides the US, but I don’t know enough about its use or distribution in other countries.)

In a nutshell, ax arose as a metathesized form of ask at some point in the Old English period, and it was the dominant form in written Old English and an acceptable variant down to the 1500s, when it started to be supplanted by the resurgent ask. And at some point, ash also appeared, though it quietly disappeared a few centuries later. So why did ask disappear for so long? And why did it come back?

The simple answer to the first question is that the word metathesized in the dominant dialect of Old English, which was West Saxon. (Modern Standard English descends not from West Saxon but from the dialect around London.) These sorts of changes just happen sometimes. In West Saxon, /sk/ often became /ks/ in the middle or at the end of a word. Sound changes are usually regular—that is, they affect all words with a particular sound or set of sounds—but this particular change apparently wasn’t; metathesized and unmetathesized forms continued to exist side by side, and sometimes there’s variation even within a manuscript. King Alfred the Great’s translation of Boethius’s Consolation of Philosophy, switches freely between the two: “Þæt is þæt ic þé ær ymb acsade. . . . Swa is ðisse spræce ðe ðu me æfter ascast.” This is pretty weird. When a change is beginning to happen, there may be some variation among words or among speakers, but variation between different forms of a word used by the same speaker is highly unusual.

As for the second question, it’s not entirely clear how or why ask came back. At first glance, it would seem that ask must have survived in other dialects and started to crop back up in written works during the Middle English Period. Or perhaps ax simply remetathesized and became ask again. But it can’t be quite that simple, because /sk/ regularly palatalized to /ʃ/ (the “sh” sound) during the Old English period. You can see the effects of this change in cognate pairs like shirt (from Old English) and skirt (from Old Norse) or ship (from Old English) and skipper (from Middle Dutch).

It’s not entirely clear when this palatalization of /sk/ to /ʃ/ happened, but it must have been sometime after the Angles and Saxons left mainland Europe (starting in the 400s or 500s) but before the Viking invasions beginning in the 800s, because Old Norse words borrowed into English retain /sk/ where English words did not. If palatalization had occurred after the influx of words from Old Norse, we’d say shy and shill instead of sky and skill.

One thing that makes it hard to pin down the date of this change is that /sk/ was originally spelled sc, and the sc spelling continued to be used even after palatalization must have happened. That means that words like ship and fish were spelled like scip and fisc. Thus a form with sc is ambiguous—we don’t know for certain if it was pronounced /sk/ or /ʃ/, though we can infer from other evidence that by the time most Old English documents were being created, sc represented /ʃ/. (Interestingly, this means that in the quote from Alfred the Great, the two forms would have been pronounced ax-ade and ash-ast.) It wasn’t until Middle English that scribes began using spellings like sch, ssh, or sh to distinguish /ʃ/ from the /sk/ combination.

If ask had simply survived in some dialect of Old English without metathesizing, it should have undergone palatalization and resulted in the modern-day form ash. As I said above, we do occasionally see ash in Middle English, which means that this did happen in some dialects of Old English. But this was never even the dominant form—it just pops up every now and then in the South West and West Midlands regions of England from the 1200s down to about 1500, when it finally dies out.

One other option is that the original ask metathesized to ax, missed out on palatalization, and then somehow metathesized back to ask. There may be some evidence for this option, because some other words seem to have followed the same route. For instance, words like flask and tusk appear in Old English as both flasce/flaxe and tusc/tux. But flask didn’t survive Old English—the original word was lost, and it was reborrowed from Romance languages in the 1500s—so we don’t know for sure if it was pronounced with /sk/ or /ʃ/ or both. Tusk appears in some dialects as tush, so we have the same three-way /sk/–/ks/–/ʃ/ alternation as ask.

But while ash meaning the powdery residue shows the same three-way variation, ash meaning the kind of tree does not—it’s always /ʃ/. Ask, ash, and ash all would have had /sk/ in the early stages of Old English, so why did one of them simply palatalize while the other two showed a three-way variation before settling on different forms? If it was a case of remetathesis that turned /ks/ back into /sk/, then why weren’t other words that originally ended in /ks/ affected by this second round of metathesis? And if /ks/ had turned back into /sk/ at some point, then why didn’t ax ‘a tool for chopping’ thus become ask? Honestly, I have no idea.

If those changes happened in that order, then we should expect to see /ask/ for the questioning word, the tree, and the tool. But there’s no way to reorder these rules to get the proper outputs for all three. Putting palatalization before metathesis gets us the proper output for the tree but also gives us ash for the questioning word, and putting a second round of metathesis at the end gets us the proper output for the questioning word but gives us ask for the chopping tool. And any way you rearrange them, you should never see multiple outputs for the same word, all apparently the products of different rules or at least different rule ordering, used in the same dialects or even by the same speakers.

So how do we explain this?


Maybe the sound changes happened in different orders in different parts of England, and those different dialects then borrowed forms from each other. Maybe some forms were borrowed from or influenced by the Vikings. Maybe there were several other intermediate rules that I’m missing, and those rules interacted in some strange ways. At any rate, the pronunciation ax for ask had a long and noble tradition before falling by the wayside as a dialectal form about four hundred years ago. But who knows—there’s always a chance it could become standard again in the future.


Why Is It “Woe Is Me”?

I recently received an email asking about the expression woe is me, namely what the plural would be and why it’s not woe am I. Though the phrase may strike modern speakers as bizarre if not downright ungrammatical, there’s actually a fairly straightforward explanation: it’s an archaic dative expression. Strange as it may seem, the correct form really is woe is me, not woe am I or woe is I, and the first-person plural would simply be woe is us. I’ll explain why.

Today English only has three cases—nominative (or subjective), objective, and genitive (or possessive)—and these cases only apply to personal pronouns and who. Old English, on the other hand, had four cases (and vestiges of a fifth), and they applied to all nouns, pronouns, and adjectives. Among these four were two different cases for objects: accusative and dative. (The forms that we now think of simply as object pronouns actually descend from the dative pronouns, though they now cover the functions of both the accusative and dative.) These correspond roughly to direct and indirect objects, respectively, though they could be used in other ways too.

For instance, some prepositions took accusative objects, and some took dative objects (and some took either depending on the meaning). Nouns and pronouns in the accusative and dative cases could also be used in ways that seem strange to modern speakers. The dative, for example, could be used in places where we would normally use to and a pronoun. In some constructions we still have the choice between a pronoun or to and a pronoun—think of how you can say either I gave her the ball or I gave the ball to her—but in Old English you could do this to a much greater degree.

In the phrase woe is me, woe is the subject and me is a dative object, something that isn’t allowed in English today. It really means woe is to me. Today the phrase woe is me is pretty fixed, but some past variations on the phrase make the meaning a little clearer. Sometimes it was used with a verb, and sometimes woe was simply followed by a noun or prepositional phrase. In the King James Bible, we find “If I be wicked, woe unto me” (Job 10:15). One example from Old English reads, “Wa biþ þonne þæm mannum” (woe be then [to] those men).

So “woe is I” is not simply a fancy or archaic way of saying “I am woe” and is thus not parallel to constructions like “it is I”, where the nominative form is usually prescribed and the objective form is proscribed. In “woe is me”, “me” is not a subject complement (also known as a predicative complement) but a type of dative construction.

Thus the singular is is always correct, because it agrees with the singular mass noun woe. And though we don’t have distinct dative pronouns anymore, you can still use any pronoun in the object case, so woe is us would also be correct.

Addendum: Arika Okrent, writing at Mental Floss, has also just posted a piece on this construction. She goes into a little more detail on related constructions in English, German, and Yiddish.

And here are a couple of articles by Jan Freeman from 2007, specifically addressing Patricia O’Conner’s Woe Is I and a column by William Safire on the phrase:

Woe Is Us, Part 1
Woe Is Us, Continued


Celtic and the History of the English Language

A little while ago a link to this list of 23 maps and charts on language went around on Twitter. It’s full of interesting stuff on linguistic diversity and the genetic relationships among languages, but there was one chart that bothered me: this one on the history of the English language by Sabio Lantz.

The Origins of English

The first and largest problem is that the timeline makes it look as though English began with the Celts and then received later contributions from the Romans, Anglo-Saxons, Vikings, and so on. While this is a decent account of the migrations and conquests that have occurred in the last two thousand years, it’s not an accurate account of the history of the English language. (To be fair, the bar on the bottom gets it right, but it leaves out all the contributions from other languages.)

English began with the Anglo-Saxons. They were a group of Germanic tribes originating in the area of the Netherlands, northern Germany, and Denmark, and they spoke dialects of what might be called common West Germanic. There was no distinct English language at the time, just a group of dialects that would later evolve into English, Dutch, German, Low German, and Frisian. (Frisian, for the record, is English’s closest relative on the continent, and it’s close enough that you can buy a cow in Friesland by speaking Old English.)

The inhabitants of Great Britain when the Anglo-Saxons arrived were mostly romanized Celts who spoke Latin and a Celtic language that was the ancestor of modern-day Welsh and Cornish. (In what is now Scotland, the inhabitants spoke a different Celtic language, Gaelic, and perhaps also Pictish, but not much is known about Pictish.) But while there were Latin- and Celtic-speaking people in Great Britain before the Anglo-Saxons arrived, those languages probably had very little influence on Old English and should not be considered ancestors of English. English began as a distinct language when the Anglo-Saxons split off from their Germanic cousins and left mainland Europe beginning around 450 AD.

For years it was assumed that the Anglo-Saxons wiped out most of the Celts and forced the survivors to the edges of the island—Cornwall, Wales, and Scotland. But archaeological and genetic evidence has shown that this isn’t exactly the case. The Anglo-Saxons more likely conquered the Celts and intermarried with them. Old English became the language of government and education, but Celtic languages may have survived in Anglo-Saxon–occupied areas for quite some time.

From Old to Middle English

Old English continues until about 1066, when the Normans invaded and conquered England. At that point, the language of government became Old French—or at least the version of it spoken by the Normans—or Medieval Latin. Though peasants still spoke English, nobody was writing much in the language anymore. And when English made a comeback in the 1300s, it had changed quite radically. The complex system of declensions and other inflections from Old English were gone, and the language had borrowed considerably from French and Latin. Though there isn’t a firm line, by the end of the eleventh century Old English is considered to have ended and Middle English to have begun.

The differences between Old English and Middle English are quite stark. Just compare the Lord’s Prayer in each language:

Old English:

Fæder ure þu þe eart on heofonum;
Si þin nama gehalgod
to becume þin rice
gewurþe ðin willa
on eorðan swa swa on heofonum.
urne gedæghwamlican hlaf syle us todæg
and forgyf us ure gyltas
swa swa we forgyfað urum gyltendum
and ne gelæd þu us on costnunge
ac alys us of yfele soþlice

(The character that looks like a p with an ascender is called a thorn, and it is pronounced like the modern th. It could be either voiceless or voiced depending on its position in a word. The character that looks like an uncial d with a stroke through it is also pronounced just like a thorn, and the two symbols were used interchangeably. Don’t ask me why.)

Middle English:

Oure fadir that art in heuenes,
halewid be thi name;
thi kyngdoom come to;
be thi wille don,
in erthe as in heuene.
Yyue to vs this dai oure breed ouer othir substaunce,
and foryyue to vs oure dettis,
as we foryyuen to oure dettouris;
and lede vs not in to temptacioun,
but delyuere vs fro yuel. Amen.

(Note that u and v could both represent either /u/ or /v/. V was used at the beginnings of words and u in the middle. Thus vs is “us” and yuel is “evil”.)

While you can probably muddle your way through some of the Lord’s Prayer in Old English, there are a lot of words that are unfamiliar, such as gewurþe and soþlice. And this is probably one of the easiest short passages to read in Old English. Not only is it a familiar text, but it dates to the late Old English period. Older Old English text can be much more difficult. The Middle English, on the other hand, is quite readable if you know a little bit about Middle English spelling conventions.

And even where the Old English is readable, it shows grammatical inflections that are stripped away in Middle English. For example, ure, urne, and urum are all forms of “our” based on their grammatical case. In Middle English, though, they’re all oure, much like Modern English. As I said above, the change from Old English to Middle English was quite radical, and it was also quite sudden. My professor of Old English and Middle English said that there are cases where town chronicles essentially change from Old to Middle English in a generation.

But here’s where things get a little murky. Some have argued that the vernacular language didn’t really change that quickly—it was only the codified written form that did. That is, people were taught to write a sort of standard Old English that didn’t match what they spoke, just as people continued to write Latin even as they were speaking the evolving Romance dialects such as Old French and Old Spanish.

So perhaps the complex inflectional system of Old English didn’t disappear suddenly when the Normans invaded; perhaps it was disappearing gradually throughout the Old English period, but those few who were literate learned the old forms and retained them in writing. Then, when the Normans invaded and people mostly stopped writing in English, they also stopped learning how to write standard Old English. When they started writing English again a couple of centuries later, they simply wrote the language as it was spoken, free of the grammatical forms that had been artificially retained in Old English for so long. This also explains why there was so much dialectal variation in Middle English; because there was no standard form, people wrote their own local variety. It wasn’t until the end of the Middle English period that a new standard started to coalesce and Early Modern English was born.

Supposed Celtic Syntax in English

And with that history established, I can finally get to my second problem with that graphic above: the supposed Celtic remnants in English. English may be a Germanic language, but it differs from its Germanic cousins in several notable ways. In addition to the glut of French, Latin, Greek, and other borrowings that occurred in the Middle and Early Modern English periods, English has some striking syntactic differences from other Germanic languages.

English has what is known as the continuous or progressive aspect, which is formed with a form of be and a present participle. So we usually say I’m going to the store rather than just I go to the store. It’s rather unusual to use a periphrastic—that is, wordy—construction as the default when there’s a shorter option available. Many languages do not have progressive forms at all, and if they do, they’re used to specifically emphasize that an action is happening right now or is ongoing. English, on the other hand, uses it as the default form for many types of verbs. But in German, for example, you simply say Ich gehe in den Laden (“I go to the store”), not Ich bin gehende in den Laden (“I am going to the store”).

English also makes extensive use of a feature known as do support, wherein we insert do into certain kinds of constructions, mostly questions and negatives. So while German would have Magst du Eis? (“Like you ice cream?”), English inserts a dummy do: Do you like ice cream? These constructions are rare cross-linguistically and are very un-Germanic.

And some people have come up with a very interesting explanation for this unusual syntax: it comes from a Celtic substrate. That is, they believe that the Celtic population of Britain adopted Old English from their Anglo-Saxon conquerors but remained bilingual for some time. As they learned Old English, they carried over some of their native syntax. The Celtic languages have some rather unusual syntax themselves, highly favoring periphrastic constructions over inflected ones. Some of these constructions are roughly analogous to the English use of do support and progressive forms. For instance, in Welsh you might say Dwi yn mynd i’r siop (“I am in going to the shop”). (Disclaimer: I took all of one semester in Welsh, so I’m relying on what little I remember plus some help from various websites on Welsh grammar and a smattering of Google Translate.)

While this isn’t exactly like the English equivalent, it looks close. Welsh doesn’t have present participial forms but instead uses something called a verbal noun, which is a sort of cross between an infinitive and gerund. Welsh also uses the particle yn (“in”) to connect the verbal noun to the rest of the sentence, which is actually quite similar to constructions from late Middle and Early Modern English such as He was a-going to the store, where a- is just a worn-down version of the preposition on.

But Welsh uses this construction in all kinds of places where English doesn’t. To say I speak Welsh, for example, you say Dw’i’n siarad Cymraeg, which literally translated means I am in speaking Welsh. In English the progressive stresses that you are doing something right now, while the simple present is used for things that are done habitually or that are generally true. In Welsh, though, it’s unmarked—it’s simply a wordier way of stating something without any special progressive meaning. Despite its superficial similarities to the English progressive, it’s quite far from English in both use and meaning. Additionally, the English construction may have much more mundane origins in the conflation of gerunds and present participles in late Middle English, but that’s a discussion for another time.

Welsh’s use of do support—or, I should say, gwneud support—even less closely parallels that of English. In English, do is used in interrogatives (Do you like ice cream?), negatives (I don’t like ice cream), and emphatic statements (I do like ice cream), and it also appears as a stand-in for whole verb phrases (He thinks I don’t like ice cream, but I do). In Welsh, however, gwneud is not obligatory, and it can be used in simple affirmative statements without any emphasis.

Nor is it always used where it would be in English. Many questions and negatives are formed with a form of the be verb, bod, rather than gwneud. For example, Do you speak Welsh? is Wyt ti’n siarad Cymraeg? (“Are you in speaking Welsh?”), and I don’t understand is Dw i ddim yn deall (“I am not in understanding”). (This is probably simply because Welsh uses the pseudo-progressive in the affirmative form, so it uses the same construction in interrogatives and negatives, much like how English would turn “He is going to the store” into “Is he going to the store?” or “He isn’t going to the store.” Do is only used when there isn’t another auxiliary verb that could be used.)

But there’s perhaps an even bigger problem with the theory that English borrowed these constructions from Celtic: time. Both the progressive and do support start to appear in late Middle English (the fourteenth and fifteenth centuries), but they don’t really take off until the sixteenth century and beyond, over a thousand years after the Anglo-Saxons began colonizing Great Britain. So if the Celtic inhabitants of Britain adopted English but carried over some Celtic syntax, and if the reason why that Celtic syntax never appeared in Old English is that the written language was a standardized form that didn’t match the vernacular, and if the reason why Middle English looks so different from Old English is that people were now writing the way they spoke, then why don’t we see these Celticisms until the end of the Middle English period, and then only rarely?

Proponents of the Celtic substrate theory argue that these features are so unusual that they could only have been borrowed into English from Celtic languages. They ask why English is the only Germanic language to develop them, but it’s easy to flip this sort of question around. Why did English wait for more than a thousand years to borrow these constructions? Why didn’t English borrow the verb-subject-object sentence order from the Celtic languages? Why didn’t it borrow the after-perfect, which uses after plus a gerund instead of have plus a past participle (She is after coming rather than She has come), or any other number of Celtic constructions? And maybe most importantly, why are there almost no lexical borrowings from Celtic languages into English? Words are the first things to be borrowed, while more structural grammatical features like syntax and morphology are among the last. And just to beat a dead horse, just because something developed in English doesn’t mean you should expect to see the same thing develop in related languages.

The best thing that the Celtic substrate theory has going for it, I think, is that it’s appealing. It neatly explains something that makes English unique and celebrates the Celtic heritage of the island. But there’s a danger whenever a theory is too attractive on an emotional level. You tend to overlook its weaknesses and play up its strengths, as John McWhorter does when he breathlessly explains the theory in Our Magnificent Bastard Tongue. He stresses again and again how unique English is, how odd these constructions are, and how therefore they must have come from the Celtic languages.

I’m not a historical linguist and certainly not an expert in Celtic languages, but alarm bells started going off in my head when I read McWhorter’s book. There were just too many things that didn’t add up, too many pieces that didn’t quite fit. I wanted to believe it because it sounded so cool, but wanting to believe something doesn’t make it so. Of course, none of this is to say that it isn’t so. Maybe it’s all true but there just isn’t enough evidence to prove it yet. Maybe I’m being overly skeptical for nothing.

But in linguistics, as in other sciences, a good dose of skepticism is healthy. A crazy theory requires some crazy-good proof, and right now, all I see is a theory with enough holes in it to sink a fleet of Viking longboats.


Lynne Truss and Chicken Little

Lynne Truss, author of the bestselling Eats, Shoots & Leaves: The Zero Tolerance Approach to Punctuation, is at it again, crying with her characteristic hyperbole and lack of perspective that the linguistic sky is falling because she got a minor bump on the head.

As usual, Truss hides behind the it’s-just-a-joke-but-no-seriously defense. She starts by claiming to have “an especially trivial linguistic point to make” but then claims that the English language is doomed, and it’s all linguists’ fault. According to Truss, linguists have sat back and watched while literacy levels have declined—and have profited from doing so.

What exactly is the problem this time? That some people mistakenly write some phrases as compound words when they’re not, such as maybe for may be or anyday for any day. (This isn’t even entirely true; anyday is almost nonexistent in print, even in American English, according to Google Ngram Viewer.) I guess from anyday it’s a short, slippery slope to complete language chaos, and then “we might as well all go off and kill ourselves.”

But it’s not clear what her complaint about erroneous compound words has to do with literacy levels. If the only problem with literacy is that some people write maybe when they mean may be, then it seems to be, as she originally says, an especially trivial point. Yes, some people deviate from standard orthography. While this may be irritating and may occasionally cause confusion, it’s not really an indication that people don’t know how to read or write. Even educated people make mistakes, and this has always been the case. It’s not a sign of impending doom.

But let’s consider the analogies she chose to illustrate linguists’ supposed negligence. She says that we’re like epidemiologists who simply catalog all the ways in which people die from diseases or like architects who make notes while buildings collapse. (Interestingly, she makes two remarks about how well paid linguists are. Of course, professors don’t actually make that much, especially those in the humanities or social sciences. And it smacks of hypocrisy from someone whose book has sold 3 million copies.)

Perhaps there is a minor crisis in literacy, at least in the UK. This article says that 16–24-year-olds in the UK are lagging behind many counterparts in other first-world countries. (The headline suggests that they’re trailing the entire world, but the study only looked at select countries from Europe and east Asia.) Wikipedia, however, says that the UK has a 99 percent literacy rate. Maybe young people are slipping a bit, and this is certainly something that educators should address, but it doesn’t appear that countless people are dying from an epidemic of slightly declining literacy rates or that our linguistic structures are collapsing. This is simply not the linguistic apocalypse that Truss makes it out to be.

Anyway, even if it were, why would it be linguists’ job to do something about it? Literacy is taught in primary and secondary school and is usually the responsibility of reading, language arts, or English teachers—not linguists. Why not criticize English professors for sitting back and collecting fat paychecks for writing about literary theory while our kids struggle to read? Because they’re not her ideological enemy, that’s why. Linguists often oppose language pedants like Truss, and so Truss finds some reason—contrived though it may be—to blame them. Though some applied linguists do in fact study things like language acquisition and literacy, most linguists hew to the more abstract and theoretical side of language—syntax, morphology, phonology, and so on. Blaming descriptive linguists for children’s illiteracy is like blaming physicists for children’s inability to ride bikes.

And maybe the real reason why linguists are unconcerned about the upcoming linguistic apocalypse is that there simply isn’t one. Maybe linguists are like meteorologists who observe that, contrary to the claims of some individuals, the sky is not actually falling. In studying the structure of other languages and the ways in which languages change, linguists have realized that language change is not decay. Consider the opening lines from Beowulf, an Old English epic poem over a thousand years old:

HWÆT, WE GAR-DEna in geardagum,
þeodcyninga þrym gefrunon,
hu ða æþelingas ellen fremedon!

Only two words are instantly recognizable to modern English speakers: we and in. The changes from Old English to modern English haven’t made the language better or worse—just different. Some people maintain that they understand that language changes but say that they still oppose certain changes that seem to come from ignorance or laziness. They fear that if we’re not vigilant in opposing such changes, we’ll lose our ability to communicate. But the truth is that most of those changes from Old English to modern English also came from ignorance or laziness, and we seem to communicate just fine today.

Languages can change very radically over time, but contrary to popular belief, they never devolve into caveman grunting. This is because we all have an interest in both understanding and being understood, and we’re flexible enough to adapt to changes that happen within our lifetime. And with language, as opposed to morality or ethics, there is no inherent right or wrong. Correct language is, in a nutshell, what its users consider to be correct for a given time, place, and audience. One generation’s ignorant change is sometimes the next generation’s proper grammar.

It’s no surprise that Truss fundamentally misunderstands what linguists and lexicographers do. She even admits that she was “seriously unqualified” for linguistic debate a few years back, and it seems that nothing has changed. But that probably won’t stop her from continuing to prophesy the imminent destruction of the English language. Maybe Truss is less like Chicken Little and more like the boy who cried wolf, proclaiming disaster not because she actually sees one coming, but rather because she likes the attention.


Relative What

A few months ago Braden asked in a comment about the history of what as a relative pronoun. (For my previous posts on relative pronouns, see here.) The history of relative pronouns in English is rather complicated, and the system as a whole is still in flux, partly because modern English essentially has two overlapping systems of relativization.

In Old English, there were a few different ways to create a relative pronoun, as this site explains. One way was to use the indeclinable particle þe, another was to use a form of the demonstrative pronoun (roughly equivalent to modern English that/those), and another was to use a demonstrative or personal pronoun followed by þe. Our modern relative that grew out of the use of demonstrative pronouns, though unlike the Old English demonstratives, that does not decline for gender, number, and case.

In the late Old English and Middle English periods, writers and speakers began to use interrogative pronouns as relative pronouns by analogy with French and Latin. It first appeared in texts that were translations from Latin around 1000 AD, but within a couple of centuries it had apparently been naturalized. Other interrogatives became pressed into service as relatives during this time, including who, which, where, when, why, and how. All of these are still in common use in Standard English except for what.

It’s important to note that what is still used as a nominal relative, which means that it does not modify another noun phrase but stands in for a noun phrase and a relative simultaneously, as in We fear what we don’t understand. This could be rephrased as We fear that which we don’t understand or We fear the things that we don’t understand, revealing the nominal and the relative.

But while all the other interrogatives have continued as relatives in Standard English, what as a simple relative pronoun is nonstandard today. Simple relative what is found in the works of Shakespeare and the King James Bible, but at some point in the last three or four centuries it fell out of use in the standard dialect. Unfortunately, I’m not really sure when this happened; the Oxford English Dictionary has citations up through 1740 and then one from 1920 that appears to be dialogue from a novel. Merriam-Webster’s Dictionary of English Usage says that in the US, it’s mainly found in rural areas in the Midland and South. As I told Braden in a response to his comment, I’ve heard it used myself. A couple of months ago I heard a man in church pray for “our leaders what guides and directs us”—not just a beautiful example of relative what, but also an interesting example of nonstandard verb agreement.

As for why simple relative what died out in Standard English, I really have no idea. Jonathan Hope noted that it’s rather unusual of Standard English to allow other interrogatives as relatives but not this one.1Jonathan Hope, “Rats, Bats, Sparrows and Dogs: Biology, Linguistics and the Nature of Standard English,” in The Development of Standard English, 1300–1800, ed. Laura Wright (Cambridge: University of Cambridge Press, 2000). In some ways, relative what would make more sense than relative which, since what is historically part of the same paradigm as who; what comes from the neuter form of the interrogative or indefinite pronoun in Old English, while who comes from the combined masculine/feminine form, as shown here. And as I said in this post, whose was originally the genitive form for both who and what, so allowing simple relative what would make for a rather tidy paradigm.

Perhaps that’s the problem. Hope and other have argued that standardized languages—or perhaps speakers of standardized languages—tend to resist tidy paradigms. Irregularities creep in and are preserved, and they can be surprisingly resistant to change. Maybe someone reading this has a fuller explanation of just how this particular little wrinkle came to be.

