Arrant Pedantry

By

Two Space or Not Two Space

A friend of mine recently posted on Facebook that you could take the second space after a period away from him when you pry it from his cold, dead fingers. I responded with this image of Ben Wyatt from Parks and Recreation.

I don't even have time to tell you how wrong you are. Actually, it’s gonna bug me if I don’t.

But I said I’d refrain from sharing my thoughts unless he really wanted to hear them. He said he did, so here goes.

Even though the extra space has its defenders, using two spaces between sentences is wrong by today’s standards, but nearly everybody is wrong about why.

The usual argument goes that it’s a holdover from the days of typewriters. Typewriters use monospaced fonts (meaning that each character takes up the same amount of horizontal space, whether it’s an i or a W), which look spacey compared to proportional fonts (where characters have different widths according to the size and shape of the actual character). Since monospaced text looks spacey already, it was decided that an extra space was needed between sentences to make things readable. But since we’re now all writing on computers with proportional fonts, we should all ditch the two-space habit. Case closed!

But not so fast.

You may have been taught in typing class to type two spaces at the end of a sentence, but the practice has nothing to do with typewriters. It’s actually just an attempt to replicate the look of typeset text of the era. There are other blog posts out there that give a much more thorough account of the history of sentence spacing than I’ll give here (and I’ll link to them at the end), but I’ll use some of the same sources.

But before we dive in, some definitions. Spacing in typography is usually based on the em, a relative unit of measurement that’s as wide as a line of type is tall. That is, if type is set at 12 points, then an em is also 12 points. The name derives from the fact that a capital M in many typefaces is about as wide as it is tall. The em dash (—) is so named because it’s 1 em wide. A space the width of an em is called an em space, an em quad, or just an em or a quad.

An en space or en quad is about the width of a capital N, which is half the width of an em space. An en dash, as you guessed it, is 1 en wide.

A three-em space is not three ems wide but one-third of an em (that is, it’s a three-to-an-em space). Also called a thick space, this is the standard space used between words. There are also smaller spaces like four-em and five-em spaces (known as thin spaces) and hair spaces, but we don’t need to worry about them.

Modern typesetting practice is to use a thick space everywhere, but professional practice even just a hundred years ago was surprisingly different. Just take a look at this guide to spacing from the first edition of what would later be known as The Chicago Manual of Style (published in 1906):

Space evenly. A standard line should have a 3-em space between all words not separated by other punctuation points than commas, and after commas; an en-quad after semicolons, and colons followed by a lower-case letter; two 3-em spaces after colons followed by a capital; an em-quad after periods, and exclamation and interrogation points, concluding a sentence.

In other words, the standard spacing was a thick space (one-third of an em) between words (the same as it is today), a little bit more than that (half an em) after semicolons or colons that were followed by a lowercase letter, two thick spaces after a colon followed by a capital, and the equivalent of three thick spaces between sentences. Typesetters weren’t just double-spacing between sentences—they were triple spacing. You can see this extra spacing in the manual itself:

Remember that typewriters were generally monospaced, meaning that the carriage advanced the same amount for every character, including spaces. On a typewriter, there’s no such thing as a thin space, en space, or em space. Consequently, the rules for spacing were simplified a bit: a thick space between words and following semicolons or colons followed by a lowercase letter, and two thick spaces between sentences or after a colon followed by a capital letter.

At this point the two-spacers may be cheering. History is on your side! The extra space is good! But not so fast.

Around the middle of the last century, typesetting practice began to change. That complicated system of spacing takes extra time to implement, and financial and technological pressures eventually pushed typesetters to adopt the current practice of using a single thick space everywhere. But this wasn’t an innovation. English and Americans typesetters may have used extra space, but French typesetters did not—they used just one space between sentences. Clearly not everyone thought that the extra space was necessary.

And as someone who has done a fair amount of typesetting, I have to say that I’m thankful for the current standard. It’s easy to ensure that there’s only a single space everywhere, but trying to ensure that there’s extra space between sentences—and only between sentences—would be a nightmare even with the help of find-and-replace queries or regular expressions. (I’ve seen some suggestions that typesetting software automatically adds space between sentences, but this isn’t true of any of the typesetting software I’ve ever used, which includes FrameMaker, QuarkXPress, and InDesign. Maybe LaTeX does it, but I’d be curious to see how well it really does.)

My wife has done a fair amount of editing for doctoral students whose committees seem to think that the APA style requires two spaces between sentences, so she’s spent a lot of time putting all those extra spaces in. (Luckily for her, she charges by the hour.) In its section on spacing following punctuation, the Publication Manual of the American Psychological Association says that “spacing twice after punctuation marks at the end of a sentence aids readers of draft manuscripts.” (APA doesn’t require the extra space in published work, though, meaning that authors are asked to put the spaces in and then editors or typesetters take them right back out.)

Unfortunately, there’s no evidence to back up the claim that the extra space aids readability by providing a little more visual separation between sentences; what few studies have been done have been inconclusive. Inserting extra spacing means extra time for the editor, typesetter, and proofreader, and it’s extra time that doesn’t appear to add any value. (Conversely, there’s also no evidence that the extra space hurts.) I suspect that the readability argument is just a post hoc rationalization for a habit that some find hard to break.

After all, most people alive today grew up in the era of single spacing in professionally set text, so it’s what most people are familiar with. You never see the extra space unless you’re looking at an older text, a typewritten text, or a text that hasn’t been professionally edited and typeset. But most people who use the extra space do so not because of allegedly improved readability but because it’s simply what they were taught or because they say it’s impossible to break the habit of hitting the spacebar twice after a sentence.

And I’m skeptical when people claim that double-spacing is hardwired into their brains. Maybe I just have an easier time breaking bad habits than some people, but when I was taught to type in eighth grade (on typewriters, even—my school didn’t have enough money to stock the typing lab with computers), I was taught the two-space rule. And almost as soon as I was out of that class, I stopped. It took maybe two weeks to break the habit. But I already knew that it was an outdated practice, so I was motivated to abandon it as soon as my grade no longer depended on it.

If you’ve been typing this way for decades, though, or if you were never informed that the practice was outdated, you may be less motivated to try to change. Even if you write for publication, you can rely on your editor or typesetter to remove those extra spaces for you with a quick find-and-replace. You may not even be aware that they’re doing it.

Of course, even some people who should know better seem to be unaware that using two spaces is no longer the standard. When my oldest son was taught to type in school a couple of years ago, his teacher—who is probably younger than me—taught the class to use two spaces after a sentence. Even though typesetters switched to using a single space over fifty years ago, and typewriters have gone the way of the rotary phone, the two-space practice just won’t die.

So the real question is, what should you do? If you’re still using two spaces, either out of habit or because you like how it looks, should you make the switch? Or, put another way, is it really wrong to keep using two spaces after half a century after the publishing world has moved on?

Part of me really wants to say that yes, it really is that wrong, and you need to get over yourself and just break the stupid habit already. But the truth is that unless you’re writing for publication, it doesn’t actually matter all that much. If your work is going to be edited and typeset, then you should know that the extra space is going to be taken out anyway, so you might as well save a step by not putting it in in the first place.

But if you’re just writing a text or posting on Facebook or something like that, it’s not that big a deal. At worst, you’re showing your age and maybe showing your inability or unwillingness to break a habit that annoys some people. But the fact that it annoys some people is on us, not you. After all, it’s not like you’re committing a serious crime, like not using a serial comma.

Sources and Further Reading

This post on a blog called Heraclitean River is very well researched, though the author uses entirely too many rage caps and “Wow. Just wows.” (He’s apparently pretty upset about typographers’ supposed lies about their own history. It’s still worth a read if you have the time.)

This post on Creative Pro is not quite as exhaustive but is much more even-handed, but it still concludes that using two spaces is the right thing to do when using monospaced fonts. If the rationale behind using two spaces on a typewriter was to look like typeset text of the era, then there’s no reason to continue doing it.

On a blog called the World’s Greatest Book, Dave Bricker also has a very well-researched and even-handed post on the history of sentence spacing. He concludes, “Though writers are encouraged to unlearn the double-space typing habit, they may be heartened to learn that intellectual arguments against the old style are mostly contrived. At worst, the wide space after a period is a victim of fashion.”

By

Book Review: Word by Word

Word by Word: The Secret Life of Dictionaries, by Kory Stamper

Disclosure: I received a free advance review copy of this book from the publisher, Pantheon Books. I also consider Kory Stamper a friend.

A lot of work goes into making a book, from the initial writing and development to editing, copyediting, design and layout, proofreading, and printing. Orders of magnitude more work go into making a dictionary, yet few of us give much thought to how dictionaries actually come into being. Most people probably don’t think about the fact that there are multiple dictionaries. We always refer to it as the dictionary, as if it were a monolithic entity.

In Word by Word, Merriam-Webster editor Kory Stamper shows us the inner workings of dictionary making, from gathering citations to defining to writing pronunciations to researching etymologies. In doing so, she also takes us through the history of lexicography and the history of the English language itself.

If you’ve read other popular books on lexicography, like The Lexicographer’s Dilemma by Jack Lynch, you’re probably already familiar with some of the broad outlines of Word by Word—where dictionaries come from, how words get in them, and so on. But Stamper presents even familiar ideas in a fresh way and with wit and charm. If you’re familiar with her blog, Harmless Drudgery, you know she’s a gifted writer. (And if you’re not familiar with it, you should remedy that as soon as possible.)

In discussing the influence of French and Latin on English, for example, she writes, “Blending grammatical systems from two languages on different branches of the Indo-European language tree is a bit like mixing orange juice and milk: you can do it, but it’s going to be nasty.” And in describing the ability of lexicographers to focus on the same dry task day in and day out, she says that “project timelines in lexicography are traditionally so long that they could reasonably be measured in geologic epochs.”

Stamper also deftly teaches us about lexicography by taking us through her own experience of learning the craft, from the job interview in which she gushed about medieval Icelandic family sagas to the day-to-day grind of sifting through citations to the much more politically fraught side of dictionary writing, like changing the definitions for marriage or nude (one of the senses was defined as the color of white skin).

But the real joy of Stamper’s book isn’t the romp through the history of lexicography or the English language or even the self-deprecating jokes about lexicographers’ antisocial ways. It’s the way in which Stamper make stories about words into stories about us.

In one chapter, she looks into the mind of peevers by examining the impulse to fix English and explaining why so many of the rules we cherish are wrong:

The fact is that many of the things that are presented to us as rules are really just the of-the-moment preferences of people who have had the opportunity to get their opinions published and whose opinions end up being reinforced and repeated down the ages as Truth.

Real language is messy, and it doesn’t fit neatly into the categories of right and wrong that we’re taught. Learning this “is a betrayal”, she says, but it’s one that lexicographers have to get over if they’re going to write good dictionaries.

In the chapter “Irregardless”, she explores some of the social factors that shape our speech—race and ethnicity, geography, social class—to explain how she became one of the world’s foremost irregardless apologists when she started answering emails from correspondents who want the word removed from the dictionary. Though she initially shared her correspondents’ hatred of the word, an objective look at its use helped her appreciate it in all its nuanced, nonstandard glory. But—just like anyone else—she still has her own hangups and peeves, like when her teenage daughter started saying “I’m done my homework.”

In another chapter, she relates how she discovered that the word bitch had no stylistic label warning dictionary users that the word is vulgar or offensive, and she dives not only into the word’s history but also into modern efforts to reclaim the slur and the effects the word can have on those who hear it—anger, shame, embarrassment—even when it’s not directed at them.

And in my favorite chapter, she takes a look at the arcane art of etymology. “If logophiles want to be lexicographers when they grow up,” she writes, “then lexicographers want to be etymologists.” (I’ve always wanted to be an etymologist, but I don’t know nearly enough dead languages. Plus, there are basically zero job openings for etymologists.) Stamper relates the time when she brought some Finnish candy into the office, and Merriam-Webster’s etymologist asked her—in Finnish—if she spoke Finnish. She said—also in Finnish—that she spoke a little and asked if he did too. He replied—again, in Finnish—that he didn’t speak Finnish. This is the sort of logophilia that I can only dream of.

Stamper explodes some common etymological myths—no, posh and golf and the f word don’t originate from acronyms—before turning a critical eye on Noah Webster himself. The man may have been the founder of American lexicography, but his etymologies were crap. Webster was motivated by the belief that all languages descend from Hebrew, and so he tried to connect every word to a Hebrew root. But tracing a word’s history requires poring over old documents (often in one of those aforementioned dead languages) and painstakingly following it through the twists and turns of sound changes and semantic shifts.

Stamper ends the book with some thoughts on the present state and future of lexicography. The internet has enabled dictionaries to expand far beyond the limitations of print books—you no longer have to worry about things line breaks or page counts—but it also pushes lexicographers to work faster even as it completely upends the business side of things.

It’s not clear what the future holds for lexicography, but I’m glad that Kory Stamper has given us a peek behind the curtain. Word by Word is a heartfelt, funny, and ultimately human look at where words come from, how they’re defined, and what they say about us.

Word by Word: The Secret Life of Dictionaries is available now at Amazon and other booksellers.

By

Politeness and Pragmatics

On a forum I frequent, a few posters started talking about indirectness and how it can be annoying when a superior—whether a boss or a parent—asks you to do something in an indirect way. My response was popular enough that I thought I might repost it here. What follows is one of the original posts plus my edited and expanded response.

My kids used to get really pissed off when I asked them “Would you please unload the dishwasher”. They said it implied that they had a choice, when they really didn’t.

It’s time for some speech act theory.

The study of the meanings of words and utterances is called semantics, but the study of speech acts—how we intend those utterances to be received and how they’re received in context—is called pragmatics. And a look at pragmatics can reveal why parents say things like “Would you please unload the dishwasher?” when they really mean “Unload the dishwasher.”

Any speech act has three components: the locution (the meaning of the words themselves), the illocution (the intent of the speaker or writer), and the perlocution (the message that is received, or the effect of the speech act). Quite often, all three of these coincide. If I ask “What time is it?”, you can be pretty sure that my intent is find out the time, so the message you receive is “Jonathon wants me to tell him the time.” We call this a direct speech act.

But sometimes the locution, illocution, and perlocution don’t exactly correspond. If I ask “Do you know what time it is?”, I’m not literally asking if you have knowledge of the current time and nothing more, so the appropriate response is not just “Yes” or “No” but “It’s 11:13” or whatever the time is. I’m still asking you to tell me the time, but I didn’t say it directly. We call this an indirect speech act.

And speech can often be much more indirect than this. If we’re on a road trip and I ask my wife, “Are you hungry?”, what I really mean is that I’m hungry and want to stop for food, and I’m checking to see if she wants to stop too. Or maybe we’re sitting at home and I ask, “Is it just me, or is it hot in here?” And what I really mean is “I’m hot—do you mind if I turn the AC up?”

Indirect speech acts are often used to be polite or to save face. In the case of asking a child or subordinate to do something when they really don’t have a choice, it’s a way of downplaying the power imbalance in the relationship. By pretending to give someone a choice, we acknowledge that we’re imposing our will on them, which can make them feel better about having to do it. So while it’s easy to get annoyed at someone for implying that you have a choice when you really don’t, this reaction deliberately misses the point of indirectness, which is to lubricate social interaction.

Of course, different speech communities and even different individuals within a community can have varying notions of how indirect one should be, which can actually cause additional friction. Some cultures rely much more on indirectness, and so it causes problems when people are too direct. On the flip side, others may be frustrated with what they perceive as passive-aggressiveness, while the offender is probably just trying to be polite or save face.

In other words, indirectness is generally a feature, not a bug, though it only works if both sides are playing the same game. Instead of getting annoyed at the mismatch between the locution and the illocution, ask yourself what the speaker is probably trying to accomplish. Indirectness isn’t a means of obscuring the message—it’s an important part of the message itself.

By

Cognates, False and Otherwise

A few months ago, I was editing some online German courses, and I came across one of my biggest peeves in discussions of language: false cognates that aren’t.

If you’ve ever studied a foreign language, you’ve probably learned about false cognates at some point. According to most language teachers and even many language textbooks, false cognates are words that look like they should mean the same thing as their supposed English counterparts but don’t. But cognates don’t necessarily look the same or mean the same thing, and words that look the same and mean the same thing aren’t necessarily cognates.

In linguistics, cognate is a technical term meaning that words are etymologically related—that is, they have a common origin. The English one, two, three, German eins, zwei, drei, French un, deux, trois, and Welsh un, dau, tri are all cognate—they and words for one, two, three in many other language all trace back to the Proto-Indo-European (PIE) *oino, *dwo, *trei.

These sets are all pretty obvious, but not all cognates are. For example, the English four, five, German vier, fünf, French quatre, cinq, and Welsh pedwar, pump. The English and German are still obviously related, but the others less so. Fünf and pump are actually pretty close, but it seems a pretty long way from four and vier to pedwar, and an even longer way from them to quatre and cinq.

And yet these words all go back to the PIE *kwetwer and *penkwe. Though the modern-day forms aren’t as obviously related, linguists can nevertheless establish their relationships by tracing the them back through a series of sound changes to their conjectured historical forms.

And not all cognates share meaning. The English yoke, for instance, is related to the Latin jugular, the Greek zeugma, and the Hindi yoga, along with join, joust, conjugate, and many others. These words all trace back to the PIE *yeug ‘join’, and that sense can still be seen in some of its modern descendants, but if you’re learning Hindi, you can’t rely on the word yoke to tell you what yoga means.

Which brings us back to the German course that I was editing. Cognates are often presented as a way to learn vocabulary quickly, because the form and meaning are often similar enough to the form and meaning of the English word to make them easy to remember. But cognates often vary wildly in form (like four, quatre, and pedwar) and in meaning (like yoke, jugular, zeugma, and yoga). And many of the words presented as cognates are in fact not cognates but merely borrowings. Strictly speaking, cognates are words that have a common origin—that is, they were inherited from an ancestral language, just as the cognates above all descend from Proto-Indo-European. Cognates are like cousins—they may belong to different families, but they all trace back to a common ancestor.

But if cognates are like cousins, then borrowings are like clones, where a copy of word is taken directly from one language to another. Most of the cognates that I learned in French class years ago are actually borrowings. The English and French forms may look a little different now, but the resemblance is unmistakable. Many of the cognates in the German course I was editing were also borrowings, and in many cases they were words that were borrowed into both German and English from French:

bank
drama
form
gold
hand
jaguar
kredit
land
name
park
problem
sand
tempo
wind
zoo

Of these, only gold, hand, land, sand, and wind are actually cognates. Maybe it’s nitpicking to point out that the English jaguar and the German Jaguar aren’t cognates but borrowings from Portuguese. For a language learner, the important thing is that these words are similar in both languages, making them easy to learn.

But it’s the list of supposed false cognates that really irks me:

bad/bath
billion/trillion
karton/cardboard box
chef/boss
gift/venom
handy/cellphone
mode/fashion
peperoni/chili pepper
pickel/zit
rock/skirt
wand/wall
beamer/video projector
argument/proof, reasons

The German word is on the left and the English word on the right. Once again, many of these words are borrowings, mostly from French and Latin. All of these borrowings are clearly related, though their senses may have developed in different directions. For example, chef generally means “boss” in French, but it acquired its specialized sense in English from the longer phrase chef de cuisine, “head of the kitchen”. The earlier borrowing chief still maintains the sense of “head” or “boss”.

(It’s interesting that billion and trillion are on the list, since this isn’t necessarily an English/German difference—it also used to be an American/British difference, but the UK has adopted the same system as the US. Some languages use billion to mean a thousand million, while other languages use it to mean a million million. There’s a whole Wikipedia article on it.)

But some of these words really are cognate with English words—they just don’t necessarily look like it. Bad, for example, is cognate with the English bath. You just need to know that the English sounds spelled as <th>—like the /θ/ in thin or the /ð/ in then—generally became /d/ in German.

And, surprisingly, the German Gift, “poison”, is indeed cognate with the English gift. Gift is derived from the word give, and it means “something given”. The German word is essentially just a highly narrowed sense of the word: poison is something you give someone. (Well, hopefully not something you give someone.)

On a related note, that most notorious of alleged false cognates, the Spanish embarazado, really is related to the English embarrassed. They both trace back to an earlier word meaning “to put someone in an awkward or difficult situation”.

Rather than call these words false cognates, it would be more accurate to call them
false friends. This term is broad enough to encompass both words that are unrelated and words that are borrowings or cognates but that have different senses.

This isn’t to say that cognates aren’t useful in learning a language, of course, but sometimes it takes a little effort to see the connections. For example, when I learned German, one of my professors gave us a handout of some common English–German sound correlations, like the th ~ d connection above. For example, if you know that the English /p/ often corresponds to a German /f/ and that the English sound spelled <ea> often corresponds to the German /au/, then the relation between leap and laufen “to run” becomes clearer.

Or if you know that the English sound spelled <ch> often corresponds with the German /k/ or that the English /p/ often corresponds with the German /f/, then the relation between cheap and kaufen “to buy” becomes a little clearer. (Incidentally, this means that the English surname Chapman is cognate with the German Kaufmann.) And knowing that the English <y> sometimes corresponds to the German /g/ might help you see the relationship between the verb yearn and the German adverb gern “gladly, willingly”.

You don’t have to teach a course in historical linguistics in order to teach a foreign language like German, but you’re doing a disservice if you teach that obviously related pairs like Bad and bath aren’t actually related. Rather than teach students that language is random and treacherous, you can teach them to find the patterns that are already there. A little bit of linguistic background can go a long way.

Plus, you know, real etymology is a lot more fun.

Edited to add: In response to this post, Christopher Bergmann (www.isoglosse.de) created this great diagram of helpful cognates, unhelpful or less-helpful cognates, false cognates, and so on:

Click to see the full-sized image.

By

For Whomever the Bell Tolls

A couple of weeks ago, Ben Yagoda wrote a post on Lingua Franca in which he confessed to being a whomever scold. He took a few newspapers to task for messing up and using whomever where whoever was actually called for, and then he was taken to task himself by Jan Freeman. He said that “Whomever Mr. Trump nominates will inherit that investigation” should have the subject form, whoever, while she said that the object form, whomever was indeed correct. So what’s so tricky about whoever that even experts disagree about how to use it?

To answer that, we need to back up and explain why who trips so many people up. Who is an interrogative and relative pronoun, which means that it’s used to ask questions and to form relative clauses. One feature of both questions and relative clauses is that they cause movement—that is, the pronoun moves from where it would normally be in a declarative sentence to a position at the beginning of the clause. For example, a sentence like You gave it to him becomes Who did you give it to? when made into a question, with the personal pronoun him being changed to who and moved to the front. Or a pair of sentences like I gave it to the woman. I met her at the conference becomes I gave it to the woman who I met at the conference. Again, the personal pronoun her is replaced with who and moved up.

Technically, both of these examples should use whom, because in both cases it’s replacing an object pronoun, and whom is the object form of who. But we often have trouble keeping track of the syntactic role of who(m) when it moves, so many people just who regardless of whether it’s syntactically a subject or object. Sometimes people overcorrect and use whom where it’s syntactically a subject, as in Whom may I say is calling?

Whoever adds another layer of complexity. It’s what we call a fused relative pronoun—it functions as both the relative pronoun and its own antecedent. Let’s go back to our example above: I gave it to the woman who I met at the conference. The antecedent of who is the woman. But we can replace both with whoever: I gave it to whoever I met at the conference.

Because a fused relative functions as its own antecedent, it fills roles in two different clauses—the main clause and the relative clause. And whereas a simple relative like who is always just a subject or an object in the relative clause, whoever can be both a subject and an object simultaneously thanks to its dual roles. There are four possible combinations:

  1. Subject of main clause, subject of relative clause: Whoever ate the last cookie is in trouble.
  2. Object in main clause, subject of relative clause: I’ll give the last cookie to whoever wants it.
  3. Subject of main clause, object in relative clause: Whoever you gave the last cookie to is lucky.
  4. Object in main clause, object in relative clause: I’ll give the last cookie to whoever I like the most.

So if whoever can fill two different roles in two different clauses, how do we decide whether to use the subject or object form? Which role wins out?

The traditional rule is that the role in the relative clause wins. If it’s the subject of the relative clause, use whoever. If it’s the object of the relative clause, use whomever. This means that the prescribed forms in the sentences above would be (1) whoever, (2) whoever, (3) whomever, and (4) whomever

The rationale for this rule is that the relative clause as a whole functions as the subject or as an object within the main clause. That is, the relative clause is treated as a separate syntactic unit, and that unit is then slotted into the main clause. Thus it doesn’t matter if whoever follows a verb or a preposition—the only thing that matters is its role in the relative clause.

I think this is easier to understand with sentence diagrams. Note that in the diagram below, whoever occupies a place in two different structures—it’s simultaneously the complement of the preposition to and the subject of the relative clause. Syntax diagrams normally branch, but in in this case they converge because whoever fuses those two roles together.

Grammatical case is governed by the word on which the pronoun is dependent, so we can think of case assignment as coming down from the verb or preposition to the pronoun. In the diagram above, the case assignment for whoever (represented by the arrows) comes from its role in the relative clause. Normally the preposition to would assign case to its complement, but in this situation it’s blocked, because case has already been assigned at the level of the relative clause.

Of course, case in English has been a mess ever since the Norman Conquest. English went from being a highly inflected language that marked case on all nouns, pronouns, and adjectives to a minimally inflected language that marks case only on a small handful of pronouns. Our internal rules governing pronoun case seem to have broken down to some extent, leading to a number of constructions where subject and object forms are in alternation, such as between you and I or me and her went to the store. The Oxford English Dictionary has examples of whomever being used for whoever going all the way back to John Wyclif in 1380 and examples of whoever being used for whomever going back to Shakespeare in 1599.

Which brings us back to Yagoda’s original post. The sentence that brought correction from Jan Freeman was “Whomever Mr. Trump nominates will inherit that investigation.” Yagoda said it should be whoever; Freeman said it was correct as is. Yagoda eventually conceded that he was wrong and that the Times sentence was right, but not before a side trip into let him who. Freeman linked to this post by James Harbeck, in which he explains that constructions like let him who is without sin don’t work quite the same way as let whoever is without sin.

A lot of people have learned that the clause with whoever essentially functions as the object of let, but many people then overextend that rule and say that the entire construction he who is without sin is the object of let. To understand why it’s not, let’s use another syntax diagram.

Note the differences between this diagram and the previous one. Him is the object of the verb let, and who is . . . is a relative clause that modifies him. But, crucially, him is not part of that clause; it’s merely the antecedent to the relative pronoun. Its case assignment comes from the verb let, while the case assignment of who comes from its role in the relative clause.

For he to be the correct form here, its case would have to be controlled by the verb in a relative clause that it’s not even a part of. Case assignment essentially flows downhill from the structure above the pronoun; it doesn’t flow uphill to a structure above it.

But apparently Harbeck’s post wasn’t enough to convince Yagoda. While admitting that he didn’t understand Harbeck’s argument, he nevertheless said he disagreed with it and declared that he was on team “Let he who is without sin . . .”

Some of the commenters on Yagoda’s post, though, had an elegant test to show that Yagoda was wrong without resorting to syntax trees or discussions of case assignment: simply remove the relative clause. In Let him cast the first stone, it’s clear that him is an object. The relative clause may add extra information about who exactly is casting the first stone, but it’s grammatically optional and thus shouldn’t affect the case of its antecedent.

In conclusion, case in English is a bit of a mess and a syntactic analysis can help, but sometimes the simplest solutions are best.

By

Changes at the Arrant Pedantry Store, Plus 20% Off

If you’ve been to the Arrant Pedantry Store recently (and if you haven’t, then why not?), then you may have noticed a change that looks small but is actually pretty big: the ability to edit products. Now, instead of only being able to select from the products I’ve already created, you can make your own. Want to put Battlestar Grammatica on a hoodie, or Editing Is Awesome on a mug, or Grammar Is for Lovers on a thong?

It’s pretty easy. All you have to do is click on an existing product, then click “Do you want to edit the design?” at the bottom of the product picture. You’ll be taken to a screen where you can select the product, select and position the design, and choose the quantity and size. You can even change the color of the design itself, so you could get Stet Wars in red on a blue shirt or I Could Care Fewer in hot pink on a gray shirt or Eschew Obfuscation in black on a black shirt (though I don’t recommend the last one, unless you want to be extra funny).

When you’re done, you’ll go through the regular checkout process, and Spreadshirt will print your product and mail it out just like normal.

And if you order between February 3rd and 5th, you can use the coupon code GET20 to get 20 percent off when you order two or more products. And while you’re there, check out some of the new designs:

By

Prescriptivism and Language Change

Recently, John McIntyre posted a video in which he defended the unetymological use of decimate to the Baltimore Sun’s Facebook page. When he shared it to his own Facebook page, a lively discussion ensued, including this comment:

Putting aside all the straw men, the ad absurdums, the ad hominems and the just plain sillies, answer me two questions:
1. Why are we so determined that decimate, having once changed its meaning to a significant portion of the population, must be used to mean obliterate and must never be allowed to change again?
2. Is your defence of the status quo on the word not at odds with your determination that it is a living language?
3. If the word were to have been invented yesterday, do you really think “destroy” is the best meaning for it?
…three questions!

Putting aside all the straw men in these questions themselves, let’s get at what he’s really asking, which is, “If decimate changed once before from ‘reduce by one-tenth’ to ‘reduce drastically’, why can’t it change again to the better, more etymological meaning?”

I’ve seen variations on this question pop up multiple times over the last few years when traditional rules have been challenged or debunked. It seems that the notions that language changes and that such change is normal have become accepted by many people, but some of those people then turn around and ask, “So if language changes, why can’t change it in the way I want?” For example, some may recognize that the that/which distinction is an invention that’s being forced on the language, but they may believe that this is a good change that increases clarity.

On the surface, this seems like a reasonable question. If language is arbitrary and changeable, why can’t we all just decide to change it in a positive way? After all, this is essentially the rationale behind the movements that advocate bias-free or plain language. But whereas those movements are motivated by social or cognitive science and have measurable benefits, this argument in favor of old prescriptive rules is just a case of motivated reasoning.

The bias-free and plain language movements are based on the premises that people deserve to be treated equally and that language should be accessible to its audience. Arguing that decimated really should mean “reduced by one-tenth” is based on a desire to hang on to rules that one was taught in one’s youth. It’s an entirely post hoc rationale, because it’s only employed to defend bad rules, not to determine the best meaning for or use of every word. For example, if we really thought that narrower etymological senses were always better, shouldn’t we insist that cupboard only be used to refer to a board on which one places cups?

This argument is based in part on a misunderstanding of what the descriptivist/prescriptivist debate is all about. Nobody is insisting that decimate must mean “obliterate”, only observing that it is used in the broader sense far more often than the narrower etymological sense. Likewise, no one is insisting that the word must never be allowed to change again, only noting that it is unlikely that the “destroy one-tenth” sense will ever be the dominant sense. Arguing against a particular prescription is not the same as making the opposite prescription.

But perhaps more importantly, this argument is based on a fundamental misunderstanding of how language change works. As Allan Metcalf said in a recent Lingua Franca post, “It seems a basic principle of language that if an expression is widely used, that must be because it is widely useful. People wouldn’t use a word if they didn’t find it useful.” And as Jan Freeman has said, “we don’t especially need a term that means ‘kill one in 10.’” That is, the “destroy one-tenth” sense is not dominant precisely because it is not useful.

The language changed when people began using the word in a more useful way, or to put it more accurately, people changed the language by using the word in a more useful way. You can try to persuade them to change back by arguing that the narrow meaning is better, but this argument hasn’t gotten much traction in the 250 years since people started complaining about the broader sense. (The broader sense, unsurprisingly, dates back to the mid-1600s, meaning that English speakers were using it for a full two centuries before someone decided to be bothered by it.)

But even if you succeed, all you’ll really accomplish is driving decimate out of use altogether. Just remember that death is also a kind of change.

By

15% Off Plus Free Shipping

I should have posted this sooner, but better late than never. Spreadshirt, the home of the Arrant Pedantry Store, currently has a promotion for 15% off plus free shipping, and it ends tonight. If you’ve been thinking of getting one of the new We Can Even! shirts for that special person in your life for Christmas, now would be the perfect time.

we-can-even

Just use the code 2016OMG at checkout.

By

Whence Did They Come?

In a recent episode of Slate’s Lexicon Valley podcast, John McWhorter discussed the history of English personal pronouns. Why don’t we use ye or thee and thou anymore? What’s the deal with using they as a gender-neutral singular pronoun? And where do they and she come from?

The first half, on the loss of ye and the original second-person singular pronoun thou, is interesting, but the second half, on the origins of she and they, missed the mark, in my opinion.

I recommend listening to the whole thing, but here’s the short version. The pronouns she and they/them/their(s) are new to the language, relatively speaking. This is what the personal pronoun paradigm looked like in Old English:

Case Masculine Neuter Feminine Plural
Nominative hit hēo hīe
Accusative hine hit hīe hīe
Dative him him hire him
Genitive his his hire heora

There was some variation in some forms in different dialects and sometimes even within a single dialect, but this table captures the basic forms. (Note that the vowels here basically have classical values, so would be pronounced somewhat like hey, hire would be something like hee-reh, and so on. A macron or acute accent just indicates that a vowel is longer.)

One thing that’s surprising is how recognizable many of them are. We can easily see he, him, and his in the singular masculine forms (though hine, along with all the other accusative forms, have been lost), it (which has lost its h) in the singular neuter forms, and her in the singular feminine forms. The real oddballs here are the singular feminine form, hēo, and the third-person plural forms. They look nothing like their modern forms.

These changes started when the case system started to disappear at the end of the Old English period. , hēo, and hie began to merge together, which would have led to a lot of confusion. But during the Middle English period (roughly 1100 to 1500 AD), some new pronouns appeared, and then things started settling down into the paradigms we know now: he/him/his, it/it/its, she/her/her, and they/them/their. (Note that the original dative and genitive forms for it were identical to those for he, but it wasn’t until Early Modern English that these were replaced by it and his, respectively.)

The origin of they/them/their is fairly uncontroversial: these were apparently borrowed from Old Norse–speaking settlers, who invaded during the Old English period and captured large parts of eastern and northern England, forming what is known as the Danelaw. These Old Norse speakers gave us quite a lot of words, including anger, bag, eye, get, leg, and sky.

The Old Norse words for they/them/their looked like this:

Case Masculine Neuter Feminine
Nominative þeir þau þær
Accusative þá þau þær
Dative þeim þeim þeim
Genitive þeirra þeirra þeirra

If you look at the masculine column, you’ll notice the similarity to the current they/them/their paradigm. (Note that the letter that looks like a cross between a b and a p is a thorn, which stood for the sounds now represented by th in English.)

Many Norse borrowings lost their final r, and unstressed final vowels began to be dropped in Middle English, which would yield þei/þeim/þeir. (As with the Old English pronouns, the accusative form was lost.) It seems like a pretty straightforward case of borrowing. The English third-person pronouns began to merge together as the result of some regular sound changes, but the influx of Norse speakers provided us an alternative for the plural forms.

But not so fast, McWhorter says. Borrowing nouns, verbs, and the like is pretty common, but borrowing pronouns, especially personal pronouns, is pretty rare. So he proposes an alternative origin for they/them/their: the Old English demonstrative pronouns—that is, words like this and these (though in Old English, the demonstratives functioned as definite articles too). Since hē/hēo/hīe were becoming ambiguous, McWhorter argues, English speakers turned to the next best thing: a set of words meaning essentially “that one” or “those ones”. Here’s what the plural demonstrative pronouns in Old English looked like:

Case Plural
Nominative þā
Accusative þā
Dative þǣm/þām
Genitive þāra/þǣra

(Old English had a common plural form rather than separate plural forms for the masculine, neuter, and feminine genders.)

There’s some basis for this kind of change from a demonstrative to a person pronoun; third-person pronouns in many languages come from demonstratives, and the third-person plural pronouns in Old Norse actually come from demonstratives themselves, which explains why they look similar to the Old English demonstratives: they all start with þ, and the dative and genitive forms have the -m and -r on the end just like them/their and the Old Norse forms do.

But notice that the vowels are different. Instead of ei in the nominative, dative, and genitive forms, we have ā or ǣ. This may not seem like a big deal, but generally speaking, vowel changes don’t just randomly affect a few words at a time; they usually affect every word with that sound. There has to be some way to explain the change from ā to ei/ey.

And to make matters worse, we know that ā (/ɑː/ in the International Phonetic Alphabet) raised to /ɔː/ (the vowel in court or caught if you don’t rhyme it with cot) during Middle English and eventually raised to /oʊ/ (the vowel in coat) during the Great Vowel Shift. In a nutshell, if English speakers had started using þā as the third-person plural pronoun in the nominative case, we’d be saying tho rather than they today.

But the biggest problem is that the historical evidence just doesn’t support the idea that they originates from þā. The first recorded instance of they, according to The Oxford English Dictionary, is in a twelfth-century manuscript known as the Ormulum, written by a monk known only as Orm. Orm is the Old Norse word for worm, serpent, or dragon, and the manuscript is written in an East Midlands dialect, which means that it came from the Danelaw, the area once controlled by Norse speakers.

In the Ormulum we finds forms like þeȝȝ and þeȝȝre for they and their, respectively. (The letter ȝ, known as yogh, could represent a variety of sounds, but in this case it represents /i/ or /j/). Other early forms of they include þei, þai, and thei.

The spread of these new forms was gradual, moving from areas of heaviest Old Norse influence throughout the rest of the English-speaking British Isles. The early-fifteenth-century Hengwert Chaucer, a manuscript of The Canterbury Tales, usually has they as the subject but retains her for genitives (from the Old English plural genitive form hiera or heora) and em for objects (from the Old English plural dative him. The ’em that we use today as a reduced form of them probably traces back to this, making it the last vestige of the original Old English third-person plural pronouns.

So to make a long story short, we have new pronouns that look like Old Norse pronouns that arose in an Old Norse–influenced area and then spread out from there. McWhorter’s argument boils down to “borrowing personal pronouns is rare, so it must not have happened”, and then he ignores or hand-waves away any problems with this theory. The idea that these pronouns instead come from the Old English þā just doesn’t appear to be supported either phonologically or historically.

This isn’t even an area of controversy. When I tweeted about McWhorter’s podcast, Merriam-Webster lexicographer Kory Stamper was surprised, responding, “I…didn’t realize there was an argument about the ety of ‘they’? I mean, all the etymologists I know agree it’s Old Norse.” Borrowing pronouns may be rare, but in this case all the signs point to yes.

For a more controversial etymology, though, you’ll have to wait until a later date, when I wade into the murky etymology of she.

By

Stupidity on Singular They

A few weeks ago, the National Review published a singularly stupid article on singular they. It’s wrong from literally the first sentence, in which the author, Josh Gelernter, says that “this week, the 127-year-old American Dialect Society voted the plural pronoun ‘they,’ used as a singular pronoun, their Word of the Year.” It isn’t from last week; this is a piece of old news that recently went viral again. The American Dialect Society announced its word of the year, as it typically does, at the beginning of the year. Unfortunately, this is a good indication of the quality of the author’s research throughout the rest of the article.

After calling those who use singular they stupid and criticizing the ADS for failing to correct them (which is a fairly serious misunderstanding of the purpose of the ADS and the entire field of linguistics in general), Gelernter says that we already have a gender-neutral third-person pronoun, and it’s he. He cites “the dictionary of record”, Webster’s Second International, for support. His choice of dictionary is telling. For those not familiar with it, Webster’s Second, or W2, was published in 1934 and has been out of print for decades.

The only reason someone would choose it over Webster’s Third, published in 1961, is as a reaction to the perception that W3 was overly permissive. When it was first published, it was widely criticized for its more descriptive stance, which did away with some of the more judgemental usage labels. Even W3 is out of date and has been replaced with the new online Unabridged; W2 is only the dictionary of record of someone who refuses to accept any of the linguistic change or social progress of the last century.

Gelernter notes that W2’s first definition for man is “a member of the human race”, while the “male human being” sense “is the second-given, secondary definition.” Here it would have helped Gelernter to read the front matter of his dictionary. Unlike some other dictionaries, Merriam-Webster arranges entries not in order of primary or central meanings to more peripheral meanings but in order of historical attestation. Man was most likely originally gender-neutral, while the original word for a male human being was wer (which survives only in the word werewolf). Over time, though, wer fell out of use, and man began pulling double duty. 1The Online Etymology Dictionary notes that a similar thing happened with the Latin vir (cognate with wer) and homo. Vir fell out of use as homo took over the sense of “male human”.

So just because an entry is listed first in a Merriam-Webster dictionary does not mean it’s the primary definition, and just because a word originally meant one thing (and still does mean that thing to some extent) does not mean we must continue to use it that way.

Interestingly, Gelernter admits that the language lost some precision when the plural you pushed out the singular thou as a second-person pronoun, though, bizarrely, he says that it was for good reason, because you had caught on as a more polite form of address. The use of you as a singular pronoun started as a way to be polite and evolved into an obsession with social status, in which thou was eventually relegated to inferiors before finally dropping out of use.

The resurgence of singular they in the twentieth century was driven by a different sort of social force: an acknowledgement that the so-called gender-neutral he is not really gender-neutral. Research has shown that gender-neutral uses of he and man cause readers to think primarily of males, even when context makes it clear that the person could be of either gender. (Here’s just one example.) They send the message that men are the default and women are other. Embracing gender-neutral language, whether it’s he or she or they or some other solution, is about correcting that imbalance by acknowledging that women are people too.

And in case you still think that singular they is just some sort of newfangled politically correct usage, you should know that it has been in use since the 1300s and has been used by literary greats from Chaucer to Shakespeare to Orwell.2I once wrote that Orwell didn’t actually use singular they; it turns out that the quote attributed to him in Merriam-Webster’s Dictionary of English Usage was wrong, but he really did use it. For centuries, nobody batted an eye at singular they, until grammarians started to proscribe it in favor of generic he in the eighteenth and nineteenth centuries. Embracing singular they doesn’t break English grammar; it merely embraces something that’s been part of English grammar for seven centuries.

At the end, we get to the real heart of Gelernter’s article: ranting about new gender-neutral job titles in the armed forces. Gelernter seems to think that changing to gender-neutral titles will somehow make the members of our armed forces suddenly forget how to do their jobs. This isn’t really about grammar; it’s about imagining that it’s a burden to think about the ways in which language affects people, that it’s a burden to treat women with the same respect as men.

But ultimately, it doesn’t matter what Josh Gelernter thinks about singular they or about gender-neutral language in general. Society will continue to march on, just as language has continued to march on in the eight decades since his beloved Webster’s Second was published. But remember that we have a choice in deciding how language will march on. We can use our language to reflect outdated and harmful stereotypes, or we can use it to treat others with the respect they deserve. I know which one I choose.

Notes   [ + ]

1. The Online Etymology Dictionary notes that a similar thing happened with the Latin vir (cognate with wer) and homo. Vir fell out of use as homo took over the sense of “male human”.
2. I once wrote that Orwell didn’t actually use singular they; it turns out that the quote attributed to him in Merriam-Webster’s Dictionary of English Usage was wrong, but he really did use it.
%d bloggers like this: