Arrant Pedantry

By

You Are Not Dr. Seuss

A couple of weeks ago, Nancy Friedman tweeted a link to an article about Netflix’s forthcoming adaptation of Green Eggs and Ham. And sadly but predictably, whoever wrote the press release about the announcement felt compelled to write in Seussian verse, despite having no idea how to do so.

Here’s the official press release, and here’s the poem—and I use the term loosely—in all its terrible glory:

Issued from Netflix headquarters.
Delivered straight to all reporters.

We’d love to share some happy news
based on the rhymes of Dr. Seuss.
Green Eggs and Ham will become a show
and you’re among the first to know.

In this richly animated production,
a 13-episode introduction,
standoffish inventor (Guy, by name)
and Sam-I-Am of worldwide fame,
embark on a cross-country trip
that tests the limits of their friendship.
As they learn to try new things,
they find out what adventure brings.
Of course they also get to eat
that famous green and tasty treat!

Cindy Holland, VP of Original Content for Netflix
threw her quote into the mix:
“We think this will be a hit
Green Eggs and Ham is a perfect fit
for our growing slate of amazing stories
available exclusively in all Netflix territories.
You can stream it on a phone.
You can stream it on your own.
You can stream it on TV.
You can stream it globally.”

I have to admit that I initially didn’t make it past the beginning of the third verse, though I knew we were in trouble from the first line. The problem is that while it’s pretty easy to make a rhyme, it’s a lot harder to make lines that scan right. To scan in this sense means to show the metrical structure—the patterns of stressed and unstressed syllables in a line. If terms like iambic pentameter make your eyes glaze over, don’t worry about them for now—let’s just look at how you’d stress some of the lines. I’ll show the stresses with all caps.

ISSued from NETflix headQUARTers

In the first line we have a pattern of stress-unstress-unstress. Each group with a stressed syllable is called a foot, and we have three feet in this line. (The last foot is short one unstressed syllable, but this is fine.) The pattern for the first line is DA-da-da DA-da-da DA-da. But notice that the stress on “headquarters” has had to shift; normally you say HEADquarters, but to keep the rhythm even, you have to say headQUARTers instead. This is not terrible, but it’s not a great start.

But the second line doesn’t match up with the first:

deLIVered STRAIGHT to ALL rePORTers

This one starts unstressed rather than stressed and alternates stressed-unstressed for the rest of the line. Alternatively, you could read the first line with this kind of stress (ISSued FROM netFLIX headQUARTers), but that requires shifting the stress on “Netflix” too and stressing the preposition “from”, which is normally unstressed. And even then, the second line still has that extra unstressed syllable at the beginning. The pattern here is da-DA da-DA da-DA da-DA da.

The next stanza sticks more closely to the unstressed-stressed pattern, but again it requires the reader to put the stresses in unusual places. And then there’s an extra unstressed syllable in the third line (green EGGS and HAM will beCOME a SHOW—why not green EGGS and HAM will BE a SHOW?).

The third stanza is more of a wreck. Just say the first line out loud and try to figure out where the stresses are or what the pattern is:

in this RICHly ANimated proDUCtion

The worst line by far, though, has to be the beginning of the fourth stanza:

CINdy HOLland, V(EE)P(EE) of oRIginal CONtent for NETflix

In this line we have two feet with the DA-da pattern, then two back-to-back stresses, then a couple of unstressed syllables, and then a few feet with the DA-da-da pattern.

Surprisingly, though, the poem ends strong, with a metrical pattern straight out of Green Eggs and Ham itself. Compare:

YOU can STREAM it ON a PHONE.

WOULD you EAT them ON a PLANE?

This is obviously where they stopped trying to shoehorn in phrases like “Cindy Holland, VP of Original Content for Netflix” and started following the source material more closely.

The trouble with most people who try to imitate Seuss is that they think poetry is just about the rhyme. (And as a parent of young children, I can tell you that there are an awful lot of children’s book authors who apparently feel compelled to write in verse, despite being terrible at it.) Rhyme is an important part of verse, but rhyme isn’t worth much without the rhythm of well-written lines. Imagine trying to write a song by throwing together a bunch of notes together but not paying any attention to rhythm. It would be a disaster, and no one would want to listen to it.

This isn’t to say that you can’t play around with meter, of course, but it should be deliberate, which means that you have to understand it first. Even Dr. Seuss fudged the meter on occasion, but this was the exception and not the rule. Rhythm shouldn’t be something you accidentally stumble upon from time to time.

You don’t necessarily have to know an anapestic tetrameter from an iambic pentameter to write good verse, but you need to have a good sense of rhythm. Try marking the stresses in each line to see if there’s a consistent pattern. And if you find yourself stumbling or awkwardly stressing certain words as you read the lines aloud, then that’s a good sign that something is wrong.

Dr. Seuss is so revered because he was so good, and it’s not easy to imitate him. So if, you still can’t write good verse despite your best efforts, that’s nothing to be ashamed of. But maybe you should stick to writing press releases.

By

Language, Logic, and Correctness

In “Why Descriptivists Are Usage Liberals”, I said that there some logical problems with declaring something to be right or wrong based on evidence. A while back I explored this problem in a piece titled “What Makes It Right?” over on Visual Thesaurus.

The terms prescriptive and descriptive were borrowed from philosophy, where they are used to talk about ethics, and the tension between these two approaches is reflected in language debates today. The questions we have today about correct usage are essentially the same questions philosophers have been debating since the days of Socrates and Plato: what is right, and how do we know?

As I said on Visual Thesaurus, all attempts to answer these questions run into a fundamental logical problem: just because something is doesn’t mean it ought to be. Most people are uncomfortable with the idea of moral relativism and believe at some level that there must be some kind of objective truth. Unfortunately, it’s not entirely clear just where we find this truth or how objective it really is, but we at least operate under the convenient assumption that it exists.

But things get even murkier when we try to apply this same assumption to language. While we may feel safe saying that murder is wrong and would still be wrong even if a significant portion of the population committed murder, we can’t safely make similar arguments about language. Consider the word bird. In Old English, the form of English spoken from about 500 AD to about 1100 AD, the word was brid. Bird began as a dialectal variant that spread and eventually supplanted brid as the standard form by about 1600. Have we all been saying this word wrong for the last four hundred years or so? Is saying bird just as wrong as saying nuclear as nucular?

No, of course not. Even if it had been considered an error once upon a time, it’s not an error anymore. Its widespread use in Standard English has made it standard, while brid would now be considered an error (if someone were to actually use it). There is no objectively correct form of the word that exists independent of its use. That is, there is no platonic form of the language, no linguistic Good to which a grammarian-king can look for guidance in guarding the city.

This is why linguistics is at its core an empirical endeavor. Linguists concern themselves with investigating linguistic facts, not with making value judgements about what should be considered correct or incorrect. As I’ve said before, there are no first principles from which we can determine what’s right and wrong. Take, for example, the argument that you should use the nominative form of pronouns after a copula verb. Thus you should say It is I rather than It is me. But this argument assumes as prior the premise that copula verbs work this way and then deduces that anything that doesn’t work this way is wrong. Where would such a putative rule come from, and how do we know it’s valid?

Linguists often try to highlight the problems with such assumptions by pointing out, for example, that French requires an object pronoun after the copula (in French you say c’est moi [it’s me], not c’est je [it’s I]) or that English speakers, including renowned writers, have long used object forms in this position. That is, there is no reason to suppose that this rule has to exist, because there are clear counterexamples. But then, as I said before, some linguists leave the realm of strict logic and argue that if everyone says it’s me, then it must be correct.

Some people then counter by calling this argument fallacious, and strictly speaking, it is. Mededitor has called this the Jane Austen fallacy (if Jane Austen or some other notable past writer has done it, then it must be okay), and one commenter named Kevin S. has made similar arguments in the comments on Kory Stamper’s blog, Harmless Drudgery.

There, Kevin S. attacked Ms. Stamper for noting that using lay in place of lie dates at least to the days of Chaucer, that it is very common, and that it “hasn’t managed to destroy civilization yet.” These are all objective facts, yet Kevin S. must have assumed that Ms. Stamper was arguing that if it’s old and common, it must be correct. In fact, she acknowledged that it is nonstandard and didn’t try to argue that it wasn’t or shouldn’t be. But Kevin S. pointed out a few fallacies in the argument that he assumed that Ms. Stamper was making: an appeal to authority (if Chaucer did it, it must be okay), the “OED fallacy” (if it has been used that way in the past, it must be correct), and the naturalistic fallacy, which is deriving an ought from an is (lay for lie is common; therefore it ought to be acceptable).

And as much as I hate to say it, technically, Kevin S. is right. Even though he was responding to an argument that hadn’t been made, linguists and lexicographers do frequently make such arguments, and they are in fact fallacies. (I’m sure I’ve made such arguments myself.) Technically, any argument that something should be considered correct or incorrect isn’t a logical argument but a persuasive one. Again, this goes back to the basic difference between descriptivism and prescriptivism. We can make statements about the way English appears to work, but making statements about the way English should work or the way we think people should feel about it is another matter.

It’s not really clear what Kevin S.’s point was, though, because he seemed to be most bothered by Ms. Stamper’s supposed support of some sort of flabby linguistic relativism. But his own implied argument collapses in a heap of fallacies itself. Just as we can’t necessarily call something correct just because it occurred in history or because it’s widespread, we can’t necessarily call something incorrect just because someone invented a rule saying so.

I could invent a rule saying that you shouldn’t ever use the word sofa because we already have the perfectly good word couch, but you would probably roll your eyes and say that’s stupid because there’s nothing wrong with the word sofa. Yet we give heed to a whole bunch of similarly arbitrary rules invented two or three hundred years ago. Why? Technically, they’re no more valid or logically sound than my rule.

So if there really is such a thing as correctness in language, and if any argument about what should be considered correct or incorrect is technically a logical fallacy, then how can we arrive at any sort of understanding of, let alone agreement on, what’s correct?

This fundamental inability to argue logically about language is a serious problem, and it’s one that nobody has managed to solve or, in my opinion, ever will completely solve. This is why the war of the scriptivists rages on with no end in sight. We see the logical fallacies in our opponents’ arguments and the flawed assumptions underlying them, but we don’t acknowledge—or sometimes even see—the problems with our own. Even if we did, what could we do about them?

My best attempt at an answer is that both sides simply have to learn from each other. Language is a democracy, true, but, just like the American government, it is not a pure democracy. Some people—including editors, writers, English teachers, and usage commentators—have a disproportionate amount of influence. Their opinions carry more weight because people care what they think.

This may be inherently elitist, but it is not necessarily a bad thing. We naturally trust the opinions of those who know the most about a subject. If your car won’t start, you take it to a mechanic. If your tooth hurts, you go to the dentist. If your writing has problems, you ask an editor.

Granted, using lay for lie is not bad in the same sense that a dead starter motor or an abscessed tooth is bad: it’s a problem only in the sense that some judge it to be wrong. Using lay for lie is perfectly comprehensible, and it doesn’t violate some basic rule of English grammar such as word order. Furthermore, it won’t destroy the language. Just as we have pairs like lay and lie or sit and set, we used to have two words for hang, but nobody claims that we’ve lost a valuable distinction here by having one word for both transitive and intransitive uses.

Prescriptivists want you to know that people will judge you for your words (and—let’s be honest—usually they’re the ones doing the judging), and descriptivists want you to soften those judgements or even negate them by injecting them with a healthy dose of facts. That is, there are two potential fixes for the problem of using words or constructions that will cause people to judge you: stop using that word or construction, or get people to stop judging you and others for that use.

In reality, we all use both approaches, and, more importantly, we need both approaches. Even most dyed-in-the-wool prescriptivists will tell you that the rule banning split infinitives is bogus, and even most liberal descriptivists will acknowledge that if you want to be taken seriously, you need to use Standard English and avoid major errors. Problems occur when you take a completely one-sided approach, insisting either that something is an error even if almost everyone does it or that something isn’t an error even though almost everyone rejects it. In other words, good usage advice has to consider not only the facts of usage but speakers’ opinions about usage.

For instance, you can recognize that irregardless is a word, and you can even argue that there’s nothing technically wrong with it because nobody cares that the verbs bone and debone mean the same thing, but it would be irresponsible not to mention that the word is widely considered an error in educated speech and writing. Remember that words and constructions are not inherently correct or incorrect and that mere use does not necessarily make something correct; correctness is a judgement made by speakers of the language. This means that, paradoxically, something can be in widespread use even among educated speakers and can still be considered an error.

This also means that on some disputed items, there may never be anything approaching consensus. While the facts of usage may be indisputable, opinions may still be divided. Thus it’s not always easy or even possible to label something as simply correct or incorrect. Even if language is a democracy, there is no simple majority rule, no up and down vote to determine whether something is correct. Something may be only marginally acceptable or correct only in certain situations or according to certain people.

But as in a democracy, it is important for people to be informed before metaphorically casting their vote. Bryan Garner argues in his Modern American Usage that what people want in language advice is authority, and he’s certainly willing to give it to you. But I think what people really need is information. For example, you can state authoritatively that regardless of past or present usage, singular they is a grammatical error and always will be, but this is really an argument, not a statement of fact. And like all arguments, it should be supported with evidence. An argument based solely or primarily on one author’s opinion—or even on many people’s opinions—will always be a weaker argument than one that considers both facts and opinion.

This doesn’t mean that you have to accept every usage that’s supported by evidence, nor does it mean that all evidence is created equal. We’re all human, we all still have opinions, and sometimes those opinions are in defiance of facts. For example, between you and I may be common even in educated speech, but I will probably never accept it, let alone like it. But I should not pretend that my opinion is fact, that my arguments are logically foolproof, or that I have any special authority to declare it wrong. I think the linguist Thomas Pyles said it best:

Too many of us . . . would seem to believe in an ideal English language, God-given instead of shaped and molded by man, somewhere off in a sort of linguistic stratosphere—a language which nobody actually speaks or writes but toward whose ineffable standards all should aspire. Some of us, however, have in our worst moments suspected that writers of handbooks of so-called “standard English usage” really know no more about what the English language ought to be than those who use it effectively and sometimes beautifully. In truth, I long ago arrived at such a conclusion: frankly, I do not believe that anyone knows what the language ought to be. What most of the authors of handbooks do know is what they want English to be, which does not interest me in the least except as an indication of the love of some professors for absolute and final authority.[1]

In usage, as in so many other things, you have to learn to live with uncertainty.

  1. [1] “Linguistics and Pedagogy: The Need for Conciliation,” in Selected Essays on English Usage, ed. John Algeo (Gainesville: University Presses of Florida, 1979), 169–70.

By

No, Online Grammar Errors Have Not Increased by 148%

Yesterday a post appeared on QuickandDirtyTips.com (home of Grammar Girl’s popular podcast) that appears to have been written by a company called Knowingly, which is promoting its Correctica grammar-checking tool. They claim that “online grammar errors have increased by 148% in nine years”. If true, it would be a pretty shocking claim, but the numbers immediately sent up some red flags.

They searched for seventeen different errors and compared the numbers from nine years ago to the numbers from today. From the description, I gather that the first set of numbers comes from a publicly available set of data that Google culled from public web pages. The data was released in 2006 and is hosted by the Linguistic Data Consortium. You can read more about the data here, but this part is the most relevant:

We processed 1,024,908,267,229 words of running text and are publishing the counts for all 1,176,470,663 five-word sequences that appear at least 40 times. There are 13,588,391 unique words, after discarding words that appear less than 200 times.

So the data is taken from over a trillion words of text, but some sequences were discarded if they didn’t appear frequently enough, and you can only search sequences up to five words long. Also note that while the data was released in 2006, it does not necessarily all come from 2006; some of it could have come from web pages that were older than that.

It sounds like the second set of numbers comes from a series of Google searches—it simply says “search result data today”. It isn’t explicitly stated, but it appears that the search terms were put in quotes to find exact strings. But we’re already comparing apples and oranges: though the first set of data came from a known sample size (just over a trillion words) and and was cleaned up a bit by having outliers thrown out, we have no idea how big the second sample size is. How many words are you effectively searching when you do a search in Google?

This is why corpora usually present not just raw numbers but normalized numbers—that is, not just an overall count, but a count per thousand words or something similar. Knowing that you have 500 instances of something in data set A and 1000 instances in data set B doesn’t mean anything unless you know how big those sets are, and in this case we don’t.

This problem is ameliorated somewhat by looking not just at the raw numbers but at the error rates. That is, they searched for both the correct and incorrect forms of each item, calculated how frequent the erroneous form was, and compared the rates from 2006 to the rates from 2015. It would still be better to compare two similar datasets, because we have no idea how different the cleaned-up Google Ngrams data is from raw Google search data, but at least this allows us to make some rough comparisons. But notice the huge differences between the “then” and “now” numbers in the table below. Obviously the 2015 data represents a much larger set. (I’ve split their table into two pieces, one for the correct terms and one for the incorrect terms, to make them fit in my column here.)

Correct Term

Then

Now

jugular vein

56,409

794,000

bear in mind

931,235

35,500,000

head over heels

179,491

8,130,000

chocolate mousse

237,870

6,790,000

egg yolk

152,458

5,420,000

without further ado

120,124

1,960,000

whet your appetite

52,850

533,000

heroin and morphine

3,220

112,000

reach across the aisle

2707

117,000

herd mentality

19,444

411,000

weather vane

70906

477,000

zombie horde

21,091

464,000

chili peppers

1,105,405

29,100,000

brake pedal

138,765

1,450,000

pique your interest

8,126

296,000

lessen the burden

14,926

389,000

bridal shower

852,371

16,500,000

Incorrect Term

Then

Now

juggler vein

693

4,150

bare in mind

18,492

477,000

head over heals

12,633

398,000

chocolate moose

14,504

364,000

egg yoke

2,028

88,900

without further adieu

13,170

437,000

wet your appetite

8,930

216,000

heroine and morphine

45

3,860

reach across the isle

93

11,800

heard mentality

313

21,300

weather vein

698

16,100

zombie hoard

744

64,200

chilly peppers

2,532

155,000

brake petal

417

27,800

peek your interest

320

111,000

lesson the burden

212

91,400

bridle shower

182

157,000

But then the Correctica team commits a really major statistical goof—they average all those percentages together to calculate an overall percentage. Here’s their data again:

Incorrect Term

Then

Now

Increase

juggler vein

1.2%

0.5%

–57.2%

bare in mind

1.9%

1.3%

–31.9%

head over heals

6.6%

4.7%

–29.0%

chocolate moose

5.7%

5.1%

–11.5%

egg yoke

1.3%

1.6%

22.9%

without further adieu

9.9%

18.2%

84.5%

wet your appetite

14.5%

28.8%

99.5%

heroine and morphine

1.4%

3.3%

141.7%

reach across the isle

3.3%

9.2%

175.8%

heard mentality

1.6%

4.9%

211.0%

weather vein

1.0%

3.3%

234.9%

zombie hoard

3.4%

12.2%

256.7%

chilly peppers

0.2%

0.5%

131.8%

brake petal

0.3%

1.9%

527.9%

peek your interest

3.8%

27.3%

619.8%

lesson the burden

1.4%

19.0%

1258.6%

bridle shower

0.0%

0.9%

4315.2%

3.4%

8.4%

148.2%

They simply add up all the percentages (1.2% + 1.9% + 6.6% + . . .) and divide by the numbers of percentages, 17. But this number is meaningless. Imagine that we were comparing two items: isn’t is used 9,900 times and ain’t 100 times, and regardless is used 999 times and irregardless 1 time. This means that when there’s a choice between isn’t and ain’t, ain’t is used 1% of the time (100/(9900+100)), and when there’s a choice between regardless and irregardless, irregardless is used .1% of the time (1/(999+1)). If you average 1% and .1%, you get .55%, but this isn’t the overall error rate.

But to get an overall error rate, you need to calculate the percentage from the totals. We have to take the total number of errors and the total number of opportunities to use either the correct or the incorrect form. This gives us (1+100/((9900+999)+(100+1))), or 101/11000, which works out to .92%, not .55%.

When we count up the totals and calculate the overall rates, we get an error rate of 1.88% for then (not 3.4%) and 2.38% for now (not 8.4%). That means the increase from 2006 to 2009 is not 148.2%, but a much more modest 26.64%. (By the way, I’m not sure where they got 148.2%; by my calculations, it should be 147.1%, but I could have made a mistake somewhere.) This is still a rather impressive increase in errors from 2009 to today, but the problems with the data set make it impossible to say for sure if this number is accurate or meaningful. “Heroine and morphine” occurred 45 times out of over a trillion words. Even if the error rate jumped 141.73% from 2009 to 2015, and even if the two sample sets were comparable, this would still probably amount to nothing more than statistical noise.

And even if these numbers were accurate and meaningful, there’s still the question of research design. They claim that grammar errors have increased, but all of the items are spelling errors, and most of them are rather obscure ones at that. At best, this study only tells us that these errors have increased that much, not that grammar errors in general have increased that much. If you’re setting out to study grammar errors (using grammar in the broad sense), why would you assume that these items are representative of the phenomenon in general?

So in sum, the study is completely bogus, and it’s obviously nothing more than an attempt to sell yet another grammar-checking service. Is it important to check your writing for errors? Sure. Can Correctica help you do that? I have no idea. But I do know that this study doesn’t show an epidemic of grammar errors as it claims to.

(Here’s the data if anyone’s interested.)

By

Why Descriptivists Are Usage Liberals

Outside of linguistics, the people who care most about language tend to be prescriptivists—editors, writers, English teachers, and so on—while linguists and lexicographers are descriptivists. “Descriptive, not prescriptive!” is practically the linguist rallying cry. But we linguists have done a terrible job of explaining just what that means and why it matters. As I tried to explain in “What Descriptivism Is and Isn’t”, descriptivism is essentially just an interest in facts. That is, we make observations about what the language is rather than state opinions about how we’d like it to be.

Descriptivism is often cast as the opposite of prescriptivism, but they aren’t opposites at all. But no matter how many times we insist that “descriptivism isn’t ‘anything goes’”, people continue to believe that we’re all grammatical anarchists and linguistic relativists, declaring everything correct and saying that there’s no such thing as a grammatical error.

Part of the problem is that whenever you conceive of two approaches as opposing points of view, people will assume that they’re opposite in every regard. Prescriptivists generally believe that communication is important, that having a standard form of the language facilitates communication, and that we need to uphold the rules to maintain the standard. And what people often see is that linguists continually tear down the rules and say that they don’t really matter. The natural conclusion for many people is that linguists don’t care about maintaining the standard or supporting good communication—they want a linguistic free-for-all instead. Then descriptivists appear to be hypocrites for using the very standard they allegedly despise.

It’s true that many descriptivists oppose rules that they disagree with, but as I’ve said before, this isn’t really descriptivism—it’s anti-prescriptivism, for lack of a better term. (Not because it’s the opposite of prescriptivism, but because it often prescribes the opposite of what traditional linguistic prescriptivism does.) Just ask yourself how an anti-prescriptive sentiment like “There’s nothing wrong with singular they” is a description of linguistic fact.

So if that’s not descriptivism, then why do so many linguists have such liberal views on usage? What does being against traditional rules have to do with studying language? And how can linguists oppose rules and still be in favor of good communication and Standard English?

The answer, in a nutshell, is that we don’t think that the traditional rules have much to do with either good communication or Standard English. The reason why we think that is a little more complicated.

Linguists have had a hard time defining just what Standard English is, but there are several ideas that recur in attempts to define it. First, although Standard English can certainly be spoken, it is often conceived of as a written variety, especially in the minds of non-linguists. Second, it is generally more formal, making it appropriate for a wide range of serious topics. Third, it is educated, or rather, it is used by educated speakers. Fourth, it is supraregional, meaning that it is not tied to a specific region, as most dialects are, but that it can be used across an entire language area. And fifth, it is careful or edited. Notions of uniformity and prestige are often thrown into the mix as well.

Careful is a vague term, but it means that users of Standard English put some care into what they say or write. This is especially true of most published writing; the entire profession of editing is dedicated to putting care into the written word. So it’s tempting to say that following the rules is an important part of Standard English and that tearing down those rules tears down at least that part of Standard English.

But the more important point is that Standard English is ultimately rooted in the usage of actual speakers and writers. It’s not just that there no legislative body declaring what’s standard, but that there are no first principles from which we can deduce what’s standard. All languages are different, and they change over time, so how can we know what’s right or wrong except by looking at the evidence? This is what descriptivists try to do when discussing usage: look at the evidence from historical and current usage and draw meaningful conclusions about what’s right or wrong. (There are some logical problems with this, but I’ll address those another time.)

Let’s take singular they, for example. The evidence shows that it’s been in use for centuries not just by common folk or educated speakers but by well-respected writers from Geoffrey Chaucer to Jane Austen. The evidence also shows that it’s used in fairly predictable ways, generally to refer to indefinite pronouns or to nouns that don’t specify gender. Its use has not caused the grammar of English to collapse, and it seems like a rather felicitous solution to the gender-neutral pronoun problem. So at least from a dispassionate linguistic point of view, there is no problem with it.

From another point of view, though, there is something wrong with it: some people don’t like it. This is a social rather than a linguistic fact, but it’s a fact nonetheless. But this social fact arose because at some point someone declared—contrary to the linguistic facts—that singular they is a grammatical error that should be avoided. Here’s where descriptivists depart from description and get into anti-prescription. If people have been taught to dislike this usage, it stands to reason that they could be taught to get over this dislike.

That is, linguists are engaging in anti-prescriptivism to counter the prescriptivism that isn’t rooted in linguistic fact. So when they debunk or tear down traditional rules, it’s not that they don’t value Standard English or good communication; it’s that they think that those particular rules have nothing to do with either.

To be fair, I think that many linguists think they’re still merely describing when they’re countering prescriptive attitudes. Saying that singular they has been used for centuries by respected writers, that it appears to follow fairly well-defined rules, and that the proscription against it is not based in linguistic fact is descriptive; saying that people need to get over their dislike and accept it is not.

And this is precisely why I think descriptivism and prescriptivism not only can but should coexist. It’s not wrong to have opinions on what’s right or wrong, but I think it’s better if those opinions have some basis in fact. Guidance on issues of usage can really only be relevant and valid if it takes all the evidence into account—who uses a certain word of construction, in what circumstances, and so on. These are all facts that can be investigated, and linguistics provides a solid methodological framework for doing so. Anything that ignores the facts reduces to one sort of ipse dixit or another, either a statement from an authority declaring something to be right or wrong or one’s own preferences or pet peeves.

Linguists value good communication, and we recognize the importance of Standard English. But our opinions on both are informed by our study of language and by our emphasis on facts and evidence. This isn’t “anything goes”, or at least no more so than language has always been. People have always worried about language change, but language has always turned out fine. Inventing new rules to try to regulate language will not save it from destruction, and tossing out the rules that have no basis in fact will not hasten the language’s demise. But recognizing that some rules don’t matter may alleviate some of those worries, and I think that’s a good thing for both camps.

By

Fifty Shades of Bad Grammar Advice

A few weeks ago, the folks at the grammar-checking website Grammarly wrote a piece about supposed grammar mistakes in Fifty Shades of Grey. Despite being a runaway hit, the book has frequently been criticized for its terrible prose, and Grammarly apparently saw an opportunity to fix some of the book’s problems (and probably sell its grammar-checking services along the way).

The first problem, of course, is that most of the errors Grammarly identified have nothing to do with grammar. The second is that most of their edits not only fail to fix the clunky prose but actually make it worse.

Mark Allen already took Grammarly to task in a post on the Copyediting blog, saying that their edits “lack restraint”, that “the list is full of style choices and non-errors”, and that “it fails to make a case for the value of proofreading, and, by association, . . . reflects poorly on the craft of copyediting.” I agreed and thought at the time that nothing more needed to be said.

But then Grammarly decided to go even further. In this infographic, they claim to have found “similar gaffes” in the works of authors ranging from Nicholas Sparks to Shakespeare.

The first edit suggests that Nicholas Sparks needs a comma in the sentence “I am a common man with common thoughts and I’ve led a common life.” It’s true that this is a compound sentence, and such sentences typically require a comma between the two independent clauses. But The Chicago Manual of Style says that the comma can be omitted when the clauses are short and closely related. This isn’t an error so much as a style choice.

Incidentally, Grammarly says that “E. L. James is not the first author to include a comma in her work when a semi-colon would be more appropriate, or vice versa.” But the supposed error here isn’t that James used a comma when she should have used a semicolon; it’s that she didn’t use a comma at all. (Also note that “semicolon” is not spelled with a hyphen and that the comma before “or vice versa” is not necessary.)

Error number 2 is comma misuse (which is somehow different from error number 1, which is also comma misuse). Grammarly says, “Many writers forget to include a comma when one is necessary, or include a comma when it is not necessary.” (By the way, the comma before “or include a comma when it is not necessary” is not necessary.) The supposed offender here is Hemingway, who wrote, “We would be together and have our books and at night be warm in bed together with the windows open and the stars bright.” Grammarly suggests putting a comma after “at night”, but that would be a mistake.

The sentence has a compound predicate with three verb phrases strung together with ands. Hemingway says that “We would (1) be together and (2) have our books and (3) at night be warm in bed together with the windows open and the stars bright.” You don’t need a comma between the parts of a compound predicate, and if you want to set off the phrase “at night”, then you need commas on both sides: “We would be together and have our books and, at night, be warm in bed together with the windows open and the stars bright.” But that destroys the rhythm of the sentence and interferes with Hemingway’s signature style.

Error number 3 is wordiness, and the offender is Edith Wharton, who wrote, “Each time you happen to me all over again.” Grammarly suggests axing “all over”, leaving “Each time you happen to me again”. But this edit doesn’t fix a wordy sentence so much as it kills its emphasis. This is dialogue; shouldn’t dialogue sound like the way people talk?

Error number 4, colloquialisms, is not even an error by Grammarly’s own admission—it’s a stylistic choice. And choosing to use colloquialisms—more particularly, contractions—is a perfectly valid stylistic choice in fiction, especially in dialogue. Changing “doesn’t sound very exciting” to “it does not sound very exciting” is probably fine if you’re editing dialogue for Data from Star Trek, but it just isn’t how normal people talk.

The next error, commonly confused words, is a bit of a head-scratcher. Here Grammarly fingers F. Scott Fitzgerald for writing “to-night” rather than “tonight”. But this has nothing to do with confused words, because they’re the same word. To-night was the more common spelling until the 1930s, when the unhyphenated tonight surpassed it. This is not an error at all, let alone an error involving commonly confused words.

The sixth error, sentence fragments, is again debatable, and Grammarly even acknowledges that using fragments “is one way to emphasize an idea.” Once again, Grammarly says that it’s a style choice that for some reason you should never make. The Chicago Manual of Style, on the other hand, rightly acknowledges that the proscription against sentence fragments has “no historical or grammatical foundation.”

Error number 7 is another puzzler. They say that determiners “help writers to be specific about what they are talking about.” Then they say that Boris Pasternak should have written “sent down to the earth” rather than “sent down to earth” in Doctor Zhivago. Where on the earth did they get that idea? Not only is “down to earth” far more common in writing, but there’s nothing unclear about it. Adding the “the” doesn’t solve any problem because there is no problem here. Incidentally, they say the error has to do with determiners, but they’re really talking about articles—a, an, and the. Articles are simply one type of determiner, which also includes possessive determiners, demonstratives, and quantifiers.

I’ll skip error number 8 for the moment and go to number 9, the passive voice. Again they note the passive voice is a stylistic choice and not a grammatical error, and then they edit it out anyway. In place of Mr. Darcy’s “My feelings will not be repressed” we now have “I will not repress my feelings.” Grammarly claims that the passive can cause “a lack of clarity in your writing”, but what is unclear about this line? Is anyone confused about it in the slightest? Instead of added clarity, we get a ham-fisted edit that shifts the focus from where it should be—the feelings—onto Mr. Darcy himself. This is exactly the sort of sentence that calls for the passive voice.

The eighth error is probably the most infuriating because it gets so many things wrong. Here they take Shakespeare himself to task over his supposed preposition misuse. They say that in The Tempest, Shakespeare should have written “such stuff on which dreams are made on” rather than “such stuff as dreams are made on”. The first problem with Grammarly’s correction is that it doubles the preposition “on”, creating a grammatical problem rather than fixing it.

The second problem with this correction is that which can’t be used as a relative pronoun referring to such—only as can do that. Their fix is not just awkward but doubly ungrammatical.

The third is that it simply ruins the meter of the line. Remember that Shakespeare often wrote in a meter called iambic pentameter, which means that each foot contains two syllables with stress on the second syllable and that there are five feet per line. Here’s the sentence from The Tempest:

We are such stuff
As dreams are made on, and our little life
Is rounded with a sleep.

(Note that these aren’t full lines because I’m omitting the text from surrounding sentences that make up part of the first and third lines.) Pay attention to the rhythm of those lines.

we ARE such STUFF
as DREAMS are MADE on AND our LITTle LIFE
is ROUNDed WITH a SLEEP

Now compare Grammarly’s fix:

we ARE such STUFF
on WHICH dreams ARE made ON and OUR littLE life
is ROUNDed WITH a SLEEP

The second line has too many syllables, and the stresses have all shifted. Shakespeare’s line puts most of the stresses on nouns and verbs, while Grammarly’s fix puts it mostly on function words—pronouns, prepositions, determiners—and, maybe worst of all, on the second syllable of “little”. They have taken lines from one of the greatest writers in all of English history and turned them into ungrammatical doggerel. It takes some nerve to edit the Bard; it apparently takes sheer blinkered idiocy to edit him so badly.

So, just to recap, that’s nine supposed grammatical errors that Grammarly says will ruin your prose, most of which are not errors and have nothing to do with grammar. Their suggested fixes, on the other hand, sometimes introduce grammatical errors and always worsen the writing. The takeaway from all of this is not, as Grammarly says, that loves conquers all, but rather that Grammarly doesn’t know the first thing about grammar, let alone good writing.

Addendum: I decided to stop giving Grammarly such a bad time and help them out by editing their infographic pro bono.

By

New Shirts, New Old Posts

Good news, everyone! I have a new T-shirt design inspired by that one movie featuring the popular interlocking brick system.

371130_1002686190_editingisawesomefinal_orig

Head over to the Arrant Pedantry Store to take a look.

I’ve also moved a couple of posts over here from a now-defunct site. When I finished grad school a couple of years ago, my wife and I launched a new site for our freelance editing endeavors, and shortly thereafter I got a full-time job. Though the site is gone, I wanted to keep our blog posts (all two of them) online, so you can now find them here.

Why You Need an Editor (by me)

Accepting and Rejecting Changes in Microsoft Word (by my wife, Ruth)

By

Why Is It “Woe Is Me”?

I recently received an email asking about the expression woe is me, namely what the plural would be and why it’s not woe am I. Though the phrase may strike modern speakers as bizarre if not downright ungrammatical, there’s actually a fairly straightforward explanation: it’s an archaic dative expression. Strange as it may seem, the correct form really is woe is me, not woe am I or woe is I, and the first-person plural would simply be woe is us. I’ll explain why.

Today English only has three cases—nominative (or subjective), objective, and genitive (or possessive)—and these cases only apply to personal pronouns and who. Old English, on the other hand, had four cases (and vestiges of a fifth), and they applied to all nouns, pronouns, and adjectives. Among these four were two different cases for objects: accusative and dative. (The forms that we now think of simply as object pronouns actually descend from the dative pronouns, though they now cover the functions of both the accusative and dative.) These correspond roughly to direct and indirect objects, respectively, though they could be used in other ways too.

For instance, some prepositions took accusative objects, and some took dative objects (and some took either depending on the meaning). Nouns and pronouns in the accusative and dative cases could also be used in ways that seem strange to modern speakers. The dative, for example, could be used in places where we would normally use to and a pronoun. In some constructions we still have the choice between a pronoun or to and a pronoun—think of how you can say either I gave her the ball or I gave the ball to her—but in Old English you could do this to a much greater degree.

In the phrase woe is me, woe is the subject and me is a dative object, something that isn’t allowed in English today. It really means woe is to me. Today the phrase woe is me is pretty fixed, but some past variations on the phrase make the meaning a little clearer. Sometimes it was used with a verb, and sometimes woe was simply followed by a noun or prepositional phrase. In the King James Bible, we find “If I be wicked, woe unto me” (Job 10:15). One example from Old English reads, “Wa biþ þonne þæm mannum” (woe be then [to] that man).

So “woe is I” is not simply a fancy or archaic way of saying “I am woe” and is thus not parallel to constructions like “it is I”, where the nominative form is usually prescribed and the objective form is proscribed. In “woe is me”, “me” is not a subject complement (also known as a predicative complement) but a type of dative construction.

Thus the singular is is always correct, because it agrees with the singular mass noun woe. And though we don’t have distinct dative pronouns anymore, you can still use any pronoun in the object case, so woe is us would also be correct.

Addendum: Arika Okrent, writing at Mental Floss, has also just posted a piece on this construction. She goes into a little more detail on related constructions in English, German, and Yiddish.

And here are a couple of articles by Jan Freeman from 2007, specifically addressing Patricia O’Conner’s Woe Is I and a column by William Safire on the phrase:

Woe Is Us, Part 1
Woe Is Us, Continued

By

On Visual Thesaurus: “Clear and/or Unclear”

And/or is a surprisingly contentious little conjunction. Some lawyers love it, but most editors hate it—and many judges hate it too. Find out what the problem is in my newest post on Visual Thesaurus, “Clear and/or Unclear”.

By

Get a Discount on Copyediting Newsletter

Attention, editors! Get a great deal on a subscription to Copyediting newsletter when you use the code COPY at checkout. It’s full of great information on style and usage, advice on getting your freelance business going, tech tips, and, of course, my column on grammar.

The code is good for any subscription, audio conference, or webinar, and it’s valid until January 31st. Check it out!

By

Another Day, Another Worthless Grammar Quiz

Yesterday I did something I regret: I clicked on and took one of those stupid quizzes that go around Facebook. It’s called How good is your grammar? and I clicked on it not to find out how good my grammar is, but because I wanted to know what the test-maker thought good grammar was.

I started seeing problems with the test right away, including questions that had two or three right answers or no right answers, and no matter what I did, I couldn’t score higher than 13, a score which provided me this questionable feedback:

You’ve definitely got our respect! 13 out of 15 is a really, really impressive score. Your grammar skills are so good, you’re probably the person that picks your friends up on their mistakes, right? We’ll happily admit that this test was pretty hard and we’re pretty sure that you’re friends can’t do better – why not test them and find out?

“Picks your friends up on their mistakes”? I get what they mean, but I’ve never heard that expression before. And “We’ll happily admit that this test was pretty hard and we’re pretty sure that you’re friends can’t do better”? That compound sentence needs a comma before “and”, and more importantly, it should be “your friends”, not “you’re friends”.

The most frustrating part is that this quiz doesn’t provide a key or any question-specific feedback, so it’s impossible to tell what you’ve gotten wrong. I had to ask someone who managed to get 15 what his answers were, and the correct answers were pretty eyebrow-raising. To make matters worse, they seem to have changed since I took it yesterday. (Edited to add: As several people have pointed out, the answers seem to be right now, but some people are still reporting that they’re getting different scores every time even though they’re giving the same answers. Some are also reporting that they’re getting a score of 15 even when they deliberately answer ever question wrong, so it could be that the scoring is just random and the whole thing is a scam.) I’ll go through it question by question, highlighting the correct answer according to the quiz (at the time I took it) and explaining why it is or isn’t right.

  1. Let’s start quite easy: which of these sentences is grammatically correct?
    • There are seven girls in her class.
    • There’s seven girls in her class.
    • They’re seven girls in her class.

This one is fairly straightforward. Though there’s with a plural subject is quite common and is found even in edited writing, strict grammatical agreement requires there are. However, they’re seven girls is grammatical too, though with a very different meaning. Imagine that you were talking about seven different girls, and someone asked you who they were. You might respond, “They’re seven girls in her class.” It’s an unlikely conversation, but in that sense it’s not ungrammatical.

  1. Which of these is right?
    • The woman that works here
    • The woman who works here
    • The woman which works here

Many traditionalists insist that only who can be used to refer to people, but this isn’t true. That can also be used with people, as I’ve explained here and elsewhere. It has been in use since the days of Old English, over a thousand years ago, and great writers have been using it ever since. Even Bryan Garner, who is quite conservative in many regards, says it’s okay.

  1. What’s the subject in this sentence? ‘Today I went to the park’.
    • I
    • Today
    • Park

This is where things really start to get idiotic. The correct answer, according to the quiz, is park. In reality, the subject of the sentence is I. Park is the object of the preposition to.

  1. Should it be ‘there’, ‘they’re’ or ‘their’?
    • The students thought there homework was hard
    • The students thought their homework was hard
    • The students thought they’re homework was hard

This one’s easy: the correct answer is actually what the quiz says. (Though when I first took it, the options all had a superfluous comma after students. They’ve since been removed.)

  1. What’s a pronoun?
    • A word that stands in the place of a noun.
    • A ‘being’ word.
    • A particularly impressive noun.

It was at this point that I started wondering if the author of the quiz was just an idiot or if they were actually trolling everyone. A pronoun is not a particularly impressive noun; it’s a word that stands in the place of a noun or noun phrase.

  1. Which is right?
    • She could have done that.
    • She could of done that.
    • She could off done that.

Again, this one’s easy, and the quiz actually gets it right. Could’ve sounds just like could of, so people often incorrectly write the latter. (But no one writes could off. I don’t know why that’s even an option.

  1. Now they get a little bit trickier: Which is right?
    • If I was you, I would…
    • If I am you, I would…
    • If I were you, I would…

This is another oversimplification. Traditionally, were is used with counterfactual statements, but was has been used for centuries and appears in edited prose. (I once saw an example in Old English, which shows that this rule has been waning for over a millennium.)

  1. Which of these adjectives is a superlative?
    • Happy
    • Happier
    • Happiest

This one is right. Happy is a positive adjective, and happier is a comparative adjective.

  1. What’s the object in this sentence? ‘Yesterday she hated me’
    • Yesterday
    • She
    • There is no object in this sentence
    • Me

Wrong, wrong, wrong. The object is me.

  1. Which is right?
    • The boy to whom she gave the toy was called Matt.
    • The boy, who she gave the toy to, was called Matt.
    • The boy whom she gave the toy was called Matt.

Actually, all of them could be right depending on context and register. I don’t know why the second option has commas around the relative clause, but they’re not necessarily wrong. They could be correct if the clause is nonrestrictive, but it’s impossible to tell without more context.

The second option is informal, but it’s hard to call it wrong since that’s how pretty much every native English speaker would say it. Whom is on the decline, and there’s nothing wrong with preposition stranding, though it’s sometimes avoided in more formal speech and writing.

The other options are both correct. You can say either She gave him the toy or She gave the toy to him. The first has him as an indirect verbal object, while the second has it as an oblique (prepositional) object. You can make a relative clause out of either one, yielding either whom she gave the toy or to whom she gave the toy.

  1. And now for the really difficult ones: Which is grammatically correct?
    • There were fewer people in the shop today.
    • There were less people in the shop today.
    • Both are right.

Many people frown on less with count nouns, but there’s nothing technically wrong with it. Like so many grammar rules, this is an eighteenth-century invention. Fewer is the safer choice in formal speech or writing, though.

  1. How are you supposed to use apostrophes correctly? Which is right?
    • The ice-cream parlor was called Joes Ice’s
    • The ice cream parlor was called Joe’s Ices
    • The ice-cream parlor was called Joes Ices

Correct. Again, though, don’t ask me why two options have a hyphen while the other doesn’t.

  1. How about in this one?
    • Its going to be cold tomorrow.
    • It’s going to be cold tomorrow.
    • It going to be cold tomorrow.

Correct. Many people confuse it’s and its, but in this case you want the contraction. (I don’t know if anyone would actually say or write it going to be cold tomorrow.)

  1. A comma, colon or semi-colon? Which is right?
    • He wasn’t very hungry; he had already eaten earlier that day.
    • He wasn’t very hungry, he had already eaten earlier that day.
    • He wasn’t very hungry: he had already eaten earlier that day.

This one’s arguable. A semicolon might be preferred, but a colon wouldn’t technically be wrong since the second clause is elaborating on the first. The second option contains the error commonly known as a comma splice or run-on sentence.

  1. In the pluperfect tense, what is the second person form of the verb ‘to go’?
    • You have gone
    • You had gone
    • You went

Wrong again. Have gone is the present perfect; had gone is the pluperfect, also known as the past perfect. Also, when I first took the quiz, it asked for the third-person form, but you is the second person. This has since been fixed.

The strange thing is that I can’t figure out the scoring of the quiz, especially since it gives no feedback. I answered all the questions correctly—according to what’s actually traditionally correct—and yet I scored 13, even though I should have scored 11 because four of the supposedly correct answers are wrong. Either something is buggy with the quiz, or the author has been revising the answers and sometimes introducing errors. Either way, the quiz is absolute garbage and shouldn’t be taken seriously.

Oh, and to cap things off, the author of the quiz obviously has no idea what linguists actually do. This is the feedback if you manage to score 15 out of 15:

Those weren’t even difficult for you, were they? Either you’re a professional linguistic researcher at the Institute for English Language or you had a little bit of luck with a couple of your answers… We congratulate you – when it comes to English grammar you really are the best!

Because linguistics is apparently about memorizing a bunch of normative, prescriptive rules about how to use language rather than actually, you know, researching how language works.