I meant to blog about this several weeks ago, when the topic came up in my corpus linguistics class from Mark Davies, but I didn’t have time then. And I know the that/which distinction has been done to death, but I thought this was an interesting look at the issue that I hadn’t seen before.
For one of our projects in the corpus class, we were instructed to choose a prescriptive rule and then examine it using corpus data, determining whether the rule was followed in actual usage and whether it varied over time, among genres, or between the American and British dialects. One of my classmates (and former coworkers) chose the that/which rule for her project, and I found the results enlightening.
She searched for the sequences “[noun] that [verb]” and “[noun] which [verb],” which aren’t perfect—they obviously won’t find every relative clause, and they’ll pull in a few non-relatives—but the results serve as a rough measurement of their relative frequencies. What she found is that before about the 1920s, the two were used with nearly equal frequency. That is, the distinction did not exist. After that, though, which takes a dive and that surges. The following chart shows the trends according to Mark Davies’ Corpus of Historical American English and his Google Books N-grams interface.
It’s interesting that although the two corpora show the same trend, Google Books lags a few decades behind. I think this is a result of the different style guides used in different genres. Perhaps style guides in certain genres picked up the rule first, from whence it disseminated to other style guides. And when we break out the genres in COHA, we see that newspapers and magazines lead the plunge, with fiction and nonfiction books following a few decades later, though use of which is apparently in a general decline the entire time. (NB: The data from the first decade or two in COHA often seems wonky; I think the word counts are low enough in those years that strange things can skew the numbers.)
The strange thing about this rule is that so many people not only take it so seriously but slander those who disagree, as I mentioned in this post. Bryan Garner, for instance, solemnly declares—without any evidence at all—that those who don’t follow the rule “probably don’t write very well,” while those who follow it “just might.”1Garner’s Modern American Usage, 3rd ed., s.v. “that. A. And which.” (This elicited an enormous eye roll from me.) But Garner later tacitly acknowledges that the rule is an invention—not by the Fowler brothers, as some claim, but by earlier grammarians. If the rule did not exist two hundred years ago and was not consistently enforced until the 1920s or later, how did anyone before that time ever manage to write well?
I do say enforced, because most writers do not consistently follow it. In my research for my thesis, I’ve found that changing “which” to “that” is the single most frequent usage change that copy editors make. If so many writers either don’t know the rule or can’t apply it consistently, it stands to reason that most readers don’t know it either and thus won’t notice the difference. Some editors and grammarians might take this as a challenge to better educate the populace on the alleged usefulness of the rule, but I take it as evidence that it’s just not useful. And anyway, as Stan Carey already noted, it’s the commas that do the real work here, not the relative pronouns. (If you’ve already read his post, you might want to go and check it out again. He’s added some updates and new links to the end.)
And as I noted in my previous post on relatives, we don’t observe a restrictive/nonrestrictive distinction with who(m) or, for that matter, with relative adverbs like where or when, so at the least we can say it’s not a very robust distinction in the language and certainly not necessary for comprehension. As with so many other useful distinctions, its usefulness is taken to be self-evident, but the evidence of its usefulness is less than compelling. It seems more likely that it’s one of those random things that sometimes gets grammaticalized, like gender or evidentiality. (Though it’s not fully grammaticalized, because it’s not obligatory and is not a part of the natural grammar of the language, but is a rule that has to be learned later.)
Even if we just look at that and which, we find a lot of exceptions to the rule. You can’t use that as the object of a preposition, even when it’s restrictive. You can’t use it after a demonstrative that, as in “Is there a clear distinction between that which comes naturally and that which is forced, even when what’s forced looks like the real thing?” (I saw this example in COCA and couldn’t resist.) And Garner even notes “the exceptional which”, which is often used restrictively when the relative clause is somewhat removed from its noun.2S.v. “Remote Relatives. B. The Exceptional which.” And furthermore, restrictive which is frequently used in conjoined relative clauses, such as “Eisner still has a huge chunk of stock options—about 8.7 million shares’ worth—that he can’t exercise yet and which still presumably increase in value over the next decade,” to borrow an example from Garner.3S.v. “which. D. And which; but which..”
Something that linguistics has taught me is that when your rule is riddled with exceptions and wrinkles, it’s usually a sign that you’ve missed something important in its formulation. I’ll explain what I think is going on with that and which in a later post.