"Damp Squid" Book And The 2 Billion Word Database

The best part of "Damp Squid," a series of lexicographic essays by Jeremy Butterfield, isn't the oddball title but the computer that inspired it. The Oxford English Corpus, a database of more than 2 billion words culled from a range of global sources, works like a magic cauldron: it allows researchers to see how any word in circulation is actually being used. "Damp squib," for instance, is British slang for a firework that doesn't fire, or a party that fizzles. But the Corpus says most people mistakenly reference the sea creature. A small number also think that "hammer and tongs"—meaning to work vigorously—is actually "hammer and thongs."

Dictionary makers rely on such tidbits to keep pace with the masses. But some of the most compulsively quotable factoids from the Corpus are pure brain candy: the 22nd most common noun modified by "naked" is "pic." (No. 1: "eye.") The most common adjective used to refer to "muscles" is "fabulous." And there are 214 derivations of the word "blog," including "blogstipation" and "bloggocks" and "blogospherical."

There's even more ticklish data on the Corpus Web site, AskOxford.com, where linguistics and sociology collide. What does it mean that "work" is one of the top 20 words in use, while "play" and "rest" don't crack the top 100? Or that "war" is more common than "peace," and "time" is the most common noun of all? Butterfield doesn't say—which is a shame. To use two all-star clichés, at the end of the day, it's not rocket science.

Join the Discussion