To Catch A Sock Puppet

Detecting Phony Online Reviews
Illustration by Sean McCabe (Source images: Dorling Kindersley / Getty Images, Siede Preis / Getty Images)

It was the crime writer, on Amazon, under an assumed name, stabbing his fellow novelists in the back. The plot was uncovered earlier this month by thriller writer Jeremy Duns, who revealed the poison penmanship in a series of tweets. “This is RJ Ellory writing about his own book. And he has done this for them all, and yes, I’m proving it in the next few minutes,” Duns tweeted, before exposing Ellory’s pseudonyms. Ellory confessed, and the ensuing scandal prompted hundreds of writers, from Laura Lippman to Jo Nesbø, to sign a pledge condemning sock puppetry, as the practice is called.

“The sleuthing is not that difficult,” says Duns, who has spent the last year interviewing CIA agents about a Cold War espionage operation. “It’s that they’re particularly inept.” Duns has caught phony reviewers when they accidentally sign in under the wrong account, or link to an Amazon wish list under a real name. Often the first clue is in the reviews, which read more like a bitter writer than an avid reader: Ellory called his own work a “modern masterpiece” while griping about all the advertising his rivals receive.

But human eyes can go only so far. Fake reviews are ubiquitous on any site that lets users create anonymous accounts, such as Amazon, TripAdvisor, and Yelp; the tech research company Gartner projects that by 2014, between 10 percent and 15 percent of social-media reviews will be fake. Since Duns unmasked Ellory, he has been bombarded with requests to investigate other suspicious accounts; he began looking into one of them, a famous author, and gave up. “There were thousands of reviews. You’d need an algorithm to sort through them.”

Such an algorithm is in the works. Last year Cornell researchers developed a program to detect suspicious hotel reviews on TripAdvisor. The researchers commissioned hundreds of fake hotel reviews using Amazon’s crowdsourcing site, Mechanical Turk, and isolated linguistic differences between genuine reviews and fake ones. They found that among other giveaways, fake reviews use the first-person frequently and pile on effusive adjectives and superlatives. In a matchup, Cornell’s human judges did little better than chance, and even then couldn’t agree which reviews were fake; the algorithm, meanwhile, picked out fake reviews 90 percent of the time.

Computer science professor Yejin Choi of State University of New York, Stony Brook, is currently using the Cornell data to develop a way to detect products that are benefiting from phony review campaigns. Rather than flagging individual reviews, her program looks at the way fake praise distorts the statistical distribution of a product’s reviews. So far it’s been able to detect deceptive behavior with 72 percent accuracy, and last month Google sponsored her research. “This is one of the few areas in human language technology that computers are better than humans,” says Choi. “Humans have a truth bias. They tend to believe what they see.”

Join the Discussion