Small Printer Speaks to Large Issues: Online Reviews and Research Epistemology
Are online reviews fair? Consider these reviews of a small printer, the Canon Pixma MG6320 on the Consumer Reports website. At the time I am writing, there are three reviews, and all three writers gave it one star out of a possible five—the worst possible rating. The review titles are:
- “Piece of junk”
- “Unreliable and unbelievably expensive”
- “The worst printer ever.”
On the other hand, on Amazon.com the same printer currently has 464 reviews, and it gets an average of four out of five stars. Sample review titles include:
- “Amazing printer”
- “Made a great gift”
- “A very good buy”
There are also negative reviews of course (“I wish I could give it minus stars”), but the consensus is four-star positive.
What is going on here? You could speculate that it’s just a matter of randomness and numbers—the three reviews are too small of a sample to matter, and maybe the printer would drift towards something more consistent over time if more CR people reviewed it. Sinan Aral has also proved that the initial review of an item biases subsequent reviews in ways that affect the final outcome. But I will argue that there’s more than just small sample size involved. It’s quite often the case that CR reviews are dramatically more negative than those on Amazon. I selected this particular item randomly, and this printer was the first item I checked. You can find other items that don’t fit this pattern, and it would be worthwhile to do a systematic comparison and see to what extent it is true. But I believe the printer is not an outlier.
My suspicion is that it has to do with who goes to each site and why. Perhaps people log on to CR to review a product mainly when they are annoyed. Gilbert and Karrahalios studied why people write reviews on Amazon—particularly when an object has already been heavily reviewed and their review says the same thing as previous reviews. They found that some reviewers (“pros”) review for Amazon as a hobby, and take pride in the quality of their reviews. Others (“amateurs”) describe their reviews as “spontaneous” and “heartfelt”—they want to express how they feel about the product. Gilbert writes that Amazon reviews by amateur-style writers have a bimodal distribution—people write because they love a product or hate it. CR gets only one peak—the folks who hate it. The interesting question then becomes, why does CR get only one side of the story? What is it about the site design and its positioning that creates this effect? Further, is there any systematic way we can understand the bias valence of different review sites? Which sites tend to be biased in what ways, and why? Can this help site designers to create review infrastructures that are more useful to their customers? Can we help customers to be better readers, knowing which reviews to believe?
My initial question was, “Are online reviews fair?” I want to argue that that’s not a well-formed question. Better versions might be, “Under what conditions are online reviews of products and services more or less useful to consumers?” and “In what ways do design features of online sites affect reviews that users write?”
In his essay “Thick Description, Towards an Interpretive Theory of Culture,” Clifford Geertz wrote that “The locus of study is not the object of study. Anthropologists don’t study villages (tribes, towns, neighborhoods …); they study in villages.” Some of our loci of study are interesting in themselves—Amazon is Amazonian in size, and worth understanding. But I wonder how much we can develop more broadly relevant insights without comparing villages. It may be easier to understand Amazon when you have Consumer Reports for contrast. Though the two sites differ in so many ways that systematic comparison is a daunting task.
There is a need for good research at all different levels of specificity, from the absurdly general (“Are online reviews fair?”) to the absurdly specific (“In what ways do Amazon.com user reviews of inexpensive consumer printers help people to make good purchasing decisions?”) Researchers trying to build personal reputations tend to err towards making overly general claims—there’s more glory/credit in answering the big questions. But there’s more substance in making an appropriate level of claim for the significance of your findings.
In the same essay, Geertz writes, “Small facts speak to large issues, winks to epistemology, or sheep raids to revolution, because they are made to.” Geertz is a poet, and that line resonates in my mind with my stores of T.S. Eliot and Billy Collins. But I still wonder what it actually means.
I started writing this post for a reason that will seem unrelated. A friend asked how I reconcile the fact that Sherry Turkle and danah boyd are studying similar phenomena—changes in teenage life and family relationships in the presence of mobile and social computing—and coming to quite different conclusions. To unfairly paraphrase, Sherry believes that we are “alone together,” and the technology is changing human relationships for the worse. Danah believes that “the kids are alright”—that the kinds of things teens use these technologies for are quite similar to those same age-appropriate behaviors enacted with previous technologies, and teens are negotiating their stage of life just fine. My answer is that they are both right, and claims are being made for their work (by others more than by them) that are over-generalizing results. Metaphorically, one is studying Consumer Reports and the other is studying Amazon. com, and people are taking their results as being about online reviews in general. (See my post on smart phones and parenting for some examples of both good and bad changes catalyzed by this technology.)
The hard work still to be done is to integrate these two perspectives, and understand their relationship. The important work is to identify what key questions we haven’t yet asked—questions whose answers have actionable consequences. Whether we’re talking about kids and parents on cellphones at dinner or online reviews of which cellphones they should buy, researchers need to ask useful questions and draw conclusions at the right level of specificity.