I deliberately annoyed Internet users without their consent in the name of Science, and I would do it again.
How do people on social media feel about being studied by researchers? To shed some light on this topic, Jim Hudson and I conducted a study in which we entered chatrooms and told people we were studying them. Actually, what we were really studying was whether they kicked us out of the chatroom.
This is not something we did lightly—we got IRB approval for our study, and it went to full board review. For those of you keeping score, this was a full waiver of consent for a deceptive study (we said we were studying language online). In the study, if we weren’t kicked out of the chatroom after five minutes, we posted a debriefing message about the real purpose of the study with a URL for more information. If we were already kicked out, we did not disturb participants further. The study was approved because our IRB felt there was minimal risk to participants and the study could not practicably be carried out without the waiver.
We used the user name “Chat_Study.” We eliminated chatrooms with names that suggested sensitive topics (like health support groups), and we chose only groups with a channel operator present who had the power to kick us out. In the remaining groups, we entered, waited one minute, and posted one of four messages:
- No message: None
- Recording message: We are researchers recording this chatroom for a study on language use in online environments. For questions or comments, email firstname.lastname@example.org. Thank you!
- Opt out message: We are researchers recording this chatroom for a study on language use in online environments. If you don’t want to be recorded, please whisper “Chatstudy opt out” to us. For questions or comments, email email@example.com. Thank you!
- Opt in message: We are researchers and would like to record this conversation for a study on language use in online environments. If we may record you, please whisper “Chatstudy volunteer” to us. For questions or comments, email firstname.lastname@example.org. Thank you!
So what happened? In short, almost no one opted in or out, and we were kicked out of the chatroom in less than five minutes in 63.3% of the time (in the conditions in which we posted a message). In the control condition in which we entered but posted no message, we were kicked out 29% of the time. The paper includes many of the “boot messages” we received. In addition to lots of messages like “go away” or “no spamming,” we received some more colorful ones. My favorite is “Yo momma so ugly she turned Medusa to stone!”
Intriguingly, for every additional 13 people in a chatroom, our chance of getting kicked out went down by half. Our hunch is that the smaller the group, the more intimate it feels, and the more our presence felt like an intrusion. I.e. three friends discussing their favorite sports team is a quite different social context than 40 strangers playing a chatroom-based trivia game.
I believe this study was ethical. What we learned from it—how people really feel about being studied, for that context—outweighs the annoyance we caused. This is a judgment call, and something we considered carefully. The irony of annoying people to show how much what we were doing is annoying is not lost on me. But would I do similar studies? Yes, if and only if there was benefit that outweighed the harm.
This data was collected in 2003 and the results were published in 2004. I bring this up now because of the recent uproar about a study published by Facebook researchers in PNAS in 2014. Our chatroom study makes the points that:
- It is possible to do ethical research on Internet users without their consent.
- It is possible to do ethical research that annoys Internet users.
- The potential benefit of the study needs to outweigh the harm.
- Annoying people is doing some harm, and most users are annoyed by being studied without their consent.
In my opinion it was questionable judgment for Kramer et al. to see if filtering people’s newsfeeds to contain fewer positive emotional words led them to post more content with fewer positive words themselves. But here’s a harder question: If they had only done the happy case (filtering people’s newsfeeds to have fewer negative words), would you still have concerns about the study? This separates the issue of potential harm from the issue of lack of control. I believe we can have a more productive conversation about the issues if we separate these questions. I personally would have no objections to the study if it had been done only in the happy case.
I suspect that in part it’s the control issue that has provoked such strong reactions in people. But it’s not that users lost control in this one instance—it’s that they never had it in the first place. The Facebook newsfeed is curated by an algorithm that is not transparent. One interesting side effect of the controversy is that people are beginning to have serious conversations about the ways that we are giving up control to social media sites. I hope that conversation continues.
If you’re annoyed by my claiming the right to (sometimes, in a carefully considered fashion) annoy you, leave me a comment!
(Edited to more precisely describe the Facebook study–thanks to Mor Naaman for prodding me to fix it.)
My class CS 6470 Design of Online Communities is structured around having students do a qualitative study of an online community using participant observation and interviewing. Over the years since I first taught the class in 1998, the core assignment has evolved in a number of ways. One recent change that surprised me is the need rethink the assignment’s focus on a single site.
In spring 2013, students Patrick Mize, Michael Roberts and Aditya Tirodkar chose to study the site Equestria Daily, a blog for bronies. (I am writing about their work with their permission.) As I’ve written, bronies are adult, male fans of the television show My Little Pony: Friendship is Magic. As the students began to study the site, they quickly learned that it was impossible to understand Equestria Daily without understanding a constellation of other sites including Equestria After Dark, PonyChan, Brony forums on Deviant Art and Meetup, and others. They made this diagram:
Taken together, brony online activity forms a kind of ecosystem. Too many people talking about ponies on 4chan led the creation of Ponychan. A policy change in what sort of content is allowed on Equestria Daily changed what is posted on Equestria After Dark. Pony fans use all these different sites in a complimentary fashion, and user behavior is not confined to one site. In fact it’s impossible to tell the story of Equestria Daily without explaining its relationship to this ecosystem of pony activity.
As you’ve probably noticed, much of this activity is oriented towards “adult” content. Any cultural content that inspires dedicated fans can be repurposed towards erotica. It’s not surprising that a cultural meme that tends to appeal to individuals in a more typically libidinous stage of life would be used in this fashion. And as these things go, pony erotica tends to be relatively tame. I certainly see less potential harm in original art about ponies compared to adult content made from photography of real people who may or may not have been exploited in the taking or eventual use of their images.
The need to think about ecosystems of online sites is not specific to bronies. I was meeting recently with my colleague Alex Orso to discuss his online software engineering class in our online master of computer science degree program (OMS), and he lamented that supporting his course involves using half a dozen different tools. There’s our institutional grading software, our vendor’s class delivery platform, our normal class support software, a third-party class discussion tool, a software repository…. Saying you would study “the software” or “the website” for OMS is an anachronism. There are many platforms and tools, and the challenges are all at the seams between them. Similarly, my student Joseph Gonzales is studying The Greatest International Scavenger Hunt the World Has Ever Seen (GISHWHES), and finding that teams of hunters all use a host of different tools and change their tool use through different phases of their project.
It doesn’t surprise me that people use multiple tools and online sites. It does surprise me that this happens to a degree that it can be hard to even discuss one single site in isolation. Time to rewrite that class assignment!
Thanks to Patrick, Michael, and Aditya for a great project!
Twitter research ethics are complicated, and deserve a more nuanced treatment than my short post from yesterday. I’ll take a stab here at saying a bit more:
Question 1: Is analyzing Twitter “human subjects research”?
I want to start by looking at US law. (Note that this is only applicable in the US and only applies to federally funded research, though some companies chose to voluntarily follow these rules and most universities apply the rules to all research whether it is federally funded or not.) The policy states that several categories of work are exempt from the rules, including:
(4) Research involving the collection or study of existing data, documents, records, pathological specimens, or diagnostic specimens, if these sources are publicly available or if the information is recorded by the investigator in such a manner that subjects cannot be identified, directly or through identifiers linked to the subjects.
It’s pretty clear that Twitter data (on open accounts) is existing data that is already publicly available. So legally speaking, I believe researchers are well within their rights to simply use it at will. It’s public, so you can use it. But should you?
Ethical is a higher standard than legal. As Jim Hudson and I found in our study of people chatting on Internet Relay Chat (IRC), people often misunderstand the public nature of online communications. This leads to my second question:
Question 2: If people have expectations of privacy that differ from expert opinion on what is “reasonable,” does that need to be taken into account?
I don’t think there’s a simple answer to that question. It probably has to be addressed on a case-by-case basis. And if people’s expectations are persistent and continue to differ from the written rules, maybe the rules need to evolve.
If you do consider research on Twitter to be human subjects research, then you need to apply for IRB clearance, and you probably have good grounds to request a waiver of consent. A waiver of consent is possible in these circumstances:
(d) An IRB may approve a consent procedure which does not include, or which alters,
some or all of the elements of informed consent set forth in this section, or waive the requirements to obtain informed consent provided the IRB finds and documents that:
1) The research involves no more than minimal risk to the subjects;
(2) The waiver or alteration will not adversely affect the rights and welfare of the subjects;
(3) The research could not practicably be carried out without the waiver or alteration; and
(4) Whenever appropriate, the subjects will be provided with additional pertinent information after participation.
In such a case, an IRB might request that the tweets be anonymized, and this would contribute to making the case that the work presents minimal risk. This sounds like a great approach for research on sensitive topics, like epidemiology for example.
Because part of my research is about people’s creative accomplishments online, I am more likely to encounter situations where anonymizing people is unethical because it denies them credit for their work. We only name people in accounts at their written request, by marking that on a consent form. And our projects generally use mixed methods—with a combination of analyzing people’s online postings and interviewing them. I believe this mixed methods approach often gives better research results, and necessarily makes the work human subjects research rather than merely analysis of public information.
I personally prefer to view Twitter research as human subjects’ research and apply for a waiver of consent. Thinking through a formal IRB application and soliciting help from IRB members can help you to think through the details of how to treat your subjects in accordance with principles of beneficence, justice, and respect for persons. Ethical is after all a higher standard than merely legal.
That said, the public nature of Twitter data is hard to deny. Maybe the rule about pre-existing, public information needs to be rethought. Something more nuanced would serve us better.
In this article about tweets being made available to researchers, the authors quote two epidemiologists saying ethical use of Twitter should anonymize tweets:
Caitlin Rivers and Bryan Lewis, computational epidemiologists at Virginia Tech, published guidelines for the ethical use of Twitter data in February. Among other things, they suggest that scientists never reveal screen names and make research objectives publicly available. For example, although it is considered ethical to collect information from public spaces—and Twitter is a public space—it would be unethical to share identifying details about a single user without his or her consent. Rivers and Lewis argue that it is crucial for scientists to consider and protect users’
I disagree. Of course it may be more often true for epidemiology, but it really depends on what kind of study you’re doing. As Kurt Luther, Casey Fiesler, and I have written, sometimes anonymizing users may be morally wrong because you are denying them credit for their work. (“That tweet was really funny–I want my name on it!”) Twitter is public, published material. The contents of private Twitter feeds are for followers only, but the contents of public feeds arguably are as public as a newspaper article. If you want to take extra precautions to anonymize people, that’s fine. But to say it’s always necessary is ridiculous. It depends on the type of study you’re doing.
Jim Hudson and I empirically studied how people often misunderstand how public their communications are. The complicated question that follows is: if user expectations are out of line with what experts would call “reasonable,” how should the scholarly community proceed? Dealing with things on a case-by-case basis is the best we can do for now.
Most fall semesters I teach CS 4001 “Computers, Society, and Professionalism.” I love the class–we cover ethics, argumentation, professionalism, and the social implications of technology. As part of the class, I always teach a lecture about the USA Patriot Act. It’s a labor of love–it takes me three or four times as long to prepare for that class than any other class in the semester, because it’s so complicated and there’s always new news to sort through. Were the “gag order” provisions found unconstitutional or not? What’s the difference between the Protect America Act (which expired) and its new incarnation in the the FISA Amendments Act? The details go on and on. I teach class in a studiously neutral way–There are tradeoffs between security and privacy, and where to draw the line is complicated.
PBS Frontline has come out with a new three-hour documentary “United States of Secrets” which takes on a lot of these issues. I highly recommend watching it. What the US government has been actually recording goes well beyond what is authorized by the Patriot Act. But what I found most depressing about it was not that we are being spied on, but that some government officials apparently have been ignoring the rule of law. For example, the NSA constructed a tenuous theory to give them permission to record basically everything, and the US Attorney General signed off on it. OK, I don’t like the theory, but at least there was an attempt at legality. But later when the Attorney General changed his mind and decided the program was illegal, the NSA just asked the White House Council to sign off on it instead. Really? Mom said no so you ran and asked Dad? (More like Mom said no so you ran and asked your uncle.) And then there are the videos of the President and other officials flat-out lying to the public and to congress. They didn’t say “I can’t discuss that”–they lied and said the surveillance wasn’t happening.
It was heartening to see the whistleblowers profiled in the film. There are plenty of good people who tried to speak up–going through every possible internal proper channel before finally going to the press. Our class covers ethical procedures for when and how to become a whistleblower, and the whistleblowers profiled followed those procedures impeccably. And these aren’t civil libertarian liberals–they are pro-defense conservatives who are appalled by what is going on. But in another depressing turn, the government then goes after the whistleblowers, turning their lives upside-down.
What’s the point of teaching students about a law if what the law says doesn’t change how the government actually operates?
The really fun part of teaching my graduate class Design of Online Communities is that I learn incredible things from the students’ empirical studies of online sites. In Spring 2013, one team of students (Patrick Mize, Michael Roberts, and Aditya Tirodkar) chose to study Equestria Daily, a site for bronies. Bronies are adult, male fans of the children’s television show My Little Pony: Friendship is Magic. This raises two questions: First, why would adult men become fans of television show aimed at young girls? Second, what interesting issues does the design of Equestria Daily raise? I’ll tackle the second issue in another post. Here I want to talk about bronies.
When I first heard about bronies, I was fascinated. More fascinated because I have a friend (a fellow CS faculty member) who is a brony. The question of why someone would be a brony has two parts: First, why does anyone join any group? Second, why are some people attracted to brony culture in particular?
After reading my students’ paper (I hope they’ll polish and publish it) and all their interview transcripts, I also watched the documentary Bronies, The Extremely Unexpected Adult Fans of My Little Pony (available on Netflix streaming). And the more I learn, the less surprising it all becomes.
Why do people join any group? Sociologist Ray Oldenburg writes about how people need a third place, neither work nor home. The full title of his book is “The Great Good Place, Cafés, coffee shops, community centers, beauty parlors, general stores, bars, hangouts and how they get you through the day.” Oldenburg bemoans the fact that the invention of suburbia has made it harder to find places to casually socialize. In a more quantitative vein, Robert Putnam notes that we are increasingly bowling alone—fewer people are joining voluntary associations where they can meet others.
My neighbors recently joined a fancy golf club. At the club, they meet others who share their interests, values and worldview. Adult club members have an opportunity to talk with friends while playing golf, and their kids meet one another while splashing in the pool. Then they all go to the clubhouse for lunch, and there are more opportunities to build and maintain social ties. Three factors help bring together people with things in common. First, the price of membership means members have a common socio-economic status. Second, a membership application is required, and current members choose to admit those who they feel will fit in. Finally, self selection is probably the dominant filter—most people have a sense of whether this is the sort of place for them.
I’m not a golf club sort of person. I wish there were a place like that for me. The golf club is a classic example of Oldenburg’s third place. Because it’s a physical place, members can drop by on a casual basis and meet others informally. But what do you do if there is no such place nearby that suits you?
Most people, like me, have few third places readily available. Could something like a third place be mediated electronically? Putnam dismisses that possibility, but he was writing a long time ago, and also doesn’t have empirical data on online communities and the value they bring to their members. It’s important to note that most online communities are not solely online. As Barry Wellman and Milena Gulia point out, people who initially meet online often go to extraordinary lengths to meet in person, and face-to-face encounters help to strengthen ties.
All of this brings me back to the world of bronies. At brony conventions, pony fans get to meet other like-minded individuals. They meet electronically, enhance their ties in person, and then maintain them electronically until the next opportunity to meet in person arises. It’s not as nice as a golf club where you see your friends every weekend, but it serves a similar function in their lives. I’m sure if they had the financial means and geographic density of bronies to create a pony club, they’d love it. Like all fandoms I’ve observed, brony culture is creative. Bronies work in every possible creative medium, and especially make their own art and music inspired by the show. A pony club would be a sort of maker space with a sound studio, 3D printers, digital and analog art tools, and a space for parties and dancing. It’s a shame it’s not a more practical idea.
I hope I have given you some insights into why people might want to join a community of like-minded individuals. One mystery remains: Why My Little Pony? In my students’ interviews and in the documentary, bronies talk about embracing values of kindness and friendship. It is an explicit rejection of the cultural norm of competitive and aggressive masculinity. If NASCAR and American football repel you, what are you to embrace? Ponies are the opposite.
There are indeed female bronies (often called “pegasisters”), but they’re less common because the values of the show are more in line with traditional femininity. If joining a subculture is an act of identity construction that says “I am different,” being a fan of My Little Pony is a more defining statement for men.
Now that brony culture emerged, it’s easy to see why it would appeal to a certain class of incredibly nice guys. An open question is: why this particular show? I would love to see a cultural history of the origins of bronydom, and how the subculture initially took off.
My next post will be about what my students learned from studying the design of Equestria Daily.
Are online reviews fair? Consider these reviews of a small printer, the Canon Pixma MG6320 on the Consumer Reports website. At the time I am writing, there are three reviews, and all three writers gave it one star out of a possible five—the worst possible rating. The review titles are:
- “Piece of junk”
- “Unreliable and unbelievably expensive”
- “The worst printer ever.”
On the other hand, on Amazon.com the same printer currently has 464 reviews, and it gets an average of four out of five stars. Sample review titles include:
- “Amazing printer”
- “Made a great gift”
- “A very good buy”
There are also negative reviews of course (“I wish I could give it minus stars”), but the consensus is four-star positive.
What is going on here? You could speculate that it’s just a matter of randomness and numbers—the three reviews are too small of a sample to matter, and maybe the printer would drift towards something more consistent over time if more CR people reviewed it. Sinan Aral has also proved that the initial review of an item biases subsequent reviews in ways that affect the final outcome. But I will argue that there’s more than just small sample size involved. It’s quite often the case that CR reviews are dramatically more negative than those on Amazon. I selected this particular item randomly, and this printer was the first item I checked. You can find other items that don’t fit this pattern, and it would be worthwhile to do a systematic comparison and see to what extent it is true. But I believe the printer is not an outlier.
My suspicion is that it has to do with who goes to each site and why. Perhaps people log on to CR to review a product mainly when they are annoyed. Gilbert and Karrahalios studied why people write reviews on Amazon—particularly when an object has already been heavily reviewed and their review says the same thing as previous reviews. They found that some reviewers (“pros”) review for Amazon as a hobby, and take pride in the quality of their reviews. Others (“amateurs”) describe their reviews as “spontaneous” and “heartfelt”—they want to express how they feel about the product. Gilbert writes that Amazon reviews by amateur-style writers have a bimodal distribution—people write because they love a product or hate it. CR gets only one peak—the folks who hate it. The interesting question then becomes, why does CR get only one side of the story? What is it about the site design and its positioning that creates this effect? Further, is there any systematic way we can understand the bias valence of different review sites? Which sites tend to be biased in what ways, and why? Can this help site designers to create review infrastructures that are more useful to their customers? Can we help customers to be better readers, knowing which reviews to believe?
My initial question was, “Are online reviews fair?” I want to argue that that’s not a well-formed question. Better versions might be, “Under what conditions are online reviews of products and services more or less useful to consumers?” and “In what ways do design features of online sites affect reviews that users write?”
In his essay “Thick Description, Towards an Interpretive Theory of Culture,” Clifford Geertz wrote that “The locus of study is not the object of study. Anthropologists don’t study villages (tribes, towns, neighborhoods …); they study in villages.” Some of our loci of study are interesting in themselves—Amazon is Amazonian in size, and worth understanding. But I wonder how much we can develop more broadly relevant insights without comparing villages. It may be easier to understand Amazon when you have Consumer Reports for contrast. Though the two sites differ in so many ways that systematic comparison is a daunting task.
There is a need for good research at all different levels of specificity, from the absurdly general (“Are online reviews fair?”) to the absurdly specific (“In what ways do Amazon.com user reviews of inexpensive consumer printers help people to make good purchasing decisions?”) Researchers trying to build personal reputations tend to err towards making overly general claims—there’s more glory/credit in answering the big questions. But there’s more substance in making an appropriate level of claim for the significance of your findings.
In the same essay, Geertz writes, “Small facts speak to large issues, winks to epistemology, or sheep raids to revolution, because they are made to.” Geertz is a poet, and that line resonates in my mind with my stores of T.S. Eliot and Billy Collins. But I still wonder what it actually means.
I started writing this post for a reason that will seem unrelated. A friend asked how I reconcile the fact that Sherry Turkle and danah boyd are studying similar phenomena—changes in teenage life and family relationships in the presence of mobile and social computing—and coming to quite different conclusions. To unfairly paraphrase, Sherry believes that we are “alone together,” and the technology is changing human relationships for the worse. Danah believes that “the kids are alright”—that the kinds of things teens use these technologies for are quite similar to those same age-appropriate behaviors enacted with previous technologies, and teens are negotiating their stage of life just fine. My answer is that they are both right, and claims are being made for their work (by others more than by them) that are over-generalizing results. Metaphorically, one is studying Consumer Reports and the other is studying Amazon. com, and people are taking their results as being about online reviews in general. (See my post on smart phones and parenting for some examples of both good and bad changes catalyzed by this technology.)
The hard work still to be done is to integrate these two perspectives, and understand their relationship. The important work is to identify what key questions we haven’t yet asked—questions whose answers have actionable consequences. Whether we’re talking about kids and parents on cellphones at dinner or online reviews of which cellphones they should buy, researchers need to ask useful questions and draw conclusions at the right level of specificity.