A group of Danish researchers, led by Aarhus University graduate student Emil O. W. Kirkegaard, recently publicly released a dataset of nearly 70,000 users of the online dating site OkCupid, including usernames, age, gender, location, what kind of relationship (or sex) they’re interested in, personality traits, and answers to thousands of profiling questions used by the site.
When asked whether the researchers attempted to anonymize the dataset, Kirkegaard replied bluntly: “No. Data is already public.” This sentiment is repeated in the accompanying draft paper, “The OKCupid dataset: A very large public dataset of dating site users,” posted to the online peer-review forums of Open Differential Psychology, an open-access online journal also run by Kirkegaard:
Some may object to the ethics of gathering and releasing this data. However, all the data found in the dataset are or were already publicly available, so releasing this dataset merely presents it in a more useful form.
To those concerned about privacy, research ethics, and the growing practice of publicly releasing large data sets, this logic of “but the data is already public” is an all-too-familiar refrain used to gloss over thorny ethical concerns,.
In response to this problematic data release, CIPR director Michael Zimmer published an editorial in Wired: “OkCupid Study Reveals the Perils Of Big-Data Science” (Wired, May 14, 2016). He states, in part:
The OkCupid data release reminds us that the ethical, research, and regulatory communities must work together to find consensus and minimize harm. We must address the conceptual muddles present in big data research. We must reframe the inherent ethical dilemmas in these projects. We must expand educational and outreach efforts. And we must continue to develop policy guidance focused on the unique challenges of big data studies. That is the only way can ensure innovative research—like the kind Kirkegaard hopes to pursue—can take place while protecting the rights of people an the ethical integrity of research broadly.
Zimmer also appeared on the WUWM Milwaukee Public Radio show Lake Effect to discuss “Big Data Research Creates Ethical Concerns”, noting that:
So when a researcher like this says, ‘Well this stuff was already public,’ what he kind of really means is like, ‘This stuff was visible to other users who happen to also create a profile,’ and those aren’t the same thing,” says Zimmer. “Psychologically I think it’s important for users when they sign up for this thing to have this assumption, or these set of expectations, that I know this data is kind of public but it’s meant for this community… Doing this kind of research sometimes violates that assumption.
CIPR is pleased to welcome Dr. Annette Markham, a renown internet researcher who focuses on areas of social media, ethics, and qualitative methods, to hold an informal workshop with SOIS PhD students on Remixed Methods for Qualitative Research.
We will be discussing Dr. Markham’s recent article, “Remix Cultures, Remix Methods: Reframing Qualitative Inquiry for Social Media Contexts” (PDF), where she discusses some of the complications associated with studying internet-mediated contexts, and offers a research centered definition of remix. Dr. Markham describes particular elements of remix that have proven to be valuable pedagogical tools for helping disrupt traditional frames for conducting qualitative research in digital contexts: Generate, Play, Borrow, Move, and Interrogate.
Special thanks to Dr. Nadine Kozak for helping organize today’s workshop.
On Friday, September 30, 2011, please join us for a CIPR brown bag research lunch from 12:30-2:00 in Bolton 521 (bring your own lunch).
There will be two short presentations, both focusing on issues in Internet research ethics:
- “Oh the Ethics You’ll Know”, by Nick Proferes, SOIS PhD student. This short (and clever!) presentation shares on-going research into how issues of research ethics are discussed on the Association of Internet Researchers (AoIR) mailing list. This will be a preview of what Nick will present at the AoIR annual conference in October.
- “Research Ethics in the 2.0 Era: Conceptual Gaps for Ethicists, Researchers, IRBs”, by Michael Zimmer, Assistant Professor and Co-Director of Center for Information Policy Research. This talk contributes to this growing discourse on Internet research ethics by describing conceptual gaps that have emerged with relation to how researchers and IRBs think about privacy, anonymity, consent, and harm in the 2.0 era. This will be a preview of an invited presentation at the International Symposium on Digital Ethics hosted by Loyola’s Center for Digital Ethics & Policy.
We intend to hold informal research lunches (bring your own lunch) a few times each semester, to provide a space for faculty, students, staff, and friends interested in information policy and ethics (conceived of broadly) to share research — both finished and in progress.
If you’d like to schedule a time to present, please contact Michael Zimmer at firstname.lastname@example.org.
Michael Zimmer, a privacy scholar at the U. of Wisconsin at Milwaukee Center for Information Policy Research, says the methods of the Harvard project “should have triggered an ethical concern.”
The Chronicle of Higher Education has published an article featuring CIPR Co-Director Michael Zimmer’s research critiquing the privacy protections and research methods related to the “Taste, Ties, and Time” (T3) Facebook research study conducted by a set of Harvard sociologists.
The article, “Harvard Researchers Accused of Breaching Students’ Privacy”, discusses a variety of privacy and research ethics concerns raised by Zimmer, and also features insights by former CIPR director Elizabeth Buchanan.
Read the full article here, and additional commentary by Zimmer on his blog.