Tuesday, 24 May 2016

The ethics of (public?) data

On dating site OKCupid, a lot of data is visible to registered users. This profile data is part of the service that OKCupid provides, allowing individuals to learn about and communicate with other site users. Users might assume that what happens on the web site stays on the web site (kind of like Vegas, right?) ... but is that necessarily the case? Nope. Check out this article from Christian Science Monitor about the reaction to Danish students scraping the site and releasing the "public" data as data set on a shared data site. Oh, and they took no steps to anonymize it because it was already "public."

http://www.csmonitor.com/Technology/2016/0514/Privacy-online-OKCupid-study-raises-new-questions-about-public-data

What do you think about this case ... as an individual who signs up for and uses various online services? As a professional who needs to access and analyze data?

3 comments:

  1. I have a very radical view of this particular issue. The argument is usual based around an expectation of privacy. My problem with that expectation is that people are attempting to treat social media like their living rooms, when it is really their front yards. For example, if I had a discussion of a political candidate in my living room, I have an expectation that my views are privately shared with whom I chose to share it. However, if I put a sign of political support in my front yard, I know longer have an expectation of privacy regarding my political views. Posting on open social media sites is not your living room, it's your front yard. The act of posting openly is an indication that you are making the information public knowledge. So while I know an ethics committee will frown on it, and because of that would not do it, I do not really see any wrongdoing here.

    ReplyDelete
  2. I tend to agree with Mark here because I have always been told that whatever is on the Internet is going to get out in some way. You cannot put things on websites and pretend others will not see them or keep them private.

    ReplyDelete
  3. Reasonable or not, a lot of people have that sort of expectation.
    Scholars working in this area (like me) are exploring what leads to that expectation and whether or not it is a reasonable expectation, whether or not it is ethical (which is not the same as legal or permitted by an IRB) for researchers to use such information in their research, etc.

    One of the tricky parts to defining "public" data is considering access. Anyone who is online might access the OKCupid information -- but since OKCupid keeps the information behind a login it doesn't feel quite as public as posting to a blog.

    Defining virtual spaces and the information shared therein as public v private is tricky. So to complicate Mark's example -- a living room conversation = private. Posting a sign on the front yard = public. But how about having a conversation with friends in a restaurant that is overheard by people at the next table. A restaurant is a public place. As a researcher, would it be OK for me to collect data by listening to conversations at the next table?

    ReplyDelete