Access the full text.
Sign up today, get DeepDyve free for 14 days.
This study identifies and explores evolving concepts of trust and privacy in the context of user-generated health data. We define ‘‘user-generated health data’’ as data captured through devices or software (whether purpose built or commer- cially available) and used outside of traditional clinical settings for tracking personal health data. The investigators conducted qualitative research through semistructured interviews (n¼ 32) with researchers, health technology start- up companies, and members of the general public to inquire why and how they interact with and understand the value of user-generated health data. We found significant results concerning new attitudes toward trust, privacy, and sharing of health data outside of clinical settings that conflict with regulations governing health data within clinical settings. Members of the general public expressed little concern about sharing health data with the companies that sold the devices or apps they used, and indicated that they rarely read the ‘‘terms and conditions’’ detailing how their data may be exploited by the company or third-party affiliates before consenting to them. In contrast, interviews with researchers revealed significant resistance among potential research participants to sharing their user-generated health data for purposes of scientific study. The widespread rhetoric of personalization and social sharing in ‘‘user-generated culture’’ appears to facilitate an understanding of user-generated health data that deemphasizes the risk of exploitation in favor of loosely defined benefits to individual and social well-being. We recommend clarification and greater transparency of regulations governing data sharing related to health. Keywords Big data, health data, terms and conditions, trust, privacy, sharing knowledge might these insights reveal and for whom Introduction might they improve outcomes? The novel achievements As the field of medicine has begun to embrace big data, of user-generated health data rely heavily on partici- a problematic truism has taken hold: more data equals pants’ willingness to share their data, even when more knowledge equals better health outcomes. doing so may not serve their own best interests (Leaf, Ubiquitous environmental and lifestyle data from wear- 2015). The question of who benefits from big health able technologies and mobile apps promises to uncover data is therefore entangled with questions about data new indicators of health and illness from outside of ownership, sharing, trust, and privacy. This study traditional clinical settings (Steinhubl et al., 2015). explores how concepts of trust and privacy are chan- While the often-cited ‘‘Four V’s’’ of big data, ging in the context of user-generated health data and ‘‘volume, variety, velocity, and veracity,’’ all hint at the complexity of deriving straightforward insights Rice University, USA from these new sources of data (Raghupathi and Raghupathi, 2014), the governing logic of many busi- Corresponding author: ness and research enterprises holds that the unfettered Kirsten Ostherr, Department of English, Rice University, 6100 Main St. flow of data will yield real value as soon as it is coupled MS-30, Houston, TX 77005, USA. with appropriate analytics. But what new kinds of Email: kostherr@rice.edu Creative Commons CC-BY: This article is distributed under the terms of the Creative Commons Attribution 4.0 License (http:// www.creativecommons.org/licenses/by/4.0/) which permits any use, reproduction and distribution of the work without further permission provided the original work is attributed as specified on the SAGE and Open Access pages (https://us.sagepub.com/en-us/nam/open-access- at-sage). 2 Big Data & Society analyzes how researchers, start-up companies, and sharing their data beyond their personal social net- members of the general public interact with and under- works. This study contributes to the growing body of stand the value of user-generated health data as a key research on the role of big data, personal data, and data component of big health data. sharing in healthcare by illuminating how members of Members of the general public, including patients, the general public, health researchers, and health infor- have begun to play a newly important role in collecting mation technology start-up companies understand the data about health and disease (Sarasohn-Kahn, 2014). meaning and value of user-generated health data. While With the rise of the mobile web and the growth of this study has global implications, it is primarily smartphone use (Rainie and Wellman, 2014), citizens’ focused on the effects of policies governing health and daily lives have become experiments ‘‘in the wild,’’ social data in the United States. whose digital traces offer new opportunities and chal- We began this study with the broad question: ‘‘how lenges to researchers seeking to gather information is user-generated health data transforming ideas about about human behavior and exposures outside of the health, both within and beyond medical contexts?’’ controlled settings of lab-based studies. This phenom- However, our research quickly identified the concepts enon has emerged with the rise of ‘‘user-generated con- of trust and privacy as particularly critical for shaping tent’’ (UGC), defined as content that ‘‘comes from the value of user-generated health data, so we narrowed regular people who voluntarily contribute data, infor- the focus of our interviews to prioritize those terms. We mation, or media that then appears before others in a define ‘‘user-generated health data’’ as data captured useful or entertaining way, usually on the Web—for through devices or software (purpose built or commer- example, restaurant ratings, wikis, and videos’’ cially available) and used outside of traditional clinical (Krumm et al., 2008; Van Dijck, 2009). As researchers settings for tracking personal health data (such as wear- and marketers began to mine UGC for insights and able heart rate monitors, step-counters, and sleep track- predictors of user behavior in the early 2000s, the rele- ers). For the purposes of this paper, we define ‘‘medical vance to health of what might be considered incidental contexts’’ (used interchangeably with ‘‘traditional clin- data, such as global positioning system (GPS) or social ical settings’’) as those sites where formal doctor– media data, quickly became apparent. In addition, the patient interaction is governed by health law such as growing popularity of wearable health and wellness the Health Information Portability and Accountability Act of 1996 (HIPAA) and the U.S. Food and Drug trackers, such as the Fitbit, Jawbone UP, the Apple watch, and others, has created an abundance of user- Administration (FDA) approval process governing generated health data. Like the incidental health data the use of medical devices, including some digital derived from GPS or social media, user-generated health tools. Our research explores how distinctions health data is produced, shared, and exploited under between clinical and nonclinical spaces and practices poorly defined privacy and ownership policies are changing in the context of mobile health technolo- (Lupton, 2016; Neff and Nafus, 2016). gies and user-generated health data. Therefore, while In light of the growing importance of patients and the blurring of boundaries between the clinical and consumers in the life cycle of big health data creation the nonclinical, or between medical and health/wellness and exploitation, the need for clarity around the role of domains, may seem to suggest that these distinctions user-generated health data in commercial and scientific are becoming less relevant (Fiore-Gartland and Neff, enterprises is pressing. When the Precision Medicine 2015), we nonetheless recognize that existing rules Initiative (PMI) was launched in the United States in define regulatory boundaries between consumer-facing 2015, it was described as ‘‘a new way of doing research software applications and devices (which do not require that fosters open, responsible data sharing with the FDA approval and are not governed by HIPAA), and highest regard to participant privacy, and that puts clinical-facing apps and devices (which are regulated by engaged participants at the center of research efforts’’ FDA and HIPAA). When considering how user-gener- (NIH, 2015). The premise of ‘‘open, responsible data ated health data travels through social and information sharing’’ rests upon the assumption that future uses of networks, the boundaries between the nonclinical and PMI datasets would not produce harmful unintended the clinical remain quite relevant, with significant impli- consequences for data donors, yet legal scholars and cations for our study. data scientists have shown that data privacy is virtually After the description of our research methods, the impossible to ensure (Ohm, 2010; Pasquale and first section of this paper describes how the mobile Ragone, 2014). Moreover, little is known about how technologies that have facilitated the rise of ‘‘user- and why participants engage in data sharing, what priv- generated data’’ have enabled new forms of autonomy acy means to those participants, what individuals think for patients and new processes of health datafication, researchers and businesses can and should do with their raising important questions about the meaning of priv- data, and what users think they might gain (or lose) by acy and sharing in this new context. The remaining Ostherr et al. 3 sections of the paper describe and analyze the key We initially conducted informal, unstructured inter- results of our interviews. In the second section, we dis- views with three researchers and three start-up compa- cuss how the concept of trust shapes users’ attitudes not nies to help us further identify core issues for these only toward sharing their health data, but also toward groups. On the basis of those interviews, we developed their assessment of the significance of the data itself. semistructured interview scripts for each cohort. The The third section explains how concepts of privacy interview questions for the researchers and start-ups have become more flexible in relation to evolving atti- were closely aligned and focused on what kinds of tudes about the value of user-generated health data, data our interlocutors used in their research or busi- with significant consequences for users’ willingness to ness; what role user-generated data played in their agree to device and software ‘‘terms and conditions.’’ In work, whether they had to develop novel consent pro- contrast, in the fourth section we analyze the growing cedures or terms of use for user-generated data; how unwillingness of individuals to participate as ‘‘human they saw this type of data as different from other forms subjects’’ in health data science, despite their willing- of data, whether there were new business or research ness to agree to corporate terms of use that entail com- challenges that arose from user-generated data, modification of their privacy, and the implications for whether new privacy and security issues emerged from models of data sharing, privacy, and trust. this type of data; and what they saw as the major bene- fits of working with this new kind of data. We recruited participants from September 2015 to January 2016 by Methods networking with local experts to identify 10 researchers We first conducted a literature review of academic, and 10 start-up companies to interview. We completed journalistic, and gray literature focused on key concepts nine researcher and six start-up interviews over four in user-generated health data, such as quantification, weeks from January to February 2016, mostly in big data, mobile technology, and digital health. The person at their offices, occasionally over the phone. results highlighted the interconnections between indus- The interview questions for end users (members of try and the academy, as health researchers are adapting the general public) were shaped by published literature consumer-facing wearable technologies in their work, reporting users’ attitudes toward health-tracking while health technology companies are drawing on devices, as well as informal ethnography with users in behavioral science to validate and promote the claims the local community. The questions asked were what of their devices. Widespread consumer adoption of kind of device was used for tracking health data, what health-tracking devices has demonstrated the accept- they used it for, when and why they started using it, ance of these relatively new technologies outside of clin- how they use it on a daily basis, whether they see this ical settings. Our aim was to characterize and critically kind of health data as different from data they might interrogate how different groups of stakeholders under- receive in a clinical setting, whether they share their stood the concepts of trust and privacy through a meth- data with anyone else, why or why not, whether they odological approach that bridged discourse analysis, think anyone else has access to their data, whether they ethics, and science and technology studies. The study read the terms and conditions for the device, and what protocol was approved by an Institutional Review they like or dislike about using the device. Through Board. convenience sampling over four weeks in February On the basis of our literature review we identified and March 2016, we conducted 17 interviews with an three target populations to interview, according to the average length of 20 minutes each by approaching following inclusion criteria: participants must be members of the general public in three highly trafficked healthy adults and either: researchers who interact urban parks. Our participation rate was approximately with user-generated data, employees of a business that 80%. Upon completion of 32 interviews, the recordings interacts with user-generated data, or members of the were transcribed and the interviews were manually general public who interact with user-generated data. coded by six members of the research team, using Researchers included behavioral and computational an inductive approach to identify latent themes in scientists, businesses included health information tech- the data. nology start-up companies (including software and device developers), and members of the general public Autonomy and health datafication included individuals who use wearable devices or apps in an age of user-generated culture to capture their own health data. Together, these three groups cover the spectrum of actors who, through their The emergence of user-generated health data—as dis- professional and everyday activities, shape the ideas tinct from clinical health data—is part of a larger zeit- around and practices of using technologies that pro- geist of ‘‘user-generated culture’’ that has captured the duce user-generated health data. attention of individuals, corporations, hospitals, and 4 Big Data & Society governments within the past decade (Fu¨ ller, 2016). an existing healthcare ecosystem, but potentially for- Entities like Uber, AirBnB, and Tinder are keystone mulate an entirely new type of healthcare. examples of how devices and the data they produce Importantly, user-generated health data from com- have transformed their respective industries through mercial devices are not easily integrated into clinical new patterns of digital intermediation (Benghozi and settings (Chung and Basch, 2015; Luxton et al., 2012). Paris, 2016). While user-generated health data appears Most patients cannot simply bring their Fitbit data to to be part of a larger cultural trend in mobile device their cardiologist and expect to receive recommenda- integration, healthcare is a unique domain with a spe- tions based on those data. While a provider could ‘‘pre- cific set of histories, demands, and stakes that do not scribe’’ the use of a commercial tracking device for a necessarily apply to rideshare networks, real estate patient to monitor her cardiovascular activity, incor- tourism, or romantic match-making services. porating the data from that device into the patient’s Mobile health technologies now enable users to electronic health record (EHR) would pose significant accrue large volumes of real time and longitudinal legal and regulatory challenges. With few exceptions, health data, using methods not typically possible in user-generated health data presently has no place traditional clinics or ‘‘analog’’ self-tracking journals within formal EHR-based medical documentation sys- (Cortez et al., 2014). These user-driven practices gener- tems, rendering it invisible in the majority of doctor– ate new types of health data that avoid many of the patient encounters (Kish and Topol, 2015). Conversely, infrastructures and actors traditionally involved in health-related device and software companies operating healthcare and health decision-making. As improved outside of hospitals, clinics, and other HIPAA-pro- methods for collecting, processing, and storing large tected zones face few restrictions on their exploitation datasets are developed, the big health data generated of users’ data, as consumers must agree to ‘‘terms and by individual patients may redefine our conceptions of conditions’’ to activate and use the app. In many cases, health, disease, and what it means to be a patient (Fox those terms of use permit the parent company to sell and Duggan, 2013; Topol, 2015). The practices sur- users’ health data to third parties, including marketers, rounding user-generated health data do not merely advertisers, and other types of data brokers (Shklovski convey information; they mediate medical knowledge et al., 2014). and help to construct meaning that bridges health While the users who generate health data outside of clinical settings may be vulnerable to third-party and medical domains (Neff and Nafus, 2016; Ostherr, 2013). These practices of technomediation provide an exploitation, many see self-tracking tools that put important context for understanding what contempor- health measurement and quantification into the hands ary scholars have called ‘‘datafication,’’ a process of of ordinary users as a democratizing force that chal- ‘‘rendering into data aspects of the world not previ- lenges traditional doctor–patient knowledge hierar- ously quantified’’ (Kennedy et al., 2015). Practices of chies. Activists engaged in the Quantified Self and datafication also involve the transformation of existing e-patient movements (Ferguson et al., 2007; Nafus data into actionable forms that generate diverse and Sherman, 2014) seek to transform the process of and unevenly distributed forms of value for their health datafication into a process of health data making producers and consumers (Van Dijck, 2014). (Pybus et al., 2016) that generates value for the individ- Contemporary practices of health datafication occur uals whose bodies generate the data, rather than solely both within and beyond clinical settings, posing chal- for the corporations who manufacture those devices or lenges to traditional understandings of agency and provide formal healthcare services to those bodies (Van ownership of medical data (Health Information and Dijck and Poell, 2016). Ironically, concern for the need the Law Project, 2015). to protect patient health information through overly Alongside and overlapping user-generated health cautious adherence to HIPAA guidelines has con- data’s relation to ‘‘user-generated culture’’ is the emer- strained the expansion of patient autonomy into ging phenomenon of patient-generated health data clinical domains, as new methods for sharing patient (PGHD). Like user-generated health data, PGHD data, enabled by electronic communication technolo- often relies on mobile devices to generate health data. gies, have raised concerns regarding ownership, confi- Unlike user-generated health data, PGHD is typically dentiality, and control (Strauss, 2012; Wilkes, 2015). enfolded within traditional healthcare ecosystems that Paradoxically, some users are more willing to share include existing privacy infrastructure governed by their health data on an app than with their healthcare HIPAA, the Common Rule, and other federal and provider (Wortham, 2016). This may result from the state regulations (Deering et al., 2013; Thorpe and device’s sociotechnical infrastructure: the social net- Gray, 2015). With user-generated health data and the working capacity that enables users to share their mobile devices that produce them, issues related to health data is often a key feature in product privacy and data sharing do not simply evolve within design and a central marketing component for many Ostherr et al. 5 health-related apps on mobile devices. With Fitbit, In interviews with end users, the topic of trust cen- for example, social media connectivity allows users to tered on the submission of their personal data into compare their data and ‘‘compete’’ within their social worlds beyond their technological hardware. networks (Nakhasi et al., 2014). Thus, the barriers to Individuals who were less professionally trained in the sharing user-generated health data within formal interpretation of numerical data gained insights into healthcare settings are elided by the seeming openness the personal and social significance of their data by of consumer-facing health apps designed to cultivate sharing it with others. Like science and technology stu- unrestricted data sharing (Kim, 2014) outside of clin- dies scholars, our interviewees situated themselves and ical settings. The contradiction between the restrictive their devices within networks (Haraway, 1988). When view of data sharing within medicine and the permis- we asked how they understood the meaning of the term sive view of data sharing outside of medicine cultivates ‘‘user-generated data,’’ our interlocutors emphasized its a sense of uncertainty among users about the value emergence from multiple origins and its circulation of privacy and trust on one hand, and openness through multiple domains. User-generated data, they and sharing on the other. The asymmetry of said, is marked by its immediacy and ubiquity, its ‘‘big- opportunities for user-generated data to serve the ness,’’ and ‘‘speed,’’ as well as its travels. Encounters goals of patients inside versus outside of clinical set- with user-generated data are organized through rela- tings points to the conflicting conceptual models that tions across scales and domains, from personal to insti- characterize these ecosystems today. These contradic- tutional collaboration and from behavioral strategies to tions are giving rise to new attitudes toward privacy epistemological maneuvers. The particularities of these and sharing as well as new understandings of the distinct assemblages guided users’ management and meaning and value users can derive from quantified interpretation of their data. health data. Surprisingly, our interviewees expressed little concern about sharing their user-generated health data with cor- porate actors. They expressed much greater interest in Trusting and sharing numerical data the ways that their data was purposefully shared with A core tenet of science and technology studies is that known members of their social networks. Several Fitbit empirical evidence is social and situated, rather than users described sharing their daily step counts with objective and neutral (Latour and Woolgar, 1979). others, and emphasized that viewing others’ data Numerical data, in particular, are not to be trusted inspired them to walk more. Their network was com- absolutely but instead considered as contingent out- posed of themselves, their devices, and the friends that comes of the social practices that yield them (Porter, they shared their data with. Numbers were relative. By 1996). Thus, all data are ‘‘user generated.’’ Notably, relating one’s personal number to the number of a friend our diverse sets of interviewees seemed to share this who they could socially situate—as a person of a certain viewpoint as they reflected upon the importance of situ- age, with a certain job, in a certain location—these end ating data within networks in order to divine the sig- users measured the significance of their own data. By nificance of given numbers. Each group enacted distinct interpreting their daily step count within the context of practices to materialize user-generated data as a social this network, they drew motivation that propelled their object. physical body onward. Following our preliminary interviews, the topic of In contrast, the researchers we interviewed inter- trust in numbers, and trust in data, became an animat- preted data with attention to the diverse genres of ing concern that directed the course of our study. expertise that made their research agenda possible. Researcher–interviewees highlighted the fact that user- Every investigator who was involved in projects con- generated data came from diverse sources ‘‘in the wild’’ cerning user-generated data was part of a collaborative and as a result, they were less secure as evidence than and interdisciplinary team. As one researcher said: data gathered through controlled experiments. By col- lecting user-generated data from novel sites, researchers I think modern science is all about teams now. It’s like expanded the scope of their work; however, the new mapping the human genome happened because we methodologies raised concerns about how these new threw really large, smart teams of people at that prob- streams of data were to be interpreted and trusted. As lem to be able map it. It’s the same way now with a lot one researcher told us: ‘‘The data we have now has of the new stuff. [.. .] I think the old way of people surpassed our conceptual model’s abilities to tell us toiling away solitarily in their lab are generally going exactly what to do.’’ New models are needed in order away. I collaborate on—all my current grants have to put the numbers into scholarly narratives. electrical engineers on them. They have computer sci- Interdisciplinary alliances that brought together diverse entists. They have computational scientists on them. genres of expertise facilitated this practice. They often have geneticists. I don’t know what to do 6 Big Data & Society with any of those. [.. .] We work as a team and actually user-generated data within networks, its significance using big data, but there are specific people that actu- came into sharper focus as points of contrast and ally do the computational models because they’re far genres of expertise were brought to bear on discon- beyond me. nected and incomplete numbers. Bioengineers, software developers, psychologists, and Privacy as flexible cultural artifact others combined their complementary expertise in order to enact their research design. Each recognized Data privacy and security have become major topics of the involvement of their collaborators as essential, concern in the post-Snowden era (Pybus et al., 2016), often admitting that they were unqualified to perform with special emphasis on the vulnerability of health that work themselves. In this way, the difficulty of trust data (Dockery, 2016; Ornstein, 2015). As one interlocu- in numbers materialized through user-generated data tor noted, health and financial data constitute sensitive collection is resolved through trust in collaborators objects that need careful management and protection. who together establish the viability of this type of However, we found that concerns about privacy in data as evidence. healthcare differ in substance, depending on the Businesses, on the other hand, situated data within actor’s position and stakes in the chain of data collec- networks composed of hospitals, physicians, patients, tion, storage, and use. Thus, health data privacy is not government regulators, and hackers—each with their a stable natural object that has value regardless of the own perspectives and capacities. They, too, entered subjects who enact it; rather, health data privacy is a relations with other genres of expertise to manage multifaceted cultural artifact that becomes assembled their enterprise. As an employee from one healthcare and maintained within a complex ecology of alliances start-up said: ‘‘you have to get a consultant who’s and disconnections. familiar with what you’re collecting and familiar with A recent survey by the Pew Research Center found that landscape in order to come and help you under- that most Americans ‘‘strongly agree’’ that maintaining stand any regulation around it [.. .] there are social, privacy and confidentiality in their everyday activities is moral, other considerations as well.’’ Unlike other important (Madden and Rainie, 2015). Yet, we found interviewees, the primary goal of start-ups was not to that very few individuals we interviewed held these con- materialize numbers that could be put toward self- cerns. When asked whether they thought that their data realization or scholarly argument; rather, their goals was being used by anyone for purposes that they were were financial. As such, they strategically managed unaware of, one respondent replied: ‘‘They might be. I the complexities of the networks they worked within. don’t really care if they do or not ...’’ Almost all our This task centered on the enclosure of data to ensure interviewees agreed that they might have shared their patient privacy and proprietary rights, and as a result health data with third parties, without being fully the social voyages of data became less of an enactment aware. Some assumed that corporations such as of meaning and more of a threat to be managed. Apple collect their data automatically, with the purpose Despite the enhanced technological features of cloud of producing more technologically sophisticated—and computing, for example, an employee from another thus, ‘‘better’’—services and devices. Most did not healthcare start-up warned that ‘‘pushing the data out- actively think about how their data was viewed by the side the hospital is a challenge, so if you wanted to store companies that manufacture the devices and apps. and process data in the cloud it’s just not gonna happen When pressed, most felt that the manipulation of this right now because hospitals don’t want to put their data by other parties was innocuous, since it was likely data outside the networks.’’ Another employee noted only valuable in the aggregate, in their view. that ‘‘instead of building things ourselves, we will use as While users were generally aware that consenting to much premade items as possible to reduce any of our a company’s terms of use constitutes a legal contract, risk.’’ Indeed, while they stated that the emergence of very few reported actually reading those agreements large and comprehensive datasets could make health- before consenting to them. One participant com- care more efficient and effective, they identified the bar- mented: ‘‘Do I ever read ‘terms of use’? Did I actually riers that privacy advocates and competing business read the consent form I just signed? No. I just agree to enterprises placed upon the circulation of data as a everything like I do for all of my Apple updates. Agree. hindrance. Agree. Done. So, no.’’ Attitudes like this one appear to Because of its immensity and immediacy, user-gen- be the norm, and they highlight the contrast between erated data offers unique possibilities to those who the widespread concern captured by the Pew survey encounter it. However, these traits also make it difficult described above and the casual attitudes associated for any one individual to interpret this data in isolation with informal, social settings for user-generated health from other individuals and other datasets. By locating data sharing. Ostherr et al. 7 Similarly, a survey of over 10,000 users in 20 differ- However, some users proactively reframe their con- ent countries (Internet Society, 2012) asked: ‘‘What are cerns about privacy by emphasizing their acquired cap- the main reasons you accept the terms and conditions acity to monitor their own health and fitness levels, as offered, without reading them?’’ A full 42% of practice preventative self-care, and thereby potentially respondents noted the length of the document, while avoid costly medical services. Although some expressed 19% of respondents indicated that the legal termin- laughingly that they probably should care more about ology was difficult to understand, and 11% selected third-party use of their health data, the majority of our ‘‘I don’t have a choice if I want to complete an activity interviewees said they did not think about it much. that I need to complete.’’ These responses indicate that Given the terms of use, interviewees indicated that users feel they have no agency in controlling access to they value using their health-tracking apps more than their data. Notably, the inclusion criteria for this study they value their data privacy. As several interviewees selected for interviewees who already use some form of suggested, attitudes toward privacy in healthcare are software or app to capture health data, and therefore, changing, under the conditions of rapid technological we could only include individuals who have already change and its impact on patterns of sociality. Another consented to the provider’s terms of use. Indeed, the interviewee observed, widespread use of self-tracking devices and smartphone apps offers a proxy measure of public willingness to What we never thought we would post is being posted agree to terms and conditions to facilitate participation by the people who thought that they would never post- in digital health. ... the definition of privacy will completely change as But if personal privacy is as important as public we move forward. ... We’re actually quite adaptive. It debate and the experiences of researchers (described will change. Maybe that’s actually the reason why as a below) would suggest, further explanation of this species we are very successful, because we change with behavior is needed. In his discussion of ‘‘digital what we feel has value for us. market manipulation,’’ legal scholar Ryan Calo describes the cognitive overload that users experience Our research suggests that the concept of privacy itself when faced with the prospect of reading through the is undergoing change in the public consciousness, and multiple pages of ‘‘legalese’’ that constitute the average the legal system has not kept pace. terms of use document: Human subjects in data science ... too much or extraneous information is said to underlie a host of departures from rational decision- Participants in data science research appear as sources making. For example, ‘information overload’ causes of data and as protected legal subjects. Researchers consumers to rely on heuristics or rules of thumb, working with user-generated health data thus require shortcuts which are sometimes faulty. The phenom- sophisticated technical knowledge and skills to deiden- enon of ‘wear out,’ which suggests consumers tune tify collected data, police access to those data, and out messages they see too often, renders product warn- ensure that any public appearance or use of the data ings less effective. (Calo, 2014: 1012) is in full compliance with the law. Yet these assurances have done little to persuade participants that their priv- The length, complexity, and ubiquity of these agree- acy will be protected in research settings. Several ments may be leading end users to forego control researchers engaged in the development of new health over their health data because the ‘‘heuristics or rules technologies reported considerable difficulty engaging of thumb’’ that they operate under lead them to believe study participants: that the entity who is collecting the data will not use it maliciously. It may surprise many users to know that When I would mention, hey, here is the type of data we the major wearable technology companies all reserve collect. It immediately puts some people in a very alert, the right to share personal, identifiable data in the pro- semi-panic mode. This is way too much information cess of business deals (Fitbit, 2015; Garmin, 2014; you are collecting about people. I think the reason is Jawbone, 2014; Misfit, 2015). Furthermore, these com- that we hear a lot of stories of how perhaps different panies have virtually no restrictions on the ways that companies know a lot about us. Or maybe government they may use and sell aggregated, unidentifiable data knows a lot about us. I try to tell them there’s a differ- that they collect on the users of their technologies. ence between the two approaches. One is being, you are ‘‘[T]he law’s always behind technology,’’ one researcher being tracked without you being told, and without you commented, indicating that legal categories available knowing who is likely looking at it ... However, in this today and newly developed technological solutions do situation the way we collect data actually is completely not neatly map onto each other. different. It’s a fully informed situation where the 8 Big Data & Society participant or the patient is actually told what informa- an online platform that enables sharing of user-gener- tion will be collected, right. Even the method of collec- ated health data with a network of strangers committed tion. At the same time, they’re also told who will likely to the idea that patient-generated knowledge would look at it, and what they plan to do with it. Then, in benefit the community by making health management fact, they are given guarantees that this data will be more accessible, more supported, and less isolating. sandboxed to the point that only these two or three Engaging user-generated data as a tool to connect people can actually look at it. and relate to others, to offer encouragement, or to foster competitive spirit gives room to a different kind The reticence to consent to research governed by of sociality in which data is not a threat but a compo- Common Rule and HIPAA regulations contrasts shar- nent of the very social fabric. However, as Van Dijck ply with our interviewees’ reported comfort with data and Poell (2016) have argued, data use is loosely regu- collection by for-profit companies who are not behol- lated on many online health platforms, allowing for den to HIPAA guidelines at all (see Comstock, 2016). commodification and exploitation by actors with less This contrast reflects larger contradictions around community-minded goals, raising once more the ques- issues of privacy, sharing, and trust as health data tion of who truly benefits from big health data. crosses boundaries between clinical and social domains. As Metcalf and Crawford (2016) and Zwitter (2014) Conclusion have argued, the field of big data research is rapidly outpacing the ability of institutional ethics regulations One of the surprising results of our interviews was that, to keep pace, leading to major disputes over the mean- despite the constant reporting of large-scale data ing of ethical human subjects research in data science. breaches around the world (Comey, 2016), our inter- ‘‘[H]ow a particular patient feels versus how the general locutors felt little concern about sharing their user- public feels about the same data’’ matters, as one generated health data with corporations. Why? Some researcher mentioned, gesturing at different emotional interviewees suggested that the transactional nature of responses and conceptual vocabularies around privacy their consent overrode any concerns about privacy; that he encounters in his work with user-generated individuals had already decided that they wanted to health data. use a device or piece of software, so they consented to The cultivated loyalty to privacy as a weapon in the the terms of use in exchange for access to the product cyberwar over personal data prompts broader, more they desired. Campbell and Carlson (2002) note that systemic thinking about the conditions in which the commodification of privacy is often presented as a health becomes expressed as an individual responsibil- necessary feature of consumer access to popular plat- ity and concern. As various thinkers (Foucault, 2009; forms such as Facebook, and the social sharing features Rabinow and Rose, 2006) have argued, health and bio- of technologies that produce user-generated health data logical vitalities are major analytics for governing indi- further encourage users to see ‘‘health’’ as a commodi- viduals and populations in contemporary states. In fied benefit of the exchange of personal data. Turow neoliberal environments with receding welfare provi- et al. (2015) have called this ‘‘the tradeoff fallacy,’’ sions and strong emphasis on personal autonomy and noting that most Americans feel it is impossible to independence, resilient health and vitality become mat- limit access to their data, and instead see digital profil- ters of individual responsibility and choice (Rose, ing as inevitable. Our research also suggests that the 2006). Instead of addressing larger conditions that pro- ‘‘black box’’ (Pasquale, 2015) surrounding these trans- duce toxic environments, systemic poverty, and inad- actions may obscure the true nature of the exchange, equate social resources, contemporary discourses of with potentially harmful results, including the wide- corporate and state care nurture the ideas of health spread perception that it is impossible for users to opt and healthcare as primarily individual concerns and out of participation in surveillance practices (Elmer, responsibilities (Jain, 2012). In this context, the impera- 2003). As Wilbanks and Topol (2016) have argued, tive to track and take control of one’s health, as a priv- ‘‘undisclosed algorithmic decision-making’’ based on acy that must be defended, solidifies the idea that user-generated health data could lead to ‘‘discrimin- individuals must take full credit for maintaining and atory health actions’’ against the very users who will- investing in their own health and wellness. ingly shared their own data. At the same time, the connective capacities of digital Entangled with the emergent understanding of priv- technologies create novel opportunities for alternative acy as flexible and contextual, our research also identi- ‘‘data-making’’ practices (Pybus et al., 2016) that chal- fied a new concept of communities of data sharing. lenge the idea of personal health data as private prop- Many of our interviewees expressed a willingness to erty that only provides individual value, security, and create and share personal health data with other wealth (Agus, 2016). One example is PatientsLikeMe, users. The rhetoric of personalization and sharing Ostherr et al. 9 appears to facilitate an understanding of user-generated policies governing corporate practices on the other. health data that deemphasizes the risk of exploitation Moreover, there is an additional disconnection between in favor of loosely defined benefits to individual and public discourse on threats posed to personal privacy social well-being. In this model, the concept of person- by data piracy, security leaks, and identity theft on one alization emerges from the purposeful creation of hand, and public interest in the terms and conditions networks of family members or friends with whom indi- that actually govern access to their user-generated data viduals share data to motivate or make sociable their on the other. These contradictions suggest that the data-tracking activities. The sense that each of these regulatory frameworks for managing the risks of shar- networks is highly personal to its creator seems to over- ing user-generated health data need an overhaul that ride awareness of the other, less benevolent entities with brings them into closer conformity with the current whom the data is being shared. attitudes of the general public. At the same time, the An interesting corollary was the idea expressed by casual permissiveness that characterized many of our several researchers that data sharing by the general interlocutors’ responses to our terms and conditions public would lead to greater improvements in health question suggests that, in addition to updating our outcomes than previously possible through lab-based legal frameworks for protecting and sharing user- research. The rhetoric of openness and sharing has generated health data, we also need to engage in a begun to frame data exchanges among researchers more robust public dialog about the potential benefits just as it has shaped the practices of casual users in and harms of openly sharing health data. With recent online health platforms. The complex nature of user- studies demonstrating the harms embedded in artificial generated health data research has also given rise to the intelligence algorithms that replicate racial and other formation of multidisciplinary research teams whose biases of their human programmers, as well as the members must participate in ‘‘sharing’’ across trad- growing intermediation of data sources that together itional disciplinary boundaries to accomplish their might be capable of revealing sensitive personal data research objectives. In this sense, the new model of (Crawford and Calo, 2016), there is a clear need for communal data sharing can be seen as having a trans- regulations that offer consumers easily comprehensible formative effect on the conduct of scientific big health terms of use, with opportunities to opt out of surveil- data research as well. lance. Future research on user-generated health data that identifies and explains the effects of participating Importantly though, many of the data scientists we interviewed described significant challenges in recruit- in this ecosystem—both outside and inside of clinical ing research participants, despite the relaxed attitudes settings—will provide much needed guidance to policy- individuals expressed about consenting to terms and makers and patients as regulations governing data shar- conditions that enable corporations to freely exploit ing attempt to catch up with practices in the wild. their users’ data. Our study suggests that when data privacy is explicitly foregrounded in the process of Declaration of conflicting interests obtaining verbal consent, and participants are The author(s) declared no potential conflicts of interest with addressed as individuals rather than as anonymous respect to the research, authorship, and/or publication of this consumers, the risk of malevolent data exploitation article. appears significantly more threatening. Ironically, researchers who are required to participate in ethics Funding review procedures and follow explicit protocols for The author(s) received no financial support for the research, data privacy, security, and storage are subject to con- authorship, and/or publication of this article. siderably more suspicion by members of the general public than are the corporations that overtly participate References in data profiling with far less ethical supervision. Under Agus DB (2016) Give up your data to cure disease. The New these asymmetrical circumstances, researchers are pena- York Times, 6 February. lized for raising public awareness of the procedures Benghozi P and Paris T (2016) The cultural economy in the required ethically to conduct health data research, digital age. City, Culture and Society 7(2): 75–80. while businesses are free to benefit without restriction. Calo R (2014) Digital market manipulation. The George What emerges from these seemingly contradictory Washington Law Review 82(4): 995–1051. attitudes about sharing user-generated health data is a Campbell JE and Carlson M (2002) Panopticon.com: Online fluid, contextually specific, and social conception of surveillance and the commodification of privacy. Journal privacy. A major implication of this finding is that of Broadcasting and Electronic Media 46(4): 586–606. there is a significant disconnection between the regula- Chung AE and Basch EM (2015) Potential and challenges of tory policies governing the sharing of health data for patient-generated health data for high-quality cancer care. research and patient care, on one hand, and those Journal of Oncology Practice 11(3): 195–197. 10 Big Data & Society Comey J (2016) Humility, adaptability, and collaboration: sons-you-accept-the-terms-and-conditions-as-offered- The way forward in cyber security. Speech delivered at without-reading-them-14/ (accessed 28 May 2016). FBI/Fordham University international cyber security con- Jain SL (2012) Cancer butch. Cultural Anthropology 22: ference, New York City, New York, 27 July 2016. 501–538. Available at: https://www.fbi.gov/news/speeches/humi- Jawbone (2014) UP terms of use. Available at: https://jaw lity-adaptability-and-collaboration-the-way-forward-in- bone.com/legal/up/terms (accessed 28 May 2016). cyber-security (accessed 1 August 2016). Kennedy H, Poell T and Van Dijck J (2015) Data and agency. Comstock J (2016) How consumer health, fitness devices Big Data & Society 2(2): 1–7. reveal HIPAAS’s blurry lines. Available at: http://mobi Kim N (2014) Three’s a crowd: Towards contextual integrity healthnews.com/content/how-consumer-health-fitness- in third party data sharing. Harvard Journal of Law and devices-reveal-hipaas-blurry-lines (accessed 28 May 2016). Technology 28(1): 325–347. Cortez NG, Cohen IG and Kesselheim AS (2014) FDA regu- Kish L and Topol E (2015) Unpatients: Why patients should lation of mobile health technologies. New England Journal own their medical data. Nature Biotechnology 33(9): of Medicine 371(4): 372–379. 921–924. Crawford K and Calo R (2016) There is a blind spot in AI Krumm J, Davies N and Narayanaswami C (2008) User- research. Nature 538(7625): 311–313. generated content. IEEE Pervasive Computing 7(4): 10–11. Deering MJ, Siminerio E and Weinstein S (2013) Issue brief: Latour B and Woolgar S (1979) Laboratory Life. Beverly Patient-generated health data and health IT. Office of the Hills, CA: Sage. National Coordinator for Health Information Technology, Leaf C (2015) The biggest share in the sharing economy. Washington, D.C., 20 December 2013, pp. 1–11. Fortune, 7 August. Available at: http://fortune.com/2015/ Washington, D.C: Office of the National Coordinator 08/07/digital-health-data/ (accessed 1 August 2016). for Health Information Technology. Lupton D (2016) The Quantified Self. London: Polity. Dockery S (2016) The morning risk report: Study shows deep Luxton DD, Kayl RA and Mishkind MC (2012) mHealth flaws in health-care cybersecurity. The Wall Street Journal, data security. Telemedicine and e-Health 18(4): 284–288. 29 June. Madden M and Rainie L (2015) Americans’ attitudes toward Elmer G (2003) A diagram of panoptic surveillance. New privacy, security, and surveillance. Report for Pew Media and Society 5(2): 231–247. Research Center. Available at: http://www.pewinternet. Ferguson T, et al. (2007) e-Patients: How they can help us heal org/2015/05/20/americans-attitudes-about-privacy-secur healthcare. Report for Robert Wood Johnson Foundation. ity-and-surveillance/ (accessed 2 May 2016). Available at: http://www.e-patients.net/e-Patients_White_ Metcalf J and Crawford K (2016) Where are human subjects Paper.pdf (accessed 1 August 2016). in big data research? Big Data & Society 3(1): 1–14. Fiore-Gartland B and Neff G (2015) Communication, medi- Misfit (2015) Misfit terms of use. Available at: http://misfit. ation, and the expectations of data. International Journal com/legal/terms_of_use (accessed 28 May 2016). of Communication 9: 1466–1484. Nafus D and Sherman J (2014) This one does not go up to 11: Fitbit (2015) Terms of service. Available at: https://www. The quantified self movement as an alternative big data fitbit.com/legal/terms-of-service (accessed 28 May 2016). practice. International Journal of Communication 8: 11. Foucault M (2009) Security, Territory, Population. New Nakhasi A, Shen AX, Passarella RJ, et al. (2014) York: Picador. Online social networks that connect users to physical Fox S and Duggan M (2013) Tracking for health. Report for activity partners. Journal of Medical Internet Research Pew Research Center. Available at: http://www.pewinter 16(6): e153. net.org/2013/01/28/tracking-for-health/ (accessed 1 National Institutes of Health, Precision Medicine Initiative August 2016). (2015) About the precision medicine initiative cohort pro- Fu¨ ller J (2016) The power of community brands. In: Harhoff gram. Available at: https://www.nih.gov/precision-medi D and Lakhani KR (eds) Revolutionizing Innovation. cine-initiative-cohort-program (accessed 1 August 2016). Cambridge: MIT Press, pp. 353–376. Neff G and Nafus D (2016) Self-tracking. Cambridge: MIT Garmin (2014) Terms of use. Available at: http://www. Press. garmin.com/en-US/legal/terms-of-use (accessed 28 May Ohm P (2010) Broken promises of privacy: Responding to the 2016). surprising failure of anonymization. UCLA Law Review Haraway D (1988) Situated knowledges: The science question 57: 1701–1777. in feminism and the privilege of partial perspective. Ornstein C (2015) Your health records are supposed to be Feminist Studies 14(3): 575–599. private. They aren’t. The Washington Post, 30 December. Health Information and the Law Project (2015) Who owns Ostherr K (2013) Medical Visions: Producing the Patient medical records: 50 state comparison. Report for George Through Film, Television, and Imaging Technologies. Washington University, Hirsh Health Law and Policy New York: Oxford University Press. Program. Available at: http://www.healthinfolaw.org/ Pasquale F (2015) The Black Box Society. Cambridge, MA: comparative-analysis/who-owns-medical-records-50-state- Harvard University Press. comparison (accessed 1 August 2016). Pasquale F and Ragone TA (2014) Protecting health privacy Internet Society (2012) Global internet user survey 2012. in an era of big data processing and cloud computing. Available at: http://www.internetsociety.org/surveyex Stanford Technology Law Review 17: 595–653. plorer/online-privacy-and-identity/what-are-the-main-rea Ostherr et al. 11 Porter TM (1996) Trust in Numbers: The Pursuit of Thorpe JH and Gray EA (2015) Big data and public health: Objectivity in Science and Public Life. Princeton, NJ: Navigating privacy laws to maximize potential. Public Princeton University Press. Health Reports 130(2): 171–5. Pybus J, Cote M and Blanke T (2016) Hacking the social life Topol E (2015) The Patient Will See You Now: The Future of of big data. Big Data & Society 2(2): 1–10. Medicine Is In Your Hands. New York: Basic Books. Rabinow P and Rose N (2006) Biopower today. BioSocieties Turow J, Hennessy M and Draper N (2015) The tradeoff 1: 195–217. fallacy: How marketers are misrepresenting American con- Raghupathi W and Raghupathi V (2014) Big data analytics in sumers and opening them up to exploitation. Report, healthcare. Health Information Science and Systems 2(3): Annenberg School for Communication, University of 1–10. Pennsylvania. Available at: https://www.asc.upenn.edu/ Rainie L and Wellman B (2014) Networked: The New Social sites/default/files/TradeoffFallacy_1.pdf (accessed 29 July Operating System. Cambridge: MIT Press. 2016). Rose N (2006) The Politics of Life Itself: Biomedicine, Power, Van Dijck J (2009) Users like you? Theorizing agency in user- and Subjectivity in the Twenty-first Century. Princeton, NJ: generated content. Media, Culture and Society 31(1): Princeton University Press. 41–58. Sarasohn-Kahn J (2014) Here’s looking at you: How personal Van Dijck J (2014) Datafication, dataism and dataveillance: health information is being tracked and used. Report, Big data between scientific paradigm and ideology. California Health Care Foundation, July 2014. Surveillance and Society 12(2): 197–208. Shklovski I, Mainwaring SD, Sku´ lado´ ttir HH, et al. (2014) Van Dijck J and Poell T (2016) Understanding the promises Leakiness and creepiness in app space: Perceptions of priv- and premises of online health platforms. Big Data & acy and mobile app use. In: Proceedings of the 32nd annual Society 3(1): 1–11. ACM conference on human factors in computing systems- Wilbanks J and Topol E (2016) Stop the privatization of CHI ’14, Toronto, Canada, 26 April–1 May, pp.2347– health data. Nature 535: 345–348. 2356. New York, NY: Association for Computing Wilkes JJ (2015) The creation of HIPAA culture: Prioritizing Machinery. privacy paranoia over patient care. BYU Law Review 5(7): Steinhubl SR, Muse ED and Topol EJ (2015) The emerging 1213–1249. field of mobile health. Science Translational Medicine Wortham J (2016) We’re more honest with our phones than 7(283): 283rv3. with our doctors. The New York Times Magazine, 23 Strauss LJ (2012) Patient privacy—Then and now. Journal of March. Health Care Compliance 61: 19–61. Zwitter A (2014) Big data ethics. Big Data & Society 1(2): 1–6.
Big Data & Society – SAGE
Published: Apr 17, 2017
Keywords: Big data; health data; terms and conditions; trust; privacy; sharing
You can share this free article with as many people as you like with the url below! We hope you enjoy this feature!
Read and print from thousands of top scholarly journals.
Already have an account? Log in
Bookmark this article. You can see your Bookmarks on your DeepDyve Library.
To save an article, log in first, or sign up for a DeepDyve account if you don’t already have one.
Copy and paste the desired citation format or use the link below to download a file formatted for EndNote
Access the full text.
Sign up today, get DeepDyve free for 14 days.
All DeepDyve websites use cookies to improve your online experience. They were placed on your computer when you launched this website. You can change your cookie settings through your browser.