Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

Public perceptions of good data management: Findings from a UK-based survey:

Public perceptions of good data management: Findings from a UK-based survey: Low levels of public trust in data practices have led to growing calls for changes to data-driven systems, and in the EU, the General Data Protection Regulation provides a legal motivation for such changes. Data management is a vital component of data-driven systems, but what constitutes ‘good’ data management is not straightforward. Academic attention is turning to the question of what ‘good data’ might look like more generally, but public views are absent from these debates. This paper addresses this gap, reporting on a survey of the public on their views of data management approaches, undertaken by the authors and administered in the UK, where departure from the EU makes future data legislation uncertain. The survey found that respondents dislike the current approach in which commercial organizations control their personal data and prefer approaches that give them control over their data, that include oversight from regulatory bodies or that enable them to opt out of data gathering. Variations of data trusts – that is, structures that provide independent stewardship of data – were also preferable to the current approach, but not as widely preferred as control, oversight and opt out options. These features therefore constitute ‘good data management’ for survey respondents. These findings align only in part with principles of good data identified by policy experts and researchers. Our findings nuance understandings of good data as a concept and of good data management as a practice and point to where further research and policy action are needed. Keywords Good data, public perceptions, data management, data trust, personal data store, conjoint experiment systems and the structures that enable them (for exam- Introduction ple by Doteveryone, 2019a, 2019b). Throughout the world, low levels of public trust in data The General Data Protection Regulation (GDPR), practices have recently been identified (Edelman, 2018; which came into effect in 2018, provides a legal moti- Open Data Institute (ODI), 2018). There is a ‘data trust vation to improve data practices in EU countries deficit’, it has been claimed (Royal Statistical Society adopting this legislation. Under GDPR, individuals (RSS), 2014), characterized by mounting concern have rights with regard to access and portability of about the potential negative consequences of the wide- spread use of data-driven platforms and services. Sheffield Methods Institute, University of Sheffield, Sheffield, UK Awareness of limited public trust in data practices Sociological Studies,University of Sheffield, Sheffield, UK (that is, organizational data collection, analysis and Copenhagen Business School, Frederiksberg, Denmark sharing and the uses to which the outcomes of these BBC R&D, Greater Manchester, UK processes are put) brought about in part by high profile Corresponding author: global failures to protect people’s personal data from Helen Kennedy, University of Sheffield, Elmfield, Northumberland Road, misuse (Cadwalladr and Graham-Harrison, 2018), has Sheffield S10 2TU, UK. led to growing calls for changes to current data-driven Email: h.kennedy@sheffield.ac.uk Creative Commons CC BY: This article is distributed under the terms of the Creative Commons Attribution 4.0 License (https:// creativecommons.org/licenses/by/4.0/) which permits any use, reproduction and distribution of the work without further permission provided the original work is attributed as specified on the SAGE and Open Access pages (https://us.sagepub.com/en-us/nam/open-access-at-sage). 2 Big Data & Society their personal data. Coupled with concern about the understanding public perceptions of good data man- data trust deficit, this new legislation has led to growing agement in the UK is extremely timely. For this experimentation with alternative approaches to the reason, our survey focused on the UK. In the survey, management of personal data, which some believe we found that respondents dislike approaches which would be better for people and society (Hall and give commercial organizations control of personal Pesenti, 2017; O’Hara, 2019). These include personal data in return for the digital services they provide. data stores (PDSs), in which individuals personally Respondents expressed a preference for approaches store and manage their data, and data trusts, defined that give them control over data about them, that by the ODI (2019a) as ‘a legal structure that include oversight from regulatory bodies or that provides independent stewardship of data’ for the ben- enable them to opt out of data gathering altogether. efit of all parties. These approaches are not mutually exclusive, but we This context has led a range of policy stakeholders separated them out for the purpose of our analysis and to advocate for responsible and ethical data develop- comment on their relationship in our conclusion. ments. In the UK, where our research took place, Variations of data trusts (described in detail below) advocates include government centres (such as the were also preferable to the status quo, but not as new Centre for Data Ethics and Innovation (CDEI)), widely preferred as approaches involving personal con- think tanks (for example Doteveryone) and indepen- trol, regulatory oversight or the ability to opt out. Thus dent research and advocacy organizations (such as personal control, oversight and the ability to opt out the Ada Lovelace Institute (Ada) and the ODI). In constituted ‘good data management’ for respondents in academic circles, attention is turning to what good, our survey. responsible and ethical data might look like (for exam- The paper proceeds to situate our research in the ple, Daly et al., 2019). Good data management context of debates about ‘good data’ and alternative approaches, like PDSs and data trusts, are a vital data management approaches. We then describe our part of a responsible and ethical data ecosystem, but methods and discuss our findings. We conclude with what constitutes good data management – that is, data reflections on the significance of our findings for con- storage, stewardship and decision-making about shar- ceptualizing good data and for better data management ing – is not straightforward. policy and practice. Policy stakeholders in the UK, like the CDEI and Ada, claim that understanding public views about data practices is essential to ensure that data works ‘for Good data and alternative approaches to people and society’ (Ada’s mission) and is ‘a force for data management good’ (a CDEI aim). This also applies to data manage- ment: in order to determine what constitutes good data Good data management, public views must be taken into account. The emerging field of critical data studies has done a Research into public views on good data management good job of making visible the many troubling conse- is therefore needed, so that these can be factored in to quences of datafication, including increased surveil- future data policy and practice. Yet to date, public lance, threats to privacy, new forms of algorithmic attitudes to data management have rarely been exam- control, and the expansion of new and old inequalities ined, and when they have, research has focused nar- and forms of discrimination (Iliadis and Russo, 2016; rowly on user feedback on specific models under Kennedy, 2018). More recently, and against this critical development or on fictional scenarios (Sailaja et al., backdrop, scholars have begun to consider what ‘good 2019). For this reason, our paper focuses specifically data’ alternatives might look like. One example is Daly on data management, as opposed to other data practi- et al.’s (2019) edited collection, Good Data, which was ces such as data generation, collection, analysis and motivated by a recognition that although scholars had sharing. extensively critiqued problematic data practices, they The paper reports on a survey on public views on had not considered more positive alternatives. Devitt different approaches to managing personal data (that et al. (2019) describe Good Data as aiming to open up ‘a is, data related to an identified or identifiable person multifaceted conversation on the kinds of futures we (GDPR, 2018)), which aimed to fill the gap identified want to see’ and presenting ‘concrete steps on how we above. The survey was administered in the UK in May can start realizing good data in practice’. They suggest 2019 to over 2000 adults. Although the GDPR was that asking what constitutes good data is an essential adopted in UK law after coming into force, the UK’s step in advancing the critical scholarship which has withdrawal from the EU is causing uncertainty about exposed the harms and injustices that result from wide- future data legislation in the UK. As the UK decides what its post-Brexit data laws will look like, spread ‘bad’ data practices. Hartman et al. 3 Of course, ‘good’ is a complex concept: it could about good data. Understanding ‘bottom up’ percep- mean fair, ethical or just, or it could have other mean- tions of what constitutes good data management is ings, some of which acknowledge the power inequal- needed, just as it is in relation to other aspects of data- ities that shape datafication more readily than others. fication (Couldry and Powell, 2014). We aimed to fill On the one hand, in the context of datafication, the these gaps with the research we discuss in this paper. concept of good has been used by initiatives which Furthermore, as research has shown that inequalities might be seen to depoliticize ‘data relations’ influence perceptions of datafication (Kennedy et al., (Kennedy, 2016), such as charitable projects like Data 2020), the views of diverse populations on what con- For Good and DataKind. On the other hand, it is open stitutes a good data management model need to be to experienced-based interpretation, something that examined. Andrew Sayer (2011) says is important for understand- ing ‘why things matter to people’. Approaches to data management The term ‘data’ is not straightforward either. Daly In debates about data management, a number of et al. (2019) use data as a proxy for the whole DIKW approaches have been put forward as good alternatives model – that is, the hierarchical pyramid which has to current arrangements. One is the data trust, ‘a legal data at its base, information above that, knowledge structure that provides independent stewardship of above that and wisdom at the top. In Good Data and data’ for the benefit of all parties (ODI, 2019a; see elsewhere, including in this paper, data is used as a also Hall and Pesenti, 2017). According to the ODI proxy for the whole data ecology, incorporating struc- (2019a), the trustees of a data trust ‘take on responsi- tures, data management models, uses and consequen- bility to make decisions about what data to share and ces. As such, good data is a metaphor that extends with whom’ in order to support the trust’s intended beyond the ‘high quality evidence’ meaning of the purposes and benefits. One focus of debate has been term that might be more commonly used amongst on the legal status of data trusts, as seen in the defini- data scientists and statisticians. tion cited here. In the UK context, a trust is a partic- Good Data’s editors propose a set of principles for ular legal structure which does not exist in the same good data practices (Devitt et al., 2019), a number of form across all international jurisdictions. However, which are relevant to our focus on public perceptions O’Hara argues that a data trust cannot be a trust in a of approaches to data management. Some principles legal sense. Rather, ‘it takes inspiration from the notion highlight the importance of individual control over of a legal trust’ (O’Hara, 2019: 4). Our focus in this what happens to personal data – for example, ‘data paper, therefore, is not on the legal dimensions of a subjects must mediate data uses’ and ‘users must be data trust, but on its approach to stewardship. A able to understand and control their personal data’. data trust can take many forms, and data management Other principles emphasize collective needs, such as approaches can combine features of a data trust with ‘communal data sharing assists community participa- other features (ODI, 2019a). Table 1 below compares tion’, ‘access to data promotes sustainable communal data trusts with other data stewardship models. A fur- living’ and ‘open data enables citizen activism and ther difference relates to the type of data to be man- empowerment’. These principles form the foundations aged: some approaches are more appropriate for for some of the alternative approaches to data manage- personal data (such as the PDS), others for open or ment that we discuss below and explored in our survey. public interest data (such as the data commons). For Daly et al., the motivation to think about good There are also similarities across different data man- data comes from a belief that data is political and data agement approaches. Trusts, co-operatives and practices should be evaluated according to whether commons-based approaches all involve trusted parties they are used to enhance social well-being, especially overseeing, managing and stewarding data on behalf of for disadvantaged groups. Good Data thus advocates individuals and communities. In this sense (rather than ‘data methods to dismantle existing power structures in a legal sense), they are all ‘trust-like’. For this reason, through the empowerment of communities and citi- in our research, we explored all three of these models: (1) zens’ (Devitt et al., 2019). We follow Daly et al.’s argu- the data co-operative, which manages the collection and ment that good data should enhance well-being, storage of its members’ data, is accountable to its mem- especially amongst disadvantaged groups, because it bers and is governed by a board of representatives con- acknowledges the politics of datafication. In order to stituted by its members; (2) the data commons, similarly understand whether particular approaches to data collectively motivated, which enables online access to management enhance well-being, we further argue community data which can be used for various purposes that the views of those impacted by these approaches must be considered, yet neither public perceptions nor and for the benefit of all (see the decode project data management feature centrally in existing debate for an example, https://decodeproject.eu/); and (3) 4 Big Data & Society Table 1. Distinguishing features of data stewardship approaches according to ODI (2019a). Approach Distinguishing feature Data trusts Takes what has been learned from the use of legal trusts. Trustees of a data trust will take on responsibility (with some liabilities) to steward data for an agreed purpose. Data cooperatives Takes what has been learned from cooperatives. A mutual organization owned and democratically controlled by members, who delegate control over data about them. Data commons Takes what has been learned from managing common pool resources – such as forests and fisheries – and applies the principles to data. Personal data stores Stores data provided by a single individual on their behalf and provides access to that data to third- parties when directed to by the individual. Note: The final row of the original table, Research partnerships, has been removed because it is not relevant to our focus here. Source: reproduced with permission from ODI (2019a). data trusts. We differentiated between two types of trust, “human self-determination”, treating the individual as building on experimentation that was under way at the an autonomous subject with inalienable rights and lib- time of our survey (ODI, 2019a): (a) a trust governed erties’ (Lehtiniemi and Ruckenstein, 2019: 6; see also by an independent responsible party, which makes Sharon and Lucivero, 2019). decisions on behalf of data subjects about who Other more familiar approaches to data manage- accesses data, what they can do with it and under ment also exist. Under the prevailing ‘notice and con- sent’ approach (described in our survey as the ‘digital what circumstances, and (b) a trust governed by mul- tiple independent responsible organizations which service approach’ or ‘status quo’), the service provider manage different types of data in different contexts is responsible for managing personal data with users (for example, one for health data, one for finance consent. Under GDPR, data controllers are mandated data and so on) and represent the interests of all to notify users about the collection of their personal parties involved. We consider these four models as information and associated data practices and obtain agreement in advance. This takes the form of a privacy ‘trust-like’ in our discussion below (see Table 2 for a full list of the data management approaches that we notice which users must consent to before they can use explored in our survey). a service. In some instances, controls may be integrated As can be seen in Table 2, we also explored other into privacy notices allowing users to opt in or out of solutions to the perceived data trust deficit in our certain data collection practices, but it is often difficult survey. One is the PDS, also included in Table 1. The for people to negotiate terms of use, see the extent of PDS is seen as a more trustworthy approach to man- data practices or easily change or revoke consent. The aging personal data than current models (for example shortcomings of the privacy notice system have been by Janssen et al., 2019), because it enables individuals well documented (for example by Cate, 2010; Cranor, to control the processing of, access to and transfer of 2012; Nissenbaum, 2009; Warner and Sloan, 2013). their personal data. Personal control has been found to Few people read notices in full (Obar and Oeldorf- be important in UK research about public attitudes to Hirsch, 2020) and when they do, they often find them data practices: 94% of participants in a Digital difficult to comprehend. This undermines the premise Catapult (2015) survey said they wanted more control of informed consent on which the legitimacy of this over their data. The PDS has therefore received signif- approach to data management relies (Bakos et al., icant attention and financial investment in recent years: 2014; Nissenbaum, 2009). This approach has been notable examples include Solid (https://solid.inrupt. described as exploitative in light of asymmetries com/) led by Tim Berners-Lee, Databox in the UK between organizations and end users (Edwards and (https://www.databoxproject.uk/) and services such as Veale, 2017; Zuboff, 2018) in which people have little digi.me (https://digi.me/). Advocates such as the inter- choice but to consent to the data collection practices of national MyData movement believe that PDSs digital services, if they want to participate in digital ‘empower individuals by improving their right to self- society. Despite these criticisms, this approach to determination regarding their personal data’ and that data management is widely adopted across the global with the PDS, ‘the sharing of personal data is based on digital economy. trust’ (MyData, nd). In contrast, critics argue that the Another way to address perceived data management PDS represents an individualized solution. For exam- deficits is through regulation. Current EU and UK reg- ple, Lehtiniemi and Ruckenstein state that ‘the ulatory frameworks for data have been characterized as MyData vision relies on the ethical principle of contradictory and unclear in a dynamic policy Hartman et al. 5 Table 2. Data management approaches as described to respondents. Name Description Personal data store You are given a secure place to collect, store and manage the data about you which has been collected by other services. This is called a personal data store, or PDS. You have access to this data, and you can decide who else can access this data, how they can use it and under what circumstances. The purpose of the PDS is to give you personal control over your data, which you can manage in a secure way. Responsible independent You are given a way to nominate a responsible independent party to oversee collection, party storage and access of your personal data. They have legal responsibilities to look after your data. In line with your wishes, the nominated party can make decisions on your behalf about who accesses your data, what they can do with it and under what circumstances. You have a say over what happens to your data, but you are not personally responsible for looking after it. Responsible independent Responsible independent organizations manage your data in different contexts (e.g. one for organizations health data, one for finance data, etc.). These organizations make decisions about who can access your data, what they can do with it and under what circumstances. They have legal respon- sibilities to manage access to your data in ways that represent the interests of all parties involved. Digital service (status quo) You sign up to a new digital service (e.g. an online shop) that collects and uses your data. You are asked to agree to terms of use and a privacy policy beforehand. These describe how the service will collect, store and manage data about you. You are given settings you can alter, but you are not able to change or negotiate these terms or see how your data is used. This approach gives services control over your data (this is what usually happens now). Data co-operative You become a member of a data co-operative that manages the collection and storage of its members’ data and is accountable to its members. As a member, you can put yourself forward to sit on a board of representatives and make decisions about who has access to members’ data, how it is used and under what circumstances. Or you can vote for other co-operative members to do these things. The purpose of the data co-operative is that your data is managed collectively, by the people whose data is in the co-operative. Public data commons You access data online about your area and community using an open data platform that is accessible to all citizens under commons law. This is called a public data commons. The data commons collects, stores and manages access to open data which can be used for various purposes. Everyone can access and use this data, in line with the commons’ rules of engagement. The purpose of the public data commons is to make data accessible so everyone can benefit from it. Regulatory public body You have been given the details of a new regulatory public body that oversees how organizations access and use data, acting on behalf of UK citizens. This public body provides oversight over how organizations collect, store and use personal data. It can hold organizations accountable for misuse (e.g. fine organizations when they breach terms of use). The purpose of the regula- tory body is to ensure that personal data are collected, stored and used in legal and fair ways. Data ID card (opt out) You have the ability to choose whether to opt out of online data collection, storage and use – this is called managing your data preferences. Your data preferences are stored on a data ID card.You can use this card to log onto online sites. The card automatically opts you out of data collection, storage and use according to your preferences and whenever this is possible. The purpose of the data ID card is to give people the option of opting out of having their data collected. environment (Hinz and Brand, nd). Previous research regulation is to be implemented is not entirely clear; in the UK has found public support for better regula- L’Hoiry and Norris (2015) have found that data pro- tion of data management, such as a 2014 RSS survey tection regulation does not easily translate from the which found ‘more support for the government pre- ‘law in theory’ into the ‘law in practice’ (Galetta venting misuse of personal data than an appetite to et al., 2016). Furthermore, as noted above, although have personal control over this’ (RSS, 2014: 3). EU laws on data protection apply to the UK during GDPR has strengthened data protection regulation the Brexit transition period, post-Brexit data legislation across EU countries that adopt it, but how the in the UK is far from clear at the time of writing. 6 Big Data & Society Table 3. Respondent demographics compared to British Enabling the possibility of opting in to or opting out Election Survey (%). of data collection represents another approach to data management. The PDS and variations of the data trust Comparison: model enable opting in through different means, where- This sample: British Election as opting out enables people to enact a desire not to Qualtrics Study March have their data collected (Brunton and Nissenbaum, May 2019 2019 2011). It is worth noting that although widespread Gender adoption of either opting out or data trust models is Male 47.40 45.92 unlikely in the current context of surveillance capital- Female 52.19 54.08 ism dominated by transnational corporations (Zuboff, Other (non-binary) 0.41 – 2018), these approaches play an important role in Age debate about future good data arrangements. For this 18–34 32.64 17.00 35–54 38.13 33.68 reason, we included them in our survey. Furthermore, 55 or older 29.23 49.58 as noted above, approaches such as notice and consent, Education oversight by a regulatory body and opting out are not No formal qualification 5.17 6.44 mutually exclusive. We separated them out in our Technical or other 18.74 22.42 survey to enable us to evaluate public views of them qualification as components of good data management, and we GCSE/A-Level 48.92 40.45 return to a discussion of their relationship in our (or equivalent) conclusion. University degree 27.18 30.41 Understanding public perceptions of all of the (or higher) Employment status approaches to data management discussed here is Full time 44.20 39.40 important, in order to address the data trust deficit Part time 16.91 15.25 and develop good future data practices. To date, Not working 24.34 16.17 there has been no independent and comparative Retired 14.55 5.85 research on this topic. As an active advocate for data Household income trusts, the ODI carried out three short pilots, conclud- < £15,000 21.20 14.06 ing that there is ‘huge appetite’ for data trusts within £15,000 to< £30,000 32.88 31.52 the organizations involved in the pilot (ODI, 2019b). £30,000 to< £50,000 26.16 27.74 The question of what members of the public, whose > £50,000 19.76 22.03 Ethnicity data is often at stake in such arrangements, think of White 90.63 95.74 alternative data management approaches, including BAME 9.37 4.26 data trusts, remains unanswered. In our research, we Disability asked ‘what do members of the public think constitutes Disabled 20.94 31.35 good data management?’ Research cited above sug- Non-disabled 79.06 68.65 gested that we may find a preference for approaches Total % 100.00 100.00 premised on greater personal control (Digital N 2169 30,842 Catapult, 2015), regulatory oversight (RSS, 2014) or Note: Our data was collected from members of a self-selected Internet ‘trust-like’ approaches (ODI, 2019b). We put the panel by Qualtrics in May 2019. British Election Study (BES) data was approaches discussed above to respondents in our collected by YouGov in March 2019. Respondents who provided a ‘don’t survey to elicit their views. In the next sections, we know’ answer or refused to answer a question are not included in these totals. Not all percentages sum to 100 due to rounding. describe our methods and findings. Respondents’ existing knowledge and favourably with other reputable Internet panels such as views about data practices the British Election Study conducted by YouGov (see In May 2019, 2169 respondents living within the UK column 2 in Table 3). Qualtrics partners with online completed our online survey. The survey focused on sample providers to recruit diverse respondents for what participants thought about the eight approaches research purposes. Researchers have found that to managing data listed in Table 2. We collected data Qualtrics approximates probability-based samples rea- from diverse respondents from across the UK (for a sonably well in terms of demographic characteristics full demographic breakdown, see Table 3). and responses to other socio-political questions (Zack Respondents were recruited by Qualtrics using opt-in et al., 2019). It should be noted that surveys conducted methods, the sample demographics of which compare online using an Internet panel like Qualtrics are likely Hartman et al. 7 to recruit respondents who are capable technology (84.0%). A full list of statements and responses can users. This was confirmed in answers to related ques- be seen in the Supplemental Appendix. tions: 94.6% indicated that they were confident using In another part of the survey, we asked participants devices to do things online, 98.9% stated they used the about the types of data-driven apps and services that they would like to see developed in the future, inviting Internet daily and only 8.5% of respondents indicated that they were not users of at least one of the major them to select services from a list or add their own. social media platforms. Types included related to health, well-being, the envi- Before rating the approaches, respondents complet- ronment and education. When we asked respondents ed knowledge questions to gauge their familiarity with who they would like to see provide these services, most and understanding of concepts relevant to the survey. said they preferred governmental or publicly-funded We presented participants with a series of statements organizations – 46% and 40% of respondents selected about personal data, open data and the GDPR and these options, compared to 18% selecting commercial asked them to identify whether each statement was organizations in a question where respondents could true or false. These statements were used to assess select as many options as they wished. their knowledge about relevant issues and evaluate The questions discussed thus far were asked to aid responses to later questions in light of these responses. our analysis. Existing research has highlighted that Some of these statements were reverse worded to knowledge levels influence public views about data account for potential agreement bias. Respondents practices (Digital Catapult, 2015; Doteveryone, 2018) appeared most knowledgeable about the concept of and as such establishing existing knowledge levels was personal data, with the vast majority correctly answer- necessary. Standard demographic questions were asked ing questions related to its definition: more than 7 out to enable us to explore whether different groups of of 10 respondents answered these questions correctly. people have different views about good data manage- Respondents were least knowledgeable about open ment. Responses to questions about future data-driven data: less than half were able to correctly answer two apps and services indicate what might constitute good questions on this topic. Results were mixed concerning data management for respondents: personal control; familiarity with and understanding of GDPR: 93% of the ability to exercise one’s rights; accountable, pro- the sample correctly answered a question about its social uses of data; and oversight by a public body. main purpose and 53% provided correct answers to a question about data portability (see Table 4). Views on approaches to data Once completed, we provided respondents with the management answers to these questions to ensure that everyone Examining respondents’ views about data management began subsequent sections with the same general infor- mation about the topic. We also included questions on approaches was at the heart of our survey, and we used attitudes towards how personal data is collected, three different methods to do this. Our first method stored, used and shared by organizations, to gauge asked respondents to rate four randomly selected data respondents’ views on a broad range of related issues management approaches (presented one at a time) using and enable us to analyse whether attitudes were indi- a Likert scale ranging from 0 (poor) to 10 (excellent). cators of preference. We asked participants to indicate This method is commonly used in surveys, yet assigning on a five-point Likert scale whether they agreed or dis- a numeric value on an 11-point scale can be difficult for agreed with a series of statements. Respondents were some respondents. To address this issue, our second concerned about the privacy (84.6% agreement) and method of assessing preferences used an innovative security (84.2%) of their personal data. They wanted approach called a conjoint experiment (Hainmueller to be able to exercise their rights (92.1%) and have et al., 2014). A conjoint experiment works by presenting more control over their data (89.0%). In particular, respondents with options randomly generated from a they were concerned about how their personal data is list. The task involves comparing items side-by-side used by organizations (86.9%), and they wanted com- and then choosing the preferred option. This forced panies to be held accountable if it is misused (96.1%). choice design simplifies the decision facing respondents Respondents were against commercial organizations (Hainmueller and Hopkins, 2015; Pelzer, 2019). We used using personal data to generate profit (78.3%). Only a single-attribute conjoint experiment in which partici- around half of the respondents supported sharing per- pants were presented with two randomly selected sonal data for use in research in the public interest approaches from the list of eight (see Table 2 for the (52.7%). Around two in three wanted data to be used exact wording of each model) and asked them to select for the social good (68.8%). Most want data to be the approach that they preferred from the pair. This managed, analysed and gathered in ethical ways paired selection task was repeated three times for each 8 Big Data & Society Table 4. Percentage of knowledge questions answered correctly. Question (correct response) % Correct The General Data Protection Regulation (GDPR) governs the processing of personal data (collection, storage 93.1 and use). (True) Any information that can be used to identify an individual is personal data. (True) 92.2 Location data collected by your mobile phone is not personal data. (False) 73.4 The General Data Protection Regulation (GDPR) does not give you the right to access the personal data 72.2 organizations hold about you. (False) There are still no financial penalties for companies that do not comply with the General Data Protection 69.0 Regulation (GDPR). (False) The General Data Protection Regulation (GDPR) allows for ‘data portability’ meaning that you can take your 52.6 data from one organization and give it to another. (True) Open data does not generally include personal data. (True) 48.9 Open data can only be used, modified and shared for non-commercial purposes. (False) 48.2 Table 5. Example of the single-attribute conjoint experiment. Option A Option B You are given a secure place to collect, store and manage the You are given a way to nominate a responsible independent data about you which has been collected by other services. party to oversee collection, storage and access of your This is called a personal data store, or PDS. You have access personal data. They have legal responsibilities to look after to this data, and you can decide who else can access this your data. In line with your wishes, the nominated party can data, how they can use it and under what circumstances. The make decisions on your behalf about who accesses your purpose of the PDS is to give you personal control over data, what they can do with it and under what circumstan- your data, which you can manage in a secure way. ces. You have a say over what happens to your data, but you are not personally responsible for looking after it. Based on these descriptions, which option for managing data would you prefer? ☐ Option A ☐ Option B respondent. Table 5 provides an example of the single- to data, knowing what data is held about them, by attribute conjoint experiment used in this study, which whom and what they do with it); allowed us to evaluate how respondents rated the � Use and beneficiaries of the data (for example, per- approaches in comparison to one another. sonal insights, generate profit, benefit society). Our third and related method for assessing respond- ents’ views of the data management approaches was to In addition, we included who has control (for exam- ask them to complete a multiple-attribute conjoint ple, individual, trustee, commercial organization) as a experiment. This differed from the second method we factor, as this is relevant to our focus on data manage- described above in allowing us to compare different ment. An example of our multiple-attribute conjoint factors that may affect the decision to select one data experiment is provided in Table 6 (the full survey and management approach over another. We accomplished stimulus materials are available in the Supplemental this by randomly combining multiple factors into data Appendix). management profiles to assess the relative effect of each specific factor on preferences. We asked respondents to Preferences in relation to approaches express preferences for scenarios generated from a Of the eight approaches to data management that we combination of factors identified as significant in pre- presented to respondents, three were consistently rated vious research (for example Kennedy et al., 2015): highly. The most preferred approach was the PDS, described in the survey as ‘a secure place to collect, � Type of data (for example, medical, financial, media consumption); store and manage the data about you which has been � What management arrangements mean for the indi- collected by other services’ which would give individu- vidual (for example, full control over what happens als control over their personal data (see Table 7 for Hartman et al. 9 Table 6. Example of the multiple-attribute conjoint experiment. Option A Option B In this scenario, the data is Medical data Financial data The data is controlled by You A trustee like a city council or the government You will be able to Have full control over what happens to Know what data is held about you, by it whom and what they do with it The data will be used for these reasons, So you can get insights and value from So an organization can use your data to and generate these benefits your personal data benefit the public Based on the descriptions, which of these options would you prefer? ☐ Option A ☐ Option B PDS. This finding was confirmed in responses to ques- Table 7. Mean ratings on a scale from 0 to 10 for each data tions about views on data uses, in which 96.1% of management model. respondents agreed with the statement ‘I want compa- Model Mean rating nies to be held accountable if they misuse my personal data’. Realizing this statement requires governance, Personal data store 7.7 which may explain respondents’ strong preference for Regulatory public body 7.6 data management to be overseen by a regulatory body. Data ID card (with clear opt-out options) 7.5 Responsible independent organizations 6.4 In contrast to the RSS (2014) survey cited above, which Public data commons 6.3 found more support for governance than personal con- Responsible independent party 6.2 trol, we found a strong preference for both. The high Data co-operative 5.9 ranking of both the PDS and oversight by a regulatory Status quo 4.9 public body suggests that both personal control and oversight are important principles of good data man- agement for respondents. mean ratings of each model). Responses to questions We described the approach that would allow people about views on data uses suggest that the possibility of to opt out of having their data collected as a ‘Data ID greater individual control may be why this approach Card’, to give material form to a means for opting out was highly rated: 86.9% of respondents agreed with the of data collection. This approach was ranked third statement ‘I want more control over how my personal overall. The relatively high ranking of this model rein- data is used by organizations’, and 89.0% agreed with forces the importance of individual control over data the statement ‘I want more control over my personal amongst our respondents. It also shows that respond- data’. As noted above, previous research by Digital ents would be willing to opt out of data gathering, Catapult (2015) also highlighted the importance of per- indicating strong dissatisfaction with current data sonal control. arrangements. After the PDS, the next highest rated approach We explored respondents’ views on data manage- involved a regulatory public body overseeing ‘how ment in multiple ways in the survey, to ensure reliabil- organizations access and use data, acting on behalf of ity of findings. We found that the results of the single- UK citizens’ in order to ‘ensure that personal data are attribute conjoint experiment corroborated the findings collected, stored and used in legal and fair ways’. As discussed above. This experiment asked respondents to noted above, elsewhere in the survey, we asked choose the option that they preferred from a randomly respondents who they would like to see provide new generated pair of approaches, the results of which are data-driven services ‘for the public good’ and most presented in Figure 1. The plotted points provide the selected governmental organizations (46% of respond- change in the probability of selecting an approach rel- ents), followed by publicly-funded organizations ative to the status quo (that is, digital services having (40%). This reinforces the finding that oversight of control over people’s data). The vertical dotted line data by a public regulatory body was a strong prefer- indicates the digital service/status quo baseline; points ence for our respondents. to the right of the dotted line indicate an increase in the The high rating of this model by respondents sug- probability of choosing that particular approach rela- gests a preference for legally enforceable safeguards tive to the baseline. The lines around each side of plot- alongside the personal control of data offered by the ted points are 95% error bars, indicating uncertainty 10 Big Data & Society Average Marginal Component Effects Model: (Baseline = Digital Ser vice (Status Quo)) A. Personal Data Store B. Independent Responsib le Party C. Responsible Independent Org D. Data Co−Operative ● E. Public Data Commons F. Regulatory Public Body G. Data ID Card (Opt−Out Option) 0.0 0.2 0.4 Change in Predicted Probability Figure 1. Results from the single-attribute conjoint analysis. around each value, which derives from the fact that our These approaches may have received lower ratings survey is based on a sample of the population. because they were less familiar to respondents than As with the individual ratings task, this experiment approaches based on the more commonplace concepts revealed that the top three preferred approaches are the of choice, control and regulation. As noted above, in PDS, opting out and oversight by a regulatory public the knowledge questions with which we opened the body, in that order of preference. There was at least a survey, respondents demonstrated limited knowledge 30% point increase in selecting any of the top three of open data, the principles of which influence data data management approaches compared to the status trust approaches. In addition, elsewhere in the survey, quo/‘notice and consent’ approach. This is a significant only 39.3% of respondents agreed with the statement number, both statistically and substantively. The ‘I’m in favour of open data’. This relatively low level of approaches that did not offer personal control or reg- support for open data could result from the low levels ulatory oversight, which we describe above as ‘trust- of understanding of open data that we also identified. like’, had lower mean scores than those that did offer Together, these findings may explain the lower mean such features, in both the rating task and the single- scores for the ‘trust-like’ data management approaches attribute conjoint experiment. These include that we presented to respondents. approaches overseen by a public data commons, a It is striking that respondents preferred all other data co-operative, multiple responsible independent approaches to a ‘digital services model’ that ‘gives serv- organizations or a specific responsible independent ices control over what happens to your data’. With an party. Trust-like approaches were preferable to the average rating of just 4.9 out of 10, this suggests that status quo, but less preferable than those based on respondents are unsatisfied with services and organiza- the concepts of personal choice, control and regulation. tions controlling data. Combined with the high rating Hartman et al. 11 Average Marginal Component Effects A. The data is: (Baseline = Online beha vioural) Financial Location ● Media Medical B. The data is controlled b y: (Baseline = Commercial organisation) People's collective Trustee (govt) Trustee (public service) Trustee (nominated) ● You C. You will be able to: (Baseline = Exercise your rights) Access data yourself Have a say Have more control Know data is secure Know official is overseeing data ● Know what data is held D. The data will be used f or: (Baseline = For profit) For insights To benefit society −0.1 0.0 0.1 0.2 0.3 0.4 Change in Predicted Probability Figure 2. Results from the multiple-attribute conjoint analysis. of the opt out model, and strong support for statements Figure 2 demonstrates that the most important expressing concern about data management issues, factor influencing responses to the multiple-attribute conjoint experiment was the locus of control over these findings show that current arrangements require radical change in order to win public support. data – respondents want control to rest with them. The probability of respondents selecting a data man- agement scenario that gives them control over their Preferences in relation to data handling scenarios own data increased by 30% points relative to the base- We also used a multiple-attribute conjoint experiment, line (that is, a commercial organization controls the which compared the significance of a number of factors data). Thus, personal control played a key role in this in data handling scenarios – including types of data, experiment, just as it did in evaluations of data man- uses of data and related benefits (identified as signifi- agement approaches (as seen in Table 7) and in cant in previous research (Kennedy et al., 2015)) and responses to statements about data use and manage- control arrangements and what these enable – to assess ment. As we discovered throughout the survey, preferences towards data management approaches. respondents preferred scenarios in which anyone Figure 2 displays the results from this conjoint experi- other than a commercial organization was responsible ment. As with the single-attribute conjoint analysis, in for controlling their data. In this experiment, there was the figure, we present results which show the change in little notable differentiation among the alternative con- the probability of selecting a profile with particular trollers that we presented, apart from respondents characteristics relative to a baseline, this time for themselves, for whom a significant preference was each of the attributes we included in the scenarios. expressed. 12 Big Data & Society Figure 3. Subgroup responses to the single-attribute conjoint experiment by age group and existing knowledge. The other significant factor in this experiment relat- social inequalities influence perceptions of data practi- ed to uses of data and beneficiaries. Respondents pre- ces (Kennedy et al., 2020 is one exception). Because of this, we analysed whether these and other character- ferred scenarios in which data would be used for istics, including existing knowledge of data-related insights or to benefit society rather than for profit, which is consistent with findings from other surveys matters, had an impact on respondents’ views of data management approaches. This latter variable, knowl- (e.g. Doteveryone, 2018). The effect sizes for these fac- edge, was indeed a significant predictor of preferences tors were in the medium range, with a change in the in relation to some of the approaches (see the probability of selecting that profile of 0.15 or greater. Supplemental Appendix for full results). In the ratings In other words, there is a 15-percentage point increase exercise, for example, knowledgeable respondents pre- in the chance that a particular profile would be selected ferred approaches that offered more control and/or when it provided personal insights or benefits to society oversight over personal data by a regulatory public compared to profit. Other factors were not as impact- body than less knowledgeable respondents, who indi- ful, as Figure 2 demonstrates. For instance, respond- cated a slightly higher preference for the status quo, ents did not significantly differentiate in relation to which gives digital services control over their data. what management arrangements mean for the individ- This effect was relatively small (about a half point dif- ual (for example, giving them control over what hap- ference on a 10-point scale). Age also had a significant pens to data or enabling them to know what data is impact on ratings of approaches: younger respondents held about them), as seen in Figure 2(c). Finally, this rated the status quo model higher than those who were experiment confirmed the finding from elsewhere in the aged 35 years and over (about 1 point higher mean survey that respondents do not like their personal data rating on a 10-point scale). Thus, differences relating to be controlled by commercial organizations (Figure 2 to age and existing knowledge mattered, but not a great (b)) or used for profit (Figure 2(d)). As Figure 2 shows, deal. Apart from these two findings, there were no all other scenarios were preferable to this one. other clear differences in evaluations by demographic subgroups within the sample. In other words, we did Differences amongst respondents not find that gender, ethnicity, educational attainment, Recent research has demonstrated that people experi- employment status or household income were signifi- ence datafication differently. Ethnicity, gender, poverty cant predictors of preferences. and their intersections have been shown to impact peo- Similar subgroup differences were observed in the ple’s experiences of data practices (Eubanks, 2017; single-attribute conjoint experiment, presented in Noble, 2018). There is much less research into whether Figure 3 (the full set of comparisons is available in Hartman et al. 13 the Supplemental Appendix). This figure plots the aver- are provided for under GDPR, which continues to be age proportion of respondents selecting each data man- implemented in the UK at the time of writing, our agement model, also known as marginal means, by age findings raise questions for future research about the and knowledge. By design, marginal means average relationship between the ‘law in theory’ and the ‘law in 0.5. In other words, if responses were simply randomly practice’ (Galetta et al., 2016). These include questions chosen, there is a 50:50 chance that a given response is about whether people perceive the existing arrange- selected. Values above 0.5 tell us that respondents ments as ‘good’ but in need of better enforcement, or prefer a given approach, and values below 0.5 indicate whether greater oversight by regulators and more strin- that respondents do not like the approach. A value of 0 gent regulations would be preferred. would tell us that the approach was never selected; a Third, we need to think carefully about what value of 1 means that it was always selected. As with respondents’ preference for more control over their previous figures, Figure 3 also includes error bars. personal data might look like in practice. In previous While the plot points for various demographic sub- qualitative research that we have undertaken, partici- groups were for the most part grouped closely together, pants expressed concern about the burden of decision- indicating consistency in responses, there are some making that a PDS approach might impose upon them exceptions. One is age, which appears to have some as individuals (Steedman et al., 2020). Offloading the influence on preference. Respondents in the 18–34 responsibility for good and informed data management years age group were less swayed by the PDS, oversight decision-making onto citizens may therefore be prob- by a regulatory public body and the opt out option lematic. Effective approaches to greater personal con- than respondents aged 35 years and over, although trol need further research. Our research has identified younger respondents still preferred these approaches what users want; further research into how to realise to the others presented to them. This is indicated in this in practice is needed. Figure 3 by the closer proximity to the 0.5 value for A further finding from our survey is that not all younger respondents. Less knowledgeable respondents, alternatives to data management are rated equally by in general, were also less likely to differentiate among respondents. Although they preferred all alternatives to the approaches. Again, this is shown in the closer prox- the status quo, they expressed a greater preference for imity of their marginal means to the 0.5 vertical line. some than for others. Data trust-like approaches – a The effects of both of these variables, however, are public data commons, a data co-operative, oversight by a responsible independent party or organizations – relatively small, as we observed with responses to were ranked below PDS, regulatory and opt out other survey items. approaches. These findings were consistent across dif- ferent methods used in the survey. We cannot therefore Discussion and conclusions conclude that there is a ‘huge appetite’ for data trusts Our research asked ‘what do members of the UK amongst the public, as the ODI suggests exists amongst public think constitutes good data management?’ Our organizational stakeholders, based on their pilot (ODI, findings suggest that personal data, oversight from reg- 2019b). Further research is needed to explore the rea- ulatory bodies and the choice to opt out of data gath- sons for this, although some speculation is possible. ering are the main components of good data Data trust-like approaches may have been rated management from the perspective of the UK public. lower than other approaches because they were less Another important finding is that respondents dislike familiar to respondents than approaches based on the approaches in which commercial organizations control more commonplace concepts of control, opting out and and profit from personal data in exchange for digital regulation. Respondents’ limited knowledge of and services. As noted above, these approaches to data support for open data, the principles of which inform management are not mutually exclusive. Under data trusts, was evidenced in answers to diverse ques- GDPR, the dominant ‘notice and consent’ model tions in the survey. This might explain respondents’ should include opt out options and oversight from reg- lesser preference for these approaches. ulatory bodies. In this context, we draw three conclu- Existing knowledge and age had an impact on eval- sions from our findings. uations of approaches, but the effects of these factors First, our research suggests that organizations which were relatively small. The fact that less knowledgeable handle personal data and policy-makers in this domain respondents were less likely to differentiate amongst need to accept that current arrangements are not approaches might suggest that with good information, acceptable. People like the idea of choice, control and more differentiation of approaches might result. But oversight, and they do not like commercial organiza- the relationship between information, understanding tions controlling and profiting from their personal and perceptions of data practices is complex, and pre- data. Second, given that some of preferred features vious research has shown that information and 14 Big Data & Society understanding are not necessarily the solution to the preferences exist, and global action is also needed, from data trust deficit (Steedman et al., 2020). Here again, data policy-makers and practitioners, to respond to further research is needed to understand the relation- public concerns. ship between knowledge about and preference for data management approaches in greater depth. Declaration of conflicting interests Our research indicates that public views of good The author(s) declared no potential conflicts of interest with data management align only in part with the principles respect to the research, authorship, and/or publication of this of good data identified by experts and commentators. article. Devitt et al.’s (2019) principles ‘users must be able to understand and control their personal data’ and ‘data Funding subjects must mediate data uses’ were confirmed by our respondents strong preference for a PDS model or an The author(s) disclosed receipt of the following financial sup- opt out option to give them control over what happens port for the research, authorship, and/or publication of this to their data. However, collective principles such as article: This work was supported by a grant from the Arts ‘communal data sharing assists community participa- and Humanities Research Council, award number AH/ tion’, ‘access to data promotes sustainable communal S012109/1, and BBC Research and Development. living’, and ‘open data enables citizen activism and empowerment’, represented in data co-operative and ORCID iDs public data commons approaches, were not as widely Helen Kennedy https://orcid.org/0000-0003-0273-3825 preferred, although respondents did indicate support Robin Steedman https://orcid.org/0000-0003-1033-9318 for pro-social uses of data. Respondents’ evaluations of what constitutes good data management did not Supplemental Material align with those experts who argue that data trusts rep- resent a model of good data either, given that the trust- Supplemental material for this article is available online. like approaches that we presented to them were not the most preferred options. A major contribution of our References research, then, is that it nuances understandings of Bakos Y, Marotta-Wurgler F and Trossen DR (2014) Does good data as a concept and of good data management anyone read the fine print? Consumer attention to as a practice. standard-form contracts. The Journal of Legal Studies In some ways, the UK is in a unique position when it 43(1): 1–35. comes to data management futures, given current Brunton F and Nissenbaum H (2011) Vernacular resistance uncertainty about post-Brexit data regulation. This sit- to data collection and analysis: A political theory of obfus- uation provides the UK government with an opportu- cation. First Monday 16(5). nity to heed what the public wants, which has been the Cadwalladr C and Graham-Harrison E (2018; March 17) main focus of our paper. We found a ‘huge appetite’ Revealed: 50 million Facebook profiles harvested for for alternatives to commercial control of personal data Cambridge Analytica in major data breach. The amongst our respondents, and a clear indication of Guardian. Available at: www.theguardian.com/news/ what constitutes good data management for them. 2018/mar/17/cambridge-analytica-facebook-influence-us- election (accessed 18 March 2018). The UK government could choose to implement Cate FH (2010) The limits of notice and choice. IEEE good data management approaches which have Security & Privacy Magazine 8(2): 59–62. public support, but this would require investment of Couldry N and Powell A (2014) Big data from the bottom up. resources for technical development and for further Big Data and Society 1(1): 1–5. public consultation. By contrast, disregard for public Cranor FL (2012) Necessary but not sufficient: Standardized views about what constitutes good data management mechanisms for privacy notice and choice. Journal on would perpetuate distrust, and this would likely have Telecommunications and High Technology Law 10(2): consequences both for government and for organiza- 273–307. tions that are trying to work with data in ways that are Daly A, Devitt SK and Mann M (2019) Good Data. good, ethical and responsible. In many ways, these con- Amsterdam, the Netherlands: Institute of Network clusions are not unique to the UK. Many countries face Cultures. similar challenges relating to trust, and research on Devitt SK, Mann M and Daly A (2019) The ‘Good Data’ attitudes to data practices in general has found similar Project. Available at: www.networkcultures.org/blog/ levels of concern across countries (for example 2019/01/11/principles-of-good-data/ (accessed 5 June Edelman, 2018; European Commission, 2019; ODI, 2020). 2018; PEGA, 2019). Further research is needed across Digital Catapult (2015) Trust in personal data: A UK review. the globe to explore why particular data management Report by Digital Catapult, London, UK. Hartman et al. 15 Doteveryone (2018) People, power and technology: The 2018 Kennedy H (2018) Living with data: Aligning data studies digital attitudes report. Available at: www.understanding. and data activism through a focus on everyday experiences doteveryone.org.uk (accessed 5 June 2020). of Datafication’. Krisis: Journal for Contemporary Doteveryone (2019a) Engaging the public with responsible Philosophy. Available at: www.krisis.eu/living-with-data/ technology: Four principles and three requirements. (accessed 5 June 2020). Available at: www.doteveryone.org.uk/download/3225/ Kennedy H, Steedman R and Jones R (2020) Approaching (accessed 5 June 2020). public perceptions of datafication through the lens of Doteveryone (2019b) Better redress: Building accountability inequality: A case study in public service media’ informa- for the digital age: An evidence review from Doteveryone. tion. Communication and Society. Epub ahead of print 4 Available at: www.doteveryone.org.uk/wp-content/ March 2020. DOI: 10.1080/1369118X.2020.1736122. uploads/2019/12/Better-redress-evidence-review.pdf Lehtiniemi T and Ruckenstein M (2019) The social imagina- (accessed 5 June 2020). ries of data activism. Big Data & Society 6(1): 1–12. Edelman (2018) Edelman Trust Barometer 2018. Available L’Hoiry X and Norris C (2015) The honest data protection at: www.edelman.co.uk/research/edelman-trust-barome officer’s guide to subject access requests. International ter-2018-uk-findings (accessed 5 June 2020). Data Privacy Law 5(3): 190–214. Edwards L and Veale M (2017) Slave to the algorithm? Why a MyData (nd) Homepage. Available at: www.mydata.org/ ‘right to an explanation’ is probably not the remedy you are (accessed 5 June 2020). looking for. Duke Law and Technology Review 16(1): 18–84. Nissenbaum H (2009) Privacy in Context: Technology, Policy Eubanks V (2017) Automating Inequality: How High-Tech and the Integrity of Social Life. Palo Alto, CA: Stanford Tools Profile, Police and Punish the Poor. New York, University Press. NY: St Martins Press. Noble S (2018) Algorithms of Oppression: How Search European Commission (2019) Special Eurobarometer 487a. Engines Reinforce Racism. New York, NY: New York Summary – The General Data Protection Regulation. University Press. Available at: www.ec.europa.eu/commfrontoffice/publicopi Obar AJ and Oeldorf-Hirsch A (2020) The biggest lie on the nionmobile/index.cfm/Survey/getSurveyDetail/surveyKy/ internet: Ignoring the privacy policies and terms of service 2222 (accessed 5 June 2020). policies of social networking services. Information, Galetta A, Fonio C and Ceresa A (2016) Nothing is as it Communication & Society 23(1): 128–147. seems. The exercise of access rights in Italy and Belgium: ODI (2018) Who do we trust with personal data? Available Dispelling fallacies in the legal reasoning from the ‘law in at: www.theodi.org/article/who-do-we-trust-with-person theory’ to the ‘law in Practice’. International Data Privacy al-data-odi-commissioned-survey-reveals-most-and-least- Law 6(1): 16–27. trusted-sectors-across-europe/ (accessed 5 June 2020). GDPR (2018) Personal data. Available at: www.gdpr-info.eu/ ODI (2019a) Data trusts: Lessons from three pilots. Available issues/personal-data/ (accessed 5 June 2020). at: www.docs.google.com/document/d/118RqyUAWP3W Hainmueller J and Hopkins DJ (2015) The hidden American IyyCO4iLUT3oOobnYJGibEhspr2v87jg/edit# (accessed immigration consensus: A conjoint analysis of attitudes 5 June 2020). toward immigrants. American Journal of Political Science ODI (2019b) Huge appetite for data trusts. Available at: 59(3): 529–548. www.theodi.org/article/huge-appetite-for-data-trusts- Hainmueller J, Hopkins D and Yamamoto T (2014) Causal according-to-new-odi-research/ (accessed 15 April 2019). inference in conjoint analysis: Understanding multidimen- O’Hara K (2019) Data trusts: Ethics, architecture and gover- sional choices via stated preference experiments. Political nance for trustworthy data stewardship. White Paper. Analysis 22(1): 1–30. Available at: www.eprints.soton.ac.uk/428276/ (accessed Hall W and Pesenti J (2017) Growing the Artificial Intelligence 5 June 2020).. Industry in the UK. London, UK: DCMS. PEGA (2019) GDPR: Show me the data survey reveals EU Hinz A and Brand J (nd) Data policies: Regulatory consumers poised to act on legislation. Available at: www. approaches for data-driven platforms in the UK and EU. pega.com/system/files/resources/2019-07/GDPR-Show- Available at: www.datajustice.files.wordpress.com/2020/01/ Me-The-Data-eBook.pdf (accessed 15 April 2019). data-policies-research-report-revised.pdf (accessed 5 June Pelzer E (2019) The potential of conjoint analysis for com- 2020). munication research. Communication Research Reports Iliadis A and Russo F (2016) Critical data studies: An intro- 36(2): 136–147. duction. Big Data & Society 3(2): 1–7. RSS (2014) Trust in data and attitudes toward data use/data Janssen H, Cobbe J, Norval C, et al. (2019) Personal data sharing. Available at: www.statslife.org.uk/images/pdf/ stores and the GDPR’s lawful grounds for processing per- rss-data-trust-data-sharing-attitudes-research-note.pdf sonal data. Zenodo. Epub ahead of print 29 May 2019. (accessed 24 February 2019). DOI: 10.5281/zenodo.3234902. Sailaja N, Colley J, Crabtree A, et al. (2019) The living room Kennedy H, Elgesem D and Miguel C (2015) On fairness: of the future. In: Proceedings of TVX 2019: The ACM User perspectives on social media data mining. conference on interactive experiences for television and Convergence 8(6): 859–876. online video. Salford, UK, 5 June 2019. Kennedy H (2016) Post, Mine, Repeat: Social Media Data Sayer A (2011) Why Things Matter to People: Social Science, Mining Becomes Ordinary. Basingstoke, UK: Palgrave Values and Ethical Life. Cambridge, UK: Cambridge Macmillan. University Press. 16 Big Data & Society Sharon T and Lucivero F (2019) Introduction to the special Warner R and Sloan R (2013) Beyond notice and choice: theme: The expansion of the health data ecosystem – Privacy, norms, and consent. Journal of High Technology Rethinking data ethics and governance. Big Data & Law. Available at: www.scholarship.kentlaw.iit.edu/fac_ Society 6(2): 1–5. schol/568 (accessed 5 June 2020). Steedman R, Kennedy H and Jones R (2020) Zack ES, Kennedy J and Long JS (2019) Can nonprobability Complex ecologies of trust in data practices and samples be used for social science research? A cautionary data-driven systems. Information, Communication and tale. Survey Research Methods 15(2): 215–227. Society. Epub ahead of print 8 April 2020. DOI: Zuboff S (2018) The Age of Surveillence Capitalism. London, 10.1080/1369118X.2020.1748090. UK: Profile Books Limited. http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Big Data & Society SAGE

Public perceptions of good data management: Findings from a UK-based survey:

Loading next page...
 
/lp/sage/public-perceptions-of-good-data-management-findings-from-a-uk-based-neTVI0t4vZ

References (58)

Publisher
SAGE
Copyright
Copyright © 2022 by SAGE Publications Ltd, unless otherwise noted. Manuscript content on this site is licensed under Creative Commons Licenses.
ISSN
2053-9517
eISSN
2053-9517
DOI
10.1177/2053951720935616
Publisher site
See Article on Publisher Site

Abstract

Low levels of public trust in data practices have led to growing calls for changes to data-driven systems, and in the EU, the General Data Protection Regulation provides a legal motivation for such changes. Data management is a vital component of data-driven systems, but what constitutes ‘good’ data management is not straightforward. Academic attention is turning to the question of what ‘good data’ might look like more generally, but public views are absent from these debates. This paper addresses this gap, reporting on a survey of the public on their views of data management approaches, undertaken by the authors and administered in the UK, where departure from the EU makes future data legislation uncertain. The survey found that respondents dislike the current approach in which commercial organizations control their personal data and prefer approaches that give them control over their data, that include oversight from regulatory bodies or that enable them to opt out of data gathering. Variations of data trusts – that is, structures that provide independent stewardship of data – were also preferable to the current approach, but not as widely preferred as control, oversight and opt out options. These features therefore constitute ‘good data management’ for survey respondents. These findings align only in part with principles of good data identified by policy experts and researchers. Our findings nuance understandings of good data as a concept and of good data management as a practice and point to where further research and policy action are needed. Keywords Good data, public perceptions, data management, data trust, personal data store, conjoint experiment systems and the structures that enable them (for exam- Introduction ple by Doteveryone, 2019a, 2019b). Throughout the world, low levels of public trust in data The General Data Protection Regulation (GDPR), practices have recently been identified (Edelman, 2018; which came into effect in 2018, provides a legal moti- Open Data Institute (ODI), 2018). There is a ‘data trust vation to improve data practices in EU countries deficit’, it has been claimed (Royal Statistical Society adopting this legislation. Under GDPR, individuals (RSS), 2014), characterized by mounting concern have rights with regard to access and portability of about the potential negative consequences of the wide- spread use of data-driven platforms and services. Sheffield Methods Institute, University of Sheffield, Sheffield, UK Awareness of limited public trust in data practices Sociological Studies,University of Sheffield, Sheffield, UK (that is, organizational data collection, analysis and Copenhagen Business School, Frederiksberg, Denmark sharing and the uses to which the outcomes of these BBC R&D, Greater Manchester, UK processes are put) brought about in part by high profile Corresponding author: global failures to protect people’s personal data from Helen Kennedy, University of Sheffield, Elmfield, Northumberland Road, misuse (Cadwalladr and Graham-Harrison, 2018), has Sheffield S10 2TU, UK. led to growing calls for changes to current data-driven Email: h.kennedy@sheffield.ac.uk Creative Commons CC BY: This article is distributed under the terms of the Creative Commons Attribution 4.0 License (https:// creativecommons.org/licenses/by/4.0/) which permits any use, reproduction and distribution of the work without further permission provided the original work is attributed as specified on the SAGE and Open Access pages (https://us.sagepub.com/en-us/nam/open-access-at-sage). 2 Big Data & Society their personal data. Coupled with concern about the understanding public perceptions of good data man- data trust deficit, this new legislation has led to growing agement in the UK is extremely timely. For this experimentation with alternative approaches to the reason, our survey focused on the UK. In the survey, management of personal data, which some believe we found that respondents dislike approaches which would be better for people and society (Hall and give commercial organizations control of personal Pesenti, 2017; O’Hara, 2019). These include personal data in return for the digital services they provide. data stores (PDSs), in which individuals personally Respondents expressed a preference for approaches store and manage their data, and data trusts, defined that give them control over data about them, that by the ODI (2019a) as ‘a legal structure that include oversight from regulatory bodies or that provides independent stewardship of data’ for the ben- enable them to opt out of data gathering altogether. efit of all parties. These approaches are not mutually exclusive, but we This context has led a range of policy stakeholders separated them out for the purpose of our analysis and to advocate for responsible and ethical data develop- comment on their relationship in our conclusion. ments. In the UK, where our research took place, Variations of data trusts (described in detail below) advocates include government centres (such as the were also preferable to the status quo, but not as new Centre for Data Ethics and Innovation (CDEI)), widely preferred as approaches involving personal con- think tanks (for example Doteveryone) and indepen- trol, regulatory oversight or the ability to opt out. Thus dent research and advocacy organizations (such as personal control, oversight and the ability to opt out the Ada Lovelace Institute (Ada) and the ODI). In constituted ‘good data management’ for respondents in academic circles, attention is turning to what good, our survey. responsible and ethical data might look like (for exam- The paper proceeds to situate our research in the ple, Daly et al., 2019). Good data management context of debates about ‘good data’ and alternative approaches, like PDSs and data trusts, are a vital data management approaches. We then describe our part of a responsible and ethical data ecosystem, but methods and discuss our findings. We conclude with what constitutes good data management – that is, data reflections on the significance of our findings for con- storage, stewardship and decision-making about shar- ceptualizing good data and for better data management ing – is not straightforward. policy and practice. Policy stakeholders in the UK, like the CDEI and Ada, claim that understanding public views about data practices is essential to ensure that data works ‘for Good data and alternative approaches to people and society’ (Ada’s mission) and is ‘a force for data management good’ (a CDEI aim). This also applies to data manage- ment: in order to determine what constitutes good data Good data management, public views must be taken into account. The emerging field of critical data studies has done a Research into public views on good data management good job of making visible the many troubling conse- is therefore needed, so that these can be factored in to quences of datafication, including increased surveil- future data policy and practice. Yet to date, public lance, threats to privacy, new forms of algorithmic attitudes to data management have rarely been exam- control, and the expansion of new and old inequalities ined, and when they have, research has focused nar- and forms of discrimination (Iliadis and Russo, 2016; rowly on user feedback on specific models under Kennedy, 2018). More recently, and against this critical development or on fictional scenarios (Sailaja et al., backdrop, scholars have begun to consider what ‘good 2019). For this reason, our paper focuses specifically data’ alternatives might look like. One example is Daly on data management, as opposed to other data practi- et al.’s (2019) edited collection, Good Data, which was ces such as data generation, collection, analysis and motivated by a recognition that although scholars had sharing. extensively critiqued problematic data practices, they The paper reports on a survey on public views on had not considered more positive alternatives. Devitt different approaches to managing personal data (that et al. (2019) describe Good Data as aiming to open up ‘a is, data related to an identified or identifiable person multifaceted conversation on the kinds of futures we (GDPR, 2018)), which aimed to fill the gap identified want to see’ and presenting ‘concrete steps on how we above. The survey was administered in the UK in May can start realizing good data in practice’. They suggest 2019 to over 2000 adults. Although the GDPR was that asking what constitutes good data is an essential adopted in UK law after coming into force, the UK’s step in advancing the critical scholarship which has withdrawal from the EU is causing uncertainty about exposed the harms and injustices that result from wide- future data legislation in the UK. As the UK decides what its post-Brexit data laws will look like, spread ‘bad’ data practices. Hartman et al. 3 Of course, ‘good’ is a complex concept: it could about good data. Understanding ‘bottom up’ percep- mean fair, ethical or just, or it could have other mean- tions of what constitutes good data management is ings, some of which acknowledge the power inequal- needed, just as it is in relation to other aspects of data- ities that shape datafication more readily than others. fication (Couldry and Powell, 2014). We aimed to fill On the one hand, in the context of datafication, the these gaps with the research we discuss in this paper. concept of good has been used by initiatives which Furthermore, as research has shown that inequalities might be seen to depoliticize ‘data relations’ influence perceptions of datafication (Kennedy et al., (Kennedy, 2016), such as charitable projects like Data 2020), the views of diverse populations on what con- For Good and DataKind. On the other hand, it is open stitutes a good data management model need to be to experienced-based interpretation, something that examined. Andrew Sayer (2011) says is important for understand- ing ‘why things matter to people’. Approaches to data management The term ‘data’ is not straightforward either. Daly In debates about data management, a number of et al. (2019) use data as a proxy for the whole DIKW approaches have been put forward as good alternatives model – that is, the hierarchical pyramid which has to current arrangements. One is the data trust, ‘a legal data at its base, information above that, knowledge structure that provides independent stewardship of above that and wisdom at the top. In Good Data and data’ for the benefit of all parties (ODI, 2019a; see elsewhere, including in this paper, data is used as a also Hall and Pesenti, 2017). According to the ODI proxy for the whole data ecology, incorporating struc- (2019a), the trustees of a data trust ‘take on responsi- tures, data management models, uses and consequen- bility to make decisions about what data to share and ces. As such, good data is a metaphor that extends with whom’ in order to support the trust’s intended beyond the ‘high quality evidence’ meaning of the purposes and benefits. One focus of debate has been term that might be more commonly used amongst on the legal status of data trusts, as seen in the defini- data scientists and statisticians. tion cited here. In the UK context, a trust is a partic- Good Data’s editors propose a set of principles for ular legal structure which does not exist in the same good data practices (Devitt et al., 2019), a number of form across all international jurisdictions. However, which are relevant to our focus on public perceptions O’Hara argues that a data trust cannot be a trust in a of approaches to data management. Some principles legal sense. Rather, ‘it takes inspiration from the notion highlight the importance of individual control over of a legal trust’ (O’Hara, 2019: 4). Our focus in this what happens to personal data – for example, ‘data paper, therefore, is not on the legal dimensions of a subjects must mediate data uses’ and ‘users must be data trust, but on its approach to stewardship. A able to understand and control their personal data’. data trust can take many forms, and data management Other principles emphasize collective needs, such as approaches can combine features of a data trust with ‘communal data sharing assists community participa- other features (ODI, 2019a). Table 1 below compares tion’, ‘access to data promotes sustainable communal data trusts with other data stewardship models. A fur- living’ and ‘open data enables citizen activism and ther difference relates to the type of data to be man- empowerment’. These principles form the foundations aged: some approaches are more appropriate for for some of the alternative approaches to data manage- personal data (such as the PDS), others for open or ment that we discuss below and explored in our survey. public interest data (such as the data commons). For Daly et al., the motivation to think about good There are also similarities across different data man- data comes from a belief that data is political and data agement approaches. Trusts, co-operatives and practices should be evaluated according to whether commons-based approaches all involve trusted parties they are used to enhance social well-being, especially overseeing, managing and stewarding data on behalf of for disadvantaged groups. Good Data thus advocates individuals and communities. In this sense (rather than ‘data methods to dismantle existing power structures in a legal sense), they are all ‘trust-like’. For this reason, through the empowerment of communities and citi- in our research, we explored all three of these models: (1) zens’ (Devitt et al., 2019). We follow Daly et al.’s argu- the data co-operative, which manages the collection and ment that good data should enhance well-being, storage of its members’ data, is accountable to its mem- especially amongst disadvantaged groups, because it bers and is governed by a board of representatives con- acknowledges the politics of datafication. In order to stituted by its members; (2) the data commons, similarly understand whether particular approaches to data collectively motivated, which enables online access to management enhance well-being, we further argue community data which can be used for various purposes that the views of those impacted by these approaches must be considered, yet neither public perceptions nor and for the benefit of all (see the decode project data management feature centrally in existing debate for an example, https://decodeproject.eu/); and (3) 4 Big Data & Society Table 1. Distinguishing features of data stewardship approaches according to ODI (2019a). Approach Distinguishing feature Data trusts Takes what has been learned from the use of legal trusts. Trustees of a data trust will take on responsibility (with some liabilities) to steward data for an agreed purpose. Data cooperatives Takes what has been learned from cooperatives. A mutual organization owned and democratically controlled by members, who delegate control over data about them. Data commons Takes what has been learned from managing common pool resources – such as forests and fisheries – and applies the principles to data. Personal data stores Stores data provided by a single individual on their behalf and provides access to that data to third- parties when directed to by the individual. Note: The final row of the original table, Research partnerships, has been removed because it is not relevant to our focus here. Source: reproduced with permission from ODI (2019a). data trusts. We differentiated between two types of trust, “human self-determination”, treating the individual as building on experimentation that was under way at the an autonomous subject with inalienable rights and lib- time of our survey (ODI, 2019a): (a) a trust governed erties’ (Lehtiniemi and Ruckenstein, 2019: 6; see also by an independent responsible party, which makes Sharon and Lucivero, 2019). decisions on behalf of data subjects about who Other more familiar approaches to data manage- accesses data, what they can do with it and under ment also exist. Under the prevailing ‘notice and con- sent’ approach (described in our survey as the ‘digital what circumstances, and (b) a trust governed by mul- tiple independent responsible organizations which service approach’ or ‘status quo’), the service provider manage different types of data in different contexts is responsible for managing personal data with users (for example, one for health data, one for finance consent. Under GDPR, data controllers are mandated data and so on) and represent the interests of all to notify users about the collection of their personal parties involved. We consider these four models as information and associated data practices and obtain agreement in advance. This takes the form of a privacy ‘trust-like’ in our discussion below (see Table 2 for a full list of the data management approaches that we notice which users must consent to before they can use explored in our survey). a service. In some instances, controls may be integrated As can be seen in Table 2, we also explored other into privacy notices allowing users to opt in or out of solutions to the perceived data trust deficit in our certain data collection practices, but it is often difficult survey. One is the PDS, also included in Table 1. The for people to negotiate terms of use, see the extent of PDS is seen as a more trustworthy approach to man- data practices or easily change or revoke consent. The aging personal data than current models (for example shortcomings of the privacy notice system have been by Janssen et al., 2019), because it enables individuals well documented (for example by Cate, 2010; Cranor, to control the processing of, access to and transfer of 2012; Nissenbaum, 2009; Warner and Sloan, 2013). their personal data. Personal control has been found to Few people read notices in full (Obar and Oeldorf- be important in UK research about public attitudes to Hirsch, 2020) and when they do, they often find them data practices: 94% of participants in a Digital difficult to comprehend. This undermines the premise Catapult (2015) survey said they wanted more control of informed consent on which the legitimacy of this over their data. The PDS has therefore received signif- approach to data management relies (Bakos et al., icant attention and financial investment in recent years: 2014; Nissenbaum, 2009). This approach has been notable examples include Solid (https://solid.inrupt. described as exploitative in light of asymmetries com/) led by Tim Berners-Lee, Databox in the UK between organizations and end users (Edwards and (https://www.databoxproject.uk/) and services such as Veale, 2017; Zuboff, 2018) in which people have little digi.me (https://digi.me/). Advocates such as the inter- choice but to consent to the data collection practices of national MyData movement believe that PDSs digital services, if they want to participate in digital ‘empower individuals by improving their right to self- society. Despite these criticisms, this approach to determination regarding their personal data’ and that data management is widely adopted across the global with the PDS, ‘the sharing of personal data is based on digital economy. trust’ (MyData, nd). In contrast, critics argue that the Another way to address perceived data management PDS represents an individualized solution. For exam- deficits is through regulation. Current EU and UK reg- ple, Lehtiniemi and Ruckenstein state that ‘the ulatory frameworks for data have been characterized as MyData vision relies on the ethical principle of contradictory and unclear in a dynamic policy Hartman et al. 5 Table 2. Data management approaches as described to respondents. Name Description Personal data store You are given a secure place to collect, store and manage the data about you which has been collected by other services. This is called a personal data store, or PDS. You have access to this data, and you can decide who else can access this data, how they can use it and under what circumstances. The purpose of the PDS is to give you personal control over your data, which you can manage in a secure way. Responsible independent You are given a way to nominate a responsible independent party to oversee collection, party storage and access of your personal data. They have legal responsibilities to look after your data. In line with your wishes, the nominated party can make decisions on your behalf about who accesses your data, what they can do with it and under what circumstances. You have a say over what happens to your data, but you are not personally responsible for looking after it. Responsible independent Responsible independent organizations manage your data in different contexts (e.g. one for organizations health data, one for finance data, etc.). These organizations make decisions about who can access your data, what they can do with it and under what circumstances. They have legal respon- sibilities to manage access to your data in ways that represent the interests of all parties involved. Digital service (status quo) You sign up to a new digital service (e.g. an online shop) that collects and uses your data. You are asked to agree to terms of use and a privacy policy beforehand. These describe how the service will collect, store and manage data about you. You are given settings you can alter, but you are not able to change or negotiate these terms or see how your data is used. This approach gives services control over your data (this is what usually happens now). Data co-operative You become a member of a data co-operative that manages the collection and storage of its members’ data and is accountable to its members. As a member, you can put yourself forward to sit on a board of representatives and make decisions about who has access to members’ data, how it is used and under what circumstances. Or you can vote for other co-operative members to do these things. The purpose of the data co-operative is that your data is managed collectively, by the people whose data is in the co-operative. Public data commons You access data online about your area and community using an open data platform that is accessible to all citizens under commons law. This is called a public data commons. The data commons collects, stores and manages access to open data which can be used for various purposes. Everyone can access and use this data, in line with the commons’ rules of engagement. The purpose of the public data commons is to make data accessible so everyone can benefit from it. Regulatory public body You have been given the details of a new regulatory public body that oversees how organizations access and use data, acting on behalf of UK citizens. This public body provides oversight over how organizations collect, store and use personal data. It can hold organizations accountable for misuse (e.g. fine organizations when they breach terms of use). The purpose of the regula- tory body is to ensure that personal data are collected, stored and used in legal and fair ways. Data ID card (opt out) You have the ability to choose whether to opt out of online data collection, storage and use – this is called managing your data preferences. Your data preferences are stored on a data ID card.You can use this card to log onto online sites. The card automatically opts you out of data collection, storage and use according to your preferences and whenever this is possible. The purpose of the data ID card is to give people the option of opting out of having their data collected. environment (Hinz and Brand, nd). Previous research regulation is to be implemented is not entirely clear; in the UK has found public support for better regula- L’Hoiry and Norris (2015) have found that data pro- tion of data management, such as a 2014 RSS survey tection regulation does not easily translate from the which found ‘more support for the government pre- ‘law in theory’ into the ‘law in practice’ (Galetta venting misuse of personal data than an appetite to et al., 2016). Furthermore, as noted above, although have personal control over this’ (RSS, 2014: 3). EU laws on data protection apply to the UK during GDPR has strengthened data protection regulation the Brexit transition period, post-Brexit data legislation across EU countries that adopt it, but how the in the UK is far from clear at the time of writing. 6 Big Data & Society Table 3. Respondent demographics compared to British Enabling the possibility of opting in to or opting out Election Survey (%). of data collection represents another approach to data management. The PDS and variations of the data trust Comparison: model enable opting in through different means, where- This sample: British Election as opting out enables people to enact a desire not to Qualtrics Study March have their data collected (Brunton and Nissenbaum, May 2019 2019 2011). It is worth noting that although widespread Gender adoption of either opting out or data trust models is Male 47.40 45.92 unlikely in the current context of surveillance capital- Female 52.19 54.08 ism dominated by transnational corporations (Zuboff, Other (non-binary) 0.41 – 2018), these approaches play an important role in Age debate about future good data arrangements. For this 18–34 32.64 17.00 35–54 38.13 33.68 reason, we included them in our survey. Furthermore, 55 or older 29.23 49.58 as noted above, approaches such as notice and consent, Education oversight by a regulatory body and opting out are not No formal qualification 5.17 6.44 mutually exclusive. We separated them out in our Technical or other 18.74 22.42 survey to enable us to evaluate public views of them qualification as components of good data management, and we GCSE/A-Level 48.92 40.45 return to a discussion of their relationship in our (or equivalent) conclusion. University degree 27.18 30.41 Understanding public perceptions of all of the (or higher) Employment status approaches to data management discussed here is Full time 44.20 39.40 important, in order to address the data trust deficit Part time 16.91 15.25 and develop good future data practices. To date, Not working 24.34 16.17 there has been no independent and comparative Retired 14.55 5.85 research on this topic. As an active advocate for data Household income trusts, the ODI carried out three short pilots, conclud- < £15,000 21.20 14.06 ing that there is ‘huge appetite’ for data trusts within £15,000 to< £30,000 32.88 31.52 the organizations involved in the pilot (ODI, 2019b). £30,000 to< £50,000 26.16 27.74 The question of what members of the public, whose > £50,000 19.76 22.03 Ethnicity data is often at stake in such arrangements, think of White 90.63 95.74 alternative data management approaches, including BAME 9.37 4.26 data trusts, remains unanswered. In our research, we Disability asked ‘what do members of the public think constitutes Disabled 20.94 31.35 good data management?’ Research cited above sug- Non-disabled 79.06 68.65 gested that we may find a preference for approaches Total % 100.00 100.00 premised on greater personal control (Digital N 2169 30,842 Catapult, 2015), regulatory oversight (RSS, 2014) or Note: Our data was collected from members of a self-selected Internet ‘trust-like’ approaches (ODI, 2019b). We put the panel by Qualtrics in May 2019. British Election Study (BES) data was approaches discussed above to respondents in our collected by YouGov in March 2019. Respondents who provided a ‘don’t survey to elicit their views. In the next sections, we know’ answer or refused to answer a question are not included in these totals. Not all percentages sum to 100 due to rounding. describe our methods and findings. Respondents’ existing knowledge and favourably with other reputable Internet panels such as views about data practices the British Election Study conducted by YouGov (see In May 2019, 2169 respondents living within the UK column 2 in Table 3). Qualtrics partners with online completed our online survey. The survey focused on sample providers to recruit diverse respondents for what participants thought about the eight approaches research purposes. Researchers have found that to managing data listed in Table 2. We collected data Qualtrics approximates probability-based samples rea- from diverse respondents from across the UK (for a sonably well in terms of demographic characteristics full demographic breakdown, see Table 3). and responses to other socio-political questions (Zack Respondents were recruited by Qualtrics using opt-in et al., 2019). It should be noted that surveys conducted methods, the sample demographics of which compare online using an Internet panel like Qualtrics are likely Hartman et al. 7 to recruit respondents who are capable technology (84.0%). A full list of statements and responses can users. This was confirmed in answers to related ques- be seen in the Supplemental Appendix. tions: 94.6% indicated that they were confident using In another part of the survey, we asked participants devices to do things online, 98.9% stated they used the about the types of data-driven apps and services that they would like to see developed in the future, inviting Internet daily and only 8.5% of respondents indicated that they were not users of at least one of the major them to select services from a list or add their own. social media platforms. Types included related to health, well-being, the envi- Before rating the approaches, respondents complet- ronment and education. When we asked respondents ed knowledge questions to gauge their familiarity with who they would like to see provide these services, most and understanding of concepts relevant to the survey. said they preferred governmental or publicly-funded We presented participants with a series of statements organizations – 46% and 40% of respondents selected about personal data, open data and the GDPR and these options, compared to 18% selecting commercial asked them to identify whether each statement was organizations in a question where respondents could true or false. These statements were used to assess select as many options as they wished. their knowledge about relevant issues and evaluate The questions discussed thus far were asked to aid responses to later questions in light of these responses. our analysis. Existing research has highlighted that Some of these statements were reverse worded to knowledge levels influence public views about data account for potential agreement bias. Respondents practices (Digital Catapult, 2015; Doteveryone, 2018) appeared most knowledgeable about the concept of and as such establishing existing knowledge levels was personal data, with the vast majority correctly answer- necessary. Standard demographic questions were asked ing questions related to its definition: more than 7 out to enable us to explore whether different groups of of 10 respondents answered these questions correctly. people have different views about good data manage- Respondents were least knowledgeable about open ment. Responses to questions about future data-driven data: less than half were able to correctly answer two apps and services indicate what might constitute good questions on this topic. Results were mixed concerning data management for respondents: personal control; familiarity with and understanding of GDPR: 93% of the ability to exercise one’s rights; accountable, pro- the sample correctly answered a question about its social uses of data; and oversight by a public body. main purpose and 53% provided correct answers to a question about data portability (see Table 4). Views on approaches to data Once completed, we provided respondents with the management answers to these questions to ensure that everyone Examining respondents’ views about data management began subsequent sections with the same general infor- mation about the topic. We also included questions on approaches was at the heart of our survey, and we used attitudes towards how personal data is collected, three different methods to do this. Our first method stored, used and shared by organizations, to gauge asked respondents to rate four randomly selected data respondents’ views on a broad range of related issues management approaches (presented one at a time) using and enable us to analyse whether attitudes were indi- a Likert scale ranging from 0 (poor) to 10 (excellent). cators of preference. We asked participants to indicate This method is commonly used in surveys, yet assigning on a five-point Likert scale whether they agreed or dis- a numeric value on an 11-point scale can be difficult for agreed with a series of statements. Respondents were some respondents. To address this issue, our second concerned about the privacy (84.6% agreement) and method of assessing preferences used an innovative security (84.2%) of their personal data. They wanted approach called a conjoint experiment (Hainmueller to be able to exercise their rights (92.1%) and have et al., 2014). A conjoint experiment works by presenting more control over their data (89.0%). In particular, respondents with options randomly generated from a they were concerned about how their personal data is list. The task involves comparing items side-by-side used by organizations (86.9%), and they wanted com- and then choosing the preferred option. This forced panies to be held accountable if it is misused (96.1%). choice design simplifies the decision facing respondents Respondents were against commercial organizations (Hainmueller and Hopkins, 2015; Pelzer, 2019). We used using personal data to generate profit (78.3%). Only a single-attribute conjoint experiment in which partici- around half of the respondents supported sharing per- pants were presented with two randomly selected sonal data for use in research in the public interest approaches from the list of eight (see Table 2 for the (52.7%). Around two in three wanted data to be used exact wording of each model) and asked them to select for the social good (68.8%). Most want data to be the approach that they preferred from the pair. This managed, analysed and gathered in ethical ways paired selection task was repeated three times for each 8 Big Data & Society Table 4. Percentage of knowledge questions answered correctly. Question (correct response) % Correct The General Data Protection Regulation (GDPR) governs the processing of personal data (collection, storage 93.1 and use). (True) Any information that can be used to identify an individual is personal data. (True) 92.2 Location data collected by your mobile phone is not personal data. (False) 73.4 The General Data Protection Regulation (GDPR) does not give you the right to access the personal data 72.2 organizations hold about you. (False) There are still no financial penalties for companies that do not comply with the General Data Protection 69.0 Regulation (GDPR). (False) The General Data Protection Regulation (GDPR) allows for ‘data portability’ meaning that you can take your 52.6 data from one organization and give it to another. (True) Open data does not generally include personal data. (True) 48.9 Open data can only be used, modified and shared for non-commercial purposes. (False) 48.2 Table 5. Example of the single-attribute conjoint experiment. Option A Option B You are given a secure place to collect, store and manage the You are given a way to nominate a responsible independent data about you which has been collected by other services. party to oversee collection, storage and access of your This is called a personal data store, or PDS. You have access personal data. They have legal responsibilities to look after to this data, and you can decide who else can access this your data. In line with your wishes, the nominated party can data, how they can use it and under what circumstances. The make decisions on your behalf about who accesses your purpose of the PDS is to give you personal control over data, what they can do with it and under what circumstan- your data, which you can manage in a secure way. ces. You have a say over what happens to your data, but you are not personally responsible for looking after it. Based on these descriptions, which option for managing data would you prefer? ☐ Option A ☐ Option B respondent. Table 5 provides an example of the single- to data, knowing what data is held about them, by attribute conjoint experiment used in this study, which whom and what they do with it); allowed us to evaluate how respondents rated the � Use and beneficiaries of the data (for example, per- approaches in comparison to one another. sonal insights, generate profit, benefit society). Our third and related method for assessing respond- ents’ views of the data management approaches was to In addition, we included who has control (for exam- ask them to complete a multiple-attribute conjoint ple, individual, trustee, commercial organization) as a experiment. This differed from the second method we factor, as this is relevant to our focus on data manage- described above in allowing us to compare different ment. An example of our multiple-attribute conjoint factors that may affect the decision to select one data experiment is provided in Table 6 (the full survey and management approach over another. We accomplished stimulus materials are available in the Supplemental this by randomly combining multiple factors into data Appendix). management profiles to assess the relative effect of each specific factor on preferences. We asked respondents to Preferences in relation to approaches express preferences for scenarios generated from a Of the eight approaches to data management that we combination of factors identified as significant in pre- presented to respondents, three were consistently rated vious research (for example Kennedy et al., 2015): highly. The most preferred approach was the PDS, described in the survey as ‘a secure place to collect, � Type of data (for example, medical, financial, media consumption); store and manage the data about you which has been � What management arrangements mean for the indi- collected by other services’ which would give individu- vidual (for example, full control over what happens als control over their personal data (see Table 7 for Hartman et al. 9 Table 6. Example of the multiple-attribute conjoint experiment. Option A Option B In this scenario, the data is Medical data Financial data The data is controlled by You A trustee like a city council or the government You will be able to Have full control over what happens to Know what data is held about you, by it whom and what they do with it The data will be used for these reasons, So you can get insights and value from So an organization can use your data to and generate these benefits your personal data benefit the public Based on the descriptions, which of these options would you prefer? ☐ Option A ☐ Option B PDS. This finding was confirmed in responses to ques- Table 7. Mean ratings on a scale from 0 to 10 for each data tions about views on data uses, in which 96.1% of management model. respondents agreed with the statement ‘I want compa- Model Mean rating nies to be held accountable if they misuse my personal data’. Realizing this statement requires governance, Personal data store 7.7 which may explain respondents’ strong preference for Regulatory public body 7.6 data management to be overseen by a regulatory body. Data ID card (with clear opt-out options) 7.5 Responsible independent organizations 6.4 In contrast to the RSS (2014) survey cited above, which Public data commons 6.3 found more support for governance than personal con- Responsible independent party 6.2 trol, we found a strong preference for both. The high Data co-operative 5.9 ranking of both the PDS and oversight by a regulatory Status quo 4.9 public body suggests that both personal control and oversight are important principles of good data man- agement for respondents. mean ratings of each model). Responses to questions We described the approach that would allow people about views on data uses suggest that the possibility of to opt out of having their data collected as a ‘Data ID greater individual control may be why this approach Card’, to give material form to a means for opting out was highly rated: 86.9% of respondents agreed with the of data collection. This approach was ranked third statement ‘I want more control over how my personal overall. The relatively high ranking of this model rein- data is used by organizations’, and 89.0% agreed with forces the importance of individual control over data the statement ‘I want more control over my personal amongst our respondents. It also shows that respond- data’. As noted above, previous research by Digital ents would be willing to opt out of data gathering, Catapult (2015) also highlighted the importance of per- indicating strong dissatisfaction with current data sonal control. arrangements. After the PDS, the next highest rated approach We explored respondents’ views on data manage- involved a regulatory public body overseeing ‘how ment in multiple ways in the survey, to ensure reliabil- organizations access and use data, acting on behalf of ity of findings. We found that the results of the single- UK citizens’ in order to ‘ensure that personal data are attribute conjoint experiment corroborated the findings collected, stored and used in legal and fair ways’. As discussed above. This experiment asked respondents to noted above, elsewhere in the survey, we asked choose the option that they preferred from a randomly respondents who they would like to see provide new generated pair of approaches, the results of which are data-driven services ‘for the public good’ and most presented in Figure 1. The plotted points provide the selected governmental organizations (46% of respond- change in the probability of selecting an approach rel- ents), followed by publicly-funded organizations ative to the status quo (that is, digital services having (40%). This reinforces the finding that oversight of control over people’s data). The vertical dotted line data by a public regulatory body was a strong prefer- indicates the digital service/status quo baseline; points ence for our respondents. to the right of the dotted line indicate an increase in the The high rating of this model by respondents sug- probability of choosing that particular approach rela- gests a preference for legally enforceable safeguards tive to the baseline. The lines around each side of plot- alongside the personal control of data offered by the ted points are 95% error bars, indicating uncertainty 10 Big Data & Society Average Marginal Component Effects Model: (Baseline = Digital Ser vice (Status Quo)) A. Personal Data Store B. Independent Responsib le Party C. Responsible Independent Org D. Data Co−Operative ● E. Public Data Commons F. Regulatory Public Body G. Data ID Card (Opt−Out Option) 0.0 0.2 0.4 Change in Predicted Probability Figure 1. Results from the single-attribute conjoint analysis. around each value, which derives from the fact that our These approaches may have received lower ratings survey is based on a sample of the population. because they were less familiar to respondents than As with the individual ratings task, this experiment approaches based on the more commonplace concepts revealed that the top three preferred approaches are the of choice, control and regulation. As noted above, in PDS, opting out and oversight by a regulatory public the knowledge questions with which we opened the body, in that order of preference. There was at least a survey, respondents demonstrated limited knowledge 30% point increase in selecting any of the top three of open data, the principles of which influence data data management approaches compared to the status trust approaches. In addition, elsewhere in the survey, quo/‘notice and consent’ approach. This is a significant only 39.3% of respondents agreed with the statement number, both statistically and substantively. The ‘I’m in favour of open data’. This relatively low level of approaches that did not offer personal control or reg- support for open data could result from the low levels ulatory oversight, which we describe above as ‘trust- of understanding of open data that we also identified. like’, had lower mean scores than those that did offer Together, these findings may explain the lower mean such features, in both the rating task and the single- scores for the ‘trust-like’ data management approaches attribute conjoint experiment. These include that we presented to respondents. approaches overseen by a public data commons, a It is striking that respondents preferred all other data co-operative, multiple responsible independent approaches to a ‘digital services model’ that ‘gives serv- organizations or a specific responsible independent ices control over what happens to your data’. With an party. Trust-like approaches were preferable to the average rating of just 4.9 out of 10, this suggests that status quo, but less preferable than those based on respondents are unsatisfied with services and organiza- the concepts of personal choice, control and regulation. tions controlling data. Combined with the high rating Hartman et al. 11 Average Marginal Component Effects A. The data is: (Baseline = Online beha vioural) Financial Location ● Media Medical B. The data is controlled b y: (Baseline = Commercial organisation) People's collective Trustee (govt) Trustee (public service) Trustee (nominated) ● You C. You will be able to: (Baseline = Exercise your rights) Access data yourself Have a say Have more control Know data is secure Know official is overseeing data ● Know what data is held D. The data will be used f or: (Baseline = For profit) For insights To benefit society −0.1 0.0 0.1 0.2 0.3 0.4 Change in Predicted Probability Figure 2. Results from the multiple-attribute conjoint analysis. of the opt out model, and strong support for statements Figure 2 demonstrates that the most important expressing concern about data management issues, factor influencing responses to the multiple-attribute conjoint experiment was the locus of control over these findings show that current arrangements require radical change in order to win public support. data – respondents want control to rest with them. The probability of respondents selecting a data man- agement scenario that gives them control over their Preferences in relation to data handling scenarios own data increased by 30% points relative to the base- We also used a multiple-attribute conjoint experiment, line (that is, a commercial organization controls the which compared the significance of a number of factors data). Thus, personal control played a key role in this in data handling scenarios – including types of data, experiment, just as it did in evaluations of data man- uses of data and related benefits (identified as signifi- agement approaches (as seen in Table 7) and in cant in previous research (Kennedy et al., 2015)) and responses to statements about data use and manage- control arrangements and what these enable – to assess ment. As we discovered throughout the survey, preferences towards data management approaches. respondents preferred scenarios in which anyone Figure 2 displays the results from this conjoint experi- other than a commercial organization was responsible ment. As with the single-attribute conjoint analysis, in for controlling their data. In this experiment, there was the figure, we present results which show the change in little notable differentiation among the alternative con- the probability of selecting a profile with particular trollers that we presented, apart from respondents characteristics relative to a baseline, this time for themselves, for whom a significant preference was each of the attributes we included in the scenarios. expressed. 12 Big Data & Society Figure 3. Subgroup responses to the single-attribute conjoint experiment by age group and existing knowledge. The other significant factor in this experiment relat- social inequalities influence perceptions of data practi- ed to uses of data and beneficiaries. Respondents pre- ces (Kennedy et al., 2020 is one exception). Because of this, we analysed whether these and other character- ferred scenarios in which data would be used for istics, including existing knowledge of data-related insights or to benefit society rather than for profit, which is consistent with findings from other surveys matters, had an impact on respondents’ views of data management approaches. This latter variable, knowl- (e.g. Doteveryone, 2018). The effect sizes for these fac- edge, was indeed a significant predictor of preferences tors were in the medium range, with a change in the in relation to some of the approaches (see the probability of selecting that profile of 0.15 or greater. Supplemental Appendix for full results). In the ratings In other words, there is a 15-percentage point increase exercise, for example, knowledgeable respondents pre- in the chance that a particular profile would be selected ferred approaches that offered more control and/or when it provided personal insights or benefits to society oversight over personal data by a regulatory public compared to profit. Other factors were not as impact- body than less knowledgeable respondents, who indi- ful, as Figure 2 demonstrates. For instance, respond- cated a slightly higher preference for the status quo, ents did not significantly differentiate in relation to which gives digital services control over their data. what management arrangements mean for the individ- This effect was relatively small (about a half point dif- ual (for example, giving them control over what hap- ference on a 10-point scale). Age also had a significant pens to data or enabling them to know what data is impact on ratings of approaches: younger respondents held about them), as seen in Figure 2(c). Finally, this rated the status quo model higher than those who were experiment confirmed the finding from elsewhere in the aged 35 years and over (about 1 point higher mean survey that respondents do not like their personal data rating on a 10-point scale). Thus, differences relating to be controlled by commercial organizations (Figure 2 to age and existing knowledge mattered, but not a great (b)) or used for profit (Figure 2(d)). As Figure 2 shows, deal. Apart from these two findings, there were no all other scenarios were preferable to this one. other clear differences in evaluations by demographic subgroups within the sample. In other words, we did Differences amongst respondents not find that gender, ethnicity, educational attainment, Recent research has demonstrated that people experi- employment status or household income were signifi- ence datafication differently. Ethnicity, gender, poverty cant predictors of preferences. and their intersections have been shown to impact peo- Similar subgroup differences were observed in the ple’s experiences of data practices (Eubanks, 2017; single-attribute conjoint experiment, presented in Noble, 2018). There is much less research into whether Figure 3 (the full set of comparisons is available in Hartman et al. 13 the Supplemental Appendix). This figure plots the aver- are provided for under GDPR, which continues to be age proportion of respondents selecting each data man- implemented in the UK at the time of writing, our agement model, also known as marginal means, by age findings raise questions for future research about the and knowledge. By design, marginal means average relationship between the ‘law in theory’ and the ‘law in 0.5. In other words, if responses were simply randomly practice’ (Galetta et al., 2016). These include questions chosen, there is a 50:50 chance that a given response is about whether people perceive the existing arrange- selected. Values above 0.5 tell us that respondents ments as ‘good’ but in need of better enforcement, or prefer a given approach, and values below 0.5 indicate whether greater oversight by regulators and more strin- that respondents do not like the approach. A value of 0 gent regulations would be preferred. would tell us that the approach was never selected; a Third, we need to think carefully about what value of 1 means that it was always selected. As with respondents’ preference for more control over their previous figures, Figure 3 also includes error bars. personal data might look like in practice. In previous While the plot points for various demographic sub- qualitative research that we have undertaken, partici- groups were for the most part grouped closely together, pants expressed concern about the burden of decision- indicating consistency in responses, there are some making that a PDS approach might impose upon them exceptions. One is age, which appears to have some as individuals (Steedman et al., 2020). Offloading the influence on preference. Respondents in the 18–34 responsibility for good and informed data management years age group were less swayed by the PDS, oversight decision-making onto citizens may therefore be prob- by a regulatory public body and the opt out option lematic. Effective approaches to greater personal con- than respondents aged 35 years and over, although trol need further research. Our research has identified younger respondents still preferred these approaches what users want; further research into how to realise to the others presented to them. This is indicated in this in practice is needed. Figure 3 by the closer proximity to the 0.5 value for A further finding from our survey is that not all younger respondents. Less knowledgeable respondents, alternatives to data management are rated equally by in general, were also less likely to differentiate among respondents. Although they preferred all alternatives to the approaches. Again, this is shown in the closer prox- the status quo, they expressed a greater preference for imity of their marginal means to the 0.5 vertical line. some than for others. Data trust-like approaches – a The effects of both of these variables, however, are public data commons, a data co-operative, oversight by a responsible independent party or organizations – relatively small, as we observed with responses to were ranked below PDS, regulatory and opt out other survey items. approaches. These findings were consistent across dif- ferent methods used in the survey. We cannot therefore Discussion and conclusions conclude that there is a ‘huge appetite’ for data trusts Our research asked ‘what do members of the UK amongst the public, as the ODI suggests exists amongst public think constitutes good data management?’ Our organizational stakeholders, based on their pilot (ODI, findings suggest that personal data, oversight from reg- 2019b). Further research is needed to explore the rea- ulatory bodies and the choice to opt out of data gath- sons for this, although some speculation is possible. ering are the main components of good data Data trust-like approaches may have been rated management from the perspective of the UK public. lower than other approaches because they were less Another important finding is that respondents dislike familiar to respondents than approaches based on the approaches in which commercial organizations control more commonplace concepts of control, opting out and and profit from personal data in exchange for digital regulation. Respondents’ limited knowledge of and services. As noted above, these approaches to data support for open data, the principles of which inform management are not mutually exclusive. Under data trusts, was evidenced in answers to diverse ques- GDPR, the dominant ‘notice and consent’ model tions in the survey. This might explain respondents’ should include opt out options and oversight from reg- lesser preference for these approaches. ulatory bodies. In this context, we draw three conclu- Existing knowledge and age had an impact on eval- sions from our findings. uations of approaches, but the effects of these factors First, our research suggests that organizations which were relatively small. The fact that less knowledgeable handle personal data and policy-makers in this domain respondents were less likely to differentiate amongst need to accept that current arrangements are not approaches might suggest that with good information, acceptable. People like the idea of choice, control and more differentiation of approaches might result. But oversight, and they do not like commercial organiza- the relationship between information, understanding tions controlling and profiting from their personal and perceptions of data practices is complex, and pre- data. Second, given that some of preferred features vious research has shown that information and 14 Big Data & Society understanding are not necessarily the solution to the preferences exist, and global action is also needed, from data trust deficit (Steedman et al., 2020). Here again, data policy-makers and practitioners, to respond to further research is needed to understand the relation- public concerns. ship between knowledge about and preference for data management approaches in greater depth. Declaration of conflicting interests Our research indicates that public views of good The author(s) declared no potential conflicts of interest with data management align only in part with the principles respect to the research, authorship, and/or publication of this of good data identified by experts and commentators. article. Devitt et al.’s (2019) principles ‘users must be able to understand and control their personal data’ and ‘data Funding subjects must mediate data uses’ were confirmed by our respondents strong preference for a PDS model or an The author(s) disclosed receipt of the following financial sup- opt out option to give them control over what happens port for the research, authorship, and/or publication of this to their data. However, collective principles such as article: This work was supported by a grant from the Arts ‘communal data sharing assists community participa- and Humanities Research Council, award number AH/ tion’, ‘access to data promotes sustainable communal S012109/1, and BBC Research and Development. living’, and ‘open data enables citizen activism and empowerment’, represented in data co-operative and ORCID iDs public data commons approaches, were not as widely Helen Kennedy https://orcid.org/0000-0003-0273-3825 preferred, although respondents did indicate support Robin Steedman https://orcid.org/0000-0003-1033-9318 for pro-social uses of data. Respondents’ evaluations of what constitutes good data management did not Supplemental Material align with those experts who argue that data trusts rep- resent a model of good data either, given that the trust- Supplemental material for this article is available online. like approaches that we presented to them were not the most preferred options. A major contribution of our References research, then, is that it nuances understandings of Bakos Y, Marotta-Wurgler F and Trossen DR (2014) Does good data as a concept and of good data management anyone read the fine print? Consumer attention to as a practice. standard-form contracts. The Journal of Legal Studies In some ways, the UK is in a unique position when it 43(1): 1–35. comes to data management futures, given current Brunton F and Nissenbaum H (2011) Vernacular resistance uncertainty about post-Brexit data regulation. This sit- to data collection and analysis: A political theory of obfus- uation provides the UK government with an opportu- cation. First Monday 16(5). nity to heed what the public wants, which has been the Cadwalladr C and Graham-Harrison E (2018; March 17) main focus of our paper. We found a ‘huge appetite’ Revealed: 50 million Facebook profiles harvested for for alternatives to commercial control of personal data Cambridge Analytica in major data breach. The amongst our respondents, and a clear indication of Guardian. Available at: www.theguardian.com/news/ what constitutes good data management for them. 2018/mar/17/cambridge-analytica-facebook-influence-us- election (accessed 18 March 2018). The UK government could choose to implement Cate FH (2010) The limits of notice and choice. IEEE good data management approaches which have Security & Privacy Magazine 8(2): 59–62. public support, but this would require investment of Couldry N and Powell A (2014) Big data from the bottom up. resources for technical development and for further Big Data and Society 1(1): 1–5. public consultation. By contrast, disregard for public Cranor FL (2012) Necessary but not sufficient: Standardized views about what constitutes good data management mechanisms for privacy notice and choice. Journal on would perpetuate distrust, and this would likely have Telecommunications and High Technology Law 10(2): consequences both for government and for organiza- 273–307. tions that are trying to work with data in ways that are Daly A, Devitt SK and Mann M (2019) Good Data. good, ethical and responsible. In many ways, these con- Amsterdam, the Netherlands: Institute of Network clusions are not unique to the UK. Many countries face Cultures. similar challenges relating to trust, and research on Devitt SK, Mann M and Daly A (2019) The ‘Good Data’ attitudes to data practices in general has found similar Project. Available at: www.networkcultures.org/blog/ levels of concern across countries (for example 2019/01/11/principles-of-good-data/ (accessed 5 June Edelman, 2018; European Commission, 2019; ODI, 2020). 2018; PEGA, 2019). Further research is needed across Digital Catapult (2015) Trust in personal data: A UK review. the globe to explore why particular data management Report by Digital Catapult, London, UK. Hartman et al. 15 Doteveryone (2018) People, power and technology: The 2018 Kennedy H (2018) Living with data: Aligning data studies digital attitudes report. Available at: www.understanding. and data activism through a focus on everyday experiences doteveryone.org.uk (accessed 5 June 2020). of Datafication’. Krisis: Journal for Contemporary Doteveryone (2019a) Engaging the public with responsible Philosophy. Available at: www.krisis.eu/living-with-data/ technology: Four principles and three requirements. (accessed 5 June 2020). Available at: www.doteveryone.org.uk/download/3225/ Kennedy H, Steedman R and Jones R (2020) Approaching (accessed 5 June 2020). public perceptions of datafication through the lens of Doteveryone (2019b) Better redress: Building accountability inequality: A case study in public service media’ informa- for the digital age: An evidence review from Doteveryone. tion. Communication and Society. Epub ahead of print 4 Available at: www.doteveryone.org.uk/wp-content/ March 2020. DOI: 10.1080/1369118X.2020.1736122. uploads/2019/12/Better-redress-evidence-review.pdf Lehtiniemi T and Ruckenstein M (2019) The social imagina- (accessed 5 June 2020). ries of data activism. Big Data & Society 6(1): 1–12. Edelman (2018) Edelman Trust Barometer 2018. Available L’Hoiry X and Norris C (2015) The honest data protection at: www.edelman.co.uk/research/edelman-trust-barome officer’s guide to subject access requests. International ter-2018-uk-findings (accessed 5 June 2020). Data Privacy Law 5(3): 190–214. Edwards L and Veale M (2017) Slave to the algorithm? Why a MyData (nd) Homepage. Available at: www.mydata.org/ ‘right to an explanation’ is probably not the remedy you are (accessed 5 June 2020). looking for. Duke Law and Technology Review 16(1): 18–84. Nissenbaum H (2009) Privacy in Context: Technology, Policy Eubanks V (2017) Automating Inequality: How High-Tech and the Integrity of Social Life. Palo Alto, CA: Stanford Tools Profile, Police and Punish the Poor. New York, University Press. NY: St Martins Press. Noble S (2018) Algorithms of Oppression: How Search European Commission (2019) Special Eurobarometer 487a. Engines Reinforce Racism. New York, NY: New York Summary – The General Data Protection Regulation. University Press. Available at: www.ec.europa.eu/commfrontoffice/publicopi Obar AJ and Oeldorf-Hirsch A (2020) The biggest lie on the nionmobile/index.cfm/Survey/getSurveyDetail/surveyKy/ internet: Ignoring the privacy policies and terms of service 2222 (accessed 5 June 2020). policies of social networking services. Information, Galetta A, Fonio C and Ceresa A (2016) Nothing is as it Communication & Society 23(1): 128–147. seems. The exercise of access rights in Italy and Belgium: ODI (2018) Who do we trust with personal data? Available Dispelling fallacies in the legal reasoning from the ‘law in at: www.theodi.org/article/who-do-we-trust-with-person theory’ to the ‘law in Practice’. International Data Privacy al-data-odi-commissioned-survey-reveals-most-and-least- Law 6(1): 16–27. trusted-sectors-across-europe/ (accessed 5 June 2020). GDPR (2018) Personal data. Available at: www.gdpr-info.eu/ ODI (2019a) Data trusts: Lessons from three pilots. Available issues/personal-data/ (accessed 5 June 2020). at: www.docs.google.com/document/d/118RqyUAWP3W Hainmueller J and Hopkins DJ (2015) The hidden American IyyCO4iLUT3oOobnYJGibEhspr2v87jg/edit# (accessed immigration consensus: A conjoint analysis of attitudes 5 June 2020). toward immigrants. American Journal of Political Science ODI (2019b) Huge appetite for data trusts. Available at: 59(3): 529–548. www.theodi.org/article/huge-appetite-for-data-trusts- Hainmueller J, Hopkins D and Yamamoto T (2014) Causal according-to-new-odi-research/ (accessed 15 April 2019). inference in conjoint analysis: Understanding multidimen- O’Hara K (2019) Data trusts: Ethics, architecture and gover- sional choices via stated preference experiments. Political nance for trustworthy data stewardship. White Paper. Analysis 22(1): 1–30. Available at: www.eprints.soton.ac.uk/428276/ (accessed Hall W and Pesenti J (2017) Growing the Artificial Intelligence 5 June 2020).. Industry in the UK. London, UK: DCMS. PEGA (2019) GDPR: Show me the data survey reveals EU Hinz A and Brand J (nd) Data policies: Regulatory consumers poised to act on legislation. Available at: www. approaches for data-driven platforms in the UK and EU. pega.com/system/files/resources/2019-07/GDPR-Show- Available at: www.datajustice.files.wordpress.com/2020/01/ Me-The-Data-eBook.pdf (accessed 15 April 2019). data-policies-research-report-revised.pdf (accessed 5 June Pelzer E (2019) The potential of conjoint analysis for com- 2020). munication research. Communication Research Reports Iliadis A and Russo F (2016) Critical data studies: An intro- 36(2): 136–147. duction. Big Data & Society 3(2): 1–7. RSS (2014) Trust in data and attitudes toward data use/data Janssen H, Cobbe J, Norval C, et al. (2019) Personal data sharing. Available at: www.statslife.org.uk/images/pdf/ stores and the GDPR’s lawful grounds for processing per- rss-data-trust-data-sharing-attitudes-research-note.pdf sonal data. Zenodo. Epub ahead of print 29 May 2019. (accessed 24 February 2019). DOI: 10.5281/zenodo.3234902. Sailaja N, Colley J, Crabtree A, et al. (2019) The living room Kennedy H, Elgesem D and Miguel C (2015) On fairness: of the future. In: Proceedings of TVX 2019: The ACM User perspectives on social media data mining. conference on interactive experiences for television and Convergence 8(6): 859–876. online video. Salford, UK, 5 June 2019. Kennedy H (2016) Post, Mine, Repeat: Social Media Data Sayer A (2011) Why Things Matter to People: Social Science, Mining Becomes Ordinary. Basingstoke, UK: Palgrave Values and Ethical Life. Cambridge, UK: Cambridge Macmillan. University Press. 16 Big Data & Society Sharon T and Lucivero F (2019) Introduction to the special Warner R and Sloan R (2013) Beyond notice and choice: theme: The expansion of the health data ecosystem – Privacy, norms, and consent. Journal of High Technology Rethinking data ethics and governance. Big Data & Law. Available at: www.scholarship.kentlaw.iit.edu/fac_ Society 6(2): 1–5. schol/568 (accessed 5 June 2020). Steedman R, Kennedy H and Jones R (2020) Zack ES, Kennedy J and Long JS (2019) Can nonprobability Complex ecologies of trust in data practices and samples be used for social science research? A cautionary data-driven systems. Information, Communication and tale. Survey Research Methods 15(2): 215–227. Society. Epub ahead of print 8 April 2020. DOI: Zuboff S (2018) The Age of Surveillence Capitalism. London, 10.1080/1369118X.2020.1748090. UK: Profile Books Limited.

Journal

Big Data & SocietySAGE

Published: Jun 22, 2020

Keywords: Good data; public perceptions; data management; data trust; personal data store; conjoint experiment

There are no references for this article.