Get 20M+ Full-Text Papers For Less Than $1.50/day. Subscribe now for You or Your Team.

Learn More →

Making sense of big data in health research: Towards an EU action plan

Making sense of big data in health research: Towards an EU action plan Medicine and healthcare are undergoing profound changes. Whole-genome sequencing and high-resolution imaging technologies are key drivers of this rapid and crucial transformation. Technological innovation combined with automation and miniaturization has triggered an explosion in data production that will soon reach exabyte proportions. How are we going to deal with this exponential increase in data production? The potential of “big data” for improving health is enormous but, at the same time, we face a wide range of challenges to overcome urgently. Europe is very proud of its cultural diversity; however, exploitation of the data made available through advances in genomic medicine, imaging, and a wide range of mobile health applications or connected devices is hampered by numerous historical, technical, legal, and political barriers. European health systems and databases are diverse and fragmented. There is a lack of harmonization of data formats, processing, analysis, and data transfer, which leads to incompatibilities and lost opportunities. Legal frameworks for data sharing are evolving. Clinicians, researchers, and citizens need improved methods, tools, and training to generate, analyze, and query data effectively. Addressing these barriers will contribute to creating the European Single Market for health, which will improve health and healthcare for all Europeans. European healthcare systems and the potential error. Knowledge generation is changing dramatically. for big data The digitalization of medicine allows the comparison of Medicine has traditionally been a science of observation disease progression or treatment responses from patients and experience. For thousands of years, clinicians have worldwide. Whole-genome sequencing allows searching integrated the knowledge of preceding generations with and comparing one’s own genome to millions and soon their own life-long experiences to treat patients accord- billions of other human genomes. Eventually, the entire ing to the oath of Hippocrates; mostly based on trial and world population could be used as a reference popula- tion in order to link genome information with many * Correspondence: cauffray@eisbm.org; rudi.balling@uni.lu other types of physiological, clinical, environmental, and European Institute for Systems Biology and Medicine, 1 avenue Claude lifestyle data. For many, this is a vision full of opportun- Vellefaux, 75010 Paris, France Luxembourg Centre for Systems Biomedicine, University of Luxembourg, 7 ities, whereas for others it provides a wealth of technical Avenue des Hauts Fourneaux, 4362 Esch-sur-Alzette, Luxembourg Full list of author information is available at the end of the article © 2016 The Author(s). Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. Auffray et al. Genome Medicine (2016) 8:71 Page 2 of 13 challenges, unanticipated consequences, and loss of priv- clinicians, public policy experts, representatives of health acy and autonomy. services, patient advocacy groups, the pharmaceutical in- The quality of conclusions on the etiology of diseases dustry, and ICT companies. follows a law of large numbers. Cross-sectional cohort studies of 30,000 to 50,000 or more cases are required to What do we mean by “big data”? separate the signal from noise and to detect genomic “Big data” has a wide range of definitions in health re- regions associated with a given trait in which disease- search [8, 9] and to create a single definition for all uses related genes or susceptibility factors are located [1, 2]. (“one size fits all” approach) may be too abstract to be Whole-genome sequencing studies often identify only a useful. However, a workable definition of what big data few genomic regions that contain elements with large ef- means for health research or at least a consensus of what fects on the penetrance or expressivity of gene products this term means was proposed during the workshop in but hundreds of genomic regions that have small effects Luxembourg. “Big data in health” encompasses high and are highly dependent on genetic background, envir- volume, high diversity biological, clinical, environmen- onmental factors, or social and lifestyle determinants [3]. tal, and lifestyle information collected from single indi- There is also a need to study disease pathogenesis on viduals to large cohorts, in relation to their health and genome, epigenome, transcriptome, proteome, and me- wellness status, at one or several time points. Big data tabolome levels and combine these dimensions through can only be dealt with by adopting a strong governance multi-omics research. Furthermore, individual variation model and best practices of new technologies, e.g., in responsible for normal and disease phenotypes is high as large-scale data production compliant with community- a result of somatic mutations or variation in transcrip- based quality standards, coupled with interoperable data tion, splicing, or allele-specific gene expression between storage, data integration, and advanced analytics solutions individuals [4–6]. [10]. Another goal of the workshop was to develop an EU Vast amounts of temporal and spatial parameter data action plan for research funders towards the integration of are now available. But what are we going to do with the big data into policy development, biomedical research, data? It takes hard work to condense useful information and clinical practice in health and wellness management. from big data and turn this information into knowledge Big data comes from a variety of sources, such as clinical and action. The challenge will be to make a smart choice trials, electronic health records (EHRs), patient registries between situations when less is more versus less is less and databases, multidimensional data from genomic, epi- but also when more is more versus more is less. genomic, transcriptomic, proteomic, metabolomic, and Here, we briefly describe the key challenges that result microbiomic measurements, and medical imaging. More from making sense out of big data in health and using recently, data are being integrated from social media, so- these data for the benefit of the patient and the healthcare cioeconomic or behavioral indicators, occupational infor- system. We also highlight key technical, legal, and ethical mation, mobile applications, or environmental monitoring issues that we face to develop evidence-based personalized [11]. Big data comes in a wide range of formats. Data medicine. Finally, we put forward five recommendations streams have to be assessed and interpreted in a timely for the European Union (EU) and member states’ policy manner to benefit patients affected by diseases and to help makers to serve as a framework for an EU action plan that citizens remain in good health [8, 12]. could help to reach this ambitious goal. Importance of patient registries Making sense of big data in health research Patient registries have for decades served as a key tool for On 30 October 2015, the Health Directorate of the assessing clinical outcome and clinical and health technol- Directorate-General for Research and Innovation at the ogy performance [13–15]. Rare disease registries pool data European Commission (EC), the executive body of the to achieve a sufficient sample size for epidemiological and/ EU, organized in Luxembourg a workshop entitled “Big or clinical research [16, 17]. The European Organization for data in health research: an EU action plan” [7]. The aim Research and Treatment of Cancer (EORTC) [18] opened a of the workshop was to ask stakeholders in the “big data prospective registry for patients with melanoma in June revolution” for their input on how European funding for 2015 [19]. The European Network of Cancer Registries health research should take into account the opportun- (ENCR) [20], established within the framework of the ities, limitations, and concerns of the anticipated develop- Europe Against Cancer Programme of the EC, pro- ments in health and healthcare. Participants included motes collaboration between cancer registries, defines bioinformaticians, computational biologists, genome sci- data collection standards, provides training for cancer entists, drug developers, biobanking experts, experimental registry personnel, and regularly disseminates informa- biologists, biostatisticians, information and communica- tion on incidence and mortality from cancer in the tion technology (ICT) experts, public health researchers, European Union and other European countries. Auffray et al. Genome Medicine (2016) 8:71 Page 3 of 13 Patient registries provide significant potential for re- management of big data would enable a more systematic search and public health improvements in the EU, identification of drug safety signals, such as earlier detec- owing to the large volume of patients in each registry tion of adverse drug reactions [31], while allowing person- and the variety of quality medical information related alized medicine analyses via appropriate patient and/or to each patient. Patient registries are increasingly im- population stratification methodologies. This in turn portant to monitor patients’ treatments and for safety should lead to improved treatment responses for biologic- assessment and the identification of trends in transla- ally or clinically defined patient subgroups, which will also tional medicine (e.g., registry-based clinical trials, per- avoid unnecessary rejection of potent drugs and devices. sonalized medicine) [21]. As a result, patient communities will benefit and the un- Patient registries allow informed policy decisions at the sustainable trend of escalating costs in hospital and com- local, regional, national, and, in some cases, the inter- munity care management as well as diagnostic and drug national level. As a result, hundreds of registries have been development costs by the biopharma industries will stop set up that range from national to international rare disease or slow down. Health economy specialists need to provide initiatives, coupling clinical and genetic data and biobanks. suitable metrics to monitor key performance indicators of However, for various reasons, including data protection success in big data pilot projects. Such metrics might in- and the fragmentation of regulatory frameworks, the com- clude the change in response rates in stratified patient bination of these disparate information sources to guide subpopulations or the number of adverse drug reactions health research and decision-making in the clinic has so far after systems medicine-based companion diagnostics. lagged behind the use of large-scale, big data collections in Big data also has many potential benefits for transla- other sectors. Other disciplines, such as electronic and tional research into health and well-being. Integrated mechanical engineering, and whole industries, such as data sets should improve models of common disease to building airplanes, weather forecasting, or robotics, have better understand the progression of rare diseases [32]. demonstrated computational modeling and simulation as They may also enable the detection of population-level an essential component that is based on data sharing and effects, such as the off-target and adverse effects of their experience could help overcome the barriers experi- drugs or the occurrence of co-morbidity [33]. enced in health research [22–24]. Biomarkers constitute a key building block of precision medicine, yet the development and clinical validation of The potential benefits of big data for healthcare new biomarkers is a lengthy process and relatively few Big data in health can be used to improve the efficiency such markers have yet reached routine clinical practice and effectiveness of prediction and prevention strategies [34, 35]. However, a sizeable number of biomarkers are or of medical interventions, health services, and health now widely used in routine clinical diagnostics, which policies [25–27]. Access to well-curated and high quality include—but are not limited to—targeted cancer therapy health-related data will likely have a number of benefits [36–39]. Multidimensional signatures that take into ac- in a diverse range of situations. In clinical practice, these count a wealth of prior information, both from the pa- data will improve outcomes for individual patients tient’s previous life history and state-of-the-art information through personalization of predictions, earlier diagnosis, from the literature and relevant databases, will hopefully better treatments, and improved decision support for deliver a much higher predictive power than the single clinicians in cyclic processes. (Cyclic processes are usu- biomarkers used today. There is also potential for re- ally composed of the definition of policy/decision op- search on the impact of healthcare interventions and tions, the selection of the best alternative, and the monitoring trends in infectious diseases to inform pub- subsequent implementation and validation of this op- lic health policies [40]. tion. Integrating feedback from a continuous evaluation Finally, there is an opportunity to engage with the indi- on the process completes this cycle [28].)These im- vidual patient more closely and import data from mobile provements should eventually lead to lowered costs for health applications or connected devices. This interaction the healthcare system. with the patient will result in the collection of more de- Likewise, the integration of fragmented information sys- tailed clinical, environmental, and lifestyle information, tems into the clinical life cycle will allow the discovery of such as heart frequency and body temperature, physical medically relevant associations, early signals, or changed activity and nutrition habits, and sleep and stress manage- disease trajectories and should, therefore, enable better ment, which will prevent risk exposure and disease onset patient management strategies and improved quality and [41]. Personal monitoring over time should aid the early safety of care. For clinical trials, more expansive, inter- detection of deviations from a healthy state and trajector- operable health records should make it much easier to ies should lead to actionable recommendations, making it find suitable participants and to design and assess the possible for individuals to maintain themselves in good feasibility of new studies [29, 30]. Moreover, better health [12]. Auffray et al. Genome Medicine (2016) 8:71 Page 4 of 13 The challenges ahead for the effective use of big levels, such as animal or cellular models and transla- data in healthcare tional studies, but also clinical research that involves To refine the recommendations for an EU action plan, patients or public health research. To make the most we identified the main challenges that exist for the use use of the information produced, several technical chal- of big data in healthcare research and in the clinic. The lenges should be addressed, such as the combination of challenges have been reported elsewhere [42, 43] and in- structured data, such as genotype, phenotype, and genom- clude clinical, technical, legal, and cultural hurdles. ics data, with semi-structured and unstructured data, e.g., These challenges vary depending on whether the data medical imaging, EHRs, lifestyle, environmental, and are preclinical based on cellular or animal models or health economics data [53–56]. Recent successful exam- from patients in clinical settings, on the intended type of ples show the feasibility of combining such data for trans- analysis and interpretation, on cross-cultural aspects of lational and clinical research [57–59]. privacy, and on ethical and legal considerations. We are on the cusp of having access to vast personal data—for Technical challenges related to the management example, on physiological, behavioral, molecular, clinical, of electronic health records environmental exposure, medical imaging, disease man- Adoption of EHRs across Europe varies greatly. Estonia agement, medication prescription history, nutrition, or [60] and the Valencia Community in Spain (Josep Redón exercise parameters—that could potentially be used to i Màs, personal communication) have moved entirely to track the health of individuals and populations in con- EHRs. Integration is supported with auxiliary systems, siderably more detail than ever before. The integration for instance drug–drug interaction alert systems that of structured and unstructured data, using natural lan- warn physicians and pharmacists about potential pre- guage processing and other sophisticated machine learn- scription clashes, clinical risk groups calculation and ing tools, is being tested and it is hoped this will lead to costs (e.g., Valencia region, Spain), and drug–gene inter- a new level of integration of prior information with up- action alert systems that guide physicians to adjust the to-date clinical information [44]. dose of a prescribed drug in aberrant drug metabolizers Over a thousand Mendelian disorders are linked to (e.g., The Netherlands). The USA have taken steps towards genetic defects and, for many of these, genetic testing is a “patient-driven economy” [61]. In such a scenario, the performed to inform clinical practice. The most suc- patient owns his/her data. This ownership requires the de- cessful integration of basic and clinical data can be ob- velopment of an appropriate health-record infrastructure served in oncology [45, 46] and in research on rare but provides a wide range of new health service business diseases [47–49]. However, the medical relevance of the opportunities with major economic potential. Empowering large amount of genetic variation revealed by genomic patients to take control of their data could be of particular sequencing is still unknown in most cases. importance for cross-border healthcare and health re- Data acquisition is undergoing rapid change. Wearable search activities in Europe where healthcare is highly frag- devices, integrated sensors, and continuous monitoring mented and multinational. To transfer medical data from capabilities are available for all scales of measurements one country to another in the EU is very difficult. Owner- [50]. Several legal issues will have to be tackled, for ex- ship of data by patients could overcome these obstacles ample, when a consumer device becomes a diagnostic and unleash new ways to stimulate a competitive health- device and the quality assurance and regulatory approval driven economy. are more stringent [51]. Furthermore, patient records can be computationally Data storage issues include security, accessibility, and opaque, for example, in the form of free text, recorded sustainability. Should data be stored centrally or in a fed- speech, or medical images; translation into a format com- erated manner? There are concerns about entrusting patible with computational analyses will be necessary. health-related data to public clouds. As a result, there is Data in different languages and time-consuming searches a strong need to come up with alternatives. The decades and identification are other important barriers. of experience in big data management for the particle There are some best practices for the management of physics community at The European Organization for EHRs. For example, the International Rare Diseases Re- Nuclear Research (CERN) that led to the development search Consortium (IRDiRC) [62] develops and implements of the World Wide Web [52] will be valuable. However, standards and harmonized methodology across diseases many aspects that are specific to big data in health re- and medical cases [63]. Several European collaborative search need to be taken into account, such as data het- projects, such as the European project p-medicine, have erogeneity, institutional and legal fragmentation, and created IT infrastructures that will facilitate translational re- strong data protection standards. There will be a massive search and the development of personalized medicine [64]. increase in big data production in all areas of biomed- ELIXIR, one of the European infrastructures for life sci- ical research, which includes studies at the preclinical ences [65], has facilitated the collection, quality control, and Auffray et al. Genome Medicine (2016) 8:71 Page 5 of 13 archiving of large amounts of life science data such as funders need to make sure that sufficient attention is translational medicine data [66]. paid to data quality at the experimental and study design stages, for example, by ensuring data management plans Technical challenges related to data analysis and and appropriately reviewed data sharing procedures are computing infrastructures in place for all funded research. Basic as well as clinical researchers need new computa- “Seeing is believing”. This phrase is relevant not only for tional tools to improve data access and aid user-friendly high-resolution microscopy and imaging technologies but data analysis for efficient decision making in the clinic. Cli- also for the presentation and visualization of health- nicians need new tools that track, trace, and provide fast related data. We need to progress from the current display feedback for individual patient care. Researchers need tools of “hairballs”, incomprehensible comprehensive networks, that can be adapted for different data sets and analyses or ranking tables that nobody has the time or motivation such as those used in a wide range of EU-funded projects to look at. If we want to provide clinicians with updated, through the Innovative Medicines Initiative (IMI)-funded relevant information and clinical decision support sys- eTRIKS consortium platform [67]. Accessing tool reposi- tems, the devices have to be user-friendly and intuitive tories to search for the best tool to answer specific research with an interoperable format. The concept of disease- or clinical questions will be a prerequisite. Equally import- specific maps, with a common computational framework, ant are traceable computational environments that might be one way to make progress, as demonstrated in maintain data provenance information from patient to several EU-funded projects (Fig. 1). sample and from sample to clinically actionable results. In December 2012, the UK announced the 100,000 Ge- Computational modeling and simulation nomes Project [68], which aims to sequence 100,000 One of the pathways for exploitation of big data is its com- genomes, from around 70,000 people, with the focus on bination with predictive, mechanistic models [76] such as patients with rare diseases or cancer. The US and China those provided by the European Molecular Biology La- have recently announced plans for similar studies on boratory–European Bioinformatics Institute (EMBL-EBI) one million individuals. The goal of these projects is to [77]. The Virtual Physiological Human (VPH) community yield further insights into human health and disease has also endeavored to develop a descriptive, integrative, and to build a framework with which to integrate gen- and predictive computational framework of human anat- omics into standard public healthcare programs in the omy, physiology, and pathology with support from the EC near future. Data continue to increase at an exponential Directorate General for Communications Networks, Con- rate and the need for cross-border exchange of biomed- tent & Technology (DG CONNECT) [78, 79], following ical and healthcare data, cloud-storage, and cloud- the path opened by the IUPS Physiome Project [80]. computing is inevitable [69, 70]. Until many issues of Predictive computational approaches are associated data safety and security are solved, however, local solu- with infrastructural challenges, particularly for the inte- tions will be favored [71]. gration of data with analytical tools and workflows. On- line environments such as VPH-Share and projects Data quality, acquisition, curation, and such as p-medicine have appropriate infrastructures for visualization these applications [81]. The quality and structure of health data available is incon- Another approach to make sense of big data is based on sistent. A major challenge for preclinical and clinical re- a systems-level understanding of health and disease [82]. search is to obtain and achieve access to sufficient high Systems medicine integrative approaches are gradually quality, informative data. Owing to a lack of harmonized gaining visibility and enable translation of the human biol- methods, in most cases health data cannot be directly ogy complex and voluminous data into a toolbox to dem- used for secondary purposes, such as quality of care, phar- onstrate clinical impact [83]. However, a full appreciation macovigilance, safety and efficacy of treatments, health of the power of systems biology and computational technology assessment, and public health policy. Efforts modeling for the upcoming changes in health re- are underway, in both Europe and the US, to develop and search and healthcare is still missing. Currently, with implement standardized data collection, storage, and ana- the exception of oncology, there are still few highly lysis [10, 72, 73]. The European Open Science Cloud, cre- convincing use cases where systems biology ap- ated by the EC, will offer Europe’s 1.7 million researchers proaches have found applications in routine clinical and 70 million science and technology professionals a vir- care [45, 46, 84, 85]. Mathematical, computational dis- tual environment to store, share, and re-use their data ease models are unlikely to be routine in health re- across disciplines and borders [74]. search anytime soon. Achieving necessary changes will Data curation is often neglected but vitally important need strong support from funders to foster this para- to warrant high-quality, informative data [75]. Research digm shift in methodology. Auffray et al. Genome Medicine (2016) 8:71 Page 6 of 13 (a) (b) (c) (d) (e) (f) Fig. 1 Making sense of complex data and overcoming the hairball syndrome using systems biology algorithms and visualization tools. a Visualization of the topology of clinical data from the U-BIOPRED consortium adult severe asthma cohorts (courtesy of Ratko Djukanovic, University of Southampton, UK and Peter Sterk, Amsterdam Medical Center, The Netherlands) [126] using Topology Data Analysis from Ayasdi [127, 128]. b Network obtained though integration of genome, transcriptome, and proteome data from the SysCLAD consortium lung transplantation cohorts (courtesy of Johann Pellet, EISBM, France) [129, 130] using Ingenuity® Variant Analysis [131]. c Typical static representation of a molecular pathway in Thomson Reuters GeneGo MetaCore™ [132]. d An example of a detailed representation of biochemical reactions in the LCSB Parkinson’s molecular map [133]. e A cellular-level representation of biological interactions in the EISBM AsthmaMap (courtesy of Alexander Mazein, EISBM, France) [134, 135]. f A network representation of data and statements developed as part of a biocentric knowledge base within the eTRIKS consortium (courtesy of Mansoor Saqi and Irina Balaur, EISBM, France) [67] Legal and regulatory aspects the International Cancer Genome Consortium (ICGC, A crucial aspect to be addressed concerns the regula- 2016) [87], the International Human Epigenome Con- tory acceptance of big data for the evaluation of novel sortium (IHEC, 2016) [88], the Genomic Standards pharmacological or biological therapies to comple- Consortium (GSC, 2016) [89], and the Clinical Data ment large randomized clinical trials [86]. Collabora- Interchange Standards Consortium (CDISC) [90] and tive pilotprojects thattest the useof big data in by ISO standards committees (e.g., ISO TC276 WG5, observational and/or interventional large clinical trials 2016) provide some examples [91]. The recently pub- with the contribution of regulatory agencies can lished FAIR Data Principles of Findability, Accessibil- bridge different methodological approaches and deter- ity, Interoperability and Reusability for scientific data mine adapted quality standards. Universities and hos- management should help stakeholders from academia, pitals do not have the procedures in place to industry, funding agencies, and non-commercial pub- effectively capture and share data with other organi- lishers support the reuse of scholarly data [92]. Given zations and countries. We need to develop and adopt the complexity and high number of stakeholders in- high quality standards for data generation and pro- volved in the implementation of data standards within cessing to ensure that meaningful and valid data with hospital and university settings, the biggest chance for well-defined semantics are processed and shared. The success comes with highly focused pilot projects. Key fac- quality of data generation as well as the processing tors include flexibility, expansion through modular strat- and regulatory acceptance of big data are addressed egies, and the identification and involvement of key at the international level. Research initiatives such as healthcare actors providing them with immediate benefits. Auffray et al. Genome Medicine (2016) 8:71 Page 7 of 13 Linking existing initiatives and building new initiatives European Scientific Research Projects has been developed on clinical data interchange are also important. The Glo- [103] and is being deployed in the IMI-funded project bal Alliance for Genomics and Health (GA4GH, 2016) eTRIKS [67]. There is also a need to have a much higher [93] initiative is working towards technical, ethical, and level of security than is possible today. One suggestion was legal frameworks to address and resolve some of these to explore block-chain technology, which makes use of a issues. The Coordinated Research Infrastructures Build- digital, distributed transaction record, digital events, with ing Enduring Life-science (CORBEL) Services, a recently identical copies maintained on multiple computer systems, launched European consortium, will also contribute to shared between many different parties. Once entered, the the above data-sharing challenges [94]. CORBEL is an block-chain contains a certain and verifiable record of initiative of 11 new Biological and Medical Science Re- every single transaction [104]. Originally used as the tech- search Infrastructures (BMS RIs), who together will cre- nology underlying “Bitcoin”, the potential to make secure ate a platform for harmonized user access to biological transactions of biomedical and healthcare data is being ex- and medical technologies, biological samples, and data plored [105]. Another possibility would be to make use of services (e.g., BRIDGEHEALTH consortium [95]). The differentiated privacy approaches as practiced in health in- Genomics England policy [68] of storing all data within formation exchanges [106]. the National Health Service (NHS) with highly regulated restricted access to prevent abuse of private information Research infrastructures and user protocols might be a way to go forward. This Similarly, research infrastructures are instrumental to policy will need to be complemented by that of the UK support the harmonization of legal and ethical frame- Personal Genome Project allowing volunteers to donate works in European countries, as demonstrated by the their personal genome from Genomics England and Common Service on Ethical, Legal and Social Implica- other sources to the public domain [96]. tions (CS ELSI) of Biobanking and BioMolecular re- The processes and legal agreements for data sharing sources Research Infrastructure Consortium (CS ELSI across registries and European Member States are seldom BBMRI-ERIC, 2016) [107]. The goal of ELSI BBMRI-ERIC established. The harmonization of regulatory frameworks is to facilitate and support cross-border exchanges of hu- is crucial while also ensuring personal data protection and man biological resources and data attached for research compliance with current legal frameworks, which includes uses, collaborations, and sharing of knowledge, experi- provisions on how to prevent, handle, and prosecute ences, and best practices. potential abuse of the system. For example, there is no Existing computational infrastructures are coping with consensus within international law on whether specific re- storage of big data, but the challenge within the EU is the quirements should be applicable to genetic information. lack of a large-scale European infrastructure and methods Several documents exist at the regional and international of secure data distribution in a cross-border setting [108]. levels that include useful guidelines, such as the UNESCO It is crucial to ensure that the infrastructures that exist International Declaration on Human Genetic Data (2003) and evolve are coordinated and sustainable. Initiatives [97] and the Organisation for Economic Co-operation and such as ELIXIR [65] and the CS IT BBMRI-ERIC [109] Development (OECD) Guidelines on Human Biobanks have begun to address these issues but there is a need for and Genetic Research Databases (2009) [98]. The GA4GH coordination and significant strategic investments to has developed the Framework for Responsible Sharing of ensure that organizations such as these are equipped to Genomic and Health-Related Data [99]. support the rapid growth and evolution of healthcare informatics over the next decade. Distribution of ex- Privacy protection and data sharing policies pertise and facilities, consistent operation, and federation There are broad differences within and across Europe with throughout Europe are essential for scalability and long- regards to privacy protection and data sharing polices term sustainability. This has become one of the key chal- [100]. The workshop in Luxembourg emphasized that the lenges of distributed infrastructures such as BBMRI-ERIC “onesizefits all” approach will not be applicable in Europe. [110], ECRIN [111], and ELIXIR [65], which could benefit The EC proposal for the General Data Protection Regula- from the long experience of CERN in particle physics as tion (2012/0011COD) [101] attempts to harmonize the discussed earlier [52]. fragmented situation that exists under the current Data Protection Directive (95/46/EC, European Parliament and Training and education: many health data, Council, 1995). In the compromise text concluded in the insufficient health data scientists trilogue negotiations between Parliament and the Council, One of the biggest bottlenecks and challenges is the avail- a paragraph is included in the preamble of the new act ability of healthcare professionals and clinical researchers which defines DNA and RNA as personal data [102]. A that are able to use the latest information technologies de- Code of Practice on Secondary Use of Medical Data in veloped in the big data analytics era [112, 113]. Data Auffray et al. Genome Medicine (2016) 8:71 Page 8 of 13 managers with good insight into the specificities of the Recommendations for an EU action plan health application domain are rare. An equally important Launch pilot projects on the application of big data to bottleneck will be the lack of trained clinical scientists to inform health deal with big data. The majority of university hospitals The primary recommendation is for the launch of pilot face a daily struggle to balance their budget. Clinical re- projects on the application of big data that involve search rarely brings in money to pay the costs for clinical healthcare providers, health technology developers, care. As a result, many university hospitals cease to main- policy-makers, and advisory bodies. Pilot translational tain their culture of research as an essential basis for top- research projects that involve healthcare workers and level healthcare. Once the chain of training the next gen- patients could bring big data closer to the clinic and eration of clinical scientists is broken by the retirement of prove the value of collecting and analyzing such infor- the current trainers, the situation will change dramatically mation using the latest mathematical and computational and result in a catastrophic shift. Therefore, there is a tools. The design principles for achieving integrated pressing need for programs that support the careers of healthcare information systems [114] might serve as guid- clinical scientists with state-of-the-art training in data ana- ance on how small pilot projects can be used for future lysis and management. expansion. There is a clear lack of cross-disciplinary education and training, which means that employees in the clinical Leverage the potential of open and citizen science for the environment often do not have the expertise to deal with exploitation of big data in health big data in clinical research and healthcare. Coordinating The concept of “open science” includes open access to Action Systems Medicine (CASyM) [83] has developed publications and raw data, transparency of tools and meth- modules of multidisciplinary training for the next gener- odologies, and networking of researchers across fields and ation of researchers and medical doctors. Furthermore, countries [115]. Open science provides significant added despite compulsory requirements of data transparency value in pilot studies and its broad implementation in the applicable to clinical trials data, researchers and clini- scientific community and society is under discussion. For cians often have little incentive to make data fully avail- example, a high-profile effort to switch all peer-reviewed able. Another challenge may be public skepticism about publishing to open access within the next years is envis- the security of an integrated healthcare system. However, aged [116–119]. several global initiatives have shown that individuals are The second recommendation is to encourage lever- ready to share their medical data for advancing science aging the complementarity between open and citizen (Personal Genome Project) [96], which highlights the science in the context of big data in health. It will be im- potential contribution of citizen science to big data in portant to inform and involve the public not only about health research. Data donor cards would provide an in- data collection but also about all aspects of health re- centive for people to make their data publicly available search [120]. Consumer genomics companies are already and would work in the same way as organ donor cards, successful at gathering metadata through engagement thereby reusing a system already understood by many with their customers. The field of rare diseases has also people. Legislative approaches should include opt-in and benefitted greatly from the involvement of parents of or opt-out solutions. For a successful transformation of children with such diseases, using non-traditional tech- healthcare, we need to push the boundaries of interdisci- niques such as social media to build a network of related plinarity, which comprises the natural sciences such as cases of a particular syndrome. “Citizen science” is also biology and medicine, engineering, the social sciences, becoming increasingly important because of the in- and the humanities. Projects fail more often because of creased uptake of mobile health devices, consumer elec- the underappreciation of the complexities of ethical, tronics, and household appliances and is well-aligned legal, and social factors than for technological reasons. with the EC focus on “responsible research and The workshop in Luxembourg brought together a innovation” that includes elements of open science in wide range of experts and stakeholders to discuss the its ongoing Horizon 2020 Framework Programme pol- key developments, challenges, and potential solutions icy [121]. that we face with using big data for the benefit of the patients, the health care industry, and Europe as a Catalyze the involvement of all relevant stakeholders in whole. The workshop resulted in specific recommenda- projects tions for European policy-makers. There was no doubt The third recommendation is to involve in projects all among the participants that big data and the revolution relevant stakeholders, which includes clinicians, patient in ICT will transform healthcare. There was also a organizations, researchers, software providers, healthcare sense of urgency to implement rapidly the possible and managers, ethical and legal experts, regulatory authorities, to tackle the yet impossible. policy-makers, pharmaceutical companies, and funding Auffray et al. Genome Medicine (2016) 8:71 Page 9 of 13 bodies. Multidisciplinary involvement is required to secure types of data and data formats. The development and an effective translation from basic research to applied use of interoperable data, technology standards, and har- healthcare and to bridge the organizational and cultural monized operating procedures for data collection and differences in data sharing practices across Europe and analysis are paramount to enable data integration and within the different health sectors in a worldwide context. to support data flow and federated access between pub- Clark and colleagues have laid out “a core set of lessons lic and private partners. Furthermore, applicable data that should become part of a basic training for researchers protection standards and maintaining public trust are interested in crafting usable knowledge for sustainable de- important to realize the full potential of big data in velopment” [122]. One of the most important lessons is to health research for European citizens and, by extension, understand that research is a social and political process, worldwide. In this regard, we need a definition of core not just a process of discovery, and that stakeholders are data sets that could serve as a common standard for diverse and need to be involved in the team building any individual health state. process at an early stage. Using the big data revolution to drive the transform- It is likely that bioinformaticians, biostatisticians, and ation of healthcare requires resources for state-of-the-art computational scientists will more often be included in the ICT infrastructure, training programs, and pilot projects near future as natural members of research and clinical that can serve as a role model. These costs, however, will teams and healthcare administration, as already carried be overcompensated by the gains that will come with out by global pharmaceutical organizations. Important to- the implementation of functioning digital workflows and wards this direction is cross-disciplinary training and to sophisticated health data analytics and the creation of a improve the dialogue between the information technology new health and wellness industry. experts, biologists, and clinicians, especially as these groups have the potential to affect greatly the practical Accelerate the harmonization of regulatory frameworks in outcome of research. Europe for health-related research and data sharing The final recommendation is to agree on the necessity for, Support a rapid transition to new computational, and the high priority of, accelerating the harmonization of statistical, and other mathematical methods of analysis the European policy and regulatory frameworks that affect The fourth recommendation is to foster the transition to health-related research and data sharing and the distribu- new computational, statistical, and other mathematical tion of biological material used for the generation of data methods of analysis that enable the integration of data necessary for research. There should be a balance between across the multiple scales of time and space typical of the protection of an individual’s privacy, while acknow- complex biological systems in their healthy and diseased ledging that many patients are much more open about states [123]: traditional methods of analysis are no lon- data sharing than current policies seem to assume, and ger scalable for such big data diversity. The roadmap de- the ability to proceed with research to ensure that Europe veloped by the Avicenna Coordination Support Action remains competitive in health research. EU and national provides a vision on how computer simulation will funding bodies should take stock of the existing best prac- transform the biomedical industry by developing “in tices and catalyze their adoption in transnational health silico clinical trials” [124]. research. The need for new methods spans a wide range of topics. We need effective methods for data integration, Conclusions and future perspectives collection, and data provenance management, for ex- The digital revolution is underway. A number of indus- ample, the integration of genomics information and pa- tries have already transformed their activities or have tient registries with EHRs and the integration of model now become inoperative. The driving forces are organism data into disease models. We also need im- miniaturization, automation, and now increasingly the proved methodologies and tools to support data entry by convergence of artificial intelligence, deep learning, and those recording data, such as visual and physiological robotics. Healthcare will not escape these developments. information. Innovative statistical methods, such as In fact, big data as a driving force will play an even more models for predictive analytics and computational important role than in most industries. In Europe, work- models tailored to big data, are required to enable hy- ing across borders is the only way to master the chal- pothesis generation, estimation of risk models, and study lenges of this scientific, technological, and industrial design. The Infrastructure, Design, Engineering, Archi- revolution. The single most important factor is the tecture, and Integration project (IDeAl) [125] is taking workforce. Countries that are ahead in ICT competence steps in this direction by developing new methods for and have an understanding of cultural differences and an gene selection to tailor the design for small population ability and willingness to work together have the best group trials. There may even be a requirement for new chance to succeed. Auffray et al. Genome Medicine (2016) 8:71 Page 10 of 13 Abbreviations (New Drugs 4 Bad Bugs, IMI-n°115525), PARENT (PAtient REgistries iNiTiative, BBMRI, Biobanking and Biomolecular Resources Research Infrastructure; BMS RI, CHAFEA Project Grant n°2011 23 02), p-medicine (From data sharing and inte- Biological and Medical Sciences Research Infrastructure; CASyM, Coordinating gration via VPH models to personalized medicine, FP7-n°270089), PREDEMICS Action Systems Medicine; CDISC, Clinical Data Interchange Standards Consortium; (Preparedness, Prediction and Prevention of Emerging Zoonotic Viruses with CORBEL, Coordinated Research Infrastructures Building Enduring Life-science Pandemic Potential using Multidisciplinary Approaches, FP7-n°278433), PREPARE Services; EBI, European Bioinformatics Institute; EC, European Commission; EHR, (Platform for European Preparedness Against (Re-)emerging Epidemics, FP7-n° electronic health record; EISBM, European Institute for Systems Biology and 602525), READNA (Revolutionary Approaches and Devices for Nucleic Acid Medicine; ELSI, ethical, legal, and social implications; EMBL, European Molecular analysis, FP7-n°201418), CHAARM (Combined Highly Active Anti-retroviral Biology Laboratory; ENCR, European Network of Cancer Registries; EORTC, Microbicides, FP7-n°242135), ProteomeXchange (International Data Exchange European Organisation for Research and Treatment of Cancer; ERIC, European and Data Representation Standards for Proteomics, FP7-n°260558), PSIMEx Research Infrastructure Consortium; EU, European Union; EURORDIS, Rare Diseases (Proteomics Standards International Molecular Exchange–Systematic Europe; GA4GH, Global Alliance for Genomics and Health; ICGC, International Capture of Published Molecular Interaction Data, FP7-n°223411), RADIANT Cancer Genome Consortium; IHEC, International Human Epigenome Consor- (Rapid Development and Distribution of Statistical Tools for High- tium; IMI, Innovative Medicines Initiative; ISO, International Organization for Throughput Sequencing Data FP7-n°305626), SEMCARE (Semantic Data Standardization; LCSB, Luxembourg Centre for Systems Biomedicine; NHS, Platform for Healthcare, FP7-n°611388), SPRINTT (Sarcopenia and Physical National Health Service; OECD, Organization for Economic Co-operation and fRailty IN older people: multi-componenT Treatment strategies, IMI-n° Development; UNESCO, United Nations Educational, Scientific and Cultural 115621), STATEGRA (User-driven Development of Statistical Methods for Organization; VPH, virtual physiological human Experimental Planning, Data Gathering, and Integrative Analysis of Next Generation Sequencing, proteomics and Metabolomics data FP7-n°306000), SysCLAD (Systems prediction of Chronic Lung Allograft Dysfunction, FP7-n° Acknowledgements 354457), SYSMEDIBD (Systems medicine of chronic inflammatory bowel We would like to thank the organizers of the EC workshop Anders Colver, disease, FP7-n°305564), U-BIOPRED (Unbiased BIOmarkers for the PREDiction of Tomasz Dylag, Christina Kyriakopoulou, and Sasa Jenko, who serve as respiratory disease outcomes, IMI-n°115010), VPH-share (Virtual Physiological scientific officers at the EC Health Directorate. Human: Sharing for Healthcare - A Research Environment, FP7-n°269978). Big data in health research: an EU action plan workshop was organized by the Genom Austria (member of the Global Network of Personal Genome Health Directorate of the Directorate-General for Research and Innovation at Projects). the European Commission with the contribution of the Innovative Medicines SJ, PF, and JAV acknowledge support from the European Molecular Biology Initiative office, Digital Society, Trust & Security Directorate of the Directorate Laboratory. General for Communications Networks, Content & Technology, Health systems IB acknowledges support from the Wellcome Trust (WT098051). and products Directorate of Directorate-General for Health and Food Safety, Joint Research Centre and EUROSTAT http://bigdata2015.uni.lu/eng/European- Authors’ contributions Commission-satellite-workshop. We thank chairs and panelists Ana Conesa, Haralampos Karanikas, Inês Barroso, Ivo http://ec.europa.eu/research/health/index.cfm# Gut, Jerry Lanfear, Niklas Blomberg, Norbert Graf, Pablo Villoslada, Paul Flicek, Rod We thank Alvar Agusti, Jacques Beckmann, Laurent Nicod, Andres Metspalu, Hose, Rudi Balling, Tim Hubbard, Yike Guo, Charles Auffray, Mikael Benson, Damjana Rozman, Philippe Sabatier, Ferran Sanz, Peter Sterk, Giulio Superti- Gianluigi Zanetti and Jeanine Houwing-Duistermaat for their contributions to the Furga, Jesper Tegnér, Olaf Wolkenhauer, and two anonymous reviewers, who manuscript preparation. All authors contributed to the content of the manuscript. provided insightful comments that helped to improve the manuscript. Figure 1 Sophie Janacek drafted the initial version of the manuscript, which was subse- was prepared by Bertrand De Meulder, Alexander Mazein, Johann Pellet, quently thoroughly edited by Rudi Balling, Charles Auffray, and Christoph Bock Mansoor Saqi, and Irina Balaur. with the support of Maria Manuela Nogueira. Diane Lefaudeux helped with the Workshop participants represent the following projects supported by the bibliography. All authors read and approved the final manuscript. European Union's Horizon 2020 and the Seventh Framework Programme: AETIONOMY (Organising mechanistic knowledge about neurodegenerative diseases for the improvement of drug development and therapy, IMI-n° Competing interests 115568), ASTERIX (New methodologies for clinical trials for small population CA, RB, RDH, CD, KP, EBD, WM, MK, JR, LV, JAV, NG and IB declare that they have groups, FP7-n°603160), BBMRI ERIC, BLUEPRINT (A Blueprint of Haematopoetic no competing interests. AK is employed by ITTM S.A. and is an expert at the Epigenomes, FP7-n°282510), BRIDGEHealth (BRidging Information and Data ISO/TC 76 WG 5; he does not have any competing interests. TK is employed by Generation for Evidence-based Health policy and research, H2020-n°664691), Vitromics Healthcare Holding, which is a member of EuropaBio. Pablo Villoslada CANCER-ID (Cancer treatment and monitoring through identification of circulating has received consultancy fees from Roche, Novartis, Araclon, and Health tumour cells and tumour related nucleic acids in blood FP7-n°115749), CASyM Engineering, is founder and hold stocks in Bionure Inc. and Spire Bioventures, (Coordinating Action Systems Medicine–Implementation of Systems Medicine and works as an academic editor for Neurology & Therapy, Current Treatment across Europe, FP7-n°305033), COMBIMS (A novel drug discovery method based Options in Neurology, Multiple Sclerosis & Demyelinating diseases,and PLoS One. on systems biology: combination therapy and biomarkers for Multiple Sclerosis, PF is a member of the Scientific Advisory Board for Omicia, Inc. FP7-n°305397), DECIPHER PCP (Distributed European Community Individual Patient Healthcare Electronic Record, FP7-n°288028), ECHO (European Author details Collaboration for Healthcare Optimization, FP7-n°242189), ELIXIR (European European Institute for Systems Biology and Medicine, 1 avenue Claude Life-science Infrastructure for Biological Information, FP7-n°211601), EMIF Vellefaux, 75010 Paris, France. CIRI-UMR5308, CNRS-ENS-INSERM-UCBL, (European Medical Information Framework, IMI-n°115372), EpiGeneSys Université de Lyon, 50 avenue Tony Garnier, 69007 Lyon, France. (Epigenetics towards systems biology, FP7- n°257082), ERA-IB (ERA-Net for Luxembourg Centre for Systems Biomedicine, University of Luxembourg, 7 Industrial Biotechnology 2, FP7-n°291814), ERASynBio (Development and Avenue des Hauts Fourneaux, 4362 Esch-sur-Alzette, Luxembourg. Coordination of Synthetic Biology in the European Research Area, FP7-n° Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, 291728), ERASysAPP (Systems Biology Applications, FP7-n°321567), ESGI Cambridge CB10 1SA, UK. Health Services Management Training Centre, (European Sequencing and Genotyping Infrastructure, FP7-n°262055), Faculty of Health and Public Services, Semmelweis University, Kútvölgyi út 2, eTRIKS (Delivering European Translational Information & Knowledge 1125 Budapest, Hungary. Centre for Personalised Medicine, Linköping Management Services, IMI-n°115446), EU-MASCARA, EUROBIOFORUM, University, 581 85 Linköping, Sweden. Translational & Bioinformatics, Pfizer European Lung Foundation, IDeAl (Integrated Design and Analysis of small Inc., 300 Technology Square, Cambridge, MA 02139, USA. Institute for Health population trials, FP7-n°602552), KConnect (H2020-n°644753), MedBioinformatics Sciences, IACS - IIS Aragon, San Juan Bosco 13, 50009 Zaragoza, Spain. (Creating medically-driven integrative bioinformatics applications focused on ELIXIR, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK. oncology, CNS disorders and their comorbidities, H2020-n°634143), MeDALL CeMM Research Center for Molecular Medicine of the Austrian Academy of (Mechanisms of the Development of ALLergy, FP7-n°261357), MIMOmics Sciences, Lazarettgasse 14, AKH BT25.2, 1090 Vienna, Austria. Department of (Methods for Integrated analysis of Multiple Omics datasets, FP7-n°305280), Laboratory Medicine, Medical University of Vienna, Lazarettgasse 14, AKH MULTIMOD (Multi-layer network modules to identify markers for personalized BT25.2, 1090 Vienna, Austria. Max Planck Institute for Informatics, Campus medication in complex diseases, FP7- n°223367), IMI ND4BB TRANSLOCATION E1 4, 66123 Saarbrücken, Germany. Príncipe Felipe Research Center, C/ Auffray et al. Genome Medicine (2016) 8:71 Page 11 of 13 Eduardo Primo Yúfera 3, 46012 Valencia, Spain. University of Florida, 2. Wood AR, Esko T, Yang J, Vedantam S, Pers TH, Gustafsson S, et al. Defining Institute of Food and Agricultural Sciences (IFAS), 2033 Mowry Road, the role of common variation in the genomic and biological architecture of Gainesville, FL 32610, USA. Bluecompanion Ltd, 6 London Street (second adult human height. Nat Genet. 2014;46:1173–86. floor), London W2 1HR, UK. Technology, Data & Analytics, KPMG 3. Cooper DN, Krawczak M, Polychronakos C, Tyler-Smith C, Kehrer-Sawatzki H. Luxembourg, Société Coopérative, 39 Avenue John F. Kennedy, 1855 Where genotype is not predictive of phenotype: towards an understanding Luxembourg, Luxembourg. Department of Human Genetics, Department of of the molecular basis of reduced penetrance in human inherited disease. Pathology, Leiden University Medical Centre, Einthovenweg 20, 2333 ZC Hum Genet. 2013;132:1077–130. Leiden, The Netherlands. Information Technology Department, European 4. Pickrell JK, Marioni JC, Pai AA, Degner JF, Engelhardt BE, Nkadori E, et al. Organization for Nuclear Research (CERN), 385 Route de Meyrin, 1211 Understanding mechanisms underlying human gene expression variation Geneva 23, Switzerland. Julius Center for Health Sciences and Primary Care, with RNA sequencing. Nature. 2010;464:768–72. University Medical Center Utrecht, Heidelberglaan 100, 3508 GA Utrecht, The 5. Vernot B, Stergachis AB, Maurano MT, Vierstra J, Neph S, Thurman RE, et al. Netherlands. European Molecular Biology Laboratory, European Personal and population genomics of human regulatory variation. Genome Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Res. 2012;22:1689–97. Cambridge CB10 1SD, UK. Department of Pediatric Oncology/Hematology, 6. Piraino SW, Furney SJ. Beyond the exome: the role of non-coding somatic Saarland University, Campus Homburg, Building 9, 66421 Homburg, mutations in cancer. Ann Oncol Off J Eur Soc Med Oncol ESMO. 2016;27:240–8. Germany. Project Management Jülich, Forschungszentrum Jülich GmbH, 7. European Commission satellite workshop ‘Big data in health research: an EU Wilhelm-Johnen-Straße, 52428 Jülich, Germany. Department of Clinical action plan’. http://bigdata2015.uni.lu/eng/European-Commission-satellite- Pharmacy & Toxicology, Leiden University Medical Center, Albinusdreef 2, workshop. Accessed 20 May 2016. 2333 ZA Leiden, The Netherlands. Data Science Institute, Imperial College 8. Raghupathi W, Raghupathi V. Big data analytics in healthcare: promise and London, South Kensington, London SW7 2AZ, UK. CNAG-CRG, Center for potential. Health Inf Sci Syst. 2014;2:3. Genomic Regulation, Barcelona Institute for Science and Technology (BIST), 9. Baro E, Degoul S, Beuscart R, Chazard E. Toward a literature-driven definition C/Baldiri Reixac 4, 08029 Barcelona, Spain. Institute of Software Technology of big data in healthcare. BioMed Res Int. 2015;2015:639021. and Interactive Systems, TU Wien, Favoritenstrasse 9-11/188, 1040 Vienna, 10. Meldolesi E, van Soest J, Damiani A, Dekker A, Alitto AR, Campitelli M, et al. Austria. The Association of the British Pharmaceutical Industry, 7th Floor, Standardized data collection to build prediction models in oncology: a Southside, 105 Victoria Street, London SW1E 6QT, UK. Department of prototype for rectal cancer. Future Oncol Lond Engl. 2016;12:119–36. Medical Statistics, RWTH-Aachen University, Universitätsklinikum Aachen, 11. Fernández-Luque L, Bau T. Health and social media: perfect storm of Pauwelsstraße 30, 52074 Aachen, Germany. SYNAPSE Research information. Healthcare Inform Res. 2015;21:67–73. Management Partners, Diputació 237, Àtic 3ª, 08007 Barcelona, Spain. 12. Hood L, Price ND. Demystifying disease, democratizing health care. Sci Department of Infection, Immunity and Cardiovascular Disease and Transl Med. 2014;6:225ed5. Insigneo Institute for In-Silico Medicine, Medical School, University of 13. Wade TD. Traits and types of health data repositories. Health Inf Sci Syst. Sheffield, Beech Hill Road, Sheffield S10 2RX, UK. Department of Statistics, 2014;2:4. School of Mathematics, University of Leeds, Leeds LS2 9JT, UK. Department 14. Ludvigsson JF, Andersson E, Ekbom A, Feychting M, Kim J-L, Reuterwall C, of Medical & Molecular Genetics, King’s College London, London SE1 9RT, et al. External review and validation of the Swedish national inpatient 33 34 UK. Genomics England, London EC1M 6BQ, UK. National and Kapodistrian register. BMC Public Health. 2011;11:450. University of Athens, Medical School, Xristou Lada 6, 10561 Athens, Greece. 15. DiMarco G, Hill D, Feldman SR. Review of patient registries in dermatology. Vitromics Healthcare Holding B.V., Onderwijsboulevard 225, 5223 DE J Am Acad Dermatol. 2016. doi:10.1016/j.jaad.2016.03.020. ’s-Hertogenbosch, The Netherlands. Fraunhofer Institute for Molecular 16. Orphanet. Rare Disease Registries in Europe. http://www.orpha.net/ Biology and Applied Ecology ScreeningPort, Schnackenburgallee 114, 22525 orphacom/cahiers/docs/GB/Registries.pdf. Accessed 6 May 2016. Hamburg, Germany. ITTM S.A., 9 avenue des Hauts Fourneaux, 4362 17. 2013 EURORDIS policy fact sheet - Rare Disease Patient Registries. Esch-sur-Alzette, Luxembourg. Research Business Technology, Pfizer Ltd, http://www.eurordis.org/sites/default/files/publications/Factsheet_ GP4 Building, Granta Park, Cambridge CB21 6GP, UK. Health Economics & registries.pdf. Accessed 8 May 2016. Outcomes Research, Deloitte Belgium, Berkenlaan 8A, 1831 Diegem, Belgium. 18. EORTC: European Organisation for Research and Treatment of Cancer. Janssen Pharmaceutica N.V., R&D G3O, Turnhoutseweg 30, 2340 Beerse, http://www.eortc.org. Accessed 6 May 2016. Belgium. Faculty of Life Sciences, University of Manchester, AV Hill Building, 19. EORTC opens prospective registry for patients with Melanoma. http://www. Oxford Road, Manchester M13 9PT, UK. UMR3664 IC/CNRS, Institut Curie, eortc.org/news/eortc-opens-prospective-registry-for-patients-with- Section Recherche, Pavillon Pasteur, 26 rue d’Ulm, 75248 Paris cedex 05, melanoma. Accessed 8 May 2016. France. Linguamatics Ltd, 324 Cambridge Science Park Milton Rd, 20. ENCR: European Network of Cancer Registries. http://www.encr.eu. Accessed Cambridge CB4 0WG, UK. PwC Luxembourg, 2 rue Gerhard Mercator, 2182 6 May 2016. Luxembourg, Luxembourg. Philips, HighTechCampus 36, 5656AE 21. PARENT: PAtient REgistries iNiTiative. http://patientregistries.eu/deliverables. Eindhoven, The Netherlands. Department of Public Health and Primary Accessed 6 May 2016. Care, KU Leuven Kulak, Etienne Sabbelaan 53, 8500 Kortrijk, Belgium. 22. Kaplan G, Virginia Mason, Bo-Linn G, Gordon and Betty Moore Foundation, INCLIVA Health Research Institute, University of Valencia, CIBERobn ISCIII, Carayon P, University of Wisconsin, et al. Bringing a systems approach to Avenida Menéndez Pelayo 4 accesorio, 46010 Valencia, Spain. Swiss health. National Academy of Engineering of the National Academies and Institute of Bioinformatics (SIB) and University of Basel, Klingelbergstrasse 50/ Institute of Medicine of the National Academies; Jul 2013. https://www.nae. 70, 4056 Basel, Switzerland. Agency for Health Quality and Assessment of edu/File.aspx?id=86344. Accessed 6 May 2016 Catalonia (AQuAS), Carrer de Roc Boronat 81-95, 08005 Barcelona, Spain. 23. Bulger M, Taylor G, Schroeder R. Data-driven business models: challenges EuroBioForum Foundation, Chrysantstraat 10, 3135 HG Vlaardingen, The and opportunities of big data. Oxford Internet Institute. Research Councils Netherlands. Integrated BioBank of Luxembourg, 6 rue Nicolas-Ernest UK: NEMODE, New Economic Models in the Digital Economy; 2014. Barblé, 1210 Luxembourg, Luxembourg. Technopolis Group, 3 Pavilion http://www.nemode.ac.uk/wp-content/uploads/2014/09/nemode_ Buildings, Brighton BN1 1EE, UK. Hospital Clinic of Barcelona, Institute business_models_for_bigdata_2014_oxford.pdf. Accessed 20 May 2016. d’Investigacions Biomediques August Pi Sunyer (IDIBAPS), Rosello 149, 08036 24. Delfino A, Faure Ragani A, Telpis V, Tilley J, McKinsey & Company. Mature quality Barcelona, Spain. European Platform for Patients’ Organisations, Science systems: what pharma can learn from other industries. Pharm Manuf. 26 Feb and Industry (Epposi), De Meeûs Square 38-40, 1000 Brussels, Belgium. 2015; http://www.pharmamanufacturing.com/articles/2015/mature-quality- 55 56 CRS4, Ed.1 POLARIS, 09129 Pula, Italy. BBMRI-ERIC, Neue Stiftingtalstrasse systems-what-pharma-can-learn-from-other-industries/. Accessed 20 May 2016. 2/B/6, 8010 Graz, Austria. 25. Rumsfeld JS, Joynt KE, Maddox TM. Big data analytics to improve cardiovascular care: promise and challenges. Nat Rev Cardiol. 2016;13(6):350–9. 26. Monteith S, Glenn T, Geddes J, Whybrow PC, Bauer M. Big data for bipolar disorder. Int J Bipolar Disord. 2016;4:10. References 27. Janke AT, Overbeek DL, Kocher KE, Levy PD. Exploring the potential of 1. Ideker T, Dutkowski J, Hood L. Boosting signal-to-noise in complex biology: predictive analytics and big data in emergency care. Ann Emerg Med. 2016; prior knowledge is power. Cell. 2011;144:860–3. 67:227–36. Auffray et al. Genome Medicine (2016) 8:71 Page 12 of 13 28. Khandani S. Engineering design process: education transfer plan. 2005. 55. Hofmann-Apitius M, Ball G, Gebel S, Bagewadi S, de Bono B, Schneider R, et al. http://www.saylor.org/site/wp-content/uploads/2012/09/ME101-4.1- Bioinformatics mining and modeling methods for the identification of disease Engineering-Design-Process.pdf. Accessed 8 May 2016. mechanisms in neurodegenerative disorders. Int J Mol Sci. 2015;16:29179–206. 29. Abugessaisa I, Saevarsdottir S, Tsipras G, Lindblad S, Sandin C, Nikamo P, 56. Tenenbaum JD. Translational bioinformatics: past, present, and future. et al. Accelerating translational research by clinically driven development of Genomics Proteomics Bioinformatics. 2016;14:31–41. an informatics platform–a case study. PLoS One. 2014;9, e104382. 57. Denny JC, Bastarache L, Ritchie MD, Carroll RJ, Zink R, Mosley JD, et al. 30. Cano I, Lluch-Ariet M, Gomez-Cabrero D, Maier D, Kalko S, Cascante M, et al. Systematic comparison of phenome-wide association study of electronic Biomedical research in a digital health framework. J Transl Med. 2014;12 Suppl 2:S10. medical record data and genome-wide association study data. Nat Biotechnol. 2013;31:1102–10. 31. Koutkias VG, Jaulent M-C. Computational approaches for pharmacovigilance signal detection: toward integrated and semantically-enriched frameworks. 58. Gustafsson M, Gawel DR, Alfredsson L, Baranzini S, Björkander J, Blomgran R, Drug Saf. 2015;38:219–32. et al. A validated gene regulatory network and GWAS identifies early 32. Espay AJ, Bonato P, Nahab FB, Maetzler W, Dean JM, Klucken J, et al. regulators of T cell-associated diseases. Sci Transl Med. 2015;7:313ra178. Technology in Parkinson’s disease: challenges and opportunities. Mov 59. Landau DA, Carter SL, Stojanov P, McKenna A, Stevenson K, Lawrence MS, Disord Off J Mov Disord Soc. 2016. doi:10.1002/mds.26642. et al. Evolution and impact of subclonal mutations in chronic lymphocytic 33. Austin C, Kusumoto F. The application of Big Data in medicine: current leukemia. Cell. 2013;152:714–26. implications and future directions. J Interv Card Electrophysiol Int J Arrhythm 60. Leitsalu L, Alavere H, Tammesoo M-L, Leego E, Metspalu A. Linking a Pacing. 2016. doi:10.1007/s10840-016-0104-y. population biobank with national health registries-the estonian experience. 34. Poste G. Bring on the biomarkers. Nature. 2011;469:156–7. J Pers Med. 2015;5:96–106. 61. Mandl KD, Kohane IS. Time for a patient-driven health information 35. Sawyers CL. The cancer biomarker problem. Nature. 2008;452:548–52. economy? N Engl J Med. 2016;374:205–8. 36. Barlesi F, Mazieres J, Merlio J-P, Debieuvre D, Mosser J, Lena H, et al. Routine 62. IRDiRC: International Rare Diseases Research Consortium. http://www.irdirc. molecular profiling of patients with advanced non-small-cell lung cancer: org. Accessed 8 May 2016. results of a 1-year nationwide programme of the French Cooperative Thoracic Intergroup (IFCT). Lancet Lond Engl. 2016;387:1415–26. 63. RARE-Bestpractices. http://www.rarebestpractices.eu/home. Accessed 8 37. Holderfield M, Deuker MM, McCormick F, McMahon M. Targeting RAF May 2016. kinases for cancer therapy: BRAF-mutated melanoma and beyond. Nat Rev 64. p-medicine - from data sharing and integration via VPH models to Cancer. 2014;14:455–67. personalized medicine. http://www.p-medicine.eu. Accessed 8 May 2016. 38. Kalia M. Biomarkers for personalized oncology: recent advances and future 65. ELIXIR: A distributed infrastructure for life-science information. https://www. challenges. Metabolism. 2015;64:S16–21. elixir-europe.org. Accessed 6 May 2016. 39. Semrad TJ, Kim EJ. Molecular testing to optimize therapeutic decision 66. Ison J, Rapacki K, Ménager H, Kalaš M, Rydza E, Chmura P, et al. Tools and making in advanced colorectal cancer. J Gastrointest Oncol. 2016;7:S11–20. data services registry: a community effort to document bioinformatics resources. Nucleic Acids Res. 2016;44:D38–47. 40. Hay SI, George DB, Moyes CL, Brownstein JS. Big data opportunities for global infectious disease surveillance. PLoS Med. 2013;10, e1001413. 67. eTRIKS: European Translational Research Information and Knowledge 41. Zheng Y-L, Ding X-R, Poon CCY, Lo BPL, Zhang H, Zhou X-L, et al. Management Services. https://www.etriks.org. Accessed 6 May 2016. Unobtrusive sensing and wearable devices for health informatics. IEEE Trans 68. Genomics England 100,000 Genomes Project. http://www.genomicsengland. Biomed Eng. 2014;61:1538–54. co.uk. Accessed 6 May 2016. 42. OECD Publishing. Health data governance: privacy, monitoring and research 69. Rosenthal A, Mork P, Li MH, Stanford J, Koester D, Reynolds P. Cloud - policy brief. OECD; Oct 2015. https://www.oecd.org/health/health-systems/ computing: a new business paradigm for biomedical information sharing. Health-Data-Governance-Policy-Brief.pdf. Accessed 6 May 2016. J Biomed Inform. 2010;43:342–53. 43. Eisenstein M. Big data: the power of petabytes. Nature. 2015;527:S2–4. 70. Chen Y-C, Horng G, Lin Y-J, Chen K-C. Privacy preserving index for encrypted electronic medical records. J Med Syst. 2013;37:9992. 44. Doyle-Lindrud S. Watson will see you now: a supercomputer to help clinicians 71. Griebel L, Prokosch H-U, Köpcke F, Toddenroth D, Christoph J, Leb I, et al. make informed treatment decisions. Clin J Oncol Nurs. 2015;19:31–2. A scoping review of cloud computing in healthcare. BMC Med Inform Decis 45. Cesario A, Marcus F. Cancer systems biology, bioinformatics and medicine: Mak. 2015;15:17. research and clinical applications. 1st ed. Netherlands: Springer Science & Business Media; 2011. 72. IMI: Innovative Medicines Initiative - Ongoing projects. http://www.imi. 46. Cancer Genome Atlas Research Network, Weinstein JN, Collisson EA, Mills europa.eu/content/ongoing-projects. Accessed 8 May 2016. GB, Shaw KRM, Ozenberger BA, et al. The Cancer Genome Atlas Pan-Cancer 73. Hughes R, Beene M, Dykes. The significance of data harmonization for analysis project. Nat Genet. 2013;45:1113–20. credentialing research. Washington, DC: Institute of Medicine of the 47. Gahl WA, Wise AL, Ashley EA. The undiagnosed diseases network of the National Academies; 2014. http://nam.edu/wp-content/uploads/2015/06/ national institutes of health: a national extension. JAMA. 2015;314:1797–8. CredentialingDataHarmonization.pdf. Accessed 8 May 2016. 74. European Open Science Cloud. http://ec.europa.eu/research/openscience/ 48. Taruscio D, Groft SC, Cederroth H, Melegh B, Lasko P, Kosaki K, et al. index.cfm?pg=open-science-cloud. Accessed 9 May 2016. Undiagnosed Diseases Network International (UDNI): White paper for global 75. Howe D, Costanzo M, Fey P, Gojobori T, Hannick L, Hide W, et al. Big data: actions to meet patient needs. Mol Genet Metab. 2015;116:223–5. the future of biocuration. Nature. 2008;455:47–50. 49. Thompson R, Johnston L, Taruscio D, Monaco L, Béroud C, Gut IG, et al. 76. Liberles DA, Teufel AI, Liu L, Stadler T. On the need for mechanistic models in RD-Connect: an integrated platform connecting databases, registries, computational genomics and metagenomics. Genome Biol Evol. 2013;5:2008–18. biobanks and clinical bioinformatics for rare disease research. J Gen Intern Med. 2014;29:780–7. 77. EMBL-EBI: European Molecular Biology Laboratory – European 50. Yaman H, Yavuz E, Er A, Vural R, Albayrak Y, Yardimci A, et al. The use of Bioinformatics Institute. http://www.ebi.ac.uk/biomodels-main. Accessed 8 mobile smart devices and medical apps in the family practice setting. J Eval May 2016. Clin Pract. 2016;22:290–6. 78. Viceconti M, Hunter P, Hose R. Big data, big knowledge: big data for 51. American Bar Association, Health Law Section, ABA Section of Science & personalized healthcare. IEEE J Biomed Health Inform. 2015;19:1209–15. Technology Law and Center for Professional Development. Medical device 79. Virtual Physiological Human (VPH) Institute. http://www.vph-institute.org. law: compliance issues, best practices and trends. 2015. http://www. Accessed 6 May 2016. americanbar.org/content/dam/aba/events/cle/2015/10/ce1510mdm/ 80. IUPS Physiome Project. http://physiomeproject.org/software/fieldml. ce1510mdm_interactive.authcheckdam.pdf. Accessed 6 May 2016. Accessed 6 May 2016. 52. Di Meglio A. Big data management–from CERN/LHC to personalised 81. MarésJ,Shamardin L, Weiler G, Anguita A,Sfakianakis S, Neri E, et al.p-medicine: medicine. Ajaccio, France: MEDAMI; 2016. doi:10.5281/zenodo.50739. a medical informatics platform for integrated large scale heterogeneous patient 53. Murphy SN, Weber G, Mendis M, Gainer V, Chueh HC, Churchill S, et al. data. AMIA Annu Symp Proc. 2014;2014:872–81. Serving the enterprise and beyond with informatics for integrating biology 82. Schmitz U, Wolkenhauer O. Systems medicine. 1st ed. New York: Humana and the bedside (i2b2). J Am Med Inform Assoc. 2010;17:124–30. Press; 2016. 54. Chen J, Qian F, Yan W, Shen B. Translational biomedical informatics in the 83. CASyM: Coordinating Action Systems Medicine Europe. https://www.casym. cloud: present and future. BioMed Res Int. 2013;2013:658925. eu. Accessed 6 May 2016. Auffray et al. Genome Medicine (2016) 8:71 Page 13 of 13 84. Pemovska T, Kontro M, Yadav B, Edgren H, Eldfors S, Szwajda A, et al. 110. BBMRI-ERIC: Biobanking and BioMolecular resources Research Infrastructures. Individualized systems medicine strategy to tailor treatments for patients with http://bbmri-eric.eu. Accessed 8 May 2016. chemorefractory acute myeloid leukemia. Cancer Discov. 2013;3:1416–29. 111. ECRIN: European Clinical Research Infrastructure Network. http://www.ecrin. 85. Roca J, Cano I, Gomez-Cabrero D, Tegnér J. From systems understanding to org. Accessed 6 May 2016. personalized medicine: lessons and recommendations based on a 112. Cascante M, de Atauri P, Gomez-Cabrero D, Wagner P, Centelles JJ, Marin S, multidisciplinary and translational analysis of COPD. Methods Mol Biol et al. Workforce preparation: the Biohealth computing model for Master Clifton NJ. 2016;1386:283–303. and PhD students. J Transl Med. 2014;12 Suppl 2:S11. 86. Kemp R. Legal aspects of managing big data white paper. 2014. Kemp IT Law, 113. Rozman D, Acimovic J, Schmeck B. Training in systems approaches for the http://www.kempitlaw.com/wp-content/uploads/2014/10/Legal-Aspects-of- next generation of life scientists and medical doctors. Systems Medicine. Big-Data-White-Paper-v2-1-October-2014.pdf. Accessed 6 May 2016. 1st ed. New York: Humana Press (Springer Protocols). Schmitz U and Wolkenhauer O; 2016. p.73–86. 87. ICGC: International Cancer Genome Consortium. https://icgc.org/. Accessed 6 114. Jensen TB. Design principles for achieving integrated healthcare information May 2016. systems. Health Informatics J. 2013;19:29–45. 88. IHEC: International Human Epigenome Consortium. http://ihec-epigenomes. 115. Open science definition. https://en.wikipedia.org/wiki/Open_science. org. Accessed 6 May 2016. Accessed 8 May 2016. 89. GSC: Genomic Standards Consortium. http://gensc.org. Accessed 6 May 2016. 116. Butler D. Dutch lead European push to flip journals to open access. Nature. 90. CDISC: Clinical Data Interchange Standards Consortium. http://www.cdisc. 2016;529:13–3. org. Accessed 8 May 2016. 117. Swedish Research Council. Proposal for National Guidelines for Open Access 91. ISO TC276 WG5: Technical Committee 276 on Biotechnology, Working to Scientific Information. Swedish Research Council; Feb 2015. https:// Group 5 on Data Processing and Integration. http://www.iso.org/iso/home/ publikationer.vr.se/en/product/proposal-for-national-guidelines-for-open- standards_development/list_of_iso_technical_committees/iso_technical_ access-to-scientific-information/. Accessed 8 May 2016. committee.htm?commid=4514241. Accessed 6 May 2016. 118. Bauer B, Blechl B, Bock C, Danowski P, Ferus A, Graschopf A, et al. 92. Wilkinson MD, Dumontier M, Aalbersberg IJJ, Appleton G, Axton M, Baak A, Recommendations for the transition to open access in Austria. Nov 2015. et al. The FAIR Guiding Principles for scientific data management and http://zenodo.org/record/34079#.Vy-njjY03q0. Accessed 8 May 2016 stewardship. Sci Data. 2016;3:160018. 119. Berlin declaration on open access to knowledge in the sciences and 93. GA4GH: Global Alliance for Genomics and Health. http://genomicsandhealth. humanities. 22 Oct 2003. https://openaccess.mpg.de/Berlin-Declaration. org. Accessed 6 May 2016. Accessed 8 May 2016. 94. CORBEL: Coordinated Research Infrastructures Building Enduring Life-science 120. Follett R, Strezov V. An analysis of citizen science based research: usage and Services. https://www.elixir-europe.org/about/eu-projects/corbel. Accessed 6 publication patterns. PLoS One. 2015;10, e0143687. May 2016. 121. Horizon 2020 Framework Programme policy on open science (open access). 95. BRIDGEHEALTH. http://www.bridge-health.eu/content/integrate-information- http://ec.europa.eu/programmes/horizon2020/en/h2020-section/open- injuries. Accessed 8 May 2016. science-open-access. Accessed 8 May 2016. 96. Personal Genome Project. http://www.personalgenomes.org. Accessed 8 May 2016. 122. Clark WC, van Kerkhoff L, Lebel L, Gallopin GC. Crafting usable 97. UNESCO. International Declaration on Human Genetic Data. Oct 2003. knowledge for sustainable development. Proc Natl Acad Sci U S A. http://portal.unesco.org/en/ev.php-URL_ID=17720&URL_DO=DO_ 2016;113:4570–8. TOPIC&URL_SECTION=201.html. Accessed 6 May 2016. 123. Wolkenhauer O, Auffray C, Brass O, Clairambault J, Deutsch A, Drasdo D, et al. 98. Publishing OECD. Guidelines for Human Biobanks and Genetic Research Enabling multiscale modeling in systems medicine. Genome Med. 2014;6:21. Databases (HBGRDs). 2009. http://www.oecd.org/sti/biotechnology/hbgrd. 124. Viceconti M, Henney A, Morley-Fletcher E. In silico clinical trials: how Accessed 6 May 2016. computer simulation will transform the biomedical industry. Brussels, 99. Knoppers BM. Framework for responsible sharing of genomic and health- Belgium: Avicenna Coordination Support Action; 2016. http://avicenna-isct. related data. HUGO J. 2014;8:3. org/wp-content/uploads/2016/01/AvicennaRoadmapPDF-27-01-16.pdf. 100. DLA Piper, Data protection laws of the world. https://www. Accessed 20 May 2016. dlapiperdataprotection.com/index.html#handbook/world-map-section. 125. IDeAl: Infrastructure, Design, Engineering, Architecture, and Integration. Accessed 6 May 2016. http://www.uspto.gov/about/vendor_info/current_acquisitions/ideaihom.jsp. 101. Proposal for a Regulation of the European parliament and of the Council on Accessed 8 May 2016. the protection of individuals with regard to the processing of personal data 126. Shaw DE, Sousa AR, Fowler SJ, Fleming LJ, Roberts G, Corfield J, et al. and on the free movement of such data (General Data Protection Directive) Clinical and inflammatory characteristics of the European U-BIOPRED adult 2012/0011 (COD). http://www.europarl.europa.eu/RegData/docs_autres_ severe asthma cohort. Eur Respir J. 2015;46:1308–21. institutions/commission_europeenne/com/2012/0011/COM_ 127. Ayasdi. http://www.ayasdi.com. Accessed 6 May 2016. COM(2012)0011_EN.pdf. Accessed 6 May 2016. 128. Lum PY, Singh G, Lehman A, Ishkanov T, Vejdemo-Johansson M, 102. General data protection regulation, compromise text concluded in the trilogue Alagappan M, et al. Extracting insights from the shape of complex data negotiations between the Parliament and the Council (17 December 2015). using topology. Sci Rep. 2013;3:1236. http://www.emeeting.europarl.europa.eu/committees/agenda/201512/LIBE/ 129. Pellet J, Lefaudeux D, Royer P-J, Koutsokera A, Bourgoin-Voillard S, Schmitt LIBE%282015%291217_1/sitt-1739884. Accessed 6 May 2016. M, et al. A multi-omics data integration approach to identify a predictive 103. Bahr A, Schlünder I. Code of practice on secondary use of medical data in molecular signature of CLAD. Eur Respir J. 2015;46, OA3271. European scientific research projects. Int Priv Law. 2015;5:279–91. 130. Pison C, Magnan A, Botturi K, Sève M, Brouard S, Marsland BJ, et al. 104. Why you should care about blockchains: the non-financial uses of blockchain Prediction of chronic lung allograft dysfunction: a systems medicine technology. Nesta. http://www.nesta.org.uk/blog/why-you-should-care-about- challenge. Eur Respir J. 2014;43:689–93. blockchains-non-financial-uses-blockchain-technology. Accessed 8 May 2016. 131. Ingenuity®. http://www.ingenuity.com. Accessed 6 May 2016. 105. Barnes R. Blockchain and digital health–first impressions. DNA Dig. 132. Thomson Reuters GeneGo MetaCore™. https://portal.genego.com. Accessed 8 May 2016. http://dnadigest.org/?s=block+chain+digital+health. Accessed 8 May 2016. 133. Fujita KA, Ostaszewski M, Matsuoka Y, Ghosh S, Glaab E, Trefois C, et al. 106. Tang Y, Liu L. Searching HIE with differentiated privacy preservation. San Diego, USA: Integrating pathways of Parkinson’s disease in a molecular interaction map. 2014 USENIX Summit on Health Information Technologies HealthTech ’14; 2014. Mol Neurobiol. 2014;49:88–102. 107. CS ELSI BBMRI-ERIC: Common Service on Ethical, Legal, and Social Issues 134. Mazein A, Auffray C. EISBM AsthmaMap. http://www.eisbm.org/projects/ of Biobanking and BioMolecular resources Research Infrastructure. http:// disease-maps. Accessed 6 May 2016. bbmri-eric.eu/common-services. Accessed 6 May 2016. 135. Mazein A, De Meulder B, Lefaudeux D, Knowles R, Wheelock C, Dahlen S, 108. Georgatos F, Ballereau S, Pellet J, Ghanem M, Price N, Hood L, et al. Computational et al. The AsthmaMap: towards a community-driven reconstruction of infrastructures for data and knowledge management in systems biology. In: Prokop asthma-relevant pathways and networks. Estoril, Portugal: The 14th ERS A, Csukás B, editors. Systems Biology. Netherlands: Springer; 2013. p. 377–97. Lung Science Conference; 2016. 109. CS IT BBMRI-ERIC: Common Service on Information Technology of Biobanking and BioMolecular resources Research Infrastructure. http://bbmri-eric.eu/ common-service-it. Accessed 6 May 2016. http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Genome Medicine Springer Journals

Loading next page...
 
/lp/springer-journals/making-sense-of-big-data-in-health-research-towards-an-eu-action-plan-wJOFhIWLwn

References (136)

Publisher
Springer Journals
Copyright
Copyright © 2016 by The Author(s).
Subject
Biomedicine; Human Genetics; Metabolomics; Bioinformatics; Medicine/Public Health, general; Cancer Research; Systems Biology
eISSN
1756-994X
DOI
10.1186/s13073-016-0323-y
pmid
27338147
Publisher site
See Article on Publisher Site

Abstract

Medicine and healthcare are undergoing profound changes. Whole-genome sequencing and high-resolution imaging technologies are key drivers of this rapid and crucial transformation. Technological innovation combined with automation and miniaturization has triggered an explosion in data production that will soon reach exabyte proportions. How are we going to deal with this exponential increase in data production? The potential of “big data” for improving health is enormous but, at the same time, we face a wide range of challenges to overcome urgently. Europe is very proud of its cultural diversity; however, exploitation of the data made available through advances in genomic medicine, imaging, and a wide range of mobile health applications or connected devices is hampered by numerous historical, technical, legal, and political barriers. European health systems and databases are diverse and fragmented. There is a lack of harmonization of data formats, processing, analysis, and data transfer, which leads to incompatibilities and lost opportunities. Legal frameworks for data sharing are evolving. Clinicians, researchers, and citizens need improved methods, tools, and training to generate, analyze, and query data effectively. Addressing these barriers will contribute to creating the European Single Market for health, which will improve health and healthcare for all Europeans. European healthcare systems and the potential error. Knowledge generation is changing dramatically. for big data The digitalization of medicine allows the comparison of Medicine has traditionally been a science of observation disease progression or treatment responses from patients and experience. For thousands of years, clinicians have worldwide. Whole-genome sequencing allows searching integrated the knowledge of preceding generations with and comparing one’s own genome to millions and soon their own life-long experiences to treat patients accord- billions of other human genomes. Eventually, the entire ing to the oath of Hippocrates; mostly based on trial and world population could be used as a reference popula- tion in order to link genome information with many * Correspondence: cauffray@eisbm.org; rudi.balling@uni.lu other types of physiological, clinical, environmental, and European Institute for Systems Biology and Medicine, 1 avenue Claude lifestyle data. For many, this is a vision full of opportun- Vellefaux, 75010 Paris, France Luxembourg Centre for Systems Biomedicine, University of Luxembourg, 7 ities, whereas for others it provides a wealth of technical Avenue des Hauts Fourneaux, 4362 Esch-sur-Alzette, Luxembourg Full list of author information is available at the end of the article © 2016 The Author(s). Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. Auffray et al. Genome Medicine (2016) 8:71 Page 2 of 13 challenges, unanticipated consequences, and loss of priv- clinicians, public policy experts, representatives of health acy and autonomy. services, patient advocacy groups, the pharmaceutical in- The quality of conclusions on the etiology of diseases dustry, and ICT companies. follows a law of large numbers. Cross-sectional cohort studies of 30,000 to 50,000 or more cases are required to What do we mean by “big data”? separate the signal from noise and to detect genomic “Big data” has a wide range of definitions in health re- regions associated with a given trait in which disease- search [8, 9] and to create a single definition for all uses related genes or susceptibility factors are located [1, 2]. (“one size fits all” approach) may be too abstract to be Whole-genome sequencing studies often identify only a useful. However, a workable definition of what big data few genomic regions that contain elements with large ef- means for health research or at least a consensus of what fects on the penetrance or expressivity of gene products this term means was proposed during the workshop in but hundreds of genomic regions that have small effects Luxembourg. “Big data in health” encompasses high and are highly dependent on genetic background, envir- volume, high diversity biological, clinical, environmen- onmental factors, or social and lifestyle determinants [3]. tal, and lifestyle information collected from single indi- There is also a need to study disease pathogenesis on viduals to large cohorts, in relation to their health and genome, epigenome, transcriptome, proteome, and me- wellness status, at one or several time points. Big data tabolome levels and combine these dimensions through can only be dealt with by adopting a strong governance multi-omics research. Furthermore, individual variation model and best practices of new technologies, e.g., in responsible for normal and disease phenotypes is high as large-scale data production compliant with community- a result of somatic mutations or variation in transcrip- based quality standards, coupled with interoperable data tion, splicing, or allele-specific gene expression between storage, data integration, and advanced analytics solutions individuals [4–6]. [10]. Another goal of the workshop was to develop an EU Vast amounts of temporal and spatial parameter data action plan for research funders towards the integration of are now available. But what are we going to do with the big data into policy development, biomedical research, data? It takes hard work to condense useful information and clinical practice in health and wellness management. from big data and turn this information into knowledge Big data comes from a variety of sources, such as clinical and action. The challenge will be to make a smart choice trials, electronic health records (EHRs), patient registries between situations when less is more versus less is less and databases, multidimensional data from genomic, epi- but also when more is more versus more is less. genomic, transcriptomic, proteomic, metabolomic, and Here, we briefly describe the key challenges that result microbiomic measurements, and medical imaging. More from making sense out of big data in health and using recently, data are being integrated from social media, so- these data for the benefit of the patient and the healthcare cioeconomic or behavioral indicators, occupational infor- system. We also highlight key technical, legal, and ethical mation, mobile applications, or environmental monitoring issues that we face to develop evidence-based personalized [11]. Big data comes in a wide range of formats. Data medicine. Finally, we put forward five recommendations streams have to be assessed and interpreted in a timely for the European Union (EU) and member states’ policy manner to benefit patients affected by diseases and to help makers to serve as a framework for an EU action plan that citizens remain in good health [8, 12]. could help to reach this ambitious goal. Importance of patient registries Making sense of big data in health research Patient registries have for decades served as a key tool for On 30 October 2015, the Health Directorate of the assessing clinical outcome and clinical and health technol- Directorate-General for Research and Innovation at the ogy performance [13–15]. Rare disease registries pool data European Commission (EC), the executive body of the to achieve a sufficient sample size for epidemiological and/ EU, organized in Luxembourg a workshop entitled “Big or clinical research [16, 17]. The European Organization for data in health research: an EU action plan” [7]. The aim Research and Treatment of Cancer (EORTC) [18] opened a of the workshop was to ask stakeholders in the “big data prospective registry for patients with melanoma in June revolution” for their input on how European funding for 2015 [19]. The European Network of Cancer Registries health research should take into account the opportun- (ENCR) [20], established within the framework of the ities, limitations, and concerns of the anticipated develop- Europe Against Cancer Programme of the EC, pro- ments in health and healthcare. Participants included motes collaboration between cancer registries, defines bioinformaticians, computational biologists, genome sci- data collection standards, provides training for cancer entists, drug developers, biobanking experts, experimental registry personnel, and regularly disseminates informa- biologists, biostatisticians, information and communica- tion on incidence and mortality from cancer in the tion technology (ICT) experts, public health researchers, European Union and other European countries. Auffray et al. Genome Medicine (2016) 8:71 Page 3 of 13 Patient registries provide significant potential for re- management of big data would enable a more systematic search and public health improvements in the EU, identification of drug safety signals, such as earlier detec- owing to the large volume of patients in each registry tion of adverse drug reactions [31], while allowing person- and the variety of quality medical information related alized medicine analyses via appropriate patient and/or to each patient. Patient registries are increasingly im- population stratification methodologies. This in turn portant to monitor patients’ treatments and for safety should lead to improved treatment responses for biologic- assessment and the identification of trends in transla- ally or clinically defined patient subgroups, which will also tional medicine (e.g., registry-based clinical trials, per- avoid unnecessary rejection of potent drugs and devices. sonalized medicine) [21]. As a result, patient communities will benefit and the un- Patient registries allow informed policy decisions at the sustainable trend of escalating costs in hospital and com- local, regional, national, and, in some cases, the inter- munity care management as well as diagnostic and drug national level. As a result, hundreds of registries have been development costs by the biopharma industries will stop set up that range from national to international rare disease or slow down. Health economy specialists need to provide initiatives, coupling clinical and genetic data and biobanks. suitable metrics to monitor key performance indicators of However, for various reasons, including data protection success in big data pilot projects. Such metrics might in- and the fragmentation of regulatory frameworks, the com- clude the change in response rates in stratified patient bination of these disparate information sources to guide subpopulations or the number of adverse drug reactions health research and decision-making in the clinic has so far after systems medicine-based companion diagnostics. lagged behind the use of large-scale, big data collections in Big data also has many potential benefits for transla- other sectors. Other disciplines, such as electronic and tional research into health and well-being. Integrated mechanical engineering, and whole industries, such as data sets should improve models of common disease to building airplanes, weather forecasting, or robotics, have better understand the progression of rare diseases [32]. demonstrated computational modeling and simulation as They may also enable the detection of population-level an essential component that is based on data sharing and effects, such as the off-target and adverse effects of their experience could help overcome the barriers experi- drugs or the occurrence of co-morbidity [33]. enced in health research [22–24]. Biomarkers constitute a key building block of precision medicine, yet the development and clinical validation of The potential benefits of big data for healthcare new biomarkers is a lengthy process and relatively few Big data in health can be used to improve the efficiency such markers have yet reached routine clinical practice and effectiveness of prediction and prevention strategies [34, 35]. However, a sizeable number of biomarkers are or of medical interventions, health services, and health now widely used in routine clinical diagnostics, which policies [25–27]. Access to well-curated and high quality include—but are not limited to—targeted cancer therapy health-related data will likely have a number of benefits [36–39]. Multidimensional signatures that take into ac- in a diverse range of situations. In clinical practice, these count a wealth of prior information, both from the pa- data will improve outcomes for individual patients tient’s previous life history and state-of-the-art information through personalization of predictions, earlier diagnosis, from the literature and relevant databases, will hopefully better treatments, and improved decision support for deliver a much higher predictive power than the single clinicians in cyclic processes. (Cyclic processes are usu- biomarkers used today. There is also potential for re- ally composed of the definition of policy/decision op- search on the impact of healthcare interventions and tions, the selection of the best alternative, and the monitoring trends in infectious diseases to inform pub- subsequent implementation and validation of this op- lic health policies [40]. tion. Integrating feedback from a continuous evaluation Finally, there is an opportunity to engage with the indi- on the process completes this cycle [28].)These im- vidual patient more closely and import data from mobile provements should eventually lead to lowered costs for health applications or connected devices. This interaction the healthcare system. with the patient will result in the collection of more de- Likewise, the integration of fragmented information sys- tailed clinical, environmental, and lifestyle information, tems into the clinical life cycle will allow the discovery of such as heart frequency and body temperature, physical medically relevant associations, early signals, or changed activity and nutrition habits, and sleep and stress manage- disease trajectories and should, therefore, enable better ment, which will prevent risk exposure and disease onset patient management strategies and improved quality and [41]. Personal monitoring over time should aid the early safety of care. For clinical trials, more expansive, inter- detection of deviations from a healthy state and trajector- operable health records should make it much easier to ies should lead to actionable recommendations, making it find suitable participants and to design and assess the possible for individuals to maintain themselves in good feasibility of new studies [29, 30]. Moreover, better health [12]. Auffray et al. Genome Medicine (2016) 8:71 Page 4 of 13 The challenges ahead for the effective use of big levels, such as animal or cellular models and transla- data in healthcare tional studies, but also clinical research that involves To refine the recommendations for an EU action plan, patients or public health research. To make the most we identified the main challenges that exist for the use use of the information produced, several technical chal- of big data in healthcare research and in the clinic. The lenges should be addressed, such as the combination of challenges have been reported elsewhere [42, 43] and in- structured data, such as genotype, phenotype, and genom- clude clinical, technical, legal, and cultural hurdles. ics data, with semi-structured and unstructured data, e.g., These challenges vary depending on whether the data medical imaging, EHRs, lifestyle, environmental, and are preclinical based on cellular or animal models or health economics data [53–56]. Recent successful exam- from patients in clinical settings, on the intended type of ples show the feasibility of combining such data for trans- analysis and interpretation, on cross-cultural aspects of lational and clinical research [57–59]. privacy, and on ethical and legal considerations. We are on the cusp of having access to vast personal data—for Technical challenges related to the management example, on physiological, behavioral, molecular, clinical, of electronic health records environmental exposure, medical imaging, disease man- Adoption of EHRs across Europe varies greatly. Estonia agement, medication prescription history, nutrition, or [60] and the Valencia Community in Spain (Josep Redón exercise parameters—that could potentially be used to i Màs, personal communication) have moved entirely to track the health of individuals and populations in con- EHRs. Integration is supported with auxiliary systems, siderably more detail than ever before. The integration for instance drug–drug interaction alert systems that of structured and unstructured data, using natural lan- warn physicians and pharmacists about potential pre- guage processing and other sophisticated machine learn- scription clashes, clinical risk groups calculation and ing tools, is being tested and it is hoped this will lead to costs (e.g., Valencia region, Spain), and drug–gene inter- a new level of integration of prior information with up- action alert systems that guide physicians to adjust the to-date clinical information [44]. dose of a prescribed drug in aberrant drug metabolizers Over a thousand Mendelian disorders are linked to (e.g., The Netherlands). The USA have taken steps towards genetic defects and, for many of these, genetic testing is a “patient-driven economy” [61]. In such a scenario, the performed to inform clinical practice. The most suc- patient owns his/her data. This ownership requires the de- cessful integration of basic and clinical data can be ob- velopment of an appropriate health-record infrastructure served in oncology [45, 46] and in research on rare but provides a wide range of new health service business diseases [47–49]. However, the medical relevance of the opportunities with major economic potential. Empowering large amount of genetic variation revealed by genomic patients to take control of their data could be of particular sequencing is still unknown in most cases. importance for cross-border healthcare and health re- Data acquisition is undergoing rapid change. Wearable search activities in Europe where healthcare is highly frag- devices, integrated sensors, and continuous monitoring mented and multinational. To transfer medical data from capabilities are available for all scales of measurements one country to another in the EU is very difficult. Owner- [50]. Several legal issues will have to be tackled, for ex- ship of data by patients could overcome these obstacles ample, when a consumer device becomes a diagnostic and unleash new ways to stimulate a competitive health- device and the quality assurance and regulatory approval driven economy. are more stringent [51]. Furthermore, patient records can be computationally Data storage issues include security, accessibility, and opaque, for example, in the form of free text, recorded sustainability. Should data be stored centrally or in a fed- speech, or medical images; translation into a format com- erated manner? There are concerns about entrusting patible with computational analyses will be necessary. health-related data to public clouds. As a result, there is Data in different languages and time-consuming searches a strong need to come up with alternatives. The decades and identification are other important barriers. of experience in big data management for the particle There are some best practices for the management of physics community at The European Organization for EHRs. For example, the International Rare Diseases Re- Nuclear Research (CERN) that led to the development search Consortium (IRDiRC) [62] develops and implements of the World Wide Web [52] will be valuable. However, standards and harmonized methodology across diseases many aspects that are specific to big data in health re- and medical cases [63]. Several European collaborative search need to be taken into account, such as data het- projects, such as the European project p-medicine, have erogeneity, institutional and legal fragmentation, and created IT infrastructures that will facilitate translational re- strong data protection standards. There will be a massive search and the development of personalized medicine [64]. increase in big data production in all areas of biomed- ELIXIR, one of the European infrastructures for life sci- ical research, which includes studies at the preclinical ences [65], has facilitated the collection, quality control, and Auffray et al. Genome Medicine (2016) 8:71 Page 5 of 13 archiving of large amounts of life science data such as funders need to make sure that sufficient attention is translational medicine data [66]. paid to data quality at the experimental and study design stages, for example, by ensuring data management plans Technical challenges related to data analysis and and appropriately reviewed data sharing procedures are computing infrastructures in place for all funded research. Basic as well as clinical researchers need new computa- “Seeing is believing”. This phrase is relevant not only for tional tools to improve data access and aid user-friendly high-resolution microscopy and imaging technologies but data analysis for efficient decision making in the clinic. Cli- also for the presentation and visualization of health- nicians need new tools that track, trace, and provide fast related data. We need to progress from the current display feedback for individual patient care. Researchers need tools of “hairballs”, incomprehensible comprehensive networks, that can be adapted for different data sets and analyses or ranking tables that nobody has the time or motivation such as those used in a wide range of EU-funded projects to look at. If we want to provide clinicians with updated, through the Innovative Medicines Initiative (IMI)-funded relevant information and clinical decision support sys- eTRIKS consortium platform [67]. Accessing tool reposi- tems, the devices have to be user-friendly and intuitive tories to search for the best tool to answer specific research with an interoperable format. The concept of disease- or clinical questions will be a prerequisite. Equally import- specific maps, with a common computational framework, ant are traceable computational environments that might be one way to make progress, as demonstrated in maintain data provenance information from patient to several EU-funded projects (Fig. 1). sample and from sample to clinically actionable results. In December 2012, the UK announced the 100,000 Ge- Computational modeling and simulation nomes Project [68], which aims to sequence 100,000 One of the pathways for exploitation of big data is its com- genomes, from around 70,000 people, with the focus on bination with predictive, mechanistic models [76] such as patients with rare diseases or cancer. The US and China those provided by the European Molecular Biology La- have recently announced plans for similar studies on boratory–European Bioinformatics Institute (EMBL-EBI) one million individuals. The goal of these projects is to [77]. The Virtual Physiological Human (VPH) community yield further insights into human health and disease has also endeavored to develop a descriptive, integrative, and to build a framework with which to integrate gen- and predictive computational framework of human anat- omics into standard public healthcare programs in the omy, physiology, and pathology with support from the EC near future. Data continue to increase at an exponential Directorate General for Communications Networks, Con- rate and the need for cross-border exchange of biomed- tent & Technology (DG CONNECT) [78, 79], following ical and healthcare data, cloud-storage, and cloud- the path opened by the IUPS Physiome Project [80]. computing is inevitable [69, 70]. Until many issues of Predictive computational approaches are associated data safety and security are solved, however, local solu- with infrastructural challenges, particularly for the inte- tions will be favored [71]. gration of data with analytical tools and workflows. On- line environments such as VPH-Share and projects Data quality, acquisition, curation, and such as p-medicine have appropriate infrastructures for visualization these applications [81]. The quality and structure of health data available is incon- Another approach to make sense of big data is based on sistent. A major challenge for preclinical and clinical re- a systems-level understanding of health and disease [82]. search is to obtain and achieve access to sufficient high Systems medicine integrative approaches are gradually quality, informative data. Owing to a lack of harmonized gaining visibility and enable translation of the human biol- methods, in most cases health data cannot be directly ogy complex and voluminous data into a toolbox to dem- used for secondary purposes, such as quality of care, phar- onstrate clinical impact [83]. However, a full appreciation macovigilance, safety and efficacy of treatments, health of the power of systems biology and computational technology assessment, and public health policy. Efforts modeling for the upcoming changes in health re- are underway, in both Europe and the US, to develop and search and healthcare is still missing. Currently, with implement standardized data collection, storage, and ana- the exception of oncology, there are still few highly lysis [10, 72, 73]. The European Open Science Cloud, cre- convincing use cases where systems biology ap- ated by the EC, will offer Europe’s 1.7 million researchers proaches have found applications in routine clinical and 70 million science and technology professionals a vir- care [45, 46, 84, 85]. Mathematical, computational dis- tual environment to store, share, and re-use their data ease models are unlikely to be routine in health re- across disciplines and borders [74]. search anytime soon. Achieving necessary changes will Data curation is often neglected but vitally important need strong support from funders to foster this para- to warrant high-quality, informative data [75]. Research digm shift in methodology. Auffray et al. Genome Medicine (2016) 8:71 Page 6 of 13 (a) (b) (c) (d) (e) (f) Fig. 1 Making sense of complex data and overcoming the hairball syndrome using systems biology algorithms and visualization tools. a Visualization of the topology of clinical data from the U-BIOPRED consortium adult severe asthma cohorts (courtesy of Ratko Djukanovic, University of Southampton, UK and Peter Sterk, Amsterdam Medical Center, The Netherlands) [126] using Topology Data Analysis from Ayasdi [127, 128]. b Network obtained though integration of genome, transcriptome, and proteome data from the SysCLAD consortium lung transplantation cohorts (courtesy of Johann Pellet, EISBM, France) [129, 130] using Ingenuity® Variant Analysis [131]. c Typical static representation of a molecular pathway in Thomson Reuters GeneGo MetaCore™ [132]. d An example of a detailed representation of biochemical reactions in the LCSB Parkinson’s molecular map [133]. e A cellular-level representation of biological interactions in the EISBM AsthmaMap (courtesy of Alexander Mazein, EISBM, France) [134, 135]. f A network representation of data and statements developed as part of a biocentric knowledge base within the eTRIKS consortium (courtesy of Mansoor Saqi and Irina Balaur, EISBM, France) [67] Legal and regulatory aspects the International Cancer Genome Consortium (ICGC, A crucial aspect to be addressed concerns the regula- 2016) [87], the International Human Epigenome Con- tory acceptance of big data for the evaluation of novel sortium (IHEC, 2016) [88], the Genomic Standards pharmacological or biological therapies to comple- Consortium (GSC, 2016) [89], and the Clinical Data ment large randomized clinical trials [86]. Collabora- Interchange Standards Consortium (CDISC) [90] and tive pilotprojects thattest the useof big data in by ISO standards committees (e.g., ISO TC276 WG5, observational and/or interventional large clinical trials 2016) provide some examples [91]. The recently pub- with the contribution of regulatory agencies can lished FAIR Data Principles of Findability, Accessibil- bridge different methodological approaches and deter- ity, Interoperability and Reusability for scientific data mine adapted quality standards. Universities and hos- management should help stakeholders from academia, pitals do not have the procedures in place to industry, funding agencies, and non-commercial pub- effectively capture and share data with other organi- lishers support the reuse of scholarly data [92]. Given zations and countries. We need to develop and adopt the complexity and high number of stakeholders in- high quality standards for data generation and pro- volved in the implementation of data standards within cessing to ensure that meaningful and valid data with hospital and university settings, the biggest chance for well-defined semantics are processed and shared. The success comes with highly focused pilot projects. Key fac- quality of data generation as well as the processing tors include flexibility, expansion through modular strat- and regulatory acceptance of big data are addressed egies, and the identification and involvement of key at the international level. Research initiatives such as healthcare actors providing them with immediate benefits. Auffray et al. Genome Medicine (2016) 8:71 Page 7 of 13 Linking existing initiatives and building new initiatives European Scientific Research Projects has been developed on clinical data interchange are also important. The Glo- [103] and is being deployed in the IMI-funded project bal Alliance for Genomics and Health (GA4GH, 2016) eTRIKS [67]. There is also a need to have a much higher [93] initiative is working towards technical, ethical, and level of security than is possible today. One suggestion was legal frameworks to address and resolve some of these to explore block-chain technology, which makes use of a issues. The Coordinated Research Infrastructures Build- digital, distributed transaction record, digital events, with ing Enduring Life-science (CORBEL) Services, a recently identical copies maintained on multiple computer systems, launched European consortium, will also contribute to shared between many different parties. Once entered, the the above data-sharing challenges [94]. CORBEL is an block-chain contains a certain and verifiable record of initiative of 11 new Biological and Medical Science Re- every single transaction [104]. Originally used as the tech- search Infrastructures (BMS RIs), who together will cre- nology underlying “Bitcoin”, the potential to make secure ate a platform for harmonized user access to biological transactions of biomedical and healthcare data is being ex- and medical technologies, biological samples, and data plored [105]. Another possibility would be to make use of services (e.g., BRIDGEHEALTH consortium [95]). The differentiated privacy approaches as practiced in health in- Genomics England policy [68] of storing all data within formation exchanges [106]. the National Health Service (NHS) with highly regulated restricted access to prevent abuse of private information Research infrastructures and user protocols might be a way to go forward. This Similarly, research infrastructures are instrumental to policy will need to be complemented by that of the UK support the harmonization of legal and ethical frame- Personal Genome Project allowing volunteers to donate works in European countries, as demonstrated by the their personal genome from Genomics England and Common Service on Ethical, Legal and Social Implica- other sources to the public domain [96]. tions (CS ELSI) of Biobanking and BioMolecular re- The processes and legal agreements for data sharing sources Research Infrastructure Consortium (CS ELSI across registries and European Member States are seldom BBMRI-ERIC, 2016) [107]. The goal of ELSI BBMRI-ERIC established. The harmonization of regulatory frameworks is to facilitate and support cross-border exchanges of hu- is crucial while also ensuring personal data protection and man biological resources and data attached for research compliance with current legal frameworks, which includes uses, collaborations, and sharing of knowledge, experi- provisions on how to prevent, handle, and prosecute ences, and best practices. potential abuse of the system. For example, there is no Existing computational infrastructures are coping with consensus within international law on whether specific re- storage of big data, but the challenge within the EU is the quirements should be applicable to genetic information. lack of a large-scale European infrastructure and methods Several documents exist at the regional and international of secure data distribution in a cross-border setting [108]. levels that include useful guidelines, such as the UNESCO It is crucial to ensure that the infrastructures that exist International Declaration on Human Genetic Data (2003) and evolve are coordinated and sustainable. Initiatives [97] and the Organisation for Economic Co-operation and such as ELIXIR [65] and the CS IT BBMRI-ERIC [109] Development (OECD) Guidelines on Human Biobanks have begun to address these issues but there is a need for and Genetic Research Databases (2009) [98]. The GA4GH coordination and significant strategic investments to has developed the Framework for Responsible Sharing of ensure that organizations such as these are equipped to Genomic and Health-Related Data [99]. support the rapid growth and evolution of healthcare informatics over the next decade. Distribution of ex- Privacy protection and data sharing policies pertise and facilities, consistent operation, and federation There are broad differences within and across Europe with throughout Europe are essential for scalability and long- regards to privacy protection and data sharing polices term sustainability. This has become one of the key chal- [100]. The workshop in Luxembourg emphasized that the lenges of distributed infrastructures such as BBMRI-ERIC “onesizefits all” approach will not be applicable in Europe. [110], ECRIN [111], and ELIXIR [65], which could benefit The EC proposal for the General Data Protection Regula- from the long experience of CERN in particle physics as tion (2012/0011COD) [101] attempts to harmonize the discussed earlier [52]. fragmented situation that exists under the current Data Protection Directive (95/46/EC, European Parliament and Training and education: many health data, Council, 1995). In the compromise text concluded in the insufficient health data scientists trilogue negotiations between Parliament and the Council, One of the biggest bottlenecks and challenges is the avail- a paragraph is included in the preamble of the new act ability of healthcare professionals and clinical researchers which defines DNA and RNA as personal data [102]. A that are able to use the latest information technologies de- Code of Practice on Secondary Use of Medical Data in veloped in the big data analytics era [112, 113]. Data Auffray et al. Genome Medicine (2016) 8:71 Page 8 of 13 managers with good insight into the specificities of the Recommendations for an EU action plan health application domain are rare. An equally important Launch pilot projects on the application of big data to bottleneck will be the lack of trained clinical scientists to inform health deal with big data. The majority of university hospitals The primary recommendation is for the launch of pilot face a daily struggle to balance their budget. Clinical re- projects on the application of big data that involve search rarely brings in money to pay the costs for clinical healthcare providers, health technology developers, care. As a result, many university hospitals cease to main- policy-makers, and advisory bodies. Pilot translational tain their culture of research as an essential basis for top- research projects that involve healthcare workers and level healthcare. Once the chain of training the next gen- patients could bring big data closer to the clinic and eration of clinical scientists is broken by the retirement of prove the value of collecting and analyzing such infor- the current trainers, the situation will change dramatically mation using the latest mathematical and computational and result in a catastrophic shift. Therefore, there is a tools. The design principles for achieving integrated pressing need for programs that support the careers of healthcare information systems [114] might serve as guid- clinical scientists with state-of-the-art training in data ana- ance on how small pilot projects can be used for future lysis and management. expansion. There is a clear lack of cross-disciplinary education and training, which means that employees in the clinical Leverage the potential of open and citizen science for the environment often do not have the expertise to deal with exploitation of big data in health big data in clinical research and healthcare. Coordinating The concept of “open science” includes open access to Action Systems Medicine (CASyM) [83] has developed publications and raw data, transparency of tools and meth- modules of multidisciplinary training for the next gener- odologies, and networking of researchers across fields and ation of researchers and medical doctors. Furthermore, countries [115]. Open science provides significant added despite compulsory requirements of data transparency value in pilot studies and its broad implementation in the applicable to clinical trials data, researchers and clini- scientific community and society is under discussion. For cians often have little incentive to make data fully avail- example, a high-profile effort to switch all peer-reviewed able. Another challenge may be public skepticism about publishing to open access within the next years is envis- the security of an integrated healthcare system. However, aged [116–119]. several global initiatives have shown that individuals are The second recommendation is to encourage lever- ready to share their medical data for advancing science aging the complementarity between open and citizen (Personal Genome Project) [96], which highlights the science in the context of big data in health. It will be im- potential contribution of citizen science to big data in portant to inform and involve the public not only about health research. Data donor cards would provide an in- data collection but also about all aspects of health re- centive for people to make their data publicly available search [120]. Consumer genomics companies are already and would work in the same way as organ donor cards, successful at gathering metadata through engagement thereby reusing a system already understood by many with their customers. The field of rare diseases has also people. Legislative approaches should include opt-in and benefitted greatly from the involvement of parents of or opt-out solutions. For a successful transformation of children with such diseases, using non-traditional tech- healthcare, we need to push the boundaries of interdisci- niques such as social media to build a network of related plinarity, which comprises the natural sciences such as cases of a particular syndrome. “Citizen science” is also biology and medicine, engineering, the social sciences, becoming increasingly important because of the in- and the humanities. Projects fail more often because of creased uptake of mobile health devices, consumer elec- the underappreciation of the complexities of ethical, tronics, and household appliances and is well-aligned legal, and social factors than for technological reasons. with the EC focus on “responsible research and The workshop in Luxembourg brought together a innovation” that includes elements of open science in wide range of experts and stakeholders to discuss the its ongoing Horizon 2020 Framework Programme pol- key developments, challenges, and potential solutions icy [121]. that we face with using big data for the benefit of the patients, the health care industry, and Europe as a Catalyze the involvement of all relevant stakeholders in whole. The workshop resulted in specific recommenda- projects tions for European policy-makers. There was no doubt The third recommendation is to involve in projects all among the participants that big data and the revolution relevant stakeholders, which includes clinicians, patient in ICT will transform healthcare. There was also a organizations, researchers, software providers, healthcare sense of urgency to implement rapidly the possible and managers, ethical and legal experts, regulatory authorities, to tackle the yet impossible. policy-makers, pharmaceutical companies, and funding Auffray et al. Genome Medicine (2016) 8:71 Page 9 of 13 bodies. Multidisciplinary involvement is required to secure types of data and data formats. The development and an effective translation from basic research to applied use of interoperable data, technology standards, and har- healthcare and to bridge the organizational and cultural monized operating procedures for data collection and differences in data sharing practices across Europe and analysis are paramount to enable data integration and within the different health sectors in a worldwide context. to support data flow and federated access between pub- Clark and colleagues have laid out “a core set of lessons lic and private partners. Furthermore, applicable data that should become part of a basic training for researchers protection standards and maintaining public trust are interested in crafting usable knowledge for sustainable de- important to realize the full potential of big data in velopment” [122]. One of the most important lessons is to health research for European citizens and, by extension, understand that research is a social and political process, worldwide. In this regard, we need a definition of core not just a process of discovery, and that stakeholders are data sets that could serve as a common standard for diverse and need to be involved in the team building any individual health state. process at an early stage. Using the big data revolution to drive the transform- It is likely that bioinformaticians, biostatisticians, and ation of healthcare requires resources for state-of-the-art computational scientists will more often be included in the ICT infrastructure, training programs, and pilot projects near future as natural members of research and clinical that can serve as a role model. These costs, however, will teams and healthcare administration, as already carried be overcompensated by the gains that will come with out by global pharmaceutical organizations. Important to- the implementation of functioning digital workflows and wards this direction is cross-disciplinary training and to sophisticated health data analytics and the creation of a improve the dialogue between the information technology new health and wellness industry. experts, biologists, and clinicians, especially as these groups have the potential to affect greatly the practical Accelerate the harmonization of regulatory frameworks in outcome of research. Europe for health-related research and data sharing The final recommendation is to agree on the necessity for, Support a rapid transition to new computational, and the high priority of, accelerating the harmonization of statistical, and other mathematical methods of analysis the European policy and regulatory frameworks that affect The fourth recommendation is to foster the transition to health-related research and data sharing and the distribu- new computational, statistical, and other mathematical tion of biological material used for the generation of data methods of analysis that enable the integration of data necessary for research. There should be a balance between across the multiple scales of time and space typical of the protection of an individual’s privacy, while acknow- complex biological systems in their healthy and diseased ledging that many patients are much more open about states [123]: traditional methods of analysis are no lon- data sharing than current policies seem to assume, and ger scalable for such big data diversity. The roadmap de- the ability to proceed with research to ensure that Europe veloped by the Avicenna Coordination Support Action remains competitive in health research. EU and national provides a vision on how computer simulation will funding bodies should take stock of the existing best prac- transform the biomedical industry by developing “in tices and catalyze their adoption in transnational health silico clinical trials” [124]. research. The need for new methods spans a wide range of topics. We need effective methods for data integration, Conclusions and future perspectives collection, and data provenance management, for ex- The digital revolution is underway. A number of indus- ample, the integration of genomics information and pa- tries have already transformed their activities or have tient registries with EHRs and the integration of model now become inoperative. The driving forces are organism data into disease models. We also need im- miniaturization, automation, and now increasingly the proved methodologies and tools to support data entry by convergence of artificial intelligence, deep learning, and those recording data, such as visual and physiological robotics. Healthcare will not escape these developments. information. Innovative statistical methods, such as In fact, big data as a driving force will play an even more models for predictive analytics and computational important role than in most industries. In Europe, work- models tailored to big data, are required to enable hy- ing across borders is the only way to master the chal- pothesis generation, estimation of risk models, and study lenges of this scientific, technological, and industrial design. The Infrastructure, Design, Engineering, Archi- revolution. The single most important factor is the tecture, and Integration project (IDeAl) [125] is taking workforce. Countries that are ahead in ICT competence steps in this direction by developing new methods for and have an understanding of cultural differences and an gene selection to tailor the design for small population ability and willingness to work together have the best group trials. There may even be a requirement for new chance to succeed. Auffray et al. Genome Medicine (2016) 8:71 Page 10 of 13 Abbreviations (New Drugs 4 Bad Bugs, IMI-n°115525), PARENT (PAtient REgistries iNiTiative, BBMRI, Biobanking and Biomolecular Resources Research Infrastructure; BMS RI, CHAFEA Project Grant n°2011 23 02), p-medicine (From data sharing and inte- Biological and Medical Sciences Research Infrastructure; CASyM, Coordinating gration via VPH models to personalized medicine, FP7-n°270089), PREDEMICS Action Systems Medicine; CDISC, Clinical Data Interchange Standards Consortium; (Preparedness, Prediction and Prevention of Emerging Zoonotic Viruses with CORBEL, Coordinated Research Infrastructures Building Enduring Life-science Pandemic Potential using Multidisciplinary Approaches, FP7-n°278433), PREPARE Services; EBI, European Bioinformatics Institute; EC, European Commission; EHR, (Platform for European Preparedness Against (Re-)emerging Epidemics, FP7-n° electronic health record; EISBM, European Institute for Systems Biology and 602525), READNA (Revolutionary Approaches and Devices for Nucleic Acid Medicine; ELSI, ethical, legal, and social implications; EMBL, European Molecular analysis, FP7-n°201418), CHAARM (Combined Highly Active Anti-retroviral Biology Laboratory; ENCR, European Network of Cancer Registries; EORTC, Microbicides, FP7-n°242135), ProteomeXchange (International Data Exchange European Organisation for Research and Treatment of Cancer; ERIC, European and Data Representation Standards for Proteomics, FP7-n°260558), PSIMEx Research Infrastructure Consortium; EU, European Union; EURORDIS, Rare Diseases (Proteomics Standards International Molecular Exchange–Systematic Europe; GA4GH, Global Alliance for Genomics and Health; ICGC, International Capture of Published Molecular Interaction Data, FP7-n°223411), RADIANT Cancer Genome Consortium; IHEC, International Human Epigenome Consor- (Rapid Development and Distribution of Statistical Tools for High- tium; IMI, Innovative Medicines Initiative; ISO, International Organization for Throughput Sequencing Data FP7-n°305626), SEMCARE (Semantic Data Standardization; LCSB, Luxembourg Centre for Systems Biomedicine; NHS, Platform for Healthcare, FP7-n°611388), SPRINTT (Sarcopenia and Physical National Health Service; OECD, Organization for Economic Co-operation and fRailty IN older people: multi-componenT Treatment strategies, IMI-n° Development; UNESCO, United Nations Educational, Scientific and Cultural 115621), STATEGRA (User-driven Development of Statistical Methods for Organization; VPH, virtual physiological human Experimental Planning, Data Gathering, and Integrative Analysis of Next Generation Sequencing, proteomics and Metabolomics data FP7-n°306000), SysCLAD (Systems prediction of Chronic Lung Allograft Dysfunction, FP7-n° Acknowledgements 354457), SYSMEDIBD (Systems medicine of chronic inflammatory bowel We would like to thank the organizers of the EC workshop Anders Colver, disease, FP7-n°305564), U-BIOPRED (Unbiased BIOmarkers for the PREDiction of Tomasz Dylag, Christina Kyriakopoulou, and Sasa Jenko, who serve as respiratory disease outcomes, IMI-n°115010), VPH-share (Virtual Physiological scientific officers at the EC Health Directorate. Human: Sharing for Healthcare - A Research Environment, FP7-n°269978). Big data in health research: an EU action plan workshop was organized by the Genom Austria (member of the Global Network of Personal Genome Health Directorate of the Directorate-General for Research and Innovation at Projects). the European Commission with the contribution of the Innovative Medicines SJ, PF, and JAV acknowledge support from the European Molecular Biology Initiative office, Digital Society, Trust & Security Directorate of the Directorate Laboratory. General for Communications Networks, Content & Technology, Health systems IB acknowledges support from the Wellcome Trust (WT098051). and products Directorate of Directorate-General for Health and Food Safety, Joint Research Centre and EUROSTAT http://bigdata2015.uni.lu/eng/European- Authors’ contributions Commission-satellite-workshop. We thank chairs and panelists Ana Conesa, Haralampos Karanikas, Inês Barroso, Ivo http://ec.europa.eu/research/health/index.cfm# Gut, Jerry Lanfear, Niklas Blomberg, Norbert Graf, Pablo Villoslada, Paul Flicek, Rod We thank Alvar Agusti, Jacques Beckmann, Laurent Nicod, Andres Metspalu, Hose, Rudi Balling, Tim Hubbard, Yike Guo, Charles Auffray, Mikael Benson, Damjana Rozman, Philippe Sabatier, Ferran Sanz, Peter Sterk, Giulio Superti- Gianluigi Zanetti and Jeanine Houwing-Duistermaat for their contributions to the Furga, Jesper Tegnér, Olaf Wolkenhauer, and two anonymous reviewers, who manuscript preparation. All authors contributed to the content of the manuscript. provided insightful comments that helped to improve the manuscript. Figure 1 Sophie Janacek drafted the initial version of the manuscript, which was subse- was prepared by Bertrand De Meulder, Alexander Mazein, Johann Pellet, quently thoroughly edited by Rudi Balling, Charles Auffray, and Christoph Bock Mansoor Saqi, and Irina Balaur. with the support of Maria Manuela Nogueira. Diane Lefaudeux helped with the Workshop participants represent the following projects supported by the bibliography. All authors read and approved the final manuscript. European Union's Horizon 2020 and the Seventh Framework Programme: AETIONOMY (Organising mechanistic knowledge about neurodegenerative diseases for the improvement of drug development and therapy, IMI-n° Competing interests 115568), ASTERIX (New methodologies for clinical trials for small population CA, RB, RDH, CD, KP, EBD, WM, MK, JR, LV, JAV, NG and IB declare that they have groups, FP7-n°603160), BBMRI ERIC, BLUEPRINT (A Blueprint of Haematopoetic no competing interests. AK is employed by ITTM S.A. and is an expert at the Epigenomes, FP7-n°282510), BRIDGEHealth (BRidging Information and Data ISO/TC 76 WG 5; he does not have any competing interests. TK is employed by Generation for Evidence-based Health policy and research, H2020-n°664691), Vitromics Healthcare Holding, which is a member of EuropaBio. Pablo Villoslada CANCER-ID (Cancer treatment and monitoring through identification of circulating has received consultancy fees from Roche, Novartis, Araclon, and Health tumour cells and tumour related nucleic acids in blood FP7-n°115749), CASyM Engineering, is founder and hold stocks in Bionure Inc. and Spire Bioventures, (Coordinating Action Systems Medicine–Implementation of Systems Medicine and works as an academic editor for Neurology & Therapy, Current Treatment across Europe, FP7-n°305033), COMBIMS (A novel drug discovery method based Options in Neurology, Multiple Sclerosis & Demyelinating diseases,and PLoS One. on systems biology: combination therapy and biomarkers for Multiple Sclerosis, PF is a member of the Scientific Advisory Board for Omicia, Inc. FP7-n°305397), DECIPHER PCP (Distributed European Community Individual Patient Healthcare Electronic Record, FP7-n°288028), ECHO (European Author details Collaboration for Healthcare Optimization, FP7-n°242189), ELIXIR (European European Institute for Systems Biology and Medicine, 1 avenue Claude Life-science Infrastructure for Biological Information, FP7-n°211601), EMIF Vellefaux, 75010 Paris, France. CIRI-UMR5308, CNRS-ENS-INSERM-UCBL, (European Medical Information Framework, IMI-n°115372), EpiGeneSys Université de Lyon, 50 avenue Tony Garnier, 69007 Lyon, France. (Epigenetics towards systems biology, FP7- n°257082), ERA-IB (ERA-Net for Luxembourg Centre for Systems Biomedicine, University of Luxembourg, 7 Industrial Biotechnology 2, FP7-n°291814), ERASynBio (Development and Avenue des Hauts Fourneaux, 4362 Esch-sur-Alzette, Luxembourg. Coordination of Synthetic Biology in the European Research Area, FP7-n° Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, 291728), ERASysAPP (Systems Biology Applications, FP7-n°321567), ESGI Cambridge CB10 1SA, UK. Health Services Management Training Centre, (European Sequencing and Genotyping Infrastructure, FP7-n°262055), Faculty of Health and Public Services, Semmelweis University, Kútvölgyi út 2, eTRIKS (Delivering European Translational Information & Knowledge 1125 Budapest, Hungary. Centre for Personalised Medicine, Linköping Management Services, IMI-n°115446), EU-MASCARA, EUROBIOFORUM, University, 581 85 Linköping, Sweden. Translational & Bioinformatics, Pfizer European Lung Foundation, IDeAl (Integrated Design and Analysis of small Inc., 300 Technology Square, Cambridge, MA 02139, USA. Institute for Health population trials, FP7-n°602552), KConnect (H2020-n°644753), MedBioinformatics Sciences, IACS - IIS Aragon, San Juan Bosco 13, 50009 Zaragoza, Spain. (Creating medically-driven integrative bioinformatics applications focused on ELIXIR, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK. oncology, CNS disorders and their comorbidities, H2020-n°634143), MeDALL CeMM Research Center for Molecular Medicine of the Austrian Academy of (Mechanisms of the Development of ALLergy, FP7-n°261357), MIMOmics Sciences, Lazarettgasse 14, AKH BT25.2, 1090 Vienna, Austria. Department of (Methods for Integrated analysis of Multiple Omics datasets, FP7-n°305280), Laboratory Medicine, Medical University of Vienna, Lazarettgasse 14, AKH MULTIMOD (Multi-layer network modules to identify markers for personalized BT25.2, 1090 Vienna, Austria. Max Planck Institute for Informatics, Campus medication in complex diseases, FP7- n°223367), IMI ND4BB TRANSLOCATION E1 4, 66123 Saarbrücken, Germany. Príncipe Felipe Research Center, C/ Auffray et al. Genome Medicine (2016) 8:71 Page 11 of 13 Eduardo Primo Yúfera 3, 46012 Valencia, Spain. University of Florida, 2. Wood AR, Esko T, Yang J, Vedantam S, Pers TH, Gustafsson S, et al. Defining Institute of Food and Agricultural Sciences (IFAS), 2033 Mowry Road, the role of common variation in the genomic and biological architecture of Gainesville, FL 32610, USA. Bluecompanion Ltd, 6 London Street (second adult human height. Nat Genet. 2014;46:1173–86. floor), London W2 1HR, UK. Technology, Data & Analytics, KPMG 3. Cooper DN, Krawczak M, Polychronakos C, Tyler-Smith C, Kehrer-Sawatzki H. Luxembourg, Société Coopérative, 39 Avenue John F. Kennedy, 1855 Where genotype is not predictive of phenotype: towards an understanding Luxembourg, Luxembourg. Department of Human Genetics, Department of of the molecular basis of reduced penetrance in human inherited disease. Pathology, Leiden University Medical Centre, Einthovenweg 20, 2333 ZC Hum Genet. 2013;132:1077–130. Leiden, The Netherlands. Information Technology Department, European 4. Pickrell JK, Marioni JC, Pai AA, Degner JF, Engelhardt BE, Nkadori E, et al. Organization for Nuclear Research (CERN), 385 Route de Meyrin, 1211 Understanding mechanisms underlying human gene expression variation Geneva 23, Switzerland. Julius Center for Health Sciences and Primary Care, with RNA sequencing. Nature. 2010;464:768–72. University Medical Center Utrecht, Heidelberglaan 100, 3508 GA Utrecht, The 5. Vernot B, Stergachis AB, Maurano MT, Vierstra J, Neph S, Thurman RE, et al. Netherlands. European Molecular Biology Laboratory, European Personal and population genomics of human regulatory variation. Genome Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Res. 2012;22:1689–97. Cambridge CB10 1SD, UK. Department of Pediatric Oncology/Hematology, 6. Piraino SW, Furney SJ. Beyond the exome: the role of non-coding somatic Saarland University, Campus Homburg, Building 9, 66421 Homburg, mutations in cancer. Ann Oncol Off J Eur Soc Med Oncol ESMO. 2016;27:240–8. Germany. Project Management Jülich, Forschungszentrum Jülich GmbH, 7. European Commission satellite workshop ‘Big data in health research: an EU Wilhelm-Johnen-Straße, 52428 Jülich, Germany. Department of Clinical action plan’. http://bigdata2015.uni.lu/eng/European-Commission-satellite- Pharmacy & Toxicology, Leiden University Medical Center, Albinusdreef 2, workshop. Accessed 20 May 2016. 2333 ZA Leiden, The Netherlands. Data Science Institute, Imperial College 8. Raghupathi W, Raghupathi V. Big data analytics in healthcare: promise and London, South Kensington, London SW7 2AZ, UK. CNAG-CRG, Center for potential. Health Inf Sci Syst. 2014;2:3. Genomic Regulation, Barcelona Institute for Science and Technology (BIST), 9. Baro E, Degoul S, Beuscart R, Chazard E. Toward a literature-driven definition C/Baldiri Reixac 4, 08029 Barcelona, Spain. Institute of Software Technology of big data in healthcare. BioMed Res Int. 2015;2015:639021. and Interactive Systems, TU Wien, Favoritenstrasse 9-11/188, 1040 Vienna, 10. Meldolesi E, van Soest J, Damiani A, Dekker A, Alitto AR, Campitelli M, et al. Austria. The Association of the British Pharmaceutical Industry, 7th Floor, Standardized data collection to build prediction models in oncology: a Southside, 105 Victoria Street, London SW1E 6QT, UK. Department of prototype for rectal cancer. Future Oncol Lond Engl. 2016;12:119–36. Medical Statistics, RWTH-Aachen University, Universitätsklinikum Aachen, 11. Fernández-Luque L, Bau T. Health and social media: perfect storm of Pauwelsstraße 30, 52074 Aachen, Germany. SYNAPSE Research information. Healthcare Inform Res. 2015;21:67–73. Management Partners, Diputació 237, Àtic 3ª, 08007 Barcelona, Spain. 12. Hood L, Price ND. Demystifying disease, democratizing health care. Sci Department of Infection, Immunity and Cardiovascular Disease and Transl Med. 2014;6:225ed5. Insigneo Institute for In-Silico Medicine, Medical School, University of 13. Wade TD. Traits and types of health data repositories. Health Inf Sci Syst. Sheffield, Beech Hill Road, Sheffield S10 2RX, UK. Department of Statistics, 2014;2:4. School of Mathematics, University of Leeds, Leeds LS2 9JT, UK. Department 14. Ludvigsson JF, Andersson E, Ekbom A, Feychting M, Kim J-L, Reuterwall C, of Medical & Molecular Genetics, King’s College London, London SE1 9RT, et al. External review and validation of the Swedish national inpatient 33 34 UK. Genomics England, London EC1M 6BQ, UK. National and Kapodistrian register. BMC Public Health. 2011;11:450. University of Athens, Medical School, Xristou Lada 6, 10561 Athens, Greece. 15. DiMarco G, Hill D, Feldman SR. Review of patient registries in dermatology. Vitromics Healthcare Holding B.V., Onderwijsboulevard 225, 5223 DE J Am Acad Dermatol. 2016. doi:10.1016/j.jaad.2016.03.020. ’s-Hertogenbosch, The Netherlands. Fraunhofer Institute for Molecular 16. Orphanet. Rare Disease Registries in Europe. http://www.orpha.net/ Biology and Applied Ecology ScreeningPort, Schnackenburgallee 114, 22525 orphacom/cahiers/docs/GB/Registries.pdf. Accessed 6 May 2016. Hamburg, Germany. ITTM S.A., 9 avenue des Hauts Fourneaux, 4362 17. 2013 EURORDIS policy fact sheet - Rare Disease Patient Registries. Esch-sur-Alzette, Luxembourg. Research Business Technology, Pfizer Ltd, http://www.eurordis.org/sites/default/files/publications/Factsheet_ GP4 Building, Granta Park, Cambridge CB21 6GP, UK. Health Economics & registries.pdf. Accessed 8 May 2016. Outcomes Research, Deloitte Belgium, Berkenlaan 8A, 1831 Diegem, Belgium. 18. EORTC: European Organisation for Research and Treatment of Cancer. Janssen Pharmaceutica N.V., R&D G3O, Turnhoutseweg 30, 2340 Beerse, http://www.eortc.org. Accessed 6 May 2016. Belgium. Faculty of Life Sciences, University of Manchester, AV Hill Building, 19. EORTC opens prospective registry for patients with Melanoma. http://www. Oxford Road, Manchester M13 9PT, UK. UMR3664 IC/CNRS, Institut Curie, eortc.org/news/eortc-opens-prospective-registry-for-patients-with- Section Recherche, Pavillon Pasteur, 26 rue d’Ulm, 75248 Paris cedex 05, melanoma. Accessed 8 May 2016. France. Linguamatics Ltd, 324 Cambridge Science Park Milton Rd, 20. ENCR: European Network of Cancer Registries. http://www.encr.eu. Accessed Cambridge CB4 0WG, UK. PwC Luxembourg, 2 rue Gerhard Mercator, 2182 6 May 2016. Luxembourg, Luxembourg. Philips, HighTechCampus 36, 5656AE 21. PARENT: PAtient REgistries iNiTiative. http://patientregistries.eu/deliverables. Eindhoven, The Netherlands. Department of Public Health and Primary Accessed 6 May 2016. Care, KU Leuven Kulak, Etienne Sabbelaan 53, 8500 Kortrijk, Belgium. 22. Kaplan G, Virginia Mason, Bo-Linn G, Gordon and Betty Moore Foundation, INCLIVA Health Research Institute, University of Valencia, CIBERobn ISCIII, Carayon P, University of Wisconsin, et al. Bringing a systems approach to Avenida Menéndez Pelayo 4 accesorio, 46010 Valencia, Spain. Swiss health. National Academy of Engineering of the National Academies and Institute of Bioinformatics (SIB) and University of Basel, Klingelbergstrasse 50/ Institute of Medicine of the National Academies; Jul 2013. https://www.nae. 70, 4056 Basel, Switzerland. Agency for Health Quality and Assessment of edu/File.aspx?id=86344. Accessed 6 May 2016 Catalonia (AQuAS), Carrer de Roc Boronat 81-95, 08005 Barcelona, Spain. 23. Bulger M, Taylor G, Schroeder R. Data-driven business models: challenges EuroBioForum Foundation, Chrysantstraat 10, 3135 HG Vlaardingen, The and opportunities of big data. Oxford Internet Institute. Research Councils Netherlands. Integrated BioBank of Luxembourg, 6 rue Nicolas-Ernest UK: NEMODE, New Economic Models in the Digital Economy; 2014. Barblé, 1210 Luxembourg, Luxembourg. Technopolis Group, 3 Pavilion http://www.nemode.ac.uk/wp-content/uploads/2014/09/nemode_ Buildings, Brighton BN1 1EE, UK. Hospital Clinic of Barcelona, Institute business_models_for_bigdata_2014_oxford.pdf. Accessed 20 May 2016. d’Investigacions Biomediques August Pi Sunyer (IDIBAPS), Rosello 149, 08036 24. Delfino A, Faure Ragani A, Telpis V, Tilley J, McKinsey & Company. Mature quality Barcelona, Spain. European Platform for Patients’ Organisations, Science systems: what pharma can learn from other industries. Pharm Manuf. 26 Feb and Industry (Epposi), De Meeûs Square 38-40, 1000 Brussels, Belgium. 2015; http://www.pharmamanufacturing.com/articles/2015/mature-quality- 55 56 CRS4, Ed.1 POLARIS, 09129 Pula, Italy. BBMRI-ERIC, Neue Stiftingtalstrasse systems-what-pharma-can-learn-from-other-industries/. Accessed 20 May 2016. 2/B/6, 8010 Graz, Austria. 25. Rumsfeld JS, Joynt KE, Maddox TM. Big data analytics to improve cardiovascular care: promise and challenges. Nat Rev Cardiol. 2016;13(6):350–9. 26. Monteith S, Glenn T, Geddes J, Whybrow PC, Bauer M. Big data for bipolar disorder. Int J Bipolar Disord. 2016;4:10. References 27. Janke AT, Overbeek DL, Kocher KE, Levy PD. Exploring the potential of 1. Ideker T, Dutkowski J, Hood L. Boosting signal-to-noise in complex biology: predictive analytics and big data in emergency care. Ann Emerg Med. 2016; prior knowledge is power. Cell. 2011;144:860–3. 67:227–36. Auffray et al. Genome Medicine (2016) 8:71 Page 12 of 13 28. Khandani S. Engineering design process: education transfer plan. 2005. 55. Hofmann-Apitius M, Ball G, Gebel S, Bagewadi S, de Bono B, Schneider R, et al. http://www.saylor.org/site/wp-content/uploads/2012/09/ME101-4.1- Bioinformatics mining and modeling methods for the identification of disease Engineering-Design-Process.pdf. Accessed 8 May 2016. mechanisms in neurodegenerative disorders. Int J Mol Sci. 2015;16:29179–206. 29. Abugessaisa I, Saevarsdottir S, Tsipras G, Lindblad S, Sandin C, Nikamo P, 56. Tenenbaum JD. Translational bioinformatics: past, present, and future. et al. Accelerating translational research by clinically driven development of Genomics Proteomics Bioinformatics. 2016;14:31–41. an informatics platform–a case study. PLoS One. 2014;9, e104382. 57. Denny JC, Bastarache L, Ritchie MD, Carroll RJ, Zink R, Mosley JD, et al. 30. Cano I, Lluch-Ariet M, Gomez-Cabrero D, Maier D, Kalko S, Cascante M, et al. Systematic comparison of phenome-wide association study of electronic Biomedical research in a digital health framework. J Transl Med. 2014;12 Suppl 2:S10. medical record data and genome-wide association study data. Nat Biotechnol. 2013;31:1102–10. 31. Koutkias VG, Jaulent M-C. Computational approaches for pharmacovigilance signal detection: toward integrated and semantically-enriched frameworks. 58. Gustafsson M, Gawel DR, Alfredsson L, Baranzini S, Björkander J, Blomgran R, Drug Saf. 2015;38:219–32. et al. A validated gene regulatory network and GWAS identifies early 32. Espay AJ, Bonato P, Nahab FB, Maetzler W, Dean JM, Klucken J, et al. regulators of T cell-associated diseases. Sci Transl Med. 2015;7:313ra178. Technology in Parkinson’s disease: challenges and opportunities. Mov 59. Landau DA, Carter SL, Stojanov P, McKenna A, Stevenson K, Lawrence MS, Disord Off J Mov Disord Soc. 2016. doi:10.1002/mds.26642. et al. Evolution and impact of subclonal mutations in chronic lymphocytic 33. Austin C, Kusumoto F. The application of Big Data in medicine: current leukemia. Cell. 2013;152:714–26. implications and future directions. J Interv Card Electrophysiol Int J Arrhythm 60. Leitsalu L, Alavere H, Tammesoo M-L, Leego E, Metspalu A. Linking a Pacing. 2016. doi:10.1007/s10840-016-0104-y. population biobank with national health registries-the estonian experience. 34. Poste G. Bring on the biomarkers. Nature. 2011;469:156–7. J Pers Med. 2015;5:96–106. 61. Mandl KD, Kohane IS. Time for a patient-driven health information 35. Sawyers CL. The cancer biomarker problem. Nature. 2008;452:548–52. economy? N Engl J Med. 2016;374:205–8. 36. Barlesi F, Mazieres J, Merlio J-P, Debieuvre D, Mosser J, Lena H, et al. Routine 62. IRDiRC: International Rare Diseases Research Consortium. http://www.irdirc. molecular profiling of patients with advanced non-small-cell lung cancer: org. Accessed 8 May 2016. results of a 1-year nationwide programme of the French Cooperative Thoracic Intergroup (IFCT). Lancet Lond Engl. 2016;387:1415–26. 63. RARE-Bestpractices. http://www.rarebestpractices.eu/home. Accessed 8 37. Holderfield M, Deuker MM, McCormick F, McMahon M. Targeting RAF May 2016. kinases for cancer therapy: BRAF-mutated melanoma and beyond. Nat Rev 64. p-medicine - from data sharing and integration via VPH models to Cancer. 2014;14:455–67. personalized medicine. http://www.p-medicine.eu. Accessed 8 May 2016. 38. Kalia M. Biomarkers for personalized oncology: recent advances and future 65. ELIXIR: A distributed infrastructure for life-science information. https://www. challenges. Metabolism. 2015;64:S16–21. elixir-europe.org. Accessed 6 May 2016. 39. Semrad TJ, Kim EJ. Molecular testing to optimize therapeutic decision 66. Ison J, Rapacki K, Ménager H, Kalaš M, Rydza E, Chmura P, et al. Tools and making in advanced colorectal cancer. J Gastrointest Oncol. 2016;7:S11–20. data services registry: a community effort to document bioinformatics resources. Nucleic Acids Res. 2016;44:D38–47. 40. Hay SI, George DB, Moyes CL, Brownstein JS. Big data opportunities for global infectious disease surveillance. PLoS Med. 2013;10, e1001413. 67. eTRIKS: European Translational Research Information and Knowledge 41. Zheng Y-L, Ding X-R, Poon CCY, Lo BPL, Zhang H, Zhou X-L, et al. Management Services. https://www.etriks.org. Accessed 6 May 2016. Unobtrusive sensing and wearable devices for health informatics. IEEE Trans 68. Genomics England 100,000 Genomes Project. http://www.genomicsengland. Biomed Eng. 2014;61:1538–54. co.uk. Accessed 6 May 2016. 42. OECD Publishing. Health data governance: privacy, monitoring and research 69. Rosenthal A, Mork P, Li MH, Stanford J, Koester D, Reynolds P. Cloud - policy brief. OECD; Oct 2015. https://www.oecd.org/health/health-systems/ computing: a new business paradigm for biomedical information sharing. Health-Data-Governance-Policy-Brief.pdf. Accessed 6 May 2016. J Biomed Inform. 2010;43:342–53. 43. Eisenstein M. Big data: the power of petabytes. Nature. 2015;527:S2–4. 70. Chen Y-C, Horng G, Lin Y-J, Chen K-C. Privacy preserving index for encrypted electronic medical records. J Med Syst. 2013;37:9992. 44. Doyle-Lindrud S. Watson will see you now: a supercomputer to help clinicians 71. Griebel L, Prokosch H-U, Köpcke F, Toddenroth D, Christoph J, Leb I, et al. make informed treatment decisions. Clin J Oncol Nurs. 2015;19:31–2. A scoping review of cloud computing in healthcare. BMC Med Inform Decis 45. Cesario A, Marcus F. Cancer systems biology, bioinformatics and medicine: Mak. 2015;15:17. research and clinical applications. 1st ed. Netherlands: Springer Science & Business Media; 2011. 72. IMI: Innovative Medicines Initiative - Ongoing projects. http://www.imi. 46. Cancer Genome Atlas Research Network, Weinstein JN, Collisson EA, Mills europa.eu/content/ongoing-projects. Accessed 8 May 2016. GB, Shaw KRM, Ozenberger BA, et al. The Cancer Genome Atlas Pan-Cancer 73. Hughes R, Beene M, Dykes. The significance of data harmonization for analysis project. Nat Genet. 2013;45:1113–20. credentialing research. Washington, DC: Institute of Medicine of the 47. Gahl WA, Wise AL, Ashley EA. The undiagnosed diseases network of the National Academies; 2014. http://nam.edu/wp-content/uploads/2015/06/ national institutes of health: a national extension. JAMA. 2015;314:1797–8. CredentialingDataHarmonization.pdf. Accessed 8 May 2016. 74. European Open Science Cloud. http://ec.europa.eu/research/openscience/ 48. Taruscio D, Groft SC, Cederroth H, Melegh B, Lasko P, Kosaki K, et al. index.cfm?pg=open-science-cloud. Accessed 9 May 2016. Undiagnosed Diseases Network International (UDNI): White paper for global 75. Howe D, Costanzo M, Fey P, Gojobori T, Hannick L, Hide W, et al. Big data: actions to meet patient needs. Mol Genet Metab. 2015;116:223–5. the future of biocuration. Nature. 2008;455:47–50. 49. Thompson R, Johnston L, Taruscio D, Monaco L, Béroud C, Gut IG, et al. 76. Liberles DA, Teufel AI, Liu L, Stadler T. On the need for mechanistic models in RD-Connect: an integrated platform connecting databases, registries, computational genomics and metagenomics. Genome Biol Evol. 2013;5:2008–18. biobanks and clinical bioinformatics for rare disease research. J Gen Intern Med. 2014;29:780–7. 77. EMBL-EBI: European Molecular Biology Laboratory – European 50. Yaman H, Yavuz E, Er A, Vural R, Albayrak Y, Yardimci A, et al. The use of Bioinformatics Institute. http://www.ebi.ac.uk/biomodels-main. Accessed 8 mobile smart devices and medical apps in the family practice setting. J Eval May 2016. Clin Pract. 2016;22:290–6. 78. Viceconti M, Hunter P, Hose R. Big data, big knowledge: big data for 51. American Bar Association, Health Law Section, ABA Section of Science & personalized healthcare. IEEE J Biomed Health Inform. 2015;19:1209–15. Technology Law and Center for Professional Development. Medical device 79. Virtual Physiological Human (VPH) Institute. http://www.vph-institute.org. law: compliance issues, best practices and trends. 2015. http://www. Accessed 6 May 2016. americanbar.org/content/dam/aba/events/cle/2015/10/ce1510mdm/ 80. IUPS Physiome Project. http://physiomeproject.org/software/fieldml. ce1510mdm_interactive.authcheckdam.pdf. Accessed 6 May 2016. Accessed 6 May 2016. 52. Di Meglio A. Big data management–from CERN/LHC to personalised 81. MarésJ,Shamardin L, Weiler G, Anguita A,Sfakianakis S, Neri E, et al.p-medicine: medicine. Ajaccio, France: MEDAMI; 2016. doi:10.5281/zenodo.50739. a medical informatics platform for integrated large scale heterogeneous patient 53. Murphy SN, Weber G, Mendis M, Gainer V, Chueh HC, Churchill S, et al. data. AMIA Annu Symp Proc. 2014;2014:872–81. Serving the enterprise and beyond with informatics for integrating biology 82. Schmitz U, Wolkenhauer O. Systems medicine. 1st ed. New York: Humana and the bedside (i2b2). J Am Med Inform Assoc. 2010;17:124–30. Press; 2016. 54. Chen J, Qian F, Yan W, Shen B. Translational biomedical informatics in the 83. CASyM: Coordinating Action Systems Medicine Europe. https://www.casym. cloud: present and future. BioMed Res Int. 2013;2013:658925. eu. Accessed 6 May 2016. Auffray et al. Genome Medicine (2016) 8:71 Page 13 of 13 84. Pemovska T, Kontro M, Yadav B, Edgren H, Eldfors S, Szwajda A, et al. 110. BBMRI-ERIC: Biobanking and BioMolecular resources Research Infrastructures. Individualized systems medicine strategy to tailor treatments for patients with http://bbmri-eric.eu. Accessed 8 May 2016. chemorefractory acute myeloid leukemia. Cancer Discov. 2013;3:1416–29. 111. ECRIN: European Clinical Research Infrastructure Network. http://www.ecrin. 85. Roca J, Cano I, Gomez-Cabrero D, Tegnér J. From systems understanding to org. Accessed 6 May 2016. personalized medicine: lessons and recommendations based on a 112. Cascante M, de Atauri P, Gomez-Cabrero D, Wagner P, Centelles JJ, Marin S, multidisciplinary and translational analysis of COPD. Methods Mol Biol et al. Workforce preparation: the Biohealth computing model for Master Clifton NJ. 2016;1386:283–303. and PhD students. J Transl Med. 2014;12 Suppl 2:S11. 86. Kemp R. Legal aspects of managing big data white paper. 2014. Kemp IT Law, 113. Rozman D, Acimovic J, Schmeck B. Training in systems approaches for the http://www.kempitlaw.com/wp-content/uploads/2014/10/Legal-Aspects-of- next generation of life scientists and medical doctors. Systems Medicine. Big-Data-White-Paper-v2-1-October-2014.pdf. Accessed 6 May 2016. 1st ed. New York: Humana Press (Springer Protocols). Schmitz U and Wolkenhauer O; 2016. p.73–86. 87. ICGC: International Cancer Genome Consortium. https://icgc.org/. Accessed 6 114. Jensen TB. Design principles for achieving integrated healthcare information May 2016. systems. Health Informatics J. 2013;19:29–45. 88. IHEC: International Human Epigenome Consortium. http://ihec-epigenomes. 115. Open science definition. https://en.wikipedia.org/wiki/Open_science. org. Accessed 6 May 2016. Accessed 8 May 2016. 89. GSC: Genomic Standards Consortium. http://gensc.org. Accessed 6 May 2016. 116. Butler D. Dutch lead European push to flip journals to open access. Nature. 90. CDISC: Clinical Data Interchange Standards Consortium. http://www.cdisc. 2016;529:13–3. org. Accessed 8 May 2016. 117. Swedish Research Council. Proposal for National Guidelines for Open Access 91. ISO TC276 WG5: Technical Committee 276 on Biotechnology, Working to Scientific Information. Swedish Research Council; Feb 2015. https:// Group 5 on Data Processing and Integration. http://www.iso.org/iso/home/ publikationer.vr.se/en/product/proposal-for-national-guidelines-for-open- standards_development/list_of_iso_technical_committees/iso_technical_ access-to-scientific-information/. Accessed 8 May 2016. committee.htm?commid=4514241. Accessed 6 May 2016. 118. Bauer B, Blechl B, Bock C, Danowski P, Ferus A, Graschopf A, et al. 92. Wilkinson MD, Dumontier M, Aalbersberg IJJ, Appleton G, Axton M, Baak A, Recommendations for the transition to open access in Austria. Nov 2015. et al. The FAIR Guiding Principles for scientific data management and http://zenodo.org/record/34079#.Vy-njjY03q0. Accessed 8 May 2016 stewardship. Sci Data. 2016;3:160018. 119. Berlin declaration on open access to knowledge in the sciences and 93. GA4GH: Global Alliance for Genomics and Health. http://genomicsandhealth. humanities. 22 Oct 2003. https://openaccess.mpg.de/Berlin-Declaration. org. Accessed 6 May 2016. Accessed 8 May 2016. 94. CORBEL: Coordinated Research Infrastructures Building Enduring Life-science 120. Follett R, Strezov V. An analysis of citizen science based research: usage and Services. https://www.elixir-europe.org/about/eu-projects/corbel. Accessed 6 publication patterns. PLoS One. 2015;10, e0143687. May 2016. 121. Horizon 2020 Framework Programme policy on open science (open access). 95. BRIDGEHEALTH. http://www.bridge-health.eu/content/integrate-information- http://ec.europa.eu/programmes/horizon2020/en/h2020-section/open- injuries. Accessed 8 May 2016. science-open-access. Accessed 8 May 2016. 96. Personal Genome Project. http://www.personalgenomes.org. Accessed 8 May 2016. 122. Clark WC, van Kerkhoff L, Lebel L, Gallopin GC. Crafting usable 97. UNESCO. International Declaration on Human Genetic Data. Oct 2003. knowledge for sustainable development. Proc Natl Acad Sci U S A. http://portal.unesco.org/en/ev.php-URL_ID=17720&URL_DO=DO_ 2016;113:4570–8. TOPIC&URL_SECTION=201.html. Accessed 6 May 2016. 123. Wolkenhauer O, Auffray C, Brass O, Clairambault J, Deutsch A, Drasdo D, et al. 98. Publishing OECD. Guidelines for Human Biobanks and Genetic Research Enabling multiscale modeling in systems medicine. Genome Med. 2014;6:21. Databases (HBGRDs). 2009. http://www.oecd.org/sti/biotechnology/hbgrd. 124. Viceconti M, Henney A, Morley-Fletcher E. In silico clinical trials: how Accessed 6 May 2016. computer simulation will transform the biomedical industry. Brussels, 99. Knoppers BM. Framework for responsible sharing of genomic and health- Belgium: Avicenna Coordination Support Action; 2016. http://avicenna-isct. related data. HUGO J. 2014;8:3. org/wp-content/uploads/2016/01/AvicennaRoadmapPDF-27-01-16.pdf. 100. DLA Piper, Data protection laws of the world. https://www. Accessed 20 May 2016. dlapiperdataprotection.com/index.html#handbook/world-map-section. 125. IDeAl: Infrastructure, Design, Engineering, Architecture, and Integration. Accessed 6 May 2016. http://www.uspto.gov/about/vendor_info/current_acquisitions/ideaihom.jsp. 101. Proposal for a Regulation of the European parliament and of the Council on Accessed 8 May 2016. the protection of individuals with regard to the processing of personal data 126. Shaw DE, Sousa AR, Fowler SJ, Fleming LJ, Roberts G, Corfield J, et al. and on the free movement of such data (General Data Protection Directive) Clinical and inflammatory characteristics of the European U-BIOPRED adult 2012/0011 (COD). http://www.europarl.europa.eu/RegData/docs_autres_ severe asthma cohort. Eur Respir J. 2015;46:1308–21. institutions/commission_europeenne/com/2012/0011/COM_ 127. Ayasdi. http://www.ayasdi.com. Accessed 6 May 2016. COM(2012)0011_EN.pdf. Accessed 6 May 2016. 128. Lum PY, Singh G, Lehman A, Ishkanov T, Vejdemo-Johansson M, 102. General data protection regulation, compromise text concluded in the trilogue Alagappan M, et al. Extracting insights from the shape of complex data negotiations between the Parliament and the Council (17 December 2015). using topology. Sci Rep. 2013;3:1236. http://www.emeeting.europarl.europa.eu/committees/agenda/201512/LIBE/ 129. Pellet J, Lefaudeux D, Royer P-J, Koutsokera A, Bourgoin-Voillard S, Schmitt LIBE%282015%291217_1/sitt-1739884. Accessed 6 May 2016. M, et al. A multi-omics data integration approach to identify a predictive 103. Bahr A, Schlünder I. Code of practice on secondary use of medical data in molecular signature of CLAD. Eur Respir J. 2015;46, OA3271. European scientific research projects. Int Priv Law. 2015;5:279–91. 130. Pison C, Magnan A, Botturi K, Sève M, Brouard S, Marsland BJ, et al. 104. Why you should care about blockchains: the non-financial uses of blockchain Prediction of chronic lung allograft dysfunction: a systems medicine technology. Nesta. http://www.nesta.org.uk/blog/why-you-should-care-about- challenge. Eur Respir J. 2014;43:689–93. blockchains-non-financial-uses-blockchain-technology. Accessed 8 May 2016. 131. Ingenuity®. http://www.ingenuity.com. Accessed 6 May 2016. 105. Barnes R. Blockchain and digital health–first impressions. DNA Dig. 132. Thomson Reuters GeneGo MetaCore™. https://portal.genego.com. Accessed 8 May 2016. http://dnadigest.org/?s=block+chain+digital+health. Accessed 8 May 2016. 133. Fujita KA, Ostaszewski M, Matsuoka Y, Ghosh S, Glaab E, Trefois C, et al. 106. Tang Y, Liu L. Searching HIE with differentiated privacy preservation. San Diego, USA: Integrating pathways of Parkinson’s disease in a molecular interaction map. 2014 USENIX Summit on Health Information Technologies HealthTech ’14; 2014. Mol Neurobiol. 2014;49:88–102. 107. CS ELSI BBMRI-ERIC: Common Service on Ethical, Legal, and Social Issues 134. Mazein A, Auffray C. EISBM AsthmaMap. http://www.eisbm.org/projects/ of Biobanking and BioMolecular resources Research Infrastructure. http:// disease-maps. Accessed 6 May 2016. bbmri-eric.eu/common-services. Accessed 6 May 2016. 135. Mazein A, De Meulder B, Lefaudeux D, Knowles R, Wheelock C, Dahlen S, 108. Georgatos F, Ballereau S, Pellet J, Ghanem M, Price N, Hood L, et al. Computational et al. The AsthmaMap: towards a community-driven reconstruction of infrastructures for data and knowledge management in systems biology. In: Prokop asthma-relevant pathways and networks. Estoril, Portugal: The 14th ERS A, Csukás B, editors. Systems Biology. Netherlands: Springer; 2013. p. 377–97. Lung Science Conference; 2016. 109. CS IT BBMRI-ERIC: Common Service on Information Technology of Biobanking and BioMolecular resources Research Infrastructure. http://bbmri-eric.eu/ common-service-it. Accessed 6 May 2016.

Journal

Genome MedicineSpringer Journals

Published: Jun 23, 2016

There are no references for this article.