Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

Statistical Approaches for Assessing Health Effects of Environmental Chemical Mixtures in Epidemiology: Lessons from an Innovative Workshop

Statistical Approaches for Assessing Health Effects of Environmental Chemical Mixtures in... Perspectives Brief Communication A Section 508– conformant HTML version of this ar ticle is available at http://dx.doi.org/10.1289/EHP547. Statistical Approaches for Assessing Health Effects of Environmental Chemical Mixtures in Epidemiology: Lessons from an Innovative Workshop http://dx.doi.org/10.1289/EHP547 is a need to identify interactions resulting from combined exposures, Summary : Quantifying the impact of exposure to environmental determine how the combined exposures affect human health outcomes, chemical mixtures is important for identifying risk factors for and identify preventive measures to mitigate the potential impact of diseases and developing more targeted public health interventions. these exposures. The National Institute of Environmental Health Sciences (NIEHS) held a workshop in July 2015 to address the need to develop novel Objectives statistical approaches for multi-pollutant epidemiology studies. The To follow up on the themes from the 2011 workshop, and in an effort primary objective of the workshop was to identify and compare to focus on statistical approaches for multi-pollutant (i.e., mixtures) different statistical approaches and methods for analyzing complex chemical mixtures data in both simulated and real-world data sets. epidemiology studies, the NIEHS convened another workshop in At the workshop, participants compared approaches and results July 2015. This workshop—“Statistical Approaches for Assessing and speculated as to why they may have differed. Several themes Health Effects of Environmental Chemical Mixtures in Epidemiology emerged: a) no one statistical approach appeared to outperform the Studies”—was designed to bring together experts from the fields others, b) many methods included some form of variable reduction of environmental epidemiology and biostatistics (NIEHS 2015). or summation of the data before statistical analysis, c) the statistical The primary objective of the workshop was to identify and compare approach should be selected based upon a specific hypothesis or different approaches and methods for analyzing chemical mixture data scientific question, and d) related mixtures data should be shared in epidemiological studies. among researchers to more comprehensively and accurately address An innovative approach was used to attract and engage potential methodological questions and statistical approaches. Future efforts workshop participants and conduct a working meeting. This approach should continue to design and optimize statistical approaches to address questions about chemical mixtures in epidemiological studies. involved having participants apply various statistical methods to two simulated data sets and one real-world data set before the workshop. Each data set included a single continuous health outcome (Y), Background multiple chemical exposures, and additional non-exposure variables There is great interest in quantifying the impact of exposure to (e.g., potential confounders). Experts were offered an opportunity environmental chemical mixtures on human health. As shown in to test their statistical methods of choice on the data sets and later biomonitoring studies, children and adults are exposed to a large exhibit their findings at the workshop. number of environmental chemicals across the life span (Aylward e Th r fi st step in the process was to make the simulated data sets et al. 2013; CDC 2012; Exley et al. 2015; Frederiksen et al. 2014). available to potential participants approximately six months before the Many are potentially toxic, but little is known about health effects workshop. These data sets have since been made publically available from exposure to complex mixtures (Carlin et al. 2013; Claus Henn on the NIEHS web site (http://www.niehs.nih.gov/about/events/ et al. 2014; Goodson et al. 2015; Grandjean and Landrigan 2014; pastmtg/2015/statistical/). Participants were asked to analyze the Johns et al. 2012). By examining chemical mixtures, instead of one data sets using their specific statistical approach(es) and to submit an chemical at a time, it may be possible to more accurately identify abstract describing their approach(es). Second, the real-world data set risk factors for diseases with environmental origins and develop more was made available to those who submitted abstracts based on their targeted public health interventions. analyses of the simulated data sets. The methods used to create the two In 2011, the National Institute of Environmental Health Sciences simulated data sets were known only by the workshop organizers and (NIEHS) hosted a workshop on chemical mixtures entitled “Advancing were revealed to those participants that had submitted their analyses to Research on Mixtures: New Perspectives and Approaches for Predicting the workshop organizers prior to the workshop. This allowed potential Adverse Human Health Effects.” This workshop brought together participants to compare their results to the known (i.e., “truth”) results experts from epidemiology, toxicology, exposure science, risk assessment, and to reflect on why their results may have differed. and statistics to identify key challenges in mixtures research and to The planning committee received 33 abstracts from academia, suggest approaches for addressing those challenges (Carlin et al. 2013). government, and industry. Based on these abstracts, subsets of indi- An important theme that emerged was the need for further collaboration viduals were invited to present their approach and statistical model(s) between experts that would help bridge the gap between toxicological at the meeting. and epidemiological studies that involve chemical mixtures. iTh s cross- The Data Sets disciplinary collaboration is a necessary step in understanding exposure to real-world mixtures and the associated health effects. Another key Simulated data set 1 (n = 500) was designed to represent a prospec- concept that came from the workshop included the need to develop tive cohort epidemiologic study with seven continuous, log-normal novel statistical approaches that would predict and evaluate effects exposures and one binary variable stipulated to be a confounder that associated with exposure to mixtures. In addition, the NIEHS has required adjustment. Assumptions were built into the data set and incorporated into its 2012–2017 Strategic Plan (Goal 4) the need for included no loss to follow up, missing or censored data, mismeasure further study of the health effects associated with combined exposures ment of the variables, or other potential biases. It was also assumed (NIEHS 2012; see http://www.niehs.nih.gov/about/strategicplan/). This that the seven exposure variables and the binary variables were neither goal includes the assessment of joint action of multiple environmental intermediate variables nor colliders. The data sets were designed exposures, including chemicals, nonchemical stressors (e.g., such that there were high correlations between exposures, the binary socioeconomic, behavioral factors), infectious agents, the microbiome, variable was a strong confounder, and directions of effect for the expo - and nutritional components on toxicity and disease. Moreover, there sures differed. Random, normally distributed noise was added to the | | Environmental Health Perspectives • volume 124 number 12 December 2016 A 227 Brief Communication outcome variable, and only part of the variation in the outcome was simulated data sets. If they did not achieve the correct answers for explained by the independent variables. In addition, this data set had either data set, they were asked to speculate as to why this might have fewer exposure variables than the second simulated data set and smaller occurred and if changing assumptions would have enabled them to amounts of unexplained variation (e.g., random noise), non-linear reach the correct result. In addition, the participants were requested to exposure–response functions, and interactions between exposures. summarize the main strengths and weaknesses of their approach, note Simulated data set 2 (n = 500) represented data from a cross- any particular challenges they encountered during their analysis (e.g., sectional study of 14 exposure variables. iTh s data set included three lack of toxicity data information, limitations in number of exposures potential confounders (two continuous and one binary), a strong that could be evaluated at one time), and recommend next steps. correlation between exposures, and strong effect measure modification Discussion by a binary confounder (e.g., sex). The exposure variables had complex correlations based on real-world biomarker data from the National Numerous statistical approaches were proposed at this workshop and Health and Nutrition Examination Survey (NHANES). The second can be categorized as classification and prediction, exposure–response simulated data set featured more exposure variables and more unex- surface estimation, variable selection, and variable shrinkage strategies plained variation than the first simulated data set, but contained linear (Table 1). In general, most of these techniques involved reduction or exposure–response functions and no interactions between exposures. summation of the exposures in some way. For comparison purposes, To understand the nature of the challenges presented to the workshop some investigators evaluated the commonly implemented linear participants, the reader can find additional information regarding the regression (ordinary least squares) approach. All methods were applied complexity of the simulated data sets and the assumptions that were to both the simulated datasets 1 and 2 and some were applied to the built into the data sets on the workshop web site (http://www.niehs. real-world dataset. nih.gov/about/visiting/events/pastmtg/2015/statistical/). Several general observations emerged from the discussion of these The third data set was a modified real-world data set (n = 270) approaches. First, participants agreed that no one statistical approach that came from a prospective pregnancy and birth cohort study of seemed to perform better than another at the qualitative level for the mothers and children where the results (i.e., truth) were unknown simulated data sets. Rather, there was extensive variability across the (Braun et al. 2016b). This data set included 22 exposure variables: methods and less alignment with the correct answers (i.e., truth) with 14 polychlorinated biphenyls (PCBs), 4 polybrominated diphenyl increasing data complexity (i.e., simulated data set 1 was less complex ethers (PBDEs), and 4 organochlorine pesticides. The outcome consisted of scores on the Mental Development Index (MDI; a Table 1. Examples of approaches presented at the NIEHS workshop measurement of cognition) (Bayley 1969) at ages 1–3 years; covariates “Statistical Approaches for Assessing Health Effects of Environmental Chemical Mixtures in Epidemiology Studies.” included child’s sex and mother’s age, education, race, and smoking status during pregnancy. Method Category For each analysis, workshop participants were encouraged to work Single chemical analysis Classic linear regression in multidisciplinary teams including epidemiologists, statisticians, and (ordinary least squares) Multiple regression Classic linear regression toxicologists. They were asked to address the following qualitative and (ordinary least squares) quantitative questions in their analyses: Visualization, structural equation modeling (SEM), Classification and prediction • Which exposures potentially contributed to the outcome? Are and principal component analysis (PCA) there any that did not? (qualitative). Informed sparse PCA and segmented regression Classification and prediction • How much did the exposures potentially contribute to the Bayesian g-formula Classification and prediction outcome? (quantitative). PCA Classification and prediction Classification and regression trees (CART) Classification and prediction • Was there evidence of “interaction?” Be explicit with your Bayesian profile regression Classification and prediction definition of interaction [toxicologists, epidemiologists, and Random forest Classification and prediction biostatisticians tend to think about this quite differently Multivariate adaptive regression splines (MARS) Classification and prediction (Howard and Webster 2013)]. Bayesian non-parametric regression Classification and prediction • What was the effect of joint and cumulative exposure to the Bayesian additive regression trees (BART) and Classification and prediction mixture? (qualitative). negative sparse PCA (NSPCA) • What is the estimate of the function Y = f(X ,…,X )? Conformal predictions Classification and prediction 1 p (quantitative). Bayesian kernel machine regression (BKMR) Exposure–response surface estimation Workshop participants were also asked to provide specific details Building Bayesian networks Exposure–response surface about their methodologies and how their assumptions may have estimation influenced the results. These included providing a basic overview Exposure surface smoothing (ESS) Exposure–response surface of the method(s) used, the rationale for using their approach(es), estimation any transformation or preparation of the data necessary to using the Modes of action (results presented for Z = 0 strata) Other approach(es), and assumptions inherent to the approach(es) and Feasible solution algorithm (FSA) Other built into the model (e.g., departures from linearity, dose–response Exploratory data analysis (EDA) Other Novel approach and least-angle regression (LARS) Variable selection shapes, interactions, modifiers, and different potencies for exposures). Machine learning Variable selection Participants were also asked to include information about the statis- Two-step variable selection and least absolute Variable selection tical software they used and to provide the statistical code in their shrinkage and selection operator (LASSO) analysis (e.g., R, SAS). They were encouraged to state whether or not Two-step shrinkage-based regression Variable selection they used an existing package or procedure and identify if they had Factor mixture models Variable selection to signica fi ntly modify an existing package or procedure, or develop Subset and bootstrap Variable selection completely new code. The statistical code submitted by participants Variable selection regression (VSR) Variable selection Bayesian estimation of weighted sum Variable shrinkage strategies is available on the workshop’s web site (http://www.niehs.nih.gov/ Shrinkage methods (LASSO/LARS) Variable shrinkage strategies about/visiting/events/pastmtg/2015/statistical/). Weighted quantile sum regression (WQS) Variable shrinkage strategies Finally, participants were asked to compare the outcomes of their LASSO Variable shrinkage strategies analyses to the correct answers (i.e., truth) associated with the two | | A 228 volume 124 number 12 December 2016 • Environmental Health Perspectives Brief Communication than simulated data set 2). The various approaches also differed in National Institute of Environmental Health Sciences, National Institutes of Health, Department of Health and Human Services, Research Triangle Park, North Carolina, their ability to address collinearity or correlated variables, interaction USA; Department of Epidemiology, Brown University, Providence, Rhode Island, between exposures, and model assumptions. Second, many methods USA; Department of Preventive Medicine, Icahn School of Medicine at Mount included some form of variable and data reduction or transformation, 4 Sinai, New York, New York, USA; Departments of Environmental Health and either prior to or while conducting the analysis. Third, there is a need Epidemiology, Harvard T.H. Chan School of Public Health, Boston, Massachusetts, USA; Department of Environmental Health, Boston University School of Public to define specific types of scientific questions and hypotheses related Health, Boston, Massachusetts, USA to chemical mixtures that can be addressed by epidemiologic studies Address correspondence to K.W. Taylor, National Institute of Environmental (Braun et al. 2016a). More specifically, a statistical method should Health Sciences, P.O. Box 12233, MD K2-04, Research Triangle Park, North be chosen based upon a specific scientific question, and the use of Carolina 27709 USA. Telephone: (919) 316-4707. E-mail: kyla.taylor@niehs.nih.gov complementary methods should be considered when exploring scien- This work was supported by the following NIEHS grants: R00 ES020346 and R01 ES024381. tific hypotheses. The fact that no one statistical approach appeared to J.M.B. was financially compensated for conducting a re-analysis of a study of perform better than another may be related the fact that the organizers child lead exposure for the plaintiffs in a public nuisance case related to childhood did not initially pose specific study questions for the analysis. In lead poisoning. None of these activities were directly related to the present study. addition, the way in which the outcomes of interest were conceptu- The other authors declare they have no actual or potential competing financial alized and analyzed varied among the participants. The organizers interests. designed the data sets with the assumption of prediction, such that Note to readers with disabilities: EHP strives to ensure that all journal content is accessible to all readers. However, some figures and Supplemental Material the correct and incorrect answers could be easily determined, and it published in EHP articles may not conform to 508 standards due to the complexity appears that most of the workshop participants also assumed a predic- of the information being presented. If you need assistance accessing journal content, tive model approach because no specific hypotheses were identified. please contact ehponline@niehs.nih.gov. Our staff will work with you to assess and Fourth, the limitations of the data analyzed in this workshop must meet your accessibility needs within 3 working days. be recognized and addressed to the extent possible. The simulated data sets contained continuous exposure variables, but real-world data often include categorical data. The data sets also contained a restricted RefeRences number of observations (small sample size) limiting statistical power Aylward LL, Kirman CR, Schoeny R, Portier CJ, Hays SM. 2013. Evaluation of biomonitoring data for some methods as well as other issues inherent in many epide- from the CDC National Exposure Report in a risk assessment context: perspectives across miologic studies (e.g., co-pollutant correlation, paucity of relevant chemicals. Environ Health Perspect 121(3):287–294, doi: 10.1289/ehp.1205740. toxicology data, and insufficient information on potential confounding Bayley N. 1969. Bayley Scales of Infant Development. 1st Edition. San Antonio, TX:Psychological Corporation. variables). These issues may be addressed with larger detailed data sets Braun JM, Gennings C, Hauser R, Webster TF. 2016a. What can epidemiological studies tell (e.g., larger and more complex simulated data sets and consortium- us about the impact of chemical mixtures on human health? Environ Health Perspect based or pooled data); existing statistical strategies (e.g., imputa- 124(1):A6–A9, doi: 10.1289/ehp.1510569. Braun JM, Kalloo G, Chen A, Dietrich KN, Liddy-Hicks S, Morgan YX, et al. 2016b. Cohort profile: tion); more collaboration between epidemiologists, toxicologists, and Health 0utcomes and Measures of the Environment (HOME) study. Int J Epidemiol, doi: statisticians; or the development of novel methods. 10.1093/ije/dyw006. Carlin DJ, Rider CV, Woychik R, Birnbaum LS. 2013. Unraveling the health effects of environ- Conclusions and Future Directions mental mixtures: an NIEHS priority. Environ Health Perspect 121(1):A6–A8, doi: 10.1289/ ehp.1206182. This workshop and its format were unique and novel in the field of CDC (Centers for Disease Control and Prevention). 2012. Fourth National Report on Human environmental chemical mixtures and epidemiology, because partici- Exposure to Environmental Chemicals, Updated Tables, September 2012. http://www.cdc. gov/exposurereport/pdf/FourthReport_UpdatedTables_Sep2012.pdf [accessed 1 December pants were asked to conduct statistical analyses of specific model data 2014]. sets and to compare their results to other statistical approaches. Based Claus Henn B, Coull BA, Wright RO. 2014. Chemical mixtures and children’s health. Curr Opin on the attendance, number of abstracts submitted, and enthusiastic Pediatr 26(2):223–229. discussion at the workshop, this format was successful in bringing stat- Exley K, Aerts D, Biot P, Casteleyn L, Kolossa-Gehring M, Schwedler G, et  al. 2015. Pilot study testing a European human biomonitoring framework for biomarkers of chemical isticians and epidemiologists together to work on a common problem. exposure in children and their mothers: experiences in the UK. Environ Sci Pollut Res Int The questions participants were asked to address in their analyses 22(20):15821–15834. helped focus the discussion on the desired outcome— specifically, Frederiksen H, Jensen TK, Jorgensen N, Kyhl HB, Husby S, Skakkebaek NE, et al. 2014. Human urinary excretion of non-persistent environmental chemicals: an overview of Danish data “Which exposures contributed to the outcome?” and “Are there collected between 2006 and 2012. Reproduction 147(4):555–565. any that did not?” Based on the results from the presentations and Goodson WH III, Lowe L, Carpenter DO, Gilbertson M, Manaf Ali A, Lopez de Cerain Salsamendi abstracts, a significant amount of variability between the methods was A, et  al. 2015. Assessing the carcinogenic potential of low-dose exposures to chemical mixtures in the environment: the challenge ahead. Carcinogenesis 36(suppl 1):S254–S296. evident. Therefore, a useful future activity would be to systematically Grandjean P, Landrigan PJ. 2014. Neurobehavioural effects of developmental toxicity. Lancet characterize the variation in results across methods that are sufficiently Neurol 13(3):330–338. comparable to effect estimation and statistical significance. Based Howard GJ, Webster TF. 2013. Contrasting theories of interaction in epidemiology and toxi- cology. Environ Health Perspect 121(1):1–6, doi: 10.1289/ehp.1205889. on the workshop discussions and comparisons across currently used Johns DO, Stanek LW, Walker K, Benromdhane S, Hubbell B, Ross M, et  al. 2012. Practical methods, further development of methods is needed to adequately advancement of multipollutant scientific and risk assessment approaches for ambient air determine the health effects of mixtures and combined exposures. We pollution. Environ Health Perspect 120(9):1238–1242. NIEHS (National Institute of Environmental Health Sciences). 2012. 2012–2017 Strategic encourage the ongoing use of the posted simulated data sets, to facili- Plan. Advancing Science, Improving Health: A Plan for Environmental Health Research. tate collaboration across environmental health disciplines and improve Publication No. 12-7935. http://www.niehs.nih.gov/about/strategicplan/index.cfm [accessed our understanding of the health impacts of chemical mixtures. 11 October 2016]. NIEHS. 2015. Statistical Approaches for Assessing Health Effects of Environmental Chemical 1 1 2 1 Kyla W. Taylor, Bonnie R. Joubert, Joe M. Braun, Caroline Dilworth, Mixtures in Epidemiology Studies [Workshop], 13–14 July 2015, National Research of 3 4 1 1 Chris Gennings, Russ Hauser, Jerry J. Heindel, Cynthia V. Rider, Environmental Health Sciences, Research Triangle Park, North Carolina. http://www.niehs. 5 1 Thomas F. Webster, and Danielle J. Carlin nih.gov/about/events/pastmtg/2015/statistical/ [accessed 11 October 2016]. | | Environmental Health Perspectives • volume 124 number 12 December 2016 A 229 http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Environmental Health Perspectives Pubmed Central

Statistical Approaches for Assessing Health Effects of Environmental Chemical Mixtures in Epidemiology: Lessons from an Innovative Workshop

Loading next page...
 
/lp/pubmed-central/statistical-approaches-for-assessing-health-effects-of-environmental-NkCJHmnGIs

References (23)

Publisher
Pubmed Central
ISSN
0091-6765
eISSN
1552-9924
DOI
10.1289/EHP547
Publisher site
See Article on Publisher Site

Abstract

Perspectives Brief Communication A Section 508– conformant HTML version of this ar ticle is available at http://dx.doi.org/10.1289/EHP547. Statistical Approaches for Assessing Health Effects of Environmental Chemical Mixtures in Epidemiology: Lessons from an Innovative Workshop http://dx.doi.org/10.1289/EHP547 is a need to identify interactions resulting from combined exposures, Summary : Quantifying the impact of exposure to environmental determine how the combined exposures affect human health outcomes, chemical mixtures is important for identifying risk factors for and identify preventive measures to mitigate the potential impact of diseases and developing more targeted public health interventions. these exposures. The National Institute of Environmental Health Sciences (NIEHS) held a workshop in July 2015 to address the need to develop novel Objectives statistical approaches for multi-pollutant epidemiology studies. The To follow up on the themes from the 2011 workshop, and in an effort primary objective of the workshop was to identify and compare to focus on statistical approaches for multi-pollutant (i.e., mixtures) different statistical approaches and methods for analyzing complex chemical mixtures data in both simulated and real-world data sets. epidemiology studies, the NIEHS convened another workshop in At the workshop, participants compared approaches and results July 2015. This workshop—“Statistical Approaches for Assessing and speculated as to why they may have differed. Several themes Health Effects of Environmental Chemical Mixtures in Epidemiology emerged: a) no one statistical approach appeared to outperform the Studies”—was designed to bring together experts from the fields others, b) many methods included some form of variable reduction of environmental epidemiology and biostatistics (NIEHS 2015). or summation of the data before statistical analysis, c) the statistical The primary objective of the workshop was to identify and compare approach should be selected based upon a specific hypothesis or different approaches and methods for analyzing chemical mixture data scientific question, and d) related mixtures data should be shared in epidemiological studies. among researchers to more comprehensively and accurately address An innovative approach was used to attract and engage potential methodological questions and statistical approaches. Future efforts workshop participants and conduct a working meeting. This approach should continue to design and optimize statistical approaches to address questions about chemical mixtures in epidemiological studies. involved having participants apply various statistical methods to two simulated data sets and one real-world data set before the workshop. Each data set included a single continuous health outcome (Y), Background multiple chemical exposures, and additional non-exposure variables There is great interest in quantifying the impact of exposure to (e.g., potential confounders). Experts were offered an opportunity environmental chemical mixtures on human health. As shown in to test their statistical methods of choice on the data sets and later biomonitoring studies, children and adults are exposed to a large exhibit their findings at the workshop. number of environmental chemicals across the life span (Aylward e Th r fi st step in the process was to make the simulated data sets et al. 2013; CDC 2012; Exley et al. 2015; Frederiksen et al. 2014). available to potential participants approximately six months before the Many are potentially toxic, but little is known about health effects workshop. These data sets have since been made publically available from exposure to complex mixtures (Carlin et al. 2013; Claus Henn on the NIEHS web site (http://www.niehs.nih.gov/about/events/ et al. 2014; Goodson et al. 2015; Grandjean and Landrigan 2014; pastmtg/2015/statistical/). Participants were asked to analyze the Johns et al. 2012). By examining chemical mixtures, instead of one data sets using their specific statistical approach(es) and to submit an chemical at a time, it may be possible to more accurately identify abstract describing their approach(es). Second, the real-world data set risk factors for diseases with environmental origins and develop more was made available to those who submitted abstracts based on their targeted public health interventions. analyses of the simulated data sets. The methods used to create the two In 2011, the National Institute of Environmental Health Sciences simulated data sets were known only by the workshop organizers and (NIEHS) hosted a workshop on chemical mixtures entitled “Advancing were revealed to those participants that had submitted their analyses to Research on Mixtures: New Perspectives and Approaches for Predicting the workshop organizers prior to the workshop. This allowed potential Adverse Human Health Effects.” This workshop brought together participants to compare their results to the known (i.e., “truth”) results experts from epidemiology, toxicology, exposure science, risk assessment, and to reflect on why their results may have differed. and statistics to identify key challenges in mixtures research and to The planning committee received 33 abstracts from academia, suggest approaches for addressing those challenges (Carlin et al. 2013). government, and industry. Based on these abstracts, subsets of indi- An important theme that emerged was the need for further collaboration viduals were invited to present their approach and statistical model(s) between experts that would help bridge the gap between toxicological at the meeting. and epidemiological studies that involve chemical mixtures. iTh s cross- The Data Sets disciplinary collaboration is a necessary step in understanding exposure to real-world mixtures and the associated health effects. Another key Simulated data set 1 (n = 500) was designed to represent a prospec- concept that came from the workshop included the need to develop tive cohort epidemiologic study with seven continuous, log-normal novel statistical approaches that would predict and evaluate effects exposures and one binary variable stipulated to be a confounder that associated with exposure to mixtures. In addition, the NIEHS has required adjustment. Assumptions were built into the data set and incorporated into its 2012–2017 Strategic Plan (Goal 4) the need for included no loss to follow up, missing or censored data, mismeasure further study of the health effects associated with combined exposures ment of the variables, or other potential biases. It was also assumed (NIEHS 2012; see http://www.niehs.nih.gov/about/strategicplan/). This that the seven exposure variables and the binary variables were neither goal includes the assessment of joint action of multiple environmental intermediate variables nor colliders. The data sets were designed exposures, including chemicals, nonchemical stressors (e.g., such that there were high correlations between exposures, the binary socioeconomic, behavioral factors), infectious agents, the microbiome, variable was a strong confounder, and directions of effect for the expo - and nutritional components on toxicity and disease. Moreover, there sures differed. Random, normally distributed noise was added to the | | Environmental Health Perspectives • volume 124 number 12 December 2016 A 227 Brief Communication outcome variable, and only part of the variation in the outcome was simulated data sets. If they did not achieve the correct answers for explained by the independent variables. In addition, this data set had either data set, they were asked to speculate as to why this might have fewer exposure variables than the second simulated data set and smaller occurred and if changing assumptions would have enabled them to amounts of unexplained variation (e.g., random noise), non-linear reach the correct result. In addition, the participants were requested to exposure–response functions, and interactions between exposures. summarize the main strengths and weaknesses of their approach, note Simulated data set 2 (n = 500) represented data from a cross- any particular challenges they encountered during their analysis (e.g., sectional study of 14 exposure variables. iTh s data set included three lack of toxicity data information, limitations in number of exposures potential confounders (two continuous and one binary), a strong that could be evaluated at one time), and recommend next steps. correlation between exposures, and strong effect measure modification Discussion by a binary confounder (e.g., sex). The exposure variables had complex correlations based on real-world biomarker data from the National Numerous statistical approaches were proposed at this workshop and Health and Nutrition Examination Survey (NHANES). The second can be categorized as classification and prediction, exposure–response simulated data set featured more exposure variables and more unex- surface estimation, variable selection, and variable shrinkage strategies plained variation than the first simulated data set, but contained linear (Table 1). In general, most of these techniques involved reduction or exposure–response functions and no interactions between exposures. summation of the exposures in some way. For comparison purposes, To understand the nature of the challenges presented to the workshop some investigators evaluated the commonly implemented linear participants, the reader can find additional information regarding the regression (ordinary least squares) approach. All methods were applied complexity of the simulated data sets and the assumptions that were to both the simulated datasets 1 and 2 and some were applied to the built into the data sets on the workshop web site (http://www.niehs. real-world dataset. nih.gov/about/visiting/events/pastmtg/2015/statistical/). Several general observations emerged from the discussion of these The third data set was a modified real-world data set (n = 270) approaches. First, participants agreed that no one statistical approach that came from a prospective pregnancy and birth cohort study of seemed to perform better than another at the qualitative level for the mothers and children where the results (i.e., truth) were unknown simulated data sets. Rather, there was extensive variability across the (Braun et al. 2016b). This data set included 22 exposure variables: methods and less alignment with the correct answers (i.e., truth) with 14 polychlorinated biphenyls (PCBs), 4 polybrominated diphenyl increasing data complexity (i.e., simulated data set 1 was less complex ethers (PBDEs), and 4 organochlorine pesticides. The outcome consisted of scores on the Mental Development Index (MDI; a Table 1. Examples of approaches presented at the NIEHS workshop measurement of cognition) (Bayley 1969) at ages 1–3 years; covariates “Statistical Approaches for Assessing Health Effects of Environmental Chemical Mixtures in Epidemiology Studies.” included child’s sex and mother’s age, education, race, and smoking status during pregnancy. Method Category For each analysis, workshop participants were encouraged to work Single chemical analysis Classic linear regression in multidisciplinary teams including epidemiologists, statisticians, and (ordinary least squares) Multiple regression Classic linear regression toxicologists. They were asked to address the following qualitative and (ordinary least squares) quantitative questions in their analyses: Visualization, structural equation modeling (SEM), Classification and prediction • Which exposures potentially contributed to the outcome? Are and principal component analysis (PCA) there any that did not? (qualitative). Informed sparse PCA and segmented regression Classification and prediction • How much did the exposures potentially contribute to the Bayesian g-formula Classification and prediction outcome? (quantitative). PCA Classification and prediction Classification and regression trees (CART) Classification and prediction • Was there evidence of “interaction?” Be explicit with your Bayesian profile regression Classification and prediction definition of interaction [toxicologists, epidemiologists, and Random forest Classification and prediction biostatisticians tend to think about this quite differently Multivariate adaptive regression splines (MARS) Classification and prediction (Howard and Webster 2013)]. Bayesian non-parametric regression Classification and prediction • What was the effect of joint and cumulative exposure to the Bayesian additive regression trees (BART) and Classification and prediction mixture? (qualitative). negative sparse PCA (NSPCA) • What is the estimate of the function Y = f(X ,…,X )? Conformal predictions Classification and prediction 1 p (quantitative). Bayesian kernel machine regression (BKMR) Exposure–response surface estimation Workshop participants were also asked to provide specific details Building Bayesian networks Exposure–response surface about their methodologies and how their assumptions may have estimation influenced the results. These included providing a basic overview Exposure surface smoothing (ESS) Exposure–response surface of the method(s) used, the rationale for using their approach(es), estimation any transformation or preparation of the data necessary to using the Modes of action (results presented for Z = 0 strata) Other approach(es), and assumptions inherent to the approach(es) and Feasible solution algorithm (FSA) Other built into the model (e.g., departures from linearity, dose–response Exploratory data analysis (EDA) Other Novel approach and least-angle regression (LARS) Variable selection shapes, interactions, modifiers, and different potencies for exposures). Machine learning Variable selection Participants were also asked to include information about the statis- Two-step variable selection and least absolute Variable selection tical software they used and to provide the statistical code in their shrinkage and selection operator (LASSO) analysis (e.g., R, SAS). They were encouraged to state whether or not Two-step shrinkage-based regression Variable selection they used an existing package or procedure and identify if they had Factor mixture models Variable selection to signica fi ntly modify an existing package or procedure, or develop Subset and bootstrap Variable selection completely new code. The statistical code submitted by participants Variable selection regression (VSR) Variable selection Bayesian estimation of weighted sum Variable shrinkage strategies is available on the workshop’s web site (http://www.niehs.nih.gov/ Shrinkage methods (LASSO/LARS) Variable shrinkage strategies about/visiting/events/pastmtg/2015/statistical/). Weighted quantile sum regression (WQS) Variable shrinkage strategies Finally, participants were asked to compare the outcomes of their LASSO Variable shrinkage strategies analyses to the correct answers (i.e., truth) associated with the two | | A 228 volume 124 number 12 December 2016 • Environmental Health Perspectives Brief Communication than simulated data set 2). The various approaches also differed in National Institute of Environmental Health Sciences, National Institutes of Health, Department of Health and Human Services, Research Triangle Park, North Carolina, their ability to address collinearity or correlated variables, interaction USA; Department of Epidemiology, Brown University, Providence, Rhode Island, between exposures, and model assumptions. Second, many methods USA; Department of Preventive Medicine, Icahn School of Medicine at Mount included some form of variable and data reduction or transformation, 4 Sinai, New York, New York, USA; Departments of Environmental Health and either prior to or while conducting the analysis. Third, there is a need Epidemiology, Harvard T.H. Chan School of Public Health, Boston, Massachusetts, USA; Department of Environmental Health, Boston University School of Public to define specific types of scientific questions and hypotheses related Health, Boston, Massachusetts, USA to chemical mixtures that can be addressed by epidemiologic studies Address correspondence to K.W. Taylor, National Institute of Environmental (Braun et al. 2016a). More specifically, a statistical method should Health Sciences, P.O. Box 12233, MD K2-04, Research Triangle Park, North be chosen based upon a specific scientific question, and the use of Carolina 27709 USA. Telephone: (919) 316-4707. E-mail: kyla.taylor@niehs.nih.gov complementary methods should be considered when exploring scien- This work was supported by the following NIEHS grants: R00 ES020346 and R01 ES024381. tific hypotheses. The fact that no one statistical approach appeared to J.M.B. was financially compensated for conducting a re-analysis of a study of perform better than another may be related the fact that the organizers child lead exposure for the plaintiffs in a public nuisance case related to childhood did not initially pose specific study questions for the analysis. In lead poisoning. None of these activities were directly related to the present study. addition, the way in which the outcomes of interest were conceptu- The other authors declare they have no actual or potential competing financial alized and analyzed varied among the participants. The organizers interests. designed the data sets with the assumption of prediction, such that Note to readers with disabilities: EHP strives to ensure that all journal content is accessible to all readers. However, some figures and Supplemental Material the correct and incorrect answers could be easily determined, and it published in EHP articles may not conform to 508 standards due to the complexity appears that most of the workshop participants also assumed a predic- of the information being presented. If you need assistance accessing journal content, tive model approach because no specific hypotheses were identified. please contact ehponline@niehs.nih.gov. Our staff will work with you to assess and Fourth, the limitations of the data analyzed in this workshop must meet your accessibility needs within 3 working days. be recognized and addressed to the extent possible. The simulated data sets contained continuous exposure variables, but real-world data often include categorical data. The data sets also contained a restricted RefeRences number of observations (small sample size) limiting statistical power Aylward LL, Kirman CR, Schoeny R, Portier CJ, Hays SM. 2013. Evaluation of biomonitoring data for some methods as well as other issues inherent in many epide- from the CDC National Exposure Report in a risk assessment context: perspectives across miologic studies (e.g., co-pollutant correlation, paucity of relevant chemicals. Environ Health Perspect 121(3):287–294, doi: 10.1289/ehp.1205740. toxicology data, and insufficient information on potential confounding Bayley N. 1969. Bayley Scales of Infant Development. 1st Edition. San Antonio, TX:Psychological Corporation. variables). These issues may be addressed with larger detailed data sets Braun JM, Gennings C, Hauser R, Webster TF. 2016a. What can epidemiological studies tell (e.g., larger and more complex simulated data sets and consortium- us about the impact of chemical mixtures on human health? Environ Health Perspect based or pooled data); existing statistical strategies (e.g., imputa- 124(1):A6–A9, doi: 10.1289/ehp.1510569. Braun JM, Kalloo G, Chen A, Dietrich KN, Liddy-Hicks S, Morgan YX, et al. 2016b. Cohort profile: tion); more collaboration between epidemiologists, toxicologists, and Health 0utcomes and Measures of the Environment (HOME) study. Int J Epidemiol, doi: statisticians; or the development of novel methods. 10.1093/ije/dyw006. Carlin DJ, Rider CV, Woychik R, Birnbaum LS. 2013. Unraveling the health effects of environ- Conclusions and Future Directions mental mixtures: an NIEHS priority. Environ Health Perspect 121(1):A6–A8, doi: 10.1289/ ehp.1206182. This workshop and its format were unique and novel in the field of CDC (Centers for Disease Control and Prevention). 2012. Fourth National Report on Human environmental chemical mixtures and epidemiology, because partici- Exposure to Environmental Chemicals, Updated Tables, September 2012. http://www.cdc. gov/exposurereport/pdf/FourthReport_UpdatedTables_Sep2012.pdf [accessed 1 December pants were asked to conduct statistical analyses of specific model data 2014]. sets and to compare their results to other statistical approaches. Based Claus Henn B, Coull BA, Wright RO. 2014. Chemical mixtures and children’s health. Curr Opin on the attendance, number of abstracts submitted, and enthusiastic Pediatr 26(2):223–229. discussion at the workshop, this format was successful in bringing stat- Exley K, Aerts D, Biot P, Casteleyn L, Kolossa-Gehring M, Schwedler G, et  al. 2015. Pilot study testing a European human biomonitoring framework for biomarkers of chemical isticians and epidemiologists together to work on a common problem. exposure in children and their mothers: experiences in the UK. Environ Sci Pollut Res Int The questions participants were asked to address in their analyses 22(20):15821–15834. helped focus the discussion on the desired outcome— specifically, Frederiksen H, Jensen TK, Jorgensen N, Kyhl HB, Husby S, Skakkebaek NE, et al. 2014. Human urinary excretion of non-persistent environmental chemicals: an overview of Danish data “Which exposures contributed to the outcome?” and “Are there collected between 2006 and 2012. Reproduction 147(4):555–565. any that did not?” Based on the results from the presentations and Goodson WH III, Lowe L, Carpenter DO, Gilbertson M, Manaf Ali A, Lopez de Cerain Salsamendi abstracts, a significant amount of variability between the methods was A, et  al. 2015. Assessing the carcinogenic potential of low-dose exposures to chemical mixtures in the environment: the challenge ahead. Carcinogenesis 36(suppl 1):S254–S296. evident. Therefore, a useful future activity would be to systematically Grandjean P, Landrigan PJ. 2014. Neurobehavioural effects of developmental toxicity. Lancet characterize the variation in results across methods that are sufficiently Neurol 13(3):330–338. comparable to effect estimation and statistical significance. Based Howard GJ, Webster TF. 2013. Contrasting theories of interaction in epidemiology and toxi- cology. Environ Health Perspect 121(1):1–6, doi: 10.1289/ehp.1205889. on the workshop discussions and comparisons across currently used Johns DO, Stanek LW, Walker K, Benromdhane S, Hubbell B, Ross M, et  al. 2012. Practical methods, further development of methods is needed to adequately advancement of multipollutant scientific and risk assessment approaches for ambient air determine the health effects of mixtures and combined exposures. We pollution. Environ Health Perspect 120(9):1238–1242. NIEHS (National Institute of Environmental Health Sciences). 2012. 2012–2017 Strategic encourage the ongoing use of the posted simulated data sets, to facili- Plan. Advancing Science, Improving Health: A Plan for Environmental Health Research. tate collaboration across environmental health disciplines and improve Publication No. 12-7935. http://www.niehs.nih.gov/about/strategicplan/index.cfm [accessed our understanding of the health impacts of chemical mixtures. 11 October 2016]. NIEHS. 2015. Statistical Approaches for Assessing Health Effects of Environmental Chemical 1 1 2 1 Kyla W. Taylor, Bonnie R. Joubert, Joe M. Braun, Caroline Dilworth, Mixtures in Epidemiology Studies [Workshop], 13–14 July 2015, National Research of 3 4 1 1 Chris Gennings, Russ Hauser, Jerry J. Heindel, Cynthia V. Rider, Environmental Health Sciences, Research Triangle Park, North Carolina. http://www.niehs. 5 1 Thomas F. Webster, and Danielle J. Carlin nih.gov/about/events/pastmtg/2015/statistical/ [accessed 11 October 2016]. | | Environmental Health Perspectives • volume 124 number 12 December 2016 A 229

Journal

Environmental Health PerspectivesPubmed Central

Published: Dec 1, 2016

There are no references for this article.