Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

Commentary: The seven plagues of epigenetic epidemiology

Commentary: The seven plagues of epigenetic epidemiology 74 INTERNATIONAL JOURNAL OF EPIDEMIOLOGY 54 57 McGowan PO, Suderman M, Sasaki A et al. Broad epi- Rakyan VK, Down TA, Thorne NP et al. An integrated genetic signature of maternal care in the brain of adult resource for genome-wide identification and analysis of rats. PLoS ONE 2011;6:e14739. human tissue-specific differentially methylated regions Galobardes B, Shaw M, Lawlor DA, Lynch JW, Davey (tDMRs). Genome Res 2008;18:1518–29. Smith G. Indicators of socioeconomic position (part 2). Atherton K, Fuller E, Shepherd P, Strachan DP, Power C. J Epidemiol Community Health 2006;60:95–101. Loss and representativeness in a biomedical survey at age Galobardes B, Shaw M, Lawlor DA, Lynch JW, Davey 45 years: 1958 British birth cohort. J Epidemiol Community Smith G. Indicators of socioeconomic position (part 1). Health 2008;62:216–23. J Epidemiol Community Health 2006;60:7–12. This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/ by-nc/3.0), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited. Published by Oxford University Press on behalf of the International Epidemiological Association International Journal of Epidemiology 2012;41:74–78 The Author 2012; all rights reserved. Advance Access publication 23 January 2012 doi:10.1093/ije/dyr225 Commentary: The seven plagues of epigenetic epidemiology 1,2 3 Bastiaan T Heijmans * and Jonathan Mill 1 2 Molecular Epidemiology, Leiden University Medical Center, Leiden, The Netherlands, Netherlands Consortium for Healthy Ageing, Leiden, The Netherlands and Institute of Psychiatry, King’s College London, London, UK *Corresponding author. Molecular Epidemiology, Leiden University Medical Center, Postal Zone S-5-P, PO Box 9600, 2300 RC, Leiden, The Netherlands. E-mail: bas.heijmans@lumc.nl Accepted 1 December 2011 Epigenetics is being increasingly combined with every epigenetic epidemiologist is struggling with the same biological, technical and methodological issues. epidemiology to add mechanistic understanding to associations observed between environmental, genetic It is important to take these into consideration when and stochastic factors and human disease phenotypes. designing a study and interpreting the results. Let us consider seven of those issues, taking the current Currently, epigenetic epidemiological studies primarily study on SES as a starting point. focus on exploring if and where the epigenome (i.e. the overall epigenetic state of a cell) is influenced by specific environmental exposures like prenatal nu- 1 2 3 trition, sun exposure and smoking. In this issue of We do not really know where the IJE, Nada Borghol et al. report an association to look, or what to look for between childhood social-economic status (SES) and differential DNA methylation in adulthood. Low SES Most epigenetic epidemiological studies focus on DNA may integrate diverse and heterogeneous environmen- methylation for various practical and biological rea- tal influences, and knowing which epigenetic changes sons, neglecting other layers of the epigenome-like are associated with low SES may provide clues about histone modifications that are also likely to be im- the biological processes underlying its health conse- portant in influencing disease phenotypes. Our basic quences. The authors stress that their study is prelim- understanding of the methylome (i.e. the whole of inary. This statement is, in fact, to a greater or lesser DNA methylation marks on the genome) is in its extent applicable to the entire first wave of studies infancy, and we are still learning about the specific currently being published that likewise aim to dis- localization of the features that, when differentially cover associations between epigenetic variation mea- methylated, regulate gene expression and are thus sured on a genome-wide scale and environmental relevant for epigenetic epidemiologists to study. The exposures or disease phenotypes. When executing current study, like many others, evaluated promoter such epigenome-wide association studies (EWASs), regions, in this case defined as 1000 bp upstream to THE SEVEN PLAGUES OF EPIGENETIC EPIDEMIOLOGY 75 12,13 250 bp downstream of transcription start sites. human genome). The recently launched Illumina Although these features are often enriched for DNA 450 k Methylation Beadchip may offer a balance be- methylation marks influencing the expression of tween coverage and precision, which will be attractive genes, recent work suggests that other regions of for epidemiological EWASs executed during the next the methylome outside of promoters, including few years. It interrogates DNA methylation at over inter-genic CpG island shores and intra-genic CpG 480 000 CG dinucleotides, is high-throughput and rela- islands, may ultimately be more important for regu- tively affordable. The precision of this platform appears 12,13 lating phenotypic variation. to compare well with some of the other platforms, For any differentially methylated region identified in but these results should be interpreted with caution. EWASs it will be important to demonstrate function- Although correlation coefficients reported across the ality. Promoter methylation in the current study was various platform comparisons are high, they are mainly driven by the fact that the large majority of the integrated with public gene expression data and, as genome is either unmethylated or fully methylated, and expected, highly expressed genes were more com- substantial discrepancies between platforms may exist monly flanked by less methylated promoters and 12,14 for intermediate level methylation. Therefore, the vice versa. A limitation is that this observation is for technological validation of findings using an independ- groups of promoters, whereas information is needed ent method remains important. This will be feasible about this relationship for individual promoters. for a small number of ‘top hits’, like the three procad- Mining the reference epigenomes and transcriptomes herin promoters assessed in the current study. However, that are being generated for different cell types under validating the outcomes of the complex pathway the umbrella of initiatives such as the National Insti- analyses performed to implicate either entire biological tutes of Health (NIH) Epigenomics Roadmap and the processes (such as extra- and intra-cellular signalling in International Human Epigenome Consortium may the current study) or genomic features with a specific contribute to such information. Additional in vitro function in gene regulation [e.g. promoters, enhancers, experiments will be required to evaluate the transcrip- inter/intragenic CG island (shores) etc.], is more tional effects of differential DNA methylation at a demanding and currently not realized. Validating the specific locus independent of its genomic context. results of such gene-set testing methods will entail the re-assessment of DNA methylation across large sets of loci. We have to rely on imperfect technology The good news is that recent advances in genomic tech- We may be limited by available nology mean that genome-scale studies of DNA methy- sample sizes that are optimal for lation across multiple samples are now feasible. In epigenetic epidemiology practice, however, one has to compromise between coverage and precision in epidemiological studies, The current study investigated only 40 individuals. which likely incorporate a large number of samples. A Investigators will be able to secure budgets for large (and growing) number of methods exist for as- larger studies as empirical data increasingly highlight sessing DNA methylation both genome wide and at spe- the value of epigenetic epidemiology, and high- cific CpG sites, and one problem relates to our inability throughput, economical laboratory approaches to compare results across studies that have used differ- become more widely adopted. Nevertheless, it is un- ent platforms. On the one hand there are methods such likely that the simple brute-force approach that has as that used in the current study in which the methy- been used relatively successfully in genome-wide as- lated portion of the genome is captured using antibodies sociation studies (GWASs) is valid for EWASs. In gen- against methylated DNA and subsequently quantified etics, many of the epidemiological principles about using microarrays or next-generation sequencing. designing studies with respect to selection biases, con- These approaches can provide coverage across most of founding, batch effects and appropriateness of con- the genome and may be optimally suited to discriminate trols could largely be replaced by the simple rule low from high methylation, but have lower reliability ‘bigger-is-better’. This is not true for epigenetic epi- for smaller differences and are biased by factors such as demiology, because the epigenome is not a static 12,13 CG density. On the other hand, there are methods entity like the genome, which necessitates the use based on the bisulphite conversion of DNA combined of more conventional epidemiological approaches. with next-generation sequencing that provide higher Further complicating matters is the fact that, for the accuracy and single nucleotide resolution. Although most powerful study designs in epigenetic epidemi- whole-genome bisulphite sequencing is currently un- ology (including studies of discordant monozygotic 16 17 feasible to use across large epidemiological cohorts, twins particularly when longitudinally sampled, the method can be adapted to target a reduced repre- early exposure studies with long-term follow-up, sentation of the genome (approximately 3 million out of and studies of specific cell types ), the number of approximately 28 million CG dinucleotides in the eligible individuals for whom relevant biological 76 INTERNATIONAL JOURNAL OF EPIDEMIOLOGY materials were stored in existing epidemiological co- changes induced early in development (and potential- horts were often limited, and it will be difficult to ly propagated soma-wide) than for changes occur scale-up analyses to include the thousands of samples during ageing that are more likely to remain tissue 19,21 that may be required for establishing robust associ- specific. Efforts to obtain biopsies (subcutaneous ations with disease phenotypes. Moving forward, it fat, muscle, etc.) and post-mortem material in subsets will be important to establish cause and effect in epi- of longitudinal biobanks will greatly increase their genetic epidemiology; disease-associated differentially value for epigenetic studies, despite the problems methylated regions may arise prior to illness and con- associated with cellular heterogeneity that also hold tribute to the disease phenotype or could be a second- for such samples. ary effect of the disease process, or the medications used in treatment. Furthermore, maximum infor- mation will be obtained from epidemiological studies that are able to integrate epigenomic information We may be trying to detect with genomic, transcriptomic and proteomic data ob- tained from the same samples. inherently small effect sizes using these sub-optimal methods and sample cohorts Whatever we do, it may never The main findings in the current study concerned be enough to fully account for DNA methylation differences at three procadherin promoters. The extent of the difference at these pro- epigenetic differences between moters was similar to those commonly observed in tissues and cells other recent studies, namely 5%, and was most ap- In many respects, large comprehensively phenotyped parent for a single, nominally statistically significant and longitudinally sampled epidemiological studies, CG dinucleotide in each region. The biological impli- like the 1958 British birth cohort used in the current cations of such small alterations in DNA methylation study, are an ideal resource for epigenetic epidemi- in terms of gene expression and function are un- ology. In nearly all of these studies, however, whole known. Although DNA methylation is recognized as blood is the only biological material that has been one of the most stable epigenetic marks, it is still archived. Blood is a heterogeneous tissue and any relatively dynamic and this has important implica- DNA methylation difference between groups could tions for epigenetic epidemiology. The randomness be confounded by differences in the cellular compos- of maintaining and mitotically transmitting DNA ition of whole blood samples, for example, resulting methylation patterns may potentially dilute the puta- from the immune response to sub-clinical infection. tive epigenetic signatures of an adverse exposure early The good news is that fewer than perhaps expected in life (e.g. to low SES in childhood) observed dec- DNA methylation differences exist between leucocyte ades later. Of note, recent studies indicate that DNA types, and controlling for cellular heterogeneity may methylation patterns in leucocytes undergo consider- be possible in biobanks with a simple blood cell 20 able changes during the first years of life. Thus on count. Whether the latter is sufficient (and under top of the previously discussed question of whether which circumstances it is not), however, remains to DNA methylation at a specific locus actually influ- be established. Epigenomic studies of separate cell ences transcriptional activity, researchers should also types such as those being undertaken by the NIH aim to establish whether the small DNA methylation Epigenomic Roadmap Initiative and the European differences often observed between groups—either ex- Union Blueprint consortium are currently generating pressed as absolute difference, relative difference or reference epigenomes of haematopoietic cells that will relative to the variation in the population—translate be of great utility in this regard. When moving into differences in gene expression in the relevant beyond associations with environmental exposures tissue. It will be of particular interest to see whether to epigenetic associations with phenotypes, a key the effects of such modest differences, while perhaps question for epigenetic epidemiology concerns the of little consequence individually, may shift transcrip- extent to which easily accessible peripheral tissues tion of a biological process or functional network (such as blood) can be used to ask questions about when they co-occur with other changes to the inter-individual phenotypic variation manifest in in- methylome. Little is known about the actual scale accessible tissues such as the brain, visceral fat and and extent of between-individual variation in DNA other internal organs and tissues. Cross-tissue com- methylation across the genome. In this regard, parisons of the methylome within the same individual public genome-scale resources need to be created are currently underway to establish the relationship that document inter-individual differences in DNA between epigenetic patterns in blood with other tis- methylation and gene expression, in addition to the sues. Although these analyses are crucial, the results may not be generally applicable; higher inter-tissue reference epigenomes that are currently being concordance may be present for DNA methylation generated. THE SEVEN PLAGUES OF EPIGENETIC EPIDEMIOLOGY 77 interactions. Furthermore, ASM may contribute to- We lack a framework for the wards the apparent ‘missing heritability’ of many analysis of genome-wide epigenetic complex diseases and the low penetrance often re- ported for SNPs identified by GWASs. data The results of GWASs are relatively easy to judge. Quality-control steps are well-defined and reported, We have to manage high individually testing every genetic variant [i.e. single nucleotide polymorphism (SNP)] is straightforward, expectations and levels of genome-wide statistical significance are There is a considerable interest in epigenetic research clear. For EWASs, the analytical methodology is very in the popular press. The current study is a vivid much under construction. For example, in the current illustration: even though the authors deem it prelim- study it was not possible to attain genome-wide levels inary, it was widely covered by the media. of significance, which is acceptable for an exploratory Epigenetics should avoid some of the hype that sur- study, but makes it difficult to fully interpret the re- rounded the early days of genetic epidemiology. After ported differences. Because of the vast range of meth- the draft human genome sequence was announced in ods currently being used to assess DNA methylation, 2001, it was widely perceived that we would soon meta-analyses across different studies are difficult. understand the causes of most common diseases The adoption of a common technology platform, and how to treat them. This expectation was not real- such as the new Illumina 450 k Methylation istic, but not always renounced by geneticists. Beadchip, across multiple studies would provide an Currently, many scientists outside the field are disap- excellent opportunity to converge on widely accepted pointed by results of human genetics, and in particu- guidelines for the analysis and integration of EWAS lar GWASs, despite their overall considerable success. data. Apart from pre-processing procedures (quality Genetic epidemiology has proven to be harder than control, normalization, handling different probe expected despite the favourable starting point of thou- types, accounting for genetic variation, etc.), elements sands of Mendelian diseases and the high heritabil- of these guidelines should deal with the analysis of ities associated with most traits to be explained. Very individual CG dinucleotides vs groups of (correlated) much like genetics, epigenetics will not be able to adjacent CGs, the use of genome annotations in the deliver the miracles it is sometimes claimed it will. analysis (histone states, promoter types, CG content, In conclusion, epigenetic epidemiology is early in its etc.), and levels of epigenome-wide significance for development and susceptible to new ideas and appro- various analyses. An important aspect will be the ex- aches. Only a few years ago empirical papers were ploration of the previously mentioned gene-set testing greatly outnumbered by reviews. Now, reference epi- methods in the context of DNA methylation since genomes are produced at great pace (see http://epi- they will be vital to obtain meaningful interpretations 8,9 genomeatlas.org). Moreover, furthered by pilot of genome-wide data in terms of underlying biological studies like the one from Nada Borghol et al., the processes or genomic functions [e.g. promoters, en- outline of the infrastructure required for EWASs is hancers, inter/intragenic CG island (shores), etc.]. emerging. Crucial elements include optimal study de- For example, commonly used enrichment methods signs, benchmarking technology and data analysis assume independence within a gene set and, apart approaches that are statistically and biologically from consistency in biological signal in a gene set, sound. An additional key aspect to the successful statistical significance may reflect consistency in design and interpretation of epigenetic epidemiologic- other characteristics such as GC content, coverage or al studies will be the creation of public genome-scale other sequence features. Alternative implementa- resources focusing on inter-individual variation incor- tions of gene-set testing methods include global test- porating epigenomic, DNA sequence and transcrip- ing approaches. Finally, it will be important to adopt tomic data. Education, hard work and a certain an integrative paradigm based on the combination of degree of luck will get us there—not very different genetic and epigenetic epidemiological data. Of par- to the remedy against low SES. ticular relevance in this respect is evidence for the widespread occurrence of allele-specific DNA methy- lation (ASM) across the genome. Recent studies have Funding shown that there are considerable inter-individual dif- ferences in ASM, which are frequently associated with NGI/NWO (#93518027, to B.T.H.); NGI/NWO-funded genetic variation but can also be mediated by genomic Netherlands Consortium for Healthy Ageing (NCHA) imprinting (i.e. the parent-of-origin dependent silen- (#05060810, B.T.H.); NIH grant (AG036039, to J.M.). cing of expression by epigenetic mechanisms), envir- onmental influences and apparently stochastic factors 27,28 in the cell. ASM can mask the effect of risk alleles Acknowledgement by silencing their expression, and also provides a po- tential mechanism underlying gene–environment We thank Elmar Tobi for his comments. 78 INTERNATIONAL JOURNAL OF EPIDEMIOLOGY twins discordant for schizophrenia and bipolar disorder. Conflict of interest: None declared. Hum Mol Genet 2011. doi:10.1093/hmg/ddr416 [Epub 9 September 2011]. Wong CC, Caspi A, Williams B et al. A longitudinal References study of epigenetic variation in twins. Epigenetics 2010;5: 516–26. Tobi EW, Lumey LH, Talens RP et al. DNA methylation Ollikainen M, Smith KR, Joo EJ et al. DNA methylation differences after exposure to prenatal famine are common analysis of multiple tissues from newborn twins reveals and timing- and sex-specific. Hum Mol Genet 2009;18: both genetic and intrauterine components to variation in 4046–53. the human neonatal epigenome. Hum Mol Genet 2010;19: Gronniger E, Weber B, Heil O et al. Aging and chronic sun 4176–88. exposure cause distinct epigenetic changes in human Heijmans BT, Tobi EW, Lumey LH, Slagboom PE. The skin. PLoS Genet 2010;6:e1000971. epigenome: archive of the prenatal environment. Breitling LP, Yang R, Korn B, Burwinkel B, Brenner H. Epigenetics 2009;4:526–31. Tobacco-smoking-related differential DNA methylation: Talens RP, Boomsma DI, Tobi EW et al. Variation, pat- 27K discovery and replication. Am J Hum Genet 2011;88: terns, and temporal stability of DNA methylation: consid- 450–57. erations for epigenetic epidemiology. FASEB J 2010;24: Borghol N, Suderman M, McArdle W et al. Associations 3135–44. with early-life socio-economic position in adult DNA Thompson RF, Atzmon G, Gheorghe C et al. methylation. Int J Epidemiol 2012;41:62–74. Tissue-specific dysregulation of DNA methylation in Rakyan VK, Down TA, Balding DJ, Beck S. Epigenome- aging. Aging Cell 2010;9:506–18. wide association studies for common human diseases. Martino DJ, Tulic MK, Gordon L et al. Evidence for Nat Rev Genet 2011;12:529–41. age-related and individual-specific changes in DNA Irizarry RA, Ladd-Acosta C, Wen B et al. The human colon methylation profile of mononuclear cells during early cancer methylome shows similar hypo- and hypermethy- immune development in humans. Epigenetics 2011;6: lation at conserved tissue-specific CpG island shores. Nat 1085–94. Genet 2009;41:178–86. Stoger R. The thrifty epigenotype: an acquired and herit- Deaton AM, Webb S, Kerr AR et al. Cell type-specific DNA able predisposition for obesity and diabetes? Bioessays methylation at intragenic CpG islands in the immune 2008;30:156–66. system. Genome Res 2011;21:1074–86. Goeman JJ, Buhlmann P. Analyzing gene expression data Bernstein BE, Stamatoyannopoulos JA, Costello JF et al. in terms of gene sets: methodological issues. Bioinformatics The NIH Roadmap Epigenomics Mapping Consortium. 2007;23:980–87. Nat Biotechnol 2010;28:1045–8. Goeman JJ, van de Geer SA, de Kort F, van Anonymous. Moving AHEAD with an international Houwelingen HC. A global test for groups of genes: test- human epigenome project. Nature 2008;454:711–5. ing association with a clinical outcome. Bioinformatics Klug M, Rehli M. Functional analysis of promoter CpG 2004;20:93–99. methylation using a CpG-free luciferase reporter vector. Meaburn EL, Schalkwyk LC, Mill J. Allele-specific Epigenetics 2006;1:127–30. methylation in the human genome: implications for Laird PW. Principles and challenges of genomewide DNA genetic studies of complex disease. Epigenetics 2010;5: methylation analysis. Nat Rev Genet 2010;11:191–203. 578–82. Bock C, Tomazou EM, Brinkman AB et al. Quantitative Shoemaker R, Deng J, Wang W, Zhang K. Allele-specific comparison of genome-wide DNA methylation mapping methylation is prevalent and is contributed by CpG-SNPs technologies. Nat Biotechnol 2010;28:1106–14. in the human genome. Genome Res 2010;20:883–89. Harris RA, Wang T, Coarfa C et al. Comparison of Schalkwyk LC, Meaburn EL, Smith R et al. Allelic skewing sequencing-based methods to profile DNA methylation of DNA methylation is widespread across the genome. Am and identification of monoallelic epigenetic modifications. J Hum Genet 2010;86:196–212. Nat Biotechnol 2010;28:1097–105. Kong A, Steinthorsdottir V, Masson G et al. Parental Sandoval J, Heyn HA, Moran S et al. Validation of a DNA origin of sequence variants associated with complex dis- methylation microarray for 450,000 CpG sites in the eases. Nature 2009;462:868–74. human genome. Epigenetics 2011;6:692–702. Coghlan A. Childhood poverty leaves its marks on Relton CL, Davey Smith G. Epigenetic epidemiology of adult genetics. New Scientist 2011. [Epub 26 common complex disease: prospects for prediction, pre- October 2011]; http://www.newscientist.com/article/ vention, and treatment. PLoS Med 2010;7:e1000356. Dempster EL, Pidsley R, Schalkwyk LC et al. dn20255-childhood-poverty-leaves-its-mark-on-adult- Disease-associated epigenetic changes in monozygotic genetics.html (15 November 2011, date last accessed). http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png International Journal of Epidemiology Pubmed Central

Commentary: The seven plagues of epigenetic epidemiology

International Journal of Epidemiology , Volume 41 (1) – Jan 23, 2012

Loading next page...
 
/lp/pubmed-central/commentary-the-seven-plagues-of-epigenetic-epidemiology-0y30j21ZD8

References

References for this paper are not available at this time. We will be adding them shortly, thank you for your patience.

Publisher
Pubmed Central
Copyright
Published by Oxford University Press on behalf of the International Epidemiological Association © The Author 2012; all rights reserved.
ISSN
0300-5771
eISSN
1464-3685
DOI
10.1093/ije/dyr225
Publisher site
See Article on Publisher Site

Abstract

74 INTERNATIONAL JOURNAL OF EPIDEMIOLOGY 54 57 McGowan PO, Suderman M, Sasaki A et al. Broad epi- Rakyan VK, Down TA, Thorne NP et al. An integrated genetic signature of maternal care in the brain of adult resource for genome-wide identification and analysis of rats. PLoS ONE 2011;6:e14739. human tissue-specific differentially methylated regions Galobardes B, Shaw M, Lawlor DA, Lynch JW, Davey (tDMRs). Genome Res 2008;18:1518–29. Smith G. Indicators of socioeconomic position (part 2). Atherton K, Fuller E, Shepherd P, Strachan DP, Power C. J Epidemiol Community Health 2006;60:95–101. Loss and representativeness in a biomedical survey at age Galobardes B, Shaw M, Lawlor DA, Lynch JW, Davey 45 years: 1958 British birth cohort. J Epidemiol Community Smith G. Indicators of socioeconomic position (part 1). Health 2008;62:216–23. J Epidemiol Community Health 2006;60:7–12. This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/ by-nc/3.0), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited. Published by Oxford University Press on behalf of the International Epidemiological Association International Journal of Epidemiology 2012;41:74–78 The Author 2012; all rights reserved. Advance Access publication 23 January 2012 doi:10.1093/ije/dyr225 Commentary: The seven plagues of epigenetic epidemiology 1,2 3 Bastiaan T Heijmans * and Jonathan Mill 1 2 Molecular Epidemiology, Leiden University Medical Center, Leiden, The Netherlands, Netherlands Consortium for Healthy Ageing, Leiden, The Netherlands and Institute of Psychiatry, King’s College London, London, UK *Corresponding author. Molecular Epidemiology, Leiden University Medical Center, Postal Zone S-5-P, PO Box 9600, 2300 RC, Leiden, The Netherlands. E-mail: bas.heijmans@lumc.nl Accepted 1 December 2011 Epigenetics is being increasingly combined with every epigenetic epidemiologist is struggling with the same biological, technical and methodological issues. epidemiology to add mechanistic understanding to associations observed between environmental, genetic It is important to take these into consideration when and stochastic factors and human disease phenotypes. designing a study and interpreting the results. Let us consider seven of those issues, taking the current Currently, epigenetic epidemiological studies primarily study on SES as a starting point. focus on exploring if and where the epigenome (i.e. the overall epigenetic state of a cell) is influenced by specific environmental exposures like prenatal nu- 1 2 3 trition, sun exposure and smoking. In this issue of We do not really know where the IJE, Nada Borghol et al. report an association to look, or what to look for between childhood social-economic status (SES) and differential DNA methylation in adulthood. Low SES Most epigenetic epidemiological studies focus on DNA may integrate diverse and heterogeneous environmen- methylation for various practical and biological rea- tal influences, and knowing which epigenetic changes sons, neglecting other layers of the epigenome-like are associated with low SES may provide clues about histone modifications that are also likely to be im- the biological processes underlying its health conse- portant in influencing disease phenotypes. Our basic quences. The authors stress that their study is prelim- understanding of the methylome (i.e. the whole of inary. This statement is, in fact, to a greater or lesser DNA methylation marks on the genome) is in its extent applicable to the entire first wave of studies infancy, and we are still learning about the specific currently being published that likewise aim to dis- localization of the features that, when differentially cover associations between epigenetic variation mea- methylated, regulate gene expression and are thus sured on a genome-wide scale and environmental relevant for epigenetic epidemiologists to study. The exposures or disease phenotypes. When executing current study, like many others, evaluated promoter such epigenome-wide association studies (EWASs), regions, in this case defined as 1000 bp upstream to THE SEVEN PLAGUES OF EPIGENETIC EPIDEMIOLOGY 75 12,13 250 bp downstream of transcription start sites. human genome). The recently launched Illumina Although these features are often enriched for DNA 450 k Methylation Beadchip may offer a balance be- methylation marks influencing the expression of tween coverage and precision, which will be attractive genes, recent work suggests that other regions of for epidemiological EWASs executed during the next the methylome outside of promoters, including few years. It interrogates DNA methylation at over inter-genic CpG island shores and intra-genic CpG 480 000 CG dinucleotides, is high-throughput and rela- islands, may ultimately be more important for regu- tively affordable. The precision of this platform appears 12,13 lating phenotypic variation. to compare well with some of the other platforms, For any differentially methylated region identified in but these results should be interpreted with caution. EWASs it will be important to demonstrate function- Although correlation coefficients reported across the ality. Promoter methylation in the current study was various platform comparisons are high, they are mainly driven by the fact that the large majority of the integrated with public gene expression data and, as genome is either unmethylated or fully methylated, and expected, highly expressed genes were more com- substantial discrepancies between platforms may exist monly flanked by less methylated promoters and 12,14 for intermediate level methylation. Therefore, the vice versa. A limitation is that this observation is for technological validation of findings using an independ- groups of promoters, whereas information is needed ent method remains important. This will be feasible about this relationship for individual promoters. for a small number of ‘top hits’, like the three procad- Mining the reference epigenomes and transcriptomes herin promoters assessed in the current study. However, that are being generated for different cell types under validating the outcomes of the complex pathway the umbrella of initiatives such as the National Insti- analyses performed to implicate either entire biological tutes of Health (NIH) Epigenomics Roadmap and the processes (such as extra- and intra-cellular signalling in International Human Epigenome Consortium may the current study) or genomic features with a specific contribute to such information. Additional in vitro function in gene regulation [e.g. promoters, enhancers, experiments will be required to evaluate the transcrip- inter/intragenic CG island (shores) etc.], is more tional effects of differential DNA methylation at a demanding and currently not realized. Validating the specific locus independent of its genomic context. results of such gene-set testing methods will entail the re-assessment of DNA methylation across large sets of loci. We have to rely on imperfect technology The good news is that recent advances in genomic tech- We may be limited by available nology mean that genome-scale studies of DNA methy- sample sizes that are optimal for lation across multiple samples are now feasible. In epigenetic epidemiology practice, however, one has to compromise between coverage and precision in epidemiological studies, The current study investigated only 40 individuals. which likely incorporate a large number of samples. A Investigators will be able to secure budgets for large (and growing) number of methods exist for as- larger studies as empirical data increasingly highlight sessing DNA methylation both genome wide and at spe- the value of epigenetic epidemiology, and high- cific CpG sites, and one problem relates to our inability throughput, economical laboratory approaches to compare results across studies that have used differ- become more widely adopted. Nevertheless, it is un- ent platforms. On the one hand there are methods such likely that the simple brute-force approach that has as that used in the current study in which the methy- been used relatively successfully in genome-wide as- lated portion of the genome is captured using antibodies sociation studies (GWASs) is valid for EWASs. In gen- against methylated DNA and subsequently quantified etics, many of the epidemiological principles about using microarrays or next-generation sequencing. designing studies with respect to selection biases, con- These approaches can provide coverage across most of founding, batch effects and appropriateness of con- the genome and may be optimally suited to discriminate trols could largely be replaced by the simple rule low from high methylation, but have lower reliability ‘bigger-is-better’. This is not true for epigenetic epi- for smaller differences and are biased by factors such as demiology, because the epigenome is not a static 12,13 CG density. On the other hand, there are methods entity like the genome, which necessitates the use based on the bisulphite conversion of DNA combined of more conventional epidemiological approaches. with next-generation sequencing that provide higher Further complicating matters is the fact that, for the accuracy and single nucleotide resolution. Although most powerful study designs in epigenetic epidemi- whole-genome bisulphite sequencing is currently un- ology (including studies of discordant monozygotic 16 17 feasible to use across large epidemiological cohorts, twins particularly when longitudinally sampled, the method can be adapted to target a reduced repre- early exposure studies with long-term follow-up, sentation of the genome (approximately 3 million out of and studies of specific cell types ), the number of approximately 28 million CG dinucleotides in the eligible individuals for whom relevant biological 76 INTERNATIONAL JOURNAL OF EPIDEMIOLOGY materials were stored in existing epidemiological co- changes induced early in development (and potential- horts were often limited, and it will be difficult to ly propagated soma-wide) than for changes occur scale-up analyses to include the thousands of samples during ageing that are more likely to remain tissue 19,21 that may be required for establishing robust associ- specific. Efforts to obtain biopsies (subcutaneous ations with disease phenotypes. Moving forward, it fat, muscle, etc.) and post-mortem material in subsets will be important to establish cause and effect in epi- of longitudinal biobanks will greatly increase their genetic epidemiology; disease-associated differentially value for epigenetic studies, despite the problems methylated regions may arise prior to illness and con- associated with cellular heterogeneity that also hold tribute to the disease phenotype or could be a second- for such samples. ary effect of the disease process, or the medications used in treatment. Furthermore, maximum infor- mation will be obtained from epidemiological studies that are able to integrate epigenomic information We may be trying to detect with genomic, transcriptomic and proteomic data ob- tained from the same samples. inherently small effect sizes using these sub-optimal methods and sample cohorts Whatever we do, it may never The main findings in the current study concerned be enough to fully account for DNA methylation differences at three procadherin promoters. The extent of the difference at these pro- epigenetic differences between moters was similar to those commonly observed in tissues and cells other recent studies, namely 5%, and was most ap- In many respects, large comprehensively phenotyped parent for a single, nominally statistically significant and longitudinally sampled epidemiological studies, CG dinucleotide in each region. The biological impli- like the 1958 British birth cohort used in the current cations of such small alterations in DNA methylation study, are an ideal resource for epigenetic epidemi- in terms of gene expression and function are un- ology. In nearly all of these studies, however, whole known. Although DNA methylation is recognized as blood is the only biological material that has been one of the most stable epigenetic marks, it is still archived. Blood is a heterogeneous tissue and any relatively dynamic and this has important implica- DNA methylation difference between groups could tions for epigenetic epidemiology. The randomness be confounded by differences in the cellular compos- of maintaining and mitotically transmitting DNA ition of whole blood samples, for example, resulting methylation patterns may potentially dilute the puta- from the immune response to sub-clinical infection. tive epigenetic signatures of an adverse exposure early The good news is that fewer than perhaps expected in life (e.g. to low SES in childhood) observed dec- DNA methylation differences exist between leucocyte ades later. Of note, recent studies indicate that DNA types, and controlling for cellular heterogeneity may methylation patterns in leucocytes undergo consider- be possible in biobanks with a simple blood cell 20 able changes during the first years of life. Thus on count. Whether the latter is sufficient (and under top of the previously discussed question of whether which circumstances it is not), however, remains to DNA methylation at a specific locus actually influ- be established. Epigenomic studies of separate cell ences transcriptional activity, researchers should also types such as those being undertaken by the NIH aim to establish whether the small DNA methylation Epigenomic Roadmap Initiative and the European differences often observed between groups—either ex- Union Blueprint consortium are currently generating pressed as absolute difference, relative difference or reference epigenomes of haematopoietic cells that will relative to the variation in the population—translate be of great utility in this regard. When moving into differences in gene expression in the relevant beyond associations with environmental exposures tissue. It will be of particular interest to see whether to epigenetic associations with phenotypes, a key the effects of such modest differences, while perhaps question for epigenetic epidemiology concerns the of little consequence individually, may shift transcrip- extent to which easily accessible peripheral tissues tion of a biological process or functional network (such as blood) can be used to ask questions about when they co-occur with other changes to the inter-individual phenotypic variation manifest in in- methylome. Little is known about the actual scale accessible tissues such as the brain, visceral fat and and extent of between-individual variation in DNA other internal organs and tissues. Cross-tissue com- methylation across the genome. In this regard, parisons of the methylome within the same individual public genome-scale resources need to be created are currently underway to establish the relationship that document inter-individual differences in DNA between epigenetic patterns in blood with other tis- methylation and gene expression, in addition to the sues. Although these analyses are crucial, the results may not be generally applicable; higher inter-tissue reference epigenomes that are currently being concordance may be present for DNA methylation generated. THE SEVEN PLAGUES OF EPIGENETIC EPIDEMIOLOGY 77 interactions. Furthermore, ASM may contribute to- We lack a framework for the wards the apparent ‘missing heritability’ of many analysis of genome-wide epigenetic complex diseases and the low penetrance often re- ported for SNPs identified by GWASs. data The results of GWASs are relatively easy to judge. Quality-control steps are well-defined and reported, We have to manage high individually testing every genetic variant [i.e. single nucleotide polymorphism (SNP)] is straightforward, expectations and levels of genome-wide statistical significance are There is a considerable interest in epigenetic research clear. For EWASs, the analytical methodology is very in the popular press. The current study is a vivid much under construction. For example, in the current illustration: even though the authors deem it prelim- study it was not possible to attain genome-wide levels inary, it was widely covered by the media. of significance, which is acceptable for an exploratory Epigenetics should avoid some of the hype that sur- study, but makes it difficult to fully interpret the re- rounded the early days of genetic epidemiology. After ported differences. Because of the vast range of meth- the draft human genome sequence was announced in ods currently being used to assess DNA methylation, 2001, it was widely perceived that we would soon meta-analyses across different studies are difficult. understand the causes of most common diseases The adoption of a common technology platform, and how to treat them. This expectation was not real- such as the new Illumina 450 k Methylation istic, but not always renounced by geneticists. Beadchip, across multiple studies would provide an Currently, many scientists outside the field are disap- excellent opportunity to converge on widely accepted pointed by results of human genetics, and in particu- guidelines for the analysis and integration of EWAS lar GWASs, despite their overall considerable success. data. Apart from pre-processing procedures (quality Genetic epidemiology has proven to be harder than control, normalization, handling different probe expected despite the favourable starting point of thou- types, accounting for genetic variation, etc.), elements sands of Mendelian diseases and the high heritabil- of these guidelines should deal with the analysis of ities associated with most traits to be explained. Very individual CG dinucleotides vs groups of (correlated) much like genetics, epigenetics will not be able to adjacent CGs, the use of genome annotations in the deliver the miracles it is sometimes claimed it will. analysis (histone states, promoter types, CG content, In conclusion, epigenetic epidemiology is early in its etc.), and levels of epigenome-wide significance for development and susceptible to new ideas and appro- various analyses. An important aspect will be the ex- aches. Only a few years ago empirical papers were ploration of the previously mentioned gene-set testing greatly outnumbered by reviews. Now, reference epi- methods in the context of DNA methylation since genomes are produced at great pace (see http://epi- they will be vital to obtain meaningful interpretations 8,9 genomeatlas.org). Moreover, furthered by pilot of genome-wide data in terms of underlying biological studies like the one from Nada Borghol et al., the processes or genomic functions [e.g. promoters, en- outline of the infrastructure required for EWASs is hancers, inter/intragenic CG island (shores), etc.]. emerging. Crucial elements include optimal study de- For example, commonly used enrichment methods signs, benchmarking technology and data analysis assume independence within a gene set and, apart approaches that are statistically and biologically from consistency in biological signal in a gene set, sound. An additional key aspect to the successful statistical significance may reflect consistency in design and interpretation of epigenetic epidemiologic- other characteristics such as GC content, coverage or al studies will be the creation of public genome-scale other sequence features. Alternative implementa- resources focusing on inter-individual variation incor- tions of gene-set testing methods include global test- porating epigenomic, DNA sequence and transcrip- ing approaches. Finally, it will be important to adopt tomic data. Education, hard work and a certain an integrative paradigm based on the combination of degree of luck will get us there—not very different genetic and epigenetic epidemiological data. Of par- to the remedy against low SES. ticular relevance in this respect is evidence for the widespread occurrence of allele-specific DNA methy- lation (ASM) across the genome. Recent studies have Funding shown that there are considerable inter-individual dif- ferences in ASM, which are frequently associated with NGI/NWO (#93518027, to B.T.H.); NGI/NWO-funded genetic variation but can also be mediated by genomic Netherlands Consortium for Healthy Ageing (NCHA) imprinting (i.e. the parent-of-origin dependent silen- (#05060810, B.T.H.); NIH grant (AG036039, to J.M.). cing of expression by epigenetic mechanisms), envir- onmental influences and apparently stochastic factors 27,28 in the cell. ASM can mask the effect of risk alleles Acknowledgement by silencing their expression, and also provides a po- tential mechanism underlying gene–environment We thank Elmar Tobi for his comments. 78 INTERNATIONAL JOURNAL OF EPIDEMIOLOGY twins discordant for schizophrenia and bipolar disorder. Conflict of interest: None declared. Hum Mol Genet 2011. doi:10.1093/hmg/ddr416 [Epub 9 September 2011]. Wong CC, Caspi A, Williams B et al. A longitudinal References study of epigenetic variation in twins. Epigenetics 2010;5: 516–26. Tobi EW, Lumey LH, Talens RP et al. DNA methylation Ollikainen M, Smith KR, Joo EJ et al. DNA methylation differences after exposure to prenatal famine are common analysis of multiple tissues from newborn twins reveals and timing- and sex-specific. Hum Mol Genet 2009;18: both genetic and intrauterine components to variation in 4046–53. the human neonatal epigenome. Hum Mol Genet 2010;19: Gronniger E, Weber B, Heil O et al. Aging and chronic sun 4176–88. exposure cause distinct epigenetic changes in human Heijmans BT, Tobi EW, Lumey LH, Slagboom PE. The skin. PLoS Genet 2010;6:e1000971. epigenome: archive of the prenatal environment. Breitling LP, Yang R, Korn B, Burwinkel B, Brenner H. Epigenetics 2009;4:526–31. Tobacco-smoking-related differential DNA methylation: Talens RP, Boomsma DI, Tobi EW et al. Variation, pat- 27K discovery and replication. Am J Hum Genet 2011;88: terns, and temporal stability of DNA methylation: consid- 450–57. erations for epigenetic epidemiology. FASEB J 2010;24: Borghol N, Suderman M, McArdle W et al. Associations 3135–44. with early-life socio-economic position in adult DNA Thompson RF, Atzmon G, Gheorghe C et al. methylation. Int J Epidemiol 2012;41:62–74. Tissue-specific dysregulation of DNA methylation in Rakyan VK, Down TA, Balding DJ, Beck S. Epigenome- aging. Aging Cell 2010;9:506–18. wide association studies for common human diseases. Martino DJ, Tulic MK, Gordon L et al. Evidence for Nat Rev Genet 2011;12:529–41. age-related and individual-specific changes in DNA Irizarry RA, Ladd-Acosta C, Wen B et al. The human colon methylation profile of mononuclear cells during early cancer methylome shows similar hypo- and hypermethy- immune development in humans. Epigenetics 2011;6: lation at conserved tissue-specific CpG island shores. Nat 1085–94. Genet 2009;41:178–86. Stoger R. The thrifty epigenotype: an acquired and herit- Deaton AM, Webb S, Kerr AR et al. Cell type-specific DNA able predisposition for obesity and diabetes? Bioessays methylation at intragenic CpG islands in the immune 2008;30:156–66. system. Genome Res 2011;21:1074–86. Goeman JJ, Buhlmann P. Analyzing gene expression data Bernstein BE, Stamatoyannopoulos JA, Costello JF et al. in terms of gene sets: methodological issues. Bioinformatics The NIH Roadmap Epigenomics Mapping Consortium. 2007;23:980–87. Nat Biotechnol 2010;28:1045–8. Goeman JJ, van de Geer SA, de Kort F, van Anonymous. Moving AHEAD with an international Houwelingen HC. A global test for groups of genes: test- human epigenome project. Nature 2008;454:711–5. ing association with a clinical outcome. Bioinformatics Klug M, Rehli M. Functional analysis of promoter CpG 2004;20:93–99. methylation using a CpG-free luciferase reporter vector. Meaburn EL, Schalkwyk LC, Mill J. Allele-specific Epigenetics 2006;1:127–30. methylation in the human genome: implications for Laird PW. Principles and challenges of genomewide DNA genetic studies of complex disease. Epigenetics 2010;5: methylation analysis. Nat Rev Genet 2010;11:191–203. 578–82. Bock C, Tomazou EM, Brinkman AB et al. Quantitative Shoemaker R, Deng J, Wang W, Zhang K. Allele-specific comparison of genome-wide DNA methylation mapping methylation is prevalent and is contributed by CpG-SNPs technologies. Nat Biotechnol 2010;28:1106–14. in the human genome. Genome Res 2010;20:883–89. Harris RA, Wang T, Coarfa C et al. Comparison of Schalkwyk LC, Meaburn EL, Smith R et al. Allelic skewing sequencing-based methods to profile DNA methylation of DNA methylation is widespread across the genome. Am and identification of monoallelic epigenetic modifications. J Hum Genet 2010;86:196–212. Nat Biotechnol 2010;28:1097–105. Kong A, Steinthorsdottir V, Masson G et al. Parental Sandoval J, Heyn HA, Moran S et al. Validation of a DNA origin of sequence variants associated with complex dis- methylation microarray for 450,000 CpG sites in the eases. Nature 2009;462:868–74. human genome. Epigenetics 2011;6:692–702. Coghlan A. Childhood poverty leaves its marks on Relton CL, Davey Smith G. Epigenetic epidemiology of adult genetics. New Scientist 2011. [Epub 26 common complex disease: prospects for prediction, pre- October 2011]; http://www.newscientist.com/article/ vention, and treatment. PLoS Med 2010;7:e1000356. Dempster EL, Pidsley R, Schalkwyk LC et al. dn20255-childhood-poverty-leaves-its-mark-on-adult- Disease-associated epigenetic changes in monozygotic genetics.html (15 November 2011, date last accessed).

Journal

International Journal of EpidemiologyPubmed Central

Published: Jan 23, 2012

References