Get 20M+ Full-Text Papers For Less Than $1.50/day. Subscribe now for You or Your Team.

Learn More →

Genetic Power Calculator: design of linkage andassociation genetic mapping studies of complex traits

Genetic Power Calculator: design of linkage andassociation genetic mapping studies of complex traits Vol. 19 no. 1 2003 BIOINFORMATICS APPLICATIONS NOTE Pages 149–150 Genetic Power Calculator: design of linkage and association genetic mapping studies of complex traits 1,∗ 2 1 S. Purcell ,S.S. Cherny and P. C. Sham Social, Genetics and Developmental Psychiatry Research Centre, Institute of Psychiatry, King’s College London, De Crespigny Park, London SE5 8AF, UK and Wellcome Trust Centre for Human Genetics, University of Oxford, Roosevelt Drive, Oxford OX3 7BN, UK Received on May 16, 2002; revised on July 11, 2002; accepted on July 15, 2002 ABSTRACT researcher in performing such calculations (Cardon and Summary: Awebsite for performing power calculations Fulker, 1994; Carey and Williamson, 1991; Nance and for the design of linkage and association genetic mapping Neale, 1989; Neale et al., 1994; Schmitz et al., 1998; studies of complex traits. Sham et al., 2000; Suarez et al., 1982) there are few Availibility: The package is made available at http:// software tools to make the task practical. The present statgen.iop.kcl.ac.uk/gpc/ paper describes an easy- to-use website which allows the Contact: s.purcell@iop.kcl.ac.uk researcher to quickly perform such necessary calculations. Methods to map loci influencing complex traits fall An essential first step in the planning of any scientific into two broad classes: linkage and association. Linkage study is to assess how many samples must be collected relies on correlating sharing of chromosomal segments in order to achieve sufficient power to detect the hy- among relatives with their similarity on a trait whereas pothesized effect. In human genetics this requirement is association directly relates genotype to phenotype. Vari- particularly salient, given the costs involved in phenotyp- ance components models provide a powerful framework ing and genotyping individuals. Futhermore, it is virtually for both linkage and association mapping (Almasy impossible to submit a grant proposal to study a particular and Blangero, 1998; Fulker and Cherny, 1996; Pratt et disease or trait without inclusion of detailed power al., 2000; Fulker et al., 1999). It has been shown that calculations to show that the proposed research is likely maximum-likelihood variance components approaches to to succeed, provided there is indeed a gene to be found. linkage mapping of quantitative trait loci (QTL), which After conducting a study, power analysis can also shed utilize the full familial covariance structure, are more light on negative results by indicating whether the study powerful than simple regression-based methods (Fulker was underpowered, or what the smallest detectable effect and Cherny, 1996) Additionally, for association mapping, size would be given the actual sample size. The power apowerful variance components approach has been of a study is the probability of successfully detecting presented (Fulker et al., 1999) which allows simultaneous an effect of a particular size: if β is the probability of a modelling of linkage and association while controlling false-negative (type II) error, then power is 1 − β.Power for population stratification effects in sibship data, by depends on several factors: magnitude of effect, sample considering both between-sibship and within-sibship size, N , and required level of statistical significance, α variation. This method has been made accessible by the (the false-positive, or type I, error rate). Although N and release of QTDT (Abecasis et al., 2000a) and has been α are determined by the experimenter, many of the factors generalized to deal with extended families rather than just that contribute to the effect size are typically unknown. sibships (Abecasis et al., 2000b). In order to compute power, we are therefore required The computationally intensive approach to power cal- to make assumptions regarding what we expect to find. culation is to simulate hundreds or thousands of replicate Formapping loci, such factors include the proportion samples under a specifed set of population parameters. of variance explained by the trait locus, gene action, The proportion of replicates in which an effect is detected and marker heterozygosity and density. Although there (the test statistic falling above a specified threshold) pro- is no shortage of statistical genetic literature to aid the vides an estimate of power. Recently, however, closed- To whom correspondence should be addressed. form analytic power equations have been presented for c Oxford University Press 2003 149 S.Purcell et al. variance components methods of linkage and association ACKNOWLEDGEMENTS mapping (Sham et al., 2000). The use of such equations Supported in part by National Institutes of Health (USA) greatly speeds up power calculation and allows a more grant EY-12562, MRC components grant G9700821 (UK) comprehensive exploration of the parameter space (e.g. and the Wellcome Trust. different models and sample types). The Genetic Power Calculator (GPC) implements these power equations and REFERENCES others, for both linkage and association methods using ei- ther qualitative (e.g. the presence or absence of a disease) Abecasis,G.R., Cardon,L.R. and Cookson,W.O. (2000a) A general test of association for quantitative traits in nuclear families. Am. or quantitative (e.g. a score on a personality inventory) J. Hum. Genet., 66, 279–292. traits. Abecasis,G.R., Cookson,W.O. and Cardon,L.R. (2000b) Pedigree Power for the variance components linkage test can tests of transmission desiquilibrium. Eur. J. Hum. Genet., 8, 545– be calculated for sibships of arbitrary size, under user- definable levels of the proportion of variance explained by Almasy,L. and Blangero,J. (1998) Multipoint quantitative-trait the trait locus, acting additively and/or via dominance. The linkage analysis in general pedigress. Am. J. Hum. Genet., 62, background residuals sibling correlation can be varied, as 1198–1211. can the polymorphism information content at the locus Cardon,L.R. and Fulker,D.W. (1994) The power of interval mapping of interest, allowing accommodation of either twopoint of quantitative trait loci, using selected sib pairs. Am. J. Hum. or multipoint linkage. The output includes a table of Genet., 55, 825–833. power for various common α levels as well as a user- Carey,G. and Williamson,J. (1991) Linkage analysis of quantitative selected α level, and required sample size to achieve traits; incresed power by using selected samples. Am. J. Hum. the user-selected desired level of power, suitable for Genet., 49, 786–796. direct inclusion in a grant proposal. In addition the Fulker,D.W. and Cherny,S.S. (1996) An improved multipoint sib- noncentrality parameter is provided to facilitate power pair analysis of quantative traits. Behavior Genet., 26, 527–532. calculation for samples of variable-sized sibships. For Fulker,D.W., Cherny,S.S., Sham,P.C. and Hewitt,J.K. (1999) Com- tests of association, GPC uses the variance components bined linkage and association sib-pair analysis for quantitative test described by Fulker et al. (1999). The user is presented traits. Am. J. Hum. Genet., 64, 259–267. with a similar set of options as for the linkage test, along Nance,W.E. and Neale,M.C. (1989) Partitioned twin analysis: a with association-specific options such as the extent of power study. Behavior Genet., 19, 143–150. trait locus-marker locus linkage disequilibrium and allele Neale,M.C., Eaves,L.J. and Kendler,K.S. (1994) The power of the frequencies. Output is presented for between-sibship, classical twin method to resolve variation in threshold traits. within-sibship and combined tests of association. Behavior Genet., 24, 239–258. For testing association in discrete (disease) traits, tools Pratt,S.C., Daly,M.J. and Kruglyak,L. (2000) Exact multipoint are available for both the TDT test, which employs parents quantitative-trait linkage analysis in pedigress by variance com- and a single affected offspring (Spielman et al., 1993) ponents. Am. J. Hum. Genet., 66, 1153–1157. and the case-control design (Sham, 1998). The user is Purcell,S., Cherny,S., Hewitt,J. and Sham,P. (2001) Optimal sibship able to explore power under conditions of varying disease selection for genotyping in quantitative trait locus linkage allele frequency, disease prevalence, and genotype relative analysis. Human Heredity, 52, 1–13. risk. Again, output is similar to that described above. In Schmitz,S., Cherny,S.S. and Fulker,D.W. (1998) Increase in power addition, calculation of power for the quantitative TDT through multivariate analyses. Behavior Genet., 28, 357–363. and quantitative case-control designs is also available. Sham,P. (1998) Statistics in Human Genetics, Ist edn, Arnold, These study designs assume that cases and controls are London. defined as scoring above or below specific thresholds. Sham,P.C., Cherny,S.S., Purcell,S. and Hewitt,J.K. (2000) Power of Additional utilities made available on the website include linkage versus association analysis of quantitative traits, by use two-locus linkage power calculations and a facility for of variance-components models, for sibship data. Am. J. Hum. calculating the potential informativeness of sibships for Genet., 66, 1616–1630. linkage, conditional on observed trait values. This index Spielman,R.S., McGinnis,R.E. and Ewens,W.J. (1993) Transmis- of informativeness provides a basis for efficient selective sion test for linkage disequilibrium: the insulin gene region genopyting (Purcell et al., 2001). Presently, there is no and insulin-dependent diabetes mellitus (IDDM). Am. J. Hum. software widely available to employ such tests, but the Genet., 52, 506–516. situation is likely to improve in the near future. We will Suarez,B., O’Rourke,D. and Van Eerdewegh,P. (1982) Power of the attempt to add additional tools for estimating power to this affected-sib-pair method to defect disease susceptibility loci of website as additional methods of analysis are developed small effect: an application to multiple sclerosis. Am. J. Med. and software distributed for their implementation. Genet., 12, 309–326. http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Bioinformatics Oxford University Press

Genetic Power Calculator: design of linkage andassociation genetic mapping studies of complex traits

Bioinformatics , Volume 19 (1): 2 – Jan 1, 2003

Loading next page...
 
/lp/oxford-university-press/genetic-power-calculator-design-of-linkage-andassociation-genetic-SLb76MpOLK

References (15)

Publisher
Oxford University Press
Copyright
© Oxford University Press 2003
ISSN
1367-4803
eISSN
1460-2059
DOI
10.1093/bioinformatics/19.1.149
Publisher site
See Article on Publisher Site

Abstract

Vol. 19 no. 1 2003 BIOINFORMATICS APPLICATIONS NOTE Pages 149–150 Genetic Power Calculator: design of linkage and association genetic mapping studies of complex traits 1,∗ 2 1 S. Purcell ,S.S. Cherny and P. C. Sham Social, Genetics and Developmental Psychiatry Research Centre, Institute of Psychiatry, King’s College London, De Crespigny Park, London SE5 8AF, UK and Wellcome Trust Centre for Human Genetics, University of Oxford, Roosevelt Drive, Oxford OX3 7BN, UK Received on May 16, 2002; revised on July 11, 2002; accepted on July 15, 2002 ABSTRACT researcher in performing such calculations (Cardon and Summary: Awebsite for performing power calculations Fulker, 1994; Carey and Williamson, 1991; Nance and for the design of linkage and association genetic mapping Neale, 1989; Neale et al., 1994; Schmitz et al., 1998; studies of complex traits. Sham et al., 2000; Suarez et al., 1982) there are few Availibility: The package is made available at http:// software tools to make the task practical. The present statgen.iop.kcl.ac.uk/gpc/ paper describes an easy- to-use website which allows the Contact: s.purcell@iop.kcl.ac.uk researcher to quickly perform such necessary calculations. Methods to map loci influencing complex traits fall An essential first step in the planning of any scientific into two broad classes: linkage and association. Linkage study is to assess how many samples must be collected relies on correlating sharing of chromosomal segments in order to achieve sufficient power to detect the hy- among relatives with their similarity on a trait whereas pothesized effect. In human genetics this requirement is association directly relates genotype to phenotype. Vari- particularly salient, given the costs involved in phenotyp- ance components models provide a powerful framework ing and genotyping individuals. Futhermore, it is virtually for both linkage and association mapping (Almasy impossible to submit a grant proposal to study a particular and Blangero, 1998; Fulker and Cherny, 1996; Pratt et disease or trait without inclusion of detailed power al., 2000; Fulker et al., 1999). It has been shown that calculations to show that the proposed research is likely maximum-likelihood variance components approaches to to succeed, provided there is indeed a gene to be found. linkage mapping of quantitative trait loci (QTL), which After conducting a study, power analysis can also shed utilize the full familial covariance structure, are more light on negative results by indicating whether the study powerful than simple regression-based methods (Fulker was underpowered, or what the smallest detectable effect and Cherny, 1996) Additionally, for association mapping, size would be given the actual sample size. The power apowerful variance components approach has been of a study is the probability of successfully detecting presented (Fulker et al., 1999) which allows simultaneous an effect of a particular size: if β is the probability of a modelling of linkage and association while controlling false-negative (type II) error, then power is 1 − β.Power for population stratification effects in sibship data, by depends on several factors: magnitude of effect, sample considering both between-sibship and within-sibship size, N , and required level of statistical significance, α variation. This method has been made accessible by the (the false-positive, or type I, error rate). Although N and release of QTDT (Abecasis et al., 2000a) and has been α are determined by the experimenter, many of the factors generalized to deal with extended families rather than just that contribute to the effect size are typically unknown. sibships (Abecasis et al., 2000b). In order to compute power, we are therefore required The computationally intensive approach to power cal- to make assumptions regarding what we expect to find. culation is to simulate hundreds or thousands of replicate Formapping loci, such factors include the proportion samples under a specifed set of population parameters. of variance explained by the trait locus, gene action, The proportion of replicates in which an effect is detected and marker heterozygosity and density. Although there (the test statistic falling above a specified threshold) pro- is no shortage of statistical genetic literature to aid the vides an estimate of power. Recently, however, closed- To whom correspondence should be addressed. form analytic power equations have been presented for c Oxford University Press 2003 149 S.Purcell et al. variance components methods of linkage and association ACKNOWLEDGEMENTS mapping (Sham et al., 2000). The use of such equations Supported in part by National Institutes of Health (USA) greatly speeds up power calculation and allows a more grant EY-12562, MRC components grant G9700821 (UK) comprehensive exploration of the parameter space (e.g. and the Wellcome Trust. different models and sample types). The Genetic Power Calculator (GPC) implements these power equations and REFERENCES others, for both linkage and association methods using ei- ther qualitative (e.g. the presence or absence of a disease) Abecasis,G.R., Cardon,L.R. and Cookson,W.O. (2000a) A general test of association for quantitative traits in nuclear families. Am. or quantitative (e.g. a score on a personality inventory) J. Hum. Genet., 66, 279–292. traits. Abecasis,G.R., Cookson,W.O. and Cardon,L.R. (2000b) Pedigree Power for the variance components linkage test can tests of transmission desiquilibrium. Eur. J. Hum. Genet., 8, 545– be calculated for sibships of arbitrary size, under user- definable levels of the proportion of variance explained by Almasy,L. and Blangero,J. (1998) Multipoint quantitative-trait the trait locus, acting additively and/or via dominance. The linkage analysis in general pedigress. Am. J. Hum. Genet., 62, background residuals sibling correlation can be varied, as 1198–1211. can the polymorphism information content at the locus Cardon,L.R. and Fulker,D.W. (1994) The power of interval mapping of interest, allowing accommodation of either twopoint of quantitative trait loci, using selected sib pairs. Am. J. Hum. or multipoint linkage. The output includes a table of Genet., 55, 825–833. power for various common α levels as well as a user- Carey,G. and Williamson,J. (1991) Linkage analysis of quantitative selected α level, and required sample size to achieve traits; incresed power by using selected samples. Am. J. Hum. the user-selected desired level of power, suitable for Genet., 49, 786–796. direct inclusion in a grant proposal. In addition the Fulker,D.W. and Cherny,S.S. (1996) An improved multipoint sib- noncentrality parameter is provided to facilitate power pair analysis of quantative traits. Behavior Genet., 26, 527–532. calculation for samples of variable-sized sibships. For Fulker,D.W., Cherny,S.S., Sham,P.C. and Hewitt,J.K. (1999) Com- tests of association, GPC uses the variance components bined linkage and association sib-pair analysis for quantitative test described by Fulker et al. (1999). The user is presented traits. Am. J. Hum. Genet., 64, 259–267. with a similar set of options as for the linkage test, along Nance,W.E. and Neale,M.C. (1989) Partitioned twin analysis: a with association-specific options such as the extent of power study. Behavior Genet., 19, 143–150. trait locus-marker locus linkage disequilibrium and allele Neale,M.C., Eaves,L.J. and Kendler,K.S. (1994) The power of the frequencies. Output is presented for between-sibship, classical twin method to resolve variation in threshold traits. within-sibship and combined tests of association. Behavior Genet., 24, 239–258. For testing association in discrete (disease) traits, tools Pratt,S.C., Daly,M.J. and Kruglyak,L. (2000) Exact multipoint are available for both the TDT test, which employs parents quantitative-trait linkage analysis in pedigress by variance com- and a single affected offspring (Spielman et al., 1993) ponents. Am. J. Hum. Genet., 66, 1153–1157. and the case-control design (Sham, 1998). The user is Purcell,S., Cherny,S., Hewitt,J. and Sham,P. (2001) Optimal sibship able to explore power under conditions of varying disease selection for genotyping in quantitative trait locus linkage allele frequency, disease prevalence, and genotype relative analysis. Human Heredity, 52, 1–13. risk. Again, output is similar to that described above. In Schmitz,S., Cherny,S.S. and Fulker,D.W. (1998) Increase in power addition, calculation of power for the quantitative TDT through multivariate analyses. Behavior Genet., 28, 357–363. and quantitative case-control designs is also available. Sham,P. (1998) Statistics in Human Genetics, Ist edn, Arnold, These study designs assume that cases and controls are London. defined as scoring above or below specific thresholds. Sham,P.C., Cherny,S.S., Purcell,S. and Hewitt,J.K. (2000) Power of Additional utilities made available on the website include linkage versus association analysis of quantitative traits, by use two-locus linkage power calculations and a facility for of variance-components models, for sibship data. Am. J. Hum. calculating the potential informativeness of sibships for Genet., 66, 1616–1630. linkage, conditional on observed trait values. This index Spielman,R.S., McGinnis,R.E. and Ewens,W.J. (1993) Transmis- of informativeness provides a basis for efficient selective sion test for linkage disequilibrium: the insulin gene region genopyting (Purcell et al., 2001). Presently, there is no and insulin-dependent diabetes mellitus (IDDM). Am. J. Hum. software widely available to employ such tests, but the Genet., 52, 506–516. situation is likely to improve in the near future. We will Suarez,B., O’Rourke,D. and Van Eerdewegh,P. (1982) Power of the attempt to add additional tools for estimating power to this affected-sib-pair method to defect disease susceptibility loci of website as additional methods of analysis are developed small effect: an application to multiple sclerosis. Am. J. Med. and software distributed for their implementation. Genet., 12, 309–326.

Journal

BioinformaticsOxford University Press

Published: Jan 1, 2003

There are no references for this article.