Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

The RDP-II (Ribosomal Database Project)

The RDP-II (Ribosomal Database Project) © 2001 Oxford University Press Nucleic Acids Research, 2001, Vol. 29, No. 1 173–174 Bonnie L. Maidak, James R. Cole, Timothy G. Lilburn*, Charles T. Parker Jr, 1 2 Paul R. Saxman,RyanJ.Farris, George M. Garrity,GaryJ.Olsen , Thomas M. Schmidt and James M. Tiedje Center for Microbial Ecology, 540 Plant and Soil Sciences Building, Michigan State University, East Lansing, MI 48824-1325, USA, Department of Microbiology, University of Illinois, B-103 C&LSL Building, 601 South Goodwin Avenue, Urbana, IL 61801-3714, USA and Department of Microbiology and Molecular Genetics, Michigan State University, 294 Giltner Hall, East Lansing, MI 48824-1101, USA Received October 2, 2000; Accepted October 4, 2000 ABSTRACT Release 8.0, June 1, 2000, contained 16 277 prokaryotic small subunit (SSU) rRNA sequences in aligned form with ~75% The Ribosomal Database Project (RDP-II), previously longer than 899 bp. Type strain status is marked for a sequence if described by Maidak et al. [Nucleic Acids Res. (2000), it is determinable. The number of eukaryotic and mitochondrial 28, 173–174], continued during the past year to add SSU rRNA sequences in aligned form remains at 2055 and new rRNA sequences to the aligned data and to 1503. Besides the sequences from the aligned data, more than improve the analysis commands. Release 8.0 (June 1, 10 000 additional sequences were added to create the unaligned 2000) consisted of 16 277 aligned prokaryotic small data bringing the total number to more than 30 000. The subunit (SSU) rRNA sequences while the number of unaligned data are available for downloading and for analyses eukaryotic and mitochondrial SSU rRNA sequences in that do not require alignment. The all-inclusive RDP phylo- aligned form remained at 2055 and 1503, respectively. genetic tree has not been updated for Release 8.0 because its The number of prokaryotic SSU rRNA sequences size precludes any utility and because it has become inaccurate. Instead, we have decided to build a hierarchical set of trees, more than doubled from the previous release 14 with a single tree that encompasses the breadth of the prokaryotic months earlier, and ~75% are longer than 899 bp. An sequence diversity at the top of the hierarchy (a so-called back- RDP-II mirror site in Japan is now available (http:// bone tree) and subordinate trees that encompass less and less of wdcm.nig.ac.jp/RDP/html/index.html). RDP-II provides the diversity as one moves down the hierarchy. The sequences aligned and annotated rRNA sequences, derived represented in the subordinate trees are selected according to phylogenetic trees and taxonomic hierarchies, and their position in the RDP Release 8.0 hierarchy. The backbone analysis services through its WWW server (http:// tree and 13 of these subordinate trees were calculated using the rdp.cme.msu.edu/). Analysis services include rRNA WEIGHBOR algorithm (5) for Release 8.0 and eventually all probe checking, approximate phylogenetic placement sequences in the RDP-II prokaryotic SSU rRNA alignment of user sequences, screening user sequences for will be in one or more subordinate trees. A new backbone possible chimeric rRNA sequences, automated align- phylogenetic tree for 217 prokaryotic SSU rRNA sequences ment, production of similarity matrices and services to was calculated using the WEIGHBOR algorithm (5). Additional plan and analyze terminal restriction fragment polymor- trees using this approach for 13 smaller groups were also phism experiments. The RDP-II email address for prepared for Release 8.0. Eventually, all sequences in the questions and comments has been changed from RDP-II prokaryotic SSU rRNA alignment will be in one or curator@cme.msu.edu to rdpstaff@msu.edu. more of these smaller grouped trees. To facilitate scientific research, RDP-II serves as a repository for alignments and masks used by authors in the preparation of phylogenetic trees. DESCRIPTION The availability of these alignments and masks supports the recalculation of published rRNA phylogenetic trees. These The Ribosomal Database Project (RDP-II) provides data, data are available for download from the RDP-II WWW (http:// programs and services related to ribosomal RNA sequences. This rdp.cme.msu.edu/) server. paper describes changes since the 2000 description (1). Details about specific analysis functions, data and available programs can Analysis services be found at the WWW site (http://rdp.cme.msu.edu/). A brief description of each analysis command available on the Data WWW server can be found in Table 1 from the Maidak et al. (1) description of the RDP-II or from the Documentation section The ribosomal RNA sequences in the RDP-II alignments are mainly drawn from the major sequence repositories [GenBank of the RDP-II WWW server (http://www.cme.msu.edu/RDP/ (2), EMBL Data Library (3) and DDBJ (4)]. docs/documentation.html). *To whom correspondence should be addressed. Tel: +1 517 432 4998; Fax: +1 517 353 8957; Email: rdpstaff@msu.edu 174 Nucleic Acids Research, 2001, Vol. 29, No. 1 Visualization of large sets of sequence data program (8) is under development. To keep abreast of the increasing volume of rRNA sequence data, we are evaluating For some applications (e.g. the detection of sequencing or changes in workflow, additional automation of annotation and annotation errors, the definition of taxonomic boundaries and more robust automated alignment procedures. These back-end visualization of outliers) it is necessary to build models with a changes should enable the RDP to provide timely release of complete set of aligned sequences, rather than a small subset of rRNA data. sequences, drawn either at random or deliberately. However, current methods for constructing phylogenetic trees are inherently limited. Such methods are computationally too intensive and SUPPLEMENTARY MATERIAL the output is too complex to permit accurate interpretation. To Additional material related to the RDP-II and described in the that end, in collaboration with the Bergey’s Manual Trust, Supplementary Data section of this article at NAR Online work on alternative means of visualizing extremely large sets consists of the following: of sequences using Principal Component Analysis (PCA) was (i) a PDF file of a poster from the American Society for initiated during 2000. Two-dimensional scatter plots using Microbiology (ASM) May 2000 meeting describing the PCA are available in the Supplementary Material links. RDP-II and some historical aspects of the RDP and RDP-II New auxiliary WWW sites rRNA sequence data; (ii) a PDF file of the new backbone phylogenetic tree of 217 The Center for Microbial Ecology WWW server now supports SSU rRNA prokaryotic sequences; two additional WWW sites that contain data related to the RDP-II. (iii) a PDF file detailing the diversity found in RDP releases; The Biodegradative Strain Database (http://bsd.cme.msu.edu) (iv) a PDF file of PCA two-dimensional scatter plots for provides corresponding microbiological data to complement and prokaryotic SSU rRNA sequences (figure 5 of the ASM May integrate the phylogenetic data of the RDP-II with the chemical 2000 poster, above) and metabolic data of the University of Minnesota Biocatalysis/ Biodegradation Database (http://www.labmed.umn.edu/umbbd/ index.html) (6). The second auxiliary WWW site is rrndb ACKNOWLEDGEMENTS (http://rrndb.cme.msu.edu), which provides information pertaining to the number of rRNA operons contained on We thank several individuals for their past contributions: prokaryotic genomes. (7). Robin Gutell (and his colleagues), Niels Larsen, Tom Macke, Michael J. McCaughey, Ross Overbeek, Sakti Pramanik, Mitch L. Sogin and Carl R. Woese. The National Science RDP-II CITATION AND ACCESS Foundation’s Science and Technology Center Program, the US Research assisted by any RDP-II service should cite: the Department of Energy Office of Science and the State of Ribosomal Database Project (RDP-II) at the Michigan State Michigan currently support RDP-II. University in East Lansing, Michigan; the release number; and this article. Please state which data, programs and services REFERENCES were used. The RDP-II data and analysis services can be found at URL: 1. Maidak,B.L., Cole,J.R., Lilburn,T.G., Parker,C.T.,Jr, Saxman,P.R., Stredwick,J.M., Garrity,G.M., Li,B., Olsen,G.J., Pramanik,S., Schmidt,T.M. http://rdp.cme.msu.edu/. A mirror site is available at the Labo- and Tiedje,J.M. (2000) The RDP (Ribosomal Database Project) continues. ratory for Molecular Classification in the Center for Information Nucleic Acids Res., 28, 173–174. Biology at the National Institute of Genetics (NIG), Japan 2. Benson,D.A., Karsch-Mizrachi,I., Lipman,D.J., Ostell,J., Rapp,B.A. and (http://wdcm.nig.ac.jp/RDP/html/index.html). This new mirror Wheeler,D.L. (2000) GenBank. Nucleic Acids Res., 28, 15–18. site should provide better access to RDP-II for researchers in 3. Baker,W., van den Broek,A., Camon,E., Hingamp,P., Sterk,P., Stoesser,G. and Tuli,M.A. (2000) The EMBL Nucleotide Sequence that part of the world. Database. Nucleic Acids Res., 28, 19–23. The address for email correspondence with RDP-II staff is 4. Tateno,Y., Miyazaki,S., Ota,M., Sugawara,H. and Gojobori,T. (2000) now rdpstaff@msu.edu. Those without access to email may DNA Data Bank of Japan (DDBJ) in collaboration with mass sequencing contact the RDP-II staff via telephone (+1 517 432 4998), fax teams. Nucleic Acids Res., 28, 24–26. (+1 517 353 8957) or regular mail. 5. Bruno,W.J., Socci,N.D. and Halpern,A.L. (2000) Weighted Neighbor Joining: a likelihood-based approach to distance-based phylogeny reconstruction. Mol. Biol. Evol., 17, 189–197. FUTURE CHANGES AND ADDITIONS 6. Ellis,L.B.M., Hershberger,C.D. and Wackett,L.P. (2000) The University of Minnesota Biocatalysis/Biodegradation Database: microorganisms, Several upgrades to the WWW analysis programs are planned genomics and prediction. Nucleic Acids Res., 28, 377–379. for release in the near future. An improved sequence selection 7. Klappenbach,J.A., Saxman,P.R., Cole,J.R. and Schmidt,T.A. (2001) rrndb: the ribosomal RNA operon copy number database. Nucleic Acids tool will allow searching and provide a graphical display of Res., 29, 181–184. sequence completeness. A new analysis program will allow 8. Marsh,T.L., Saxman,P., Cole,J. and Tiedje,J. (2000) Terminal restriction users to create phylogenetic trees incorporating RDP fragment length polymorphism analysis program, a web-based research sequences along with their own data. In addition, Version 2.0 tool for microbial community analysis. Appl. Environ. Microbiol., 66, of the terminal restriction fragment polymorphism (T-RFLP) 3616–3620. http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Nucleic Acids Research Oxford University Press

Loading next page...
 
/lp/oxford-university-press/the-rdp-ii-ribosomal-database-project-cD9ug0ocX5

References (10)

Publisher
Oxford University Press
ISSN
0305-1048
eISSN
1362-4962
DOI
10.1093/nar/29.1.173
Publisher site
See Article on Publisher Site

Abstract

© 2001 Oxford University Press Nucleic Acids Research, 2001, Vol. 29, No. 1 173–174 Bonnie L. Maidak, James R. Cole, Timothy G. Lilburn*, Charles T. Parker Jr, 1 2 Paul R. Saxman,RyanJ.Farris, George M. Garrity,GaryJ.Olsen , Thomas M. Schmidt and James M. Tiedje Center for Microbial Ecology, 540 Plant and Soil Sciences Building, Michigan State University, East Lansing, MI 48824-1325, USA, Department of Microbiology, University of Illinois, B-103 C&LSL Building, 601 South Goodwin Avenue, Urbana, IL 61801-3714, USA and Department of Microbiology and Molecular Genetics, Michigan State University, 294 Giltner Hall, East Lansing, MI 48824-1101, USA Received October 2, 2000; Accepted October 4, 2000 ABSTRACT Release 8.0, June 1, 2000, contained 16 277 prokaryotic small subunit (SSU) rRNA sequences in aligned form with ~75% The Ribosomal Database Project (RDP-II), previously longer than 899 bp. Type strain status is marked for a sequence if described by Maidak et al. [Nucleic Acids Res. (2000), it is determinable. The number of eukaryotic and mitochondrial 28, 173–174], continued during the past year to add SSU rRNA sequences in aligned form remains at 2055 and new rRNA sequences to the aligned data and to 1503. Besides the sequences from the aligned data, more than improve the analysis commands. Release 8.0 (June 1, 10 000 additional sequences were added to create the unaligned 2000) consisted of 16 277 aligned prokaryotic small data bringing the total number to more than 30 000. The subunit (SSU) rRNA sequences while the number of unaligned data are available for downloading and for analyses eukaryotic and mitochondrial SSU rRNA sequences in that do not require alignment. The all-inclusive RDP phylo- aligned form remained at 2055 and 1503, respectively. genetic tree has not been updated for Release 8.0 because its The number of prokaryotic SSU rRNA sequences size precludes any utility and because it has become inaccurate. Instead, we have decided to build a hierarchical set of trees, more than doubled from the previous release 14 with a single tree that encompasses the breadth of the prokaryotic months earlier, and ~75% are longer than 899 bp. An sequence diversity at the top of the hierarchy (a so-called back- RDP-II mirror site in Japan is now available (http:// bone tree) and subordinate trees that encompass less and less of wdcm.nig.ac.jp/RDP/html/index.html). RDP-II provides the diversity as one moves down the hierarchy. The sequences aligned and annotated rRNA sequences, derived represented in the subordinate trees are selected according to phylogenetic trees and taxonomic hierarchies, and their position in the RDP Release 8.0 hierarchy. The backbone analysis services through its WWW server (http:// tree and 13 of these subordinate trees were calculated using the rdp.cme.msu.edu/). Analysis services include rRNA WEIGHBOR algorithm (5) for Release 8.0 and eventually all probe checking, approximate phylogenetic placement sequences in the RDP-II prokaryotic SSU rRNA alignment of user sequences, screening user sequences for will be in one or more subordinate trees. A new backbone possible chimeric rRNA sequences, automated align- phylogenetic tree for 217 prokaryotic SSU rRNA sequences ment, production of similarity matrices and services to was calculated using the WEIGHBOR algorithm (5). Additional plan and analyze terminal restriction fragment polymor- trees using this approach for 13 smaller groups were also phism experiments. The RDP-II email address for prepared for Release 8.0. Eventually, all sequences in the questions and comments has been changed from RDP-II prokaryotic SSU rRNA alignment will be in one or curator@cme.msu.edu to rdpstaff@msu.edu. more of these smaller grouped trees. To facilitate scientific research, RDP-II serves as a repository for alignments and masks used by authors in the preparation of phylogenetic trees. DESCRIPTION The availability of these alignments and masks supports the recalculation of published rRNA phylogenetic trees. These The Ribosomal Database Project (RDP-II) provides data, data are available for download from the RDP-II WWW (http:// programs and services related to ribosomal RNA sequences. This rdp.cme.msu.edu/) server. paper describes changes since the 2000 description (1). Details about specific analysis functions, data and available programs can Analysis services be found at the WWW site (http://rdp.cme.msu.edu/). A brief description of each analysis command available on the Data WWW server can be found in Table 1 from the Maidak et al. (1) description of the RDP-II or from the Documentation section The ribosomal RNA sequences in the RDP-II alignments are mainly drawn from the major sequence repositories [GenBank of the RDP-II WWW server (http://www.cme.msu.edu/RDP/ (2), EMBL Data Library (3) and DDBJ (4)]. docs/documentation.html). *To whom correspondence should be addressed. Tel: +1 517 432 4998; Fax: +1 517 353 8957; Email: rdpstaff@msu.edu 174 Nucleic Acids Research, 2001, Vol. 29, No. 1 Visualization of large sets of sequence data program (8) is under development. To keep abreast of the increasing volume of rRNA sequence data, we are evaluating For some applications (e.g. the detection of sequencing or changes in workflow, additional automation of annotation and annotation errors, the definition of taxonomic boundaries and more robust automated alignment procedures. These back-end visualization of outliers) it is necessary to build models with a changes should enable the RDP to provide timely release of complete set of aligned sequences, rather than a small subset of rRNA data. sequences, drawn either at random or deliberately. However, current methods for constructing phylogenetic trees are inherently limited. Such methods are computationally too intensive and SUPPLEMENTARY MATERIAL the output is too complex to permit accurate interpretation. To Additional material related to the RDP-II and described in the that end, in collaboration with the Bergey’s Manual Trust, Supplementary Data section of this article at NAR Online work on alternative means of visualizing extremely large sets consists of the following: of sequences using Principal Component Analysis (PCA) was (i) a PDF file of a poster from the American Society for initiated during 2000. Two-dimensional scatter plots using Microbiology (ASM) May 2000 meeting describing the PCA are available in the Supplementary Material links. RDP-II and some historical aspects of the RDP and RDP-II New auxiliary WWW sites rRNA sequence data; (ii) a PDF file of the new backbone phylogenetic tree of 217 The Center for Microbial Ecology WWW server now supports SSU rRNA prokaryotic sequences; two additional WWW sites that contain data related to the RDP-II. (iii) a PDF file detailing the diversity found in RDP releases; The Biodegradative Strain Database (http://bsd.cme.msu.edu) (iv) a PDF file of PCA two-dimensional scatter plots for provides corresponding microbiological data to complement and prokaryotic SSU rRNA sequences (figure 5 of the ASM May integrate the phylogenetic data of the RDP-II with the chemical 2000 poster, above) and metabolic data of the University of Minnesota Biocatalysis/ Biodegradation Database (http://www.labmed.umn.edu/umbbd/ index.html) (6). The second auxiliary WWW site is rrndb ACKNOWLEDGEMENTS (http://rrndb.cme.msu.edu), which provides information pertaining to the number of rRNA operons contained on We thank several individuals for their past contributions: prokaryotic genomes. (7). Robin Gutell (and his colleagues), Niels Larsen, Tom Macke, Michael J. McCaughey, Ross Overbeek, Sakti Pramanik, Mitch L. Sogin and Carl R. Woese. The National Science RDP-II CITATION AND ACCESS Foundation’s Science and Technology Center Program, the US Research assisted by any RDP-II service should cite: the Department of Energy Office of Science and the State of Ribosomal Database Project (RDP-II) at the Michigan State Michigan currently support RDP-II. University in East Lansing, Michigan; the release number; and this article. Please state which data, programs and services REFERENCES were used. The RDP-II data and analysis services can be found at URL: 1. Maidak,B.L., Cole,J.R., Lilburn,T.G., Parker,C.T.,Jr, Saxman,P.R., Stredwick,J.M., Garrity,G.M., Li,B., Olsen,G.J., Pramanik,S., Schmidt,T.M. http://rdp.cme.msu.edu/. A mirror site is available at the Labo- and Tiedje,J.M. (2000) The RDP (Ribosomal Database Project) continues. ratory for Molecular Classification in the Center for Information Nucleic Acids Res., 28, 173–174. Biology at the National Institute of Genetics (NIG), Japan 2. Benson,D.A., Karsch-Mizrachi,I., Lipman,D.J., Ostell,J., Rapp,B.A. and (http://wdcm.nig.ac.jp/RDP/html/index.html). This new mirror Wheeler,D.L. (2000) GenBank. Nucleic Acids Res., 28, 15–18. site should provide better access to RDP-II for researchers in 3. Baker,W., van den Broek,A., Camon,E., Hingamp,P., Sterk,P., Stoesser,G. and Tuli,M.A. (2000) The EMBL Nucleotide Sequence that part of the world. Database. Nucleic Acids Res., 28, 19–23. The address for email correspondence with RDP-II staff is 4. Tateno,Y., Miyazaki,S., Ota,M., Sugawara,H. and Gojobori,T. (2000) now rdpstaff@msu.edu. Those without access to email may DNA Data Bank of Japan (DDBJ) in collaboration with mass sequencing contact the RDP-II staff via telephone (+1 517 432 4998), fax teams. Nucleic Acids Res., 28, 24–26. (+1 517 353 8957) or regular mail. 5. Bruno,W.J., Socci,N.D. and Halpern,A.L. (2000) Weighted Neighbor Joining: a likelihood-based approach to distance-based phylogeny reconstruction. Mol. Biol. Evol., 17, 189–197. FUTURE CHANGES AND ADDITIONS 6. Ellis,L.B.M., Hershberger,C.D. and Wackett,L.P. (2000) The University of Minnesota Biocatalysis/Biodegradation Database: microorganisms, Several upgrades to the WWW analysis programs are planned genomics and prediction. Nucleic Acids Res., 28, 377–379. for release in the near future. An improved sequence selection 7. Klappenbach,J.A., Saxman,P.R., Cole,J.R. and Schmidt,T.A. (2001) rrndb: the ribosomal RNA operon copy number database. Nucleic Acids tool will allow searching and provide a graphical display of Res., 29, 181–184. sequence completeness. A new analysis program will allow 8. Marsh,T.L., Saxman,P., Cole,J. and Tiedje,J. (2000) Terminal restriction users to create phylogenetic trees incorporating RDP fragment length polymorphism analysis program, a web-based research sequences along with their own data. In addition, Version 2.0 tool for microbial community analysis. Appl. Environ. Microbiol., 66, of the terminal restriction fragment polymorphism (T-RFLP) 3616–3620.

Journal

Nucleic Acids ResearchOxford University Press

Published: Jan 1, 2001

There are no references for this article.