Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Review Article
  • Published:

Strategies for the systematic sequencing of complex genomes

Abstract

Recent spectacular advances in the technologies and strategies for DNA sequencing have profoundly accelerated the detailed analysis of genomes from myriad organisms. The past few years alone have seen the publication of near-complete or draft versions of the genome sequence of several well-studied, multicellular organisms — most notably, the human. As well as providing data of fundamental biological significance, these landmark accomplishments have yielded important strategic insights that are guiding current and future genome-sequencing projects.

Key Points

  • The genome sequences of several eukaryotic organisms have been reported in recent years, including a yeast (Saccharomyces cerevisiae), a nematode (Caenorhabditis elegans), an insect (Drosophila melanogaster), a plant (Arabidopsis thaliana) and human (Homo sapiens).

  • These spectacular achievements have been associated with a range of technical advances in basic sequencing methodology, the automation of many of the key steps in the sequencing pipeline, the adoption of industrial-scale experimental protocols and the development of improved computational tools for sequence analysis.

  • The two main strategies used for sequencing large, complex genomes are clone-by-clone shotgun sequencing and whole-genome shotgun sequencing. Both approaches were used to generate the recently reported working draft human sequences.

  • In clone-by-clone sequencing, individual clones are selected from a contig map (a type of physical map) and each is then sequenced by a shotgun-sequencing strategy. In turn, the genome sequence is assembled by pasting together the sequences of the individual clones.

  • In whole-genome shotgun sequencing, the genome is broken into fragments of defined size classes, which are then cloned and used to generate sequence reads. In turn, the genome sequence is assembled from the entire collection of sequence reads.

  • Each of the two main strategies has strengths and weaknesses, and a hybrid strategy that involves both whole-genome and clone-by-clone shotgun-sequencing components is being adopted in many current projects. However, it remains to be determined how much sequencing should be done by each strategy when implementing a hybrid approach.

  • For new sequencing projects, it is also important to consider whether the genome needs to be sequenced to high accuracy or whether a more draft-level sequence can provide the information that is required. This consideration will probably influence the choice and implementation of a particular sequencing strategy.

  • Sequencing the genome of a complex, multicellular eukaryote still poses massive technological challenges and requires a significant amount of funds. Choosing the appropriate sequencing strategy is therefore a crucial step in any genome-sequencing project.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Figure 1: Two main shotgun-sequencing strategies.
Figure 2: Sequence-ready BAC contig map.
Figure 3: Main steps in clone-by-clone shotgun sequencing.
Figure 4: Shotgun-sequence assembly.
Figure 5: Long-range sequence assembly in whole-genome shotgun sequencing.
Figure 6: Hybrid shotgun-sequencing approach.

Similar content being viewed by others

References

  1. Green, E. D. in The Metabolic and Molecular Bases of Inherited Disease (eds Scriver, C. R. et al.) 259–298 (McGraw–Hill, New York, 2001).

    Google Scholar 

  2. Sanger, F., Nicklen, S. & Coulson, A. R. DNA sequencing with chain-terminating inhibitors. Proc. Natl Acad. Sci. USA 74, 5463–5467 (1977).Reports the Nobel prize-winning method developed by Fred Sanger and colleagues for sequencing DNA — called dideoxy chain termination sequencing. Roughly 25 years later, this continues to be the state-of-the-art technique for large-scale DNA sequencing.

    CAS  PubMed  PubMed Central  Google Scholar 

  3. Smith, L. M. et al. Fluorescence detection in automated DNA sequence analysis. Nature 321, 674–679 (1986).First notable description of fluorescence-based DNA sequencing and the use of automated instrumentation for detecting the sequencing reaction products.

    CAS  PubMed  Google Scholar 

  4. Hunkapiller, T., Kaiser, R. J., Koop, B. F. & Hood, L. Large-scale and automated DNA sequence determination. Science 254, 59–67 (1991).

    CAS  PubMed  Google Scholar 

  5. Mullikin, J. C. & McMurray, A. A. Sequencing the genome, fast. Science 283, 1867–1868 (1999).

    CAS  PubMed  Google Scholar 

  6. Meldrum, D. R. Sequencing genomes and beyond. Science 292, 515–516 (2001).

    CAS  PubMed  Google Scholar 

  7. Tabor, S. & Richardson, C. C. A single residue in DNA polymerases of the Escherichia coli DNA polymerase I family is critical for distinguishing between deoxy- and dideoxyribonucleotides. Proc. Natl Acad. Sci. USA 92, 6339–6343 (1995).

    CAS  PubMed  PubMed Central  Google Scholar 

  8. Prober, J. M. et al. A system for rapid DNA sequencing with fluorescent chain-terminating dideoxynucleotides. Science 238, 336–341 (1987).

    CAS  PubMed  Google Scholar 

  9. Ju, J., Ruan, C., Fuller, C. W., Glazer, A. N. & Mathies, R. A. Fluorescence energy transfer dye-labeled primers for DNA sequencing and analysis. Proc. Natl Acad. Sci. USA 92, 4347–4351 (1995).

    CAS  PubMed  PubMed Central  Google Scholar 

  10. Rosenblum, B. B. et al. New dye-labeled terminators for improved DNA sequencing patterns. Nucleic Acids Res. 25, 4500–4504 (1997).

    CAS  PubMed  PubMed Central  Google Scholar 

  11. Metzker, M. L., Lu, J. & Gibbs, R. A. Electrophoretically uniform fluorescent dyes for automated DNA sequencing. Science 271, 1420–1422 (1996).

    CAS  PubMed  Google Scholar 

  12. Lee, L. G. et al. New energy transfer dyes for DNA sequencing. Nucleic Acids Res. 25, 2816–2822 (1997).

    CAS  PubMed  PubMed Central  Google Scholar 

  13. Meldrum, D. Automation for genomics. I. Preparation for sequencing. Genome Res. 10, 1081–1092 (2000).

    CAS  PubMed  Google Scholar 

  14. Meldrum, D. Automation for genomics. II. Sequencers, microarrays, and future trends. Genome Res. 10, 1288–1303 (2000).

    CAS  PubMed  Google Scholar 

  15. International Human Genome Sequencing Consortium. Initial sequencing and analysis of the human genome. Nature 409, 860–921 (2001).Landmark paper about the initial sequence of the human genome generated by the public Human Genome Project using a clone-by-clone shotgun-sequencing strategy.

  16. Ewing, B., Hillier, L., Wendl, M. C. & Green, P. Base-calling of automated sequencer traces using Phred. I. Accuracy assessment. Genome Res. 8, 175–185 (1998).

    Article  CAS  PubMed  Google Scholar 

  17. Ewing, B. & Green, P. Base-calling of automated sequencer traces using Phred. II. Error probabilities. Genome Res. 8, 186–194 (1998).

    CAS  PubMed  Google Scholar 

  18. Gordon, D., Abajian, C. & Green, P. Consed: a graphical tool for sequence finishing. Genome Res. 8, 195–202 (1998).The most commonly used suite of computer programs for carrying out base calling, sequence assembly and viewing of sequence assemblies are Phred (references 16 and 17 ), Phrap and Consed (reference 18 ), respectively. Reference 20 describes an important extension of Consed (a program called Autofinish) that automates some of the key steps in sequence finishing.

    CAS  PubMed  Google Scholar 

  19. Bonfield, J. K., Smith, K. F. & Staden, R. A new DNA sequence assembly program. Nucleic Acids Res. 23, 4992–4999 (1995).

    CAS  PubMed  PubMed Central  Google Scholar 

  20. Gordon, D., Desmarais, C. & Green, P. Automated finishing with Autofinish. Genome Res. 11, 614–625 (2001).

    CAS  PubMed  PubMed Central  Google Scholar 

  21. Olson, M. & Green, P. A 'quality-first' credo for the Human Genome Project. Genome Res. 8, 414–415 (1998).

    CAS  PubMed  Google Scholar 

  22. Felsenfeld, A., Peterson, J., Schloss, J. & Guyer, M. Assessing the quality of the DNA sequence from The Human Genome Project. Genome Res. 9, 1–4 (1999).

    CAS  PubMed  Google Scholar 

  23. Huang, G. M. High-throughput DNA sequencing: a genomic data manufacturing process. DNA Seq. 10, 149–153 (1999).

    CAS  PubMed  Google Scholar 

  24. Wendl, M. C., Dear, S., Hodgson, D. & Hillier, L. Automated sequence preprocessing in a large-scale sequencing environment. Genome Res. 8, 975–984 (1998).

    CAS  PubMed  PubMed Central  Google Scholar 

  25. Dedhia, N. N. & McCombie, W. R. Kaleidaseq: a web-based tool to monitor data flow in a high throughput sequencing facility. Genome Res. 8, 313–318 (1998).

    CAS  PubMed  PubMed Central  Google Scholar 

  26. Lawrence, C. B. et al. The Genome Reconstruction Manager: a software environment for supporting high-throughput DNA sequencing. Genomics 23, 192–201 (1994).

    CAS  PubMed  Google Scholar 

  27. Kimmel, B. E., Palazzolo, M. J., Martin, C. H., Boeke, J. D. & Devine, S. E. in Genome Analysis: A Laboratory Manual. 1. Analyzing DNA (eds Birren, B. et al.) 455–532 (Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New York, 1997).

    Google Scholar 

  28. Church, G. M. & Kieffer-Higgins, S. Multiplex DNA sequencing. Science 240, 185–188 (1988).

    CAS  PubMed  Google Scholar 

  29. Cherry, J. L. et al. Enzyme-linked fluorescent detection for automated multiplex DNA sequencing. Genomics 20, 68–74 (1994).

    CAS  PubMed  Google Scholar 

  30. Smith, D. R. et al. Multiplex sequencing of 1.5 Mb of the Mycobacterium leprae genome. Genome Res. 7, 802–819 (1997).

    CAS  PubMed  Google Scholar 

  31. Gardner, R. C. et al. The complete nucleotide sequence of an infectious clone of cauliflower mosaic virus by M13mp7 shotgun sequencing. Nucleic Acids Res. 9, 2871–2888 (1981).

    CAS  PubMed  PubMed Central  Google Scholar 

  32. Anderson, S. Shotgun DNA sequencing using cloned DNase I-generated fragments. Nucleic Acids Res. 10, 3015–3027 (1981).

    Google Scholar 

  33. Sanger, F., Coulson, A. R., Hong, G. F., Hill, D. F. & Petersen, G. B. Nucleotide sequence of the bacteriophage lambda DNA. J. Mol. Biol. 162, 729–773 (1982). References 31–33 represent some of the earliest papers that reported the use of shotgun sequencing as a strategy for establishing the sequence of large pieces of DNA.

    CAS  PubMed  Google Scholar 

  34. Deininger, P. L. Random subcloning of sonicated DNA: application to shotgun DNA sequence analysis. Anal. Biochem. 129, 216–223 (1983).

    CAS  PubMed  Google Scholar 

  35. Messing, J. The universal primers and the shotgun DNA sequencing method. Methods Mol. Biol. 167, 13–31 (2001).

    CAS  PubMed  Google Scholar 

  36. Ansorge, W., Voss, H. & Zimmermann, J. (eds) DNA Sequencing Strategies (Wiley & Sons, Inc., New York, 1997).

    Google Scholar 

  37. Adams, M. D., Fields, C. & Venter, J. C. (eds) Automated DNA Sequencing and Analysis (Academic, Inc., San Diego, 1994).

    Google Scholar 

  38. Spurr, N. K., Young, B. D. & Bryant, S. P. (eds) ICRF Handbook of Genome Analysis (Blackwell Science Ltd, Oxford, 1998).

    Google Scholar 

  39. Green, E. D., Birren, B., Klapholz, S., Myers, R. M. & Hieter, P. (eds) Genome Analysis: A Laboratory Manual Vols 1–4 (Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New York, 1997).

    Google Scholar 

  40. Goffeau, A. et al. The yeast genome directory. Nature 387, S1–S105 (1997).Describes the genome sequence of the first eukaryotic organism, the yeast Saccharomyces cerevisiae , by a collection of numerous sequencing groups (large and small) around the world.

    Google Scholar 

  41. The C. elegans Sequencing Consortium. Genome sequence of the nematode C. elegans: a platform for investigating biology. Science 282, 2012–2018 (1998).Reports the genome sequence of the first multicellular organism, the nematode worm Caenorhabditis elegans , by the sequencing groups at Washington University and the Sanger Centre.

  42. Wilson, R. K. & Mardis, E. R. in Genome Analysis: A Laboratory Manual. 1. Analyzing DNA (eds Birren, B. et al.) 397–454 (Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New York, 1997).

    Google Scholar 

  43. Olson, M., Hood, L., Cantor, C. & Botstein, D. A common language for physical mapping of the human genome. Science 245, 1434–1435 (1989).

    CAS  PubMed  Google Scholar 

  44. Vollrath, D. in Genome Analysis: A Laboratory Manual. 4. Mapping Genomes (eds Birren, B. et al.) 187–215 (Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New York, 1999).

    Google Scholar 

  45. Burke, D. T., Carle, G. F. & Olson, M. V. Cloning of large segments of exogenous DNA into yeast by means of artificial chromosome vectors. Science 236, 806–812 (1987).

    CAS  PubMed  Google Scholar 

  46. Green, E. D., Hieter, P. & Spencer, F. A. in Genome Analysis: A Laboratory Manual. 3. Cloning Systems (eds Birren, B. et al.) 297–565 (Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New York, 1998).

    Google Scholar 

  47. Shizuya, H. et al. Cloning and stable maintenance of 300-kilobase-pair fragments of human DNA in Escherichia coli using an F-factor-based vector. Proc. Natl Acad. Sci. USA 89, 8794–8797 (1992).

    CAS  PubMed  PubMed Central  Google Scholar 

  48. Ioannou, P. A. et al. A new bacteriophage P1-derived vector for the propagation of large human DNA fragments. Nature Genet. 6, 84–89 (1994).

    CAS  PubMed  Google Scholar 

  49. Hudson, T. J. et al. An STS-based map of the human genome. Science 270, 1945–1954 (1995).

    CAS  PubMed  Google Scholar 

  50. Chumakov, I. M. et al. A YAC contig map of the human genome. Nature 377, 175–297 (1995).

    CAS  PubMed  Google Scholar 

  51. Bouffard, G. G. et al. A physical map of human chromosome 7: an integrated YAC contig map with average STS spacing of 79 kb. Genome Res. 7, 673–692 (1997).

    CAS  PubMed  Google Scholar 

  52. Nagaraja, R. et al. X chromosome map at 75-kb STS resolution, revealing extremes of recombination and GC content. Genome Res. 7, 210–222 (1997).

    CAS  PubMed  Google Scholar 

  53. Nusbaum, C. et al. A YAC-based physical map of the mouse genome. Nature Genet. 22, 388–393 (1999).

    CAS  PubMed  Google Scholar 

  54. Marra, M. A. et al. High throughput fingerprint analysis of large-insert clones. Genome Res. 7, 1072–1084 (1997).Approach for constructing sequence-ready BAC contig maps by restriction enzyme digest-based fingerprint analysis. This general method, which essentially represents an extension of earlier mapping techniques (for example, see references 57–59 ), has been used to generate BAC contig maps of the human, mouse, Arabidopsis thaliana and other genomes.

    CAS  PubMed  PubMed Central  Google Scholar 

  55. Gregory, S. G., Howell, G. R. & Bentley, D. R. Genome mapping by fluorescent fingerprinting. Genome Res. 7, 1162–1168 (1997).

    CAS  PubMed  PubMed Central  Google Scholar 

  56. Kohara, Y., Akiyama, K. & Isono, K. The physical map of the whole E. coli chromosome: application of a new strategy for rapid analysis and sorting of a large genomic library. Cell 50, 495–508 (1987).

    CAS  PubMed  Google Scholar 

  57. Olson, M. V. et al. Random-clone strategy for genomic restriction mapping in yeast. Proc. Natl Acad. Sci. USA 83, 7826–7830 (1986).

    CAS  PubMed  PubMed Central  Google Scholar 

  58. Riles, L. et al. Physical maps of the six smallest chromosomes of Saccharomyces cerevisiae at a resolution of 2.6 kilobase pairs. Genetics 134, 81–150 (1993).

    CAS  PubMed  PubMed Central  Google Scholar 

  59. Coulson, A., Sulston, J., Brenner, S. & Karn, J. Toward a physical map of the genome of the nematode Caenorhabditis elegans. Proc. Natl Acad. Sci. USA 83, 7821–7825 (1986).References 57–59 represent classic descriptions of restriction enzyme digest-based fingerprint analysis, as used to construct physical maps of the Saccharomyces cerevisiae and Caenorhabditis elegans genomes. In both cases, the resulting maps paved the way towards the sequencing of these genomes, as well as provided key insight into the strategies required for mapping and sequencing the human genome.

    CAS  PubMed  PubMed Central  Google Scholar 

  60. Marra, M. et al. A map for sequence analysis of the Arabidopsis thaliana genome. Nature Genet. 22, 265–270 (1999).

    CAS  PubMed  Google Scholar 

  61. Mozo, T. et al. A complete BAC-based physical map of the Arabidopsis thaliana genome. Nature Genet. 22, 271–275 (1999).

    CAS  PubMed  Google Scholar 

  62. The International Human Genome Mapping Consortium. A physical map of the human genome. Nature 409, 934–941 (2001).Paper reporting the BAC-based physical map of the human genome constructed by the Human Genome Project.

  63. Green, E. D. & Olson, M. V. Chromosomal region of the cystic fibrosis gene in yeast artificial chromosomes: a model for human genome mapping. Science 250, 94–98 (1990).

    CAS  PubMed  Google Scholar 

  64. McPherson, J. D. Sequence ready — or not? Genome Res. 7, 1111–1113 (1997).

    CAS  PubMed  Google Scholar 

  65. Edwards, A. et al. Automated DNA sequencing of the human HPRT locus. Genomics 6, 593–608 (1990).

    CAS  PubMed  Google Scholar 

  66. Chissoe, S. L. et al. Representation of cloned genomic sequences in two sequencing vectors: correlation of DNA sequence and subclone distribution. Nucleic Acids Res. 25, 2960–2966 (1997).

    CAS  PubMed  PubMed Central  Google Scholar 

  67. Bouck, J., Miller, W., Gorrell, J. H., Muzny, D. & Gibbs, R. A. Analysis of the quality and utility of random shotgun sequencing at low redundancies. Genome Res. 8, 1074–1084 (1998).

    CAS  PubMed  PubMed Central  Google Scholar 

  68. The Arabidopsis Genome Initiative. Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature 408, 796–815 (2000).Describes the genome sequence of the first plant, Arabidopsis thaliana , by an international consortium of sequencing groups.

  69. Coulson, A., Waterston, R., Kiff, J., Sulston, J. & Kohara, Y. Genome linking with yeast artificial chromosomes. Nature 335, 184–186 (1988).

    CAS  PubMed  Google Scholar 

  70. Bentley, D. R. Decoding the human genome sequence. Hum. Mol. Genet. 9, 2353–2358 (2000).

    CAS  PubMed  Google Scholar 

  71. Waterston, R. & Sulston, J. E. The Human Genome Project: reaching the finish line. Science 282, 53–54 (1998).

    CAS  PubMed  Google Scholar 

  72. The Sanger Centre & The Washington University Genome Sequencing Center. Toward a complete human genome sequence. Genome Res. 8, 1097–1108 (1998).

  73. Bentley, D. R., Pruitt, K. D., Deloukas, P., Schuler, G. D. & Ostell, J. Coordination of human genome sequencing via a consensus framework map. Trends Genet. 14, 381–384 (1998).

    CAS  PubMed  Google Scholar 

  74. Dunham, I. et al. The DNA sequence of human chromosome 22. Nature 402, 489–495 (1999).

    CAS  PubMed  Google Scholar 

  75. The Chromosome 21 Mapping and Sequencing Consortium. The DNA sequence of human chromosome 21. Nature 405, 311–319 (2000).References 74 and 75 announce the completion of finished sequence for the first two human chromosomes — 22 and 21, respectively.

  76. The BAC Resource Consortium. Integration of cytogenetic landmarks into the draft sequence of the human genome. Nature 409, 953–958 (2001).

  77. Yu, A. et al. Comparison of human genetic and sequence-based physical maps. Nature 409, 951–953 (2001).

    CAS  PubMed  Google Scholar 

  78. Deloukas, P. et al. A physical map of 30,000 human genes. Science 282, 744–746 (1998).

    CAS  PubMed  Google Scholar 

  79. Olivier, M. et al. A high-resolution radiation hybrid map of the human genome draft sequence. Science 291, 1298–1302 (2001).

    CAS  PubMed  Google Scholar 

  80. Venter, J. C., Smith, H. O. & Hood, L. A new strategy for genome sequencing. Nature 381, 364–366 (1996).

    CAS  PubMed  Google Scholar 

  81. Mahairas, G. G. et al. Sequence-tagged connectors: a sequence approach to mapping and scanning the human genome. Proc. Natl Acad. Sci. USA 96, 9739–9744 (1999).

    CAS  PubMed  PubMed Central  Google Scholar 

  82. Wendl, M. C. et al. Theories and applications for sequencing randomly selected clones. Genome Res. 11, 274–280 (2001).

    CAS  PubMed  PubMed Central  Google Scholar 

  83. Fraser, C. M. & Fleischmann, R. D. Strategies for whole microbial genome sequencing and analysis. Electrophoresis 18, 1207–1216 (1997).

    CAS  PubMed  Google Scholar 

  84. Fleischmann, R. D. et al. Whole-genome random sequencing and assembly of Haemophilus influenzae Rd. Science 269, 496–512 (1995).Publication reporting the genome sequence of the first prokaryotic organism, the bacterium Haemophilus influenzae . This effort was the first report of using a whole-genome shotgun-sequencing strategy to sequence the genome of a free-living organism.

    CAS  PubMed  Google Scholar 

  85. Fraser, C. M., Eisen, J. A. & Salzberg, S. L. Microbial genome sequencing. Nature 406, 799–803 (2000).

    CAS  PubMed  Google Scholar 

  86. Adams, M. D. et al. The genome sequence of Drosophila melanogaster. Science 287, 2185–2195 (2000).

    PubMed  Google Scholar 

  87. Myers, E. W. et al. A whole-genome assembly of Drosophila. Science 287, 2196–2204 (2000).References 86 and 87 report the initial genome sequence of Drosophila melanogaster generated by a hybrid strategy that involved both whole-genome shotgun sequencing and clone-by-clone shotgun sequencing. This project reflected a collaboration between the public Human Genome Project and Celera Genomics.

    CAS  PubMed  Google Scholar 

  88. Hoskins, R. A. et al. A BAC-based physical map of the major autosomes of Drosophila melanogaster. Science 287, 2271–2274 (2000).

    CAS  PubMed  Google Scholar 

  89. Weber, J. L. & Myers, E. W. Human whole-genome shotgun sequencing. Genome Res. 7, 401–409 (1997).

    CAS  PubMed  Google Scholar 

  90. Green, P. Against a whole-genome shotgun. Genome Res. 7, 410–417 (1997).References 89 and 90 provide point/counter-point perspectives that detail the opposing views on the use of a whole-genome shotgun-sequencing strategy for sequencing the human genome.

    CAS  PubMed  Google Scholar 

  91. Venter, J. C. et al. Shotgun sequencing of the human genome. Science 280, 1540–1542 (1998).

    CAS  PubMed  Google Scholar 

  92. Venter, J. C. et al. The sequence of the human genome. Science 291, 1304–1351 (2001).Landmark paper reporting the initial sequence of the human genome generated by Celera Genomics using a whole-genome shotgun-sequencing strategy in conjunction with available clone-by-clone data provided by the Human Genome Project.

    CAS  PubMed  Google Scholar 

  93. Bouck, J. B., Metzker, M. L. & Gibbs, R. A. Shotgun sample sequence comparisons between mouse and human genomes. Nature Genet. 25, 31–33 (2000).

    CAS  PubMed  Google Scholar 

  94. The International SNP Map Working Group. A map of human genome sequence variation containing 1.42 million single nucleotide polymorphisms. Nature 409, 928–933 (2001).

  95. Crollius, H. R. et al. Estimate of human gene number provided by genome-wide analysis using Tetraodon nigroviridis DNA sequence. Nature Genet. 25, 235–238 (2000).

    CAS  Google Scholar 

  96. Brenner, S. et al. Characterization of the pufferfish (Fugu) genome as a compact model vertebrate genome. Nature 366, 265–268 (1993).

    CAS  PubMed  Google Scholar 

  97. McConkey, E. H. & Varki, A. A primate genome project deserves high priority. Science 289, 1295–1296 (2000).

    CAS  PubMed  Google Scholar 

  98. Varki, A. A chimpanzee genome project is a biomedical imperative. Genome Res. 10, 1065–1070 (2000).

    CAS  PubMed  Google Scholar 

  99. VandeBerg, J. L., Williams-Blangero, S., Dyke, B. & Rogers, J. Examining priorities for a primate genome project. Science 290, 1504–1505 (2000).

    CAS  PubMed  Google Scholar 

  100. Soderlund, C., Longden, I. & Mott, R. FPC: a system for building contigs from restriction fingerprinted clones. Comput. Appl. Biosci. 13, 523–535 (1997).

    CAS  PubMed  Google Scholar 

  101. Soderlund, C., Humphray, S., Dunham, A. & French, L. Contigs built with fingerprints, markers, and FPC V4.7. Genome Res. 10, 1772–1787 (2000).

    CAS  PubMed  PubMed Central  Google Scholar 

  102. Sasaki, T. & Burr, B. International Rice Genome Sequencing Project: the effort to completely sequence the rice genome. Curr. Opin. Plant Biol. 3, 138–141 (2000).

    CAS  PubMed  Google Scholar 

Download references

Acknowledgements

I thank F. Collins, J. Touchman and R. Wilson for critical reading of this manuscript.

Author information

Authors and Affiliations

Authors

Related links

Related links

FURTHER INFORMATION

Human Genome Project

Saccharomyces cerevisiae

Caenorhabditis elegans

Drosophila melanogaster

Arabidopsis thaliana

Homo sapiens

Escherichia coli

Phred

Phrap

Consed

GAP

mouse

BAC fingerprint map of the mouse genome

rat

zebrafish

TIGR comprehensive microbial resource

Celera Genomics

Tetraodon nigroviridis

Fugu rubripes

rice

Glossary

FINISHED SEQUENCE

Complete sequence of a clone or genome, with a defined level of accuracy and contiguity.

SEQUENCE-TAGGED SITE

(STS). Short (for example, <1,000 bp), unique sequence associated with a PCR assay that can be used to detect that site in the genome.

CONTIG

Overlapping series of clones or sequence reads (for a clone contig or sequence contig, respectively) that corresponds to a contiguous segment of the source genome.

MINIMAL TILING PATH

A minimal set of overlapping clones that together provides complete coverage across a genomic region.

COVERAGE

The average number of times a genomic segment is represented in a collection of clones or sequence reads (synonymous with redundancy).

SEQUENCE-READY MAP

Typically considered an overlapping bacterial clone map (for example, a BAC contig map) with sufficiently redundant clone coverage to allow for the rational selection of clones for sequencing.

UNIVERSAL PRIMING SITE

A short sequence (for example, 16–24 bases) in a cloning vector, immediately adjacent to the vector–insert junction to which a common (that is, universal) sequencing primer can anneal.

FULL-SHOTGUN SEQUENCE

A type of prefinished sequence, in this case with sufficient coverage to make it ready for sequence finishing (typically on the order of 8–10-fold coverage).

WORKING DRAFT SEQUENCE

A type of prefinished sequence, often meant to correspond to sequence with coverage that puts it at roughly the halfway point towards full-shotgun sequence.

PREFINISHED SEQUENCE

Sequence derived from a preliminary assembly during a shotgun-sequencing project (at this stage, the sequence is often not contiguous nor highly accurate).

RADIATION HYBRID MAP

Physical map of markers (typically STSs) positioned on the basis of the frequency with which they are separated by radiation-induced breaks (map construction involves the PCR analysis of rodent cell lines, each containing different fragments of the source genome).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Green, E. Strategies for the systematic sequencing of complex genomes. Nat Rev Genet 2, 573–583 (2001). https://doi.org/10.1038/35084503

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1038/35084503

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing