Link: More Information About This Text Link: CELLS Home
Link: More Information About This Text Quick Jump to Chapter

Bacterial gene numbers range over an order of magnitude

Key Terms
  • Genome sequences show that there are 500-1200 genes in parasitic bacteria, 1500-7500 genes in free-living bacteria, and 1500-2700 genes in archaea.


Figure 3.9  
Genome sizes and gene numbers are known from complete sequences for several organisms. Lethal loci are estimated from genetic data.

Large-scale efforts have now led to the sequencing of many genomes. A range is summarized in Figure 3.9. They extend from the 0.6 × 106 bp of a mycoplasma to the 3.3 × 109 bp of the human genome, and include several important experimental animals, including yeasts, the fruit fly, and a nematode worm. (Web sites with summaries of genome sequences are listed at the end of this section).

Figure 3.10  
The minimum gene number required for any type of organism increases with its complexity. Photograph of mycoplasma kindly provided by A. Albay, K. Frantz, and K. Bott. Photograph of bacterium kindly provided by Jonathan King.

Figure 3.10 summarizes the minimum number of genes found in each class of organism; of course, many species may have more than the minimum number required for their type.

Figure 3.11  
The number of genes in bacterial and archaeal genomes is proportional to genome size.

The sequences of the genomes of bacteria and archaea show that virtually all of the DNA (typically 85-90%) codes for RNA or protein.Figure 3.11 shows that the range of genome sizes is about an order of magnitude, and that the genome size is proportional to the number of genes. The typical gene is about 1000 bp in length.

All of the bacteria with genome sizes below 1.5 Mb are obligate intracellular parasites — they live within a eukaryotic host that provides them with small molecules. Their genomes identify the minimum number of functions required to construct a cell. All classes of genes are reduced in number compared with bacteria with larger genomes, but the most significant reduction is in loci coding for enzymes concerned with metabolic functions (which are largely provided by the host cell) and with regulation of gene expression. Mycoplasma genitalium has the smallest genome, ~470 genes.

The archaea have biological properties that are intermediate between the prokaryotes and eukaryotes, but their genome sizes and gene numbers fall in the same range as bacteria. Their genome sizes vary from 1.5 - 3 Mb, corresponding to 1500 - 2700 genes. M. jannaschii is a methane-producing species that lives under high pressure and temperature. Its total gene number is similar to that of H. influenzae, but fewer of its genes can be identified on the basis of comparison with genes known in other organisms. Its apparatus for gene expression resembles eukaryotes more than prokaryotes, but its apparatus for cell division better resembles prokaryotes.
The archaea and the smallest free-living bacteria identify the minimum number of genes required to make a cell able to function independently in the environment. The smallest archaeal genome has ~1500 genes. The free-living bacterium with the smallest known genome is the thermophile Aquifex aeolicus, with 1.5 Mb and 1512 genes (2373). A "typical" gram-negative bacterium, H. influenzae, has 1,743 genes each of ~900 bp. So we can conclude that ~1500 genes are required to make a free-living organism.
Bacterial genome sizes extend over about an order of magnitude, from 0.6 Mb to <8 Mb (for review see 5863). The larger genomes have more genes. The bacteria with the largest genomes, S. meliloti and M. loti, are nitrogen-fixing bacteria that live on plant roots. Their genome sizes (~7 Mb) and total gene numbers (>7500) are similar to those of yeasts (2031).

The size of the genome of E. coli is in the middle of the range. The common laboratory strain has 4,288 genes, with an average length ~950 bp, and an average separation between genes of 118 bp (406). But there can be quite significant differences between strains. The known extremes of E. coli are from the smallest strain that has 4.6 Mb with 4249 genes to the largest strain that has 5.5 Mb bp with 5361 genes

We still do not know the functions of all the genes. In most of these genomes, ~60% of the genes can be identified on the basis of homology with known genes in other species. These genes fall approximately equally into classes whose products are concerned with metabolism, cell structure or transport of components, and gene expression and its regulation. In virtually every genome, >25% of the genes cannot be ascribed any function. Many of these genes can be found in related organisms, which implies that they have a conserved function.

There has been some emphasis on sequencing the genomes of pathogenic bacteria, given their medical importance. An important insight into the nature of pathogenicity has been provided by the demonstration that "pathogenicity islands" are a characteristic feature of their genomes (for review see 2491). These are large regions, ~10-200 kb, that are present in the genome of a pathogenic species, but absent from the genomes of nonpathogenic variants of the same or related species. Their G-C content often differs from that of the rest of the genome, and it is likely that they migrate between bacteria by a process of horizontal transfer. For example, the bacterium that causes anthrax (B. anthracis) has two large plasmids (extrachromosomal DNA), one of which has a pathogenicity island that includes the gene coding for the anthrax toxin.

  • 2491 Hacker, J. and Kaper, J. B. (2000).  Pathogenicity islands and the evolution of microbes.  Annu. Rev. Microbiol. 54, 641-679.  PubMed   Journal
  • 5863 Bentley, S. D. and Parkhill, J. (2004).  Comparative genomic structure of prokaryotes.  Annu. Rev. Genet. 38, 771-792.  PubMed  
  • 406 Blattner, F. R. et al. (1997).  The complete genome sequence of Escherichia coli K-12.  Science 277, 1453-1474.  PubMed  
  • 2031 Galibert, F. et al. (2001).  The composite genome of the legume symbiont Sinorhizobium meliloti.  Science 293, 668-672.  PubMed   Journal
  • 2373 Deckert, G. et al. (1998).  The complete genome of the hyperthermophilic bacterium Aquifex aeolicus.  Nature 392, 353-358.  PubMed   Journal

© Jones and Bartlett Publishers (2007)
Link: Jones and Bartlett Publishers

Instructors: More Information About This Text | Jones and Bartlett Biological Science Titles

© Copyright 2007 Jones and Bartlett Publishers
Contact Technical Support