Each difference between the genome sequences of Escherichia coli B strains REL606 and BL21(DE3) can be interpreted in light of known laboratory manipulations plus a gene conversion between ribosomal RNA operons. Two treatments with 1-methyl-3-nitro-1-nitrosoguanidine in the REL606 lineage produced at least 93 single-base-pair mutations (∼ 90% GC-to-AT transitions) and 3 single-base-pair GC deletions. Two UV treatments in the BL21(DE3) lineage produced only 4 single-base-pair mutations but 16 large deletions. P1 transductions from K-12 into the two B lineages produced 317 single-base-pair differences and 9 insertions or deletions, reflecting differences between B DNA in BL21(DE3) and integrated restriction fragments of K-12 DNA inherited by REL606. Two sites showed selective enrichment of spontaneous mutations. No unselected spontaneous single-base-pair mutations were evident. The genome sequences revealed that a progenitor of REL606 had been misidentified, explaining initially perplexing differences. Limited sequencing of other B strains defined characteristic properties of B and allowed assembly of the inferred genome of the ancestral B of Delbrück and Luria. Comparison of the B and K-12 genomes shows that more than half of the 3793 proteins of their basic genomes are predicted to be identical, although ∼ 310 appear to be functional in either B or K-12 but not in both. The ancestral basic genome appears to have had ∼ 4039 coding sequences occupying ∼ 4.0 Mbp. Repeated horizontal transfer from diverged Escherichia coli genomes and homologous recombination may explain the observed variable distribution of single-base-pair differences. Fifteen sites are occupied by phage-related elements, but only six by comparable elements at the same site. More than 50 sites are occupied by IS elements in both B and K, 16 in common, and likely founding IS elements are identified. A signature of widespread cryptic phage P4-type mobile elements was identified. Complex deletions (dense clusters of small deletions and substitutions) apparently removed nonessential genes from ∼ 30 sites in the basic genomes.
Bibliographical noteFunding Information:
We thank Eileen Matz and Mike Blewitt for technical assistance and the sequencing of different regions of the B and Escherich strains reported here, Chris Borland for first locating the araA mutation in REL606, and Haeyoung Jeong for preparation of Fig. 1 . This work was supported by the GTL Program of the Office of Biological and Environmental Sciences of the U.S. Department of Energy and internal research funding from Brookhaven National Laboratory (F.W.S.); Consortium National de Recherche en Génomique (P.D.); the U.S. National Science Foundation and DARPA ‘FunBio’ Program (R.E.L.); contract DE-AC02-98CH10886, Division of Materials Science, U.S. Department of Energy (S.M.); and the 21C Frontier Microbial Genomics and Applications Center Program of the Korean Ministry of Education, Science and Technology, and the KRIBB Research Initiative Program (J.F.K.).
All Science Journal Classification (ASJC) codes
- Structural Biology
- Molecular Biology