The Keio collection (Baba et al, 2006) has been established as a set of single‐gene deletion mutants of Escherichia coli K‐12. These mutants have a precisely designed deletion from the second codon from the seventh to the last codon of each predicted ORF. Further information is available at http://sal.cs.purdue.edu:8097/GB7/index.jsp or http://ecoli.naist.jp/. The distribution is now being handled by the National Institute of Genetics of Japan (http://www.shigen.nig.ac.jp/ecoli/pec/index.jsp). To date more than 4 million samples have been distributed worldwide. As we described earlier (Baba et al, 2006), gene amplification during construction is likely to have led to a small number of mutants with genetic duplications.
The design of the Keio deletions was based on annotations that are now outdated. Of 4288 ORFs targeted, mutants were obtained for 3985 (Baba et al, 2006). Re‐annotation based on highly accurate sequencing of E. coli K‐12 (Hayashi et al, 2006) led to changing many coding regions and the total number of ORFs to 4296, including pseudogenes (Riley et al, 2006) (Supplementary Table I). The recent E. coli K‐12 MG1655 GenBank record (U0096, released in December 2008) has an additional 97 ORFs (exclusive of the ORFs in IS elements, Supplementary Table II) that were not targeted. Of these 4214 annotated ORFs, 4186 were targeted for deletion and 28 were not (Supplementary Table III), which resulted in the isolation of two independent mutants for 3864 targeted ORFs. No deletion was found for 299 ORFs, which are candidates for essential genes. Deletions were also isolated for 23 other ORFs; however, re‐annotation led to re‐classification of these ORFs as ‘split ORFs’, because their coding regions are interrupted by an IS element or some other mutation (Supplementary Table IV).
To identify mutants with partial duplications, we performed two sets of PCR reactions on both representatives of all 3864 mutants. In the first set, we tested for the presence of the targeted gene by using a pair of internal gene‐specific primers (Figure 1A and B). With the parental strain E. coli K‐12 BW25113, we were able to amplify 3803 ORFs, as indicated by the presence of PCR products of the expected sizes. For 61 ORFs, we used a pair of external primers that flanked the targeted gene either because the length of the initial PCR product was too short or because the internal primer pair failed to amplify fragments of the predicted sizes for the parental control strain. Results from testing 7728 strains (3864 ORFs) showed that the vast majority (96.1%, 7428/7728) are correct; results in Supplementary Table V show that one or both isolates are correct for 98.3% (3800/3864) of the Keio mutants (Figure 1C). As one isolate is correct for 177 ORFs for which the other isolate is ambiguous, no further tests were done with the other isolate of these mutants.
Mutants of the remaining 58 ORFs (33 with mixtures and 25 with duplications; Figure 1C) were tested in a second set of PCR reactions, which was carried out using external primers flanking the targeted gene (Figure 1A and B). A positive result in the first PCR test can occur not only from mutants with a partial duplication but also from ones that have been cross‐contaminated from a nearby microplate well. Therefore, the second set of PCR tests was performed on three colonies after colony purification. In the second PCR test, colonies with the correct deletion or from a cross‐contaminant mutant were expected to yield a single PCR product of length corresponding to the expected structure of the respective single‐gene mutant or the structure of the targeted gene, respectively. In contrast, mutants with both the respective single‐gene deletion and a genetic duplication were expected to yield both PCR products. In cases wherein the sizes of the predicted PCR products were indistinguishable for the deletion and wild‐type structures, the PCR products were digested with XbaI before size separation by electrophoresis, which cuts within the kan (kanamycin resistance gene) replacement gene.
For 33 of the 58 ORFs, one or more colonies yielded a single PCR product of size corresponding to the single‐gene deletion, indicating that the wells for these mutants were cross‐contaminated (Supplementary Table V). For the 25 other mutants, purified colonies consistently produced PCR fragments corresponding to structures for both the single‐gene deletion and targeted, indicating that these mutants have partial duplications (Figure 1C and Table I). As mentioned above, our PCR tests also revealed 177 mutants, for which we showed that only one isolate is correct. Further testing of these ambiguous mutants by our second PCR test revealed that most of them do not carry a partial duplication.
The 25 ORFs for which both isolates have duplications are candidates for essential genes (Table I). Fourteen of these have been reported to be essential in the PEC (Profiling of E. coli Chromosome) database (http://www.shigen.nig.ac.jp/ecoli/pec/index.jsp; Table IA). Thus, it is likely that these 14 genes are essential. The other 11 with partial duplications have been designated as non‐essential genes in the PEC database (Table IB). Further tests are required to validate their essentiality. We also carefully evaluated all single‐gene deletion mutants in the Keio collection, which were classified as essential in the PEC database. None provided evidence of a partial duplication. Thus, some ORFs reported as essential in the PEC database are nonessential, at least not in the genetic background of our host E. coli K‐12 BW25113 during aerobic growth at 37°C on LB agar. It should be noted that no evidence exists that the Red system that we used to generate the Keio collection is responsible for causing duplications. Besides, other authors have shown that genetic duplications can occur during DNA replication (Anderson and Roth, 1981). As a cautionary note, partial duplications can occur not only during the construction of single‐gene deletion but also upon transfer of the deletion into a new host, e.g., by PCR or transduction as reported previously (Zhou et al, 2003).
This work was supported by a Grant‐in‐Aid for Scientific Research (A) and KAKENHI (Grant‐in‐Aid for Scientific Research) on Priority Areas ‘System Genomics’ from the Ministry of Education, Culture, Sports, Science and Technology of Japan to NAIST and by funds from the Yamagata Prefectural Government and Tsuruoka City to Keio University. BLW was supported by NIH GM62662.
Conflict of Interest
The authors declare that they have no conflict of interest.
Supplementary Table I
Construction and evaluation of Keio collection. [msb200992-sup-0001.doc]
Supplementary Table II
genes that ECK numbers were not assigned before Keio construction [msb200992-sup-0002.xls]
Supplementary Table III
genes that were not targets of Keio construction [msb200992-sup-0003.xls]
Supplementary Table IV
Mutants for split ORFs of ancestral genes [msb200992-sup-0004.xls]
Supplementary Table V
Information on Keio collection deletion mutants [msb200992-sup-0005.xls]
This is an open‐access article distributed under the terms of the Creative Commons Attribution License, which permits distribution, and reproduction in any medium, provided the original author and source are credited. This license does not permit commercial exploitation without specific permission.
- Copyright © 2009 EMBO and Nature Publishing Group