Competent cells are routinely used in molecular cloning to propagate and maintain cloned DNA in plasmids. General considerations for choosing competent cells for everyday cloning applications are discussed in a previous section. This section covers the essentials of choosing competent cells to enable success in the following molecular biology applications:
Escherichia coli is one of the most popular hosts for overexpression of cloned proteins because of its fast and efficient growth, and easy cultivation and induction. The most widely used E. coli strains in expression of cloned proteins are BL21 and its derivatives. This is mainly because BL21 strains are deficient in a cytoplasmic protease, Lon, and an outer membrane protease, OmpT. Both of these mutations (lon and ompT) allow intracellular accumulation of heterologous proteins (i.e., proteins derived from species or cell types different from the host) at a high rate while minimizing protein degradation during purification .
The host cell’s endogenous RNA polymerases express genes on the transformed plasmid, including the cloned gene, which may be driven by a promoter like the lac or tac. However these promoters are relatively weak and thus not ideal for strong expression of the cloned gene. Therefore, BL21 strains had been modified by adding bacteriophage T7 promoter to enhance expression of cloned recombinant proteins .
In the T7 expression system, the BL21 strain is engineered to contain a λDE3 lysogen to express the T7 RNA polymerase under the control of a mutant lac promoter, lacUV5. The lacUV5 promoter is positively regulated by IPTG (isopropylthio-β-galactoside), which is commonly used to induce transcription by T7 RNA polymerase inside the cells. As such, genes to be expressed can be cloned downstream of the T7 RNA polymerase promoter on the vector and induced from DE3 via IPTG (Figure 1A). This modified BL21 strain is commonly known as BL21(DE3).
Figure 1. Modified BL21 strains for the T7 expression system.
To enhance strong RNA expression from the T7 promoter, the BL21(DE3) strain had been genetically modified for increased mRNA stability. One example is a mutation of the rne gene, which encodes an essential endonuclease, RNase E. RNase E is the main E. coli RNA degradosome component; its function is mRNA degradation and RNA transcript processing. RNase E cuts single-stranded RNA rich in AU bases, thereby helping to factually mature mRNA and rRNA. The rne131 mutation results in a truncated RNase E that lacks the C-terminal domain required for mRNA degradation [3-5]. Thus, the BL21(DE3) strain with rne131 increases stability of mRNA transcripts, in turn increasing expression of proteins (Figures 1B, 2).
Figure 2. Protein expression from T7 expression vectors in BL21 strains with or without the rne131 mutation. (U = uninduced, I = induced with IPTG).
The T7 promoter system of the BL21(DE3) strain results in higher basal expression levels of heterologous genes than in the original BL21 strain. This may become problematic if the cloned protein is toxic to the host cell. A controllable promoter such as lacUV5 allows the cells to be cultured to a suitable density before expression of the cloned gene is induced. However, the promoter may be “leaky,” meaning there may be a considerable level of gene expression without induction. To reduce the basal expression level, some BL21(D3) strains are designed to carry a pLys plasmid (commonly pLysS or pLysE; Table 1) that expresses T7 lysozyme. T7 lysozyme inhibits T7 RNA polymerase activity, thereby reducing basal expression of the toxic gene from the vector (Figure 1C) [6,7]. The inhibition by T7 lysozyme is overcome upon induction with IPTG. For tighter regulation of protein expression, BL21 strains containing lacIq can be used. lacIq strains overproduce the lac repressor, ensuring that genes are not expressed unless specifically induced (see Figure 3 for review of the lac operon).
|Expression of T7 lysozyme||Moderate||High|
|Growth of host cells||Little or no effect||May decrease|
|Induction time for cloned gene expression||Relatively short||Relatively long|
|Levels of cloned protein induction||Little or no effect||May reduce|
|Basal expression of cloned protein||Low||Lower|
Figure 3. The lac operon. When the endogenous level of lactose is low, the lac repressor (expressed by lacI) is able to bind to the lacO sequence and inhibit transcription of the lac genes (lacZ, lacY, and lacA) from the lac promoter (lacP). If lactose (more precisely, a metabolite called allolactose) or its analog IPTG is present, these bind to the lac repressor and change its conformation. This prevents the lac repressor binding to lacO, thereby derepressing expression of the operon.
As an alternative to lacIq, some strains carry an arabinose-inducible promoter, araBAD, upstream of the gene encoding T7 RNA polymerase, in their genome. The araBAD promoter offers high-level protein expression with one of the tightest regulation mechanisms available for a T7 expression system, making it suitable for toxic protein expression (Figures 4, 5).
Figure 4. Protein expression from lacUV5 promoter (with basal regulation by LysS) vs. araBAD promoter. (A) The araBAD promoter offers lower basal (uninduced) and higher induced levels of T7 RNA polymerase–mediated protein expression than does the lacUV5 promoter with pLysS. (B) Expression of a toxic protein, elastin-like peptide-TEV protease fusion (ELP-TEV), was enhanced using the araBAD promoter.
Therefore, by varying the concentrations of L-arabinose and glucose, expression of the cloned protein can be modulated appropriately.
Figure 5. (A) A modified BL21 strain that carries a chromosomal insertion of a cassette containing the T7 RNA polymerase (T7 RNAP) gene in the araB locus, allowing regulated expression of T7 RNA polymerase from the araBAD promoter. (B) AraC is a transcriptional regulator that forms a complex with L-arabinose. In the absence of L-arabinose, the C-terminal domains of the AraC dimer bind to the araO2 and araI1 sites. For transcriptional activation, L-arabinose recognizes the N-terminal domains of the AraC dimer and switches AraC binding from araO2 to araI2, allowing transcription to begin. Contact of AraC with the araI1 and araI2 sites is also stimulated by the cAMP activator protein (CAP)-cAMP complex binding to the DNA when the glucose level is low [9,10].
The majority of laboratory E. coli competent cells are derived from the wild type strains K-12 and B, which possess the EcoKI and EcoBI restriction-modification systems, respectively. EcoKI recognizes the sequence 5′ AAC(N)6GTGC 3′, and EcoBI recognizes the sequence 5′ TGA(N)8TGCT 3′ [10,11]. Their Type I restriction-modification systems are encoded by hsdR, hsdM,and hsdSgenes to form a multi-unit enzyme with restriction, modification, and specificity activities (hence the abbreviation RMS) . When their recognition site is unmodified (i.e., unmethylated), the enzyme cleaves (“restricts”) the sequence. When only one strand of the recognition sequence is methylated (i.e., hemimethylated), the enzyme methylates the unmodified strand and prevents restriction.
Therefore, the competent cell’s genotype with respect to hsdRMS must be considered when propagating unmethylated (foreign) DNA containing EcoKI and EcoBI sites, such as PCR inserts.
Some restriction enzymes are known to be methylation-sensitive: they are unable to cleave methylated DNA sequences. The dam and dcm genes encode two of the most common methylases found in E. coli, DNA adenine methylase and DNA cytosine methylase, respectively. One general practice to keep DNA unmethylated is to use competent cells deficient in dam and dcm for transformation. The dam methylase modifies the adenine in the sequence 5′ GATC 3′  while the dcm methylase methylates the second cytosine in the sequence 5′ CC(A/T)GG 3′ . Therefore, DNA containing restriction sites that include the dam and dcm sequences should be propagated in adam–/ dcm–strain. While providing a unique strategy for controlling cleavage, dam–/dcm–strains should not be used in routine transformation because undesirable mutations may be introduced into the propagated DNA.
For cloning of eukaryotic genomic DNA, strains of competent cells that carry mutations in the methylation-dependent restriction systems (MDRS) should be selected for transformation. Genomic DNA from animals, plants, and lower eukaryotes contain methylated cytosines that are involved in epigenetic regulation of gene expression in many cellular processes [15,16]. DNA sequences containing these methylated cytosines can become the targets of methylation-dependent endonucleases that are part of the McrA, McrBC, and Mrr E. coli restriction systems (Table 2).
|McrA||Modified cytosine restriction||mcrA||5′-Cm5CGG-3′||5-hydroxymethylcytosine or 5-methylcytosine||17, 18|
Modified cytosine restriction
spaced approximately 55–103 bp apart
|5-hydroxymethylcytosine, N4-methylcytosine, or 5-methylcytosine||19, 20|
|Mrr||Methyladenine recognition and restriction||mrr||Consensus sequence unclear; requires either modified adenine or cytosine||DNA with both N6-methyladenine and 5-methylcytosine||21, 22|
Therefore, an E. coli strain that carries mutated mcrA, mcrBC, and mrr is critical for propagation of DNA of eukaryotic origin that may contain methylated cytosines and/or adenines. It also ensures better representation for construction of genomic libraries of eukaryotic origin.
Certain DNA constructs are prone to DNA recombination and therefore are unstable in the bacterial cells after transformation. Examples of such plasmid constructs include retroviral and lentiviral vectors, which contain sequences with long terminal repeats (LTRs), inverted repeats, and tandem repeats.
Competent cells with recA1 and/or recA13 mutations are widely employed for stable propagation of constructs with retroviral and lentiviral vector backbones. Mutations of the host recA gene render the enzyme involved in DNA repair inactive, thereby reducing recombination and allowing maintenance of unstable constructs. The recA mutation also prevents homologous recombination between the host chromosome and vector sequences. Nevertheless, recA deficiency alone is not sufficient to prevent deletion and recombination of DNA sequences with repeats, so competent cells engineered to maintain the stability of DNA repeats must be considered (Figure 6) [23-26].
Figure 6. Stability of a plasmid with direct repeat sequences. (A) Three strains of chemically competent cells were transformed with pH30, a 7.3 kb plasmid that contains ~100 repeats of a 32-bp sequence. Plasmid DNA recovered from randomly selected colonies of the three strains was analyzed for stability by agarose gel electrophoresis. (B) pH30, electroporated into Invitrogen Stbl4 cells and isolated from 5 separate transformations, was analyzed for stability by agarose gel electrophoresis.
Stability of DNA constructs may be further enhanced by performing the following:
Large plasmids (>10 kb) are known to pose issues in bacterial transformation, since transformation efficiency tends to decrease as plasmid size increases [28,29]. As illustrated in Figure 7, recommendations to improve the success of transformation of large plasmids include:
Figure 7. Transformation efficiencies with large plasmids up to 200 kb. (A) Chemically competent (C) or electrocompetent (E) cells of different transformation efficiencies were transformed with plasmids ranging from 8 kb to 28 kb, and efficiencies of DNA uptake were measured (n = 4). (B) Large plasmids of 60 kb, 100 kb, and 200 kb were transformed into two strains of electrocompetent cells with high transformation efficiencies, and efficiencies of DNA uptake were measured (n = 4). (C) Large plasmids of 60–200 kb were transformed into three strains of electrocompetent cells at with transformation efficiency of 1 x 1010 CFU/μg for determination of DNA uptake (n = 6).
One of the methods to screen colonies carrying desired plasmids with an insert is positive selection. In this approach, the vector harbors a lethal gene within the multiple cloning site (MCS). Successful cloning of the insert into the vector disrupts the expression of the lethal gene and permits colony formation.
The ccdB gene, derived from the F′ episome, is one of the lethal genes used for positive selection and serves as a selectable marker in many cloning vectors [32,33]. The CcdB protein is a toxin that interferes with the rejoining step of DNA gyrase, leaving the host chromosome fragmented and inhibiting cell growth. In order to propagate cloning vectors with an active ccdB gene, host cells must be resistant to the toxic effects of CcdB.
MostCcdB-resistant strains carry gyrA462, a mutation in DNA gyrase that suppresses double-stranded DNA breakage mediated by the CcdB protein . Note that bacterial strains carrying an F′ episome possess the ccdA gene, which negatively regulates ccdB . However, use of F′ competent cells to propagate ccdB vectors is not recommended, since endogenous expression of short-lived CcdA protein may not be able to negate CcdB’s lethality. Conversely, when using a ccdB vector for positive selection, F– strains should be used for transformation to avoid ccdA expression and prevent survival of colonies without the cloned DNA.
Some plasmids contain a specific origin of replication called R6Kγ. In order to replicate these plasmids in bacteria, competent cells must express the protein pi (π) to bind to the R6Kγ sequence and initiate replication of the plasmids. The protein π is encoded by the gene pir.
Two types of competent cells with the pir gene are commonly available: pir+ (wild type) and pir-116 (mutant) .
Certain R6Kγ plasmid cloning systems, such as the Univector system (Figure 8) , employ Cre-lox site-specific recombination. The Cre recombinase protein, when supplied, binds to loxPsites on the Univectors. This effects recombination of the donor and acceptor plasmids, creating a recombinant version that carries the gene of interest .
Figure 8. The Univector cloning system. The loxP sites on the donor and acceptor plasmids recombine in the presence of Cre recombinase to produce a recombinant plasmid.
Single-stranded DNA (ssDNA) is useful for dideoxy DNA sequencing, preparation of strand-specific probes, in vitro mutagenesis, and construction of subtracted cDNA libraries. A popular cloning-based approach for generating ssDNA involves taking advantage of the replication cycle of the filamentous bacteriophage M13 (Figure 9) [39-45]. For the M13 bacteriophage to attach to and enter bacterial cells, the E. coli strain must express the F pilus, denoted by the genetic marker F′ or F+.
Figure 9. M13 phage life cycle. (1) The M13 phage infects the bacterial cell via attachment to the F pilus, and injects its circular ssDNA genome. (2) The host’s cellular machinery uses the ssDNA (the + strand) as a template to synthesize a complementary strand, resulting in a dsDNA circle called the replicative form, or RF, which then multiplies. (3) The (+) strand of RF is nicked and extended using the (–) strand as a template, in a process called rolling cycle replication. (4) After one round of extension, the (+) strand is nicked again and recircularized to form the ssDNA genome of the progeny phage. (5) The progeny genome is packaged into a phage particle after a series of processes and is released from the host.
To generate ssDNA, the dsDNA of interest may be subcloned into (1) an M13 vector, or (2) a phagemid (Figures 10, 11).
Figure 11. Phagemid and the helper phage M13KO7.
As such, any E. coli strain with an F´ episome can be used to generate ssDNA via the M13 replication pathway. However, contaminating bacterial and double-stranded plasmid/phagemid DNA can interfere with preparation of high-quality ssDNA from these strains. Therefore, it is desirable to use strains that are endA+ (in addition to F′) to remove dsDNA during ssDNA purification (Figure 12).
Figure 12. Single-stranded DNA purified from strainscontainingthe F´ episome. Clones carrying cDNA inserts of 0.3, 0.6, 0.9, 1.5, 2 kb, and no insert were analyzed on an agarose gel. Strain 1, which expresses wild type endA, shows the least dsDNA contamination.
Protein expression in insect cells provides certain advantages, such as posttranslational modifications similar to those of mammalian cells and ease of scaling up. Baculoviruses, which infect insect cells, are used in one approach to introduce recombinant genes into insect cells . One method to generate a baculovirus is by transfecting insect cells with a bacmid, which is a shuttle plasmid between E. coli and insect cells, carrying the gene of interest . A recombinant bacmid may be constructed in E. coli by transforming competent cells harboring a parent bacmid and a helper transposon plasmid with a donor plasmid specially designed for carrying the gene of interest (Figure 13).
Figure 13. Creation of a recombinant bacmid. In this system, the transforming donor plasmid carries the gene of interest and the site-specific transposon Tn7, which recognizes the lacZ-mini-attTn7 fusion sequence on the bacmid of the competent cell. Transposition from the plasmid to the bacmid occurs in the presence of the transposition proteins (transABCD) provided by a helper plasmid. Successful transposition disrupts the lacZ gene on the bacmid, resulting in white bacterial colonies.
Recombinant DNA is routinely introduced into plant cells with ease and simplicity by an Agrobacterium-mediated process . In this approach, Agrobacterium tumefaciens cells are transformed with plant DNA that has been cloned into a tumor-inducing plasmid, or Ti plasmid, modified for gene transfer into plant cells .
The transformation of plant DNA constructs into electrocompetent A. tumefaciens with a disarmed Ti plasmid offers some advantages:
Figure 14. Ti plasmid and the binary vector system.
Electroporated cells were cultured at 30°C for 3 hours in different expression media for recovery and plated on either a YM or LB plate.
Transformation efficiency and host genotype are especially important for DNA library construction in order to create a representative library covering genes of varying abundance and sizes. In the preparation of a library, saturating amounts of pooled DNA inserts (gDNA or cDNA fragments) that have been ligated into vectors are used to transform competent cells. Saturating amounts of DNA maximize the library size with representative DNA from a minimum number of transformation reactions, but lower the efficiency of DNA uptake by the cells. Therefore, when selecting competent cells for library construction, the highest transformation efficiency available and electrocompetency are recommended to obtain the greatest number of transformants from a single transformation.
Library construction: How critical is transformation efficiency?
When cloning cDNA prepared from low-abundance mRNA sequences, maximizing the number of clones generated from a given amount of double-stranded cDNA increases the probability that any desired clone will be represented in the library.
The following Clarke-Carbon formula may be used to calculate the number of colonies needed to ensure that a sequence has a 99% probability of occurring in a cDNA library .
N = ln (1 – P) / ln (1 – 1/n), where P = probability, N = number of colonies required, 1/n = proportion of the total mRNA population for a single low-abundance mRNA species.
A typical eukaryotic cell contains approximately 11,000 low-abundance mRNA species comprising approximately 30% of the mRNA population . The proportion, 1/n, of the total mRNA population for this typical low-abundance message is 1 in 11,000/0.30, or ~1 in 37,000.
Therefore, N = ln (1 – 0.99) / ln (1 – 1/37,000) = 170,000, which is the number of colonies needed to ensure that a clone from a low-abundance mRNA is present in the population.
This estimate is based on the presence of ~14 molecules of each transcript per cell. However, low-abundance mRNAs can be present at 1 molecule per cell [62,63]. Therefore, to isolate cDNA clones representing these transcripts, it is necessary to generate and screen several million independent clones, using competent cells with the highest transformation efficiency available.
In addition to transformation efficiency, the genotype of the selected competent cells are an important aspect in library preparation, and the following genetic markers should be noted (Table 4).
|deoR||Maintains large inserts and improves the chance of cloning full-length cDNA sequences|
|F+ and endA+ (for ssDNA production)||Allow filamentous phage infection and removal of dsDNA|
|mcrA, mcrBC, and mrr||Permit cloning of methylated DNA|
|recA||Increases library representation when unstable sequences are present|
|supE (also known as glnV)||Enables growth of lambda phage and phage display applications [57-59]|
|tonA||Protects clones from lysis by T1 and T5 bacteriophages|
Library preparation: Tips for transformation, plating, and growth of cells
If chemical transformation is performed for library construction, the transformation volume may be increased from 100 μL instead of performing multiple reactions with smaller volumes. With larger volumes, scale up DNA input proportionally (e.g., 15 ng of DNA for every 100 μL of cells used) and increase heat shock times accordingly (Table 5) .
Table 5.Large-volume transformation. Cells were transformed with 15 ng of cDNA ligation per 100 μL of cells in 50 mL polypropylene tubes. Total colony yield is listed as the mean from 4 transformations.
|Cell volume per reaction||Heat shock time||Total volume before plating||Yield per reaction|
|100 µL||60 sec||1 mL||4.6 x 105 CFU|
|250 µL||90 sec||2.5 mL||1.0 x 106 CFU|
|500 µL||90 sec||5 mL||1.8 x 106 CFU|
|1,000 µL||120 sec||10 mL||3.4 x 106 CFU|
To amplify the library clones, growing the colonies on selective agar plates or semi-solid agar in bottles is recommended . Liquid cultures often result in skewed representation of clones due to differential growth characteristics of individual clones. Libraries containing expression vectors should be cultured without induction (i.e., no lactose or IPTG) to prevent expression of gene products that may be toxic to the E. coli host.
In summary, molecular cloning experiments often require competent cells with certain properties for transformation. To save time and effort, it is crucial to select competent cells that are appropriately designed for the desired applications.
For Research Use Only. Not for use in diagnostic procedures.