Recombinant protein expression technology enables analysis of gene regulation and protein structure and function. Utilization of recombinant protein expression varies widely—from investigation of function in vivo to large-scale production for structural studies and biotherapeutic drug discovery. This handbook will cover the fundamentals of protein expression, from selecting a host system to creating your protein expression vector, as well as highlighting key tips and products that can be used to optimize recombinant protein production and purification.
Share your protein expression application needs with us, and one of our protein expression specialists will contact you to discuss the solutions we can provide.
If you've found this chapter—Protein Expression Overview—useful, you may be interested in getting your own copy of the entire 118-page Protein Expression Handbook in convenient PDF format.
Proteins are synthesized and regulated depending upon the functional need in the cell. The blueprints for proteins are stored in DNA, which is used as a template by highly regulated transcriptional processes to produce messenger RNA (mRNA). The message coded by an mRNA is then translated into defined sequences of amino acids that form a protein (Figure 1.1). Transcription is the transfer of information from DNA to mRNA, and translation is the synthesis of protein based on an amino acid sequence specified by mRNA.
In prokaryotes, the processes of transcription and translation occur simultaneously. The translation of mRNA starts even before a mature mRNA transcript is fully synthesized. This simultaneous transcription and translation of a gene is termed coupled transcription and translation. In eukaryotes, the processes are spatially separated and occur sequentially, with transcription happening in the nucleus and translation occurring in the cytoplasm. After translation, polypeptides are modified in various ways to complete their structure, designate their location, or regulate their activity within the cell. Posttranslational modifications (PTMs) are various additions or alterations to the chemical structure of the newly synthesized protein and are critical features of the overall cell biology.
In general, proteomics research involves investigating any aspect of a protein, such as structure, function, modifications, localization, or interactions with proteins or other molecules. To investigate how particular proteins regulate biology, researchers usually require a means of producing (manufacturing) functional proteins of interest. Given the size and complexity of proteins, de novo synthesis is not a viable option for this endeavor. Instead, living cells or their cellular machinery can be harnessed as factories to build and construct proteins based on supplied genetic templates. Unlike proteins, DNA is simple to construct synthetically or in vitro using well-established recombinant DNA techniques. Therefore, DNA sequences of specific genes can be constructed as templates for subsequent protein expression (Figure 1.2). Proteins produced from such DNA templates are called recombinant proteins.
Cloning refers to the process of transferring a DNA fragment, or gene of interest, from one organism to a self-replicating genetic element such as an expression vector (Figure 1.3).
A typical expression vector includes at least 4 key elements:
Additional elements may include:
Most vectors contain a promoter for expression by a specific host system, however, some offer the option to add your own promoter. Table 1.1 lists common constitutive and inducible promoters.
|CMV (cytomegalovirus); EF-1 alpha (human elongation factor alpha) 1; UbC (human ubiquitin C); SV40 (simian virus 40)
|Promoter with TetO2 (tetracycline operator); promoter with GAL4-UAS (yeast GAL4 upstream activating sequence)
|Tetracycline or doxycycline; mifepristone
|Cell free (rabbit reticulocyte)
|Cell free (HeLa or CHO)
|Requires T7 promoter and T7 RNA polymerase for transcription
|Ac5 (actin); OpIE1 & 2; PH (polyhedrin); p10
|GAP (glyceraldehyde-3-phosphate dehydrogenase)
|AOX1 (aldehyde oxidase); GAL1 (galactokinase)
|Not commonly available
|Lac (lactose operon); araBAD (L-arabinose operon)
|Requires T7 promoter and T7 RNA polymerase for transcription
Depending on the host system, another important factor to consider is the inclusion of a Shine-Dalgarno ribosome-binding sequence (for prokaryote systems) or Kozak consensus sequence (for eukaryote systems).
Epitope tags are commonly used to allow for easy detection or rapid purification of your protein of interest by fusing a sequence coding for the tag with your gene. Epitope tags can be either on the N-terminus or C-terminus of your recombinant protein. Table 1.2 offers some basic guidelines to help select an epitope tag.
|Examples of tag
|Well-characterized antibody available against the tag Easily visualized
|V5, Xpress, myc, 6XHis, GST, BioEase tag, capTEV tag, GFP, Lumio tag, HA tag, FLAG tag
|Resins available to facilitate purification
|6XHis, GST, BioEase tag, capTEV tag
|Protease recognition site (TEV, EK, HRV3C, Factor Xa) to remove tag after expression to get native protein
|Any tag with a protease recognition site following the tag (only on N-terminus)
Genes and their variants can be prepared via PCR, isolated as a cDNA Clone, or synthesized as Invitrogen GeneArt Strings DNA Fragments or Libraries. Alternatively, genes can be synthesized by Invitrogen GeneArt Gene Synthesis or Directed Evolution custom services. Read more about building your gene in our Gene to protein handbook.
Thermo Fisher Scientific offers a variety of unique cloning technologies to shuttle your gene of interest into the right vector, to simplify cloning procedures, and help accelerate protein expression.
Once cloning is completed, plasmids are taken up into competent cells (chemically competent or electrocompetent E. coli) for propagation and storage, by a process called transformation. Chemically competent cells are cells treated with salts to open up the pores in the membrane and cell wall. Plasmid DNA is then added to the cells and a mild heat shock opens pores in the E. coli cells, allowing for entry of the plasmid. In contrast, DNA is introduced into electrocompetent cells through transient pores that are formed in the E. coli membrane and cell wall when short electrical pulses are delivered to the cell and plasmid DNA mixture. When choosing a competent cell strain to work with, it is important to consider the following factors:
After taking advantage of the E. coli’s molecular machinery to replicate the plasmid DNA, a plasmid purification kit can be used to purify the plasmid.
DNA (containing your gene of interest). We offer two main technologies for plasmid purification:
For purification of a cloned plasmid that will be used to transfect into a cell line for protein expression, we recommend anion exchange purification for its higher purity and lower endotoxin levels. Silica-based purification is appropriate for cloning related workflows, but not optimal for plasmids used for transfection as there are higher levels of endotoxins and impurities. Anion exchange columns also produce better results with large plasmids. The Invitrogen PureLink HiPure Expi Plasmid Kits have been developed to give higher yields from large-scale plasmid isolation, in less than half the time of typical plasmid DNA isolation methods.
Using the right expression system for your specific application is the key to success. Protein solubility, functionality, purification speed, and yield are often crucial factors to consider when choosing an expression system. Additionally, each system has its own strengths and challenges, which are important when choosing an expression system. We offer 6 unique expression systems: mammalian, insect, yeast, bacterial, algal, and cell-free systems. Table 1.3 summarizes the main characteristics of these expression systems including the most common applications, advantages, and challenges with each system.
Once a system is selected, the method of gene delivery will need to be considered for protein expression. The main methods for gene delivery include transfection and transduction.
Transfection is the process by which nucleic acids are introduced into mammalian and insect cells. Protocols and techniques vary widely and include lipid transfection, chemical, and physical methods such as electroporation.
See different transfection methods or our transfection reagent selection guide
For cell types not amenable to lipid-mediated transfection, viral vectors are often employed. Virus-mediated transfection, also known as transduction, offers a means to reach hard-to-transfect cell types for protein overexpression or knockdown, and it is the most commonly used method in clinical research. Adenoviral, oncoretroviral, lentiviral, and baculoviral vectors have been used extensively for gene delivery to mammalian cells, both in cell culture and in vivo.
The next step following protein expression is often to isolate and purify the protein of interest. Protein yield and activity can be maximized by selecting the right lysis reagents and appropriate purification resin. We offer cell lysis formulations that have been optimized for specific host systems, including cultured mammalian, yeast, baculovirus-infected insect, and bacterial cells. Most recombinant proteins are expressed as fusion proteins with short affinity tags, such as polyhistidine or glutathione S-transferase, which allow for selective purification of the protein of interest. Recombinant His-tagged proteins are purified using immobilized metal affinity chromatography (IMAC) resins, and GST-tagged proteins are purified using a reduced glutathione resin.
|Most common application
For Research Use Only. Not for use in diagnostic procedures.