Protein expression refers to the way in which proteins are synthesized, modified and regulated in living organisms. In protein research, the term can apply to either the object of study or the laboratory techniques required to manufacture proteins. This article focuses on the latter meaning of protein expression. However, in practical terms, recombinant protein production depends on using cellular machinery.

Introduction to protein expression

Proteins are synthesized and regulated depending upon the functional need in the cell. The blueprints for proteins are stored in DNA and decoded by highly regulated transcriptional processes to produce messenger RNA (mRNA). The message coded by an mRNA is then translated into a protein. Transcription is the transfer of information from DNA to mRNA, and translation is the synthesis of protein based on a sequence specified by mRNA.

Simple diagram of transcription and translation. This describes the general flow of information from DNA base-pair sequence (gene) to amino acid polypeptide sequence (protein).


In prokaryotes, the process of transcription and translation occur simultaneously. The translation of mRNA starts even before a mature mRNA transcript is fully synthesized. This simultaneous transcription and translation of a gene is termed coupled transcription and translation. In eukaryotes, the processes are spatially separated and occur sequentially with transcription happening in the nucleus and translation, or protein synthesis, occurring in the cytoplasm.

Comparison of transcription and translation in prokaryotes vs. eukaryotes.

Protein Expression Handbook

This 118-page handbook provides comprehensive information about protein expression and will help you choose the right expression system and purification technologies for your specific application and needs. Get tips and tricks when starting an experiment, and find answers to everyday problems related to protein expression.

Protein Expression Handbook ›


Transcription and translation

Transcription occurs in three steps in both prokaryotes and eukaryotes: initiation, elongation and termination. Transcription begins when the double-stranded DNA is unwound to allow the binding of RNA polymerase. Once transcription is initiated, RNA polymerase is released from the DNA. Transcription is regulated at various levels by activators and repressors and also by chromatin structure in eukaryotes. In prokaryotes, no special modification of mRNA is required and translation of the message starts even before the transcription is complete. In eukaryotes, however, mRNA is further processed to remove introns (splicing), addition of a cap at the 5´ end and multiple adenines at the mRNA 3´ end to generate a polyA tail. The modified mRNA is then exported to the cytoplasm where it is translated.

Translation or protein synthesis is a multi-step process that requires macromolecules like ribosomes, transfer RNAs (tRNA), mRNA and protein factors as well as small molecules like amino acids, ATP, GTP and other cofactors. There are specific protein factors for each step of translation (see table below). The overall process is similar in both prokaryotes and eukaryotes, although particular differences exist.

During initiation, the small subunit of the ribosome bound to initiator t-RNA scans the mRNA starting at the 5’end to identify and bind the initiation codon (AUG). The large subunit of the ribosome joins the small ribosomal subunit to generate the initiation complex at the initiation codon. Protein factors as well as sequences in mRNA are involved in the recognition of the initiation codon and formation of the initiation complex. During elongation, tRNAs bind to their designated amino acids (known as tRNA charging) and shuttle them to the ribosome where they are polymerized to form a peptide. The sequence of amino acids added to the growing peptide is dependent on the mRNA sequence of the transcript. Finally, the nascent polypeptide is released in the termination step when the ribosome reaches the termination codon. At this point, the ribosome is released from the mRNA and is ready to initiate another round of translation.


Protein synthesis machinery

Summary of the primary components and features of prokaryotic and eukaryotic translational apparatus.

Component

Prokaryotes

Eukaryotes

Ribosomes

30S and 50S Subunits

40S and 60S Subunits

Template or mRNA

No further processing of mRNA transcript occurs after transcription.

mRNA is polycistronic and contains multiple initiation sites.

After transcription, the mRNA transcript is spliced to remove the noncoding regions (introns), and a cap structure (M7methyl gaunosine) and a poly adenosine sequence are added at the 5' and 3' end of the message respectively.

The Cap structure and the poly A are important for export of mRNA to the cytoplasm, proper initiation of translation and stability of mRNA among other functions. The mRNA is usually monocistronic.

Features of translation

The Shine-Dalgarno sequence is present on the mRNA transcript, and a complementary sequence is present in the ribosomal subunit. This facilitates binding and alignment of the ribosome on the mRNA at the translation initiation site (AUG).

The first amino acid of the nascent polypeptide is formylated methionine.

Translation initiation occurs in two ways:

Cap-dependent translation: Cap structure and the cap binding proteins are responsible for proper ribosome binding to mRNA and recognition of the correct initiation codon. The first AUG codon in the 5’end of mRNA functions as the initiation codon. Sometimes Kozak sequence may be present around the initiation codon.

Cap-independent translation: Ribosome binding to mRNA occurs through 'internal ribosome entry site' (IRES) on mRNA.

Initiation factors

Three initiation factors are known, IF1, IF2, &IF3

More than three initiation factors, which are regulated by phosphorylation. The initiation step is the rate-limiting step in eukaryotic translation.

Elongation factors

EF-Tu & EF-Ts, EF-G

EF1(α, β, γ) and EF2

Termination or release factors

RF1 and RF-2

eRF-1


Post-translational modification

After translation, polypeptides are modified in various ways to complete their structure, designate their location or regulate their activity within the cell. Post-translational modifications (PTMs) are various additions or alterations to the chemical structure and are critical features of the overall cell biology.

Types of post-translational modifications include:

  • Polypeptide folding into a globular protein with the help of chaperone proteins to arrive at the lowest energy state
  • Modifications of the amino acids present, such as removal of the first methionine residue
  • Disulfide bridge formation or reduction
  • Protein modifications that facilitate binding functions:
    • Glycosylation
    • Prenylation of proteins for membrane localization
    • Acetylation of histones to modify DNA–histone interactions
  • Addition of functional groups that regulate protein activity:
    • Phosphorylation
    • Nitrosylation
    • GTP binding

Recombinant protein expression methods

In general, proteomics research involves investigating any aspect of a protein such as structure, function, modifications, localization or protein interactions. To investigate how particular proteins regulate biology, researchers usually require a means of producing (manufacturing) functional proteins of interest.

Given the size and complexity of proteins, chemical synthesis is not a viable option for this endeavor. Instead, living cells and their cellular machinery are usually harnessed as factories to build and construct proteins based on supplied genetic templates.

Unlike proteins, DNA is simple to construct synthetically or in vitro using well established recombinant DNA techniques. Therefore, DNA templates of specific genes, with or without add-on reporter or affinity tag sequences, can be constructed as templates for protein expression. Proteins produced from such DNA templates are called recombinant proteins.

Traditional strategies for recombinant protein expression involve transfecting cells with a DNA vector that contains the template and then culturing the cells so that they transcribe and translate the desired protein. Typically, the cells are then lysed to extract the expressed protein for subsequent purification. Both prokaryotic and eukaryotic in vivo protein expression systems are widely used. The selection of the system depends on the type of protein, the requirements for functional activity and the desired yield. These expression systems are summarized in the table below and include mammalian, insect, yeast, bacterial, algal and cell-free. Each system has advantages and challenges, and choosing the right system for the specific application is important for successful recombinant protein expression. The following table provides an overview of recombinant protein expression systems.

Recombinant protein expression systems.


Mammalian protein expression

Mammalian expression systems can be used to produce mammalian proteins that have the most native structure and activity due to its physiologically relevant environment. This results in high levels of post-translational processing and functional activity. Mammalian expression systems are the preferred system for the expression of mammalian proteins and can be used for the production of antibodies, complex proteins and proteins for use in functional cell-based assays. However, these benefits are coupled with more demanding culture conditions.

Mammalian expression systems can be used to produce proteins transiently or through stable cell lines, where the expression construct is integrated into the host genome. While stable cell lines can be used over several experiments, transient production can generate large amounts of protein in one to two weeks. These transient, high-yield mammalian expression systems utilize suspension cultures and can produce gram-per-liter yields. Furthermore, these proteins have more native folding and post-translational modifications, such as glycosylation, as compared to other expression systems. In the example that follows, 3 different mammalian expression systems were used to express recombinant proteins. 

Recombinant protein yield. The Gibco FreeStyle CHO, Expi293 and ExpiCHO Expression Systems were used to transiently expresses human IgG, rabbit IgG and EPO (erythropoietin) using the pcDNA 3.4 expression vector. The Max Titer protocol was used for ExpiCHO and proteins were harvested at day 10–12. For FreeStyleCHO and Expi293, the proteins were harvested at day 6 or 7. All proteins were quantitated by ForteBio Octet or ELISA. Use of ExpiCHO results in higher protein titers as compared to FreeStyle CHO and Expi293.

Watch this video on how to produce recombinant proteins using the ExpiCHO Expression System


Insect protein expression

Insect cells can be used for high level protein expression with modifications similar to mammalian systems. There are several systems that can be used to produce recombinant baculovirus, which can then be utilized to express the protein of interest in insect cells. These systems can be easily scaled up and adapted to high-density suspension culture for large-scale expression of protein that is more functionally similar to native mammalian protein. Though yields can be up to 500 mg/L, recombinant baculovirus production can be time consuming and culture conditions more challenging than prokaryotic systems.

Baculovirus Expression System protocol summary. The Invitrogen BaculoDirect Baculovirus Expression System utilizes Invitrogen Gateway technology for cloning. After a 1-hour recombinase reaction and transfection in insect cells, baculovirus containing the gene of interest is produced. A quick expression test can then be performed before amplifying the viral stock and scaling up expression. Use of this system allows for baculovirus expression in insect cells.


Bacterial protein expression

Bacterial protein expression systems are popular because bacteria are easy to culture, grow fast and produce high yields of recombinant protein. However, multi-domain eukaryotic proteins expressed in bacteria often are non-functional because the cells are not equipped to accomplish the required post-translational modifications or molecular folding. Also, many proteins become insoluble as inclusion bodies that are very difficult to recover without harsh denaturants and subsequent cumbersome protein-refolding procedures. In the example that follows, a bacterial cell-based system was used to express 8 different recombinant proteins. 

Protein expression in bacterial cells. Gateway cloning was used to clone 8 human proteins into the Invitrogen Champion pET300/NT-DEST vector. BL21(DE3) E. coli were utilized to express positive clones in either LB + IPTG (1), ready-to-use Invitrogen MagicMedia medium (2), or MagicMedia medium prepared from powder (3). Samples were lysed and analyzed on a Coomassie blue dye–stained Invitrogen NuPAGE 4-12% Bis-Tris Protein Gel. M = Invitrogen SeeBlue Protein Standard. Use of MagicMedia E. coli medium results in higher protein yield across different samples.


Cell-free protein expression

Cell-free protein expression is the in vitro synthesis of a protein using translation-compatible extracts of whole cells. In principle, whole cell extracts contain all the macromolecules and components needed for transcription, translation and even post-translational modification. These components include RNA polymerase, regulatory protein factors, transcription factors, ribosomes and tRNA. When supplemented with cofactors, nucleotides and the specific gene template, these extracts can synthesize proteins of interest in a few hours.

Although not sustainable for large scale production, cell-free, or in vitro translation (IVT) protein expression systems, have several advantages over traditional in vivo systems. Cell-free expression allows for fast synthesis of recombinant proteins without the hassle of cell culture. Cell-free systems enable protein labeling with modified amino acids, as well as expression of proteins that undergo rapid proteolytic degradation by intracellular proteases. Also, with the cell-free method, it is simpler to express many different proteins simultaneously (e.g., testing protein mutations by expression on a small scale from many different recombinant DNA templates). In this representative experiment, an IVT system was used to express human caspase 3 protein. 

Caspase-3 expression in a human IVT system. Caspase-3 was expressed using the Thermo Scientific 1-Step Human High-Yield IVT Kit (Human IVT) and in E. coli (Recombinant). Active caspase-3 activity was assayed using equal amounts of protein. Caspase-3 protein expressed using the IVT system was more active as compared to a protein expressed in bacteria. 


Chemical protein synthesis

Chemical synthesis of proteins can be used for applications requiring proteins labeled with unnatural amino acids, proteins labeled at specific sites or proteins that are toxic to biological expression systems. Chemical synthesis produces highly pure protein but works well only for small proteins and peptides. Yield is often quite low with chemical synthesis, and the method is prohibitively expensive for longer polypeptides. 


Recommended reading
  1. Imataka H, Mikami S (2009) Advantages of human cell-derived cell-free protein synthesis systems. Seikagaku 81(4):303–7.
  2. Mikami S et al. (2008) A human cell-derived in vitro coupled transcription/translation system optimized for production of recombinant proteins. Protein Expr Purif 62(2):190–8.
  3. Kobayashi T et al. (2007) An improved cell-free system for picornavirus synthesis. J Virol Methods 142(1-2):182–8.
  4. Mikami S et al. (2006) A hybridoma-based in vitro translation system that efficiently synthesizes glycoproteins. J Biotechnol 127(1):65–78.
  5. Mikami S et al. (2006) An efficient mammalian cell-free translation system supplemented with translation factors. Protein Expr Purif 46(2):348–57.