Epitope tagging was first described by Munro & Pelham in 1984 as a technique where a known epitope is fused to a recombinant protein, often on the N- or C-terminus, using genetic engineering. The fusion gene is cloned in an expression vector appropriate for the desired host cell type and transfected. Thereafter, the epitope-tagged fusion protein can be either detected or purified using an antibody specific for the epitope tag.

Known genes fused to epitope tags can be easily detected or purified using well-characterized tag antibodies. Epitope tagging is an efficient way to study the following scenarios:

  1. Proteins whose antibodies are not available
  2. Novel or newly discovered proteins without the need to generate antibodies
  3. Poorly immunogenic proteins
  4. Proteins that have low abundance in endogenous conditions
  5. Study topology of proteins and protein complexes and identify associated proteins

Epitope tagging, and the antibodies against them, have proven to be an efficient way to enable immunochemical and immunocytochemical methods to be done on target proteins. Epitope tag antibodies can be used for a variety of applications including protein purification, western blot, immunoprecipitation, flow cytometry, and immunofluorescence. Below is a schematic of the epitope tagging of a protein and its detection using a tag antibody.

Types of epitope tags

There is a wide range of epitope tags that are used to study proteins of interest. Generally, the type of epitope tag selected for genetic fusion depends on the intended downstream application. The table below summarizes features and applications of the most used epitope tags.

Tag Name Type of Tag Sequence/Size Applications
Hemagglutinin (HA) Peptide YPYDVPDYA Protein detection, immunoprecipitation protein-protein interaction studies
c-Myc Peptide EQKLISEEDL Protein detection, immunoprecipitation protein-protein interaction studies
V5 Peptide GKPIPNPLLGLDST Protein detection, immunoprecipitation protein-protein interaction studies
DYKDDDDK Peptide DYKDDDDK Protein detection, immunoprecipitation protein-protein interaction studies
6X-His Affinity HHHHHH Affinity purification, protein detection
Glutathione S-Transferase (GST) Affinity 27 kDa Affinity purification, protein detection, protein-protein interaction studies
Maltose Binding Protein (MBP) Affinity 43 kDa Affinity purification, protein detection, protein-protein interaction studies
Green Fluorescent Protein (GFP) Fluorescent protein 27 kDa Protein detection & visualization, FRET, live cell imaging, cell sorting
Red Fluorescent Protein (RFP) Fluorescent protein 26 kDa Protein detection & visualization, FRET, live cell imaging, cell sorting
mCherry Fluorescent protein 28 kDa Protein detection & visualization, FRET, live cell imaging, cell sorting

Based on their characteristics and applications, epitope tags are divided into three different classes: peptide tags, affinity tags, and fluorescent tags. Continue to learn about each of the epitope tag classes and further characteristics of each of the tags available.

Peptide tags

Peptide tags are short amino acid sequences offering a distinct epitope for the detection of the fused protein. Peptide tags can be used in tandem to improve their detection sensitivity or in combination with another tag. Due to their small size, peptide tags generally do not disturb protein function. They can be fused to the C-terminus or the N-terminus of a protein or even inserted within a protein. Commonly used peptide tags are HA (derived from the hemagglutinin protein of influenza virus), Myc, V5, and DYKDDDDK. Even though they are most routinely used for detection, some peptide tags, such as DYKDDDDK, are also used for purification of the fused protein.

Short descriptions of the various peptide tags are listed below:

Hemagglutinin (HA) tag
The HA tag is derived from a surface glycoprotein, hemagglutinin of human influenza, corresponding to amino acids 98-106 (YPYDVPDYA). This segment was selected due to its high immunogenicity and immuno-inaccessibility in the native hemagglutinin conformation. It is commonly expressed as a tag in conjunction with proteins to aid in the identification and purification of the tagged protein for functional analysis. HA tag antibodies can be used to facilitate protein detection or co-immunoprecipitation of tagged proteins. Below is data showing a HA tag monoclonal antibody used for immunoprecipitation (IP) and western blot (WB) detection.

Western blot analysis of Histone H3 tagged with HA and expressed in HEK-293 cells

HA tag monoclonal antibody in immunoprecipitation and western blot. Immunoprecipitation (IP) of HA tag was performed with 5 µg of the HA Tag Monoclonal Antibody (2-2.2.14) (Cat. No. 26183) on HEK-293 total cell extract (b) that was previously transfected with a HA-tagged Histone H3 vector. Untransfected total cell extract was used as a control (a). Normal mouse IgG was used as a negative IP control. HA-tagged histone H3 was detected at ~17 kDa using Histone H3 Recombinant Rabbit Monoclonal Antibody (17H2L9) (Cat. No. 702023) at 2 µg/mL. Goat anti-Mouse IgG (H+L) Superclonal Recombinant Secondary Antibody, HRP (Cat. No. A28177) was used for detection at 1:4000 dilution. Lanes 1 and 2 represent 10% of the total cell extract (input), lanes 3 and 4 represent mouse IgG as an isotype control, and lanes 5 and 6 represent IP done with the HA Tag Monoclonal Antibody.

c-Myc tag
The c-Myc (Myc) tag originated from c-myc, a proto-oncogene which encodes a nuclear phosphoprotein that plays a role in cell cycle progression, apoptosis, and cellular transformation. The Myc tag is only 10 amino acids long (EQKLISEEDL). It is used extensively for western blotting, immunoprecipitation, and flow cytometry.

V5 tag
The V5 epitope tag is derived from a small epitope (Pk) present on the P and V proteins of the paramyxovirus of simian virus (SV5). The V5 tag is usually used with all 14 amino acids (GKPIPNPLLGLDST), although it has also been used with a shorter 9 amino acid sequence (IPNPLLGLD). It can be used to detect expression of recombinant proteins in bacteria, yeast, insects, and mammalian systems.

The DYKDDDDK tag was the second fully functional epitope tag to be published. The DYKDDDDK tag is especially compatible with the fused protein because it is more hydrophilic than other common epitope tags and, therefore, less likely to denature or inactivate the protein to which it is appended. Unlike other epitope tags, the DYKDDDDK tag is an idealized, artificial design to which monoclonal antibodies were later raised. The enterokinase cleavage site in the DYKDDDDK tag allows it to be completely removed from the purified fusion protein. The data below shows detection of a DYKDDDDK-tagged protein in immunocytochemistry (ICC).

DYKDDDDK tag antibody in immunofluorescence. DYKDDDDK-tagged BNIP3 was transfected in HEK293 cells. The DYKDDDDK-tagged protein was probed with DYKDDDDK Tag Recombinant Rabbit Polyclonal Antibody (Cat. No. 740001, 1 µg/mL). Goat anti-Rabbit IgG (H+L) Superclonal Recombinant Secondary Antibody, Alexa Fluor 488 (Cat. No. A27034, 1:2000 dilution) was used for detection. Panel a shows cells stained for detection and localization of DYKDDDDK-tag protein (green) and panel b is stained for nuclei (blue). Panel c represents cytoskeletal F-actin staining using Rhodamine Phalloidin (Cat. No. R415, 1:300 dilution). Panel d is a composite image of panels a, b, and c demonstrating cytoplasmic localization of DYKDDDDK - BNIP3. Panel e shows no signal with untransfected cells and panel f represents control cells with no primary antibody.

Affinity tags

Affinity tags, as the name suggests, are often used in the affinity purification of recombinant proteins. This makes affinity tags one of the most efficient tools in protein purification. Virtually any protein (when fused with affinity tags) can be purified, even without prior knowledge of its biochemical properties. First, the protein of interest is fused with an affinity tag at either the N- or C-terminus. Then, the protein is purified using chemical or physical interactions to an immobilized substrate. The type of affinity chromatography used for purification depends on the type of tag.

Short descriptions of commonly used affinity tags for purification are listed below:

6X-His tag
The 6X-His tag is a synthetic oligo peptide consisting of 6 consecutive histidine residues (HHHHHH). The His tag is commonly expressed as a tag at the N- or C-terminal regions of recombinant protein. This allows isolation or purification by immobilized metal affinity chromatography. The affinity of the poly-histidine tag to Ni2+, by chelation, is strong and selective enough to enable purification of the protein by affinity chromatography on a Ni2+-NTA resin. Imidazole, which competes with histidine for nickel binding, is used to elute bound His-tagged protein. His tag-specific antibodies are used to facilitate detection or co-immunoprecipitation of tagged proteins. Data below shows affinity purification of a His-tagged protein using a Ni2+-NTA agarose column.

Affinity purification of His-tagged proteins

Affinity purification of His-tagged proteins. Overexpressed 6X-His-GFP lysate was loaded on columns containing HisPur Ni-NTA Superflow Agarose. After lysate binding, the column was washed with wash buffer containing 30 mM imidazole. Bound protein was eluted with buffer containing 300 mM imidazole. Fractions containing purified 6X-His-GFP were pooled and quantitated. Load, flow-through, wash, and elute fractions were separated by SDS-PAGE and stained with Imperial Protein Stain (Cat. No. 24615) to determine purity. L: loaded lysate, FT: flow-through, W: wash, E: eluate, and M: loading marker.

Glutathione S-Transferase (GST) tag
Glutathione S-transferase (GST) is a 26 kDa protein present in eukaryotes and prokaryotes, where it catalyzes a variety of reactions. Glutathione S-transferase is used to create the GST gene fusion system. GST fusion proteins can be purified from cells via their high affinity for glutathione. Once bound, GST fusion proteins are eluted in mild, non-denaturing conditions using reduced glutathione as shown in the figure below.

The GST tag can be placed on the N- or C-terminus and can enhance the solubility of expressed proteins. GST fusion proteins offer an important biological assay for direct protein-to-protein interactions. The GST tag is relatively large compared with other common peptide-based affinity epitope tags and it may interfere with some protein functions. In these cases, it can be easily removed by protease cleavage.

Maltose Binding Protein (MBP) tag
MBP is a ~43 kDa monomeric protein encoded by the malE gene of E. coli K12. When expressed as a fusion protein with a protein of interest, MBP can increase solubility of overexpressed fusion protein in bacteria. MBP fusion protein can be rapidly purified via its high affinity for amylose. The MBP fusion protein bound to amylose resin can then be eluted from the resin with maltose. The MBP tag can be detected using immunoblotting and immunoprecipitation. The MBP tag can be fused at the N- or C-terminus of the protein and can be easily cleaved using 3C protease.

Fluorescent protein reporters

Large polypeptide sequences from this class are generally used as fusion tags with target proteins for their innate fluorescence. Fluorescent tags are used in various cell biology studies to determine localization of target proteins, protein dynamics and tracking, live cell imaging, and cell sorting. Use of Green Fluorescent Protein (GFP) and its family members as fluorescent tags have become routine for tracking cellular proteins. A fluorescent tag is cloned in-frame to a target protein that allows it to be visualized in time and space to specific cells, tissues, and sub-cellular compartments. Over the last two decades, applications of fluorescent protein reporters have intensified with improvements to the tags.

Short descriptions of commonly used fluorescent tags are listed below:

Green Fluorescent Protein (GFP) tag
The jellyfish Aequorea victoria contains Green Fluorescent Protein (GFP) that emits light in the bioluminescence reaction of the animal. GFP has been used widely as a reporter protein for gene expression in eukaryotic and prokaryotic organisms, and as a protein tag in cell culture and in multicellular organisms. GFP is a 27 kDa monomeric protein, which autocatalytically forms a fluorescent pigment. The wild type protein absorbs blue light (maximally at 395 nm) and emits green light (peak emission at 508 nm) in the absence of additional proteins, substrates, or co-factors. GFP fluorescence is stable, species-independent, and suitable for a variety of applications. GFP has been used extensively as a fluorescent tag to monitor gene expression and protein localization.

Red Fluorescent Protein (RFP) tag
Red Fluorescent Protein (RFP) is isolated from sea anemone, Discosoma sp, and is a fluorophore that fluoresces red-orange when excited. Before RFP, DsRed was also isolated from Discosoma. DsRed is similar in size and properties to GFP but produces a red rather than a green fluorochrome. Several variants have been developed using directed mutagenesis. RFP is approximately 25.9 kDa. Its excitation maximum is 558 nm, and the emission maximum is 583 nm. RFP is commonly used as a molecular tag to study protein localization, reporter gene expression, and protein-protein interactions.

The mCherry protein is a second-generation fluorescent protein derived from DsRed, a red fluorescent protein isolated from Discosoma. Several cycles of directed mutation and evolutionary selection of DsRed produced mCherry, which has an excitation maximum at 587 nm and emission maximum at 610 nm. mCherry is resistant to photobleaching, is quite stable, and has an extremely rapid maturation rate.

Applications of epitope tags

Over the years, epitope tagging has been used for various applications by biochemists and cell biologists. Advancements in the utilities of each tag are increasing all the time. The following are the commonly used applications for epitope tagged proteins:

Affinity purification
From basic biological research to structural and functional proteomics, affinity tags have come a long way to become powerful tools used in molecular biology, biochemistry, and cell biology studies. They are widely used to facilitate the purification and detection of proteins of interest as well as the separation of protein complexes. Affinity tags include enzymes, protein domains, or small polypeptides that bind to a range of substrates of high specificity. Substrates range from carbohydrates, small biomolecules, and metal chelates, to antibodies, allowing rapid and efficient purification of proteins. Some of the affinity tags when added to a protein of interest increase solubility of recombinant proteins. Examples of affinity tags commonly used for purification are 6X-His, GST, and MBP.

Protein detection for in vitro overexpression studies
First, an epitope-tagged fusion protein is cloned in a desired plasmid construct controlled by either a constitutive or an inducible promoter and is transfected into an appropriate host cell for expression. Overexpressed epitope-tagged protein can be detected using various techniques like western blotting, flow cytometry, and immunofluorescence along with well-characterized epitope tag antibodies. The figure below shows western blot detection of an overexpressed N-terminally V5-tagged His-LacZ protein. In vitro overexpression studies make use of epitope tags for studying target proteins, their function, expression patterns, and cellular localization. This is a useful tool for characterization of novel or newly discovered proteins and proteins of low abundance or poor immunogenicity.

Western blot analysis of V5-His-LacZ in HEK293 cells

V5 tag antibody in western blot. Western blot analysis of V5 tag was performed by loading 20 µg of whole cell extracts of untransfected HEK-293 and HEK-293 transiently overexpressing V5-His-LacZ. V5-His-LacZ was detected at ~117 kDa using V5 Tag Monoclonal Antibody (Cat. No. R960-25) at a 1:1000 dilution. Goat Anti-Mouse IgG (H+L) Secondary Antibody, HRP (Cat. No. 62-6520) at 1:4000 dilution was used, and chemiluminescent detection was performed using Pierce ECL Western Blotting Substrate (Cat. No. 32106).

Immunoprecipitation studies
Immunoprecipitation (IP) was first developed as an adaptation of traditional column affinity chromatography. It is a technique based on the principle of antigen-antibody interaction used to isolate a protein from biological samples to study its identity, expression, post-translational modifications, and interacting partners. Immunoprecipitated samples are usually analyzed using western blotting, mass spectrometry, and other assays. In IP, an antibody is added to a cell lysate or biological sample containing an antigen and incubated to allow antigen-antibody complexes to form. Then, antigen-antibody complexes are incubated with protein A/G-coated beads to allow them to bind. The beads are thoroughly washed, and the antigen is eluted from complexes using an acidic solution or a detergent like SDS. The figure below shows a schematic representation of a standard immunoprecipitation assay.

There are two variations of IP that are used for studying target protein interactions. For interactions with other proteins, co-immunoprecipitation (co-IP) is used. For interactions with nucleic acids, chromatin immunoprecipitation (ChIP) or RNA immunoprecipitation (RIP) is used.

Epitope tag-based immunoprecipitation

Immunoprecipitation of epitope-tagged protein using tag antibodies is generally used when there is a lack of availability of antibodies against a target protein. For co-IP, a bait protein fused to an epitope tag is expressed in host cells and immunoprecipitated with a tag-specific antibody. Proteins directly interacting with the bait protein are co-immunoprecipitated and the protein complex is then analyzed by western blot. The power of co-IP has contributed immensely to the understanding of protein interaction networks, which is why co-IP is considered the gold standard for protein-protein interaction studies.

Likewise, ChIP can be used to identify regions of the genome where an epitope-tagged DNA-binding protein associates, using tag antibodies for immunoprecipitation. RIP is similar to ChIP except that RNA-binding proteins that are epitope-tagged, instead of DNA-binding proteins, are immunoprecipitated. 

Epitope tags most popularly used in IP applications are: V5, HA, c-Myc, and DYKDDDDK. The figure below shows an immunoprecipitation done with a DYKDDDDK tag antibody.

Immunoprecipitation of DYKDDDDK-tagged BNIP3 in HEK293 cells

DYKDDDDK tag antibody in IP. Immunoprecipitation of DYKDDDDK-tagged BNIP3 was performed with total cell extract from DYKDDDDK-BNIP3 transfected HEK-293 cells using the Dynabeads Protein A Immunoprecipitation Kit (Cat. No. 10006D). Normal Rabbit IgG was used as a negative IP control. Subsequently, western blot analysis was performed. DYKDDDDK-BNIP3 was detected at ~35 kDa as homodimer and ~18 kDa as monomer using DYKDDDDK Tag Recombinant Rabbit Polyclonal Antibody (Cat. No. 740001) at 1 µg/mL. Goat anti-Rabbit IgG (H+L), Superclonal Recombinant Secondary Antibody, HRP (Cat. No. A27036, 0.4 µg/mL, 1:2500 dilution). Lane 1 shows 10% of the total cell extract (input), Lane 2 is the IP performed with Rabbit IgG, and Lane 3 is the IP performed with DYKDDDDK Tag Recombinant Rabbit Polyclonal Antibody.

Visualization of fluorescent reporters
American biologist Martin Chalfie’s 1994 article in the journal Science reported wild-type GFP expression in E. coli and C. elegans. His findings suggested that this marker could provide in vivo fluorescence in a wide variety of cells and organisms. Fluorescent reporter proteins like GFP, RFP, and mCherry are used as fusion proteins with a target gene to observe their localization and destiny when expressed in an appropriate host cell. A fusion protein maintains its normal function along with acquired fluorescent property through GFP (or other reporters). All major organelles like the plasma membrane, Golgi apparatus, nucleus, and endoplasmic reticulum can be targeted through GFP and other fluorescent proteins.

Fluorescent reporters have multiple applications besides visualization of protein localization, as detailed below:

  • Transcription reporter—using GFP (or other fluorescent proteins) as a reporter, under the control of a promoter of interest, for the readout of transcriptional activity. GFP antibodies can be useful in detecting its expression in western blotting.
  • Live cell imaging—tracking the movements of proteins inside a cell helps biologists to understand cell signaling and to verify proteins involved in a variety of pathways. In this regard, fluorescent tags like GFP, RFP, and mCherry have become handy tools for cell biologists. Fluorescent proteins offer multiple features for tracking protein dynamics in live cells.
  • Fluorescence Resonance Energy Transfer (FRET)—FRET is used to study interactions between two proteins (that are fused to fluorescent proteins) which undergo conformational changes.
  • Fluorescence Activated Cell Sorting (FACS)—FACS can be used to separate cells expressing GFP from cells that are not.
  • In vivo studies or transgenic tracking—fluorescent tags are powerful tools used to monitor gene expression in different kinds of cells and model systems. It displays a promising way to monitor efficiency of gene transfer in transgenic model systems.

Advanced applications: endogenous tagging

With the advent of new technologies in the post-genomic era, advancements in applications of epitope tagging have also broadened.

ChIP-seq is a variation of ChIP followed by sequencing, which is used for genome-wide profiling of Transcription Factor (TF)-DNA interactions. However, this application is limited to the availability of ChIP-grade antibodies against transcription factors. If TF antibodies are unavailable, researchers have employed the power of epitope tagging to insert a tag in a TF target gene endogenously. Subsequently, the tag antibody is used for chromatin immunoprecipitation followed by sequencing to identify DNA binding sites and motifs in endogenous conditions.

Similarly, endogenous epitope tagging can also be used to study expression and function of a target gene in native conditions without having to worry about deleterious effects of protein overexpression.

Advantages and limitations of using epitope tags

Tagging a protein with an existing epitope is a simple procedure that allows researchers to detect or purify proteins promptly. The technique has many other advantages over the use of antibodies generated directly against the protein of interest, which are listed below:

  • The most obvious advantage is the time and expense of generating antibodies against novel or poorly immunogenic proteins.
  • Multiple well-characterized and validated antibodies, especially monoclonal antibodies, are available for epitope tags. These antibodies are proven to be specific to an epitope tag, thereby minimizing cross-reactivity.
  • More than one tag can be added to a protein of interest, which enhances the detection limit or sensitivity of the assay.
  • Epitope tagging can be advantageous in distinguishing between otherwise similar, untagged proteins without any concern of anomalous results from cross-reactive antibodies.
  • Protein-protein interaction studies can be done using antibodies against epitope tags.
  • Type of tag and insertion site can be selected by the researcher depending on requirement of the experiment and without interfering with functional sites of the protein.

While epitope tagging is a widely used technique to study protein-protein interactions, functions, and topology, it has its own limitations, which are listed below:

  • Gene sequence should be known for the protein of interest when considering tagging it.
  • An epitope tag may interfere with protein structure and function.
  • There may be abnormal levels of expression due to heterologous promoters that can lead to deleterious effects.


Recommended reading

  1. Brizzard, B., Chubet, R. (1997) Epitope Tagging of Recombinant Proteins. Current Protocols in Neuroscience. 5.8.1-5.8.10.
  2. Brizzard, B. (25th Anniversary Issue, April 2008) Epitope tagging. BioTechniques. 44: 693-695.
  3. Kimple, M.E., Brill, A.L., Pasker, R.L. Overview of Affinity Tags for Protein Purification. Curr Protoc Protein Sci. 73: Unit–9.9.
  4. Jarvik, J.W., Telmer, C.A. (1998) Epitope Tagging. Annu. Rev. Genet. 32:601–18.
  5. Zhao, X., Li, G., Liang, S. (2013) Several Affinity Tags Commonly Used in Chromatographic Purification. Journal of Analytical Methods in Chemistry, Volume 2013, Article ID 581093.
  6. Corthell, J.T. (2014) Immunoprecipitation. Basic Molecular Protocols in Neuroscience: Tips, Tricks, and Pitfalls. Chapter 8: 77-81.
  7. Chalfie, M. (2001) Cell Markers: Green Fluorescent Protein (GFP). Encyclopedia of Genetics. 311-313.
  8. Zimmer, M. (2002) Green Fluorescent Protein (GFP):  Applications, Structure, and Related Photophysical Behavior. Chem. Rev. 102, 3, 759–782.
  9. Xiong, X., Zhang, Y., Yan, J., et al. (2017) A Scalable Epitope Tagging Approach for High Throughput ChIP-Seq Analysis. ACS Synth Biol. 16; 6(6): 1034–1042.
  10. Wang, Z. (2009) Epitope tagging of endogenous proteins for genome wide Chromatin immunoprecipitation analysis. Methods Mol Biol. 567: 87–98.