Recently the NIH Roadmap Epigenomics Consortium published a set of 24 papers, the primary one examining over a hundred reference epigenomes from a broad range of cell lines and tissues. What is meant by epigenomics?
What are Epigenomic Factors?
Epigenomic factors include DNA methylation, histone modification (both methylation and acetylation), and non-coding RNA expression, the three of which all affect gene expression. While genes are encoded in DNA, the epigenome is tissue-, time-, condition- and environmentally-dependent variables that affect which gene is expressed in what tissue at what time and at what stage of development. As you can imagine this effort involved a massive number of experiments, with large amounts of these differing types of data across over a hundred tissue types.
The first epigenetic mark (addressed here) is 5-methylcytosine, a mark on the native genomic DNA. The second epigenetic mark, histone modifications, are a set of post-translational covalent modifications, the most prominent of which is known as H3K4me3 – histone 3 lysine 4 trimethylated – that is associated with transcriptional up-regulation. Histone modifications that involve methylation or acetylation are typically interrogated genome-wide via Chromatin ImmunoPrecipitation Sequencing, or ChIP-Seq. A previous post on Behind the Bench covered ChIP-Seq here, and a poster presented at the 2014 American Society for Human Genetics is available here. Also the journal Nature Reviews Molecular Biology put together a nice poster about histone marks and their associated ‘reader’ molecules available here.
How 5-mC is Studied?
The study of 5-methylcytosine has historically been performed by pyrosequencing, where genomic DNA is first treated with bisulfite, where the unmodified cytosine residues are converted to uracil, but the 5-methylcytosines are left unmodified. When sequenced, the converted uracil residues are recognized as thymidine, so a changed C to T base is unmodified, and an unchanged C base is a methylated one.
For over two decades cytosine methylation has been an active area of study in cancer research that hypomethylation (low-level of methylation) occurs in oncogenes, and hypermethylation (high-level of methylation) occurs in tumor suppressor genes. The organization of the methylated genome into what has been termed CpG island shores in 2009 demonstrated that CpG islands up to 2kb away strongly affect gene expression of genes related to tumorigenesis. (If you’d like to read more about Andy Feinberg’s work in epigenetics we wrote about him here.)
While pyrosequencing bisulfite-treated DNA is the historical method, with the advent of massively parallel sequencing additional methods have been developed to analyze hyper- and hypo-methylated regions across the genome. Also in 2009 the first human DNA methylome at single-base resolution was published. This effort required the development of new analysis tools and analysis methodologies, since converting the unmodified cytosines to uracil (and then sequenced as a thymidine base) a four-base genome code becomes a mainly three-base one. “Mainly” is used here as un-methylated sites are not uniformly 100% unmethylated nor methylated. The alignment of the converted sequence is a challenge, and the calling of percent methylation is not a trivial task.
One approach to reduce complexity
One approach is to use an antibody against 5-mC, what is called Methylated DNA ImmunoPrecipitation Sequencing, or Me-DIP Seq. Thermo Fisher Scientific developed the MethylMiner™ Methylated DNA Enrichment Kit as a flexible method for enrichment, using a methyl binding protein instead of an antibody. It has been successfully coupled with the SOLiD® next-generation sequencing platform, and an application note is available (PDF, “Enrichment of differentially methylated regions with MethylMiner™ fractionation and deep sequencing with the SOLiD™ System”). A methylation enrichment Frequently Asked Questions is also available here.
Recently a publication demonstrated a Me-DIP Seq protocol compatible for both the Ion PGM™ and Ion Proton™ Sequencers, confirmed methylation status using orthogonal microarray technology, and an addition examined the role of DNA methylation in alternative splicing. In addition, while 5-mC is the most abundant form of cytosine methylation, a variant 5-hydroxymethylcytosine is gaining attention for its presence in brain tissue and embryonic stem cells and is considered important for gene expression in these tissues. 5-hmC is indistinguishable from 5-mC after bisulfite conversion, however. By using different antibodies specific for 5-mC and 5-hmC, differences between the two epigenetic modifications were also demonstrated.
MJ Corely, M Zhang, X Zheng, A Lum-Jones, AK Maunakea. “Semiconductor-based sequencing of genome-wide DNA methylation states”. Epigenetics 10(2):153-166, 2015. [doi: 10.1080/15592294.2014.1003747]