Biochemistry of DNA methylation

CpG islands are regions in which half or more of the potential CpG sites are methylated. The number of sites is generally considered to be 200 or more. In this diagram note that in CpG methylation there is an adjacent “G” in addition to the G that the methylated C is paired with. In these examples the adjacent G is paired with a C that is also methylated. In our “Non CpG methylation” example we see three variations

The 5′ strand at the top is the sense strand because the genetic code is read 5′ to 3′. The 3′ complementary strand at the bottom is the anti-sense strand.

  • An meC is between two As. The meC s that occur before As will become important in our story.
  • An meC is between a C and a T on the bottom, anti-sense strand
  • An meC is in front of a G on the anti-sense strand. It’s C partner is methylated on the sense strand, but is behind the G on the sense strand.
  • An meC on the sense strand is surrounded by two Ts.

From Wikipedia “Distribution of CpG sites (left: in red) and GpC sites (right: in green) in the human APRT gene. CpG are more abundant in the upstream region of the gene, where they form a CpG island, whereas GpC are more evenly distributed.

The 5 exons of the APRT gene are indicated (blue), and the start (ATG) and stop (TGA) codons are emphasized (bold blue).” Note the large number of shaded regions (red and green) before the first bold blue that is the start codon. Some DNA methylation occurs in the protein coding exons of this APRT gene. The consequence of this was not discussed on the Wikipedia page. Relevant to this post is the large number of methylation sites in non protein coding introns. As this post will discuss, some mutations in the DNMT3A gene associated with autsim and schizophrenia ..occur in these non protein coding.

DNMT isoforms, a tour of the UniProt database

This is simply a tour through the UniProt database to get an undrestanding of the players. It is becoming very clear that the UniProt database is written by AI that skims through PubMed abstracts faster than we can. The take home message seems to be way to complicated for us to fully understand.


Associates with the promoter regions of tumor suppressor genes leading to their silencing. Mediates transcriptional repression by direct binding to HDAC2. In association with DNMT3B and via the recruitment of CTCFL/BORIS, involved in activation of BAG1 gene expression by modulating dimethylation of promoter histone H3 at H3K4 and H3K9. And from UniProt: It is responsible for maintaining methylation patterns established in development. DNA methylation is coordinated with methylation of histones.

DNMT2 will not get much space in this post because it methylates transfer RNA.


This enzyme is responsible for non inherited prmoter methylation. May preferentially methylate DNA linker between 2 nucleosomal cores and is inhibited by histone H1 (By similarity). Plays a role in paternal and maternal imprinting (By similarity). Required for methylation of most imprinted loci in germ cells (By similarity). Acts as a transcriptional corepressor for the myogenesis trnscriptional repressor ZBTB18 (By similarity probably to rodent proteins). Recruited to trimethylated ‘Lys-36’ of histone H3 (H3K36me3) sites (By similarity to rodent proteins)


Required for genome-wide de novo methylation and is essential for the establishment of DNA methylation patterns during development. DNA methylation is coordinated with methylation of histones. May preferentially methylate nucleosomal DNA within the nucleosome core region. May function as transcriptional co-repressor by associating with CBX4 and independently of DNA methylation. CBX4 binds to Binds to histone H3 trimethylated at ‘Lys-9. DNMT3b seems to be involved in gene silencing (By similarity). In association with DNMT1 and via the recruitment of CTCFL/BORIS, involved in activation of BAG1 gene expression by modulating dimethylation of promoter histone H3 at H3K4 and H3K9. Isoforms 4 and 5 are probably not functional due to the deletion of two conserved methyltransferase motifs. Functions as a transcriptional corepressor by associating with the transcription repression faction ZHX1. Required for DUX4 silencing in somatic cells. DUX4 interacts with many other proteins that are involved in chromatin remodeling.


Catalytically inactive regulatory factor of DNA methyltransferases that can either promote or inhibit DNA methylation depending on the context (By similarity).
Essential for the function of DNMT3A and DNMT3B: activates DNMT3A and DNMT3B by binding to their catalytic domain (PubMed:17687327). Acts by accelerating the binding of DNA and S-adenosyl-L-methionine (AdoMet) to the methyltransferases and dissociates from the complex after DNA binding to the methyltransferases (PubMed:17687327). Recognizes unmethylated histone H3 lysine 4 (H3K4me0) and induces de novo DNA methylation by recruitment or activation of DNMT3 (PubMed:17687327). Plays a key role in embryonic stem cells and germ cells (By similarity). In germ cells, required for the methylation of imprinted loci together with DNMT3A (By similarity). In male germ cells, specifically required to methylate retrotransposons, preventing their mobilization (By similarity).
Plays a key role in embryonic stem cells (ESCs) by acting both as an positive and negative regulator of DNA methylation (By similarity).
While it promotes DNA methylation of housekeeping genes together with DNMT3A and DNMT3B, it also acts as an inhibitor of DNA methylation at the promoter of bivalent genes (By similarity).
Interacts with the EZH2 component of the PRC2/EED-EZH2 complex, preventing interaction of DNMT3A and DNMT3B with the PRC2/EED-EZH2 complex, leading to maintain low methylation levels at the promoters of bivalent genes (By similarity).
Promotes differentiation of ESCs into primordial germ cells by inhibiting DNA methylation at the promoter of RHOX5, thereby activating its expression (By similarity).

Testis-specific DNA binding protein responsible for insulator function, nuclear architecture and transcriptional control, which probably acts by recruiting epigenetic chromatin modifiers. Plays a key role in gene imprinting in male germline, by participating in the establishment of differential methylation at the IGF2/H19 imprinted control region (ICR). Directly binds the unmethylated H19 ICR and recruits the PRMT7 methyltransferase, leading to methylate histone H4 ‘Arg-3’ to form H4R3sme2. This probably leads to recruit de novo DNA methyltransferases at these sites (By similarity).
Seems to act as tumor suppressor. In association with DNMT1 and DNMT3B, involved in activation of BAG1 gene expression by binding to its promoter. Required for dimethylation of H3 lysine 4 (H3K4me2) of MYC and BRCA1 promoters

Co-chaperone for HSP70 and HSC70 chaperone proteins. Acts as a nucleotide-exchange factor (NEF) promoting the release of ADP from the HSP70 and HSC70 proteins thereby triggering client/substrate protein release. Nucleotide release is mediated via its binding to the nucleotide-binding domain (NBD) of HSPA8/HSC70 where as the substrate release is mediated via its binding to the substrate-binding domain (SBD) of HSPA8/HSC70 (PubMed:27474739, PubMed:9873016, PubMed:24318877). Inhibits the pro-apoptotic function of PPP1R15A, and has anti-apoptotic activity (PubMed:12724406). Markedly increases the anti-cell death function of BCL2 induced by various stimuli (PubMed:9305631)

SNPs associated with autism, going backward [1]

This is a direct quote from an abstract that is not so public access

“Autism spectrum disorder (ASD) is a complex neurodevelopmental disorder with impairments in social communication, restricted, repetitive and stereotyped behaviors. Both genetic and environmental factors are known to contribute toward pathophysiology of Autism. Environmental influences on gene expression can be mediated by methylation patterns which are established and maintained by DNA methyltransferases. Several studies in the past have investigated the role of global methylations in Autism. The present study is aimed to investigate the role of genetic variations in the DNA methyltransferase which might be critical in defining the threshold for environmental factors toward susceptibility to autism. Polymorphisms in DNA methyltransferases, DNMT1, DNMT3A, DNMT3B, and DNMT3L were screened for association with ASD in 180 autistic patients and 260 healthy controls from a south Indian population. DNMT1 rs10418707 and rs10423341, and DNMT3A rs2289195 were found to be significantly associated at genotypic and allelic level with ASD. Functional prediction indicates that these SNPs have a role in transcriptional regulation and increased expression, indicating that hypermethylation might be induced by its genotype status. The study might reflect the role of genetics variants in DNMTs in defining the threshold of environmental impact in influencing the disease or phenotype variations in ASD.”

Unfortunately it was impossible to determine what was the consequence of these SNPs, the amino acid change, and so on. A PMC search was performed on the “rs” numbers. All three of these are associated with schizophrenia as well. [2]

GeneSNP IDAllelesGenotyping MethodPossible Functional Effects
DNMT1rs10418707G/AKASParIntronic enhancer
rs2114724C/TKASParIntronic enhancer
rs2228611G/AKASParSplicing Regulation
rs10423341C/AKASParIntronic enhancer
rs2228612A/GKASParMissense (conservative)
rs2162560G/AKASParIntronic enhancer
rs759920G/AKASParIntronic enhancer
rs16999593T/CKASParMissense (non-conservative); Splicing regulation
rs2304429G/AKASParIntronic enhancer
rs734693C/TKASParIntronic enhancer
Table from ref [2] Among de novo methyltransferases the DNMT3A rs2289195
was found to be associated at genotypic level (P = 0.013) with schizophrenia. None of the selected SNPs in DNMT3B, and DNMT3L were found to be associated with schizophrenia. [2] This publication gave no discussion of what an intronic enhancer. [2] An intron is a non protein coding part of a gene that occurs between the protein coding (exon) part of the gene.

“Most human genes produce multiple isoforms through alternative splicing, which is tightly controlled in different tissues and developmental stages. The splicing specificity is mainly determined by 5′ splice site (5′SS), 3′ splice site (3′SS) and branch point sequences, as well as by multiple cis-acting splicing regulatory elements (SRE) that are conveniently classified as exonic splicing enhancers (ESEs) or silencers (ESSs), and intronic splicing enhancers (ISEs) or silencers (ISSs). These SREs generally function by recruiting trans-factors to control splicing through diverse mechanisms ” [3]

Now that we’ve established there might be some splicing issues with DNMT1 and we’ve read through enough AI congealed tidbits, let’s look at the domains of this protein from a publication written by a groups of humans. These are some bullet point summaries along with a cartoon of the domains in question.

  • Each of our nucleated cells contain a blueprint for every other cell type in our bodies. What proteins get expressed is determined by CpG methylation. Approximately half of the CpG in our genomes is methylated. This particular review stresses that DNMT1m, DNMT3a, and DNMT3b are the methyl transferases that are most active.
  • DNMT1 is most active in the DNA synthesis, S phase, of the cell cycle and maintains the promoter methylation status from cells many generations past. According to new ways of thinking, DNMT3a and b can maintain methylation of sites missed by DNMT1.
  • The SRA domain of UHRF1 exhibits binding affinity for hemi methylated CpG sites and loads DNMT1 to these sites
  • DMAP (amino acids 12-105) inhibits transcription by interacting with histone deacetylase 2 (HDAC2).
  • RFTS may be involved in targeting during S phase and dimerizaton.
  • CxxC domain (621-698) interacts with unmethylated CpGs.
  • Bromo-adjacent homology domains seem to lack a firm definition but appear to be involved in protein/protein interactions that link methylation to regulation of transcription
  • The interesting thing about the catalytic C-terminus is that it involves an active site cysteine.
  • DNMT1-b contains an extra 48 nucleotides between the usual splice site between exons 4 and 5. DNMT1b is expressed in minor amounts in normal tissue but more in tumors.
  • DNMT1o lacks first 118 residues of the N terminal domain, hence the DMAP domain that binds to histones. It is also found in myotubules.

Does rs10418707 of schizophrenia and autism favor one of these splice variants or something different?

This review compares and contrasts the domains of four isoforms of methyl transferases.

  • The N-terminal regions of Dnmt3a and b are different and have their own targeting mechanisms.
  • The PWWP domain contains lysines and argines for binding of DNA. Additionally,
    the PWWP domain interacts with tri-methylated Lys 36 histone H3 (H3K36me3)
  • The ADD domain is cysteine rich. This domain may restrict access to the catalytic domain unless bound to histone H3.
  • The red bars in the catalytic domains appear to be motifs. … “the PCQ loop in catalytic domain motif IV, of which the Cys residue covalently binds to the target cytosine at the sixth carbon.”

This paper was a bit tricky to follow, I think the mice are heterozygous for these mutations. Many behavior tests were also performed. A trend that seems to be continuing is that less than fully functional Dnmt3a seems to impact methylation at CpA more than CpG. [6]

Dnmt3a mutantbrain vollocomotordiggingmCA enhances gene bodiesmCG enhancers
P900Llesslesssame50% lessless
R878Hless75% lessless
compiled from ref 6

Another group tested the catalytic activity of various mutants of Dnmt3a. These panels came from figures 1 and 2 of this public access publication. [7]

The relative activities in this cell culture system were normalized to co expressed Green fluorescence Protein. The authors moved on to a mouse model in which one of the copies of Dnmt3a had been deleted.

Behavior studies will not be given too much space in this biochemistry post. It would be interesting if future work elucidates neurotransmitters and related enzyme expression.

The DNMT3a+/- mice spend less time in the center of the (A) and in the open arm (B) . The DNMT3a+/- mice spent more time freezing during training. (C) And so on. This post will not spend too much time on behavior tests.

DNMT3a+/- mice Fig 4

Global DNA Methylation Levels upon Heterozygous Loss of DNMT3A(A and B) Global mCG (A) and mCA levels (B) in DNA isolated from brain regions of 8-week-old mice (left) (unpaired t test with Bonferroni correction) and developmental time course of global mCG in the cerebral cortex (right), as measured by sparse whole-genome bisulfite sequencing (WGBS) (p < 0.0001 effect by genotype, F(1,27) = 1024; p < 0.0001 effect by age F(5,27) = 884.6; n = 3–4; two-way ANOVA). Bonferroni corrected p values are indicated on the x axis. *p < 0.05, ***p < 0.001. Line graphs indicate mean and SEM. See also

Note that there are not changes in mCpG, Not in the percent methylated, not in changes over the age of the mice. The CpA sites were significantly different between the wild type and DNMT3a+/- mice is all parts of the brain but not the liver.

This post started off defining cytosine methylation and where it can occur. A UniProt.org tour of likely AI generated text of our methyl transferase enzymes. The conclusion is that things are very, very complicated. These enzymes have multiple interactions with other proteins. We explored domains of methyl transferases with images that reinforced the notion of just how complicated things are. We then explored some mutations in non protein coding parts o the genes, some mutations in the catalytic domain, and the consequences of having only one copy. One thing that is emerging as important is methylation of cytosines in front (5′-) of adenosines (CpA) may be particularly important in our brains.

  1. Alex AM, Saradalekshmi KR, Shilen N, Suresh PA, Banerjee M. Genetic association of DNMT variants can play a critical role in defining the methylation patterns in autism. IUBMB Life. 2019 Jul;71(7):901-907.
  2. Saradalekshmi KR, Neetha NV, Sathyan S, Nair IV, Nair CM, Banerjee M. DNA methyl transferase (DNMT) gene polymorphisms could be a primary event in epigenetic susceptibility to schizophrenia. PLoS One. 2014 May 23;9(5):e98182. PMC free article
  3. Wang Y, Ma M, Xiao X, Wang Z. Intronic splicing enhancers, cognate splicing factors and context-dependent regulation rules. Nat Struct Mol Biol. 2012 Oct;19(10):1044-52. PMC free article
  4. Dhe-Paganon S, Syeda F, Park L. DNA methyl transferase 1: regulatory mechanisms and implications in health and disease. Int J Biochem Mol Biol. 2011;2(1):58-66. PMC free article
  5. Tajima S, Suetake I, Takeshita K, Nakagawa A, Kimura H. Domain Structure of the Dnmt1, Dnmt3a, and Dnmt3b DNA Methyltransferases. Adv Exp Med Biol. 2016;945:63-86. free at Sci-Hub
  6. Beard DC, Zhang X, Wu DY, Martin JR, Hamagami N, Swift RG, McCullough KB, Ge X, Bell-Hensley A, Zheng H, Lawrence AB, Hill CA, Papouin T, McAlinden A, Garbow JR, Dougherty JD, Maloney SE, Gabel HW. Distinct disease mutations in DNMT3A result in a spectrum of behavioral, epigenetic, and transcriptional deficits. bioRxiv [Preprint]. 2023 Feb 27:2023.02.27.530041 PMC free article
  7. Christian DL, Wu DY, Martin JR, Moore JR, Liu YR, Clemens AW, Nettles SA, Kirkland NM, Papouin T, Hill CA, Wozniak DF, Dougherty JD, Gabel HW. DNMT3A Haploinsufficiency Results in Behavioral Deficits and Global Epigenomic Dysregulation Shared across Neurodevelopmental Disorders. Cell Rep. 2020 Nov 24;33(8):108416. PMC free article

Leave a Reply