Draft:Original research/Epigenetics

Epigenetics is the study of genome or epigenome changes resulting from external rather than genetic influences.

"Epigenetic mechanisms are affected by several factors and processes including development in utero and in childhood, environmental chemicals, drugs and pharmaceuticals, aging, and diet. DNA methylation is what occurs when methyl groups, an epigenetic factor found in some dietary sources, can tag DNA and activate or repress genes. Histones are proteins around which DNA can wind for compaction and gene regulation. Histone modification occurs when the binding of epigenetic factors to histone "tails"; alters the extent to which DNA is wrapped around histones and the availability of genes in the DNA to be activated. All of these factors and processes can have an effect on people's health and influence their health possibly resulting in cancer, autoimmune disease, mental disorders, or diabetes among other illnesses."

Epigenomes
Inside each eukaryote nucleus is genetic material (DNA) surrounded by protective and regulatory proteins. These protective and regulatory proteins and the dynamic changes to them that occur during the course of a eukaryote's existence are the epigenome.

Changes to the epigenome can result in changes to the structure of chromatin and changes to the function of the genome.

There are "nearly 50,000 acetylated sites [punctate sites of modified histones] in the human genome that correlate with active transcription start sites and CpG islands and tend to cluster within gene-rich loci."

Nucleosomes
DNA packaging in eukaryotes consists of "DNA wound in sequence around four histone protein cores.

Nucleosomes form the fundamental repeating units of eukaryotic chromatin.

The nucleosome core particle consists of approximately 147 base pairs of DNA wrapped in 1.67 left-handed superhelical turns around a histone octamer consisting of 2 copies each of the core histones H2A, H2B, H3, and H4.

Core particles are connected by stretches of "linker DNA", which can be up to about 80 bp long.

Histones
Histone deacetylases (HDAC), Enzyme Commission number (EC number) 3.5.1, are a class of enzymes that remove acetyl groups (O=C-CH3) from an ε-N-acetyl lysine amino acid on a histone, allowing the histones to wrap the DNA more tightly.

Histone deacetylase action is opposite to that of histone acetyltransferase.

Facultative heterochromatin
Genes that are silenced through a mechanism such as histone methylation or siRNA through RNAi produce facultative heterochromatin.

The regions of DNA packaged in facultative heterochromatin are not consistent between the cell types within a species, and thus a sequence in one cell that is packaged in facultative heterochromatin (and the genes within poorly expressed) may be packaged in euchromatin in another cell (and the genes within no longer silenced).

An example of facultative heterochromatin is X-chromosome inactivation in female mammals: one X chromosome is packaged as facultative heterochromatin and silenced, while the other X chromosome is packaged as euchromatin and expressed.

Lamarckism
Interest in Lamarckism has recently increased, as several studies in the field of epigenetics have highlighted the possible inheritance of behavioral traits acquired by the previous generation.

Lamarckism (or Lamarckian inheritance) is the idea that an organism can pass on characteristics that it acquired during its lifetime to its offspring (also known as heritability of acquired characteristics or soft inheritance). It is named after the French biologist Jean-Baptiste Lamarck (1744–1829), who incorporated the action of soft inheritance into his evolutionary theories.

After Erasmus Darwin wrote Zoonomia suggesting "that all warm-blooded animals have arisen ... with the power of acquiring new parts" in response to stimuli, with each round of "improvements" being inherited by successive generations", Jean-Baptiste Lamarck repeated in his Philosophie Zoologique of 1809 the folk wisdom that characteristics which were "needed" were acquired (or diminished) during the lifetime of an organism then passed on to the offspring.

Neo-Lamarckism is a theory of inheritance based on a modification and extension of Lamarckism, essentially maintaining the principle that genetic changes can be influenced and directed by environmental factors.

"First, there is no reason to think that epigenetic variations are rare: when actively sought, they have usually been found. Second, epigenetic variations can be transmitted very stably, certainly in cell lines. ... Third, epigenetic inheritance is not limited to multicellular organisms: it is found in unicellular organisms too. ... Fourth, several different models have shown how, in certain conditions, transmitting some (not all) epigenetic variations from one generation to the next is a selective advantage, even if they are stable for only a few generations. ... Fifth, epigenetic variations may influence the site and nature of genetic changes and affect evolution in this way. ... [T]here is a Lamarckian component in evolution, with the environment being an inducer as well as a selector of variation."

Epigenomic theory
Def. a chemical entity anterior to, after, at, besides, near to, on, outer to, over, related to, or upon another chemical is called an epi (or epi-) chemical.

Theoretical epigenetics
Def. the "study of the processes involved in the genetic development of an organism, especially the activation and deactivation of genes" or the "study of heritable changes caused by the activation and deactivation of genes without any change in DNA sequence"

is called epigenetics.

Methylation
"In many higher eukaryotes, cytosine is methylated at carbon 5 by DNA methylase enzyme. In mammals, the methylated sequence is usually C*pG. Inactive genes are preferentially methylated. Active genes are hypomethylated."

"This methylation does not change the nucleotide sequence but can be propogated at DNA replication. Methylated DNA can be detected by two restriction enzymes that recognize CCGG: MspI which is methylation insensitive and HpaII which is methylation sensitive."

Cytosines in CpG dinucleotides can be methylated to form 5-methylcytosine. In mammals, methylating the cytosine within a gene can turn the gene off, a mechanism that is part of a larger field of science studying gene regulation that is called epigenetics. Enzymes that add a methyl group are called DNA methyltransferases.

In mammals, 70% to 80% of CpG cytosines are methylated.

CpG dinucleotides have long been observed to occur with a much lower frequency in the sequence of vertebrate genomes than would be expected due to random chance. For example, in the human genome, which has a 42% GC content, a pair of nucleotides consisting of cytosine followed by guanine would be expected to occur 0.21 * 0.21 = 4.41% of the time. The frequency of CpG dinucleotides in human genomes is 1% &mdash; less than one-quarter of the expected frequency.

Unmethylated CpG sites can be detected by Toll-Like Receptor 9 (TLR 9) on plasmacytoid dendritic cells and B cells in humans. This is used to detect intracellular viral, fungal, and bacterial pathogen DNA.

Methylation is central to imprinting, along with histone modifications. Most of the methylation occurs a short distance from the CpG islands (at &quot;CpG island shores&quot;) rather than in the islands themselves.

Methylation of CpG sites within the promoters of genes can lead to their silencing, a feature found in a number of human cancers (for example the silencing of tumor suppressor genes). In contrast, the hypomethylation of CpG sites has been associated with the over-expression of oncogenes within cancer cells.

Methylation is the addition of a methyl group replacing a hydrogen atom.

DNA methylation in vertebrates typically occurs at CpG sites (cytosine-phosphate-guanine sites, that is, where a cytosine is directly followed by a guanine in the DNA sequence). This methylation results in the conversion of the cytosine to 5-methylcytosine. The formation of Me-CpG is catalyzed by the enzyme DNA methyltransferase. Human DNA has about 80%-90% of CpG sites methylated, but there are certain areas, known as CpG islands, that are GC-rich (made up of about 65% CG residues), wherein none are methylated. These are associated with the promoters of 56% of mammalian genes, including all ubiquitously expressed genes. One to two percent of the human genome are CpG clusters, and there is an inverse relationship between CpG methylation and transcriptional activity.

"Non-CpG methylation (CNG and CNN) ... has been observed at a low frequency in the early mouse embryo"

Protein methylation typically takes place on arginine or lysine amino acid residues in the protein sequence. Arginine can be methylated once (monomethylated arginine) or twice, with either both methyl groups on one terminal nitrogen (asymmetric dimethylated arginine) or one on both nitrogens (symmetric dimethylated arginine) by peptidylarginine methyltransferases (PRMTs). Lysine can be methylated once, twice or three times by lysine methyltransferases. Protein methylation has been most-studied in the histones. The transfer of methyl groups from S-adenosyl methionine to histones is catalyzed by enzymes known as histone methyltransferases. Histones that are methylated on certain residues can act epigenetically to repress or activate gene expression.

DNA methylation plays an important role for epigenetic gene regulation in development and cancer. The picture on the right shows the crystal structure of a short DNA helix with sequence "accgcCGgcgcc", which is methylated on both strands at the center cytosine. The structure was taken from the Protein Data Bank (accession number 329D), rendering was performed with VMD (Visual Molecular Dynamics rendering program) and post-processing was done in Photoshop.

Phosphorylation
Phosphorylation is the addition of a phosphate (PO43-) group to a protein or other organic molecule.

Kinases phosphorylate proteins and phosphatases dephosphorylate proteins.

Reversible phosphorylation of proteins is an important regulatory mechanism that occurs in both prokaryotic and eukaryotic organisms.

Phosphoryl groups attach to histones at serine and threonine sites.

Ubiquitylation
"The core histones that make up the nucleosome are subject to ... modifications, including ubiquitination [that occurs] primarily at specific positions within the amino-terminal histone tails."

Deamination
The CpG deficiency is due to an increased vulnerability of methylcytosines to spontaneously deaminate to thymine in genomes with CpG cytosine methylation.

Mutations
Alu elements are a common source of mutation in humans, but such mutations are often confined to non-coding regions where they have little discernible impact on the bearer.

The mutagenic effect of Alu and retrotransposons in general has played a major role in the recent evolution of the human genome.

The first report of Alu-mediated recombination causing a prevalent inherited predisposition to cancer was a 1995 report about hereditary nonpolyposis colorectal cancer.

"The human diseases caused by Alu insertions include":
 * Breast cancer
 * Ewing's sarcoma
 * Familial hypercholesterolemia
 * Hemophilia
 * Neurofibromatosis
 * Diabetes mellitus type II.

The following diseases have been associated with single-nucleotide DNA variations in Alu elements impacting transcription levels:
 * Alzheimer's disease
 * Lung cancer
 * Gastric cancer.

The ACE gene, encoding angiotensin-converting enzyme, has 2 common variants, one with an Alu insertion (ACE-I) and one with the Alu deleted (ACE-D). This variation has been linked to changes in sporting ability: the presence of the Alu element is associated with better performance in endurance-oriented events (e.g. triathlons), whereas its absence is associated with strength- and power-oriented performance

"The opsin gene duplication which resulted in the re-gaining of trichromacy in Old World primates (including humans) is flanked by an Alu element, implicating the role of Alu in the evolution of three colour vision.

Cancers
"[B]iochemical markers [may be used] for early proof of mechanism"

On the left is a chart of common DNA damaging agents, examples of lesions they cause in DNA, and pathways used to repair these lesions. Also shown are many of the genes in these pathways, an indication of which genes are epigenetically regulated, and which of those epigenetically regulated genes have reduced expression in various cancers. It also shows genes in the error prone microhomology-mediated end joining pathway with increased expression in various cancers.

CArG boxes
CArG boxes are present in the promoters of smooth muscle cell genes.

"The SRF-CArG association is required for transcriptional activation of SMC genes [...] the SMC genes examined in this study display SMC-specific histone modifications at the 5′-CArG boxes. [...] enrichment of H4 and H3 acetylation [...] were relatively low from positions –2,800 to –1,600 in the 5′ region. However, at position –1,600 to –1,200, there was a sharp rise in these modifications, which was increased even further at +400 in the coding region. We observed similar patterns for H3K4dMe and H3 Lys79 di-methylation [...]. SRF, TFIID, and RNA polymerase II displayed enrichments that were consistent with the positions of the CArG boxes, TATA box, and coding region, respectively". "SMC-restricted binding of SRF to murine SMC gene CArG box chromatin is associated with patterns of posttranslational histone modifications within this chromatin that are specific to the SMC lineage in culture and in vivo, including methylation and acetylation to histone H3 and H4 residues."

The "promyogenic SRF [SRF GeneID: 6722] coactivator myocardin [MYOCD GeneID: 93649] increased SRF association with methylated histones and CArG box chromatin during activation of SMC gene expression. [...] myocardin/SRF complexes physically interact with H3K4dMe and that the interaction of SRF with CArG box chromatin and H3K4dMe is sensitive to expression levels of myocardin."

The "myogenic repressor Kruppel-like factor 4 recruited histone H4 deacetylase activity to SMC genes and blocked SRF association with methylated histones and CArG box chromatin during repression of SMC gene expression. [...] deacetylation of histone H4 coupled with loss of SRF binding during suppression of SMC differentiation in response to vascular injury. [...] KLF4 can bind to evolutionarily conserved TGF-β [control element] (TCE) DNA sequences adjacent to CArG boxes of SM gene promoters"

Acetylation
Histone modifications in SMCs include H3K4dMe, H3 Lys79 di-methylation, H3 Lys9 acetylation, H4Ac, and SRF binding.

Epigenetic regulations
"Epigenetic regulations are usually due to chemical modification of DNA bases or protein complexes stably bound to DNA. These changes are heritable and are not due to changes in the DNA sequence itself."

Imprinting
"In mammals a few clustered genes are turned off in the germline of one parent. Once imprinted, a gene stays off throughout embryogenesis and adult life in somatic cells. The imprint is removed early in germline development and then re-established in sex-specific patterns."

"The individuals with dots in them represent individuals who carried the gene but did not show mutant phenotype."

Prader-Willi and Angelman Syndromes
Three "imprinted genes control Prader-Willi and Angelman Syndromes: SNRPN, necdin, and Ube3. For females, SNRPN and necdin are off while Ube3 is in. For males, SNRPN and necdin are on while Ube3 is off. Children inheriting a deletion from their father will have no active SNRPN or necdin genes and will show Prader-Willi syndrom. Children inheriting a deletion from their mother will have no active Ube3 gene and will show Angelman syndrome."

Programmings
In the figure on the right are illustrated "the various possible development paths of a cell (ball) [...] represented as a landscape [where] the reliefs are controlled by genes and their epigenetic systems. This is the same principle that is supported by Waddington as to the development of an organism of a species data (initially) an embryo will develop into being under the influence of environment and genes to a adult stage (arrival). The degree of stability of the trajectory is given by the depth of the trough, by the degree of pipe. The pipeline is an assessment of the ability to produce a consistent phenotype in being subject to environmental and / or genetic and highlights the strength of developmental systems. When the disturbance is too large, another valley is taken, there is a pattern of change (Repatterning). Waddington is also the origin of the discovery of genetic assimilation process: an environmental stimulus that causes a particular phenotype can be replaced by a genetic factor." Translated using Google Translate.

Adipogeneses
During differentiation a myriad of adipogenic loci are under the influence of chromatin modifying complexes. The figure on the right outlines two major points of regulation of adipogenesis. PPARγ acts on a variety of genes implicated in adipocyte differentiation and function. Rb, a factor that binds PPARγ, is known to have a negative effect on PPARγ action in adipogenesis. This is due to its binding deacetylases in its unphosphorylated form that then act on PPARγ's target promoters. In the event that Rb is phosphorylated PPARγ binds acetyltransferases, which then acetylate histones at PPARγ target promoters, allowing transcription to occur.

Stem cells
"Induced stem cells (iSC) are stem cells artificially derived from somatic, reproductive, pluripotent or other cell types by deliberate epigenetic reprogramming. They are classified as either totipotent (iTC), pluripotent (iPSC) or progenitor (multipotent—iMSC, also called an induced multipotent progenitor cell—iMPC) or unipotent -- (iUSC) according to their developmental potential and degree of dedifferentiation. Progenitors are obtained by so-called direct reprogramming or directed differentiation and are also called induced somatic stem cells."

Three techniques are widely recognized:


 * Transplantation of nuclei taken from somatic cells into a fertilized egg or oocyt from which the nucleus is removed prior
 * Fusion of somatic cells with pluripotent stem cells and
 * Modification of somatic cells, inducing its transformation into a stem cell, using: the genetic material encoding reprogramming protein factors,  recombinant proteins; microRNA,   a synthetic, self-replicating polycistronic RNA, and low-molecular weight biologically active substances.

X chromosome epigenetics
A model for the evolution of mammalian X-chromosome inactivation. Arrows above the phylogeny show epigenetic features underlying X chromosome inactivation (solid arrows indicate stable components of XCI within the clades, whereas dashed lines indicate unstable repressive modification).