Draft:Original research/Human RNA

Ribonucleic acid (RNA), specifically human RNA, is made up of a long chain of components called nucleotides. Each nucleotide consists of a nucleobase, a ribose sugar, and a phosphate group. The sequence of nucleotides allows RNA to encode genetic information. All cellular organisms use messenger RNA (mRNA) to carry the genetic information that directs the synthesis of proteins.

Ribonucleic acids
"[A]t least three-quarters of the [human] genome is involved in making RNA".

RNA transcription
Ribonucleic acid is used in DNA gene transcription as the messenger (mRNA) for the gene products on the left. It is also an end product from both DNA gene transcription and RNA gene transcription.

Structure
Each nucleotide in RNA contains a ribose sugar, with carbons numbered 1' through 5'. A base is attached to the 1' position, in general, adenine (A), cytosine (C), guanine (G), or uracil (U). Adenine and guanine are purines, cytosine, and uracil are pyrimidines. A phosphate group is attached to the 3' position of one ribose and the 5' position of the next. The phosphate groups have a negative charge each at physiological pH, making RNA a charged molecule (polyanion).

Introns
Some introns themselves encode specific proteins or can be further processed after splicing to generate noncoding RNA molecules. Alternative splicing is widely used to generate multiple proteins from a single gene. Furthermore, some introns represent mobile genetic elements and may be regarded as examples of selfish DNA.

Messenger RNAs
Messenger RNA (mRNA) is the RNA that carries information from DNA to the ribosome, the sites of protein synthesis (translation) in the cell. The coding sequence of the mRNA determines the amino acid sequence in the protein that is produced. Many RNAs do not code for protein however (about 97% of the transcriptional output is non-protein-coding in eukaryotes   ).

Non-coding RNAs
"RNA actively functions as a regulator, a catalyser and a controller of several vital processes in the cell. These are functions that previously were attributed solely to proteins, but during recent years evidence for the role of RNA in these activities has emerged (Goodrich, Nat Rev Mol Cell Biol, 2006). The way the non-coding RNA (i.e. the type of RNA that does not encode proteins) functions can be summarised in three different ways: 1) binding through base pairing to target sequence, 2) folding on itself and catalysing a reaction (i.e. functioning as an enzyme), or 3) binding to a protein and modulating its activity."

Long non-coding RNAs
A "steady stream of transcribed regions with no apparent purpose [...] long noncoding RNAs (lncRNAs) came from genome regions that were known to lack protein genes. The transcripts also lacked open reading frames and other properties necessary for them to be translated into proteins."

A long "noncoding RNA they named HOTAIR1. This 2.2-kilobase spliced RNA transcript interacts with the protein complex polycomb to modify chromatin and repress transcription of the human HOX genes, which regulate development."

"HOTAIR is just one of thousands of lncRNAs."

"HOTAIR serves as a 'modular scaffold', assembling a molecular cargo of specific combinations of enzymes that are equipped to regulate target genes2."

Hundreds "of lncRNAs are physically associated with polycomb and other chromatin-modifying complexes3."

"Noncoding transcripts are traditionally classified as long at around 200 nucleotides, an arbitrary distinction based on RNA purification technologies. Most are thousands of nucleotides long."

"It is difficult to discriminate functional transcripts from those that may be byproducts of other processes, but many transcripts that come from intergenic regions are starting to look like real signals. They show up relatively consistently in different experiments, contain splice junctions and are present in high numbers. Higher abundance presumably increases the likelihood that a transcript is functional, but it’s not really proof. Ultimately we have to go in and do experiments to demonstrate that things have function."

Long intergenic noncoding RNAs
"Gain- and loss-of-function experiments showed that at least one of [the long noncoding RNAs], called lincRNA-RoR, for long intergenic noncoding RNA and regulator of reprogramming, was essential for a variety of functions, including reprogramming as well as modulating genes known to respond to oxidative stress, DNA damage and p53, a protein that regulates the cell cycle and is implicated in about half of all human cancers12."

Transfer RNAs
Transfer "RNA (tRNA) are small (~80 bases in length), heavily modified RNA molecules that each carry one single amino acid to the ribosome. tRNAs are highly abundant in a cell, for example during every yeast generation approximately 3-6 million tRNAs are produced. Each tRNA molecule contains four regions of intramolecular double helices formed by Watson-Crick base pairing and three loops (D-, anticodon- and T-loop). The solving of the tRNA crystal structure in 1974 (Kim, Science, 1974; Robertus, Nature, 1974) showed that non-canonical base pairing, mediated by the hydroxyl group at the 2’ carbon in the ribose, participates in creating the unique three-dimensional structure (Noller, Science, 2005). tRNAs are extensively modified before becoming fully mature: their 5’ leader sequence is removed, the 3’ trailer sequence is trimmed, the nucleotide CCA trimer is added to the 3’ end, a large number of the bases are edited, and introns spliced. This processing requires more than 60 different proteins and includes several quality control steps. Recent work has also shown that several quality control steps ensure that only fully processed tRNAs are available to the ribosome and protein synthesis (Kadaba, Genes Dev, 2004), and that – surprisingly - retrograde transport of tRNA back into the nucleus takes place (Shaheen, Proc Natl Acad Sci U S A, 2005; Takano, Science, 2005)."

Ribosomal RNA
Ribosomal RNA (rRNA) is the catalytic component of the ribosomes. Eukaryotic ribosomes contain four different rRNA molecules: 18S, 5.8S, 28S and 5S rRNA. Three of the rRNA molecules are synthesized in the nucleolus, and one is synthesized elsewhere. In the cytoplasm, ribosomal RNA and protein combine to form a nucleoprotein called a ribosome. The ribosome binds mRNA and carries out protein synthesis. Several ribosomes may be attached to a single mRNA at any time. Nearly all the RNA found in a typical eukaryotic cell is rRNA.

MicroRNAs
MicroRNAs (miRNA; 21-22 [nucleotide] nt) are found in eukaryotes and act through RNA interference (RNAi), where an effector complex of miRNA and enzymes can cleave complementary mRNA, block the mRNA from being translated, or accelerate its degradation.

"MicroRNA (miRNA) is a group of small single-stranded non-coding RNAs of 19–22 nucleotides (nt) in size, and regulates gene expression, as do other non-coding small RNAs (smRNAs) [1–3]. More than 800 human miRNAs have been found and alteration of miRNA expression has been seen in human malignancies [4–13]."

miR-194 microRNA precursor is a small non-coding RNA gene that regulated gene expression, gene expression verified in mouse (MI0000236, MI0000733) and in human (MI0000488, MI0000732). mir-194 appears to be a vertebrate-specific miRNA and has now been predicted or experimentally confirmed in a range of vertebrate species (MIPF0000055).

Small RNAs
"In practice, most miRNAs have been identified through the use of Sanger sequencing and, later, high-throughput small RNA sequencing (sRNA-seq). miRNAs can be picked out in the large background of cellular sRNAs by their biogenesis: when sequenced miRNA strands are mapped to the precursor hairpin, they will fall in positions characteristic of Drosha and Dicer processing [18, 19]. Specifically, sequenced sRNAs should map to positions corresponding to miRNA strands or to the loop, and if both strands are identified, they should form a duplex with overhangs, as is typical of Dicer processing [18]."

Small non-coding RNAs
"With limitations in test specificity and the ability to detect novel miRNA and other small non-coding RNAs (smRNAs), microarray and RT–PCR techniques are being replaced by the evolving deep-sequencing technologies, at least in the discovery phase."

Small interfering RNAs
There are also endogenous sources of small interfering RNAs siRNAs. siRNAs act through RNA interference in a fashion similar to miRNAs. Some miRNAs and siRNAs can cause genes they target to be methylated, thereby decreasing or increasing transcription of those genes.

Piwi-interacting RNAs
Animals have Piwi-interacting RNAs (piRNA; 29-30 nt) that are active in germline cells and are thought to be a defense against transposons and play a role in gametogenesis.

Small nuclear RNAs
Small nuclear ribonucleic acid (snRNA) is a class of small RNA molecules that are found within the nucleus of eukaryotic cells. They are transcribed by RNA polymerase II or RNA polymerase III and are involved in a variety of important processes such as RNA splicing (removal of introns from hnRNA), regulation of transcription factors (7SK RNA) or RNA polymerase II (B2 RNA), and maintaining the telomeres. They are always associated with specific proteins, and the complexes are referred to as small nuclear ribonucleoproteins (snRNP) often pronounced "snurps". These elements are rich in uridine content.

Small nucleolar RNAs
"In eukaryotes, dozens of posttranscriptional modifications are directed to specific nucleotides in ribosomal RNAs (rRNAs) by small nucleolar RNAs (snoRNAs)."

"Ribosome biogenesis in Eukarya occurs in the nucleolus. Several nucleolar proteins (NOPs), including fibrillarin, Nop56, and Nop58, and dozens of snoRNAs are involved in this process (1). The snoRNAs fall into two major classes: C/D box and H/ACA box RNAs. The C/D box snoRNAs are efficiently precipitated with antibodies against fibrillarin. Most C/D box snoRNAs target specific ribose methylations within rRNA, whereas most H/ACA box RNAs target specific conversions of uridine to pseudouridine within rRNA (2)."

"The general mechanism of C/D box snoRNA-targeted ribose methylation[: ] Each snoRNA contains a 9- to 21-nucleotide (nt)–long sequence, located 5' to the D or D' box motif, that is complementary to an rRNA target sequence. Methylation is directed to the rRNA nucleotide that participates in the base pair 5 nt upstream from the start of the D or D' box. It is likely that most, if not all, eukaryotic rRNA ribose methylations are guided by snoRNAs."

Mitochondrial RNAs
There are "genetically encoded RNA probes for characterizing localization and dynamics of mitochondrial RNA (mtRNA) in single living cells."

Mitochondrial "RNA includes a component containing a poly (adenylic acid) segment."

"The mitochondrial poly(A) sequence is about 50-80 bases long. This sequence is considerably smaller than the poly(A) segment of cytoplasmic messenger RNA, but about the size found in some viral RNAs. The mitochondrial RNA to which the poly(A) is attached is apparently heterogeneous in molecular weight."

Mitochondrial transfer RNAs
"Sequence information from an increasing number of complete mitochondrial genomes indicates that a large number of evolutionary distinct organisms import nucleus-encoded tRNAs."

"Translation requires rRNAs and a complete set of tRNAs, which, according to most textbooks, are encoded by the mitochondrial genome."

"In all organisms, for any given tRNA that is imported, most of the total tRNA synthesized in the nucleus remains in the cytosol and functions in cytosolic translation. The specificity and the extent to which individual tRNAs are imported, however, differs greatly between organisms and might reflect fundamental differences in the mechanisms underlying tRNA import."

Mitochondrial endoribonuclease RNAs
"Mitochondrial RNA-processing endoribonuclease (RNAase MRP) has the capacity to cleave mitochondrial RNA complementary to the light strand of the displacement loop at a unique site. The enzyme is a ribonucleoprotein whose RNA component is a nuclear gene product. The 5′ flanking region of the primary transcript has control elements characteristic of RNA polymerase II transcription, and the coding region has features of RNA polymerase III transcription signals. The RNA associated with RNAase MRP is the first known RNA encoded by a single-copy gene in the nucleus and believed to be imported into mitochondria."

"The gene (RMRP) for this RNA component of RNAase MRP was assigned to human chromosome 9 [specifically] at 9p21-p12."

Small cytoplasmic RNAs
"Small RNA deep-sequencing (smRNA-seq) can detect almost all smRNAs present in the samples, including novel and under-expressed miRNAs, small nucleolar RNAs (snoRNAs), small cytoplasmic RNAs (scRNAs) and small nuclear RNAs (snRNAs) [17–19]."

"For each sample, approximately	9.9 – 12.3 million sequence tags (reads) aligned to the human genome sequence dataset (hg18) were obtained, which included miRNA (43.27–58.42%), snoRNA (12.85–23.78%), scRNA (0.14–0.26%), snRNA (0.04–0.10%), tRNA (2.78–12.97%), rRNA (2.13–3.93%), miscellaneous RNA (misc-RNA) (0.02 – 0.04%), introns (7.28 – 9.85%), exons (0.99 – 1.44%),	mitochondrial tRNA (Mt-tRNA) (0.20 – 4.14%) and	unknown	nucleotide sequences (10.09–15.18%) (Table 1; see also Supporting information, Figure S1a). Among the smRNA population, up to 598 distinct types of miRNAs, 367 types of snoRNAs, 11 types of scRNAs and 29 types of snRNAs were detected for each sample (see Supporting information, Table S1)."

Telomerase RNAs
"To examine the role of telomerase in normal and neoplastic growth, the telomerase RNA component (mTR) was deleted from the mouse germline. mTR−/− mice lacked detectable telomerase activity yet were viable for the six generations analyzed."

"Telomeres were shown to shorten at a rate of 4.8 ± 2.4 kb per mTR−/− generation. Cells from the fourth mTR−/− generation onward possessed chromosome ends lacking detectable telomere repeats, aneuploidy, and chromosomal abnormalities, including end-to-end fusions."

Miscellaneous RNAs
"To process subsequences (exon or intron) in genes downloaded from NCBI Gene, Mojo Hand [a web-based program] requires at least one mRNA, [coding DNA sequence] CDS, misc RNA, or exon feature."

"Control sequences [for the biogenesis of microRNA] comprise transfer RNAs (tRNAs), small nucleolar RNAs (snoRNAs) and miscellaneous RNAs (miscRNAs) (in grey)."

RNA-induced silencing complexes
The RNA-induced silencing complex (RISC) is a multiprotein complex, specifically a ribonucleoprotein, which incorporates one strand of a single-stranded RNA (ssRNA) fragment, such as microRNA (miRNA), or double-stranded small interfering RNA (siRNA). The single strand acts as a template for RISC to recognize complementary messenger RNA (mRNA) transcript, and once found, a protein Argonaute, activates and cleaves the mRNA, a process called RNA interference (RNAi), found in many eukaryotes; a key process in gene silencing and defense against viral infections.

Hypotheses

 * 1) Human RNA probably makes up less than 50 % of the RNA produced by the human genome.