Gene transcriptions/Boxes/TATAs

"The TATA box (also called Goldberg-Hogness box) is a DNA sequence (cis-regulatory element) found in the promoter region of genes in archaea and eukaryotes; approximately 24% of human genes contain a TATA box within the core promoter.

The TATA box is a binding site of either general transcription factors or histones.

Consensus sequences
In the direction of transcription along the DNA strand, the TATA box has the core DNA sequence 3'-TATAAA-5' or a variant, which is usually followed by three or more adenine (A) bases, specifically [3'-TATAAA(A)AAA-5' on the template strand].

"[M]ost of the diversity within metazoan core promoters appears to involve the variable occurrence of consensus or near-consensus TATA, Inr, and DPE elements."

The TATA box can be an AT-rich sequence "located at a fixed distance upstream of the transcription start site".

Histones
The binding of a transcription factor blocks the binding of a histone and vice versa.

Gene expressions
Although it is harder to regulate the transcription of genes with multiple transcription start sites, "variations in the expression of a constitutive gene would be minimized by the use of multiple start sites."

Earlier "studies led to the design of a super core promoter (SCP) that contains a TATA, Inr, MTE, and DPE in a single promoter (Juven-Gershon et al., 2006b). The SCP is the strongest core promoter observed in vitro and in cultured cells and yields high levels of transcription in conjunction with transcriptional enhancers. These findings indicate that gene expression levels can be modulated via the core promoter."

Human genes
"Nine elements were tested, representing a sampling of elements present in the two gene deserts and DACH introns, spread over a 1530-kb region surrounding the human DACH's TATA box."

Gene ID: 1602 is the human gene DACH1 dachshund homolog 1 also known as DACH. DACH1 has three isoforms: a, b, and c.

"[T]he human ... prostaglandin-endoperoxide-synthase-2 [gene contains] a canonical TATA box (nucleotide residues at positions -31 to -25 for the human gene)." This is Gene ID: 5743.

The Drosophila hsp70 has a TATA box containing promoter. This suggests that GeneID: 3308 HSPA4 heat shock 70kDa protein 4 [Homo sapiens], also known as hsp70, has a TATA box in its core promoter.

Gene transcriptions
"From a teleological standpoint, this arrangement [of focused promoters] is consistent with the notion that it would be easier to regulate the transcription of a gene with a single transcription start site than one with multiple start sites."

The TATA box is involved in the process of transcription by RNA polymerase.

Approximately “76% of human core promoters lack TATA-like elements, have a high GC content, and are enriched in Sp1 binding sites.”

"[T]wo motifs - M3 (SCGGAAGY) and M22 (TGCGCANK) - ... occur preferentially in human TATA-less core promoters."

"About 24% of human genes have a TATA-like element and their promoters are generally AT-rich; however, only ~10% of these TATA-containing promoters have the canonical TATA box (TATAWAWR). In contrast, ~46% of human core promoters contain the consensus INR (YYANWYY) and ~30% are INR-containing TATA-less genes." W = A or T, Y = C or T, N = G, A, T, or C, and R = A or G.

Apparently, another ~46% of human promoters lack both TATA-like and consensus INR elements.

Transcription start sites
The consensus sequence is usually located 25 base pairs [(bps) or nucleotides (nts)] upstream [(-)] of the transcription site; i.e., the transcription start site (TSS).

Focused promoters
"In focused transcription, there is either a single major transcription start site or several start sites within a narrow region of several nucleotides. Focused transcription is the predominant mode of transcription in simpler organisms."

"Focused transcription initiation occurs in all organisms, and appears to be the predominant or exclusive mode of transcription in simpler organisms."

"In vertebrates, focused transcription tends to be associated with regulated promoters".

"The analysis of focused core promoters has led to the discovery of sequence motifs such as the TATA box, BREu (upstream TFIIBrecognition element), Inr (initiator), MTE (motif ten element), DPE (downstream promoter element), DCE (downstream core element), and XCPE1 (Xcore promoter element 1) [...]."

Dispersed promoters
"In dispersed transcription, there are several weak transcription start sites over a broad region of about 50 to 100 nucleotides. Dispersed transcription is the most common mode of transcription in vertebrates. For instance, dispersed transcription is observed in about two-thirds of human genes."

In vertebrates, "dispersed transcription is typically observed in constitutive promoters in CpG islands."

Core promoters
"Focused transcription typically initiates within the Inr, and the A nucleotide in the Inr consensus is usually designed as the “+ 1” position, whether or not transcription actually initiates at that particular nucleotide. This convention is useful because other core promoter motifs, such as the MTE and DPE, function with the Inr in a manner that exhibits a strict spacing dependence with the Inr consensus sequence (and hence, the A + 1 nucleotide) rather than the actual transcription start site (Burke and Kadonaga, 1997, Kutach and Kadonaga, 2000 and Lim et al., 2004)."

"With TATA-driven core promoters, transcription can be achieved in vitro with purified RNA polymerase II, TFIIB, TFIID, TFIIE, TFIIF, and TFIIH."

"NC2 (negative cofactor 2; also known as Dr1-Drap1) [...] was identified as repressor of TATA-dependent transcription [...]."

"TBP (TATA box-binding protein) activates TATA transcription [...] The TBP subunit binds to the TATA box [...] TFIIA appears to promote the binding of TBP to the TATA box."

TATA boxes
"The TATA box is the first core promoter motif that was discovered (Goldberg, 1979) as well as the best known core promoter element. The metazoan TATA box consensus is TATAWAAR, where the upstream T is usually located at − 31 or − 30 relative to the A + 1 (or G + 1) position in the Inr (Carninci et al., 2006 and Ponjavic et al., 2006). [The] TATA box is recognized and bound by the TBP subunit of the TFIID complex. Both the TATA box and TBP are conserved from archaebacteria to humans (Reeve, 2003). The TATA box is also present in plants (Molina and Grotewold, 2005, Yamamoto et al., 2007a and Yamamoto et al., 2007b). Although the TATA box is a well known core promoter motif, it is present in only about 10%–15% of mammalian core promoters (Carninci et al., 2006, Kim et al., 2005 and Cooper et al., 2006)."

"The BRE (TFIIBrecognition element) was initially identified as a TFIIB binding sequence that is immediately upstream of a subset (∼ 10%–30%) of TATA box elements (Lagrange et al., 1998). In addition, a second TFIIB recognition site, the BREd (downstream TFIIB recognition element), was found immediately downstream of the TATA box (Deng and Roberts, 2005). The discovery of the BREd led to the renaming of the original BRE as BREu for upstream BRE (reviewed in Deng and Roberts, 2007). Both the BREu and BREd function in conjunction with a TATA box and have been found to increase as well as to decrease the levels of basal transcription ( Lagrange et al., 1998, Evans et al., 2001 and Deng and Roberts, 2005). More recent studies suggest a distinct role for the BREu in transcriptional regulation (Juven-Gershon et al., 2008a; [...])."

"TRF3 (also known as TBP2 and TBPL2) appears to be present only in vertebrates and is the TRF that is most closely related to TBP. TRF3 can bind to TATA boxes and support TATA-dependent transcription (Bártfai et al., 2004 and Jallow et al., 2004). TRF3 was found to be important for embryonic development (Bártfai et al., 2004 and Jallow et al., 2004). In addition, zebrafish embryos that are depleted of TRF3 exhibit multiple developmental defects and fail to undergo hematopoiesis (Hart et al., 2007)."

Hypotheses

 * 1) A1BG is not transcribed using a TATA box.