Gene transcriptions/TATA binding proteins/Associated factors



When there is no TATA box nucleotide sequence in the gene core promoter region of the DNA next to a gene, say A1BG of the human genome, a TATA binding protein associated factor (TAF) will bind sequence specifically and force the TATA box binding protein to bind non-sequence specifically to the DNA in the core promoter.

Notations
Notation: let the symbol TAF stand for TATA binding protein associated factor. Notation: let the symbol TBP stand for TATA binding protein.

Genetics
Genetics involves the expression, transmission, and variation of inherited characteristics.

Def. a "branch of biology that deals with the transmission and variation of inherited characteristics, in particular chromosomes and DNA" is called genetics.

Gene transcriptions
DNA is a double helix of interlinked nucleotides surrounded by an epigenome. On the basis of biochemical signals, an enzyme, specifically a ribonucleic acid (RNA) polymerase, is chemically bonded to one of the strands (the template strand) of this double helix. The polymerase, once phosphorylated, begins to catalyze the formation of RNA using the template strand. Although the catalysis may have more than one beginning nucleotide (a start site) and more than one ending nucleotide (a stop site) along the DNA, each nucleotide sequence catalyzed that ultimately produces approximately the same RNA is part of a gene. The catalysis of each RNA representation from the template DNA is a transcription, specifically a gene transcription. The overall process is also referred to as gene transcription.

Theoretical TBP associated factors
Here's a theoretical definition:

Def. any factor associating with the TATA binding protein (TBP) is called a TBP associated factor (TAF).

TATA boxes
A TATA box is a common type of core promoter sequence in eukaryotes which is a short DNA sequence.

The TATA box (also called Goldberg-Hogness box) is a DNA sequence (cis-regulatory element) found in the promoter region of genes in archaea and eukaryotes; approximately 24% of human genes contain a TATA box within the core promoter.

The TATA box is a binding site of either general transcription factors or histones.

In the direction of transcription along the DNA strand, the TATA box has the core DNA sequence [3'-TATAAA-5'] or a variant, which is usually followed by three or more adenine [(A)] bases, specifically [3'-TATAAA(A)AAA-5' on the template strand].

"[M]ost of the diversity within metazoan core promoters appears to involve the variable occurrence of consensus or near-consensus TATA, Inr, and DPE elements."

The TATA box can be an AT-rich sequence "located at a fixed distance upstream of the transcription start site".

TATA binding proteins
The TATA-binding protein (TBP) is a general transcription factor that binds specifically to a DNA sequence called the TATA box. This DNA sequence is found about 30 base pairs upstream of the transcription start site in some eukaryotic gene promoters. TBP, along with a variety of TBP-associated factors, make up the TFIID, a general transcription factor that in turn makes up part of the RNA polymerase II preinitiation complex. As one of the few proteins in the preinitiation complex that binds DNA in a sequence-specific manner, it helps position RNA polymerase II over the transcription start site of the gene.

Initiator elements
Notation: let the symbol Inr denote an initiator element.

Notation: let the symbol +1 designate the nucleotide that is the transcription start site (TSS).

Most human genes lack a TATA box and use an Inr or downstream promoter element instead. As in other metazoans, for genes lacking a TATA box, the Inr is functionally analogous, with a base pair (bp) consensus 5'-YYA+1NWYY-3', to direct transcription initiation. On the template strand (used as a template for RNA synthesis), the consensus sequence is 3'-YYA+1NWYY-5'.

The Inr is the only element in metazoan protein-encoding genes known to be a functional analog of the TATA box, in that it is sufficient for directing accurate transcription initiation in genes that lack TATA boxes. An Inr for mammalian RNA polymerase II can be defined as a DNA sequence element that overlaps a TSS and is sufficient for
 * 1) determining the start site location in a promoter that lacks a TATA box and
 * 2) enhancing the strength of a promoter that contains a TATA box.

"Although any isolated TAF may not exhibit sequence-specific interactions at the Inr element in the absence of a TATA-box, a combination of TAFs may bind sequence specifically to the Inr element regardless of the TATA-box and/or DPE (Chalkley and Verrijzer, 1999)." Bold added.

TAF1
GeneID: 6872, TAF1 "binds to core promoter sequences encompassing the transcription start site. It also binds to activators and other transcriptional regulators, and these interactions affect the rate of transcription initiation." TAF1 "is part of a complex transcriptional unit (TAF1/DYT3)".

"Yeast TAF1 can be divided into four regions including a putative histone acetyltransferase domain and TBP, TAF, and promoter binding domains."

"TAF1 [has] been systematically dissected into ... functional domains: an N-terminal TBP-binding domain termed TAND, a TAF-TAF interaction domain, a putative histone acetyltransferase (HAT) domain, ... a promoter recognition domain [(PB1), and] a ... domain that interacts with TAF7". The promoter recognition domain is approximately at one end of the gene for TAF1.

On human chromosome X (number 23, NC_000023), specifically NC_000023.10, TAF1 is located 3'-70586113[-70685855]-5', 99,742 nt. The 3'-UTR begins at 70586114.

TAF1 isoform 1
Isoform 1 (variant 1) "represents the longer transcript and encodes the longer isoform".

TAF1 isoform 2
Isoform 2 (variant 2) "uses an alternate in-frame splice site, compared to variant 1, resulting in a shorter isoform (2) that lacks an internal 21 aa segment, compared to isoform 1."

TAF1/DYT3
TAF1/DYT3 is "a complex transcript system that is composed of at least 43 exons. Thirty-eight exons code for [TAF1] ... Five downstream exons (d1-d5) ... can either form transcripts with TAF1 exons or be transcribed independently." "Transcripts including exons d1, d3, d4, d5, plus TAF1 exons make up transcript "variant 1." Major "variant 2" is composed of various TAF1 exons plus exons d3 and d4. Alternately, d exons can generate transcripts independent of TAF1 exons 1-38. Exons d2, d3, and d4 make up "variant 3" and exons d3 and d4 constitute "variant 4."" The "additional five exons are located 3' to exon 38 ... ("downstream" exons 1-5)".

TAF1/TAF2
"[A] [TAF1]-[TAF2] complex selects sequences that match the Initiator (Inr) consensus."

TAF1A
GeneID: 9015, TAF1A, TATA box binding protein (TBP)-associated factor for RNA polymerase I. It has two isoforms.

TAF1B
GeneID: 9014, TAF1B, TATA box binding protein (TBP)-associated factor for RNA polymerase I.

TAF1C
GeneID: 9013, TAF1C, TATA box binding protein (TBP)-associated factor for RNA polymerase I. It has two isoforms.

TAF1L
GeneID: 138474, TAF1L "is expressed in male germ cells, and the product has been shown to function interchangeably with the TAF1 product."

TAF2
GeneID: 6873, TAF2 "is stably associated with the TFIID complex. It contributes to interactions at and downstream of the transcription initiation site, interactions that help determine transcription complex response to activators."

TAF3
GeneID: 83860, TAF3 is part of the "set of TBP-associated factors (TAFs) [which] contribute to promoter recognition and selectivity and act as antiapoptotic factors"

TAF4
GeneID: 6874, TAF4 "has been shown to potentiate transcriptional activation by retinoic acid, thyroid hormone and vitamin D3 receptors. In addition, [it] interacts with the transcription factor CREB, which has a glutamine-rich activation domain, and binds to other proteins containing glutamine-rich regions."

TAF4B
GeneID: 6875, TAF4B is "a cell type-specific TAF that may be responsible for mediating transcription by a subset of activators in B cells."

TAF5
GeneID: 6877, TAF5 is "an integral subunit of TFIID associated with all transcriptionally competent forms of that complex. [It] interacts strongly with two TFIID subunits that show similarity to histones H3 and H4, and it may participate in forming a nucleosome-like core in the TFIID complex."

TAF6
GeneID: 6878, TAF6 "binds weakly to TBP but strongly to TAF1". It has four isoforms.

TAF7
GeneID: 6879, TAF7 "interacts with the largest TFIID subunit, as well as multiple transcription activators. [It] is required for transcription by promoters targeted by RNA polymerase II."

TAF7L
GeneID: 54457, TAF7L "could be a spermatogenesis-specific component of the DNA-binding general transcription factor complex TFIID." It has two isoforms.

TAF8
GeneID: 129685, TAF8 "contains an H4-like histone fold domain, and interacts with several subunits of TFIID including TBP and the histone-fold protein TAF10."

TAF9
GeneID: 6880, TAF9 "binds to the basal transcription factor GTF2B as well as to several transcriptional activators such as p53 and VP16." It has four isoforms.

TAF10
GeneID: 6881, TAF10 "is associated with a subset of TFIID complexes. Studies with human and mammalian cells have shown that this subunit is required for transcriptional activation by the estrogen receptor, for progression through the cell cycle, and may also be required for certain cellular differentiation programs."

TAF11
GeneID: 6882, TAF11 "is present in all TFIID complexes and interacts with TBP. This subunit also interacts with another small subunit, TAF13, to form a heterodimer with a structure similar to the histone core structure."

TAF12
GeneID: 6883, "TAF12 interacts directly with TBP as well as with TAF2I [(TAF11)]."

TAF13
GeneID: 6884, TAF13 "interacts with TBP and with two other small subunits of TFIID, TAF10 and TAF11."

TAF15
GeneID: 8148, TAF15 is in "a subunit of TFIID present in a subset of TFIID complexes. Translocations involving chromosome 17 and chromosome 9, where the gene for the nuclear receptor CSMF is located, result in a gene fusion product that is an RNA binding protein associated with a subset of extraskeletal myxoid chondrosarcomas."

TAF15 has two isoforms.

General transcription factor II Ds
Before the start of transcription, the transcription factor II D (TFIID) complex, binds to the core promoter of the gene.

Hypotheses

 * 1) TAFs are not involved in the transcription of A1BG.