Gene transcriptions/Boxes/Enhancers/Laboratory

A laboratory is a specialized activity where a student, teacher, or researcher can have hands-on, or as close to hands-on as possible, experience actively analyzing an entity, source, or object of interest.

Usually, expensive equipment, instruments, and/or machinery are available for taking the entity apart to see and accurately record how it works, what it's made of, and where it came from. This may involve simple experiments to test reality, collect data, and try to make some sense out of it.

Expensive equipment can be replaced or substituted for with more readily available tools.

Notations
You are free to create your own notation or use that already presented. A method to statistically assess your locator is also needed.

Laboratory control group
A laboratory control group of some large number of laboratory test subjects or results may be used to define normal limits for the presence of an effect.

Instructions
This laboratory is an activity for you to explore the universe for, to create a method for, or to examine. While it is part of the, it is also independent.

Some suggested entities to consider are
 * 1) available classification,
 * 2) human genes,
 * 3) eukaryotes,
 * 4) nucleotides,
 * 5) classical physics quantities, or
 * 6) geometry.

More importantly, there are your entities.

You may choose to define your entities or use those already available.

Usually, research follows someone else's ideas of how to do something. But, in this laboratory you can create these too.

This is a gene project laboratory, but you may create what a laboratory, or a  is.

Yes, this laboratory is structured.

I will provide an example. The rest is up to you.

Questions, if any, are best placed on the Discuss page.

To include your participation in each of these laboratories create a subpage of your user page once you register at wikiversity and use this subpage, for example, your online name/laboratory effort.

Enjoy learning by doing!

Hypotheses

 * 1) A1BG is not transcribed by an enhancer box.
 * 2) Existence of an enhancer box on either side of A1BG does not prove that it is actively used to transcribe A1BG.
 * 3) A1BG is not transcribed by a downstream enhancer box.

Introduction
An enhancer is a short region of DNA that can be bound with proteins (namely, the trans-acting factors, much like a set of transcription factors) to enhance transcription levels of genes (hence the name) in a gene cluster.

Some "eukaryotic genes located on separate chromosomes [associate] physically in the nucleus via interactions that may have a function in coordinating gene expression."

"Transcriptional regulatory elements such as locus control regions, enhancers or insulators act by repositioning specific genetic loci to regions with active or silent transcription6. Furthermore, sequence-specific DNA-binding proteins may confer their action by directly repositioning these loci to relevant chromatin compartments7–12."

In eukaryotic cells the structure of the chromatin complex of DNA is folded in a way that although the enhancer DNA is far from the gene in regard to the number of nucleotides, it is geometrically close to the promoter and gene.

An enhancer may be located upstream or downstream of the gene it regulates.

Enhancers do not act on the promoter region itself, but are bound by activator proteins. These activator proteins interact with the mediator complex, which recruits polymerase II and the general transcription factors which then begin transcribing the genes. Enhancers can also be found within introns. An enhancer's orientation may even be reversed without affecting its function. Additionally, an enhancer may be excised and inserted elsewhere in the chromosome, and still affect gene transcription.

The E-box is a control element in immunoglobulin heavy-chain promoters

The consensus sequence for the E-box element is CANNTG, with a palindromic canonical sequence of CACGTG.

Proximal promoters
The proximal sequence upstream of the gene that tends to contain primary regulatory elements is a proximal promoter.

It is approximately 250 base pairs or nucleotides, nts upstream of the transcription start site.

There may be an E box in the proximal promoter of some genes.

Distal promoters
An E-box usually lies within the distal promoter starting at or near -300 nts, the proximal promoter, or both.

Samplings
Once you've decided on an entity, source, or object, compose a method, way, or procedure to explore it.

One way is to perceive (see, feel, hear, taste, or touch, for example) if there are more than one of them.

Ask some questions about it.

Does it appear to have a spatial extent?

Is there any change over time?

Can it be profiled with a kind of spectrum for example, by emitted radiation? Sample by plotting two or more apparent variables against each other, like intensity versus wavelength.

Is there some location, time, intensity, where there isn't one?

Regarding hypotheses 1:

A1BG has four possible transcription directions:
 * 1) on the negative strand from ZSCAN22 to A1BG,
 * 2) on the positive strand from ZSCAN22 to A1BG,
 * 3) on the negative strand from ZNF497 to A1BG, and
 * 4) on the positive strand from ZNF497 to A1BG.

For each transcription promoter that interacts directly with RNA polymerase II holoenzyme, the four possible consensus sequences need to be tested on the four possible transcription directions, even though some genes may only be transcribed from the negative strand in the 3'-direction on the transcribed strand.

For the Basic programs (starting with SuccessablesE.bas) written to compare nucleotide sequences with the sequences on either the template strand (-), or coding strand (+), of the DNA, in the negative direction (-), or the positive direction (+), the programs are looking for, and found:
 * 1) negative strand in the negative direction is SuccessablesE--.bas, looking for 3'-C-A-(A/C/G/T)-(A/C/G/T)-T-G-5', 4, 3'-CACATG-5' at 324, 3'-CACATG-5' at 797, 3'-CACATG-5' at 2213, and 3'-CACATG-5' at 2342,
 * 2) negative strand in the positive direction is SuccessablesE-+.bas, looking for 3'-C-A-(A/C/G/T)-(A/C/G/T)-T-G-5', 21, 3'-CACATG-5', 1186, 3'-CACATG-5', 1238, 3'-CACATG-5', 1871, 3'-CACATG-5', 1933, 3'-CACATG-5', 2031, 3'-CACATG-5', 2140, 3'-CACATG-5', 2153, 3'-CACATG-5', 2266, 3'-CACATG-5', 2473, 3'-CACATG-5', 3140, 3'-CACATG-5', 3335, 3'-CACATG-5', 3580, 3'-CACATG-5', 3707, 3'-CACATG-5', 3742, 3'-CACATG-5', 3827, 3'-CACATG-5', 3900, 3'-CACATG-5', 3956, 3'-CACATG-5', 4153, 3'-CACATG-5', 4221, 3'-CACATG-5', 4364, 3'-CACATG-5', 4370,
 * 3) positive strand in the negative direction is SuccessablesE+-.bas, looking for 3'-C-A-(A/C/G/T)-(A/C/G/T)-T-G-5', 17, 3'-CACATG-5' at 123, 3'-CACATG-5' at 200, 3'-CACATG-5' at 952, 3'-CACATG-5' at 1206, 3'-CACATG-5' at 1849, 3'-CACATG-5' at 1952, 3'-CACATG-5' at 2151, 3'-CACATG-5' at 2276, 3'-CACATG-5' at 2322, 3'-CACATG-5' at 2533, 3'-CACATG-5' at 2613, 3'-CACATG-5' at 2667, 3'-CACATG-5' at 2751, 3'-CACATG-5' at 2783, 3'-CACATG-5' at 4106, 3'-CACATG-5' at 4116, 3'-CACATG-5' at 4247,
 * 4) positive strand in the positive direction is SuccessablesE++.bas, looking for 3'-C-A-(A/C/G/T)-(A/C/G/T)-T-G-5', 4, 3'-CACATG-5', 126, 3'-CACATG-5', 565, 3'-CACATG-5', 2596, 3'-CACATG-5', 3114,
 * 5) complement, negative strand, negative direction is SuccessablesEc--.bas, looking for 3'-G-T-(A/C/G/T)-(A/C/G/T)-A-C-5', 17, 3'-GTGTAC-5' at 123, 3'-GTGTAC-5' at 200, 3'-GTGTAC-5' at 952, 3'-GTGTAC-5' at 1206, 3'-GTGTAC-5' at 1849, 3'-GTGTAC-5' at 1952, 3'-GTGTAC-5' at 2151, 3'-GTGTAC-5' at 2276, 3'-GTGTAC-5' at 2322, 3'-GTGTAC-5' at 2533, 3'-GTGTAC-5' at 2613, 3'-GTGTAC-5' at 2667, 3'-GTGTAC-5' at 2751, 3'-GTGTAC-5' at 2783, 3'-GTGTAC-5' at 4106, 3'-GTGTAC-5' at 4116, 3'-GTGTAC-5' at 4247,
 * 6) complement, negative strand, positive direction is SuccessablesEc-+.bas, looking for 3'-G-T-(A/C/G/T)-(A/C/G/T)-A-C-5', 0,
 * 7) complement, positive strand, negative direction is SuccessablesEc+-.bas, looking for 3'-G-T-(A/C/G/T)-(A/C/G/T)-A-C-5', 4, 3'-GTGTAC-5' at 324, 3'-GTGTAC-5' at 797, 3'-GTGTAC-5' at 2213, 3'-GTGTAC-5' at 2342,
 * 8) complement, positive strand, positive direction is SuccessablesEc++.bas, looking for 3'-G-T-(A/C/G/T)-(A/C/G/T)-A-C-5', 21, 3'-GTGTAC-5', 1186, 3'-GTGTAC-5', 1238, 3'-GTGTAC-5', 1871, 3'-GTGTAC-5', 1933, 3'-GTGTAC-5', 2031, 3'-GTGTAC-5', 2140, 3'-GTGTAC-5', 2153, 3'-GTGTAC-5', 2266, 3'-GTGTAC-5', 2473, 3'-GTGTAC-5', 3140, 3'-GTGTAC-5', 3335, 3'-GTGTAC-5', 3580, 3'-GTGTAC-5', 3707, 3'-GTGTAC-5', 3742, 3'-GTGTAC-5', 3827, 3'-GTGTAC-5', 3900, 3'-GTGTAC-5', 3956, 3'-GTGTAC-5', 4153, 3'-GTGTAC-5', 4221, 3'-GTGTAC-5', 4364, 3'-GTGTAC-5', 4370,
 * 9) inverse complement, negative strand, negative direction is SuccessablesEci--.bas, looking for 3'-C-A-(A/C/G/T)-(A/C/G/T)-T-G-5', 4, 3'-CACATG-5' at 324, 3'-CACATG-5' at 797, 3'-CACATG-5' at 2213, and 3'-CACATG-5' at 2342,
 * 10) inverse complement, negative strand, positive direction is SuccessablesEci-+.bas, looking for 3'-C-A-(A/C/G/T)-(A/C/G/T)-T-G-5', 21, 3'-CACATG-5', 1186, 3'-CACATG-5', 1238, 3'-CACATG-5', 1871, 3'-CACATG-5', 1933, 3'-CACATG-5', 2031, 3'-CACATG-5', 2140, 3'-CACATG-5', 2153, 3'-CACATG-5', 2266, 3'-CACATG-5', 2473, 3'-CACATG-5', 3140, 3'-CACATG-5', 3335, 3'-CACATG-5', 3580, 3'-CACATG-5', 3707, 3'-CACATG-5', 3742, 3'-CACATG-5', 3827, 3'-CACATG-5', 3900, 3'-CACATG-5', 3956, 3'-CACATG-5', 4153, 3'-CACATG-5', 4221, 3'-CACATG-5', 4364, 3'-CACATG-5', 4370,
 * 11) inverse complement, positive strand, negative direction is SuccessablesEci+-.bas, looking for 3'-C-A-(A/C/G/T)-(A/C/G/T)-T-G-5', 17, 3'-CACATG-5' at 123, 3'-CACATG-5' at 200, 3'-CACATG-5' at 952, 3'-CACATG-5' at 1206, 3'-CACATG-5' at 1849, 3'-CACATG-5' at 1952, 3'-CACATG-5' at 2151, 3'-CACATG-5' at 2276, 3'-CACATG-5' at 2322, 3'-CACATG-5' at 2533, 3'-CACATG-5' at 2613, 3'-CACATG-5' at 2667, 3'-CACATG-5' at 2751, 3'-CACATG-5' at 2783, 3'-CACATG-5' at 4106, 3'-CACATG-5' at 4116, 3'-CACATG-5' at 4247,
 * 12) inverse complement, positive strand, positive direction is SuccessablesEci++.bas, looking for 3'-C-A-(A/C/G/T)-(A/C/G/T)-T-G-5', 4, 3'-CACATG-5', 126, 3'-CACATG-5', 565, 3'-CACATG-5', 2596, 3'-CACATG-5', 3114,
 * 13) inverse, negative strand, negative direction, is SuccessablesEi--.bas, looking for 3'-G-T-(A/C/G/T)-(A/C/G/T)-A-C-5', 17, 3'-GTGTAC-5' at 123, 3'-GTGTAC-5' at 200, 3'-GTGTAC-5' at 952, 3'-GTGTAC-5' at 1206, 3'-GTGTAC-5' at 1849, 3'-GTGTAC-5' at 1952, 3'-GTGTAC-5' at 2151, 3'-GTGTAC-5' at 2276, 3'-GTGTAC-5' at 2322, 3'-GTGTAC-5' at 2533, 3'-GTGTAC-5' at 2613, 3'-GTGTAC-5' at 2667, 3'-GTGTAC-5' at 2751, 3'-GTGTAC-5' at 2783, 3'-GTGTAC-5' at 4106, 3'-GTGTAC-5' at 4116, 3'-GTGTAC-5' at 4247,
 * 14) inverse, negative strand, positive direction, is SuccessablesEi-+.bas, looking for 3'-G-T-(A/C/G/T)-(A/C/G/T)-A-C-5', 4, 3'-GTGTAC-5', 126, 3'-GTGTAC-5', 565, 3'-GTGTAC-5', 2596, 3'-GTGTAC-5', 3114,
 * 15) inverse, positive strand, negative direction, is SuccessablesEi+-.bas, looking for 3'-G-T-(A/C/G/T)-(A/C/G/T)-A-C-5', 4, 3'-GTGTAC-5' at 324, 3'-GTGTAC-5' at 797, 3'-GTGTAC-5' at 2213, 3'-GTGTAC-5' at 2342,
 * 16) inverse, positive strand, positive direction, is SuccessablesEi++.bas, looking for 3'-G-T-(A/C/G/T)-(A/C/G/T)-A-C-5', 21, 3'-GTGTAC-5', 1186, 3'-GTGTAC-5', 1238, 3'-GTGTAC-5', 1871, 3'-GTGTAC-5', 1933, 3'-GTGTAC-5', 2031, 3'-GTGTAC-5', 2140, 3'-GTGTAC-5', 2153, 3'-GTGTAC-5', 2266, 3'-GTGTAC-5', 2473, 3'-GTGTAC-5', 3140, 3'-GTGTAC-5', 3335, 3'-GTGTAC-5', 3580, 3'-GTGTAC-5', 3707, 3'-GTGTAC-5', 3742, 3'-GTGTAC-5', 3827, 3'-GTGTAC-5', 3900, 3'-GTGTAC-5', 3956, 3'-GTGTAC-5', 4153, 3'-GTGTAC-5', 4221, 3'-GTGTAC-5', 4364, 3'-GTGTAC-5', 4370.

Verifications
To verify that your sampling has explored something, you may need a control group. Perhaps where, when, or without your entity, source, or object may serve.

Another verifier is reproducibility. Can you replicate something about your entity in your laboratory more than 3 times. Five times is usually a beginning number to provide statistics (data) about it.

For an apparent one time or perception event, document or record as much information coincident as possible. Was there a butterfly nearby?

Has anyone else perceived the entity and recorded something about it?

Gene ID: 1, includes the nucleotides between neighboring genes and A1BG. These nucleotides can be loaded into files from either gene toward A1BG, and from template and coding strands. These nucleotide sequences can be found in Gene transcriptions/A1BG. Copying the above discovered AGC boxes and putting the sequences in "⌘F" locates these two sequences in the same nucleotide positions as found by the computer programs.

Core promoters


The core promoter is approximately -34 nts upstream from the TSS.

From the first nucleotide just after ZSCAN22 to the first nucleotide just before A1BG are 4460 nucleotides. The core promoter on this side of A1BG extends from approximately 4425 to the possible transcription start site at nucleotide number 4460.

To extend the analysis from inside and just on the other side of ZNF497 some 3340 nts have been added to the data. This would place the core promoter some 3340 nts further away from the other side of ZNF497. The TSS would be at about 4300 nts with the core promoter starting at 4266.

Def. "the factors, including RNA polymerase II itself, that are minimally essential for transcription in vitro from an isolated core promoter" is called the basal machinery, or basal transcription machinery.

There is no E (enhancer) box in the basal transcription machinery for A1BG in the core promoter for A1BG between ZSCAN22 and A1BG.

There is no E (enhancer) box in the basal transcription machinery for A1BG in the core promoter for A1BG between ZNF497 and A1BG.

Proximal promoter enhancers
Def. a "promoter region [juxtaposed to the core promoter that] binds transcription factors that modify the affinity of the core promoter for RNA polymerase.[12][13]" is called a proximal promoter.

The proximal sequence upstream of the gene that tends to contain primary regulatory elements is a proximal promoter.

It is approximately 250 base pairs or nucleotides, nts upstream of the transcription start site".

The proximal promoter begins about nucleotide number 4210 in the negative direction. As such there is an E box within the proximal promoter of A1BG in the negative direction at 4247.

The proximal promoter begins about nucleotide number 708 in the positive direction. There are enhancers at 782, 925, and 931.

Distal promoter enhancers
The "upstream regions of the human CYP11A and bovine CYP11B genes [have] a distal promoter in each gene. The distal promoters are located at −1.8 to −1.5 kb in the upstream region of the CYP11A gene and −1.5 to −1.1 kb in the upstream region of the CYP11B gene."

"Using cloned chicken βA-globin genes, either individually or within the natural chromosomal locus, enhancer-dependent transcription is achieved in vitro at a distance of 2 kb with developmentally staged erythroid extracts. This occurs by promoter derepression and is critically dependent upon DNA topology. In the presence of the enhancer, genes must exist in a supercoiled conformation to be actively transcribed, whereas relaxed or linear templates are inactive. Distal protein–protein interactions in vitro may be favored on supercoiled DNA because of topological constraints."

Distal promoter regions may be a relatively small number of nucleotides, fairly close to the TSS such as (-253 to -54) or several regions of different lengths, many nucleotides away, such as (-2732 to -2600) and (-2830 to -2800).

The "[d]istal promoter is not a spacer element."

Using an estimate of 2 knts, a distal promoter to A1BG would be expected after nucleotide number 2460. The E boxes discovered may be in the distal promoter of ZSCAN22 at 2533, 2613, 2667, 2751, 2783, 4106, and 4116.

If there are any distal enhancers between ZN497 and A1BG, they are inside the gene for ZN497 as there are only 858 nts between them. The enhancer at 782 could be considered as in the distal promoter.

Transcribed enhancer boxes
"MYC is a basic helix-loop-helix transcription factor, evolutionarily conserved in all vertebrates with a considerable amount of sequence similarity (Atchley & Fitch, 1995). It binds to thousands of promoters in mammalian cells as MYC-MAX heterodimer (Blackwood & Eisenman, 1991; C. Y. Lin et al., 2012). In particular it binds the motif CACGTG of the enhancer box (E-box) in the core promoter of active genes. Depending on the target gene, MYC can act as transcriptional activator or repressor, and, can affect transcription at both initiation and elongation steps (Rahl et al., 2010)."

"MYC mediates the transcriptional response of growth-factors stimulation. Importantly, MYC does not only regulate the expression of mRNA(s), it also regulates ribosomal and tRNA genes, transcribed by the RNA Pol I and RNA Pol III respectively (Campbell & White, 2014; Dai, Sun, & Lu, 2010; Mitchell et al., 2015). Amongst the major gene ontology categories of protein-coding genes under the control of MYC there are: ribosome biogenesis, apoptosis, cell adhesion, cell size, angiogenesis and metabolic pathways (Nieminen, Partanen, & Klefstrom, 2007; Peterson & Ayer, 2011; A. M. Singh & Dalton, 2009; Uslu et al., 2014; van Riggelen, Yetil, & Felsher, 2010)."

Laboratory reports
Below is an outline for sections of a report, paper, manuscript, log book entry, or lab book entry. You may create your own, of course.

Enhancer-box transcription of A1BG

by Marshallsumter (discuss • contribs) 02:34, 8 September 2017 (UTC)

Abstract
By combining a literature search with computer analysis of the promoter between ZSCAN22 and A1BG and ZNF497 and A1BG, enhancer boxes have been found. To show that these enhancer boxes can be used during or for transcription of A1BG at least one transcription factor has been found.

Introduction
According to one source, A1BG is transcribed from the direction of ZNF497: 3' - 58864890: CGAGCCACCCCACCGCCCTCCCTTGG+1GGCCTCATTGCTGCAGACGCTCACCCCAGACACTCACTGCACCGGAGTGAGCGCGACCATCATG : 58866601-5', where the second 'G' at left of four Gs in a row is the TSS. Transcription was triggered in cell cultures and the transcription start site was found using reverse transcriptase. But, the mechanism for transcription is unknown.

Controlling the transcription of A1BG may have significant immune function against snake envenomation. A1BG forms a complex that is similar to those formed between toxins from snake venom and A1BG-like plasma proteins. These inhibit the toxic effect of snake venom metalloproteinases or myotoxins and protect the animal from envenomation.

Many transcription factors (TFs) occur upstream and occasionally downstream of the transcription start site (TSS), in a gene's promoter. It isn't known which, if any, assist in locating and affixing the transcription mechanism for A1BG. This examination is the first to test one such DNA-occurring TF: the E box.

Experiment
The first hypothesis required at least one computer experiment to look for enhancer boxes on either side of A1BG.

The computer program search of the nucleotides between ZSCAN22 and A1BG located enhancer boxes at -2114, -2247, -3663, and -4136 nts upstream from the TSS at 4460 from ZSCAN22.

The search between ZNF497 and A1BG found ten potential enhancer boxes at -717, -590, -555, -470, -397, -341, -144, -76, +67, and +73 nts eight upstream and two downstream from the TSS at 858 from ZNF497.

The extended nucleotides data found enhancer boxes downstream from the TSS (4300): 3'-CACATG-5' at 4364 and 3'-CACATG-5' at 4370. But no enhancers in the core promoter on the ZNF497 side of A1BG.

Whether A1BG is transcribed or can be transcribed by an enhancer box first requires the presence of at least one in its promoter regions. These computer programs which were used to systematically go through both the template and coding strand on both sides of A1BG using the nucleotide sequences stored in the Gene database of the NCBI demonstrated that enhancer boxes exist. But, per the second hypothesis, are any actually used? A literature search found MYC which "binds the motif CACGTG of the enhancer box (E-box) in the core promoter of active genes" or "can act as transcriptional activator or repressor, and, can affect transcription at both initiation and elongation steps".

Results
The presence of many enhancer boxes on both sides of A1BG demonstrate that hypothesis one: "A1BG is not transcribed by an enhancer box", is false.

The finding by literature search of evidence verifying that at least one transcription factor can enhance or inhibit the transcription of A1BG using one or more enhancer boxes disproves hypothesis two: "Existence of an enhancer box on either side of A1BG does not prove that it is actively used to transcribe A1BG". Regarding hypothesis 3: A1BG is not transcribed by a downstream enhancer box. The finding of E boxes downstream of the TSS suggests that A1BG can be transcribed by a downstream E box so hypothesis 3 is likely false.

Discussion
These results show that the presence of enhancer boxes on either side of A1BG implies their use when transcribing A1BG.

No experimental efforts to force transcription of A1BG from the either side were performed, nor were the enhancers boxes demonstrated to be used.

A complete description of all the transcription factors that can use an enhancer box to enhance or inhibit transcription is needed.

Conclusion
Enhancer boxes do occur in the proximal and distal promoters of A1BG. And, it is likely that an enhancer box is involved in some way with the transcription of A1BG.

Laboratory evaluations
To assess your example, including your justification, analysis and discussion, I will provide such an assessment of my example for comparison and consideration.

Evaluation

No wet chemistry experiments were performed to confirm that Gene ID: 1 is transcribed from either side using enhancer boxes. The NCBI database is generalized, whereas individual human genome testing could demonstrate that A1BG is transcribed from either side.