Gene transcriptions/Elements/Metal responsives/Laboratory

A laboratory is a specialized activity where a student, teacher, or researcher can have hands-on, or as close to hands-on as possible, experience actively analyzing an entity, source, or object of interest.

Usually, expensive equipment, instruments, and/or machinery are available for taking the entity apart to see and accurately record how it works, what it's made of, and where it came from. This may involve simple experiments to test reality, collect data, and try to make some sense out of it.

Expensive equipment can be replaced or substituted for with more readily available tools.

Notations
You are free to create your own notation or use that already presented. A method to statistically assess your locator is also needed.

Laboratory control group
A laboratory control group of some large number of laboratory test subjects or results may be used to define normal limits for the presence of an effect.

Instructions
This laboratory is an activity for you to explore the universe for, to create a method for, or to examine. While it is part of the, it is also independent.

Some suggested entities to consider are
 * 1) available classification,
 * 2) human genes,
 * 3) eukaryotes,
 * 4) nucleotides,
 * 5) classical physics quantities,
 * 6) known gene expressions, or
 * 7) geometry.

More importantly, there are your entities.

You may choose to define your entities or use those already available.

Usually, research follows someone else's ideas of how to do something. But, in this laboratory you can create these too.

This is a gene project laboratory, but you may create what a laboratory, or a  is.

This laboratory is structured.

I will provide an example. The rest is up to you.

Questions, if any, are best placed on the Discuss page.

To include your participation in each of these laboratories create a subpage of your user page once you register at wikiversity and use this subpage, for example, your online name/laboratory effort.

Enjoy learning by doing!

Hypotheses

 * 1) At least two A1BG gene isoforms have their transcription initiated by an MRE.

Introduction
Notation: let the symbol MT stand for metallothionein.

"The metallothionein (MT) genes provide a good example of eucaryotic promoter architecture. MT genes specify the synthesis of low-molecular-weight metal-binding proteins. They are transcriptionally regulated by the metal ions cadmium and zinc (11), glucocorticoid hormones (18), interferon (14), interleukin-1 (22), and tumor promoters (2). The metal ion regulation of MTs is conferred by a short sequence element called the metal-responsive element (MRE [21]) or TGC box (31, 34), which functions as a metal ion-dependent enhancer."

"[T]hree potential metal response elements (MREs) [overlap] the E-boxes in the repeats, (TGCACGT with TGCRCNC being the consensus sequence; 17,18)."

The reproducible consensus sequence seems to be 3'-TGCRCNC-5', specifically 3'-TGC-(A/G)-C-(A/C/G/T)-C-5'.

Core promoters


The core promoter is approximately -34 nts upstream from the TSS.

From the first nucleotide just after ZSCAN22 to the first nucleotide just before A1BG are 4460 nucleotides. The core promoter on this side of A1BG extends from approximately 4425 to the possible transcription start site at nucleotide number 4460.

From the first nucleotide just before ZNF497 to the first nucleotide just before A1BG are 4300 nucleotides. The core promoter on this side of A1BG extends from approximately 4265 to the possible transcription start site at nucleotide number 4300.

Def. "the factors, including RNA polymerase II itself, that are minimally essential for transcription in vitro from an isolated core promoter" is called the basal machinery, or basal transcription machinery.

Proximal promoters
Def. a "promoter region [juxtaposed to the core promoter that] binds transcription factors that modify the affinity of the core promoter for RNA polymerase.[12][13]" is called a proximal promoter.

The proximal sequence upstream of the gene that tends to contain primary regulatory elements is a proximal promoter.

It is approximately 250 base pairs or nucleotides, nts upstream of the transcription start site.

The proximal promoter begins about nucleotide number 4210 in the negative direction.

The proximal promoter begins about nucleotide number 4050 in the positive direction.

Distal promoters
The "upstream regions of the human CYP11A and bovine CYP11B genes [have] a distal promoter in each gene. The distal promoters are located at −1.8 to −1.5 kb in the upstream region of the CYP11A gene and −1.5 to −1.1 kb in the upstream region of the CYP11B gene."

"Using cloned chicken βA-globin genes, either individually or within the natural chromosomal locus, enhancer-dependent transcription is achieved in vitro at a distance of 2 kb with developmentally staged erythroid extracts. This occurs by promoter derepression and is critically dependent upon DNA topology. In the presence of the enhancer, genes must exist in a supercoiled conformation to be actively transcribed, whereas relaxed or linear templates are inactive. Distal protein–protein interactions in vitro may be favored on supercoiled DNA because of topological constraints."

Distal promoter regions may be a relatively small number of nucleotides, fairly close to the TSS such as (-253 to -54) or several regions of different lengths, many nucleotides away, such as (-2732 to -2600) and (-2830 to -2800).

The "[d]istal promoter is not a spacer element."

Using an estimate of 2 knts, a distal promoter to A1BG would be expected after nucleotide number 2460 from ZSCAN22.

If there are any transcription factors between ZNF497 and A1BG, they are inside the gene for ZN497 as there are only 858 nts between them. The data set has been expanded to just beyond ZNF497 and now contains 4300 nts to A1BG putting the distal promoter between 2300 and 4300.

Samplings
Once you've decided on an entity, source, or object, compose a method, way, or procedure to explore it.

One way is to perceive (see, feel, hear, taste, or touch, for example) if there are more than one of them.

Ask some questions about it.

Does it appear to have a spatial extent?

Is there any change over time?

Can it be profiled with a kind of spectrum for example, by emitted radiation? Sample by plotting two or more apparent variables against each other, like intensity versus wavelength.

Is there some location, time, intensity, where there isn't one?

Regarding hypotheses 1:

This hypothesis has two parts: (a) there are at least two isoforms and (b) two isoforms are transcribed by a MRE.

There are at least two isoforms
Are there at least two A1BG gene isoforms? A1BG in NCBI Gene lists only one isoform, the gene locus itself, but the protein transcribed is a precursor subject to translational or more likely post-translational modifications.

Mention has been made of "new genetic variants of A1BG."

"Proteomic analysis revealed that [a circulating] set of plasma proteins was α 1 B-glycoprotein (A1BG) and its post-translationally modified isoforms."

There are A1BG genotypes.

A1BG has a genetic risk score of rs893184.

"A genetic risk score, including rs16982743, rs893184, and rs4525 in F5, was significantly associated with treatment-related adverse cardiovascular outcomes in whites and Hispanics from the INVEST study and in the Nordic Diltiazem study (meta-analysis interaction P=2.39×10−5)."

"rs893184 causes a histidine (His) to arginine (Arg) [nonsynonymous single nucleotide polymorphism (nsSNP), A (minor) for G (major)] substitution at amino acid position 52 in A1BG."

Two isoforms are transcribed by a MRE
A1BG has four possible transcription directions:
 * 1) on the negative strand from ZSCAN22 to A1BG,
 * 2) on the positive strand from ZSCAN22 to A1BG,
 * 3) on the negative strand from ZNF497 to A1BG, and
 * 4) on the positive strand from ZNF497 to A1BG.

For each transcription promoter that interacts directly with RNA polymerase II holoenzyme, the four possible consensus sequences need to be tested on the four possible transcription directions, even though some genes may only be transcribed from the negative strand in the 3'-direction on the transcribed strand.

For the Basic programs (starting with SuccessablesMRE.bas) written to compare nucleotide sequences with the sequences on either the template strand (-), or coding strand (+), of the DNA, in the negative direction (-), or the positive direction (+), the programs are, are looking for, and found:
 * 1) negative strand in the negative direction is SuccessablesMRE--.bas, looking for 3'-T-G-C-(A/G)-C-(A/C/G/T)-C-5', 0,
 * 2) negative strand in the positive direction is SuccessablesMRE-+.bas, looking for 3'-T-G-C-(A/G)-C-(A/C/G/T)-C-5', 11, 3'-TGCGCCC-5', 453, 3'-TGCACAC-5', 549, 3'-TGCACAC-5', 1221, 3'-TGCGCCC-5', 1247, 3'-TGCACTC-5', 1373, 3'-TGCGCCC-5', 1399, 3'-TGCACTC-5', 1473, 3'-TGCGCCC-5', 1499, 3'-TGCGCCC-5', 1657, 3'-TGCACAC-5', 2963, 3'-TGCACCC-5', 3323,
 * 3) positive strand in the negative direction is SuccessablesMRE+-.bas, looking for 3'-T-G-C-(A/G)-C-(A/C/G/T)-C-5', 7, 3'-TGCGCTC-5', 891, 3'-TGCACTC-5', 1348, 3'-TGCACTC-5', 2001, 3'-TGCACTC-5', 2427, 3'-TGCACCC-5', 2762, 3'-TGCACTC-5', 3290, 3'-TGCACTC-5', 4341,
 * 4) positive strand in the positive direction is SuccessablesMRE++.bas, looking for 3'-T-G-C-(A/G)-C-(A/C/G/T)-C-5', 2, 3'-TGCGCCC-5', 872, 3'-TGCGCCC-5', 972,
 * 5) complement, negative strand, negative direction is SuccessablesMREc--.bas, looking for 3'-A-C-G-(T/C)-G-(A/C/G/T)-G-5', 7, 3'-ACGCGAG-5', 891, 3'-ACGTGAG-5', 1348, 3'-ACGTGAG-5', 2001, 3'-ACGTGAG-5', 2427, 3'-ACGTGGG-5', 2762, 3'-ACGTGAG-5', 3290, 3'-ACGTGAG-5', 4341,
 * 6) complement, negative strand, positive direction is SuccessablesMREc-+.bas, looking for 3'-A-C-G-(T/C)-G-(A/C/G/T)-G-5', 2, 3'-ACGCGGG-5', 872, 3'-ACGCGGG-5', 972,
 * 7) complement, positive strand, negative direction is SuccessablesMREc+-.bas, looking for 3'-A-C-G-(T/C)-G-(A/C/G/T)-G-5', 0,
 * 8) complement, positive strand, positive direction is SuccessablesMREc++.bas, looking for 3'-A-C-G-(T/C)-G-(A/C/G/T)-G-5', 11, 3'-ACGCGGG-5', 453, 3'-ACGTGTG-5', 549, 3'-ACGTGTG-5', 1221, 3'-ACGCGGG-5', 1247, 3'-ACGTGAG-5', 1373, 3'-ACGCGGG-5', 1399, 3'-ACGTGAG-5', 1473, 3'-ACGCGGG-5', 1499, 3'-ACGCGGG-5', 1657, 3'-ACGTGTG-5', 2963, 3'-ACGTGGG-5', 3323,
 * 9) inverse complement, negative strand, negative direction is SuccessablesMREci--.bas, looking for 3'-G-(A/C/G/T)-G-(T/C)-G-C-A-5', 2, 3'-GTGTGCA-5', 531, 3'-GAGTGCA-5', 1772,
 * 10) inverse complement, negative strand, positive direction is SuccessablesMREci-+.bas, looking for 3'-G-(A/C/G/T)-G-(T/C)-G-C-A-5', 10, 3'-GCGTGCA-5', 546, 3'-GCGCGCA-5', 684, 3'-GGGCGCA-5', 876, 3'-GGGCGCA-5', 976, 3'-GCGTGCA-5', 1218, 3'-GTGCGCA-5', 1523, 3'-GAGTGCA-5', 1786, 3'-GAGTGCA-5', 2326, 3'-GGGTGCA-5', 2800, 3'-GGGTGCA-5', 3883,
 * 11) inverse complement, positive strand, negative direction is SuccessablesMREci+-.bas, looking for 3'-G-(A/C/G/T)-G-(T/C)-G-C-A-5', 2, 3'-GAGTGCA-5', 1470, 3'-GTGTGCA-5', 2863,
 * 12) inverse complement, positive strand, positive direction is SuccessablesMREci++.bas, looking for 3'-G-(A/C/G/T)-G-(T/C)-G-C-A-5', 0,
 * 13) inverse, negative strand, negative direction, is SuccessablesMREi--.bas, looking for 3'-C-(A/C/G/T)-C-(A/G)-C-G-T-5', 2, 3'-CTCACGT-5', 1470, 3'-CACACGT-5', 2863,
 * 14) inverse, negative strand, positive direction, is SuccessablesMREi-+.bas, looking for 3'-C-(A/C/G/T)-C-(A/G)-C-G-T-5', 0,
 * 15) inverse, positive strand, negative direction, is SuccessablesMREi+-.bas, looking for 3'-C-(A/C/G/T)-C-(A/G)-C-G-T-5', 2, 3'-CACACGT-5', 531, 3'-CTCACGT-5', 1772,
 * 16) inverse, positive strand, positive direction, is SuccessablesMREi++.bas, looking for 3'-C-(A/C/G/T)-C-(A/G)-C-G-T-5', 10, 3'-CGCACGT-5', 546, 3'-CGCGCGT-5', 684, 3'-CCCGCGT-5', 876, 3'-CCCGCGT-5', 976, 3'-CGCACGT-5', 1218, 3'-CACGCGT-5', 1523, 3'-CTCACGT-5', 1786, 3'-CTCACGT-5', 2326, 3'-CCCACGT-5', 2800, 3'-CCCACGT-5', 3883.

Verifications
To verify that your sampling has explored something, you may need a control group. Perhaps where, when, or without your entity, source, or object may serve.

Another verifier is reproducibility. Can you replicate something about your entity in your laboratory more than 3 times. Five times is usually a beginning number to provide statistics (data) about it.

For an apparent one time or perception event, document or record as much information coincident as possible. Was there a butterfly nearby?

Has anyone else perceived the entity and recorded something about it?

Gene ID: 1, includes the nucleotides between neighboring genes and A1BG. These nucleotides can be loaded into files from either gene toward A1BG, and from template and coding strands. These nucleotide sequences can be found in Gene transcriptions/A1BG. Copying the above discovered MRE boxes and putting the sequences in "⌘F" locates these sequences in the same nucleotide positions as found by the computer programs.

Core promoters MREs
From the first nucleotide just after ZSCAN22 to the first nucleotide just before A1BG are 4460 nucleotides. The core promoter on this side of A1BG extends from approximately 4425 to the possible transcription start site at nucleotide number 4460.

There are no MREs in the core promoter between 4425 and 4460.

From the first nucleotide just before ZNF497 to the first nucleotide just before A1BG are 4300 nucleotides. The core promoter on this side of A1BG extends from approximately 4265 to the possible transcription start site at nucleotide number 4300.

There are no MREs in the core promoter between 4265 and 4300.

Proximal promoter MREs
The proximal promoter begins about nucleotide number 4210 in the negative direction.

On the positive strand in the negative direction there is an MRE 3'-TGCACTC-5' at 4341. Its complement 3'-ACGTGAG-5' occurs on the negative strand in the negative direction at 4341.

None of the MREs overlap any enhancer boxes per Enhancer box laboratory.

The proximal promoter begins about nucleotide number 4050 in the positive direction.

There is no MRE between 4050 and 4300 nts in the positive direction of the proximal promoter. But, there is 3'-GGGTGCA-5' at 3883. It is on the negative strand in the positive direction and an inverse toward A1BG suggesting it is an MRE toward ZNF497.

Distal promoter MREs
Using an estimate of 2 knts, a distal promoter to A1BG would be expected after nucleotide number 2460.

There are the following MREs on the positive strand, negative direction: 3'-TGCACCC-5', 2762, 3'-TGCACTC-5', 3290. And, their complements on the negative strand, negative direction: 3'-ACGTGGG-5', 2762, 3'-ACGTGAG-5', 3290.

There are inverse MREs 3'-GTGTGCA-5', 2863, and 3'-CACACGT-5', 2863.

Distal MREs in the positive direction, if they exist, would be inside ZNF497 or beyond. The data set has been expanded to just beyond ZNF497 and now contains 4300 nts to A1BG putting the distal promoter between 2300 and 4300.

On the negative strand in the positive direction there as 3'-TGCACAC-5' at 2963 and 3'-TGCACCC-5' at 3323.

There are inverses 3'-GAGTGCA-5' at 2326, 3'-GGGTGCA-5' at 2800 and 3'-GGGTGCA-5' at 3883.

None of the MREs overlap any enhancer boxes per Enhancer box laboratory.

Transcribed MRE boxes
Previous transcriptions have used MRE boxes in the distal promoters.

Laboratory reports
Below is an outline for sections of a report, paper, manuscript, log book entry, or lab book entry. You may create your own, of course.

Metal-responsive element transcription of A1BG

by --Marshallsumter (discuss • contribs) 01:40, 17 October 2017 (UTC)

Abstract
By combining a literature search with computer analysis of the promoter between ZSCAN22 and A1BG and ZNF497 and A1BG, metal responsive elements have been found. Literature search has also discovered at least three post-translational isoforms including the unaltered precursor. Although no metal responsive elements overlap any enhancer boxes in the distal promoter, there are elements in the distal promoter.

Introduction
According to one source, A1BG is transcribed from the direction of ZNF497: 3' - 58864890: CGAGCCACCCCACCGCCCTCCCTTGG+1GGCCTCATTGCTGCAGACGCTCACCCCAGACACTCACTGCACCGGAGTGAGCGCGACCATCATG : 58866601-5', where the second 'G' at left of four Gs in a row is the TSS. Transcription was triggered in cell cultures and the transcription start site was found using reverse transcriptase. But, the mechanism for transcription is unknown.

Controlling the transcription of A1BG may have significant immune function against snake envenomation. A1BG forms a complex that is similar to those formed between toxins from snake venom and A1BG-like plasma proteins. These inhibit the toxic effect of snake venom metalloproteinases or myotoxins and protect the animal from envenomation.

Many transcription factors (TFs) occur upstream and occasionally downstream of the transcription start site (TSS), in a gene's promoter. It isn't known which, if any, assist in locating and affixing the transcription mechanism for A1BG. This examination is the first to test one such DNA-occurring TF: the MRE box.

Experiment
Literature searches were conducted to look for additional isoforms, effects of Cadmium or Zinc on A1BG, and any known pharmacogenomic effects on A1BG.

Computer programs were written and run on the positive and negative strands between ZSCAN22 or ZNF497 and A1BG.

Results
"The human genome is estimated to contain 700 zinc-finger genes, which perform many key functions, including regulating transcription. [Four] clusters of zinc-finger genes [occur] on human chromosome 19".

Nearby zinc-fingers on chromosome 19 include ZNF497 (GeneID: 162968), ZNF837 (GeneID: 116412), and ZNF8 (GeneID: 7554).

"In rodents and in humans, about one third of the zinc-finger genes carry the Krüppel-associated box (KRAB), a potent repressor of transcription (Margolin et al. 1994), [...]. There are more than 200 KRAB-containing zinc-finger genes in the human genome, about 40% of which reside on chromosome 19 and show a clustered organization suggesting an evolutionary history of duplication events (Dehal et al. 2001)."

ZNF8 is in cluster V along with A1BG.

"In contrast to the four clusters considered [I through IV], one that occurs at the telomere of chromosome 19, which we will call cluster V, has been very stable [over mouse, rat, and human]."

"Apart from the somewhat unexpected location of Zfp35 on mouse chromosome 18 and of the AIBG orthologs on mouse chromosome 15 and rat chromosome 7, there has been little rearrangement."

So far no article has reported any linkage between zinc, including various zinc fingers, and A1BG.

Regarding additional isoforms, mention has been made of "new genetic variants of A1BG."

"Proteomic analysis revealed that [a circulating] set of plasma proteins was α 1 B-glycoprotein (A1BG) and its post-translationally modified isoforms."

Computer program sampling of the NCBI database for A1BG produced:
 * 1) On the positive strand in the negative direction there is an MRE in the proximal promoter of 3'-TGCACTC-5' at 4341. Its complement 3'-ACGTGAG-5' occurs on the negative strand in the negative direction at 4341.
 * 2) There are the following MREs in the distal promoter on the positive strand, negative direction: 3'-TGCACCC-5', 2762, 3'-TGCACTC-5', 3290. And, their complements on the negative strand, negative direction: 3'-ACGTGGG-5', 2762, 3'-ACGTGAG-5', 3290.
 * 3) There are inverse MREs in the distal promoter of 3'-GTGTGCA-5', 2863, and 3'-CACACGT-5', 2863.
 * 4) On the negative strand in the positive direction there are MREs 3'-TGCACAC-5' at 2963 and 3'-TGCACCC-5' at 3323. And, there are inverses 3'-GAGTGCA-5' at 2326, 3'-GGGTGCA-5' at 2800 and 3'-GGGTGCA-5' at 3883.

Pharmacogenomic variants have been reported. There are A1BG genotypes.

A1BG has a genetic risk score of rs893184.

"A genetic risk score, including rs16982743, rs893184, and rs4525 in F5, was significantly associated with treatment-related adverse cardiovascular outcomes in whites and Hispanics from the INVEST study and in the Nordic Diltiazem study (meta-analysis interaction P=2.39×10−5)."

"rs893184 causes a histidine (His) to arginine (Arg) [nonsynonymous single nucleotide polymorphism (nsSNP), A (minor) for G (major)] substitution at amino acid position 52 in A1BG."

Discussion
These results show that the presence of an MRE on the ZSCAN22 side of A1BG implies its use when transcribing A1BG, although it may be pointing toward ZSCAN22. These suggest that hypothesis (1) at "least two A1BG gene isoforms have their transcription initiated by an MRE" may be false.

Mention has been made of "new genetic variants of A1BG."

For example, GeneID: 9 has isoforms: a, b, X1, and X2. Each of these (a and b) have variants. "Variants 1-6 and 9 all encode the same isoform (a)."

"Variants 7, 8 and 10 all encode isoform b." Isoforms X1 and X2 are predicted.

Variants can differ in promoters, untranslated regions, or exons. For GeneID: 9: "This variant (1) represents the longest transcript but encodes the shorter isoform (a). This variant is transcribed from a promoter known as P1, promoter 2, or NATb promoter."

"This variant (2, also known as Type IID) lacks an alternate exon in the 5' UTR, compared to variant 1. This variant is transcribed from a promoter known as P1, promoter 2, or NATb promoter."

"This variant (9, also known as Type IA) has a distinct 5' UTR and represents use of an alternate promoter known as the NATa or P3 promoter, compared to variant 1."

But, A1BG in NCBI Gene lists only one isoform, the gene locus itself, and the protein transcribed is a precursor subject to translational or more likely post-translational modifications.

"Proteomic analysis revealed that [a circulating] set of plasma proteins was α 1 B-glycoprotein (A1BG) and its post-translationally modified isoforms."

This confirms that A1BG has more than one isoform. These must include transcriptional variations, various enhancers or inhibitors, or various promoters that result in post-translationally modified isoforms.

No experimental efforts to force transcription of A1BG from the either side were performed, nor were the MREs demonstrated to be used.

A complete description of all the transcription factors that can use an MRE to enhance, inhibit or activate transcription is needed.

A quick literature search on Google Scholar with ZNF497 or "zinc finger protein 497" and "metal responsive element" produced no results. But, a full web search produced several references including a GeneCard for "zinc finger protein 497", including "May be involved in transcriptional regulation." No transcriptional regulation was stated for A1BG.

These results show that the presence of MREs on the ZSCAN22 side of A1BG implies their use when transcribing A1BG.

Either A1BG can be transcribed by MREs in the proximal or distal promoter, or A1BG is not transcribed by MREs. As the literature appears absent from a Google Scholar advanced search to confirm possible transcription from proximal or distal promoters, wet chemistry experiments are needed to test the possibility.

So far no article has reported any linkage between zinc or cadmium, including various zinc fingers, and A1BG.

"In contrast to the four clusters considered [I through IV], one that occurs at the telomere of chromosome 19, which we will call cluster V, has been very stable [over mouse, rat, and human]."

The presence of multiple MREs tends to re-enforce the presence of transcriptional variants.

Conclusion
The presence of multiple MREs coupled with experimental results from the literature indicating post-translational isoforms tends to confirm the existence of two or more isoforms for A1BG and likely transcription from either side.

Laboratory evaluations
To assess your example, including your justification, analysis and discussion, I will provide such an assessment of my example for comparison and consideration.

Evaluation

No wet chemistry experiments were performed to confirm that Gene ID: 1 is transcribed from either side using MREs, especially in the distal promoters. The NCBI database is generalized, whereas individual human genome testing could demonstrate that A1BG is transcribed from either side.