Gene transcriptions/Boxes/CGCGs

A. thaliana is a popular model organism in plant biology and genetics. For a complex multicellular eukaryote, A. thaliana has a relatively small genome of approximately 135 megabase pairs (Mbp).

"The minimum DNA-binding elements are 6-bp CGCG box, (A/C/G)CGCG(C/G/T)."

"AtSR1 [Arabidopsis thaliana signal-responsive genes] targets the nucleus and specifically recognizes a novel 6-bp CGCG box (A/C/G)CGCG(G/T/C). The multiple CGCG cis-elements are found in promoters of genes such as those involved in ethylene signaling, abscisic acid signaling, and light signal perception. The DNA-binding domain in AtSR1 is located on the N-terminal 146 bp where all AtSR1-related proteins share high similarity but have no similarity to other known DNA-binding proteins. The calmodulin-binding nuclear proteins isolated from wounded leaves exhibit specific CGCG box DNA binding activities. These results suggest that the AtSR gene family encodes a family of calmodulin-binding/DNA-binding proteins involved in multiple signal transduction pathways in plants."

"Ca2+-mediated signaling is involved in the transduction of physical signals such as temperature, wind, touch, light, and gravity; oxidative signals such as those arising from pathogen attacks; and hormone signals such as ethylene, abscisic acid (ABA),1 gibberellins, and auxin (2-7). All these signals have been shown to trigger changes in amplitude or oscillation in cytosolic free Ca2+ level. Recently, the signal-induced nuclear free calcium changes were also observed (8). Free Ca2+ changes are sensed by a number of Ca2+-binding proteins that usually contain a common structural motif, the “EF-hand,” a helix-loop-helix structure (9). One of the best characterized Ca2+-binding proteins is calmodulin (CaM), a highly conserved and multifunctional regulatory protein in eukaryotes. Its regulatory activities are triggered by its ability to modulate the activity of a certain set of CaM-binding proteins after binding to Ca2+, and thereby generating physiological responses to various stimuli (10-15)."

"The CaM-regulated basic helix-loop-helix family of transcription factors was reported in mammals, where CaM inhibits the protein-DNA interaction by competing with the DNA-binding domain in certain proteins (16)."

"cis-acting elements ACGCGG/CCGCGT were present in the promoter regions of about 130 genes (more than two copies) in Arabidopsis genome."

"The promoter regions are assumed to be within ∼1 kb upstream of the starting transcription site (for the known genes) or the first ATG (for the predicted genes). These genes are related to ethylene signaling (EIN3) and ABA signaling (a putative ABA responsive protein), light perception (phytochrome A, phyA), stress responsive such as the DNA repairing protein, heat shock protein, touch protein (TCH 4), and CaM-regulated ion channel. CaM genes (CaM2 andCaM3) and AtSR6 also contains CGCGcis-elements in their promoter regions."

Samplings
For the Basic programs (starting with SuccessablesCGCG.bas) written to compare nucleotide sequences with the sequences on either the template strand (-), or coding strand (+), of the DNA, in the negative direction (-), or the positive direction (+), the programs are, are looking for, and found:


 * 1) negative strand in the negative direction (from ZSCAN22 to A1BG) is SuccessablesCGCG--.bas, looking for 3'-(A/C/G)CGCG(C/G/T)-5', 2, 3'-GCGCGT-5', 161, 3'-CCGCGC-5', 1761,
 * 2) negative strand in the positive direction (from ZNF497 to A1BG) is SuccessablesCGCG-+.bas, looking for 3'-(A/C/G)CGCG(C/G/T)-5', 8, 3'-GCGCGT-5', 543, 3'-CCGCGC-5', 681, 3'-GCGCGC-5', 683, 3'-ACGCGG-5', 871, 3'-ACGCGG-5', 971, 3'-CCGCGG-5', 1337, 3'-CCGCGG-5', 1437, 3'-CCGCGC-5', 1650,
 * 3) positive strand in the negative direction is SuccessablesCGCG+-.bas, looking for 3'-(A/C/G)CGCG(C/G/T)-5', 1, 3'-GCGCGG-5', 1762,
 * 4) positive strand in the positive direction is SuccessablesCGCG++.bas, looking for 3'-(A/C/G)CGCG(C/G/T)-5', 22, 3'-CCGCGC-5', 161, 3'-ACGCGG-5', 452, 3'-CCGCGC-5', 542, 3'-GCGCGC-5', 682, 3'-GCGCGT-5', 684, 3'-CCGCGT-5', 876, 3'-CCGCGT-5', 976, 3'-CCGCGT-5', 1046, 3'-ACGCGG-5', 1078, 3'-ACGCGG-5', 1162, 3'-CCGCGC-5', 1214, 3'-ACGCGG-5', 1246, 3'-CCGCGT-5', 1298, 3'-ACGCGT-5', 1314, 3'-ACGCGG-5', 1354, 3'-ACGCGG-5', 1398, 3'-ACGCGT-5', 1414, 3'-ACGCGG-5', 1454, 3'-ACGCGG-5', 1498, 3'-ACGCGT-5', 1523, 3'-CCGCGT-5', 1550, 3'-CCGCGG-5', 1769,
 * 5) complement, negative strand, negative direction is SuccessablesCGCGc--.bas, looking for 3'-(C/G/T)GCGC(A/C/G)-5', 1, 3'-CGCGCC-5', 1762,
 * 6) complement, negative strand, positive direction is SuccessablesCGCGc-+.bas, looking for 3'-(C/G/T)GCGC(A/C/G)-5', 22, 3'-GGCGCG-5', 161, 3'-TGCGCC-5', 452, 3'-GGCGCG-5', 542, 3'-CGCGCG-5', 682, 3'-CGCGCA-5', 684, 3'-GGCGCA-5', 876, 3'-GGCGCA-5', 976, 3'-GGCGCA-5', 1046, 3'-TGCGCC-5', 1078, 3'-TGCGCC-5', 1162, 3'-GGCGCG-5', 1214, 3'-TGCGCC-5', 1246, 3'-GGCGCA-5', 1298, 3'-TGCGCA-5', 1314, 3'-TGCGCC-5', 1354, 3'-TGCGCC-5', 1398, 3'-TGCGCA-5', 1414, 3'-TGCGCC-5', 1454, 3'-TGCGCC-5', 1498, 3'-TGCGCA-5', 1523, 3'-GGCGCA-5', 1550, 3'-GGCGCC-5', 1769,
 * 7) complement, positive strand, negative direction is SuccessablesCGCGc+-.bas, looking for 3'-(C/G/T)GCGC(A/C/G)-5', 2, 3'-CGCGCA-5', 161, 3'-GGCGCG-5', 1761,
 * 8) complement, positive strand, positive direction is SuccessablesCGCGc++.bas, looking for 3'-(C/G/T)GCGC(A/C/G)-5', 8, 3'-CGCGCA-5', 543, 3'-GGCGCG-5', 681, 3'-CGCGCG-5', 683, 3'-TGCGCC-5', 871, 3'-TGCGCC-5', 971, 3'-GGCGCC-5', 1337, 3'-GGCGCC-5', 1437, 3'-GGCGCG-5', 1650,
 * 9) inverse complement, negative strand, negative direction is SuccessablesCGCGci--.bas, looking for 3'-(A/C/G)CGCG(C/G/T)-5', 2, 3'-GCGCGT-5', 161, 3'-CCGCGC-5', 1761,
 * 10) inverse complement, negative strand, positive direction is SuccessablesCGCGci-+.bas, looking for 3'-(A/C/G)CGCG(C/G/T)-5', 8, 3'-GCGCGT-5', 543, 3'-CCGCGC-5', 681, 3'-GCGCGC-5', 683, 3'-ACGCGG-5', 871, 3'-ACGCGG-5', 971, 3'-CCGCGG-5', 1337, 3'-CCGCGG-5', 1437, 3'-CCGCGC-5', 1650,
 * 11) inverse complement, positive strand, negative direction is SuccessablesCGCGci+-.bas, looking for 3'-(A/C/G)CGCG(C/G/T)-5', 1, 3'-GCGCGG-5', 1762,
 * 12) inverse complement, positive strand, positive direction is SuccessablesCGCGci++.bas, looking for 3'-(A/C/G)CGCG(C/G/T)-5', 22, 3'-CCGCGC-5', 161, 3'-ACGCGG-5', 452, 3'-CCGCGC-5', 542, 3'-GCGCGC-5', 682, 3'-GCGCGT-5', 684, 3'-CCGCGT-5', 876, 3'-CCGCGT-5', 976, 3'-CCGCGT-5', 1046, 3'-ACGCGG-5', 1078, 3'-ACGCGG-5', 1162, 3'-CCGCGC-5', 1214, 3'-ACGCGG-5', 1246, 3'-CCGCGT-5', 1298, 3'-ACGCGT-5', 1314, 3'-ACGCGG-5', 1354, 3'-ACGCGG-5', 1398, 3'-ACGCGT-5', 1414, 3'-ACGCGG-5', 1454, 3'-ACGCGG-5', 1498, 3'-ACGCGT-5', 1523, 3'-CCGCGT-5', 1550, 3'-CCGCGG-5', 1769,
 * 13) inverse, negative strand, negative direction, is SuccessablesCGCGi--.bas, looking for 3'-(C/G/T)GCGC(A/C/G)-5', 1, 3'-CGCGCC-5', 1762,
 * 14) inverse, negative strand, positive direction, is SuccessablesCGCGi-+.bas, looking for 3'-(C/G/T)GCGC(A/C/G)-5', 22, 3'-GGCGCG-5', 161, 3'-TGCGCC-5', 452, 3'-GGCGCG-5', 542, 3'-CGCGCG-5', 682, 3'-CGCGCA-5', 684, 3'-GGCGCA-5', 876, 3'-GGCGCA-5', 976, 3'-GGCGCA-5', 1046, 3'-TGCGCC-5', 1078, 3'-TGCGCC-5', 1162, 3'-GGCGCG-5', 1214, 3'-TGCGCC-5', 1246, 3'-GGCGCA-5', 1298, 3'-TGCGCA-5', 1314, 3'-TGCGCC-5', 1354, 3'-TGCGCC-5', 1398, 3'-TGCGCA-5', 1414, 3'-TGCGCC-5', 1454, 3'-TGCGCC-5', 1498, 3'-TGCGCA-5', 1523, 3'-GGCGCA-5', 1550, 3'-GGCGCC-5', 1769,
 * 15) inverse, positive strand, negative direction, is SuccessablesCGCGi+-.bas, looking for 3'-(C/G/T)GCGC(A/C/G)-5', 2, 3'-CGCGCA-5', 161, 3'-GGCGCG-5', 1761,
 * 16) inverse, positive strand, positive direction, is SuccessablesCGCGi++.bas, looking for 3'-(C/G/T)GCGC(A/C/G)-5', 8, 3'-CGCGCA-5', 543, 3'-GGCGCG-5', 681, 3'-CGCGCG-5', 683, 3'-TGCGCC-5', 871, 3'-TGCGCC-5', 971, 3'-GGCGCC-5', 1337, 3'-GGCGCC-5', 1437, 3'-GGCGCG-5', 1650.