1 of 1

Repeating elements in the human genome: Role in evolution

by Subhankar Chakraborty

Simple sequence repeats (SSR)
SSRs comprise two groups of repeat elements, micro and macro/minisatellites. SSRs are defined as tandemly repeated short motifs. Microsatellite repeats typically have between 1 upto 6 bases in each repeat, while macrosatellites have a minimum of 5-6 and upwards ranging upto several dozen per repeating element. These repeat elements are present in both coding as well as non-coding regions of the genome, but predominantly reside in the non-coding portion. There are two types of repeats, pure and impure. Pure or perfect repeats are those in which all the repeating elements are identical, while impure repeats have substitutions in one or more of their repeating elements. Previously, SSRs had been regarded as "junk DNA". But recent evidence suggests that they regulate gene expression as a function of the number of times they are repeated (and in turn their length) and thus are a mechanism for genetic change in response to the environmental factors.
Some examples that illustrate the role of SSRs in regulating gene expression are:
In Drosophila, the gene regulating the circadian clock (period or per gene) has two variants of a hexanucleotide SSRs which encodes for Thr-Gly dimer. The longer variant encodes for (Thr-Gly) 20 and its role is to reduce variations in the circadian rhythm with temperature changes. The shorter variant of this SSR encodes (Thr-Gly) 17 and its function is to maintain the length of the circadian rhythm at 24 hours in warm temperatures. It is noted that in temperate climates of Europe, the longer allele is expressed more often. Similarly in the Mount Caramel area of Israel there is a canyon where on one side there is always sunshine while the other side remains in darkness and hence is cooler. Fruit flies on the cooler side of the canyon predominantly express the longer allele in order to maintain their circadian clocks inspite of the lower temperature.
Wheat growing in the same canyon referred to above showed variation in their SSRs when the plants growing on the cooler slope were compared with those on the sunny slopes. This indicates that SSRs are a tool used by organisms to adapt themselves to their immediate environment.
The difference in social behavior of prairie versus meadow voles represents another example of the effect of SSR length on gene expression. In this case, higher expression of the vasopressin receptor in the ventral forebrain of prairie voles is associated with social and caring behavior for their mates, while lower levels of the same in meadow voles is associated with asocial and non-monogamous behavior in them. The two variants of voles differed in the length of SSR region upstream of the vasopressin receptor gene, with the more social group having a longer SSR. When the two SSRs were transfected into rat cells, they were found to produce different expressions of the vasopressin receptor.
In dogs, the ratio of the lengths of two adjacent SSRs in the gene coding for the transcription factor Runx-3 was found to differ in different breeds with correlation to the length of their facial bones. A similar SSR is present in the homologous human gene responsible for the craniofacial skeleton, CBFA-1, and expansion of the repeat are found in some families having cleidocranial dysostosis.
Mononucleotide repeats constitute the largest group of SSRs in the genome of any organism including yeasts. A study comparing the lengths of the Adenine repeats associated with the RAS2 gene, a homologue of the mammalian RAS oncogene has shown that sporulation efficiency (which is controlled by the RAS gene) is inversely proportional to the length of the poly A repeat: strains having the A9 repeat had a greater sporulation compared to those having the A10 repeat. Further, when the RAS allele with 9 polyA residues in the SSR was replaced with one having 10 A residues, the sporulation rate decreased by more than 18 fold. This is another proof that SSRs can control gene expression.
Pathological expansion of SSRs is associated with several human diseases. For instance, a CAG allele expansion associated with the SCA2/ATXN2 gene is seen in spinocerebellar ataxia.

Simple sequence repeats occur not only in non-coding regions of genes but in the coding region itself. Repeats in the coding region are typically triplets and encode a string of identical amino acids. So a change in the repeat number would change the number of amino acids which could in turn alter the protein characteristics like flexibility, and thus affect its function. Non-triplet repeats also exist, and a mutation in them results in a frameshift, as a result of which either no protein is translated, or a short or non-functional protein is produced. But frameshift mutations are rapidly corrected by cells during subsequent divisions, so they might represent an on-off switch by which cells can control gene expression. Two adjacent SSRs may interact and depending on which of the two is predominant, the expression of the gene may vary. An example of this is the effect of adjacent SSRs on Runx-3 expression in dogs as discussed earlier.

Properties of SSRs
They have a high rate of reversible mutations that alter their copy numbers which arise by slippage during replication and unequal crossing over.
The mutation rate in SSRs depends upon the sequence length, nucleotide sequence, number of repeats and the purity. Mutations that alter the sequence of a repeat stabilize SSRs.
SSRs are highly polymorphic with variations in repeat number.
Small variations in repeat number can cause small variations in phenotype
They occur in coding as well as non-coding regions (regulatory regions) of genes
Their distribution pattern is not random. For instance, triplet repeats are more common in coding region of genes. Further each species has a particular abundant repeat, for e.g. AC in humans.1






Reference List

1 Y. Kashi and D. G. King, "Simple sequence repeats as advantageous mutators in evolution," Trends Genet. 22(5), 253 (2006).
Ref Type: Journal



Helium, Inc.
200 Brickstone Square Andover, MA 01810 USA