Although initiation of transcription (see Fig. 4-4, step 2) is the most frequently regulated step in gene expression, for certain genes subsequent steps are more important for determining the overall level of expression. These processes are generally classified as post-transcriptional regulation. The mechanisms for regulating these steps are less well understood than are those for regulating transcription initiation, but some information comes from the study of model genes. Post-transcriptional processes that we review here are pre-mRNA splicing (step 5) and transcript degradation (step 8).
Alternative splicing generates diversity from single genes
Eukaryotic genes contain introns (see p. 74) that must be removed from the primary transcript to create mature mRNA; this process is called pre-mRNA splicing. Splicing involves the joining of two sites on the RNA transcript, the 5′ splice-donor site and the 3′ splice-acceptor site, and removal of the intervening intron. The first step involves cleavage of the pre-mRNA at the 5′ splice-donor site. Second, joining of the 5′ end of the intron to an adenosine residue located within the intron forms a “lariat” structure. Third, ligation of the 5′ and 3′ splice sites releases the lariat intron. The splicing reaction occurs in the nucleus, mediated by ribonucleoprotein particles (snRNPs) that are composed of proteins and small nuclear RNA (snRNA). Together, the assembly of pre-mRNA and snRNPs forms a large complex called the spliceosome. N4-14
Mechanism of Pre-mRNA Splicing
Contributed by Peter Igarashi
The location of the 5′ and 3′ splice sites is based, at least in part, on the sequences at the ends of the introns. The 5′ splice-donor site has the consensus sequence 5′-(C/A)AG ↓GU(G/A)AGU-3′; the vertical arrow represents the boundary between the exon and the intron. The 3′ splice-acceptor site has the consensus sequence 5′-YnNCAG ↓G-3′; Yn represents a polypyrimidine tract (i.e., a long sequence of only C and U), and N represents any nucleotide. An intronic site located >17 nucleotides upstream from the 3′ acceptor site (5′-YNCUGAC-3′), called the branch point, is also present and contains the adenosine (red background in eFig 4-7) that contributes to formation of the lariat structure.
EFIGURE 4-7 Mechanism of pre-mRNA splicing. This example illustrates how a 5′ splice-donor site at one end of exon 1 can link to the 3′ splice-acceptor site at the end of exon 2 and thereby splice out the intervening intron. The process can be divided into three steps: (1) cleavage of the pre-mRNA at the 5′ splice-donor site; (2) joining of the 5′ end of the intron to an adenosine residue that is located within the intron, forming a lariat structure; and (3) ligation of the 5′ and 3′ splice sites and release of the lariat intron.
Many genes undergo alternative splicing, which refers to differential splicing of the same primary transcript to produce mature transcripts that contain different combinations of exons. If the coding region is affected, the resulting splicing variants will encode proteins with distinct primary structures that may have different physiological functions. Thus, alternative splicing is a mechanism for increasing the diversity of proteins that a single gene can produce. Figure 4-19 summarizes seven patterns of alternative splicing.
FIGURE 4-19 Types of alternative splicing. In F, the two red arrows represent alternative transcription initiation sites. Poly-A, polyadenylic acid.
In some cases, the cell may choose whether to splice out a segment of RNA. For example, the γA isoform of rat γ-fibrinogen lacks the seventh intron, whereas the γB isoform retains the intron, which encodes a unique C terminus of 12 amino acids (see Fig. 4-19A).
Alternative 3′ Splice Sites
In this case, the length of an intron is variable because the downstream boundary of the intron can be at either of two or more different 3′ splice-acceptor sites (see Fig. 4-19B). For example, in rat fibronectin, a single donor site may be spliced to any of three acceptor sites. The presence or absence of the amino acids encoded by the sequence between the different splice-acceptor sites results in fibronectin isoforms with different cell adhesion properties.
Alternative 5′ Splice Sites
Here also, the length of the intron is variable. However, in this case, it is the upstream boundary of the intron that can be at either of two or more different 5′ splice-donor sites (see Fig. 4-19C). For example, cells can generate mRNA encoding 3-hydroxy-3-methylglutaryl–coenzyme A (HMG-CoA) reductase (see p. 968) with different 5′ UTRs by splicing from multiple donor sites for the first intron to a single acceptor site.
In some cases, the cell may choose either to splice in an exon or group of exons (cassette exons) or not to splice them in (see Fig. 4-19D). An example is the α-tropomyosin gene, which contains 12 exons. All α-tropomyosin transcripts contain the invariant exons 1, 4 to 6, 8, and 9. All muscle-like cells splice in exon 7, but hepatoma (i.e., liver tumor) cells do not splice in exon 7; they directly link exon 6 to exon 8.
Mutually Exclusive Exons
In yet other cases, the cell may splice in mutually exclusive exons (see Fig. 4-19E). One of the Na/K/Cl cotransporter genes (NKCC2) is an example. Isoforms containing distinct 96-bp exons are differentially expressed in the kidney cortex and medulla. Because the encoded amino-acid sequence is predicted to reside in the membrane, the isoforms may have different kinetic properties. The α-tropomyosin gene again is another example. Smooth-muscle cells splice in exon 2 but not exon 3. Striated-muscle cells and myoblasts splice in exon 3 but not exon 2. Fibroblasts and hepatoma cells do not splice in either of these two exons.
Alternative 5′ Ends
Cells may select among different alternative promoters, creating alternative 5′ ends, and then splice the selected end to a downstream portion of the pre-mRNA (see Fig. 4-19F). For example, the SLC4A4 gene that encodes the electrogenic Na/HCO3 cotransporter 1 (see p. 122) has 26 exons. The transcript that encodes the NBCe1-A variant expressed heavily in the renal proximal tubule (see p. 829) initiates from a promoter located upstream from exon 4 and then continues with exons 5 through 26. The transcript that encodes the NBCe1-B variant expressed heavily in pancreatic ducts (see pp. 885–886) initiates from a promoter located upstream from exon 1 and then continues with exons 2 through 26. This use of alternative promoters permits differential regulation of gene expression in kidney versus pancreas. The genes encoding myosin light chain (see p. 233–234) and α-amylase (see p. 916) are additional examples. N4-15
Myosin Light Chain Alternative Splicing
Contributed by Peter Igarashi
In the case of the myosin light chain (MLC) gene (see pp. 233–234), which consists of nine exons, one transcript is initiated from a promoter that is located upstream from exon 1, skips exons 2 and 3, and includes exons 4 to 9. The other transcript is initiated instead at a promoter located in the first intron and consists of exons 2, 3, and 5 to 9. Because the coding region is affected, the two transcripts encode proteins that differ at their N-terminal ends. These splice variants are found in different cells or different developmental stages.
Alternative 3′ Ends
Finally, cells may differentially splice the transcript near the 3′ end of the gene (see Fig. 4-19G) and thereby alter the site of cleavage and polyadenylation. Such splicing may also affect the coding region. Again, α-tropomyosin is an example. Striated-muscle cells splice in exon 11, which contains one alternative 3′ UTR. Smooth-muscle cells splice in exon 12 instead of exon 11. Another example is the calcitonin gene, which encodes both the hormone calcitonin (see pp. 1067–1068) and calcitonin gene–related peptide-α (CGRPα). Thyroid C cells produce one splice variant that includes exons 1 to 4 and encodes calcitonin. Sensory neurons, on the other hand, produce another splice variant that excludes exon 4 but includes exons 5 and 6. It encodes a different protein, CGRPα.
These examples illustrate that some splicing variants are expressed only in certain cell types and not in others. Clearly, control of alternative splicing must involve steps other than initiation of transcription because many splice variants have identical 5′ ends. In some genes, the control elements that are required for alternative splicing have been identified, largely on the basis of deletion mutations that result in aberrant splicing. These control elements can reside in either introns or exons and are located within or near the splice sites. The proteins that interact with such elements remain largely unknown, although some RNA-binding proteins that may be involved in regulation of splicing have been identified.
Regulatory elements in the 3′ untranslated region control mRNA stability
Degradation of mRNA is mediated by enzymes called ribonucleases. These enzymes include 3′-5′ exonucleases, which digest RNA from the 3′ end; 5′-3′ exonucleases, which digest from the 5′ end; and endonucleases, which digest at internal sites. The stability of mRNA in cytoplasm varies widely for different transcripts. Transcripts that encode cytokines and immediate-early genes are frequently short-lived, with half-lives measured in minutes. Other transcripts are much more stable, with half-lives that exceed 24 hours. Moreover, cells can modulate the stability of individual transcripts and thus use this mechanism to affect the overall level of expression of the gene. N4-16
Degradation of mRNA by Ribonucleases
Contributed by Peter Igarashi
A structural feature of typical mRNA that contributes to its stability in cytoplasm is the 5′ methyl cap, in which the presence of the 5′-5′ phosphodiester bond makes it resistant to digestion by 5′-3′ exonucleases. Similarly, the poly(A) tail at the 3′ end of the transcript often protects messages from degradation. Deadenylation (i.e., removal of the tail) is often a prerequisite for mRNA degradation. Accordingly, transcripts with long poly(A) tails may be more stable in cytoplasm than are transcripts with short poly(A) tails.
Regulatory elements that stabilize mRNA, as well as elements that accelerate its degradation, are frequently located in the 3′ UTR of the transcripts. A well-characterized example of a gene that is primarily regulated by transcript stability is the transferrin receptor (Fig. 4-20). The transferrin receptor is required for uptake of iron into most of the cells of the body (see p. 42). During states of iron deprivation, transferrin receptor mRNA levels increase, whereas transcript levels decrease when iron is plentiful. Regulation of expression of the gene encoding the transferrin receptor is primarily post-transcriptional: changes in the half-life (i.e., stability) of the message lead to alterations in the level of the mRNA.
FIGURE 4-20 The role of iron in regulating the stability of the mRNA for the transferrin receptor. The mRNA that encodes the transferrin receptor has a series of IREs in its 3′ untranslated region.
Regulation of transferrin receptor mRNA stability depends on elements that are located in the 3′ UTR called iron response elements (IREs). An IRE is a stem-loop structure that is created by intramolecular formation of hydrogen bonds. The human transferrin receptor transcript contains five IREs in the 3′ UTR. The IRE binds a cellular protein called IRE-binding protein (IRE-BP), which stabilizes transferrin receptor mRNA in the cytoplasm. When IRE-BP dissociates, the transcript is rapidly degraded. IRE-BP can also bind to iron, and the presence of iron decreases its affinity for the IRE. During states of iron deficiency, less iron binds to IRE-BP, and thus more IRE-BP binds to the IRE on the mRNA. The increased stability of the transcript allows the cell to produce more transferrin receptors. Conversely, when iron is plentiful and binds to IRE-BP, IRE-BP dissociates from the IRE, and the transferrin receptor transcript is rapidly degraded. This design prevents cellular iron overload.
MicroRNAs regulate mRNA abundance and translation
A major form of post-transcriptional gene regulation occurs via small RNA molecules called microRNAs (miRNAs), which mediate a type of post-transcriptional gene regulation. miRNAs are typically ~22 nucleotides in length, too short to encode proteins. Instead, miRNAs play regulatory functions in many physiological and pathophysiological processes. They bind to specific mRNA targets and regulate mRNA abundance and translation. Because this regulation occurs after the transcription of mRNA, it is post-transcriptional in nature.
Figure 4-21 shows the biogenesis and function of miRNAs. Transcription of genomic DNA initially gives rise to a longer primary miRNA (pri-miRNA) transcript, which the endonuclease Drosha then cleaves to produce a precursor miRNA (pre-miRNA), typically 50 to 70 nucleotides in length. The pre-miRNA forms a hairpin structure because of intramolecular base pairing. The pre-miRNA, complexed with exportin-5 and Ran-GTP, then exits the nucleus via nuclear pores. In the cytoplasm, a second endonuclease called Dicer cleaves the pre-miRNA to produce the mature, single-stranded miRNA.
FIGURE 4-21 miRNA biogenesis and function. ORF, open reading frame.
In the cytoplasm, the newly synthesized miRNA associates with a protein complex called the RNA-induced silencing complex (RISC). Once complexed with RISC, the miRNA binds via base pairing to an mRNA target. Usually, miRNAs bind to sites located in the 3′ UTR of mRNA, although occasionally it binds to sites in the 5′ UTR (see Fig. 4-2). Recall that the 5′ and 3′ UTRs flank the coding region of the mRNA. Binding of the miRNA and the associated RISC complex results in degradation of the mRNA target or inhibits translation of the mRNA. In some cases, the targeted mRNA may be sequestered in subcellular organelles, called processing bodies (P bodies), where it is no longer available for translation. Regardless of the exact mechanism, the net result is that the expression of the protein encoded by the mRNA is inhibited. N4-17
Small Interfering RNA
Contributed by Peter Igarashi
Small interfering RNAs (siRNA) may modulate gene expression both at the post-transcriptional level and at the level of chromatin structure. These siRNAs are short (~22 bp), double-stranded RNA molecules, one strand of which is complementary in sequence to a target mRNA. The process in which siRNAs silence the expression of specific genes is called RNA interference (RNAi).
Recall that RNA is usually single stranded. However, certain non–protein-coding sequences in the genome may yield RNA transcripts that contain inverted repeats, which allows double-stranded hairpins to form via intramolecular hydrogen bonds (eFig. 4-8). Cleavage of the hairpin structure by an endonuclease called Dicer produces the mature siRNA.
Mature siRNA can assemble into a ribonucleoprotein complex called RISC, which specifically cleaves a target mRNA that is complementary in sequence to one of the strands of the siRNA (see eFig. 4-8). In addition, the binding of an siRNA to a complementary mRNA can inhibit translation of the mRNA into protein (see eFig. 4-8). Finally, siRNAs can assemble into another ribonucleoprotein complex called RNA-induced transcriptional silencing (RITS) complex, which promotes DNA and histone methylation and thus the formation of heterochromatin (see eFig. 4-8).
Hundreds of genes that are potentially regulated by RNAi have been identified, and it is likely that this number will continue to grow. Because the expression of siRNAs is often tissue specific and developmentally regulated, RNAi may be an important mechanism for silencing gene expression during cell differentiation.
EFIGURE 4-8 Regulation of gene expression by RNA interference. The siRNA is produced from hairpin RNA by Dicer. 1, Assembly of siRNA in the RISC results in cleavage of the target mRNA. 2, The siRNA can also inhibit mRNA translation. 3, Assembly of siRNA in the RITS complex promotes DNA methylation and gene silencing.
The inhibition of gene expression by miRNAs represents a form of RNA interference. RNA interference refers to a process, found in most animal and plant species, in which short RNA molecules silence the expression of specific genes. Andrew Fire and Craig Mello—who shared the 2006 Nobel Prize in Physiology or Medicine N4-18—discovered RNA interference in 1998 in the roundworm Caenorhabditis elegans.
Andrew Fire and Craig Mello
For more information about Andrew Fire and Craig Mello and the work that led to their Nobel Prize, visit http://www.nobelprize.org/nobel_prizes/medicine/laureates/2006/ (accessed October 2014).
Since the discovery of the first mammalian miRNA in 2000, hundreds of miRNAs have been identified in humans and other species. The sequences of miRNAs are often evolutionarily conserved, consistent with their important functions. The binding of a particular miRNA to its mRNA targets is primarily dependent on a short seven-nucleotide sequence, the seed sequence, at the 5′ end of the miRNA. Different miRNAs contain different seed sequences and thereby recognize and inhibit distinct mRNA targets. Because the seed sequence is relatively short, an individual miRNA can bind many, perhaps hundreds, of different mRNA targets. miRNAs are thought to fine-tune gene expression, and the magnitude of their effects on individual mRNAs is relatively modest, typically less than 2-fold. However, the cumulative effect produced by targeting multiple mRNA targets in the same pathway may produce physiologically significant effects. miRNAs regulate hundreds of genes and, according to some estimates, may regulate more than one third of human genes.
miRNAs play important roles in embryonic development, stem-cell differentiation, cell proliferation, and cell death. Other physiological processes under the control of miRNAs include insulin secretion, the stress response, renin secretion, and lipid metabolism. By fine-tuning mRNA abundance and translation, miRNAs work in concert with gene transcription to maintain the levels of proteins in cells within an optimal range. Dysregulation of miRNAs commonly plays a role in pathological conditions such as cancer, viral infection, diabetes, and Alzheimer disease
Roles of miRNAs in Cardiac Stress Responses
Elevated blood pressure imposes stress on the heart that, if sustained, results in pathological remodeling, leading to cardiac hypertrophy, fibrosis, heart failure, and increased arrhythmias. miRNAs play an important role in the cardiac response to stress, in which cardiac myocytes switch their expression of myosin heavy chains from the normally dominant adult αMHC to βMHC, which is normally weakly expressed (see Table 9-1). Embedded in one of the introns of the αMHC gene is the miRNA called miR-208. Therefore, expression of αMHC results in coexpression of miR-208, which targets a transcriptional corepressor called thyroid hormone receptor–associated protein 1 (THRAP1). Normally, THRAP1 interacts with the thyroid hormone receptor (see pp. 1010–1011) to inhibit transcription of the MYH7 gene, which encodes βMHC. Thus, by reducing THRAP1 expression, miR-208 disinhibits βMHC transcription. Indeed, miR-208 is necessary for expression of the βMHC and cardiac remodeling. Mice lacking miR-208 cannot upregulate βMHC and are protected from pathological remodeling in response to cardiac stress.
Pathological cardiac remodeling also results in increased cardiac fibrosis. The miRNA miR-29 inhibits the expression of collagen and other components of the extracellular matrix. Cardiac stress inhibits miR-29 expression, which results in increased production of extracellular matrix and fibrosis.