Retrovirus Structure and Life Cycle
Retroviruses are distinct from other types of viruses due to their unique flow of genetic information. Once a retrovirus has gained entry into a host cell, the genetic material is converted via reverse transcription from RNA to DNA, a process that is essentially backwards from normal transcription in which a cell converts DNA into RNA. This "backwards" flow of genetic information is where retroviruses derive their name.
Retroviruses are important tools for gene therapy because of their ability to integrate into the host cells genome and obtain long term gene expression. In fact, retroviruses are currently the second leading vector of choice in clinical trails. Before understanding how retroviral gene therapy works, one needs a solid understand of the structure and life cycle of a retrovirus
RETROVIRAL STRUCTURE AND GENOME ORGANIZATION
Retroviruses belong to the family Retroviridae which consists of a large and diverse group of viruses classified into seven genera. They are enveloped viruses with virions that typically measure roughly 80 to 100 nanometers in diameter. Their RNA genome is approximately 7-12 kilobases, linear, single stranded, and of positive polarity. The genome consists of two identical copies and is condensed by association with the nucleocapsid protein (NC). This association forms what is called a ribonucleoprotein (RNP) complex which is surrounded by a protein core formed mainly by capsid (CA) proteins, both NC and CA are products of the viral gag gene. Also enclosed by the capsid are the viral enzymes needed for replication: integrase, reverse transcriptase, and protease. The latter of these three enzymes is encoded by the viral pro gene and the remaining two by the pol gene. The viral capsid is then surrounded by a shell composed mainly of the matrix (MA) protein and which is also encoded by the gag gene. The MA protein forms a layer around the viral core and interacts with the viral envelope. Finally, the viral envelope forms the outermost layer and originates from the host cell’s lipid bilayer. The envelope contains viral glycoproteins that are responsible for the recognition and binding to host cell receptors that mediate viral entry. Viral glycoproteins are composed of two subunits; a surface (SU) protein that binds a cellular receptor and a transmembrane (TM) protein that anchors the entire structure into the membrane. Viral glycoproteins are encoded by the env gene.
Retroviral Structure and Genome Organization
The viral genome is composed of a dimer formed by two identical copies of (+) sense RNA and thus is essentially diploid. The diploid nature of the genome is maintained by interactions between the 5` ends of each RNA and is referred to as the dimer linkage structure (DLS). Each monomer is approximately 7 to 13 kb in size and is found within the capsid complexed with NC proteins. The genome originates from normal host transcription and thus resembles processed RNA including a 5`- cap and a 3`- poly (A) tail. The viral genome is flanked by two long terminal repeat sequences (LTRs) at both the 5`- and 3 `- ends. These LTRs contain the signals required for viral gene expression including the enhancer, promoter, 5`- capping, transcription terminator, and poly (A) signal. The 5` LTR acts as an RNA polymerase II promoter and contains transcription regulatory signals such as the TATA box near the R (repeated) sequence. The 3` LTR functions as a transcription terminator and polyadenylation signal which leads to the development of a mature viral transcript. Both 5` and 3` LTRs also contain the ATT sites required for proviral integration. The retroviral genome also includes a primer binding site (pbs), which is used to bind the primer tRNA to begin reverse transcription. Downstream from the pbs is the packaging signal sequence (Ψ or psi) that allows completed RNA transcripts to be packaged into budding viral cores. The genome also contains a polypurine tract (PPT), which is a short sequence of A/G residues responsible for initiating (+) strand synthesis during reverse transcription.
The basic translated region consists of four main genes: gag, pro, pol, and env. Each gene encodes unique proteins that facilitate the viral life cycle. The gag gene encodes three main structural proteins: NC, CA, and MA. The pro gene encodes a protease that is responsible for the cleavage of gag-pol precursors during virus maturation. The pol gene encodes the enzymes reverse transcriptase, which converts the viral RNA genome into a DNA intermediate, and integrase, which allows the DNA intermediate to be incorporated into the host cell genome. Finally, the env gene encodes the SU and TM subunits of the glycoproteins displayed on the viral surface that are used in recognition and binding of host cell receptors (Fig 1).
RETROVIRUS LIFE CYCLE
The retroviral life cycle begins when viral glycoproteins embedded in the lipid envelope recognize receptors displayed on the host cell plasma membrane and mediate viral attachment. Subsequent membrane fusion between the viral and host cell membrane follows and allows viral entry. For most retroviruses the process of fusion and entry are thought to be pH independent, meaning they do not require entry via the endosomal pathway. Retroviral glycoproteins undergo major structural rearrangement in the process of fusion, however, this process in incompletely understood.
After gaining entry into the host cell the virus is uncoated by a process that requires the mature Gag protein. Once the genetic material of the virus has been uncoated reverse transcription is implemented by the viral enzyme reverse transcriptase, which converts the RNA genome into a double stranded DNA intermediate. Reverse transcription takes place in the cytoplasm within a large complex that includes NC, RT, IN, and the viral RNA. This distinct reverse flow of genetic information from RNA to DNA, and the establishment of DNA in an integrated form in the host genome, are distinguishing features of retroviruses. The process of reverse transcription is complex and involves the initiation of DNA synthesis at precise locations and the translocation of DNA intermediates. This highly ordered process has been reviewed elsewhere. The reverse transcription generated DNA intermediate, termed a pro-virus, consists of a 5` LTR, the intervening viral genome, and a 3` LTR. It is carried to the nucleus and integrated into the host cell’s genome by the enzyme integrase, in complex with a variety of other proteins that form what is called the pre-integration complex (PIC). The viral integrase enzyme utilizes terminal ATT sites of the viral genome to begin end processing and integration. Integration accounts for the ability of the virus to persist in the infected cell indefinitely. It also accounts for the virus’s oncogenic activity as integration is essentially “random” and thus has the opportunity to create a mutation within any gene. The mechanism of translocation into the nucleus is poorly understood, however for most retroviruses this process requires the host cell to undergo mitosis. Presumably, the breakdown of the nuclear envelope during mitosis allows the pro-virus access to the host genome and subsequent integration. However, in contrast to the majority of retroviruses, lenti- and spumaviruses can successfully infect non-dividing cells suggesting an alternate mechanism of transport of their pro-viral DNA. After integration the provirus consists of a 5` LTR, the viral genomic sequences, and a 3` LTR.
Once the provirus is established, the DNA becomes a permanent addition to the infected cell’s genome. Here it is used as a template for viral RNA production and will be passed on to daughter cells during mitosis. The U3 region of the pro-viral 5’-LTR contains a promoter that includes a GC-rich domain and the TATA box which is recognized by cellular RNA polymerase II. Also contained within the U3 region is an enhancer which is responsible for binding transcription factors that positively regulate transcription. Together the promoter and enhancer regions recruit transcriptional machinery and are important in the initiation of transcription. After transcription, the 5’end of the transcript is capped by 7-methylguanosine and the 3’end is polyadenylated, generating a mature viral transcript. Depending on the family of the virus, the transcript can be spliced or remain full length and be exported from the nucleus into the cytoplasm for translation of viral proteins. Retroviruses contain open reading frames designated by the gag, pro, pol, and env genes which allow for the translation of precursor proteins that are then processed during and after virus assembly. This allows many proteins to be made from one open reading frame and ensures they are made at the correct ratio. As the Gag, Pro, Pol, and Env proteins are synthesized they come together to assemble progeny virions at the plasma membrane. The Gag precursor protein consists of MA, CA, and NC and is targeted to the plasma membrane via hydrophobic post-translational modifications, such as a myristic acid attachment. The Gag precursor proteins have a central role in virion assembly and recruit other viral proteins, such as Env, by displaying binding sites for these proteins. The Gag precursor is also thought to be responsible for packaging the viral RNA by binding the packaging (ψ) sequence near the 5’ end of the RNA. As viral proteins are sequestered near the plasma membrane the particle forms and a curvature is introduced into the membrane. As the complex increases in size it applies pressure to the membrane causing the virus to bud outward until finally the virion is pinched off and released into the extracellular matrix. During and after the release of the virion from the cell, the Gag precursor is cleaved by the viral protease (PR) enzyme. The mechanism behind the activation of PR is currently unclear. The enzyme is inactive prior to budding so that the precursors are not cleaved until after virion assembly. Gag precursor cleavage releases the viral proteins MA, CA, and NC. Cleavage of the Gag-Pro-Pol precursor occurs simultaneously with the Gag precursor and releases PR, RT, and IN protein products. Thus, after budding from the cell, PR cleaves both Gag and Gag-Pro-Pol inactive precursors into active proteins that render the viral particle mature and infectious and the viral life cycle can continue.