Read an Excerpt
Molecular Themes in DNA Replication
By Lynne S. Cox
The Royal Society of ChemistryCopyright © 2009 Royal Society of Chemistry
All rights reserved.
Conserved Steps in Eukaryotic DNA Replication
XIN QUAN GE AND J. JULIAN BLOW
Wellcome Trust Centre for Gene Regulation and Expression, University of Dundee, DD1 5EH, UK
1.1 Overview: the Biochemistry of DNA Synthesis
The genome of mammals comprises ~6 × 10 nucleotides arranged in extremely long linear polymers — the chromosomes. Accurate copying of this amount of genetic information in a biologically relevant time frame (often a few hours, though sometimes as little as a few minutes) requires highly accurate enzyme machines together with complex molecular coordination and feedback.
The DNA template is a long polymer of four types of deoxyribonucleotide arranged as a double-stranded anti-parallel helix (Figure 1.1 A). The backbone of each strand consists of phosphodiester linkages between the 3' and 5' carbons of deoxyribose. The 1 carbon of the deoxyribose is linked to one of four different bases: the purines (adenine and guanine) or the pyrimidines (thymidine and cytosine). The two strands are held together by hydrogen bonds between complementary bases. The two-ringed heterocyclic purines always base pair with single ring pyrimidines, maintaining the linear axis of the helix and avoiding backbone distortion; specifically, guanine forms three hydrogen (H) bonds with cytosine and adenine makes two hydrogen bonds with thymidine (Figure 1.1B). Whilst each H bond is relatively weak, the huge number of H bonds in an average mammalian chromosome (> 109) ensures that the duplex is extremely stable. As noted by Watson and Crick in 1953, each of the two single strands of DNA contains all the information necessary to produce new second strands through complementary base pairing.
During DNA replication, the two strands are opened up and a nascent strand, complementary to the template strand, is synthesised by a complex of many proteins called the replication fork or replisome. Unwinding of the two strands of DNA to expose bases during template-directed DNA synthesis requires the input of chemical energy to break the hydrogen bonds. This energy is derived from hydrolysis of ATP by helicases, which act as DNA-dependent ATPases that run ahead of the replication fork (Figure 1.2). In eukaryotes, the major replicative helicase is almost certainly the hexameric Mcm2-7 complex (Figure 1.3, see also Chapter 3). At the same time, torsional stress (positive supercoiling) is caused by unwinding the DNA and this is relieved by topoi-somerases (nicking-closing enzymes) (Figure 1.2). Then DNA is synthesised by enzyme-mediated polymerisation of deoxyribonucleotide triphosphates (dNTPs) complementary to the sequence of the exposed bases on the template strand (see Chapter 4). New daughter duplexes thus consist of one parental template strand base paired to a complementary daughter strand. This is semi-conservative replication and was first demonstrated experimentally by Meselson and Stahl, who showed that newly synthesised DNA is composed of one template strand plus one nascent strand.
During polymerisation, nucleophilic attack by the lone pair of electrons on the 3' hydroxyl (OH) of deoxyribose onto the 5' phosphate of an incoming dNTP results in the formation of a phosphodiester bond with the elimination of pyrophosphate (Figure 1.1C). The subsequent, and very rapid, hydrolysis of pyrophosphate to two inorganic phosphates releases energy, and it is this that drives the polymerisation reaction forwards. An important consequence of this reaction is that DNA must always be synthesised in a 5' to 3' direction; all known DNA polymerases act 5'–3' with respect to the newly synthesised (nascent) DNA molecule. However, while the double-stranded DNA template exists as an anti-parallel double helix, replication of both template strands is coordinated at the replication fork, which moves in a net direction away from the start point (replication origin). To overcome this conflict of directionality, on only one strand (the 'leading strand') can DNA be polymerised in the same direction as the fork is moving. On the other strand (the 'lagging' strand), nascent DNA is synthesised in short sections called Okazaki fragments, typically ~150 nucleotides long in eukaryotes (Figure 1.3). Okazaki fragments are started by short RNA primers which are subsequently removed before the fragments are joined together. Thus the fork can move away from the start site while co-coordinating synthesis of both nascent strands and without contravening the energy requirements of 5'–3' synthesis.
In eukaryotes, the chromosomal DNA is located within the cell nucleus where it is associated with proteins to form a DNA-protein complex called chromatin (see Chapter 10). The basic building block of chromatin is the nucleosome core particle, which contains 147 base pairs of double-stranded DNA wrapped in 1.65 left-handed superhelical turns around the surface of histone octamer comprising two central H3–H4 dimers flanked on either side by two H2A–H2B dimers (Figure 1.4). A variety of other proteins also bind to DNA and regulate its activity. For replication to occur, pre-existing nucleosomes and other DNA-bound proteins that are located ahead of replication forks need to be transiently disrupted. After fork passage, those proteins are deposited back on parental as well as nascent DNA so that the chromatin status is reproduced in daughter strands (see also Chapter 10).
1.2 Where and When Does DNA Replication Take Place?
1.2.1 Cell Cycle Control
In eukaryotes, DNA replication takes place during a distinct phase of the cell cycle called S phase, during which time the entire genome is precisely duplicated (Figure 1.5). The replicated DNA molecules are segregated to the two daughter cells during a subsequent cell cycle phase called mitosis (see Chapter 9). S phase and mitosis are separated by two 'gap' phases, G1 and G2. Progression through each stage of the cell cycle is very tightly regulated by a complex interplay of kinases (enzymes that phosphorylate proteins), phosphatases (enzymes that remove phosphate groups from proteins) and proteases (which degrade proteins into shorter polypeptides or constituent amino acids).
During S phase, pairs of replication forks are initiated bidirectionally from chromosomal loci called replication origins. The large size of eukaryotic chromosomes (each of which can be tens or hundreds of megabases long) means that in order for them to be replicated in a reasonable period of time, a large number of replication origins are needed. Although the initiation of a pair replication forks at a replication origin is a tightly controlled process, each fork will typically then move along the DNA ('elongate') until it encounters a fork moving in the opposite direction, at which stage both forks will disassemble ('terminate'). When DNA is visualised during the S phase, replicated DNA can be observed as a series of 'bubbles' with replication origins near their centres (arrowheads in Figure 1.6). The stretch of DNA replicated by forks emanating from a single origin is referred to as a replicon. Replicon sizes can vary significantly, both among different organisms and among different cell types in the same organism. Rapidly dividing cells typically have small replicon sizes (for example, cells in the early Xenopus embryo has an average replicon size of ~10kb, whilst mammalian somatic cells typically have replicon sizes of 50–150 kb).
1.2.2 Origin Clusters and Replication Foci
In metazoans, adjacent origins (typically 2–5) are organised into clusters which initiate synchronously while different origins clusters are activated at different stages of S phase. One or more clusters of origins are organised into a discrete replication focal site, which has been estimated to comprise about 1 Mb of DNA and 6–12 replicons. Each focus is thought of as a factory for DNA replication and contains a range of replication fork proteins (forming so-called replisomes). It is possible that replisomes are anchored to a fibrous network within the nucleus (the 'nuclear matrix' or 'nuclear scaffold') through which multiple replication forks are spooled; alternatively, the physical organisation of the chromosomal DNA into higher order chromatin structures could provide the framework on which replication foci are built. DNA replication is typically completed in each focus within 30-120 minutes, and during this time, live cell imaging of the replication fork protein PCNA (see Chapters 3 and 7) has shown that replication foci do not merge, divide or have directional movement, thus arguing that replication foci are achieved by the coordinated assembly and disassembly of replisomal proteins at sites that are more or less fixed.
1.2.3 The Replication Timing Programme
Eukaryotes replicate their genomic DNA according to a specific temporal programme, with different clusters of origins firing at different time during an S phase that lasts from minutes in yeast to hours in metazoans. Several pieces of evidence have suggested that chromatin context is a critical determinant of origin initiation time. The replication timing programme is re-established in each cell cycle shortly after mitosis. Transcriptionally active regions tend to have open chromatin structure and replicate early, whereas gene-poor regions and the more condensed heterochromatin replicate late. Transcriptional silencing can reprogramme an origin from initiating early to late, as well as by promoting a more compact chromatin structure around the region.
The timing decision point in early G1 (Figure 1.5) is the time when specific regions of the chromosome become programmed to replicate at specific stages of S phase. This takes places coincidently with the repositioning of chromosomes in the nucleus and the formation of immobile structures in the nucleus that restrict chromosome mobility. It has been proposed that chromatin regulators might be concentrated into subnuclear compartments by a clustering of related chromosomal domains, which may influence the timing of origin firing within a chromatin domain. For example in yeast, late replicating origins reside close to the nuclear periphery in G1, whereas early replicating origins are apparently randomly localised within the nucleus throughout the cell cycle.
Many other factors could also contribute to determining the timing of origin firing. For example, in Saccharomyces cerevisiae, the timing of replication in certain origins is shown to be affected by the origin sequence. Precise levels of cyclin-dependent kinase (CDK) activity present at various stages of S phase are important for executing the temporal programme. In budding yeast, two S phase cyclins have differential roles in activation of early and late origins: Clb5 activates both early and late origins, while Clb6 activates only early origins.
The replication timing programme determines the differential firing time of large sequence blocks containing replication origin clusters, but why has the cell evolved such a sophisticated programme for DNA replication? The grouping of replication forks into factories that are activated at different times might provide an environment whereby newly replicated DNA could be assembled into specific chromatin states, thus maintaining the epigenetic information that is important for regulation of other nuclear activities (such as transcription).It may also allow for tight regulation feedback, for example blocking firing of late origins when replication from early origins is halted.
1.3 Origins of DNA Replication
The number of origins ranges from a few hundred in a yeast cell to tens of thousands in a human cell. The extent to which conserved DNA sequence elements determine origins differs significantly among eukaryotic species. Replication origins in the budding yeast Saccharomyces cerevisiae contain highly conserved sequence elements called A, B1, B2 and B3 boxes of the autonomously replicating sequence (ARS). These conserved DNA sequences are required for binding of the initiator protein ORC (origin recognition complex, see Chapter 2). However, not all DNA segments containing the conserved sequence elements are recognised by ORC in vivo. Other sequences distributed over 100 bp also contribute to replication origin function, possibly by providing binding sites for proteins that can enhance the recruitment of ORC to DNA or by providing DNA sequences that can be easily unwound.
The origins in most other eukaryotes are much less stringent in terms of sequence requirement. In the fission yeast, Schizosaccharomyces pombe, the required origin sequences are distributed over large DNA segments (500–1000 bp) and are AT rich. It appears that it is the number of AT tracts in a given segment of DNA that determines its probability of binding ORC and functioning as an origin of DNA replication.
The nature of origins in metazoans is even less well defined than in yeasts and the origins appear not to contain any consensus sequence. Replication origins occur at frequent and nearly random intervals along metazoan chromosomal DNA, and only a fraction of them are utilised in each cell cycle with a wide variation of efficiency. A typical pattern of origin initiation in metazoans is broad zones containing many relatively inefficient origins, one or a few of which are selected stochastically and the rest are suppressed. However, at some origins, such as lamin B2 and β-globin origins, replication starts from tightly-defined sites.
Several interacting components may influence the location and efficiency of initiation in any given cell cycle, such as:
(1) DNA sequences. Sequences rich in AT could facilitate ORC binding or DNA unwinding.
(2) Local chromatin structure. It has been shown that the positions of nucleosomes near origins are important for origin function. Whilst histone acetylation has been shown to affect origin specification in Xenopus and Drosophila, in mammalian cells, an ATP-dependent chromatin remodelling complex is required for efficient replication of heterochromatin.
(3) Transcription. Transcription has been shown to interfere with origin activity and indeed, replication origins are almost never found within actively transcribed DNA.
(4) Protein–protein interactions. The presence of other proteins could help recruit ORC and enhance origin efficiency. For example, Abf1 and the Myb protein complex bind to origins and can affect the efficiency of origin utilisation in yeast and Drosophila.
(5) Origin interference. It has been observed that in an initiation zone, firing of one replication origin appears to inhibit initiation at nearby origins, but is coordinated with neighbouring origins at more distant sites. This may suggest some sort of long range interaction between origins.
1.4 Licensing of DNA for Replication
It is essential for a cell to replicate its genome only once per cell cycle and this is regulated by the ability of cells to load the Mcm2-7 protein complex onto the origins (see Chapters 2 and 3). Mcm2-7 form a clamp around DNA and provide helicase activity to separate the double helix ahead of replication forks (see Chapter 3). During late M and G1 phases of the cell cycle, Mcm2-7 are loaded onto the DNA, which probably involves the clamping of the proteins around origin DNA without activation of their helicase activity (Figure 1.7). This 'licenses' the origin for use in the subsequent S phase. Mcm2-7 loading requires the recognition of the origin DNA by the origin recognition complex (ORC) (Figure 1.8). ORC in turn recruits proteins Cdc6 and Cdt1, which load Mcm2-7 onto DNA by hydrolysing ATP (see Chapter 2). The complex of ORC, Cdc6, Cdt1 and Mcm2-7 at replication origins is termed the pre-replicative complex or pre-RC. It is not clear whether ORC, Cdc6 and Cdt1 open the Mcm2-7 ring and load it around DNA, or whether they facilitate the assembly of the Mcm2-7 hexamer on DNA from different Mcm subcomplexes present in the nucleoplasm.
As a licensed origin initiates during S phase, the Mcm2-7 complex becomes activated as helicase, possibly by binding other replication fork proteins including the GINS complex. Since Mcm2-7 proteins travel with the replication fork, this means that an origin becomes unlicensed after it initiates. To prevent DNA being replicated a second time in a single cell cycle, it is therefore important to prevent re-licensing of replicated origins during S and G2 phases of the cell cycle. The mechanisms for achieving this vary in different eukaryotes.
Excerpted from Molecular Themes in DNA Replication by Lynne S. Cox. Copyright © 2009 Royal Society of Chemistry. Excerpted by permission of The Royal Society of Chemistry.
All rights reserved. No part of this excerpt may be reproduced or reprinted without permission in writing from the publisher.
Excerpts are provided by Dial-A-Book Inc. solely for the personal use of visitors to this web site.