Unravelling Single Cell Genomics: Micro and Nanotools
336Unravelling Single Cell Genomics: Micro and Nanotools
336Hardcover
-
SHIP THIS ITEMNot Eligible for Free ShippingPICK UP IN STORECheck Availability at Nearby Stores
Available within 2 business hours
Related collections and offers
Overview
About the Author
Read an Excerpt
Unravelling Single Cell Genomics
Micro and Nanotools
By Nathalie Bontoux, Luce Dauphinot, Marie-Claude Potier
The Royal Society of Chemistry
Copyright © 2010 Royal Society of ChemistryAll rights reserved.
ISBN: 978-1-84755-911-1
CHAPTER 1
An Introduction to Molecular Biology
LUCE DAUPHINOT
CRICM, CNRS UMR7225, INSERM UHRS975, UPHC, Hôpital de la Pitié Salpétrière, Paris, France
Abstract
The cell constitutes the basic structure of all living organisms (cellula in latin means small chamber). The typical diameter of a cell is 10–100 micrometers (µm), its volume around 10 picoliters (pl) and its mass around 1 nanogram (ng).
Cells can be divided in two main groups. Prokaryotic cells, such as bacteria, lack nucleus and are unicellular organism, characterized by a relatively simple organization with only one compartment containing a circular DNA molecule. Eukaryotic cells are characterized by a nucleus and a cytoplasm containing many sub-cellular compartments. The nucleus is surrounded by a nuclear envelope with nuclear pores that allow the transport of macromolecules between the nucleus and the cytoplasm. The DNA molecule is localized inside the nucleus and organized in chromosomes. Some eukaryote organisms are unicellular such as yeasts, but the most part are pluricellular, with the most complex organism being human, with more than 10000 billion cells.
1.1 DNA Structure and Gene Expression
Eukaryotic cells contain a nucleus while prokaryotic cells do not (Figures 1.1 and 1.2). The genetic information of the cell is stored as a double-helix DNA molecule inside the nucleus (the model proposed by Watson and Crick in 1953). The DNA molecule is made up of four different units called nucleotides, consisting of a sugar (deoxyribose) with a phosphate group linked to one of the four following bases (Figure 1.3A): adenosine (A), cytosine (C), thymine (T) or guanine (G). The double-helix DNA molecule is constituted of two complementary strands linked by hydrogen bonds between A–T and C–G (Figure 1.3B). Each DNA strand has a 5' end with a free phosphate group at its extremity and 3' end with a free hydroxyl (OH) group. In a double helix of DNA, both strands are in opposite directions. The DNA molecule constitutes the genome of the cell and is the same in all the cells of a given organism; it is used as the storage of genetic information. The human genome consists of 3x109 base pairs of a DNA molecule divided in 23 chromosomes pairs. When the cell divides, this genetic information is transmitted to the daughter cells: the genome is duplicated during the DNA replication step. The double helix is replicated by a semi-conservative process in which each strand is used as a template for the synthesis of a new strand by a DNA polymerase (Figure 1.4).
During its lifetime, the cell divides, interacts and communicates with other cells, and responds to various external stimuli. During all these processes, the information stored in the DNA will be transmitted by two other molecules: RNA and proteins. The RNA is only a carrier of the information whereas the proteins are the key effectors of the cell. The transmission of genetic information starts inside the nucleus with the transcription of the DNA into RNA. During this step, specific regions of the genome, called genes, are used as templates for the RNA transcript synthesis (Figure 1.5). The RNA transcripts are quite similar to the DNA with only a few differences (Figure 1.6): they are single-stranded, contain a different sugar (ribose), and the thymine (T) is replaced by uracil (U). These RNA molecules, called messenger RNA (mRNA), are the intermediates for protein synthesis. Their sizes range from 100 base pairs (bp) to more than 6000 bp depending on the size of the gene. After synthesis, mRNA molecules migrate to the cytoplasm through the nuclear pores, where they guide protein synthesis during the translation step (Figure 1.5). During translation, the sequence of the mRNA is read out and converted to amino acids. Despite their crucial role, mRNAs are in a minority in the cell and represent only 1–5% of the total RNA. Transfer RNA (tRNA) and ribosomal RNA (rRNA), which are implicated in the translation process (Figure 1.5), represent more than 90% of the total RNA of a cell. In humans, the RNA content of a cell is around 6–10 picograms (pg) which means that there is less than 1 pg of mRNA.
The simplest cells have fewer than 500 genes whereas the most complex ones have more than 25 000 different genes. A gene is said to be "expressed" when its DNA sequence is transcribed into mRNA. All the genes that are expressed at a given time constitute the transcriptome of a cell, i.e. all the mRNAs present in the cell. At a time point, there are only around 5000 genes expressed in a cell with one to thousands of copies of each gene, depending on its level of expression. The transcriptome is thus a kind of instantaneous picture of the cell. In the cell, gene expression is highly regulated in order to adjust the mRNA and protein levels on demand.
1.2 Molecular Biology Tools for Nucleic Acid Studies
1.2.1 DNA Engineering
Until 1970, studying DNA proved to be very difficult, mostly because of its size. With the emergence of DNA cloning technologies in 1972 followed by Southern blotting and DNA sequencing, this difficulty was rapidly bypassed.
Southern blotting is based on the ability of two complementary sequences of DNA to hybridize together with high specificity. This technique allows the detection of a specific single-stranded sequence of DNA among a mixture of millions of different ones, thus providing a powerful tool to identify and characterize genes. DNA fragments are first cut into smaller fragments by restriction enzymes, then separated according to their size by electrophoresis and transferred on a nylon membrane. Once the DNA is bound to the membrane, it is placed in solution with a specific radiolabeled probe that will hybridize to its complementary strand. After washing, the specific hybridization is revealed by autoradiography (Figure 1.7). A similar technique has been developed for RNA (northern blotting) and protein (western blotting) analysis. RNA molecules or protein extracts are separated by electrophoresis and then revealed by, respectively, a specific DNA probe or antibody. These approaches have been improved to characterize genes or mRNAs directly inside the cell: it is called in situ hybridization. Fluorescence in situ hybridization (FISH) consists of the hybridization of a fluorescent probe (corresponding to a specific DNA fragment) directly on the chromosomes of a cell, allowing localization of a gene on a chromosome (Figure 1.8). This technique proved to be a very powerful tool in clinical research, especially in cancer, to look for and characterize chromosomal abnormalities such as deletion, duplication or translocation. The same approach can be used to quantify mRNA in cells on fixed tissues.
1.2.2 Polymerase Chain Reaction
In 1985, fundamental progress was made in DNA analysis with the invention of the polymerase chain reaction (PCR) for which Mullis was awarded the Nobel Prize for chemistry in 1993. PCR allows the exponential amplification of a given DNA sequence using specific primers matching on both DNA strands (Figure 1.9). The reaction starts with an initial denaturation step of 1–2 min at 95 °C to obtain single-stranded DNA molecules, followed by a variable number (N) cycles of amplification. Each cycle of amplification consists of three steps: (1) denaturation to obtain single-stranded molecules, (2) primers annealing to complementary sequences, and (3) elongation of new DNA strands starting from the primers by a DNA polymerase. Once the elongation is done, the amplification cycle starts again at the denaturation step and each DNA strand will be used as a new template. This approach can also be used to study RNA but RNA must be converted into complementary DNA (cDNA) by reverse transcription (RT) prior to PCR amplification (Figure 1.10).
PCR is an extremely sensitive technique and can generate millions of copies of a DNA fragment starting from very small amounts. This provided the opportunity to easily produce DNA probes for cloning, sequencing and genetic engineering, and this method is also widely used for genetic diagnosis and forensic analysis. However, the main limitation of the technique was the impossibility of quantification since it produces nearly the same amount of DNA molecules independently of the initial quantity (Figure 1.11). This was solved in 1992 with the development of real-time quantitative PCR (qPCR).The protocol is the same as for PCR but DNA quantification is achieved during the reaction by monitoring a fluorescence signal that is proportional to the amount of DNA synthesized. Moreover, the initial amount of DNA in the sample can be deduced from the number of amplification cycles required to obtain a given level of fluorescence.
There are two common methods for qPCR quantification (Figure 1.12). First, a fluorescent dye such as SybrGreen, which binds to the double helix of DNA, can be used. In this case, the emission of fluorescence increases with the number of DNA molecules produced (Figure 1.12A). However, the dye is not specific for the amplified sequence and can also bind to non-specific products. As such, it is essential to use primers that give rise to a unique DNA amplification product. Second, a double-labeled probe specific for the amplified DNA fragment, called a Taqman probe, can be used. The probe, located between two primers, is linked at the 5' end to a fluorescent dye and at the 3' end to a quencher that inhibits its emission of fluorescence. During the annealing step, the probe and the primers will hybridize to their complementary sequence. During the elongation step, the probe is degraded by 5'–3' exonuclease activity of the DNA polymerase, which sets the dye free from the quencher (Figure 1.12B) and induces the emission of fluorescence. Today, qPCR is widely used in diagnosis, single-nucleotide polymorphism (SNP), genetic analysis, and pathogen detection. The most important improvement relies on gene expression analysis using RT-qPCR, currently the most accurate and the most sensitive method to detect and quantify mRNA.
1.2.3 DNA Microarrays
In 1995, another technological breakthrough was achieved in gene expression analysis with the development of DNA microarrays by Pat Brown and his colleagues at Stanford University. This new method was based on an old concept of molecular biology (the specific hybridization of two complementary strands of DNA) but opened the way to high-throughput gene profiling. The first microarrays were made of a hundred features and were used to monitor the expression of few genes. Since the achievement of the genome sequence in 2001, DNA microarrays allowing the analysis of the whole transcriptome (more than 25 000 genes for human or mouse) are now available.
DNA microarrays consist of a solid matrix (nylon membrane, glass slide or silicone wafers) on which are covalently linked DNA oligonucleotides or PCR products, called probes. Each probe is specific for one DNA of interest: the design is made with high stringency in silico to prevent any cross-hybridization, then millions copies of these probes are spotted on the chip. Usually, the probes are designed, synthesized and then printed onto the chip but some companies have developed specific technologies that allow for in situ synthesis of probes (www.affymetrix.com, www.nimblegen.com, www.agilent.com). When the main goal is to analyse gene expression profiling, mRNA isolated from the different samples that have to be compared are used as targets. They will recognize and hybridize specifically to the complementary sequence of the probes.
There are two kinds of microarray experiments: one-color and dual-color. In the first case, mRNA samples are all labeled with the same dye and each hybridized on a single microarray. The probe–target hybridization signals are then analyzed and compared altogether. In contrast, in dual-color experiments, RNA is extracted from two tissues or cell types (e.g. cancer cells versus normal cells), then independently labeled with two different dyes (cyanine 5 and cyanine 3) and loaded together on the microarrays (Figure 1.13). As the targets hybridize to their complementary probe in a quantitative way, measuring the fluorescence of each dye allows the relative level of expression of each gene in the samples to be determined.
DNA microarrays constitute an extremely powerful tool to study differential gene expression, although many problems remain, such as the standardization of experiments and the interpretation of results. Data analysis represents a real challenge for bioinformatics. First, the collection of raw data through image processing is required; then a normalisation step to remove the systematic bias is necessary; and, finally, statistical tests to determine the significant differences of expression between two or more samples have to be carried out. For each of these steps, many tools and software are available.
Today, the use of microarrays is greatly expanding. DNA microarrays are widely used for gene expression profiling, pathogen detection, resequencing or genotyping, as well as for other applications such as protein expression or chromosomal copy-number changes analysis using protein arrays and comparative genomic hybridization (CGH) arrays respectively.
CHAPTER 2The Central Dogma in Molecular Biology
LAILI MAHMOUDIAN
CRICM, UPMC – INSERM URMS975 – CNRS UMR7225, Hôpital de la Pitié Salpétrière, Paris, France
Abstract
The central dogma tells that the genetic information coded by DNA molecules, first transferred to RNA molecules as intermediated molecules, is then transferred to protein molecules. In other words it says that the genetic information is saved as sequences of nucleic acids but the function has to be expressed in the form of proteins. There are three steps in the conversion of the genetic information to the proteins: replication, transcription and translation. Figure 2.1 shows the principals of central dogma.
2.1 Replication
Replication is a process that copies the double-stranded DNA of a cell to two identical copies (Figure 2.1). DNA replication is a semi-conservative process, which means two original strands of the DNA molecules separate and each act as a template for a new complementary strand. Two identical DNA molecules will be produced from a single, double-stranded DNA molecule. Basically, in a cell, DNA replication starts at specific locations, called origins. During DNA replication first, an enzyme called helicase unwinds the DNA strand. This enzyme breaks the hydrogen bonds between the two strands of the DNA molecule. The resulting structure is two branching single-stranded DNA molecules, which are called a replication fork. DNA polymerase is the enzyme that generates the two complementary strands by adding the nucleotides (units of DNA) from the 5' end to the 3' end of the DNA strand.
Figure 2.2 shows a simple scheme of a replication fork. Each replication fork consists of two new DNA strands, which are called leading and lagging strands. The leading strand is the newly replicated DNA strand that is synthesized in the 5' to 3' direction. DNA polymerase a is able to synthesize the leading strand continuously in the same direction in which the replication fork is moving through the DNA strand.
On the other hand, the lagging strand is the DNA strand that is in the opposite direction of the replication fork from the 3' to 5' direction. Since DNA polymerase cannot synthesize in this direction, replication of the lagging strand takes place in short segments called the Okazaki fragments. An enzyme called primase makes short-length RNA primers which are the short strands of RNA that serves as the starting point of DNA replication. DNA polymerase d can use the free 3'–OH of the RNA primer to synthesize DNA in the 5' to 3' direction. The RNA fragments then are removed from the lagging strands and new deoxyribonucleotides are added to fill the gaps. An enzyme called DNA ligase then joins the desoxyribonuleotides together and completes the synthesis of the lagging strand.
(Continues...)
Excerpted from Unravelling Single Cell Genomics by Nathalie Bontoux, Luce Dauphinot, Marie-Claude Potier. Copyright © 2010 Royal Society of Chemistry. Excerpted by permission of The Royal Society of Chemistry.
All rights reserved. No part of this excerpt may be reproduced or reprinted without permission in writing from the publisher.
Excerpts are provided by Dial-A-Book Inc. solely for the personal use of visitors to this web site.