E-Book, Englisch, Band Volume 41, 348 Seiten
Reihe: Methods in Microbiology
Goodfellow / Sutcliffe / Chun New Approaches to Prokaryotic Systematics
1. Auflage 2014
ISBN: 978-0-12-800443-2
Verlag: Elsevier Science & Techn.
Format: EPUB
Kopierschutz: 6 - ePub Watermark
E-Book, Englisch, Band Volume 41, 348 Seiten
Reihe: Methods in Microbiology
ISBN: 978-0-12-800443-2
Verlag: Elsevier Science & Techn.
Format: EPUB
Kopierschutz: 6 - ePub Watermark
Volume 41 of Methods in Microbiology is a methods book designed to highlight procedures that will revitalize the purposes and practices of prokaryotic systematics.This volume will notably show that genomics and computational biology are pivotal to the new direction of travel and will emphasise that new developments need to be built upon historical good practices, notably the continued use of the nomenclatural type concept and the requirement to deposit type strains in at least two service culture collections in different countries. - Detailed protocols on cutting edge methods - Prepared by leading international experts in the relevant fields
Autoren/Hrsg.
Weitere Infos & Material
1;Front Cover;1
2;New Approaches to Prokaryotic Systematics;4
3;Copyright;5
4;Dedication;6
5;Contents;8
6;Contributors;16
7;Preface;20
8;Chapter 1: The Need for Change: Embracing the Genome;22
8.1;1. A brief history of genomic sequencing of prokaryotes;22
8.2;2. Why Sequence the Genomes of Prokaryotes?;24
8.3;3. The State-of-the-Art;27
8.4;4. Where We Are Going;30
8.5;Acknowledgement;32
8.6;References;32
9;Chapter 2: An Introduction to Phylogenetics and the Tree of Life;34
9.1;1. Introduction;34
9.2;2. Step 1: Posing a question;37
9.3;3. Step 2: Choosing Relevant Sequences;37
9.3.1;3.1. Obtaining 16S rRNA Sequences for Bacteria, Archaea and Eukarya;38
9.3.2;3.2. A note on the availability and use of data and methods;39
9.4;4. Step 3: Aligning Sequences and Editing the Alignment;40
9.5;5. Step 4: The theory of Fitting and Selecting a Phylogenetic Model;43
9.5.1;5.1. Markov nucleotide substitution models;43
9.5.2;5.2. Inferring phylogenies under Markov substitution models;46
9.5.3;5.3. Frequentist inference;47
9.5.4;5.4. Bayesian inference;48
9.5.5;5.5. Model comparison and assessment;49
9.5.6;5.6. Frequentist methods;50
9.5.7;5.7. Bayesian model choice;51
9.6;6. Step 5: Inferring Trees-Practical Guidelines for fitting and comparing Markov substitution models;52
9.6.1;6.1. Alignment formats for phylogeny programs;52
9.6.2;6.2. Inferring maximum likelihood phylogenies using RAxML;53
9.6.3;6.3. Bayesian analyses with PhyloBayes;53
9.6.4;6.4. Posterior predictive checks;55
9.7;7. Step 6: Interpreting the Phylogenetic Tree;56
9.8;Conclusions;61
9.9;Acknowledgements;62
9.10;References;62
10;Chapter 3: The All-Species Living Tree Project;66
10.1;1. Introduction;66
10.2;2. Sources of Information;67
10.2.1;2.1. Classification of microbial databases;67
10.2.1.1;2.1.1. Taxonomy (LPSN and Bergey's Manual);69
10.2.1.2;2.1.2. Type-strain information (StrainInfo database);69
10.2.1.3;2.1.3. Sequences and alignments (ARB and SILVA);70
10.3;3. Database creation and updating;71
10.4;4. Features of the Database;72
10.4.1;4.1. Optimized SSU and LSU alignments;72
10.4.2;4.2. Curated hierarchical classification;72
10.4.3;4.3. Risk-group classification;74
10.4.4;4.4. Taxonomic thresholds;75
10.5;5. Phylogenetic trees;75
10.6;6. LTP as a Taxonomic Tool;76
10.7;Acknowledgements;78
10.8;References;78
11;Chapter 4: 16S rRNA Gene-Based Identification of Bacteria and Archaea using the EzTaxon Server;82
11.1;1. Introduction;82
11.2;2. Use of 16S rRNA gene sequences in prokaryotic systematics;83
11.2.1;2.1. Sequencing of 16S rRNA genes;83
11.2.2;2.2. Calculation of nucleotide sequence similarity values of 16S rRNA gene sequences;85
11.3;3. Identification of Bacteria Using the EzTaxon Database;85
11.3.1;3.1. EzTaxon database;85
11.3.2;3.2. Algorithm for ``EzTaxon search´´;86
11.3.3;3.3. Overall workflow from Sanger DNA sequence data;88
11.3.4;3.4. Assembly and trimming of sequences;88
11.3.5;3.5. Manual editing of sequences using the secondary structure information;90
11.3.5.1;3.5.1. Manual editing of sequences with EzEditor;90
11.3.6;3.6. Identification of strains using the EzTaxon server;92
11.3.7;3.7. Phylogenetic analysis;93
11.4;Concluding remarks;93
11.5;Acknowledgement;93
11.6;References;94
12;Chapter 5: Revolutionizing Prokaryotic Systematics Through Next-Generation Sequencing;96
12.1;1. Introduction;96
12.2;2. Sequencing Approaches;97
12.3;3. Bioinformatic Analyses;99
12.3.1;3.1. De novo assembly and mapping;99
12.3.2;3.2. Annotation;102
12.3.3;3.3. Comparative genomic analysis;104
12.3.4;3.4. SNP extraction and functional characteristics;108
12.3.5;3.5. Phylogenetic analyses;108
12.4;4. Applications of Next-Generation Sequencing Technology;109
12.4.1;4.1. Prokaryotic systematics;109
12.4.2;4.2. Pathogen evolution, transmission and adaptation;110
12.4.3;4.3. Genetic basis of phenotypic characteristics;111
12.4.4;4.4. Metagenomics;111
12.4.5;4.5. Target resequencing;112
12.4.6;4.6. RNA-Seq and transcriptomics;112
12.5;Conclusions;114
12.6;Acknowledgements;114
12.7;References;114
13;Chapter 6: Whole-Genome Analyses: Average Nucleotide Identity;124
13.1;1. Introduction;124
13.1.1;1.1. Calculation of average nucleotide identity;124
13.1.2;1.2. Theoretical background: BLAST/MUMmer software;126
13.2;2. Preparation and DNA Sequencing;127
13.2.1;2.1. Strain cultivation;127
13.2.2;2.2. DNA extraction and quantification;128
13.2.3;2.3. Whole-genome sequencing;129
13.3;3. ANI Calculations Using JSpecies;130
13.3.1;3.1. Installation;130
13.3.2;3.2. Operation;131
13.3.3;3.3. Calculations on-line;135
13.4;4. Interpretation and publication of results;135
13.5;5. Application to Prokaryotic Classification: Case Studies;135
13.6;Concluding remarks;138
13.7;Acknowledgements;138
13.8;References;138
14;Chapter 7: Whole-Genome Sequencing for Rapid and Accurate Identification of Bacterial Transmission Pathways;144
14.1;1. Introduction;144
14.2;2. The Sequencing Revolution;145
14.2.1;2.1. Second-generation sequencing technologies;146
14.2.1.1;2.1.1. 454 pyrosequencing;146
14.2.1.2;2.1.2. Illumina sequencing technology;147
14.2.1.3;2.1.3. Ion Torrent;148
14.3;3. Bacterial typing with next-generation sequencing;149
14.4;4. Identifying transmission pathways using whole-genome sequencing - The toolkit;150
14.4.1;4.1. Mapping and alignment of whole genomes;153
14.4.1.1;4.1.1. Indexing;153
14.4.1.1.1;4.1.1.1. Hash tables;153
14.4.1.1.2;4.1.1.2. The Burrows-Wheeler transform;154
14.4.1.2;4.1.2. Realigning indels;154
14.4.1.3;4.1.3. The SAM format;154
14.4.1.4;4.1.4. Identifying variation from mapped reads;155
14.4.2;4.2. De novo assembly and genome alignment;155
14.4.2.1;4.2.1. Read correction;155
14.4.2.2;4.2.2. Assemblers;156
14.4.2.2.1;4.2.2.1. Overlap-layout-consensus;156
14.4.2.2.2;4.2.2.2. de Bruijn graphs;156
14.4.2.2.3;4.2.2.3. Platform-specific assemblers;157
14.4.2.3;4.2.3. Scaffolding and gap filling;157
14.4.2.4;4.2.4. Identifying variation using co-assembly;158
14.4.3;4.3. Identifying variation from whole-genome assemblies;158
14.4.3.1;4.3.1. Whole-genome alignment;158
14.4.3.1.1;4.3.1.1. BLAT;159
14.4.3.1.2;4.3.1.2. MUMmer;159
14.4.3.1.3;4.3.1.3. Mauve;160
14.4.3.1.4;4.3.1.4. Mugsy;160
14.4.4;4.4. Identifying transmissions using whole-genome variation;161
14.4.4.1;4.4.1. SNP distances - Defining a cutoff;161
14.4.4.2;4.4.2. Phylogenetic evidence;164
14.4.4.3;4.4.3. Phylodynamics;165
14.5;5. Combining genomic and epidemiological evidence;165
14.6;6. Future Directions;166
14.7;References;168
15;Chapter 8: Identification of Conserved Indels that are Useful for Classification and Evolutionary Studies;174
15.1;1. Limitations of the phylogenetic trees for understanding microbial classification;174
15.2;2. Characteristics that are well-suited for classification;175
15.3;3. Conserved signature indels and Their usefulness for classification and evolutionary studies;176
15.4;4. Identification of Conserved Signature Indels in Protein Sequences;178
15.4.1;4.1. Creation of multiple sequence alignments;179
15.4.2;4.2. Identification of potential conserved indels in the sequence alignments;180
15.4.3;4.3. Blast searches on potential conserved indels to identify useful conserved indels;184
15.4.4;4.4. Formatting of the conserved indels;185
15.5;5. Interpreting the Significance of Conserved Indels;188
15.6;6. Correspondence of the Results Obtained from CSIs with rRNA and Other Phylogenetic Approaches;191
15.7;7. Importance of the discovered CSIs for understanding microbial classification and phylogeny;194
15.8;Acknowledgements;196
15.9;References;196
16;Chapter 9: Reconciliation Approaches to Determining HGT, Duplications, and Losses in Gene Trees;204
16.1;1. Introduction;204
16.2;2. Bacterial species tree;205
16.3;3. Gene Family;206
16.4;4. Evolution of Genes in Bacterial Genomes;207
16.5;5. Gene Tree/Species Tree Reconciliation;209
16.5.1;5.1. Protocol for running AnGST;212
16.5.2;5.2. Interpreting the results of AnGST analyses;213
16.6;6. Analysis at the Genome Scale;214
16.6.1;6.1. Protocol for running COUNT;214
16.7;Concluding Remarks;216
16.8;Acknowledgements;218
16.9;References;219
17;Chapter 10: Multi-Locus Sequence Typing and the Gene-by-Gene Approach to Bacterial Classification and Analysis of Population ;222
17.1;1. Introduction;222
17.1.1;1.1. Historical perspective;222
17.1.2;1.2. Multi-locus population analyses;222
17.2;2. Multi-locus sequence typing;223
17.3;3. Whole-Genome Data Analyses;224
17.3.1;3.1. Gene-by-gene analysis of WGS data;225
17.3.2;3.2. The Bacterial Isolate Genome Sequence Database;225
17.3.3;3.3. Isolate and sequence databases;225
17.3.4;3.4. Reference sequence and profile definitions database;228
17.3.5;3.5. Database Integrity;228
17.3.6;3.6. Gene nomenclature;230
17.3.7;3.7. Typing and analysis schemes;230
17.4;4. Examples of Gene-by-Gene Analysis: Neisseria;230
17.4.1;4.1. Ribosomal multi-locus sequence typing;231
17.4.2;4.2. Neisseria rplF assay;232
17.5;5. Examples of Gene-by-Gene Analysis: Campylobacter;234
17.5.1;5.1. Core genome multi-locus sequence typing;234
17.5.2;5.2. Whole-genome multi-locus sequence typing;234
17.6;Conclusions;235
17.7;References;237
18;Chapter 11: Multi-locus Sequence Analysis: Taking Prokaryotic Systematics to the Next Level;242
18.1;1. Introduction;242
18.2;2. Multi-Locus Sequence Analysis;243
18.2.1;2.1. Underlying concepts;243
18.2.2;2.2. Selection of gene loci;244
18.2.3;2.3. Generating sequences;244
18.2.4;2.4. Data analysis;246
18.2.4.1;2.4.1. Properties of loci;246
18.2.4.1.1;2.4.1.1. Sequence alignments;246
18.2.4.1.2;2.4.1.2. Loci statistics;246
18.2.4.1.3;2.4.1.3. Establishing STs;247
18.2.4.2;2.4.2. Phylogenetic analysis;247
18.2.4.2.1;2.4.2.1. Models of evolution;247
18.2.4.2.2;2.4.2.2. Evaluating phylogenetic congruence;247
18.2.4.2.3;2.4.2.3. Construction of phylogenetic trees;248
18.2.5;2.5. Comparison with other taxonomic methods;248
18.2.6;2.6. MLSAs: Advantages and disadvantages;249
18.2.7;2.7. MLSA databases;250
18.3;3. Application of MLSAs in Prokaryotic Systematics;250
18.3.1;3.1. The genus Streptomyces;250
18.3.1.1;3.1.1. The Streptomyces MLSA scheme;251
18.3.1.2;3.1.2. DNA:DNA hybridization and MLSAs;253
18.3.2;3.2. Classification of the S. pratensis phylogroup;260
18.3.3;3.3. MLSA of phytopathogenic Streptomyces species;263
18.3.4;3.4. MLSA: Actinobacteria;263
18.4;4. Detection of Ecotypes Based on MLSAs;264
18.5;5. MLSA Based on Whole Genome Sequences;266
18.6;References;267
19;Chapter 12: Bacterial Typing and Identification By Genomic Analysis of 16S-23S rRNA Intergenic Transcribed Spacer (ITS) Seque;274
19.1;1. Introduction;275
19.2;2. Methods;280
19.2.1;2.1. Search and download bacterial whole-genome sequences;280
19.2.2;2.2. Annotation of rrn alleles;281
19.2.3;2.3. Extraction of the gene components (16S, 23S and 5S) and extra genic regions (ITS1, ITS2, pre-16S and post-5S) that make ;281
19.2.4;2.4. Alignment of rrn gene components and extra genic regions;284
19.2.5;2.5. Editing of rrn gene components and extra genic region alignment files;284
19.2.6;2.6. Exporting annotations and associated data from "Geneious" to "Excel";287
19.2.7;2.7. Exporting and drawing genome, restriction and alignment maps in Illustrator;287
19.2.8;2.8. Design of Excel tables for FileMaker database construction;288
19.2.9;2.9. Construction of FileMaker database from Excel tables;288
19.2.10;2.10. Design of FileMaker statistical and graphical reports;288
19.2.11;2.11. A new way of reporting large amounts of statistical and graphical information in FileMaker Go (see Appendix B4 for inst;289
19.3;3. Results;289
19.3.1;3.1. Sequence acquisition and preparation;289
19.3.2;3.2. Extraction of rrn operon genes and sequence alignment;289
19.3.3;3.3. Annotation of sequence differences in the rrn operons;290
19.3.4;3.4. Graphical presentation of alignments, operon restriction maps and whole-genome maps;290
19.3.5;3.5. Statistical presentation of annotation and metadata, linking alignments, whole genomes, rrn operon alleles, strains and ;291
19.4;4. Discussion;291
19.5;References;292
20;Chapter 13: MALDI-TOF Mass Spectrometry Applied to Classification and Identification of Bacteria;296
20.1;1. Introduction;296
20.2;2. Sample Preparation;297
20.2.1;2.1. Cultivation of bacteria;297
20.2.2;2.2. Selection of the matrix and composition of the matrix solution;298
20.2.2.1;Protocol 1: Preparation of the HCCA matrix solution in 50% (v/v) aqueous ACN containing 2.5% (v/v) TFA;301
20.2.3;2.3. Direct colony transfer;302
20.2.3.1;Protocol 2a: Direct transfer of bacterial biomass to stainless steel targets;302
20.2.3.2;Protocol 2b: Extended direct transfer of bacterial biomass to stainless steel targets;303
20.2.4;2.4. Extraction methods;303
20.2.4.1;Protocol 3: Sample preparation by ethanol-formic acid extraction for MALDI-TOF MS-based identification of bacteria;304
20.2.5;2.5. Organism-specific sample preparation;305
20.2.5.1;Protocol 4: Bead preparation protocol for MALDI-TOF MS sample preparation of mycobacteria;306
20.2.6;2.6. MALDI-TOF MS-based identification of bacteria in complex biological matrices;307
20.2.6.1;2.6.1. Identification of bacteria in positive blood cultures;307
20.2.6.1.1;Protocol 5: Application of the Bruker MALDI Sepsityper Kit (Schubert et al., 2011);307
20.2.6.1.2;Protocol 6: Differential centrifugation-;308
20.2.6.1.3;Protocol 7: Processing of positive BC containing charcoal (Wüppenhorst et al., 2012);308
20.2.6.1.4;Protocol 8: Lysis-filtration method for VITEK MS systems (Fothergill et al., 2013);309
20.2.6.2;2.6.2. Examples of MALDI-TOF MS applications for direct bacterial identification in complex matrices;309
20.3;3. Optimisation of Measurement Conditions;310
20.3.1;3.1. MALDI-TOF MS library creation;311
20.4;4. Application of MALDI-TOF MS for Classification and Identification;312
20.4.1;4.1. Software used for taxonomic evaluation of MALDI-TOF mass spectra;312
20.4.2;4.2. Classification and taxonomic resolution;313
20.4.2.1;4.2.1. Classification of genera within a family;313
20.4.2.2;4.2.2. Classification of species within a genus;313
20.4.2.3;4.2.3. Classification of strains within a species;316
20.4.2.4;4.2.4. Resolution at the level of subspecies;316
20.4.2.5;4.2.5. Differentiation of bacterial strains;317
20.4.2.6;4.2.6. Optimal field of application of MALDI-TOF MS in bacterial systematics;318
20.4.3;4.3. Identification;318
20.4.4;4.4. MALDI-TOF MS and discovery of novel organisms;319
20.5;Conclusions and Outlook;320
20.6;Acknowledgments;321
20.7;References;321
21;Chapter 14: Continuing Importance of the "Phenotype" in the Genomic Era;328
21.1;1. Phylogeny and Genotype;328
21.2;2. The Phenotype;333
21.3;3. The Ongoing Importance of the Phenotype in an Organism Based Taxonomy;334
21.4;Conclusions and challenges;336
21.5;References;336
22;Index;342
23;Color Plate;349
Chapter 2 An Introduction to Phylogenetics and the Tree of Life
Tom A. Williams*,1; Sarah E. Heaps*,† * Institute for Cell and Molecular Biosciences, The Medical School, Newcastle University, Newcastle upon Tyne, United Kingdom
† School of Mathematics and Statistics, Newcastle University, Newcastle upon Tyne, United Kingdom
1 Corresponding author: email address: tom.williams2@ncl.ac.uk Abstract
We present an introduction to experimental design and hypothesis testing in phylogenetics, guiding the reader through the series of steps that are generally involved in making phylogenetic trees. These include choosing a phylogenetic question, selecting the appropriate sequences to analyse, aligning sequences and assessing the quality of sequence alignments, choosing the appropriate phylogenetic methods and interpreting the results. We provide a brief introduction to the freely available, open source software and Web services that can be used to perform these tasks and discuss the difficulties most frequently encountered when building phylogenies, along with some possible solutions. To provide biological context, we will frame our discussion in terms of a very interesting (and still controversial) problem: the reconstruction of the tree of life from ribosomal RNA sequences. Keywords Phylogenetics Tree of life Statistical inference Model choice 1 Introduction
Phylogenetic trees are fundamental to organising, understanding and testing hypotheses about the evolution of biological diversity. Early phylogenies were based on morphology: useful for multicellular eukaryotes, but much less so when inferring relationships among prokaryotes or among the different branches of the tree of life, most of which is microbial. Although comparisons of biochemical properties provided some insight into bacterial relationships, they proved unreliable at deeper taxonomic levels, and by 1960, it seemed that a universal phylogeny was out of reach, with the only unambiguous division in the microbial world separating the eukaryotes from the structurally simpler prokaryotes (Stanier & van Niel, 1962). This situation changed completely with the advent of molecular sequencing, which provided biologists with a rich new source of information about evolutionary history (Zuckerkandl & Pauling, 1965) that was just as relevant for prokaryotes and microbial eukaryotes as for animals, plants and fungi. The greatest early success of the sequencing era came when Carl Woese and colleagues showed that the ribosomal RNA (rRNA) sequences of prokaryotes clustered into two groups that were at least as divergent from each other as they were from the rRNA genes of eukaryotes, demonstrating that the prokaryotes comprised two distantly related lineages, the Bacteria and Archaea (Woese & Fox, 1977; Woese, Kandler, & Wheelis, 1990). The discovery of the Archaea demonstrated the power of sequence data for investigating relationships among prokaryotes, and in the intervening years, analyses of rRNA and, more recently, whole genome sequences have become standard approaches in molecular evolution and systematics. The advantages of sequences over other types of data—such as morphology, physiology and biochemistry—for inferring phylogenies are clear, particularly in the case of prokaryotes, microbial eukaryotes and viruses. Sequence data are highly informative, and today, millions of characters can be analysed simultaneously in single-gene or concatenated multiple sequence alignments. With contemporary (i.e. “next-generation”) sequencing technologies, new sequences are cheap and relatively easy to obtain, and the number of sequences in public databases is so large that the data needed to address many unanswered or new evolutionary questions is already available. From the biological point of view, one of the greatest strengths of sequence-based phylogenies is the capability they provide for inferring relationships among organisms for which other meaningful points of comparison do not really exist. For example, all cellular organisms synthesise proteins on a ribosome, so a tree based on rRNA can include the bacterium Escherichia coli, the archaeon Sulfolobus solfataricus and the eukaryotes Saccharomyces cerevisiae and Homo sapiens, organisms which would otherwise be difficult or impossible to fit into a single, meaningful classification. Much of the early excitement around sequence-based phylogenies was due to their potential use in constructing a universal tree of life that would include all cellular organisms (Woese et al., 1990). In fact, much progress has been made on this issue in the sequencing era (Embley & Martin, 2006), although the relationships among the major lineages of cellular life remain actively debated (Ciccarelli et al., 2006; Cox, Foster, Hirt, Harris, & Embley, 2008; Foster, Cox, & Embley, 2009; Gribaldo, Poole, Daubin, Forterre, & Brochier-Armanet, 2010; Williams, Foster, Cox, & Embley, 2013; Williams, Foster, Nye, Cox, & Embley, 2012), as will be discussed in more detail below. Another major advantage of sequence data is that it is unambiguously categorical: there are 4 possible states (A, C, G and T) for each nucleotide position, and 20 for each amino acid. As a result, sequences are considerably more amenable to rigorous statistical analysis than phenotypic characters, whose states must often be encoded in a somewhat arbitrary way (Stevens, 1991). This categorical character of sequence data is important—sequences may represent the richest source of information about prokaryotic evolution currently available, but as with other kinds of data, they can be positively misleading (Felsenstein, 1978) if analysed using inappropriate methods. Thus, while obtaining sequences is easier than ever before, careful phylogenetic analysis using the best available methods remains a time-consuming and potentially challenging task. With the right tools in hand, the process of building phylogenies can be relatively straightforward, but it is not automatic—each step (Figure 1), from collecting and aligning sequences to choosing the most appropriate phylogenetic model and building the trees, involves making decisions that may change the outcome. The aim of this chapter is to provide a practical guide to each of these steps and to introduce some of the best and most frequently used software for phylogenetic analysis. In order to make our discussion more concrete, we will work through an attempt to resolve one of the most interesting and controversial questions in phylogenetics—the relationship between Bacteria, Archaea and Eukarya, the three major lineages of cellular life. Figure 1 A workflow for phylogenetic analysis. The outline of a generic approach that can be used to address many questions in phylogenetics. In this chapter, we decided to investigate the relationship between Archaea and eukaryotes. This decision motivated our selection of SSU ribosomal RNA sequences for analysis, and the properties of that dataset suggested a particular approach to alignment and phylogenetic modelling. The resulting trees were then interpreted in the light of the original question, helping to focus discussion on their most relevant features. Following Woese's discovery of the Archaea, the question naturally arose as to which of the prokaryotic groups (Bacteria or Archaea), if either, was more closely related to the eukaryotes. This question is complex because of the symbiogenic origins of eukaryotic cells (Sagan, 1967): all eukaryotes have a mitochondrion or mitochondria-related organelle that descends from a free-living alphaproteobacterium (Andersson et al., 1998; Esser et al., 2004), and many also possess a plastid descended from cyanobacteria (Martin et al., 2002). Thus, different compartments of eukaryotic cells have different phylogenetic origins. However, the genetic and ultrastructural similarities between mitochondria and plastids and their bacterial relatives are sufficiently strong that a broad consensus now exists on the origins of these organelles. Instead, contemporary debate focuses on the phylogenetic affinity of the eukaryotic nucleocytoplasmic lineage, which is often taken to represent the original host cell for these bacterial partners (Embley & Martin, 2006). Early analyses of rRNA led by Woese and coworkers (Woese, 1987; Woese & Fox, 1977) suggested that each of the three “domains” of life—Bacteria, Archaea and Eukarya—were monophyletic; in other words, that all Archaea, for example, are more closely related to each other than any of them are to Bacteria or eukaryotes. Combined with evidence from analyses of ancient gene duplications which suggested that the root of this “universal tree” lay on the branch leading to Bacteria (Gogarten et al., 1989; Iwabe, Kuma, Hasegawa, Osawa, & Miyata, 1989), these results led to the now-famous rooted three-domains tree (Woese et al., 1990), in which the Eukarya and Archaea form monophyletic sister groups to the exclusion of Bacteria. This tree represents the dominant hypothesis for the deepest branches of the tree of life and as such plays an important role in modern evolutionary biology. In this chapter, our goal will...