Changing perspectives in yeast research nearly a decade after the genome sequence

  1. Kara Dolinski and
  2. David Botstein1
  1. Lewis-Sigler Institute for Integrative Genomics, Department of Molecular Biology, Princeton University, Princeton, New Jersey, 08544 USA

Abstract

Research with budding yeast (Saccharomyces cerevisiae) has been transformed by the publication, nearly a decade ago, of the entire genome DNA sequence. The introduction of this first eukaryotic genomic sequence changed the yeast research environment significantly, not just because of dramatic progress in technical means but also because the sequence made accessible a new class of scientific questions. A central goal of yeast research remains the determination of the biological role of every sequence feature in the yeast genome. The most remarkable change has been the shift in perspective from focus on individual genes and functionalities to a more global view of how the cellular networks and systems interact and function together to produce the highly evolved organism we see today.

The first complete nucleotide sequence of a eukaryotic genome, that of budding yeast (Saccharomyces cerevisiae), was published in 1996 (Goffeau et al. 1996). It was the result of a broad international effort, stimulated by a consensus reached in the United States, nearly a decade earlier, that there should be an extraordinary 15-year effort to sequence the human genome, supported by funding of the order of $3 billion. A particularly significant feature of the National Academy of Sciences report (Alberts et al. 1988) that announced this consensus was the recommendation that the genomic sequences of a few other eukaryotes should be determined first. The eukaryotic genomes chosen were those of the leading “model organisms,” because their genomes are significantly smaller than that of the human, and because substantial and successful molecular genetics research communities had already been developed to study them. Largely because of the efforts of these communities, it was already known that many of the proteins carrying out basic cellular functions are highly conserved among all the eukaryotes, suggesting that knowing the sequences of both the model genomes and the human genome would be an important path to understanding them both. Explicitly named were yeast (Saccharomyces cerevisiae), a nematode worm (Caenorhabditis elegans), and a fruitfly (Drosophila melanogaster). The yeast genome, containing ∼12 million bp, is only 0.4% the length of the 3-billion-bp human genome, and the worm and fly genomes are ∼3.3% and 5.5% the length of the human genome, respectively (numbers from Saccharomyces Genome Database [SGD; http://www.yeastgenome.org], UCSC Gold Path [http://genome.brc.mcw.edu/goldenPath/stats.html], GSC at Washington University [http://www.genome.wustl.edu/projects/celegans/], and FlyBase [http://www.flybase.org/annot/release.html#releases], respectively).

Thus, the yeast genome became the pioneer eukaryotic genome, and the yeast research community was the first to profit from knowledge of the complete genome sequence. A dramatic transformation of yeast research ensued that presaged similar transformations of research in the other model organisms, the mouse and the human, as each of these genome sequences became available. The transformation began with technical improvements that greatly accelerated research, especially any research involving identification of pieces of DNA cloned, for example, after a biological selection from clone libraries. Whereas before the sequence, yeast researchers identified clones by mainly genetic and/or physical mapping methods, now a single sequence run sufficed. Technologies unimaginable before (e.g., DNA microarrays containing each and every yeast gene) became commonplace. The same European-led consortium that initiated the sequencing effort undertook to produce the deletion of every yeast open reading frame (ORF) (Winzeler et al. 1999; Giaever et al. 2002), and development of a whole class of genome-scale genetic methods began, a development that is still far from complete.

Today we are seeing the beginning of an even more profound transformation of yeast research, one that is more than technical. The availability of the entire genome sequence has made possible the asking of new kinds of research questions, questions that can be answered only when one has truly comprehensive information about an organism. For example, once the entire genome sequence became known, it became possible, for the first time, to study expression of all the genes at once, where before one could study genes only a few at a time. The very idea of what constitutes “specificity,” has been changed by the ability to study expression of all the genes without exception. It is now routine to enumerate, in a single experiment, all the genes in an organism that respond to a specific stimulus or stress.

Comparative analyses of the complete genome sequences of the yeast, worm, fly, mouse, and human genomes forcefully validated the expectation of extreme conservation of sequence and function over evolutionary time (see Chervitz et al. 1998; Rubin et al. 2000; Lander et al. 2001; Venter et al. 2001). There has been, as a result, a “grand unification” in molecular biology, as it became clear that, at least for proteins, sequence similarity more often than not leads to an unambiguous assessment of functional similarity. Thus, through comparative genomics much, if not quite all, of the experimental work elucidating gene or protein function done in one organism illuminates them all, and transfer of annotation from the organism where the experimental data were collected to other organisms has become routine. Since yeast is still, for many basic cellular functions, the most tractable experimental system, much of the annotation of basic cellular functions in all eukaryotes, including especially the human, can be traced back to experiments done in yeast. As described below, the genome databases, especially Gene Ontology (GO), are continuing to facilitate the transfer of annotation so that a newly discovered gene function in yeast soon appears as an annotation for the orthologs in the other eukaryotes.

From the parts list to the system level: Goals of post-genome-sequence yeast research

The most obvious goal of genomic science arose directly from knowing complete genomic sequences: to decipher, annotate, and understand the role of each and every feature of the DNA sequence in the life of the organism. Put another way, we want to understand the reasons that each genomic DNA sequence feature was selected over evolutionary time.

The knowledge of the entire genomic sequences of organisms has motivated a new kind of biological analysis that looks beyond individual genes to the ensemble of all the genes. A second, more ambitious goal of genomic science has thereby emerged: to understand not only what every gene and gene product does for the organism but also how all of these genes, gene products, and functions and their regulation interact together to produce the properties of the organism. The ultimate aim of this new “system-level” biological research (Hartwell et al. 1999; Ideker et al. 2001a), not really practical before genomic sequences, is to account for and to model, in a fully quantitative way, not just how each of the genes participates in the biology of the organism but also how their interactions are controlled to maintain homeostasis over the entire life cycle and environmental range experienced by the organism.

Saccharomyces cerevisiae, by dint of its small genome and its experimental tractability, has become the pioneer in a new era of biology, where all the individual “parts” of the organism (conveniently encoded in the genome) are specified, and where research is aimed at understanding fully and quantitatively how the ensemble of the parts, subassemblies, and regulatory networks work together to produce the robust living organisms we see.

Genes and their biological roles

In 1995, the total number of yeast genes about whose function something was understood was of the order of two thousand (Hughes et al. 2004); virtually all that was known about these genes derived directly from experiments by inference from mutant phenotypes or direct biochemical assays. The information about these genes began to be collected into the SGD (www.yeastgenome.org) (Balakrishnan et al. 2005) at about this time; as the genomic sequence became available, other databases also came into existence (see Table 1; Mewes et al. 1997; Costanzo et al. 2000; Guldener et al. 2005). SGD remains the primary source for updates to the genome sequence, primary annotation, gene names, and nomenclature, as well as the basic functional information about each gene, which is continually culled from the experimental literature. In what follows we refer the reader to SGD for details about individual genes and global statistics about genes.

Table 1.

Some sources of functional genomics data collections for S. cerevisiae

Currently, the best estimate for the number of yeast ORFs that actually encode proteins is 5773, of which 1474 (25%) are listed by SGD as “uncharacterized.” This means that something biological is known about 4299 yeast genes, approximately a twofold increase since the genome sequence became available. However, for many genes the biological information available is still very limited, and quite a bit of the new information about biological function derives from sequence comparisons; a gene never studied in yeast but apparently orthologous to a well-characterized gene in another organism acquires functional annotation by inference.

Development of diverse genome-scale experimental technologies raised considerable expectations of an acceleration in the discovery rate for the functional roles of individual poorly characterized genes. Surprisingly, this promise still remains largely unfulfilled. Although each of the technologies (i.e., two-hybrid analyses, synthetic lethality methods, and gene coexpression using DNA microarrays) has had many significant successes, most of the new functional annotations of genes continue to derive from research papers describing experiments focused on just a few genes at a time. A major symptom of the problem has been the startlingly limited overlap in the predictions of different genome-scale methods. The reader is referred to an excellent recent summary of these issues (Hughes et al. 2004). Thus the most elementary goal of post-genome research, to annotate the biological role of every gene, remains a challenge.

Gene expression technology and the emergence of system-level biology

In the mid to late 1990s, technology that allowed one for the first time to simultaneously assess gene expressions for the entire genome was developed. DNA microarrays, on which each ORF or other sequence feature is represented (DeRisi et al. 1997; Brown and Botstein 1999), has been the dominant approach used in yeast research; at the same time, Serial Analysis of Gene Expression (SAGE), an alternative sequence-based method, was also developed (Velculescu et al. 1997). Although gene expression technology has yet to make a big impact on the rate of biological annotation of yeast genes, it has nevertheless transformed yeast research, by studying gene expression associated with relatively simple biological experiments in a fully comprehensive way. This comprehensiveness has stimulated both experimental and theoretical approaches to understanding regulatory networks and other features of yeast biology at the system level (Hartwell et al. 1999; Ideker et al. 2001a). Rather than attempt a review of all this research (SGD lists, at the time of writing, 299 published genome-scale expression studies), we will limit ourselves to a few paradigm examples, often from our own experience.

Defining functional or regulatory subsystems, or “modules”

The gene expression technology provided a direct and simple way to enumerate genes whose expression respond to individual stimuli or stresses. Early examples follow: Gasch and colleagues (2000, 2001) studied all the genes responding to a number of diverse stresses, and they defined a generic “environmental stress response” that underlies all stress responses, including temperature change, starvation, oxidative stress, and radiation; Roberts and colleagues (2000) studied all the genes that respond to α-factor; and Ideker et al. (2001b) studied nutritional perturbations to cells growing on galactose. Similarly, genes with characteristic expression changes during sporulation (Chu et al. 1998) or the cell division cycle (Cho et al. 1998; Spellman et al. 1998) were systematically determined. Such studies have associated genes with each other and with particular biological activities, although further work has been required to provide detailed understanding.

However, studies of this kind have provided important new clues to which sets of genes might act together, in concert or in sequence, in a common biological process. For instance, the ribosomal genes comprise one of the most tightly coregulated groups of functionally related genes. In experiments that examine processes as diverse as sporulation (Chu et al. 1998) to response to arsenic (Haugen et al. 2004), the ribosomal genes (both mitochondrial and nuclear) are often the most significantly coregulated genes in the data set. Genome-wide expression technology has confirmed existing and discovered novel components of other major transcriptional networks in yeast, including clusters of genes involved in amino acid metabolism, energy pathways, DNA replication, and the stress response (Eisen et al. 1998; Chua et al. 2004), all of which are, as are the ribosomal proteins, frequently the most significant groups of coregulated genes in a data set. We have also learned that it is quite rare for genes to have unchanging expression levels across different experiments; for example, expression of the yeast actin (ACT1) gene, which was traditionally used as a control in Northern blots to ensure that equivalent levels of RNA were loaded in each well, changes significantly in several diverse types of microarray experiments (Fig. 1).

In many cases, later studies have not only confirmed the functional associations among genes based on coexpression but also found the regulators that are responsible; the study by Zhu and colleagues (2000) is one example. These studies not only have resulted in further experimental work, but also, as we summarize below, have stimulated many successful theoretical and analytical efforts in defining and understanding regulatory network behavior in yeast, making yeast once again the leading organism in an emerging field of biology.

Analysis and display of genome-scale data

The comprehensiveness of these methods led to a new problem: How can such a vast amount of data be analyzed, managed, and presented? Biologists were no longer able to examine these results individually; each of the aforementioned studies was comprised of hundreds of thousands of individual gene expression measurements. A key development in the field was applying clustering algorithms and data visualization tools to allow for analysis and presentation of the large volume of microarray results. An early approach, which is still probably one of the most popular, is the application of hierarchical clustering to group similarly expressed genes together, representing their relative expression levels graphically with colored boxes (Fig. 2A; Weinstein et al. 1997; Eisen et al. 1998; Wen et al. 1998). Several other methods of analyzing gene expression have since been applied, including self organizing maps (SOM) (Tamayo et al. 1999), simulated annealing clustering (Lukashin and Fuchs 2001), graph-theoretic approaches (Sharan and Shamir 2000), biclustering (Cheng and Church 2000; Tanay et al. 2002; Kluger et al. 2003), and other sophisticated approaches (see Ihmels et al. 2002).

Figure 1.

Fold change of ACT1 gene expression in microarray experiments available at SGD. This figure was generated by the SGD Expression Connection tool at http://db.yeastgenome.org/cgi-bin/expression/expressionConnection.pl. The outlying values (–10, +fivefold) are experiments from the Mnaimneh et al. (2004) data set that profiled expression in strains with essential genes under control of titratable promoters. Other conditions that led to at least fourfold change in ACT1 expression include expression during sporulation (less than fourfold) (Chu et al. 1998) and prolonged stationary phase (–4.3 fold) (Gasch et al. 2000).

Gene ontology

As genome-scale data accumulated, it quickly became clear that interpretation would depend critically on high-quality functional annotation. Without reliable underlying annotation, even the best clustering algorithm will not allow one to make sense of the data. Not only was some description needed for as many genes as possible, but that description needed to be written by using a controlled vocabulary, so that it was easy to search and find, for example, all transcription factors known in the genome. For this task, the GO was developed. GO is a structured, controlled vocabulary that describes the biological processes, functions, and locations of gene products (Ashburner et al. 2000). The GO is structured such that broad, general terms are parents of more specific terms, child terms can have more than one parent term, and these parent-child relationships are captured in the ontology (Fig. 3).

The initial goal of GO was to improve queries of orthologous genes across different organism databases without having to deal with nomenclature differences. For example, with GO, one would not need to know that “CDC25” is the yeast homolog of “son of sevenless” to link from the yeast CDC25 entry in SGD to the son of sevenless entry in FlyBase; instead one could bridge the databases by using the GO-controlled vocabulary term “Ras guanyl-nucelotide exchange factor.” Soon after its inception, GO was also applied to provide succinct descriptions of genes in expression clusters, facilitating visualization of the biological importance of the data during analysis by, for instance, the Eisen TreeView program (Eisen et al. 1998; Saldanha 2004).

Computing and validating inferences from experiments

The real power of GO emerged when computational biologists began to use it to validate inferences made from analysis of expression data. GO Term Finder is an early tool that facilitates automated analysis of the biological roles of groups of genes. It searches for significant shared GO terms used to describe any group of genes, such as genes that are coexpressed in a microarray experiment (Boyle et al. 2004). For example, a tight cluster of coexpressed genes from the aforementioned study of stress responses (Gasch et al. 2000) is shown in Figure 2A. When these genes are analyzed by using the GO Term Finder, it is clear that this cluster consists of genes involved in proteolysis (Fig. 2B). There are several other tools that similarly utilize GO for analyzing groups of genes (see the GO home page at http://www.geneontology.org). With these tools in hand, yeast researchers began to make several new biological insights.

Figure 2.

(A) Display of a group of genes that exhibit similar expression during the DNA damage response as described in Gasch et al. (2001). Red indicates increased expression, while green indicates decreased expression levels. Each gene's expression is represented by a single row of colored boxes, while each sample is represented by a single column. (B) GO Term Finder results with the cluster from A as input; the most significant enriched GO Term is “proteolysis and peptidolysis,” with a P-value of 1.26–28.

Insights into the global transcriptional network

These advances stimulated the use of DNA microarray technology to address questions of global gene regulation, using what has turned out to be a powerful combination of computation and experiment (for a recent review, see Chua et al. 2004). Computational methods were developed that go well beyond clustering coregulated genes to identify short DNA sequences that might be cis-acting regulatory elements shared among the coregulated genes and the transcription factors that may bind them; an early example is that of Bussemaker et al. (2001). In later studies, probabilistic methods were used to identify regulatory “modules” by using combinations of upstream regulatory sequence and expression data (Wang et al. 2002, 2005; Segal et al. 2003).

Figure 3.

The Gene Ontology: a structured, controlled vocabulary to describe gene products. The diagram above is a small part of the biological process ontology. Parent terms are yellow boxes, child terms are blue boxes, and a sampling of genes associated with each GO term are in italics. Note that child terms can have multiple parents, allowing for more accurate representation of complex relationships among different biological processes.

Comparative genomics have also been used successfully to examine transcriptional networks. In two independent comparative genomics studies, conserved regulatory elements were identified by aligning the intergenic regions of closely related Saccharomyces species and then searching within them for conserved sequence motifs (Cliften et al. 2003; Kellis et al. 2003). Pritsker et al. (2004) identified putative transcription factor binding motifs by using Gibbs sampling to search for significant regulatory elements within promoters of orthologous genes from 13 hemiascomycetous yeasts.

Transcription factor binding sites are now also being globally identified via experimental means by using a combination of chromatin-immunoprecipitation (chIP) and microarray technology. In these experiments, DNA bound to a given transcription factor is isolated by chIP of the epitope-tagged transcription factor; these DNA fragments are then hybridized to a microarray of intergenic regions (chip). These chIP-chip experiments have now identified binding sites for many transcription factors involved in a variety of biological processes (see Khodursky et al. 2000; Iyer et al. 2001; Lieb et al. 2001; Raghuraman et al. 2001; Kurdistani et al. 2002; Lee et al. 2002; Ng et al. 2003).

With traditional microarrays, chIP-chip experiments, and genomic sequence available, several methods have been developed to elucidate transcriptional networks by integrating these different data sources. Lee et al. (2002), Bar Joseph et al. (2003), and Gao et al. (2004) are all prominent examples where chIP-chip and expression data were combined to generate regulatory modules. In the most comprehensive study to date, Harbison and colleagues (2004) used a combination of experimental (chIP-chip), comparative genomics, and motif discovery methods to identify putative DNA binding sites for >200 transcription factors in yeast.

Most impressively, Beer and Tavazoie (2004) recently applied a probabilistic framework to predict gene expression based on sequence information. In their elegant approach, a Bayesian network takes as input different properties of sequence elements upstream of a gene and outputs the likelihood of that gene exhibiting a particular expression pattern. Their combinatorial rules correctly predicted patterns of gene expression for 73% of the yeast genes studied (1898 of 2587 genes in five test sets), with 27% predicted to be in an expression pattern different than their actual expression pattern; the P-value for the prediction of 73% is <10–127. They were then able to use their method to predict regulatory elements in the worm.

Interaction networks

With the completed genome also came the opportunity to generate comprehensive genetic and physical interaction maps. Synthetic Genetic Array (SGA) analysis (Tong et al. 2001) uses the comprehensive ORF-deletion collection (Winzeler et al. 1999; Giaever et al. 2002) and a clever genetic selection as the basis for systematically generating double mutants. Assessment of the growth properties of the double mutants generates large-scale genetic interaction maps based on the concept of “synthetic lethality.” Synthetic lethality, described first in Drosophila by Dobzhansky (1946) and Sturtevant (1956) and in yeast by Novick et al. (1989), occurs when the combination of two mutations causes lethality, while neither mutation by itself is lethal. The SGA technique has provided a means to perform genetic interaction analysis on a large scale, yielding a genetic interaction network containing ∼1000 genes and ∼4000 interactions (Tong et al. 2004; see also Pan et al. 2004). Large-scale protein-protein interaction networks have also been generated by using the two-hybrid system and mass spectrometry (Ito et al. 2000, 2001; Uetz et al. 2000; Gavin et al. 2002; Ho et al. 2002; Hazbun et al. 2003).

These results have generated increasing theoretical efforts aimed at characterizing regulatory and functional interaction networks. For example, Yeger-Lotem and colleagues (2004) adapted and extended methods that Shen-Orr et al. (2002) applied in Escherichia coli to discover significant network motifs in a combined network of regulatory and physical interactions. Shen-Orr et al. (2002) examined interactions between E. coli transcription factors and the operons that they regulate to discover “network motifs,” or patterns of connections among genes in the network that occur significantly more frequently than in randomized networks. Yeger-Lotem and colleagues (2004) extended this concept to analyze a network that is comprised of both transcription factor-target interactions and protein-protein physical interactions. They found a few two- and three-protein motifs (e.g., two transcription factors interacting to regulate a third gene) and many (63) four-protein motifs, which in almost all cases were combinations of the smaller motifs. These results suggest that smaller motifs serve as building blocks to construct the larger cellular network.

While already having generated much useful biological data, the large-scale methods for profiling genetic and physical interactions for the entire genome (i.e., all 6000 genes by all 6000 genes) are still labor intensive and, as with all high-throughput methods, generate both false-positive and false-negative results. Also, as noted above, there is distressingly poor concordance among results of the several large-scale studies (Hughes et al. 2004). In the studies where authors systematically compared their high-throughput results with those from individual experiments, the amount of overlap between the two is surprisingly low (Ito et al. 2001; Ho et al. 2002). This type of bench marking, while extremely useful, has been rare thus far because of the lack of a comprehensive collection of individual experimental results culled from the literature. Computational methods that integrate multiple types of experimental evidence to verify results, associate interactions with probability scores, and predict novel interactions or gene functions based on these combined interactions can address some of the limitations inherent with these high-throughput methods. Triosh and Barkai (2005) described a method to verify protein-protein interactions by examining whether orthologs of the interaction partners are coexpressed. Similarly, Yu et al. (2004) assessed whether protein-protein or DNA-protein interactions can be confidently transferred from one organism to another by examining “joint” sequence conservation of the interacting proteins. In another approach, Bayesian frameworks have been applied to integrate different types of functional genomics data (e.g., genetic and physical interactions and correlated expression) to generate the probability of a functional link for all possible gene pairs (Jansen et al. 2003; Troyanskaya et al. 2003; Lee et al. 2004). These pair-wise correlations can then be used to cluster functionally related genes together and thus can predict functions for previously uncharacterized genes. A different computational approach uses probabilistic decision trees to integrate different types of data in order to predict phenotypes (King et al. 2003) or synthetic lethal interactions (Wong et al. 2004), also meant to lead to functional predictions for uncharacterized genes. For all of these methods, the number of confirmed successful predictions for as-yet-uncharacterized genes is still too small to constitute a robust test of their efficacy.

Public resources for yeast genomics

We are confident that the genome-scale experimentation and the integrative analytical approaches sketched above will provide increasing insights into the biology of yeast and, as we have indicated, other eukaryotes. However, if the data are not publicly available in forms that are machine parsable, these studies will not reach their full potential in terms of generating useful biological knowledge. Toward that end, standards for several types of functional genomics data have been created. The Open Biological Ontologies (OBO; http://obo.sourceforge.net/) site is a Web page that provides links to various standards and controlled vocabulary projects, including the Microarray Gene Expression Data (MGED) Society (Spellman et al. 2002; Causton and Game 2003), the GO project (Ashburner et al. 2000), the Proteomics Standards Initiative (standards for protein-protein interactions and mass spectrometry data) (Hermjakob et al. 2004a), and BioPAX (a common exchange format for pathways data).

In addition to data format standards, public databases that provide data in these formats for bulk download are also needed. Table 1 lists some of the public databases that provide genomics data sets for bulk download by various means. The Generic Model Organism Database (GMOD) project is a collaboration among the model organism databases to develop reusable software components suitable for sharing across different database groups; many useful, freely available software components are available at the GMOD Web site (http://www.gmod.org).

Conclusion and some thoughts about the future

In the decade since the release of the yeast genome DNA sequence, there has been the expected change in the technology of yeast research as well as a rather surprising change in its goals. Indeed, as we have outlined, most of the new understanding of individual yeast gene functions has come from comparative genomics and relatively little from the high-throughput genomic technologies. The latter have, however, fueled the changes in goals, from a focus on individual genes and their interactions to a focus on the system-level transactions that make the robustly functioning organisms we find in nature.

The future of genome-scale technologies is, nevertheless, very promising. It is not clear whether the slow rate at which new annotations are verified is caused by problems in the data, data analysis and representation, or by a more simple lack of focus on the need for such verification. Some methods, now in early stages of development, will no doubt help: Among these are methods based on natural variation (examples include Brem et al. 2002; Steinmetz et al. 2002; Fay et al. 2004), methods that are not limited to nonessential genes (e.g., synthetic lethality with conditional alleles) (see Tong et al. 2004) or titratable promoter alleles (Mnaimneh et al. 2004), methods that study the locations and movements of intracellular molecules (Ghaemmaghami et al. 2003; Huh et al. 2003), and methods that use more biological information from other species (for example, Harbison et al. 2004).

We seem to be just at the dawn of the ability to construct truly quantitative, let alone comprehensive, models of functional and regulatory network interactions at the system level. The apparently simplest case might well be understanding metabolism at this level (a nascent field already being called “metabolomics”; (see Famili et al. 2003; Forster et al. 2003; for a review, see Smedsgaard and Nielsen 2005). To this end, it is clear that we lack most of the required basic measurements, such as the concentrations of metabolites in real time after perturbations in the style of Gasch et al. (2000) and Idecker et al. (2001b). Fortunately, in the post-genome-sequence era, it is much easier to acquire this kind of information on a comprehensive scale, and we believe that this will be the path forward. Another challenge of this nature is to understand the basis on which selection acts on the ensemble of genes, proteins, networks, and systems to produce organisms capable of surviving in new environments.

Finally, there remains the eternal issue of verification. We expect that the need for tests of hypotheses generated by genome-scale experiments and quantitative models will persist for a very long time. As has always been the case, every model (and the data used to generate it) must be tested, and to be tested, it must be specified in full and available to the public. The yeast community has an excellent record in this regard, one that we believe is a major reason that yeast continues to be the very model of a model organism.

Footnotes

  • Article and publication are at http://www.genome.org/cgi/doi/10.1101/gr.3727505.

  • 1 Corresponding author. E-mail botstein{at}princeton.edu; fax (609) 258-7070.

References

| Table of Contents

Preprint Server