Synthetic biology projects in vitro

  1. Anthony C. Forster1,3 and
  2. George M. Church2,3
  1. 1 Department of Pharmacology and Vanderbilt Institute of Chemical Biology, Vanderbilt University Medical Center, Nashville, Tennessee 37232, USA;
  2. 2 Department of Genetics, Harvard Medical School, Boston, Massachusetts 02115, USA

Abstract

Advances in the in vitro synthesis and evolution of DNA, RNA, and polypeptides are accelerating the construction of biopolymers, pathways, and organisms with novel functions. Known functions are being integrated and debugged with the aim of synthesizing life-like systems. The goals are knowledge, tools, smart materials, and therapies.

Synthetic biology projects (SBPs)

The basic elements of chemistry and biology are few, but the synthetic combinations are unlimited and awe inspiring. The first international conference on synthetic biology charted its goals as understanding and utilizing life’s diverse solutions to process information, materials, and energy (Silver and Way 2004) (http://syntheticbiology.org). As a bonus, genetic systems are biocompatible, renewable, and can be optimized by Darwinian selections. SBPs entail the complex manipulation of replicating systems, ranging from the sophisticated genetic engineering of organisms (Zimmer 2003; Ferber 2004; Gibbs 2004) to the chemical synthesis of unnatural replication (Rawls 2000; Szostak et al. 2001; Benner and Sismour 2005). It thus seems appropriate to divide synthetic biology into two classes, in vivo and in vitro. In vitro SBPs have received considerably less attention, so we focus on them here after introducing in vivo SBPs to illustrate the differences between the two classes.

In vivo SBPs

In vivo SBPs mostly involve bacterial engineering, have diverse goals, and are generally more suited than in vitro SBPs for large-scale production/conversion of materials. Neobiotic constructions with new functions now encompass redesigned metabolic pathways for pollution remediation (Pieper and Reineke 2000) and for synthesis of drugs (Pfeifer et al. 2001) and plastics (http://www.metabolix.com). Multiplex regulatory circuits have been pieced together to test theories of pathway control, alter phenotypes, and generate biosensors (Guet et al. 2002). Significant reduction (15%) of a bacterial genome has proved possible by leveraging a very small number of deletion oligodeoxyribonucleotides (oligos) to reduce recombination and increase electroporation efficiency (Posfai et al. 2006). System designs can be combinatorial or modular, inspired by electronic circuits (http://parts.mit.edu/), although not intended to replace them. Even the parts themselves can be redesigned: e.g., Escherichia coli and yeast have been endowed with expanded genetic codes by engineering an orthogonal suppressor tRNA/aminoacyl-tRNA synthetase/unnatural amino acid triple (Chin et al. 2003); this will lead to new pharmaceuticals (http://www.ambrx.com).

In contrast to in vitro SBPs, some in vivo SBPs require strict safety regulations. The synthesis of poliovirus (Cello et al. 2002) from oligos heralded not only designer viruses/cells for vaccines and gene/cell therapies, but also new dangers. This stimulated an action plan to regulate world suppliers of DNA synthesizers, DNA precursors, and oligos (Church 2004). Ethical and safety issues have been, and must continue to be, regulated (Cho et al. 1999).

Several groups have proposed to create bacteria with chromosomes synthesized entirely from synthetic oligos. This might be done stepwise (Posfai et al. 2006) or by inactivating the endogenous bacterial chromosome and then somehow transforming and rebooting the bacterium with an entire in vitro-synthesized genome. One goal is to test theories of the minimum number of genes required for bacterial replication (C. Venter, H. Smith, and C. Hutchison, unpubl.; but, for review, see Cho et al. 1999; Zimmer 2003). This exploits lists of indispensable genes of a minimal Mycoplasma bacterium (Hutchison et al. 1999; Glass et al. 2006). A major hurdle is that dispensability of single genes does not assure viability of combinations of deletions (e.g., pairwise synthetic lethals) (Zimmer 2003). A second goal is to synthesize bacteria with genomes that are globally mutated rather than severely truncated. Several specific applications can be envisioned. For example, freeing up of certain codons would facilitate extending the code with unnatural amino acids (Anderson et al. 2004). Replacing all 378 UAG stop codons of E. coli with UAA stop codons might allow deletion of release factor 1 (RF1), thereby enabling insertion of an unnatural amino acid at a plasmid-encoded UAG codon without competition from RF1 or chromosomal UAG codons. Global switching of nonsynonymous codons should prevent functional exchanges of genetic material with natural species, decreasing the chance of environmental contamination.

In vitro SBPs

Time and again, decreasing the dependence on cells has increased engineering flexibility with biopolymers and self-copying systems. For example, evolution has been engineered in vitro using very alien designs that accelerate selection of almost any imaginable molecular function (Brody and Gold 2000; Griffiths and Tawfik 2000; Takahashi et al. 2003; Halpin and Harbury 2004). Decreasing dependence on cells also facilitates increased understanding. The increased engineering flexibility and understanding that will emerge if replication can be completely weaned from cells is enticing. Though this may seem inevitable, it is sobering to remember that polymerase synthesis requires a translation apparatus that is dependent for its construction and function on more than 100 macromolecules, many of which are post-transcriptionally modified.

Progress in synthesizing biopolymers and replicating systems with decreasing dependence on cells will now be discussed in detail. Such a “synthesis” of current in vitro SBPs showcases many exciting achievements and goals. It also reveals steps for generalization and integration of methods (e.g., adaptation to physiological conditions) that may ultimately allow cell-free self-replication from small molecule substrates.

DNA synthesis, replication, and evolution

The fundamental tool of synthetic biology is undoubtedly gene synthesis. Gene syntheses by recombinant DNA cloning and polymerase chain reaction (PCR) are now being rivaled by tour-de-force raw synthesis from oligos. Established polymerase and ligation methods have enabled synthesis of a 7.5-kb virus (Cello et al. 2002), a 5.4-kb virus in 2 wk using biological selection (Smith et al. 2003), a 32-kb operon (Kodumal et al. 2004), and e-mail-order genes. New approaches overcome scalability and cost limitations by avoiding column-based synthesis and gel purification altogether (Fig. 1) (Richmond et al. 2004; Tian et al. 2004; Zhou et al. 2004). Thousands of oligos are synthesized on a photo-programmable chip, released, amplified by PCR, enriched for unmutated oligos by hybridization to complementary oligos, and assembled by PCR. Mutations can be further culled by binding to the DNA-mismatch binding protein, MutS, enabling gene syntheses with a 1/10,000 b error rate (Carr et al. 2004). Stitching together DNA constructs up to 100-Mb long is feasible via homologous recombination. Chemical synthesis of genome segments without templates (or their hosts) is far more flexible than old approaches. Codons can be globally altered to maximize translation (Tian et al. 2004), epigenetic base-modification patterns can be tailored, and totally new protein designs tested.

Figure 1.

Semi-automated, inexpensive synthesis of genes from oligos made on chips. Hundreds of thousands of different oligos are synthesized on a chip, released, and amplified by PCR (PCR primers are shown by the thickest lines). Construction oligos are cut (thinnest lines) with type IIS restriction endonucleases to release internal doubled-stranded hybrids, some of which contain nucleotide base errors (red stars). Only the correct construction oligos hybridize well with complementary selection oligos (purple) immobilized via biotin (B) binding to avidin. Eluted oligos are assembled by PCR. Base errors (yellow stars) are further reduced by sequestration with MutS protein (green).

Cell-free assembly and replication of large DNA structures is challenging, but rewarding. PCR is broadly used, but error-prone, confined to products <40 kb, and difficult to integrate with temperature-sensitive biological reactions. More physiological and accurate is strand-displacement amplification of circular DNA into concatameric DNAs using a restriction enzyme and DNA ligase to regenerate monomeric circles (Dahl et al. 2004). For complete integration with biological systems, an alternative processing scheme that does not require chemical synthesis of oligos is needed. For example, DNA synthesis could be primed by RNA and processing achieved by homologous recombination.

Though DNA is purposely nonreactive in life, synthetic biologists have different ideas: They use in vitro-directed evolution of DNA to evolve efficient and useful DNA catalysts (Santoro and Joyce 1997). Diagnostics is also being engineered radically: Signal amplification can be made more specific using unnatural base pairs (Benner and Sismour 2005).

RNA synthesis, replication, and evolution

In contrast to methods for synthesizing other biopolymers, RNA is usually synthesized without cells or cell extracts. Typically, research needs can be met by run-off transcription of synthetic oligos, PCR products, or linearized plasmids using coliphage T7 RNA polymerase, but there are limitations. Inflexibility of the 5′-terminal first few nucleotides and heterogeneity of the 3′ terminus can be addressed for very short RNAs by chemical synthesis. Alternatively, these restrictions and the need for one linear template per RNA might be overcome in a generalizable manner by using short class II T7 terminators (Lyakhov et al. 1998) in tandem, and trimming off extraneous sequences by ribozymes (Forster and Church 2006). A much larger obstacle to in vitro synthesis for research studies is incorporation of the modified nucleosides found mainly in tRNAs and rRNAs. Herculean total chemical synthesis has only reconstructed one tRNA (Wang 1984), is still impractical for anything but short RNAs containing the simpler of the natural modifications, and the modifications are not replicable. A generalizable solution would be provision of purified RNA modification enzymes. Although only about half of the estimated 60–70 E. coli genes for RNA modification (Bjork 1995) have been identified, new technologies will speed identification of the rest (Ikeuchi et al. 2006). Essentially, all unknown modification enzymes might be found by testing the effect of conditional mutations in every E. coli gene on the modification patterns of total rRNAs and tRNAs, as measured by mass spectrometry.

In vitro selection and evolution of RNAs has been more useful for isolating new ligands and catalysts than might have been expected from libraries limited to anionic polymers containing only four different, somewhat inert nucleotides. Its success testifies to the solubility, stability, and conformational diversity of three-dimensional RNA structures and the power of readily screening up to 1015 molecules. Cell-free replication and evolution was first achieved using coliphage Qβ RNA replicase and its single-stranded RNA substrates (Mills et al. 1967). But artificial RNA replication is severely limited by replicative mutations and by formation of inhibitory double-stranded RNA helices. This was circumvented by replacement of Qβ replicase with reverse transcriptase/PCR/T7 RNA polymerase and temporal separation of selection and replication. The result is an astonishing range of ligands (aptamers) and catalysts made of nucleic acid (Santoro and Joyce 1997; Brody and Gold 2000; Szostak et al. 2001; Murakami et al. 2006). Nuclease-resistant aptamers can also be evolved using nucleotide triphosphate analogs, providing very different alternatives to antibodies for possible sensor and therapeutic applications (Brody and Gold 2000; Benner and Sismour 2005).

Protein synthesis and evolution

Proteins are generally too long for chemical synthesis and are prepared almost exclusively in vivo due to optimal post-translational folding, modification, and yield. Nevertheless, toxicity, solubility, and purification issues encourage improvement of in vitro translation systems. The latter are also more versatile for incorporation of amino acid analogs: More than 100 have been incorporated at a single suppression site per protein (even carbohydrate linkages) (Cornish et al. 1995; Zhang et al. 2004). Crude E. coli systems with extended expression times have been enabled by continuous dialysis, but the efficiencies of translation of even homologous genes are unpredictable (Tian et al. 2004). Purified systems may be even more versatile (see below), particularly the E. coli system, because it is well studied and contains only three initiation-factor protein subunits versus some 30 for eukaryotes. The E. coli ribosome has been reconstituted from its purified components (although the protocol is nonphysiological and aided by chaperones in vivo) (El Hage et al. 2001). Efficient translation has been reconstituted from purified E. coli ribosomes, translation factors, aminoacyl-tRNA synthetases, and total tRNA (Kung et al. 1978). Of the numerous E. coli translation components, most have been prepared recombinantly, the glaring exceptions being nearly all of the tRNAs. Though the activities of the recombinant 50S proteins are unknown, it is encouraging that the only recombinant component with very low activity is the 23S rRNA transcript lacking its 23 nucleoside modifications (this effect resides in one to six modifications) (Semrad and Green 2002). Cell-free synthesis of a translation apparatus, important for understanding self-replication and for new applications (see below) is now a conceivable goal (Forster and Church 2006).

For directed evolution of a peptide or protein, it must be linked to its mRNA either directly (ribosome display [Mattheakis et al. 1994] and mRNA display [Takahashi et al. 2003]) or indirectly (cells, viruses, or emulsions) (Griffiths and Tawfik 2000). Selections with direct linkages have the advantage of being able to screen as many as 1015 different library members, while indirect linkages allow selections for enzymatic activity under multiple turnover conditions with untethered substrates. For the latter, the need to eventually link the product to the enzyme-coding gene has recently been overcome using double emulsions (Fig. 2) (Bernath et al. 2004). Directed evolution of polypeptides has already produced a plethora of diagnostics and therapeutics, though almost all products to date, monoclonal antibodies, come from in vivo systems. “Pure translation display” (Forster et al. 2004) can lack all 20 synthetases (and their inherent aminoacyl-tRNA proofreading and tRNA recharging activities) and release factors, freeing up all 64 codons for reassignment. In this way, ribosomes can be programmed to incorporate multiple adjacent unnatural amino acids with high fidelity (Forster et al. 2003). One goal is the directed evolution of peptide analog ligands that are heavily substituted for protease resistance using aminoacyl-tRNA substrates prepared by total or partial (Merryman and Green 2004) synthesis. The scope of analog incorporation could be increased by directed evolution of ribosomes, again facilitated by totally in vitro systems. For example, 23S rRNA mutations enable ribosomal incorporation of D-amino acids (Dedkova et al. 2003), and may eventually permit synthesis of mirror-image proteins to provide a panoply of enantiomers and diastereomers. An alternative possible route to mirror-image proteins would be by translating mirror-image mRNA with a chemo-enzymatically synthesized, self-replicating, mirror-image translation apparatus.

Figure 2.

Genetic selections using double emulsion compartments and a fluorescence-activated cell sorter (FACS). The water-in-oil emulsion typically contains only one gene variant per microdroplet, enabling noncovalent coupling of genotype to phenotype. Each gene (purple) is transcribed into an mRNA (blue) that is translated into an enzyme (red). Genes encoding enzymes that convert substrate into fluorescent product are selected by FACS through multiple cycles.

Small molecule synthesis and evolution

The directed evolution of small molecule ligands would be a bonanza for target validation and drug discovery, but is a substantially more complicated goal than directed evolution of nucleic acids and polypeptides. Though cellular and multicellular organisms have been evolving small molecule drugs for eons, this process is prohibitively slow. Now, a radically different in vitro approach called “DNA display” (Halpin and Harbury 2004) promises a solution (Fig. 3). Enormous libraries of small molecule ligands potentially could be evolved using a combination of features of cell-free DNA-templated organic synthesis (Gartner et al. 2004) and split-and-pool combinatorial synthesis. The proof-of-principle DNA display “translations” required hundreds of manual steps to enable screening a library of 1,000,000 protease-sensitive peptide products (Halpin and Harbury 2004). However, adaptation to much larger libraries of more drug-like compounds is feasible using microarrays, automation, and other chemistries that do not modify DNA.

Figure 3.

The encoding mechanism of DNA display. A DNA “gene” (left) is chemically “translated” in three cycles into a small molecule (DAlanine-DLeucine-Norleucine) that remains covalently attached by a linker (purple) to the gene that encoded its synthesis. The system is simplified for illustrative purposes, with only three coding positions (codons) drawn. Each codon is defined by a unique 20-mer DNA sequence. Firstly, the DNA is passed through columns that, together, have oligos complementary to all possible first codon sequences, with the DNA hybridizing to the single column that has a perfect complementary sequence. A different monomer is then reacted with the DNA on each column. This cycle is then repeated for the second and third coding positions. Another example using the nine columns shown is that a DNA with codons B-D-G encodes DLeu-DAla-DAla. In contrast to biological translation, the same codon cannot be repeated in a different codon position. Note that the number of small molecules encoded by a random DNA library increases exponentially as the number of columns increases linearly (e.g., 33 = 27 small molecules are encoded by 3 × 3 = 9 columns). Once the library is “translated,” it is then subjected to a binding selection, PCR amplification, and potentially DNA shuffling (not shown). All of these steps are then reiterated until DNA sequencing reveals consensus sequences for encoded small molecule binders.

Compartmental synthesis and division

Membrane encapsulation has the advantages of allowing system evolution without serial transfers, purifications, or splitting, and extension of replicating systems to new environments. Remarkably, synthesis and replication of compartments does not require any macromolecular catalysts: Aqueous solutions of pure lipids can yield spontaneously dividing membrane vesicles that allow passage of small molecules while retaining macromolecules (Szostak et al. 2001). But membrane replication is poorly understood and difficult to control (Luisi 2002; Gitai 2005). Fortunately, in vitro SBPs extend more broadly than lipid-bilayers, as membranous boundaries are neither necessary nor sufficient to ensure homogenous distribution of macromolecular contents. For example, segregation and selection like that found in cells (Wang et al. 2004) can be achieved with viruses, protoplasm (Kim et al. 2001), macromolecular complexes (Dower and Mattheakis 2002), emulsions (Griffiths and Tawfik 2000; Noireaux and Libchaber 2004), aerosols (Donaldson et al. 2004), biofilms, particles (Dressman et al. 2003), and arrays. The ability to spatially pattern DNA is another advantage of unnatural and in vitro systems. Stochastic crises could be avoided by dividing up large numbers of complexes, analogous to replication of multicellular biofilms. Satellites, defective interfering particles, and selfish genes will inevitably emerge and could be readily studied in purified systems. They could be kept in check by refining selection and counter-selection schemes (Mills et al. 1967; Breaker 2004).

Synthesizing self-replication and life

What is life? Living systems display inheritance, adaptation, growth, and repair by exchanging components with, and responding to, their environment. Replication and evolution are a requirement at the level of the population, not the individual. Even dead organisms can be recognized as once living by virtue of replicated complexity (replexity) not found in systems with merely high complexity. Replexity and other quantitative metrics (e.g., fidelity, evolvability) may help SBPs and alter our concept of life. For example, life-as-we-know-it requires membranous cellular compartments, but it can passage through an unencapsulated protoplast form (Kim et al. 2001), and any process for splitting and pooling the “soup” would suffice theoretically (e.g., rock cavities [Robinson 2005], tidal pools, and billabongs). Several in vitro SBPs aim to synthesize a life-like system fitting the definition “sustainable autocatalytic replication and evolution from small molecules.” The goals are understanding life and its origins and development of new molecular biology tools to aid production of biomaterials and the discovery of therapeutics (see above).

The first and simplest quest to synthesize self-replication aimed for nucleic acid-templated replication from activated nucleotide analogs without enzymes. Disappointingly, short oligomers proved impotent (Orgel 1995).

A second plan is to evolve an RNA replicase made of RNA from natural catalytic RNAs and/or libraries of random RNAs (Szostak et al. 2001). Impressively, ribozymes that polymerize nucleoside triphosphates have been constructed, but improving the fidelity and efficiency to the levels needed for replication remains a significant hurdle (Szostak et al. 2001). Another hurdle is construction of a membrane transporter made of RNA despite its membrane impermeability, yet this has recently been proved feasible (Janas et al. 2004). A benefit of this approach is experimental insight into the extinct “RNA world.”

A third completely in vitro plan is synthetic life that includes DNA and translation, which has the benefit of tying in more closely with existing biology. The plan is to assemble a biochemically derived list of some 151 genes from E. coli proposed to encode a near-minimal, self-replicating system dependent only on small molecule substrates (Fig. 4) (Forster and Church 2006). This list would extend previous computational modeling of Mycoplasma (Tomita et al. 1999) by dropping enzymes for synthesizing small molecules and by adding DNA replication, RNA processing, RNA modification, extra tRNAs to decode the whole genetic code, some additional essential translation components, and chaperones. Though all of the genes have known functions, there are some difficult choices in tRNA modification. There are seven unidentified genes, all for tRNA or rRNA modification. Other challenges include replication of monomer DNA circles over by-products, ribosome assembly under physiological conditions, efficiency of translation, control and integration of subsystems, and cosegregation of genes with their products. Nevertheless, this biochemical approach has advantages over the genetic approach (Hutchison et al. 1999) of enabling system-by-system debugging to attain self-replication, being more flexible (e.g., membranes are optional), and of starting closer to a fully understood system (the functions of about a fifth of the essential genes are unknown [Glass et al. 2006]). For example, from the viewpoint of structural biology, there is already significant three-dimensional information for a remarkable 97% of the identified gene products (see figure on front cover). Thus, finishing this SBP would yield essentially complete functional and structural understanding of a useful replicating system.

Figure 4.

A theoretical scheme for self-replication in vitro of a minimal genome dependent only on small molecule substrates. A nicked double-stranded DNA circle (blue strands) encoding all of the macromolecular replication components undergoes rolling-circle DNA synthesis by a DNA polymerase (blue ball) to give an oligomeric single-stranded DNA. Lagging strand DNA synthesis is primed by RNA, then recombination duplicates the original circular template (not shown). RNA (red strand) is synthesized by an RNA polymerase (red ball), cleaved (not shown), and translated into protein (green strand) by encoded ribosomes (green) and other translation factors (not shown).

Conclusions

Of the current SBPs, the in vivo ones have received more coverage in the literature, perhaps because of safety concerns and the obviousness of scalability. But the benefits of in vitro SBPs should not be underestimated. Many biopolymer syntheses are already better scaled up in cell-free systems, such as linear DNAs by oligo synthesis and PCR, unmodified RNAs by in vitro transcription, and peptide libraries by in vitro transcription/translation. And engineering flexibility is much greater in vitro, unshackled from cellular viability, complexity, and walls.

One promise of in vitro SBPs is applications. Current in vitro methods for synthesizing proteins and evolving protein, nucleic acid, and small-molecule ligands will be improved to accelerate production of new reagents, diagnostics, and drugs. New methods will be developed for synthesizing circular DNAs, modified RNAs, proteins containing unnatural amino acids, and liposomes.

The other promise of in vitro SBPs is basic knowledge. Until we can assemble a form of life in vitro from defined, functionally understood macromolecules and small-molecule substrates, how can we say that we understand the secret of life?

Acknowledgments

We thank many colleagues for discussions and comments on the manuscript. This work was supported by an NIH K08 grant (to A.C.F.) and a DOE GTL Center grant (to G.M.C.).

Footnotes

References

| Table of Contents

Preprint Server