The proteome folding project: Proteome-scale prediction of structure and function
- Kevin Drew1,
- Patrick Winters1,
- Glenn L. Butterfoss1,
- Viktors Berstis2,
- Keith Uplinger2,
- Jonathan Armstrong2,
- Michael Riffle3,
- Erik Schweighofer4,
- Bill Bovermann2,
- David R. Goodlett3,
- Trisha N. Davis3,
- Dennis Shasha1,
- Lars Malmström5 and
- Richard Bonneau1,6
- 1 New York University;
- 2 IBM;
- 3 University of Washington;
- 4 Institute for Systems Biology;
- 5 ETH Zurich
- ↵* Corresponding author; email: bonneau{at}nyu.edu
Abstract
The incompleteness of proteome structure and function annotation is a critical problem for biologists and, in particular, severely limits interpretation of high-throughput and next-generation experiments. We have developed a proteome annotation pipeline based on structure prediction, where function and structure annotations are generated using an integration of sequence comparison, fold recognition and grid-computing enabled de novo structure prediction. We predict protein domain boundaries and 3D structures for protein domains from 94 genomes (including Human, Arabidopsis, Rice, Mouse, Fly, Yeast, E. coli and Worm). De novo structure predictions were distributed on a grid of over 1.5 million CPUs worldwide (World Community Grid). We generate significant numbers of new confident fold annotations (9% of domains that are otherwise unannotated in these genomes). We demonstrate that predicted structures can be combined with annotations from the Gene Ontology database to predict new and more specific molecular functions.
- Received January 26, 2011.
- Accepted July 28, 2011.
- Copyright © 2011, Cold Spring Harbor Laboratory Press
This manuscript is Open Access.











