From Bioinformatics to Computational Biology
- Structural and Genetic Information Laboratory, CNRS-AVENTIS UMR 1889, Marseille cedex 20, France
This extract was created in the absence of an abstract.
It is quite ironic that the uncertainty about the number of human genes (28,000–120,000) (Ewing and Green 2000; Liang et al. 2000; Roest Crollius et al. 2000) appears to increase as the determination of the human genome sequence is nearing completion. I shall contend here that this paradox reveals deep epistemological problems, and that “bioinformatics”—a term coined in 1990 to define the use of computers in sequence analysis—is no longer developing in directions relevant to biology.
After the pioneers who established the basic concepts of molecular sequence analysis (Fitch and Margoliash 1967; Needleman and Wunsch 1970; Chou and Fasman 1974), most computational biologists of my generation (the second one) embarked on their journey into the emerging discipline with the ambition to turn it into the bona fide theoretical branch of molecular biology. Having a physicist's background, I suspect that many of us had the vision of establishing bioinformatics in a leadership role over experimental biology, similar to the supremacy that theoretical physics enjoys over experimental physics. Somewhere along the line, it seems that bioinformatics lost this ambition and became sidetracked onto what physicists would call a “phenomenological” pathway.
Let us follow the example of particle physics for a little longer. There, theoretical research has two phases (which, in fact, run in parallel). In the first phase (so-called phenomenological), a large number of physical events are recorded in huge raw databases, classified into separate groups based on statistical regularities, and then utilized to identify the most recurrent objects. Optimal database design, fast classification/clustering algorithms, and data mining software are the main area of development here. The level of knowledge gained from this phase is, for instance, that objects A and B often appear together except when C is around, or when parameter X is lower than a …











