Prediction of Cell Type-Specific Gene Modules: Identification and Initial Characterization of a Core Set of Smooth Muscle-Specific Genes
Abstract
Genes that are expressed in the same subset of cells potentially constitute a module regulated by shared cis-regulatory elements and a distinct set of transcription factors. Identifying such units is an important entry point to the molecular study of cell differentiation. We developed a general method to classify cell type-specific genes from expressed sequence tag (EST) data, and we optimized it for identification of smooth muscle cell (SMC)-specific genes. Expression profiles were derived from the quantitative distribution of EST data in mouse, and genes were classified based on their profile similarity to known reference genes, in this case smooth muscle myosin heavy chain. A large majority (>90%) of known SMC-specific genes were identified, together with novel candidates. Extensive experimental validation confirmed SMC-specific expression of candidates, for example, lipoma preferred partner (LPP) and a novel SMC-specific putative monoamine oxidase, SMAO. Our method performed considerably better than other computational methods in an objective cross validation comparison. The total number of SMC-specific genes is estimated to be ∼50.
Footnotes
-
[Supplemental material is available online at www.genome.org. A program package, uni_extract, for extraction of data and data preparation, and a MATLAB program package, QRISP, for data transformation, probability estimation, cross validation, and visualization of data, is available at http://cbz.gu.se/Lindahl/QRISP. A gene expression pattern prediction server will be available at www.qrisp.com.]
Article published online before print in July 2003.
-
Article and publication are at http://www.genome.org/cgi/doi/10.1101/gr.1197303.
-
↵3 Corresponding author. E-MAIL Per.Lindahl{at}medkem.gu.se; FAX 46-31-416108.
-
- Accepted May 27, 2003.
- Received January 25, 2003.
- Cold Spring Harbor Laboratory Press











