RT Journal A1 Khattra, Jaswinder A1 Delaney, Allen D. A1 Zhao, Yongjun A1 Siddiqui, Asim A1 Asano, Jennifer A1 McDonald, Helen A1 Pandoh, Pawan A1 Dhalla, Noreen A1 Prabhu, Anna-liisa A1 Ma, Kevin A1 Lee, Stephanie A1 Ally, Adrian A1 Tam, Angela A1 Sa, Danne A1 Rogers, Sean A1 Charest, David A1 Stott, Jeff A1 Zuyderduyn, Scott A1 Varhol, Richard A1 Eaves, Connie A1 Jones, Steven A1 Holt, Robert A1 Hirst, Martin A1 Hoodless, Pamela A. A1 Marra, Marco A. T1 Large-scale production of SAGE libraries from microdissected tissues, flow-sorted cells, and cell lines JF Genome Research JO Genome Research YR 2007 FD January 01 VO 17 IS 1 SP 108 OP 116 DO 10.1101/gr.5488207 UL http://genome.cshlp.org/content/17/1/108.abstract AB We describe the details of a serial analysis of gene expression (SAGE) library construction and analysis platform that has enabled the generation of >298 high-quality SAGE libraries and >30 million SAGE tags primarily from sub-microgram amounts of total RNA purified from samples acquired by microdissection. Several RNA isolation methods were used to handle the diversity of samples processed, and various measures were applied to minimize ditag PCR carryover contamination. Modifications in the SAGE protocol resulted in improved cloning and DNA sequencing efficiencies. Bioinformatic measures to automatically assess DNA sequencing results were implemented to analyze the integrity of ditag structure, linker or cross-species ditag contamination, and yield of high-quality tags per sequence read. Our analysis of singleton tag errors resulted in a method for correcting such errors to statistically determine tag accuracy. From the libraries generated, we produced an essentially complete mapping of reliable 21-base-pair tags to the mouse reference genome sequence for a meta-library of ∼5 million tags. Our analyses led us to reject the commonly held notion that duplicate ditags are artifacts. Rather than the usual practice of discarding such tags, we conclude that they should be retained to avoid introducing bias into the results and thereby maintain the quantitative nature of the data, which is a major theoretical advantage of SAGE as a tool for global transcriptional profiling.