Exhaustive T-cell repertoire sequencing of human peripheral blood samples reveals signatures of antigen selection and a directly measured repertoire size of at least 1 million clonotypes
- René L. Warren1,
- J. Douglas Freeman1,
- Thomas Zeng1,
- Gina Choe1,
- Sarah Munro1,
- Richard Moore1,
- John R. Webb2 and
- Robert A. Holt1,3,4
- 1BC Cancer Agency, Michael Smith Genome Sciences Centre, Vancouver, British Columbia V5Z 1L3, Canada;
- 2BC Cancer Agency, Deeley Research Centre, Victoria, British Columbia V8R 6V5, Canada;
- 3Department of Molecular Biology and Biochemistry, Simon Fraser University, Burnaby, British Columbia V5A 1S6, Canada
Abstract
Massively parallel sequencing offers a useful approach to characterizing TCR diversity. However, immune receptors are extraordinarily difficult sequencing targets because any given receptor variant may be present in very low abundance and may differ legitimately by only a single nucleotide. As such, the sensitivity of sequence-based repertoire profiling is limited by both sequencing depth and sequencing accuracy. We obtained peripheral blood TCRB mRNA from a healthy donor, at two timepoints 1 wk apart, and generated from multiple libraries a total of 1.7 billion paired sequence reads. The sequencing error rate was determined empirically, by analyzing the portion of each read encoding one of a small number of possible J segments, and used to inform a high stringency data filtering procedure. From the error filtered data, we obtained 1,061,522 distinct TCRB nucleotide sequences. This figure establishes a new, directly measured lower limit on individual T-cell repertoire size and provides a useful reference set of sequences for repertoire analysis. Analysis of naïve (CD45RA+/CD45RO−) and memory (CD45RA−CD45RO+) T-cell fractions highlighted T-cell plasticity, whereby sequences that were highly represented in both subsets at week 1 had transitioned preferentially to the CD45RA−/CD45RO+ subset at week 2. TCRB nucleotide sequences obtained from two additional donors were compared with those from the first donor and revealed highly similar V and J gene usage frequencies among individuals, but only a very small proportion (<1.1%) of shared nucleotide sequences. Analysis of in silico translated sequences indicated, however, that at the amino acid level as many as 14.2% of distinct sequences from one donor were shared with those from another donor. For each donor, shared amino acid sequences were encoded by a much larger diversity of nucleotide sequences than were unshared amino acid sequences. We also observed a highly statistically significant association between numbers of shared sequences and shared HLA class I alleles.
Footnotes
-
↵4 Corresponding author.
E-mail rholt{at}bcgsc.ca
-
[Supplemental material is available for this article. The sequencing data from this study have been submitted to the NCBI Sequence Read Archive (http://www.ncbi.nlm.nih.gov/Traces/sra/sra.cgi) under accession no. SRA020989. A file containing all distinct TCRB sequences observed after all quality filtering is available at ftp://ftp.bcgsc.ca/supplementary/TCRb2010/.]
-
Article published online before print. Article, supplemental material, and publication date are at http://www.genome.org/cgi/doi/10.1101/gr.115428.110.
- Received September 16, 2010.
- Accepted December 28, 2010.
- Copyright © 2011 by Cold Spring Harbor Laboratory Press
Freely available online through the Genome Research Open Access option.











