Marker-free characterization of full-length transcriptomes of single live circulating tumor cells

  1. Debarka Sengupta1,2,9
  1. 1Department of Computational Biology, Indraprastha Institute of Information Technology-Delhi (IIIT-Delhi), New Delhi 110020, India;
  2. 2Department of Computer Science and Engineering, Indraprastha Institute of Information Technology-Delhi (IIIT-Delhi), New Delhi 110020, India;
  3. 3Department of Computer Science and Engineering, Delhi Technological University, New Delhi 110042, India;
  4. 4Biolidics Limited, Singapore 118257, Singapore;
  5. 5National Cancer Centre Singapore, Singapore 169610, Singapore;
  6. 6Fluidigm Corporation, South San Francisco, California 94080, USA;
  7. 7Department of Research, Rajiv Gandhi Cancer Institute and Research Centre-Delhi (RGCIRC-Delhi), New Delhi 110085, India;
  8. 8Department of Laboratory Services and Molecular Diagnostics, Rajiv Gandhi Cancer Institute and Research Centre-Delhi (RGCIRC-Delhi), New Delhi 110085, India;
  9. 9Centre for Artificial Intelligence, Indraprastha Institute of Information Technology-Delhi (IIIT-Delhi), New Delhi 110020, India;
  10. 10Department of Electronics & Communications Engineering, Indraprastha Institute of Information Technology-Delhi (IIIT-Delhi), New Delhi 110020, India
  • Present addresses: 11Thermo Fisher Scientific, Singapore 739256, Singapore; 12BioSkryb Corporation, Durham, NC 27701, USA; 13Department of Biomedical Engineering, Faculty of Engineering, National University of Singapore, Singapore 117575, Singapore; 14Institute for Health Innovation and Technology (iHealthtech), National University of Singapore, Singapore 117599, Singapore

  • Corresponding authors: debarka{at}iiitd.ac.in, naveen.ramalingam{at}fluidigm.com, angshul{at}iiitd.ac.in
  • Abstract

    The identification and characterization of circulating tumor cells (CTCs) are important for gaining insights into the biology of metastatic cancers, monitoring disease progression, and medical management of the disease. The limiting factor in the enrichment of purified CTC populations is their sparse availability, heterogeneity, and altered phenotypes relative to the primary tumor. Intensive research both at the technical and molecular fronts led to the development of assays that ease CTC detection and identification from peripheral blood. Most CTC detection methods based on single-cell RNA sequencing (scRNA-seq) use a mix of size selection, marker-based white blood cell (WBC) depletion, and antibodies targeting tumor-associated antigens. However, the majority of these methods either miss out on atypical CTCs or suffer from WBC contamination. We present unCTC, an R package for unbiased identification and characterization of CTCs from single-cell transcriptomic data. unCTC features many standard and novel computational and statistical modules for various analyses. These include a novel method of scRNA-seq clustering, named deep dictionary learning using k-means clustering cost (DDLK), expression-based copy number variation (CNV) inference, and combinatorial, marker-based verification of the malignant phenotypes. DDLK enables robust segregation of CTCs and WBCs in the pathway space, as opposed to the gene expression space. We validated the utility of unCTC on scRNA-seq profiles of breast CTCs from six patients, captured and profiled using an integrated ClearCell FX and Polaris workflow that works by the principles of size-based separation of CTCs and marker-based WBC depletion.

    Footnotes

    • Received January 16, 2022.
    • Accepted November 10, 2022.

    This article is distributed exclusively by Cold Spring Harbor Laboratory Press for the first six months after the full-issue publication date (see https://genome.cshlp.org/site/misc/terms.xhtml). After six months, it is available under a Creative Commons License (Attribution-NonCommercial 4.0 International), as described at http://creativecommons.org/licenses/by-nc/4.0/.

    Preprint Server