Gene duplication is associated with gene diversification and potential neofunctionalization in lung cancer evolution

  1. Christine A. Orengo1
  1. 1Institute of Structural and Molecular Biology, University College London, London WC1E 6BT, United Kingdom;
  2. 2Cancer Evolution and Genome Instability Laboratory, The Francis Crick Institute, London NW1 1AT, United Kingdom;
  3. 3Cancer Research UK Lung Cancer Centre of Excellence, University College London Cancer Institute, London WC1E 6DD, United Kingdom;
  4. 4University College London Cancer Institute, University College London, London WC1E 6DD, United Kingdom;
  5. 5Cancer Metastasis Laboratory, University College London Cancer Institute, London WC1E 6DD, United Kingdom;
  6. 6Department of Oncology, University College London Hospitals, London NW1 2BU, United Kingdom;
  7. 7Cancer Genome Evolution Research Group, Cancer Research UK Lung Cancer Centre of Excellence, University College London Cancer Institute, London WC1E 6DD, United Kingdom
  • Corresponding author: c.orengo{at}ucl.ac.uk
  • Abstract

    Tumors evolve through a process of selection on somatic mutations, driving cell division and tissue growth through aberrations in cell-cycle control. In non-small-cell lung cancer (NSCLC), genome instability occurs early in tumor growth, resulting in pronounced intratumor heterogeneity, including changes in gene copy number, and whole-genome doubling (WGD) in ∼75% of tumors. Gene duplication, genetic drift, and selection mediate functional diversification during evolution. In this study, we seek to identify the diversification and potential gene neofunctionalization of lung tumors in the TRACERx cohort. We develop a novel computational protocol to identify preduplication and postduplication mutations predicted to affect protein function. Mutations are analyzed using paralogs grouped into functional families with highly similar functions, identifying 355 functional impact events (FIEs) through their proximity and clustering near to functional sites. The use of functional family paralogs to map mutations to protein structures from the PDB helps predict putative rare driver events in lung tumors. By extending the analysis with high-quality structural models from AlphaFold using The Encyclopedia of Domains (TED), we find a significant increase in the diversity of both genes and functional families with postduplication FIEs in lung adenocarcinomas, including some metabolic enzymes with the potential to be neofunctional. The postduplication diversification of driver genes and functions may indicate selection for somatic copy number changes in lung tumors and an increased scope for tumor adaptations.

    Footnotes

    • [Supplemental material is available for this article.]

    • Article published online before print. Article, supplemental material, and publication date are at https://www.genome.org/cgi/doi/10.1101/gr.278663.123.

    • Freely available online through the Genome Research Open Access option.

    • Received November 7, 2023.
    • Accepted November 21, 2025.

    This article, published in Genome Research, is available under a Creative Commons License (Attribution 4.0 International), as described at http://creativecommons.org/licenses/by/4.0/.

    OPEN ACCESS ARTICLE

    This Article

    1. Genome Res. © 2026 Ashford et al.; Published by Cold Spring Harbor Laboratory Press

    Article Category

    ORCID

    Share

    Preprint Server