Gavin Ha; Andrew Roth; Jaswinder Khattra; Julie Ho; Damian Yap; Leah M. Prentice; Nataliya Melnyk; Andrew McPherson; Ali Bashashati; Emma Laks; Justina Biele; Jiarui Ding; Alan Le; Jamie Rosner; Karey Shumansky; Marco A. Marra; C. Blake Gilks; David G. Huntsman; Jessica N. McAlpine; Samuel Aparicio; Sohrab P. Shah

Figure 2.

Description of the TITAN probabilistic framework. (A) Representation of the aggregate copy number signal from mixed populations in a heterogeneous tumor sample. c is the aggregate signal that is composed of three components: normal population (white circles), tumor populations with the deletion (green decagons) and without the event (blue decagons). n is the normal proportion; s_z is the tumor proportion for the z_th clonal cluster that does not contain the event; c_norm and c_DEL are normal and tumor copy numbers. Therefore, (1 − s_z) corresponds to the proportion of tumor harboring the event, also defined as the tumor cellular prevalence of the z^th clonal cluster. (B) Analysis workflow for TITAN. Three inputs are required: (1) Heterozygous positions identified in the normal DNA predicted by genotyping tools such as SAMtools mpileup (Li et al. 2009); (2) reference counts a and read depth N are extracted at these positions from aligned reads in the tumor DNA sequence data; and (3) the tumor and normal read depths, N and N_N, are normalized independently to correct GC content and mappability biases; log ratios l = log(N/N_N) of the corrected read counts are computed. The output is the optimal sequence of CNA/LOH genotypes and clonal cluster memberships at each position. Model parameters for normal contamination n, tumor cellular prevalence s_z, and tumor ploidy φ are estimated. (C) Probabilistic graphical model of TITAN. Shaded nodes are known or observed quantities; open nodes are random variables of unknown quantities. Arrows represent conditional dependence between random variables. Full details and definitions are in Methods and Supplemental Table 13. (D) Parameter trace of ω_g,z and μ_g,z when cellular prevalence varies. s₁ and s₂ are shown as the tumor cellular prevalence (i.e., transformed using 1 − s_z). n is normal proportion and φ is average tumor ploidy. Each CNA/LOH genotype is shown (Supplemental Table 14) with the associated integer copy number in parentheses.

TITAN: inference of copy number architectures in clonal cell populations from tumor whole-genome sequence data

This Article

Preprint Server

Current Issue

In This Issue