Inferring ongoing cancer evolution from single tumour biopsies using synthetic supervised learning

Tom W. Ouellette and Philip Awadalla

To explore the scenarios in which positive selection and subclonality were detected with high confidence, we generated synthetically sampled VAF distributions with 1 subclone carrying a varying number of subclonal mutations, varying subclone frequencies (cellular fraction / 2), and varying mean sequencing depths. Each synthetic VAF distribution had 2000 mutations assigned to the neutral tail (defined by the Pareto distribution/power-law) and 1000 mutations assigned to the heterozygous clonal peak. The shape parameter for the Pareto distribution was fixed to 1.4 for sampling neutral mutations and the scale parameter was set as the lowest observed mutation frequency (dependent on depth). The sequencing overdispersion parameter rho used in the beta-binomial sequencing noise model was fixed at 0.003.


Supplementary Figure 6. Accurately detecting positive selection and subclonality is dependent on the sequencing depth, number of subclonal mutations, and subclone frequency at time of biopsy. For each VAF distribution, we computed the mean probability estimate across 25 stochastic passes through our trained neural network for the (Top row) both evolutionary mode classification, P(Selection), and the (Bottom row) number of subclone classification, P(N subclones). The interactive plots below show the mean probability estimates for both tasks at increasing subclone frequency (x-axis) and increasing subclone mutations (y-axis) for the top 25 trained deep learning models (dropdown menu). Hovering your cursor will show the mean probability estimate (z) at the given mutation-frequency combination. For P(Selection, we also provide the upper and lower bound of the 89% equal-tailed interval. In practice, to mitigate model overconfidence, we only call positive selection if the lower bound of the 89% interval is greater than 0.5. For reference, we use model TASYG7N3IJR1DLN in downstream inference tasks..