Skip to main content
Fig. 1 | Molecular Cancer

Fig. 1

From: Uncovering cancer vulnerabilities by machine learning prediction of synthetic lethality

Fig. 1

Workflow of PARIS (Pan cAnceR Inferred Synthetic lethalities). a The PARIS pipeline uses data retrieved from the DepMap consortium. Dependency scores from the CRISPR-Cas9 screens and mutation/expression data were used as response variables and as independent variables, respectively. Only damaging mutations, TCGA hotspots and predicted pathogenic (coding score from FATHMM > 0.7) mutations were considered. The RF feature selection step assigns important scores to each feature (mutations and expression independently) to describe the dependency scores of a particular gene. The significant-selected pairs are optionally filtered based on the direction of the relationship: positive for mutations/dependencies and negative for expression/dependencies. Candidates for SL gene pairs are ranked based on their importance scores. b The RF feature selection is based on the Boruta algorithm, which selects significant features with importance scores higher than the maximum importance score obtained by random probes during the iteration process (shadowMax). In the example WRN dependency is explained by multiple genes belonging to the mismatch repair pathway that have significantly higher importance scores than the random probes. c Examples of dependency/selected features correlations. Scatterplots show the negative correlation between WRN dependency/MLH1 expression and the positive correlation between ARID1B dependency/ARID1A mutation status

Back to article page