Comparative Proteogenomic Analysis of Right-sided Colon Cancer, Left-sided Colon Cancer and Rectal Cancers Reveal Distinct Mutational Profiles

To understand the molecular differences between right-sided colon cancer (RCC), left-sided colon cancer (LCC) and rectal cancer, we analyzed colorectal tumors at the DNA, RNA, miRNA and protein levels using previously sequenced data from The Cancer Genome Atlas and Memorial Sloan Kettering Cancer Center. Clonal evolution analysis identified the same tumor-initiating events involving APC, KRAS and TP53 genes in RCC, LCC and rectal cancers. However, the individual role-played by each event, their order in tumor dynamics and selection of downstream mutations were distinct in all three anatomical locations, with some similarities noted between LCC and rectal cancer. We found a potentially targetable alteration APC R1450* specific to RCC that has not been previously described. Differential gene expression analysis revealed multiple genes within the homeobox, G-protein coupled receptor binding and transcription regulation families were dysregulated in RCC, LCC, and rectal cancers and may have a pathological role in these cancers. Further, using a novel in silico proteomic analytic tool developed by our research group, we found distinct central or hub proteins with unique interactomes in each location. Protein expression signatures were not necessarily concordant with the tumor profiles obtained at the DNA and RNA levels, underscoring the relevance of post-transcriptional events in defining the biology of these cancers beyond molecular changes at the DNA and/or RNA level. Ultimately, not only tumor location and the respective genomic profile but also protein-protein interactions will need to be taken into account to improve treatment outcomes of colorectal cancers. Further studies that take into account the alterations found in this study may help in developing more tailored, and perhaps more effective, treatment strategies. Author summary Patients with right-sided colon cancer (RCC) has a worse prognosis compared to left-sided colon cancer (LCC). Recent data has also shown that wild-type RAS metastatic RCC’s have poor outcomes when treated with the combination of chemotherapy and anti-EGFR therapy compared to LCC and rectal cancers. Therefore, There is an urgent unmet need to understand the molecular differences between RCC, LCC, and rectal cancers. In this study, we demonstrate clonal evolutionary trajectory and the order of mutations in RCC, LCC, and rectal cancers are distinct with some similarities between LCC and rectal cancers. The order of the mutations that lead to the acquisition of crucial driver alterations may have prognostic and therapeutic implications. We also discovered a novel targetable alteration, APC R1450* to be significantly enriched in early, late and metastatic RCC but not in LCC and rectal cancers. Amazingly, proteomic signatures were discordant with DNA and RNA levels. These distinct differences in DNA, RNA and post-transcriptional events may contribute to their unique clinicopathological features. Conflict of Interest Statement Ashiq Masood Advisory board and speaker Bureau Bristol-Myers Squibb and Boehringer Ingelheim Janakiraman Subramanian Advisory board - Astra Zeneca, Pfizer, Boehringer Ingelheim, Alexion, Paradigm, Bristol-Myers Squibb Speakers Bureau - Astra Zeneca, Boehringer Ingelheim, Lilly Research Support - Biocept and Paradigm Arif Hussain Advisory board – Novartis, Bayer, Astra Zeneca Consultant – Bristol-Myers-Squibb All other authors have no conflict of interest.


1.
Department of Medicine, University of Missouri-Kansas City School of 9 Medicine, Kansas City, Missouri 64108, USA Unlike APC R1450*, the frequency of other mutations within this region is relatively 2 9 8 similar among the TCGA and MSKCC data sets. The relative frequencies of non-2 9 9 R1450* mutations within the MCR domain of APC for RCC were 63% and 64% in the 3 0 0 TCGA and MSKCC data sets, respectively, for LCC 52% and 51%, respectively, and for 3 0 1 rectal cancers 64% vs 58% (which did not meet statistical significance, p=0.35). Another  Our somatic copy number analysis identified 13 unique amplified regions in LCC MAP2K4. No candidate genes were identified in the deleted regions in RCC (S8 Table). were differentially expressed between RCC and rectal cancers (p < 0.05, log2 ratio > 3 3 2 1.5); among these 84 genes were upregulated and 62 downregulated (S10a and S10b and rectal cancers, several differences at the gene expression level were also noted 3 3 5 between the two. In particular, 38 genes were differentially expressed between LCC and 3 3 6 rectal cancer, including 22 genes that were upregulated and 16 that were downregulated (S11a and S11b Tables). 69 differentially expressed genes were shared  In an effort to better understand the potential relevance of the above changes in 3 4 1 gene expression, we performed Functional Annotation Clustering using DAVID [36,37].

4 2
The results are shown in S12-14 Tables. The majority  peptide hormone pathway, lipid metabolism and keratinization pathway (Fig 6a).

7 5
Seven miRNAs (4 upregulated and 3 downregulated) were dysregulated when 3 7 6 comparing RCC to both LCC and rectal cancer (S9a Fig, S19 Table), while 3 were   interactomes, were found to be unique to each of the locations (Fig 7). (cell proliferation and ATM-dependent DNA damage response) and PDPK1 (growth 3 9 7 regulation) Among the potentially significant hub proteins in LCC were the following: RCC and LCC. BAP1 is a BRCA1 associated protein that acts as a nuclear-localized cancer. It is a ubiquitous cytoplasmic protein involved in multiple cellular processes proteins were found between LCC and rectal cancer. processes that may contribute to the respective clinical behavior of these cancers. To same study. In addition to using the PiCnIc algorithm to identify clonal gene 4 4 0 associations, we carried out analyses to identify somatic driver mutations, somatic copy 4 4 1 number changes, mutation hotspots, differential RNA and miRNA expression, and 4 4 2 protein-protein interaction networks.

3
We show that despite sharing the same critical initial events involving APC, associated with the interplay and cross-talk between critical oncogenic pathways. are not just due to unique genomic events but rather are also due to differences in how 4 5 2 the key (common) initial genomic events interact and select subsequent downstream behavior, including patterns of metastasis, response to treatment and clinical outcome.

5 8
To our knowledge, this is the first study to identify APC R1450* and AMER1 (A1309*) similar to APC R1450* suggests that this mutation may be biologically Significant differences between RCC and LCC and RCC and rectal cancers were reprogramming is evolving as a hallmark of cancer, and lipid metabolism dysregulation A unique aspect of this study is that it also evaluated protein expression patterns 5 1 5 in RCC, LCC and rectal cancers using the TCPA dataset. We identified dysregulation in 5 1 6 several key hub proteins, their interactomes and known pathways that have been 5 1 7 implicated in oncogenesis (Fig 7). A somewhat surprising observation from this analysis 5 1 8 is that the protein hubs and their interactomes are distinct for each of the anatomically behavior of tumors. In conclusion, despite the limitation that this study is primarily 5 2 5 computational and needs to be confirmed experimentally, it demonstrates that RCC and 5 2 6 LCC have different genomic and proteomic profiles, whereas LCC and rectal cancers 5 2 7 are related but distinct. Ultimately, optimal management of colorectal cancers will signatures with respect to anatomical location and clinical behavior.  Table).

3 8
Patients were divided into non-hypermutated, hypermutated and POLE mutations were also excluded from further analysis because of their low numbers to allow associated clinical data were excluded for analysis due to their small sample size.  = 69), stage II 32.5% (n = 126), stage III 29.7% (n = 115) and stage IV 17.5% (n = 68). In MSKCC dataset, we only analyzed patients with stage IV disease (n = 703). The  Since rectal cancer samples were significantly lower compared to RCC and LCC, they was performed by using the GISTIC algorithm on Genepattern to identify focal each overlapping was smaller than 10 megabase and had < 25 genes then 2 peaks 6 1 8 were considered same. Candidate genes related to significant amplifications and 6 1 9 deletions were identified using pan-cancer patterns of somatic copy-number alteration, were removed from the data. Again, hypermutated samples were excluded from the 6 2 7 analyses due to small sample size, especially LCC where only 5 samples were In our analysis, LCC and rectal cancer were set as controls and were compared 6 3 1 to RCC. In addition, LCC was set as control when differential gene expression was 6 3 2 compared to rectal cancer. Therefore, differentially expressed genes with positive log2 Gene quantification are htseq-count data and were obtained from the TCGA respect to the entire RNA repertoire which may vary drastically from sample to sample.

4 0
We conducted normalization of the samples using estimateSizeFactors of the DESeq2 We also employed a hierarchical (agglomerative) clustering tool, REVIGO[38] 6 5 6 using gene ontology (GO) terms for biological processes enriched in our data from 6 5 7 DAVID to identify differences and similarities between RCC, LCC and rectal cancers. A 6 5 8 p-value of < 0.05 was considered significant. (excluding shared micro RNAs between RCC/LCC and RCC/rectal cancers). (highlighted green).  (RCC vs LCC log2FC greater than 1.5 and padj less than 0.05) using DAVID _ high  with their transcription factors, target genes, pathological process and diseases.