Comprehensive analyses of m6A regulators and interactive coding and non-coding RNAs across 32 cancer types

N6-Methyladenosine (m6A) is an RNA modification that interacts with numerous coding and non-coding RNAs and plays important roles in the development of cancers. Nonetheless, the clinical impacts of m6A interactive genes on these cancers largely remain unclear since most studies focus only on a single cancer type. We comprehensively evaluated m6A modification patterns, including 23 m6A regulators and 83 interactive coding and non-coding RNAs among 9,804 pan-cancer samples. We used clustering analysis to identify m6A subtypes and constructed the m6A signature based on an unsupervised approach. We used the signatures to identify potential m6A modification targets across the genome. The prognostic value of one target was further validated in 3,444 samples from six external datasets. We developed three distinct m6A modification subtypes with different tumor microenvironment cell infiltration degrees: immunological, intermediate, and tumor proliferative. They were significantly associated with overall survival in 24 of 27 cancer types. Our constructed individual-level m6A signature was associated with survival, tumor mutation burden, and classical pathways. With the signature, we identified 114 novel genes as potential m6A targets. The gene shared most commonly between cancer types, BCL9L, is an oncogene and interacts with m6A patterns in the Wnt signaling pathway. In conclusion, m6A regulators and their interactive genes impact the outcome of various cancers. Evaluating the m6A subtype and the signature of individual tumors may inform the design of adjuvant treatments.

RNAs, which contribute a lot to cancers. In this study, we investigated m6A regulators along with their interactive coding and non-coding RNAs in a pan-cancer setting.

Results and discussions
Landscape of m6A patterns in TCGA pan-cancer To characterize m6A patterns and screen for potential targets, we developed a four-step computational framework among 9804 pan-cancer samples in The Cancer Genome Atlas (TCGA) (Additional file: Methods, Figure  1A). A total of 23 m6A regulators, 56 m6A interactive protein-coding genes, 10 lncRNAs, and 17 miRNAs were included in this study after quality control (Table S1).
The 106 genes had close co-expression relations (Pearson r > 0.3) (Fig. 1b). In the co-expression regulation network, most of the m6A regulators and some of the protein-coding genes were hub genes that interacted with other genes (Fig. 1c).
In the differential expression comparisons of tumoradjacent normal tissues across different cancer types, many genes had higher expression levels in tumor tissues while some protein-coding genes showed the opposite relationship ( Figure S1). These genes included ASB2, P2RX6, AXL, ID2, and SOCS2. Four miRNAs including miR-143, miR-29a, miR-125b-1, and miR-145 showed lower expression in tumor tissues. All of the lncRNAs had higher expression in tumor tissues. Using the first two principal components (PCs) from tumor and normal tissues, we found that the m6A patterns had good differential diagnostic value ( Figure S2). The Area Under Curve (AUC) passed 90% in most cancers (Fig. 1d, S3).

Constructing m6A subtypes and signature in pan-cancer
We used the K-means algorithm to categorize patients into different m6A subtypes (clusters) within each cancer type, separately. Three subtypes were identified by the Elbow method ( Figures S4-S5). The log-rank test detected that the defined subtypes were significantly associated with overall survival in 24 of 27 cancers after excluding cancers with death proportion < 10% ( Figure S6). Specifically, we defined the clusters sorted by median survival time (MST). The group with the longest MST was defined as cluster 1, while the group with the shortest MST was defined as cluster 3, and the middle group was cluster 2. Compared with cluster 1, clusters 2 and 3 had significantly worse survival in 22 of 27 cancers (P trend < 0.05 in Cox proportional-hazards model) (Fig. 2a). In addition, the classifiers remained significant in most cancer types when the clinical outcome was progression-free interval (PFI) (16 of 26) (Fig. 2b) or disease-specific survival (DSS) (18 of 26) (Fig. 2c). In the Kaplan-Meier survival analysis of overall population, m6A subtypes could stratify patients' survival significantly after adjusting for cancer types (P < 2 × 10 − 16 ) (Fig. 2d). Distributions of four somatic mutations were significantly different among different m6A subtypes (FDR-correct P values of Chi-square test < 0.05), including TP53, NOTCH1, CTNNB1, and PTEN ( Figure S7).
In the analysis of single sample gene set enrichment analysis (ssGSEA) of tumor microenvironment (TME) cell infiltration, the beta coefficients (95% CIs) of clusters 2 and 3 were shown in Fig. 2e while cluster 1 was used as the reference group. A total of 27 of 28 immune categories showed significant differences between m6A subtypes after FDR correction except the central memory CD8 T cell category. Cluster 3 had the lowest TME infiltration degree while cluster 1 had the highest. Thus, we defined cluster 1 as an immunological subtype, cluster 2 as an intermediate subtype, and cluster 3 as tumor proliferative subtype (Fig. 2f).
We checked the associations between the m6A signature and tumor mutation burden (TMB) score in the overall pan-cancer population. Unsurprisingly, they had a strong positive correlation with Pearson r = 0.53 in the overall population ( Figure S10A). The m6A signature was positively correlated (P FDR < 0.05) with the TMB scores in 16 cancer types ( Figure S10B).
We applied the ssGSEA to compute the enrichment score of 2236 canonical pathways across the genome. A total of 949 pathways (42.4%) were significantly associated with the m6A signature after FDR corrections, indicating that m6A modifications are linked to a broad range of biological processes (Table S2), with consistent results across most of cancer types ( Figure S11). Also identified were some classical pathways related to cancer therapy, such as the FOXM1 pathway, cell cycle regulation, polo-like kinase 1 (PLK1)-related pathways, and Aurora A/B pathways.
Further, we validated the 109 protein-coding genes in the external datasets. Forty genes (37.7%) were associated with survival in the meta-analysis with P FDR < 0.05, indicating they played important roles in cancers (Table S4).

Conclusions
In summary, this study demonstrates that m6A regulators and interactive genes may play an important role in cancer outcomes. Our systemic evaluation of m6A patterns improves the understanding of the dysregulation of RNA methylation in tumor microenvironments. The predicted interactive target genes may provide additional insight into clinical therapeutic targets.
Additional file 1. Supplementary materials and methods, and supplementary Table S1-S7, Figure S1-S15. Table S1. Summary of genes that interact with m6A modification. Table S2. Top 200 canonical pathways associated with the m6A signature. Table S3. Potential m6A modification targets associated with the m6A signature. Table S4.
(See figure on previous page.) Fig. 2 a-c Associations of the m6A subtypes and pan-cancer overall survival (OS), Progression-free interval (PFI) and disease-specific survival (DSS). The hazard ratios (HR) are evaluated by the trend association (clusters 1-3) in the Cox proportional hazard models. d Kaplan-Meier plots of the m6A subtypes and overall survival. e Associations between m6A subtypes and the ssGSEA scores of tumor microenvironment cell infiltration. Cluster 1 is used as a reference group. f Summary of the characteristics of different m6A subtypes. g Correlation of the m6A signature and m6A subtypes. h Distributions of the m6A signature across different clinical stage (I-IV). i Kaplan-Meier plot of the m6A signature and overall survival. We categorized the signature into five subgroups with equal sample sizes and adjusted the curves with age, gender, stage, cancer types and probable estimations of expression residual (PEER) factors. j Correlation plot of the m6A signature and median survival time (MST) of each cancer type. k The top potential m6A targets associated with the m6A signature in ≥12 cancer types. We used the OncoScore system to evaluate their relationships with cancer based on previous literature reports Independent validation of the potential m6A targets. Table S5. Study characteristics of the pan-cancer datasets. Table S6. Sample size of tumor-normal pairs in each cancer type. Table S7. Gene list used to generate the m6A subtypes in each cancer. Figure S1. Heatmap of the fold change (FC) values in pan-cancer. Figure S2. PCA plots of m6A related genes in tumors and adjacent normal tissues. Figure S3. Receiver-operating characteristic curves of predicted performance to discriminate from tumor and normal tissues. Figure S4. Average cost of different clusters using the Elbow method. Figure S5. Distribution of the m6A subtype across each cancer type. Figure S6. Kaplan-Meier plots of m6A subtypes and overall survival in pan-cancer. Figure S7. Associations between somatic mutations and m6A subtypes. Figure S8. Heatmap of the genes associated with overall survival in at least five cancer types and used to generate an m6A signature. Figure S9. Distribution of m6A signature in low-grade (grade 1&2) and high-grade (grade 3&4) patients. Figure S10. Associations between tumor mutation burden score and m6A signature in pan-cancer. Figure S11. The top 20 biological pathways that are associated with the m6A signature. Figure S12. Differential comparison of BCL9L in TCGA tumor and adjacent-normal tissues. Figure S13. BCL9L and survival outcome in pan-cancer. Figure S14. External validation of BCL9L prognostic value in public GEO datasets. Figure S15. Correlations between BCL9L and m6A interactive genes in the Wnt signaling pathway. (DOCX 7598kb)