Pan-cancer characterization of expression and clinical relevance of m6A-related tissue-elevated long non-coding RNAs

Main text N-methyladenosine (mA) has become a critical internal RNA modification, and it plays important roles in the development and progression of cancer [1]. mA has also been found in diverse non-coding RNAs, such as microRNAs and long noncoding RNAs (lncRNAs) [2]. LncRNAs comprise a large class of RNA transcripts and are critical regulators of gene expression. The regulatory effectiveness of lncRNAs is closely associated with spatial expression, whose dysregulation often influences cancer development and progression [3]. For these reasons, global characterization of lncRNA spatial expression across tissues or cancers could improve our understanding of lncRNA functions. Recently, LncRNA Spatial Atlas (LncSpA) and landscape of mA have been proposed as valuable resources to understand lncRNA and mA regulatory functions across different tissues [4, 5]. However, we still lack understanding of the distribution and functions of mA modification in lncRNAs, particularly the tissue-elevated (TE) lncRNAs. In this study, we aimed to systematically characterize the distribution and clinical relevance of mA-related TE lncRNAs across tissues and cancer types. We found that TE lncRNAs were found to be regulated by mA modification across tissues, particular brain tissues. We also investigated the correlation between expression of mA regulators and TE lncRNAs, and found that numbers of mA-related TE lncRNAs were associated with expression of mA regulators. We assessed the clinical prognostic values of mA-regulated TE lncRNAs. We identified several mA-related TE lncRNAs as potentially useful markers for prognostic stratification. Our analysis highlights the importance of mA modification in the regulation of lncRNA expression and helps bridge the knowledge gap between lncRNA expression and phenotypes.

Pan-cancer characterization of expression and clinical relevance of m 6 A-related tissue-elevated long non-coding RNAs Kang Xu 1,2 † , Yangyang Cai 1,2 † , Mengying Zhang 2 † , Haozhe Zou 2 , Zhenghong Chang 2 , Donghao Li 2 , Jing Bai 1,2* , Juan Xu 1,2* and Yongsheng Li 1* Main text N 6 -methyladenosine (m 6 A) has become a critical internal RNA modification, and it plays important roles in the development and progression of cancer [1]. m 6 A has also been found in diverse non-coding RNAs, such as microRNAs and long noncoding RNAs (lncRNAs) [2]. LncRNAs comprise a large class of RNA transcripts and are critical regulators of gene expression. The regulatory effectiveness of lncRNAs is closely associated with spatial expression, whose dysregulation often influences cancer development and progression [3]. For these reasons, global characterization of lncRNA spatial expression across tissues or cancers could improve our understanding of lncRNA functions. Recently, LncRNA Spatial Atlas (LncSpA) and landscape of m 6 A have been proposed as valuable resources to understand lncRNA and m 6 A regulatory functions across different tissues [4,5]. However, we still lack understanding of the distribution and functions of m 6 A modification in lncRNAs, particularly the tissue-elevated (TE) lncRNAs.
In this study, we aimed to systematically characterize the distribution and clinical relevance of m 6 A-related TE lncRNAs across tissues and cancer types. We found that TE lncRNAs were found to be regulated by m 6 A modification across tissues, particular brain tissues. We also investigated the correlation between expression of m 6 A regulators and TE lncRNAs, and found that numbers of m 6 A-related TE lncRNAs were associated with expression of m 6 A regulators. We assessed the clinical prognostic values of m 6 A-regulated TE lncRNAs. We identified several m 6 A-related TE lncRNAs as potentially useful markers for prognostic stratification. Our analysis highlights the importance of m 6 A modification in the regulation of lncRNA expression and helps bridge the knowledge gap between lncRNA expression and phenotypes.

TE lncRNAs are associated with m 6 A modification across tissues
We first retrieved the TE lncRNAs from 38 normal tissues from LncSpA in 4 data resources (Fig. 1a), including Human Body Map (HBM2.0), Human Protein Atlas (HPA), the Genotype-Tissue Expression (GTEx), and the Function Annotation Of The Mammalian Genome (FANTOM) project. In total, 9837, 13,337, 10,718, and 74,767 TE lncRNAs were obtained from GTEx, HPA, HBM2.0, and FANTOM5, respectively. Higher numbers of TE lncRNAs were found in tissues of the brain and testis tissues than in other tissues ( Fig. 1b and Additional file 1: Table S1). Next, we mapped all the m 6 A modification peaks to lncRNAs and identified approximately 511-1600 lncRNAs regulated by m 6 A across tissues (Fig. 1c, Additional file 2: Figure S1 and Additional file 3: Table S2). We next assessed the proportion of m 6 A-modified TE lncRNAs among human tissues. We found that brain tissues had the highest proportion of TE lncRNAs with m 6 A modifications ( Fig. 1d and Additional file 2: Figure S2). Approximately 14.89-19.20% TE lncRNAs were m 6 A-modified in brain tissues than in other tissues in four data resources. Although there were higher numbers of TE lncRNAs in testis tissues, the proportion of m 6 A-modified TE lncRNAs was small (Fig. 1d).
Next, we compared the overlap of m 6 A modified TE lncRNAs among tissues from different data resources. The Simpson index was calculated for two tissues from different sources. High correlations were observed for the same tissues across different sources (Fig. 1e), suggesting that m 6 A modified TE lncRNAs were conserved across different resources. To investigate potential tissue specificity of the m 6 A-modified TE lncRNAs, we calculated the percentage of m 6 A-modified TE lncRNAs and non-TE lncRNAs in each tissue. There were no significant differences observed for the two lncRNA categories for the most tissues, which is consistent with the observations in protein coding genes [5]. However, the proportion of m 6 A-modified TE lncRNAs is significantly higher than that of non-TE lncRNAs in brain tissues ( Fig. 1f and Additional file 2: Figure S3). We explored the number of m 6 A peaks for lncRNAs across tissues. We found that the majority of m 6 A peaks in lncRNAs were in brain tissues (Additional file 2: Figure S4). Collectively, these results indicated that TE lncRNAs are associated with m 6 A modification across tissues and are more prone to be regulated by m 6 A in brain tissues than in other tissues.

Co-expression network of TE lncRNAs and m 6 A regulators
The regulatory effects of m 6 A modification are primarily determined by regulators, including readers, writers, and erasers [6]. The extent to which variation in m 6 A modification of TE lncRNAs may be attributed to the expression of m 6 A regulators remains unknown. Thus, we next sought to analyze the correlation between the expressions of m 6 A-modified TE lncRNAs and regulators. In total, we identified 4862 correlations among 860 TE lncRNAs and 20 m 6 A regulators in 4 resources (Additional file 2: Figure  S5A and Additional file 4: Table S3). Numbers of TE lncRNAs were associated with expression of m 6 A regulators in all four sources, including AC091878.1, LINC00854 and AC007879.5 (Additional file 2: Figure  S5B). In contrast, we calculated the number of TE lncRNAs correlated with each m 6 A regulators. Higher numbers of TE lncRNAs were found to be correlated with  Figure S6).
Notably, we identified several TE lncRNA-regulator pairs that had been verified in literature. We took PVT1 as an example and found its expression to be significantly correlated with YTHDF2 (Additional file 2: Figure  S5C, R = 0.64, P = 0.0003). Evidence has shown that YTHDF2 and PVT1 interact and that YTHDF2 plays critical roles in the stability of PVT1 [7]. Another example is SOX2-OT, which has been reported to play an oncogenic role in cancer. It was identified as a TE lncRNA in brain tissues from all four sources. We found its expression to be significantly closely correlated with HNRNPA2B1 (Additional file 2: Figure S5D, R = 0.56, P = 0.0005). It has been shown that SOX2-OT can regulate cancer proliferation and metastasis through the miR-146b-5p/HNRNPA2B1 pathway. We also found a significant correlation between KCNK15 − AS1 and ALKBH5 (Additional file 2: Figure S5E, R = 0.48, P = 0.0108). ALKBH5 had been demonstrated to inhibit cancer motility by demethylating lncRNA KCNK15-AS1 [8]. Together, all these results suggest that m 6 A modification of TE lncRNAs is partially regulated by the expression of m 6 A regulators.

Association of m 6 A-modified TE lncRNAs with tumor prognosis
LncRNA has been identified as a biomarker suitable for the classification of cancer patients. We next investigated the relationship between expression of m 6 A modified TE lncRNAs and patient survival. We first manually mapped the m 6 A modification in human tissues to cancer types and identified 104-621 TE lncRNAs in 16 cancers (Fig. 2a). Cancers with similar tissue of origin were clustered together based on the overlap of TE lncRNAs, such as LGG and GBM, COAD, and READ. In addition, numbers of m 6 A-modified TE lncRNAs were identified across cancer types, ranging from 3 to 105 ( Fig. 2a and Additional file 2: Figure S7).
We next explored the differences in survival between patients with high-and low-levels of lncRNA expression and identified 83 protective and 18 risky m 6 A-modified TE lncRNAs across cancer types ( Fig. 2a and Additional file 5: Table S4). Moreover, we identified 28 m 6 Amodified TE lncRNAs that had significantly higher expression in cancer patients than in healthy controls and 8 m 6 A-modified TE lncRNAs that had significantly lower expression ( Fig. 2a and Additional file 5: Table  S4). There were two m 6 A-modified TE lncRNAs (F11-AS1 and LINC01018) showing significantly lower expression in hepatocellular carcinoma patients than in controls, and these lower expressions were associated with worse survival rates (Fig. 2b-e). F11-AS1 can inhibit HBV-related hepatocellular carcinoma progression by regulating NR1I3 via binding to microRNA-211-5p. LINC01018 has a novel tumor suppressor role in hepatocellular carcinoma by sponging miR-182-5p [9,10]. We also found lower expression of m 6 A-modified MIR325HG to be correlated with worse patient survival in both LGG and GBM (Fig. 2f-g). These results suggest that these TE lncRNAs could be potentially tumor suppressors in cancer.
We next tried to determine the functions of F11-AS1, LINC01018 and MIR325HG. We performed Gene Set Enrichment Analysis (GSEA) on cancer patients. We found that these m 6 A-modified lncRNAs were involved in a number of cancer hallmark-related functions (Additional file 2: Figure S8 and Additional file 6: Table S5), such as DNA repair and epithelial mesenchymal transition pathways (Additional file 2: Figure S9). Taken together, all these results suggest a connection between m 6 A modified TE lncRNAs and the risk of diseases.

Conclusions
We have shown the prevalence of m 6 A modification in TE lncRNAs across tissues and cancer types. The expression levels of m 6 A-modified TE lncRNAs were significantly closely associated with the activity of m 6 A regulators. Several studies have also shown that m 6 Am can regulate the expression of noncoding RNAs. Thus, it would also be interesting to integrate such m 6 A and m 6 Am data to identify potential lncRNA biomarkers in cancer. In summary, our work reveals the landscape of m 6 A-modified TE lncRNAs and provides a valuable resource for functional studies of m 6 A and lncRNA functions in the future.