Long noncoding RNAs as novel predictors of survival in human cancer: a systematic review and meta-analysis
© The Author(s). 2016
Received: 18 February 2016
Accepted: 14 June 2016
Published: 28 June 2016
Expression of various long noncoding RNAs (lncRNAs) may affect cancer prognosis. Here, we aim to gather and examine all evidence on the potential role of lncRNAs as novel predictors of survival in human cancer.
We systematically searched through PubMed, to identify all published studies reporting on the association between any individual lncRNA or group of lncRNAs with prognosis in human cancer (death or other clinical outcomes). Where appropriate, we then performed quantitative synthesis of those results using meta-analytic methods to identify the true effect size of lncRNAs on cancer prognosis. The reliability of those results was then examined using measures of heterogeneity and testing for selective reporting biases.
Three hundred ninety-two studies were screened to eventually identify 111 eligible studies on 127 datasets. In total, these represented 16,754 independent participants pertaining to 53 individual and 6 grouped lncRNAs within a total of 19 cancer sites. Overall, 83 % of the studies we identified addressed overall survival and 32 % of the studies addressed recurrence-free survival. For overall survival, 96 % (88/92) of studies identified a statistically significant association of lncRNA expression to prognosis. Meta-analysis of 6 out of 7 lncRNAs for which three or more studies were available, identified statistically significant associations with overall survival. The lncRNA HOTAIR was by far the most broadly studied lncRNA (n = 29; of 111 studies) and featured a summary hazard ratio (HR) of 2.22 (95 % confidence interval (CI), 1.86–2.65) with modest heterogeneity (I2 = 49 %; 95 % CI, 14–79 %). Prominent excess significance was demonstrated across all meta-analyses (p-value = 0.0003), raising the possibility of substantial selective reporting biases.
Multiple lncRNAs have been shown to be strongly associated with prognosis in diverse cancers, but substantial bias cannot be excluded in this field and larger studies are needed to understand whether these prognostic information may eventually be useful.
KeywordsLncRNA Cancer Cancer biomarkers Prognosis Survival analysis Excess significance Small-study effects Selective reporting biases
Non-coding RNAs (ncRNAs) have been proposed in the last decade as regulators of cancer pathways and biomarkers of cancer outcomes [1–4]. Potentially informative biomarkers based on ncRNAs include microRNAs (miRs)  and the larger long non-coding RNAs (lncRNAs). NcRNAs were up to recently disregarded as ‘junk’ and despite constituting the large majority of RNAs being transcribed, their role in normal development and cellular physiology in health and disease is only recently becoming apparent [2, 6, 7].
LncRNAs refer to any ncRNA consisting of more than 200 nucleotides. They are functionally heterogeneous molecules [6, 8], themselves sub-classified into large intergenic non-coding RNAs (lincRNA), transcribed ultraconserved regions (T-UCRs) and many others . Of an estimated putative 140,000 different ncRNAs in total , lncRNAs are estimated to constitute proportionally the largest class, with the most comprehensive approach to date confirming 58,648 expressed lncRNAs . Even though the function of lncRNAs is still being debated , certain lncRNAs have been implicated in functions related to regulation of gene expression in health and disease [2, 6–8, 12–15]. Well-studied examples include the lncRNA Xist, which initiates X-chromosome inactivation in female cells by recruiting repressive complexes to the X-chromosome under inactivation [16–18] and H19, which has been shown to play a significant role in genomic imprinting [19, 20].
Of particular interest however is, that it is now clear that lncRNAs are major players in tumorigenesis [7–9, 21–23]. In this context, the most well studied lncRNA is HOTAIR (HOmeobox (HOX) Transcript AntIsense RNA), which has been shown to recruit the PRC2 (Polycomb Repressive Complex 2) complex and eventually lead to epigenetic silencing of metastasis suppressor genes [2, 24].
More than 20 meta-analyses studying the role of lncRNAs in cancer prognosis have been published so far, all within the past 2 years. All of these studied a single lncRNA, either in relation to a specific cancer or to any cancer. The two most studied lncRNAs are MALAT1 and HOTAIR, which have been the subject of 10 and 7 meta-analyses respectively. The latest meta-analysis on MALAT1 for all cancer types showed that its upregulation is statistically significantly associated with poor overall survival (pooled hazard ratio [HR], 2.14; 95 % CI, 1.74–2.64) with low between-study heterogeneity (I2, 4.3 %; p-value = 0.399), on the basis of 9 studies . The results were similar to the latest meta-analysis of HOTAIR (HR, 2.33; 95 % CI, 1.77–3.09), but with significant between-study heterogeneity (Cochran’s Q-test p-value = 0.016), on the basis of 16 studies . Interestingly, all meta-analyses published so far have been produced by Chinese groups and all identified a statistically significant association of all lncRNAs studied to prognosis in cancer. However, no systematic review and meta-analysis to-date has identified all lncRNAs studied in the context of cancer and to what extent these might be of prognostic significance.
In this paper, we aimed to examine the potential role of all lncRNAs ever investigated in the context of cancer survival prediction, as novel predictors of survival in human cancer. We utilized a field-wide meta-analysis approach  to systematically identify and examine all published papers trying to associate lncRNAs to prognosis in human cancer, and to quantitatively synthesize data directly related to prognosis wherever three or more studies on an lncRNA had been done.
This report has been structured on the basis of PRISMA .
We considered published reports of a prospective or retrospective study design that had explored the association of any single or combination of stated lncRNAs to any of the following types of survival analysis: disease-specific survival (DSS, duration of time from the day of diagnosis to the day of death due to cancer); metastasis-free survival (MFS, duration of time from day of diagnosis to the day of diagnosing a metastatic event); overall/cumulative survival (OS, duration of time from day of diagnosis to the day of death due to any cause); progression/event/disease-free survival (PFS, duration of time from day of first treatment to the day evidence of cancer progression are identified or the patient dies of any cause); and recurrence-free survival (RFS, duration of time from day of cure from cancer to the day evidence of cancer progression/recurrence is identified). Survival analyses measuring different types of survival were treated separately at all times. Studies describing the association of individual or groups of lncRNAs with clinicopathologic variables (e.g. Stage, Grade, Distant metastasis, etc.), without specifically examining associations to any of the aforementioned survival analyses, were excluded. We likewise excluded cross-sectional studies and studies concerning genetic alterations of an lncRNA (e.g. polymorphisms or methylation patterns). Any kind of quantitative lncRNA analysis (quantitative real time–PCR, in situ hybridization) was eligible.
For meta-analysis eligibility, a study had to also provide the effect size and confidence interval for the association of an individual or group of lncRNAs with any of the above survival outcomes, or report information through which this effect size and confidence interval could be calculated [29, 30]. Wherever the same cohort had published more than one overlapping analysis, we only used the most encompassing data (for example, the classification of glioma would be preferred over glioblastoma multiforme). Two reviewers (S. Serghiou and A. Kyriakopoulou) identified eligible studies, and any contested articles were adjudicated by a third reviewer (J. P. A. Ioannidis).
We systematically searched PubMed (1950 to September, 2015) for studies of any language that analyzed associations between lncRNAs and prognosis in human cancer. Our search strategy was developed in consideration of previous recommendations  and used the clinical queries prognosis filter, which has been reported to have an average estimated sensitivity of 92 % for detecting articles related to prognosis [5, 31]. Our search term was: (Prognosis/Broad [filter]) AND ((lncRNA OR “lnc RNA” OR “long noncoding ribonucleic acid” OR “long noncoding RNA” OR “long non-coding ribonucleic acid” OR “long intergenic noncoding RNA” OR “long intergenic non-coding RNA” OR “long non-coding RNA” OR “long ncRNA” OR “lincRNA” OR “linc RNA”) AND (cancer OR carcinoma OR tumor OR neoplas* OR tumour OR malignan* OR metastat* OR metastas* OR leukemia OR leukaemia OR lymphoma OR recurren* OR “lymph node” OR response) AND (Humans[Mesh] AND English[lang])). The search was last updated to include articles published through September 26, 2015.
We used the programming language R  to remove duplicate records. Title and abstract were screened to identify relevant articles. The full manuscript of the relevant articles was screened against our eligibility criteria. Any uncertainties were resolved by consensus with JPA. Data were collected by two reviewers (SS, AK) and saved in a pre-designed extraction form on Google Sheets. Where information was ambiguous (such as, for example, mentioning multiple types of lncRNA quantification methods but not clarifying which one of those was used to provide the quantities utilized in the survival analysis), this was labelled as ‘unclear’. An attempt was made to contact the authors when information was clearly logically inconsistent, as in for example quoting a hazard ratio (HR) outside the confidence interval (CI), but none replied. In one paper, the lncRNA expression level  was subdivided into low versus medium versus high; for this paper we only extracted the comparison between low versus high expression levels. The following data were extracted for all articles following the CHARMS checklist : title; authors; year of publication; journal of publication; groupings (i.e. whether lncRNAs were studied one by one or in groups); what lncRNAs were studied; whether an agnostic approach to identifying the studied lncRNAs was used (where an agnostic approach would be one assuming no prior knowledge regarding the choice of lncRNA to be studied); cancer site (e.g. brain) and cancer subtype (e.g. glioblastoma multiforme); whether a paper reported clinicopathologic data of its sample and which ones; whether an attempt of associating those clinicopathologic data to lncRNAs was made and for which ones; whether an attempt of associating clinicopathologic data to prognosis was made and for which ones; whether an attempt was made to explain the clinical outcomes using non-clinical studies (in vivo, in vitro); the types of survival analyses used (as above); type of study design (prospective cohort, retrospective cohort, unreported); means of lncRNA quantitative analysis (qRT–PCR, qPCR, in situ hybridization (ISH), other); and whether the paper tried to make any non-clinical associations of the identified lncRNAs to cancer in vitro. For eligible articles we further extracted: country and city of origin of the study cohort, period of sample recruitment, range of sample ages, mean/median age with confidence interval, the population type (general population, non-general population (e.g. veterans), unreported), stage of cancer upon initial patient presentation, sample size, means of tissue preservation (frozen, paraffin-embedded, both, other), any and what preoperative treatment was given, the total number of lncRNAs studied, the type of metric the paper used to characterize their results (hazard ratio, relative risk, odds ratio, p-value), type of analysis (i.e. univariable or multivariable), lncRNA quantity cut-off and its unit (i.e. the threshold based on which lncRNA expression was deemed upregulated or downregulated by the study), the sample size of each comparison group, the minimum and maximum participant follow-up time, the number of censored participants throughout follow-up and whether this was explicitly stated or read off the Kaplan-Meier curves, the HR and its CI (provided or inferred, e.g. from p-values and HR point estimates), the p-value and whether this was statistically significant at p < 0.05 and whether an attempt to validate the reported results was made, and if so, what type of validation method was used (internal or external). For eligibility for meta-analysis, enough information to extract or calculate the natural logarithm of the hazard ratio and its variance must have been provided.
Whenever multiple datasets were combined into a single dataset to study a specific lncRNA, we only extracted the summary HR, rather than extracting the HR respective to each constitutive dataset. If multiple datasets were assessed within the same study without being combined into a single dataset, we extracted the HR respective to each dataset, as they represent separate estimates. Where both the log-rank and Breslow tests were reported, only the log-rank was extracted. No cohort was used more than once and effect sizes describing a broader class of cancer (e.g. glioma) were preferred over subclassifications of that (e.g. glioblastoma multiforme). Three studies reported effect sizes that were excluded from further consideration because the quoted HRs contradicted the text  or they were either outside the CI or could not have possibly led to the quoted CI [36, 37]; this led to complete exclusion of two out of these three studies [35, 37]. Our database can be freely accessed here: https://goo.gl/EjCDAp.
Risk of bias in individual studies
Risk of bias in individual studies was assessed on the basis of the framework of assessing internal validity of articles dealing with prognosis [30, 38] and recommendations regarding reporting of biomarker studies [39, 40].
Summary measures and synthesis of results
We meta-analyzed data on lncRNAs for which three or more estimates of their effect on a specific survival outcome were available. Therefore, meta-analyses were only done for OS and RFS. Effect sizes for OS and RFS were meta-analyzed separately. Our principal summary measure was the summary HR. Standard errors were calculated using: ln (upper limit of CI/lower limit of CI)/(2 × 1.96). Estimates were synthesized using a random-effects model and estimated using the restricted maximum-likelihood ratio method. As previously described , four meta-analyses were done for each of: (1) multivariable data, (2) univariable data, (3) multivariable data combined with univariable data whenever multivariable data were unavailable (preferentially multivariable) and (4) univariable data combined with multivariable data whenever univariable data were unavailable (preferentially univariable). Given the similarity between the estimates of all four types of meta-analysis and the importance of multivariable modelling in prognostic studies, this report only quotes the estimates of the ‘preferentially multivariable’ meta-analysis; the rest can be found in Additional file 1: Table S2. For each estimate we provide the effect size and 95 % CI. Heterogeneity was analyzed using the Q and I2 statistics and the 95 % CI of I2 was also calculated [41, 42]. These analyses were done using R and the package metafor 1.9-8 . Data were combined for each type of lncRNA regardless of cancer type. Wherever an lncRNA had been analyzed three or more times for one or more specific cancer type, a post hoc subgroup analysis per cancer type was done for that lncRNA.
Risk of bias across studies
Risk of publication bias is a significant concern in prognostic studies . We explored excess significance for factors reported by at least 3 studies . Briefly, for every meta-analyzed risk factor we compare the number of observed significant results (O) at α = 0.05, to the number of expected significant results (E), where E = sum of power of each study within a specific meta-analysis. Power was calculated taking as plausible effect for the risk factor the effect seen in the most precise study (lowest standard error). The difference between O and E was assessed using a two-tailed binomial test, with α = 0.1, as previously suggested . O and E were also summed and compared across all meta-analyses.
Literature search and description of studies
Of 127 identified datasets, only 2 were reported to represent a prospective cohort; of the rest, 19 were reported to represent a retrospective cohort and there were no relevant information for the remaining 106 datasets. No report specified what type of population their samples came from and for 113/127 datasets we have no information as to what sampling method was used to obtain the sample. For the remaining datasets, consecutive sampling was stated to have been used in 5 and random sampling in 4 datasets; 5 datasets were based on all patients ever seen by the clinic. Sampling method was disproportionately frequently reported for studies coming from the USA (4/9). A total of 94/127 datasets came from Asia (78 from China), followed by Europe (15/127) and America (13/127); there was no reported country of origin for 2 datasets and 3 datasets contained patients from multiple continents; the latter were multi-center cohorts. A total of 16,754 different patients were enrolled within these studies (avoiding double-counting samples that had been used for two or more analyses). Median sample size was 90 (IQR, 82; range, 30–997) and 69/127 datasets contained less than 100 participants (50 of which datasets came from China).
Mapping of lncRNA prognostic data
Descriptive statistics of eligible studies
13 (12 %)
23 (21 %)
0 (0 %)
75 (68 %)
111 (100 %)
16 (13 %)
15 (12 %)
15 (12 %)
12 (9 %)
10 (8 %)
10 (8 %)
9 (7 %)
6 (5 %)
6 (5 %)
5 (4 %)
Head and neck
4 (3 %)
4 (3 %)
4 (3 %)
4 (3 %)
2 (2 %)
2 (2 %)
1 (1 %)
1 (1 %)
1 (1 %)
127 (100 %)
85 (77 %)
21 (19 %)
5 (5 %)
111 (100 %)
92 (83 %)
36 (32 %)
10 (9 %)
9 (8 %)
6 (5 %)
111 (100 %)
84 (66 %)
28 (22 %)
11 (9 %)
qRT-PCR or ISH
2 (2 %)
1 (1 %)
1 (1 %)
127 (100 %)
94 (74 %)
15 (12 %)
13 (10 %)
3 (2 %)
2 (2 %)
127 (100 %)
106 (83 %)
19 (15 %)
2 (2 %)
127 (100 %)
113 (89 %)
5 (4 %)
5 (4 %)
4 (3 %)
127 (100 %)
Tissue preservation b
66 (52 %)
34 (27 %)
18 (14 %)
6 (5 %)
L + RNALater
3 (2 %)
127 (100 %)
77 (61 %)
46 (36 %)
4 (3 %)
127 (100 %)
98 (77 %)
27 (21 %)
2 (2 %)
127 (100 %)
Total number of lncRNAs studied
87 (78 %)
4 (4 %)
15 (14 %)
5 (5 %)
111 (100 %)
Clinical and Non-clinical
76 (68 %)
35 (32 %)
111 (100 %)
Use of validation method for survival
99 (89 %)
5 (5 %)
4 (4 %)
2 (2 %)
1 (1 %)
111 (100 %)
Details of the lncRNAs studied
Number of cancer types (sample size)
Times significant (%)
28 (97 %)
5 (62 %)
2 (50 %)
4 (100 %)
4 (100 %)
3 (100 %)
6 lncRNA risk score
1 (33 %)
1 (50 %)
LncR1 vs LncR2 vs LncR3
2 (100 %)
2 (100 %)
2 (100 %)
2 (100 %)
2 (100 %)
2 (100 %)
2 (100 %)
2 (100 %)
2 (100 %)
Meta-analysis for overall survival
The results of our meta-analysis for each lncRNA using ‘primarily multivariable’ data
HR (95 % CI)
I2 (95 % CI)
Observed (Expected, p-value)
49 % (14–79 %)
25 (18.2, p-value = 0.002)
0 % (0–85 %)
5 (4.1, p-value = 0.707)
6 lncRNA risk score
0 % (0–47 %)
2 (2.1, p-value = 1.000)
94 % (80–100 %)
4 (2.1, p-value = 0.128)
0 % (0–98 %)
2 (0.8, p-value = 0.170)
0 % (0–56 %)
4 (2.7, p-value = 0.309)
0 % (0–98 %)
3 (2.9, p-value = 1.000)
The only type of survival analysis other than OS studied 3 or more times in relation to a specific lncRNA was MFS for HOTAIR. This was investigated within 4 different studies in relation to 4 different cancers (breast, colorectal, esophageal, head and neck). Meta-analysis of these studies identified a summary HR of 2.54 (95 % CI, 1.62–3.98) with no statistically significant heterogeneity (Q-statistic, 5.16; p-value = 0.16).
Heterogeneity metrics and excess significance
Statistically significant heterogeneity was only observed in HOTAIR analyses, but substantial estimates of I2 were common. For HOTAIR and OS, a sensitivity analysis excluding the only study reporting an inverse correlation of HOTAIR to cancer survival  generated a HR of 2.30 (95 % CI, 1.97-2.70) with I2 = 0 % (95 % CI, 0–59 %); for all other meta-analyses, no single study produced a major change in the I2.
There was excess significance across the whole field for overall survival and the binomial distribution revealed a two-tailed p-value of 0.0003, with O = 42 statistically significant results and E = 30 expected statistically significant results across all meta-analyses with 3 or more studies each on OS. As far as excess significance within lncRNAs studied 5 or more times is concerned, there was significant excess significance documented for HOTAIR (p-value = 0.002), but not MALAT1 (p-value = 0.46).
In this systematic review and meta-analysis we have tried to gather all published papers evaluating the prognostic ability of lncRNAs in cancer. We have identified that a large number of lncRNAs have been evaluated within the context of cancer prognosis. Most of them have been evaluated only once in a published paper. Almost all of the published papers report that lncRNAs are statistically significant predictors of survival. There was often substantial heterogeneity between studies in the strength of the predictive effect. There was also strong evidence for small-study effects and for excess significance. This picture may be due to genuine differences across studies, such as different cancers and populations under study, and different adjustments made in multivariable models. However, it is also highly compatible with the presence of substantial publication bias and other selective reporting bias in this field resulting in exaggerated effects in mostly small studies (most of which coming from China) and in an implausibly high prevalence of nominally significant results.
It is well recognized that published literature on prognosis and the identification of prognostic markers is characterized by poor methodological quality, significant publication bias and wide heterogeneity in aspects of sample selection, such as pre/post-biopsy treatment or tissue preservation methods, and analysis, such as multivariable modelling and determination of cutoff values [30, 46]. As such, meta-analyses of prognostic studies may elicit summary effect sizes that are unrealistic . An evaluation of studies investigating the association of TP53 to risk of death by head and neck squamous cell carcinoma, identified that even though readily available effect sizes would confirm that TP53 is a strongly significant prognostic factor, after standardizing definitions of TP53 status and outcomes across papers and retrieving non-readily available information, this association was completely abrogated . These issues may also apply to the lncRNA literature. No two studies of our dataset were identical in all of lncRNA, cancer site, cut-off value and multivariable modelling, suggesting substantial room for selective reporting of analyses that could be done with very different models and definitions. Moreover, we suspect that publication bias may also be operating in the field.
Of particular interest is the excess significance we identified across the field (p-value = 0.0003). Despite the poor translation of cancer biomarkers into clinical practice [39, 49–51], out of 1575 studies on cancer biomarkers published in 2005, 95.8 % reported statistically significant results and only 1.3 % did not report any kind of statistically significant results . Indeed, as we have shown, this pattern is also prominent in the lncRNA cancer prognosis literature.
One way of reducing the selective reporting biases that have led to the above status quo and thus reducing lack of translatability, is transparency. The need to improve transparency has been mentioned repeatedly [39, 53]. Guidelines have been proposed to improve the reporting of prognostic markers (REMARK) [39, 51], multivariable prediction models (TRIPOD)  and genetic risk prediction studies . Wider adoption of these guidelines may increase transparency, but it is unknown whether it will suffice to markedly reduce selective reporting.
In our cohort of studies, the extent of unreported items in Table 1, did not inspire confidence in transparency and completeness of reporting practices. We also documented minimal use of validation (12/111 studies, 11 %), despite reports stressing the necessity and importance of validation in identifying true effect size for prognostic tools [56, 57]. Furthermore, more than half of the identified studies had a sample size of less than 100. Small studies are known, both theoretically and empirically, to be associated with inflated estimates of effect size , not as much due to their limited sample size, as for lower quality standards, publication bias and selective reporting , which is why they lead to so-called ‘small-study effects’. Even though these have mostly been studied within the context of randomized-controlled trials, where they have been associated with a larger average effect size and at least double the between-study heterogeneity found in larger studies , similar problems may occur also in prognostic study research . The meta-analysis for HOTAIR, which is the most widely studied lncRNA in the context of cancer prognosis, clearly indicates that smaller studies tend to be less precise and report a higher effect size than larger studies. Inflated effects are common in biomarker studies , and this may apply also for the results of lncRNAs.
Another interesting point of note is the Chinese provenance of most papers in our collection of eligible studies (78/111, 70 %). In a previous analysis of genetic studies, it was shown that there is a vast Chinese literature, and that papers from China tend to utilize smaller sample sizes yet reach statistical significance far more commonly than other papers . This was attributed to more prominent publication bias against null results or other kinds of selection bias in pursuit of statistically significant results. Discrepancies between the Chinese literature and the rest of the world were also found in published meta-analyses of genomic data . Chinese meta-analyses (1) focused on the results of studies investigating individual candidate genes rather than the results of genome-wide association studies and (2) used nominal significance (i.e. p-value < 0.05) rather than genome-wide p-value thresholds to identify statistically significant results.
Although there has been an explosion in the amount of identified potential biomarkers due to high throughput methods, unlike traditional methods of identifying molecules directly relevant to a known cellular event , very few have made their way to clinical practice, due to lack of appropriate evidence [50, 64, 65]. An important aspect in ascribing usefulness to a novel biomarker is their ability to add further predictive value, over and above the one already possible using known prognostic factors. Unfortunately, in our sample, despite most multivariable analyses identifying lncRNAs as a statistically significant predictor, only about 30 % of the reported prognostic effects were adjusted for the two classically most relevant predictors of cancer prognosis (i.e. Stage and Grade).
Our analysis has several limitations. First, given that this report is only based on the results of a single database (PubMed), it is possible that relevant papers may have been missed. Second, our analysis utilized the Medical Subject Heading (MeSH) ‘Humans’ to limit our search results to those studies conducted in humans. Even though this is accepted practice and has been used previously in similar studies , that label is added to papers at the point of indexing, and thus some papers that were published close to our search date (September 26, 2015) and had not been MeSH-labeled yet, would have been missed. We performed an updated search (June 5, 2016) for papers that did not have a Human [MeSH] and had been published before 2015 and found only two small studies [66, 67] that could potentially qualify for inclusion for the outcome of survival. This is a field with prolific literature and a substantial number of papers have continued to appear after our September 2015 search and will probably continue to appear in the near future. Third, our meta-analysis has attempted to combine multiple studies that are known to be heterogeneous in terms of cancer site and provenance of patient populations. Our estimates of heterogeneity metrics have wide 95 % confidence intervals . Fourth, on 51 occasions we had to calculate HRs ourselves based on data provided within the papers, which may not have provided the most accurate estimate of the HR possible, as most of the time these data were extracted from Kaplan-Meier curves. However, this practice has not been shown to yield results significantly different from direct methods of HR estimation . Fifth, even though every effort was made to exclude analyses of the same lncRNA using the same dataset of patients, it is possible that some overlapping data have been included, if their authors have made no hint as to the presence of overlap.
In conclusion, we have gathered a substantial amount of prognostic data regarding the association of various lncRNAs and survival. Our analysis identified a significant number of studies, most of which have been published within the last 2 years and most of which are of small sample size. Even though our systematic review and meta-analyses identified that almost all lncRNAs identified are statistically significant predictors of OS, it is very difficult to know the importance of these associations, given the detection of excess significance, small-study effects and the known difficulties with analyzing prognostic studies. Larger studies, ideally with collaborative teams using standardized approaches to measurement, adjustment, analysis, and reporting, will offer better insights into the prognostic value of lncRNAs.
RNA, Ribonucleic acid; ncRNAs, Noncoding RNAs; LncRNAs, Long noncoding RNAs; LincRNA, large intergenic non-coding RNAs; T-UCRs, transcribed ultraconserved regions; miR, microRNA; HOTAIR (HOmeobox (HOX) Transcript AntIsense RNA); PRC2, Polycomb Repressive Complex 2; HR, Hazard Ratio; CI, Confidence Interval; IQR, Interquartile range; DSS, Disease-specific survival; MFS, Metastasis-free survival, OS, Overall/cumulative survival; PFS, Progression/event/disease-free survival; RFS, Recurrence-free survival; O, Number of observed events; E, Number of expected events; PCR, Polymerase chain reaction; qPCR, Quantitative PCR; qRT-PCR, Quantitative real-time PCR; RT-qPCR, real-time quantitative PCR; ISH, in situ hybridization; LNM, Lymph node metastasis; LVM, Lymphovascular metastasis
No sources of funding to declare.
Availability of data and materials
The complete database upon which this review article has been constructed can be freely accessed here: https://goo.gl/EjCDAp. The size of this database does not permit its publication as an additional supporting file.
SS: study design, acquisition, analysis and interpretation of data, manuscript drafting; AK: study design, acquisition of data; final approval of the manuscript; JPA: study conception and design, data interpretation, drafting and critical appraisal of manuscript. All authors have given final approval to this version of the manuscript to be published.
As previously declared.
The authors declare that they have no competing interests.
Consent for publication
Ethics approval and consent to participate
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
- Alexander RP, Fang G, Rozowsky J, Snyder M, Gerstein MB. Annotating non-coding regions of the genome. Nat Rev Genet. 2010;11:559–71.View ArticlePubMedGoogle Scholar
- Esteller M. Non-coding RNAs in human disease. Nat Rev Genet Nat Publish Group. 2011;12:861–74.View ArticleGoogle Scholar
- Djebali S, Davis CA, Merkel A, Dobin A, Lassmann T, Mortazavi A, et al. Landscape of transcription in human cells. Nature. 2012;489:101–8.View ArticlePubMedPubMed CentralGoogle Scholar
- Kornienko AE, Guenzl PM, Barlow DP, Pauler FM. Gene regulation by the act of long non-coding RNA transcription. BMC Biol BioMed Central. 2013;11:1.Google Scholar
- Nair VS, Maeda LS, Ioannidis JPA. Clinical outcome prediction by microRNAs in human cancer: a systematic review. J Natl Cancer Inst. 2012;104:528–40.View ArticlePubMedPubMed CentralGoogle Scholar
- Mercer TR, Dinger ME, Mattick JS. Long non-coding RNAs: insights into functions. Nat Rev Genet Nat Publish Group. 2009;10:155–9.View ArticleGoogle Scholar
- Guttman M, Donaghey J, Carey BW, Garber M, Grenier JK, Munson G, et al. lincRNAs act in the circuitry controlling pluripotency and differentiation. Nature. 2011;477:295–300.View ArticlePubMedPubMed CentralGoogle Scholar
- Ponting CP, Oliver PL, Reik W. Evolution and functions of long noncoding RNAs. Cell. 2009;136:629–41.View ArticlePubMedGoogle Scholar
- Malek E, Jagannathan S, Driscoll JJ. Correlation of long non-coding RNA expression with metastasis, drug resistance and clinical outcome in cancer. Oncotarget. 2014;5:8027–38.View ArticlePubMedPubMed CentralGoogle Scholar
- Iyer MK, Niknafs YS, Malik R, Singhal U, Sahu A, Hosono Y, et al. The landscape of long noncoding RNAs in the human transcriptome. Nat Genet Nat Publish Group. 2015;47:199–208.View ArticleGoogle Scholar
- Lee JT. Epigenetic regulation by long noncoding RNAs. Science. 2012;338:1435–9.View ArticlePubMedGoogle Scholar
- Rinn JL, Chang HY. Genome regulation by long noncoding RNAs. Annu Rev Biochem. 2012;81:145–66.View ArticlePubMedGoogle Scholar
- Khalil AM, Guttman M, Huarte M, Garber M, Raj A, Rivea Morales D, et al. Many human large intergenic noncoding RNAs associate with chromatin-modifying complexes and affect gene expression. Proc Natl Acad Sci U S A. 2009;106:11667–72.View ArticlePubMedPubMed CentralGoogle Scholar
- Koziol MJ, Rinn JL. RNA traffic control of chromatin complexes. Curr Opin Genet Dev. 2010;20:142–8.View ArticlePubMedPubMed CentralGoogle Scholar
- Vance KW, Ponting CP. Transcriptional regulatory functions of nuclear long noncoding RNAs. Trends Genet. 2014;30(8):348–355.Google Scholar
- Penny GD, Kay GF, Sheardown SA, Rastan S, Brockdorff N. Requirement for Xist in X chromosome inactivation. Nature. 1996;379:131–7.View ArticlePubMedGoogle Scholar
- Wutz A, Rasmussen TP, Jaenisch R. Chromosomal silencing and localization are mediated by different domains of Xist RNA. Nat Genet Nat Publish Group. 2002;30:167–74.View ArticleGoogle Scholar
- Wutz A, Gribnau J. X inactivation Xplained. Curr Opin Genet Dev. 2007;17:387–93.View ArticlePubMedGoogle Scholar
- Forne T, Oswald J, Dean W, Saam JR, Bailleul B, Dandolo L, et al. Loss of the maternal H19 gene induces changes in Igf2 methylation in both cis and trans. PNAS Nation Acad Sci. 1997;94:10243–8.View ArticleGoogle Scholar
- Gabory A, Ripoche M-A, Le Digarcher A, Watrin F, Ziyyat A, Forné T, et al. H19 acts as a trans regulator of the imprinted gene network controlling growth in mice. Dev Company Biol Ltd. 2009;136:3413–21.Google Scholar
- Calin GA, Liu C-G, Ferracin M, Hyslop T, Spizzo R, Sevignani C, et al. Ultraconserved regions encoding ncRNAs are altered in human leukemias and carcinomas. Cancer Cell. 2007;12:215–29.View ArticlePubMedGoogle Scholar
- Spizzo R, Almeida MI, Colombatti A, Calin GA. Long non-coding RNAs and cancer: a new frontier of translational research? Oncogene. 2012;31:4577–87.View ArticlePubMedPubMed CentralGoogle Scholar
- Li X, Wu Z, Fu X, Han W. Long Noncoding RNAs: Insights from Biological Features and Functions to Diseases. Med Res Rev. 2013;33:517–53.View ArticlePubMedGoogle Scholar
- Gupta RA, Wang KC, Hung T, West RB, Sukumar S, Chang HY. Long non-coding RNA HOTAIR reprograms chromatin state to promote cancer metastasis. Nature. 2010;464:1071–6.View ArticlePubMedPubMed CentralGoogle Scholar
- Wang J, Xu AM, Zhang JY, He XM, Pan YS, Cheng G, et al. Prognostic significance of long non-coding RNA MALAT-1 in various human carcinomas: a meta-analysis. Genet. Mol. Res. 2016;15(1).Google Scholar
- Deng Q, Sun H, He B, Pan Y, Gao T, Chen J, et al. Prognostic Value of Long Non-Coding RNA HOTAIR in Various Cancers. PLoS ONE Public Library Sci. 2014;9:e110059.View ArticleGoogle Scholar
- Serghiou S, Patel CJ, Tan YY, Koay P, Ioannidis JPA. Field-wide meta-analyses of observational associations can map selective availability of risk factors and the impact of model specifications. J Clin Epidemiol. 2016;71:58–67.View ArticlePubMedGoogle Scholar
- Liberati A, Altman DG, Tetzlaff J, Mulrow C, Gøtzsche PC, Ioannidis JPA, et al. The PRISMA statement for reporting systematic reviews and meta-analyses of studies that evaluate health care interventions: explanation and elaboration. PLoS Med. Public Library of Science; 2009;6(7):e1000100.Google Scholar
- Parmar MK, Torri V, Stewart L. Extracting summary statistics to perform meta-analyses of the published literature for survival endpoints. Stat Med. 1998;17:2815–34.View ArticlePubMedGoogle Scholar
- Altman DG. Systematic reviews of evaluations of prognostic variables. BMJ British Med J Publish Group. 2001;323:224–8.View ArticleGoogle Scholar
- Haynes RB, McKibbon KA, Wilczynski NL, Walter SD, Werre SR, Team H. Optimal search strategies for retrieving scientifically strong studies of treatment from Medline: analytical survey. BMJ. 2005;330:1179.View ArticlePubMedPubMed CentralGoogle Scholar
- R Development Core Team. R: A Language and Environment for Statistical Computing [Internet]. Vienna, Austria. Available from: http://www.R-project.org.
- Lu L, Zhu G, Zhang C, Deng Q, Katsaros D, Mayne ST, et al. Association of large noncoding RNA HOTAIR expression and its downstream intergenic CpG island methylation with survival in breast cancer. Breast Cancer Res Treat Springer US. 2012;136:875–83.View ArticleGoogle Scholar
- Moons KGM, De Groot JAH, Bouwmeester W, Vergouwe Y, Mallett S, Altman DG, et al. Critical appraisal and data extraction for systematic reviews of prediction modelling studies: the CHARMS checklist. Plos Med Public Library Sci. 2014;11:e1001744.Google Scholar
- Xu Z-Y, Yu Q-M, Du Y-A, Yang L-T, Dong R-Z, Huang L, et al. Knockdown of long non-coding RNA HOTAIR suppresses tumor invasion and reverses epithelial-mesenchymal transition in gastric cancer. Int J Biol Sci. 2013;9:587–97.View ArticlePubMedPubMed CentralGoogle Scholar
- Wu Z-H, Wang X-L, Tang H-M, Jiang T, Chen J, Lu S, et al. Long non-coding RNA HOTAIR is a powerful predictor of metastasis and poor prognosis and is associated with epithelial-mesenchymal transition in colon cancer. Oncol Rep Spandidos Publ. 2014;32:395–402.Google Scholar
- Takahashi Y, Sawada G, Kurashige J, Uchi R, Matsumura T, Ueo H, et al. Amplification of PVT-1 is involved in poor prognosis via apoptosis inhibition in colorectal cancers. Br J Cancer. 2014;110:164–71.View ArticlePubMedGoogle Scholar
- Bouwmeester W, Zuithoff NPA, Mallett S, Geerlings MI, Vergouwe Y, Steyerberg EW, et al. Reporting and methods in clinical prediction research: a systematic review. Macleod MR, editor. Plos Med Public Library of Sci. 2012;9:1–12.Google Scholar
- McShane LM, Altman DG, Sauerbrei W, Taube SE, Gion M, Clark GM. REporting recommendations for tumour MARKer prognostic studies (REMARK). Br J Cancer Nat Publish Group. 2005;93:387–91.Google Scholar
- Henry NL, Hayes DF. Cancer biomarkers. Mol Oncol. 2012;6:140–6.View ArticlePubMedGoogle Scholar
- Higgins J, Thompson SG. Quantifying heterogeneity in a meta-analysis. Stat Med. 2002;21(11):1539-58.Google Scholar
- Ioannidis JPA, Patsopoulos NA, Evangelou E. Uncertainty in heterogeneity estimates in meta-analyses. BMJ British Med J Publish Group. 2007;335:914–6.View ArticleGoogle Scholar
- Viechtbauer W. Conducting meta-analyses in R with the metafor package. J Statistical Software [Internet]. 2010;36:1–48. Available from: http://www.jstatsoft.org/v36/i03/.Google Scholar
- Ioannidis JPA, Trikalinos TA. An exploratory test for an excess of significant findings. Clin Trials. 2007;4:245–53.View ArticlePubMedGoogle Scholar
- Ioannidis JPA. Clarifications on the application and interpretation of the test for excess significance and its extensions. J Math Psychol. 2013;57:184–7.View ArticleGoogle Scholar
- Simon R, Altman DG. Statistical aspects of prognostic factor studies in oncology. Br J Cancer Nat Publish Group. 1994;69:979–85.Google Scholar
- Blettner M, Sauerbrei W, Schlehofer B, Scheuchenpflug T, Friedenreich C. Traditional reviews, meta-analyses and pooled analyses in epidemiology. Int J Epidemiol. 1999;28:1–9.View ArticlePubMedGoogle Scholar
- Kyzas PA, Loizou KT, Ioannidis JPA. Selective reporting biases in cancer prognostic factor studies. J Natl Cancer Inst Oxford Univ Press. 2005;97:1043–55.View ArticleGoogle Scholar
- Sideris M, Papagrigoriadis S. Molecular biomarkers and classification models in the evaluation of the prognosis of colorectal cancer. Anticancer Res. 2014;34:2061–8.PubMedGoogle Scholar
- Weigel MT, Dowsett M. Current and emerging biomarkers in breast cancer: prognosis and prediction. Endocr Relat Cancer BioScientifica. 2010;17:R245–62.View ArticleGoogle Scholar
- Altman DG, McShane LM, Sauerbrei W, Taube SE. Reporting Recommendations for Tumor Marker Prognostic Studies (REMARK): explanation and elaboration. Plos Med Public Library Sci. 2012;9:e1001216.Google Scholar
- Kyzas PA, Denaxa-Kyza D, Ioannidis JPA. Almost all articles on cancer prognostic markers report statistically significant results. Eur J Cancer. 2007;43:2559–79.View ArticlePubMedGoogle Scholar
- Peat G, Riley RD, Croft P, Morley KI, Kyzas PA, Moons KGM, et al. Improving the transparency of prognosis research: the role of reporting, data sharing, registration, and protocols. Plos Med Public Library of Sci. 2014;11:e1001671.Google Scholar
- Moons KGM, Altman DG, Reitsma JB, Ioannidis JPA, Macaskill P, Steyerberg EW, et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis or Diagnosis (TRIPOD): explanation and elaboration. Ann Intern Med Am College of Physicians. 2015;162:W1–W73.Google Scholar
- Janssens ACJW, Ioannidis JPA, van Duijn CM, Little J, Khoury MJ, GRIPS Group. Strengthening the reporting of Genetic RIsk Prediction Studies: the GRIPS Statement. Plos Med. 2011;8(3):e1000420.Google Scholar
- Altman DG, Vergouwe Y, Royston P, Moons KGM. Prognosis and prognostic research: validating a prognostic model. BMJ. 2009;338:b605.View ArticlePubMedGoogle Scholar
- Siontis GCM, Tzoulaki I, Castaldi PJ, Ioannidis JPA. External validation of new risk prediction models is infrequent and reveals worse prognostic discrimination. J Clin Epidemiol. 2015;68:25–34.View ArticlePubMedGoogle Scholar
- Ioannidis JPA. Why most discovered true associations are inflated. Epidemiology. 2008;19:640–8.View ArticlePubMedGoogle Scholar
- Sterne JA, Gavaghan D, Egger M. Publication and related bias in meta-analysis: power of statistical tests and prevalence in the literature. J Clin Epidemiol. 2000;53:1119–29.View ArticlePubMedGoogle Scholar
- IntHout J, Ioannidis JPA, Borm GF, Goeman JJ. Small studies are more heterogeneous than large ones: a meta-meta-analysis. J Clin Epidemiol. 2015;68:860–9.View ArticlePubMedGoogle Scholar
- Ioannidis JPA, Panagiotou OA. Comparison of effect sizes associated with biomarkers reported in highly cited individual articles and in subsequent meta-analyses. JAMA Am Med Assoc. 2011;305:2200–10.View ArticleGoogle Scholar
- Pan Z, Trikalinos TA, Kavvoura FK, Lau J, Ioannidis JPA. Local literature bias in genetic epidemiology: an empirical evaluation of the Chinese literature. Plos Med Public Library Sci. 2005;2:e334.Google Scholar
- Ioannidis JPA, Chang CQ, Lam TK, Schully SD, Khoury MJ. The geometric increase in meta-analyses from China in the genomic era. PLoS ONE Public Library Sci. 2013;8:e65602.View ArticleGoogle Scholar
- Hayes DF, Bast RC, Desch CE, Herbert Fritsche J, Kemeny NE, Jessup JM, et al. Tumor Marker Utility Grading System: a Framework to Evaluate Clinical Utility of Tumor Markers. J Natl Cancer Inst Oxford Univ Press. 1996;88:1456–66.View ArticleGoogle Scholar
- Harris L, Fritsche H, Mennel R, Norton L, Ravdin P, Taube S, et al. American Society of Clinical Oncology 2007 update of recommendations for the use of tumor markers in breast cancer. J Clin Oncol Am Soc Clin Oncol. 2007;25:5287–312.View ArticleGoogle Scholar
- Chi Y, Huang S, Yuan L, Liu M. Role of BC040587 as a predictor of poor outcome in breast cancer. Cancer Cell Int. 2014;14(1):123.Google Scholar
- Liu PY, Erriquez D, Marshall GM, Tee AE, Polly P, Wong M, et al. Effects of a novel long noncoding RNA, lncUSMycN, on N-Myc expression and neuroblastoma progression. J Natl Cancer Inst. 2014;106:113–3. dju.View ArticleGoogle Scholar