- Open Access
In vitro downregulated hypoxia transcriptome is associated with poor prognosis in breast cancer
© The Author(s). 2017
- Received: 18 November 2016
- Accepted: 2 June 2017
- Published: 15 June 2017
Hypoxia is a characteristic of breast tumours indicating poor prognosis. Based on the assumption that those genes which are up-regulated under hypoxia in cell-lines are expected to be predictors of poor prognosis in clinical data, many signatures of poor prognosis were identified. However, it was observed that cell line data do not always concur with clinical data, and therefore conclusions from cell line analysis should be considered with caution. As many transcriptomic cell-line datasets from hypoxia related contexts are available, integrative approaches which investigate these datasets collectively, while not ignoring clinical data, are required.
We analyse sixteen heterogeneous breast cancer cell-line transcriptomic datasets in hypoxia-related conditions collectively by employing the unique capabilities of the method, UNCLES, which integrates clustering results from multiple datasets and can address questions that cannot be answered by existing methods. This has been demonstrated by comparison with the state-of-the-art iCluster method. From this collection of genome-wide datasets include 15,588 genes, UNCLES identified a relatively high number of genes (>1000 overall) which are consistently co-regulated over all of the datasets, and some of which are still poorly understood and represent new potential HIF targets, such as RSBN1 and KIAA0195. Two main, anti-correlated, clusters were identified; the first is enriched with MYC targets participating in growth and proliferation, while the other is enriched with HIF targets directly participating in the hypoxia response. Surprisingly, in six clinical datasets, some sub-clusters of growth genes are found consistently positively correlated with hypoxia response genes, unlike the observation in cell lines. Moreover, the ability to predict bad prognosis by a combined signature of one sub-cluster of growth genes and one sub-cluster of hypoxia-induced genes appears to be comparable and perhaps greater than that of known hypoxia signatures.
We present a clustering approach suitable to integrate data from diverse experimental set-ups. Its application to breast cancer cell line datasets reveals new hypoxia-regulated signatures of genes which behave differently when in vitro (cell-line) data is compared with in vivo (clinical) data, and are of a prognostic value comparable or exceeding the state-of-the-art hypoxia signatures.
- Breast cancer
- Cell lines
- Genome-wide analysis
Hypoxia, that is reduced levels of oxygen, is a characteristic of solid tumours, including breast cancer, as a result of poor vascularisation, increased metabolism, and high proliferative rates [1, 2]. Amongst the different breast cancer subtypes, triple-negative breast cancer (TNBC, estrogen, progesterone and HER2 negative) is the one most frequently associated with hypoxia [3, 4]. Moreover, hypoxia is associated with increased metastasis and resistance to chemotherapy and radiotherapy, leading to poorer rates of survival . These observations indicate why the gene expression signature of breast cancer tumours under hypoxia has a prognostic value, and also the reasons that hypoxia is a key area for the development of targeted therapy [5–8].
As a response to hypoxia, transcriptional programmes are induced in tumour cells that produce resistance to the stress of the low-oxygen micro-environment. This hypoxia response is mediated by the stabilisation of the hypoxia inducible factor (HIF) proteins, which transcriptionally activate over 300 genes [2, 9]. The HIF complex is a heterodimer composed of an alpha subunit and a beta subunit [9–11]. Three different HIFα proteins are known, namely the ones encoded by HIF1α, HIF2α (EPAS1), and HIF3α, while the HIFβ subunit is encoded by ARNT . However, the role of HIF1α and HIF2α is relatively more understood in this process than HIF3α . Normal levels of oxygen provide an essential co-factor for prolyl hydroxylases to hydroxylate HIF1 and 2α. This marks them for degradation by the proteasome after being ubiquitinated by Von-Hippel Lindau (VHL) syndrome protein. Thus abundance of oxygen represents a signal for the degradation of HIF while hypoxia results in its stabilisation [9–11, 13].
HIF heterodimers bind to the hypoxia response element (HRE) motif, with the consensus sequence of RCGTG, at the promotors of many genes resulting in their transcriptional activation [2, 14]. These genes participate in key biological processes, such as angiogenesis, pH regulation, metabolism, autophagy, invasion, metastasis, and others, which promote tumour growth [2, 13, 15–18].
Targeting HIF directly has proven to be difficult . However, genes activated by HIF under hypoxia, such as the key angiogenesis regulator VEGF, or the unfolded protein response, mediated by factors like XBP1 and ATF4, can be targeted [8, 20–23]. So far there has been a larger focus on genes transcriptionally activated by HIFs. However, many genes are downregulated in hypoxia indirectly regulated by HIF [24, 25], which could provide opportunities for synthetic lethality. Therefore, we have conducted an analysis extending a recently published consensus clustering method  and using all available experiments exposing breast cell lines to hypoxia and hypoxia related treatments, using different setups and timing. The power of our clustering approach is that it allows us to analyse these datasets collectively, to define genes consistently co-expressed in hypoxia, upregulated and downregulated, and the potential mechanisms of their regulation.
Cell-line datasets & experimental procedures
Breast cancer microarray datasets in contexts related to hypoxia
Same samples of the last two datasets in the same order, but a different platform.
Illumina HumanWG-6 v3.0
Time-series data through 48 h of exposure to hypoxia (1% O)
Affymetrix HG-U133+ 2.0
Time-series data through 24 h of exposure to hypoxia (1% O)
Affymetrix HuGene 2.0 ST
Samples at normoxia, hypoxia, and anoxia, respectively
Affymetrix HG-U133+ 2.0
Time-series data through 12 h of exposure to hypoxia (0.5% O)
Samples at normoxia, hypoxia, and normoxia exposed to DMOG, respectively
Agilent Whole Human Genome G4112F
SCP2 subline of MDA-MB-231
Time-series data through 24 h of exposure to hypoxia.
Agilent Whole Human Genome G4112F
LM2 subline of MDA-MB-231
Time-series data through 24 h of exposure to hypoxia.
Illumina HumanHT-12 V3.0
Normoxia samples versus hypoxia samples, each is either transfected with non-specific shRNA or with reptin shRNA.
Illumina HumanHT-12 V4.0
MB231RN-LM derived from MDA-MB-231
Non-transfected samples versus transfected with has-miR-18a, each is in either a control medium or treated with Cobalt(II) chloride (CoCl2) hypoxia-mimicking agent.
Affymetrix HG-U133A 2.0
Normoxia samples versus hypoxia samples, each is either untreated or treated with lactic acid.
Illumina HumanWG-6 v3.0
Normoxia samples versus hypoxia samples, each is either non-transfected or transfected with siRNA#1 against JMJD2B
Affymetrix HG-U133+ 2.0
Samples exposed to / transfected with oligogectamine, HIF1α siRNA, HIF2α siRNA, or both siRNAs, respectively. All samples were grown under hypoxia (1% O) for 16 h.
Agilent Whole Human Genome G4112A
MCF7 & ZR-75-1
Samples from each of the two cell lines were exposed to hypoxia for 24 h or 48 h
Affymetrix HG-U133+ 2.0
T47D & MDA-MB-231
A control sample and an XBP1-knocked-down sample from each of the two cell lines. All samples are under hypoxia and glucose deprivation.
Illumina HumanWG-6 v3.0
Time-series data through 24 h of reoxygenation after having been in hypoxia (0.5% O) for 24 h.
An important generalisation was made to UNCLES type A method [26, 27] in order to enable its use to cases where missing features are present (see Methods). In particular, the extended UNCLES allows inclusion of genes which are represented by probes in some rather than all datasets analysed. Had the condition of gene inclusion been as required by the original UNCLES method, described in , only 7714 genes would have been considered in this analysis, which is less than 50% of the array content. Therefore, this modification allows for more comprehensive analysis of available datasets and reduces the amount of data filtered out due to missing data or less complete platforms.
The UNCLES method does not perform its collective analysis to such group of heterogeneous datasets by merging them into a single dataset; rather it clusters each one of them individually by multiple clustering methods, and then finds the consensus result of these individual clustering results. In other words, UNCLES finds the consensus membership of genes in clusters based on results from independent clustering analyses of multiple datasets rather than merging the datasets themselves. This approach overcomes the problems that appear while attempting merging datasets generated by different platforms and with different conditions. This is detailed in the Methods section and in [26, 27].
Individual clustering methods employed at the first step of the UNCLES method were k-means with Kauffman initialisation (KA), self-organising maps (SOM) with bubble neighbourhood and four-by-four grid, and hierarchical clustering (HC) with Ward’s linkage. The experiment was repeated with the numbers of clusters (K values) of 8, 9, 10, 16, 18, 24, 30, and 40, and while varying the tuning parameter δ from zero to unity with steps of 0.1. The resulting clusters from all of these experiments were exposed to the M-N scatter plots  in order to select the top clusters (see Methods).
Consensus clustering identifies two clusters of genes oppositely co-regulated under hypoxia across 16 diverse cell-line datasets
D02, D03, D05, D07, and D08 represent time-series datasets in which breast cancer cell lines were exposed to hypoxia for durations ranging from 12 to 48 h (Table 1). The first time-point in each of them is at normoxia, or in other words, at zero hours after exposure to hypoxia. It can be seen that, in all of these cases, C1 is downregulated gradually in hypoxic conditions while C2 is upregulated (Fig. 2). In contrast, the dataset D16 represents time-series through reoxygenation, i.e., the first time-point is at hypoxia (0.5% Oxygen), followed by shifting the cells back to normal oxygen levels (21%) and observing their genetic expression for 24 h following the shift (Table 1). Agreeing with the previous observation, C1 shows gradual upregulation under reoxygenation while C2 shows gradual downregulation (Fig. 2).
Similar observations can be seen in the other datasets while comparing normoxic with hypoxic conditions (Fig. 2 and Additional file 2). In some datasets, exposure to some agents mimics the effect of real hypoxia, such as exposure to 2-oxoglutarate dependent dioxygenase inhibitor dimethyloxalylglycine (DMOG) (D01 and D06), and to CoCl2 (D10). Similarly, normoxic effects were observed in some datasets in cell lines under hypoxia when certain hypoxia-negating modifications were applied; examples include transplanting with has-miRNA-18a (D10), treatment with lactic acid (D11), and knocking down key regulators of hypoxia such as HIF1A (D01 and D13) and XBP1 (D15).
However, some treatments which were expected to reduce the effect of hypoxia were not observed to have an effect on these two clusters. For instance, transfection with reptin siRNA in D09 or knocking down HIF2A in D01 or D13 did not show a significant reduction in the effects of hypoxia. As for reptin, it was previously shown that about 25% of genes up- or down-regulated by hypoxia are reptin-dependant while the rest are not . The clusters C1 and C2 are significantly enriched with reptin-independent hypoxia-regulated genes (p-values: 1.7 × 10−23 and 1.2 × 10−22 for C1 and C2 respectively) but are not as significantly enriched with reptin-dependent genes (p-values: 6.5 × 10−3 and 6.6 × 10−3 for C1 and C2 respectively); this indeed justifies our observation here. As for knocking down HIF2A, it was in comparison with knocking down HIF1A, and it is clear that these two clusters have dependency on HIF1A rather than on HIF2A. Despite that, knocking down HIF1A could not fully restore the expression of these two clusters to the same level as in normoxia (D01).
Clusters of co-regulated genes map to distinct pathways, biological processes and cellular components
We have performed GO term analysis over the genes in C1 and C2 to identify the most significantly enriched biological processes and cellular components in them using the GeneCodis online tool [29–31]. Full lists of results are available in Additional files 3 and 4 for C1 and C2 respectively.
The general characteristic of C1 is that it is highly enriched with genes participating in growth-related processes such as the mitotic cell-cycle, gene expression (e.g. RNA metabolic process and RNA splicing), and protein synthesis (e.g. translation, ribosome biogenesis, rRNA processing, tRNA processing, and protein folding).
The localisation of the genes participating in these processes highlights rRNA processing and ribosome biogenesis (aka RRB) genes in the nucleolus; mitotic cell-cycle, RNA metabolism, and RNA splicing in the nucleoplasm; genes participating in mitochondrial protein translation localise, clearly, in the mitochondrion; and some cell-cycle, RNA metabolism, and protein folding genes localised in the cytosol. These observations can be summarised as four groups of growth-related processes which take place in four different locations of the cell are down-regulated simultaneously by hypoxia. Thus there is a major suppression of processes that use high amounts of ATP, potentially contributing to cell survival.
Highly enriched cellular processes’ and components’ GO terms in C2 (598 genes)
Regulation of transcription; DNA-dependent
4.4 × 10−14
5.6 × 10−9
Carbohydrate metabolic process
7.3 × 10−7
8.2 × 10−7
4.9 × 10−5
Response to hypoxia
4.2 × 10−4
1.5 × 10−3
Negative regulation of cell proliferation
4.3 × 10−3
Multicellular organismal development
9.3 × 10−3
Positive regulation of IκB kinase/NFκB cascade
9.9 × 10−3
4.1 × 10−39
6.5 × 10−35
4.0 × 10−14
4.3 × 10−12
3.8 × 10−7
4.0 × 10−7
2.0 × 10−5
2.5 × 10−5
Integral to membrane
3.1 × 10−5
4.9 × 10−5
6.7 × 10−5
The distribution of genes participating in different processes over those different cellular components in C2 is not as distinct as was shown in C1 (Fig. 4). In summary, the C2 genes responding to hypoxia have roles throughout the cell, such as regulation of transcription and chromatin remodelling in the nucleus, the activation of carbohydrate metabolism, including glycolysis, and signalling.
KEGG pathway analysis were performed by using the GeneCodis online tool in order to identify the pathways with genes over represented in the clusters C1 and C2 [29–31]. Full results can be found in Additional files 3 and 4 for the clusters C1 and C2, respectively. The results of this analysis shows agreement with the results of the GO term analysis. For instance, C1 is highly enriched with various growth-related and cell-cycle pathways such as RNA transport, ribosome biogenesis, pyrimidine and purine metabolism, spliceosome, cell-cycle, folate biosynthesis, and RNA polymerase; all are with corrected p-values lower than 1 × 10−4. Additionally, the proteasome pathway is strongly correlated, corrected p-value (1.1 × 10−8).
As for C2, which is up-regulated under hypoxia, carbohydrate metabolism pathways, namely glycolysis and the metabolism of fructose, mannose, amino sugars, starch, sucrose, and pentose phosphate, have corrected p-values lower than 1 × 10−3. Importantly, the renal cell carcinoma pathway is also significantly represented in the cluster with a corrected p-value of 1.8 × 10−4. This pathway includes genes that are regulated by the HIF transcription factor, which is stabilised by the mutations of VHL which are typical of this cancer type. Nine genes from the cluster C2 are listed as members of this pathway, BRAF, EGLN1, EGLN3, GAB1, MAP2K1, MET, SLC2A1, VEGFA, and VEGFC. As for the 14 genes which are assigned by the GO term analysis to the GO process term ‘response to hypoxia’, they overlap with the aforementioned nine genes, but they do not include all of them. The 14 genes are ALDOC, ANGPTL4, BNIP3, CITED2, EGLN1, EGLN3, LONP1, NOL3, PAM, PLOD1, PLOD2, SCNN1B, TH, and VEGFA. As can be seen, only three genes are in the intersection of the two groups of genes, while the union of them includes 20 distinct genes. Again, the general observation here is that KEGG pathway analysis agrees with the GO term analysis in identifying carbohydrate metabolism and response to hypoxia/HIF targets as the main groups in C2. However this also shows the current deficits in the deposited pathways resulting in discordance between databases and indicting the need for more comprehensive approaches and multiple genes for any likely utility in tumour hypoxia classification.
Analysis of HIF targets confirms significant differences amongst upregulated genes and downregulated genes in response to hypoxia
Description of seven lists of genes which were identified as potential targets for HIF
Different combinations of the seven lists of HIF potential targets
In this study
Union of all
1.2 × 10−38
Intersection of all
7.5 × 10−2
Gene RSBN1 in (C2)
Intersection of studies with HIF1α (L1, L2, L4, L5, and L6)
5.7 × 10−7
Genes in (C2): CA9, DARS, GAPDH, PLOD2, RSBN1, and SPRY1
In 3 lists or more
1.2 × 10−52
In 4 lists or more
1.5 × 10−30
In 5 lists or more
6.4 × 10−17
Genes in (C2): CA9, DARS, ENO1, GAPDH, GBE1, HK2, INHA, INSIG2, KIAA0195, P4HA1, PLOD2, RSBN1, SPRY1, STC2, WSB1
The 15 genes in C2 which appear in at least five lists of HIF targets include genes which are well-known participants in response to hypoxia processes together with less well-understood genes, where according to GO term annotation, there is no specific biological process in which they are known to participate or a molecular function which they are known to perform (KIAA0195 and RSBN1). Amazingly, RSBN1 is one of the two only genes over which there is a consensus amongst all of the seven lists as a potential HIF target, not previously reported to be induced by hypoxia, and supposedly exclusive to spermatids. This is further discussed in the Discussion section.
Cell-line-derived clusters are not conserved in clinical datasets, but form sub-clusters of highly correlated genes with distinct biological functions
Having obtained C1 and C2 by clustering gene expression in cell lines, we investigated their expression in breast cancer clinical tumours. The Cancer Genome Atlas (TCGA) is a comprehensive resource for high-throughput molecular biological data regarding cancers, and analysed expression profiles in 1026 breast cancer tumours . This data measures the expression of 20,531 genes in those 1026 samples, and covers 14,493 genes out of the 15,588 genes included in our study (Additional file 5). In cell lines the cluster C1 contains 504 genes; 471 of these genes are represented and expressed in this clinical dataset. Similarly, the cluster C2 in cell lines contains 598 genes; 564 of these genes are represented and expressed in this clinical dataset (Additional file 6).
GO term analysis (using the GeneCodis tool [29–31]) was employed to identify any specific enrichment of biological processes in those sub-clusters. Full results are shown in Additional file 5. Both C1 sub-clusters (C1a and C1b) have significant shares of the mitotic cell-cycle genes. Nonetheless, C1b is specifically highly enriched with translation, RNA metabolism, gene expression, protein folding, rRNA processing and ribosome biogenesis (RRB), and regulation of apoptotic process, all with corrected p-values lower than 10−5 in C1b and higher than 10−3 in C1a; such differences in p-values between the two sub-clusters demonstrate the specificity of enrichment of these processes in C1b.
The distinction amongst the three C2 sub-clusters is more apparent; hypoxia response is focused in the smallest of the three sub-clusters, namely C2b, which also includes some glucose metabolism and signal transduction genes; all with corrected p-values lower than 0.01. As for C2a, the only process which is significantly enriched in it is the regulation of DNA-templated transcription, with a corrected p-value of 8.4 × 10−10; no other process is enriched herein with a corrected p-value lower than 0.01. Interestingly, the two other clusters do not have this process significantly enriched despite it being the most enriched process in the mother cluster C2. As for C2c, it is enriched with many carbohydrate metabolism (including glycolysis) and signal transduction genes. This shows that, while each one of the two clusters is consistently co-expressed over 16 different breast cancer cell line datasets related to hypoxia, they form finer sub-clusters in real clinical data.
Transcription factor analysis reveals multiple potential mechanisms of regulation for gene sub-clusters
TFs with significant enrichment in the clusters C1 and C2 as well as their sub-clusters
Number of genes
Number of genes
The TF with the most enriched targets in C1 was found to be MYC, which are also significantly enriched in both of C1’s sub-clusters. MYC is an oncogene encoding a transcription factor which selectively amplifies genes that contribute to cell growth and proliferation . In normal cells, MYC is only activated if both sufficient nutrients and growth signals were available, and is deactivated by active checkpoints or differentiation. In contrast, abnormal activation of MYC was seen in cancerous cells leading to unconstraint growth and proliferation while ignoring checkpoints . These known facts about MYC highly agree with the biological processes and pathways enriched in C1, C1a, and C1b, which are mainly related to cellular growth and proliferation.
In C2the solo significant player is HIF1A, which needs no further justification. However, C2a is not as significantly enriched with the targets of HIF1A as its parent cluster and two sibling sub-clusters do. This is consistent with the fact that it is not correlated with the hypoxia-response subset C2b in clinical datasets. Nonetheless, the top TFs in C2a are HIF1A, VDR, KDM5B, and FOXM1 with the respective adjusted CS values of 3.1, 3.1, 3.0, and 3.0. Notably, VDR and KDM5B appear in either or both of the C1 sub-clusters, either with a greater value than the threshold or just under it. These two TFs do not appear in the other two C2 sub-clusters (C2b and C2c) with any adjusted CS values close to significant.
Sub-clusters of hypoxia co-regulated genes are associated with poor prognosis of tumours and are correlated with estrogen receptor status
We have investigated the prognostic significance, measured by hazard ratios (HR), and the relation with estrogen (ER) of the sub-clusters of C1 and C2 in the five clinical datasets listed in Table 5 which have overall survival (OS) and ER status information. In order to compare our sub-clusters with the state-of-the-art hypoxia signatures, we have also applied the same analysis to a signature of 51 genes previously developed using large meta-analysis of multiple datasets from different cancers , which was shown to be more robust than other hypoxia signatures , and to the consensus group of 20 genes recently compiled as the most frequently appearing genes across 32 different hypoxia signatures . We shall refer to these two signatures hereinafter as the 51-gene signature and the 20-gene signature.
C2 has significant overlaps with the 51-gene and the 20-gene signatures, as it includes 19 and 12 of their genes, respectively. On the other hand, C1 does not have any significant overlaps with them. This is expected as C1 is down-regulated while the 51- and 20-gene signatures are upregulated under hypoxia. Looking into the sub-clusters of C2, C2b (88 genes) has the largest overlap with those two signatures as it includes 6 and 4 of their genes, respectively.
Fold-changes of expression of signatures in relation to ER (ER+/ER−) and their associated p-values between parentheses
0.94 (7.5 × 10 −4 )
0.95 (1.1 × 10 −4 )
0.94 (1.5 × 10 −6 )
0.77 (8.8 × 10 −18 )
0.89 (1.6 × 10 −11 )
0.91 (4.9 × 10 −16 )
0.91 (2.0 × 10 −29 )
0.90 (1.3 × 10 −38 )
1.28 (8.3 × 10 −13 )
1.14 (2.1 × 10 −10 )
1.10 (1.9 × 10 −12 )
1.08 (1.3 × 10 −17 )
1.09 (4.1 × 10 −20 )
0.56 (4.3 × 10 −28 )
0.74 (1.0 × 10 −23 )
0.86 (1.3 × 10 −14 )
0.79 (1.9 × 10 −59 )
0.81 (4.4 × 10 −51 )
1.04 (5.4 × 10 −4 )
0.63 (1.4 × 10 −10 )
0.76 (1.8 × 10 −12 )
0.86 (7.0 × 10 −9 )
0.80 (2.0 × 10 −30 )
0.81 (1.7 × 10 −26 )
0.81 (8.6 × 10 −4 )
0.86 (3.4 × 10 −7 )
0.90 (2.6 × 10 −4 )
We then investigated if the OS times of the samples with the ER+ status and the ones with the ER− status differ significantly by applying ANOVA analysis, which is the type of statistical analysis to assess if two groups of values are significantly different. Results showed that there was no significant differential OS time between ER+ and ER− samples in three out of five clinical datasets (p-values: 0.42 in TCGA, 0.20 in GSE2034, 0.26 in GSE3494, 6.1 × 10−4 in METABRIC-disc, and 2.6 × 10−6 in METABRIC-val).
Comparison of UNCLES with the iCluster method
The iCluster method has become a commonly used method for clustering in cancer research, especially for multiple datasets [25, 49]. Therefore, we have compared our method with it. We have applied iCluster to the same set of 16 cell-line datasets using its default parameters. The iCluster method requires pre-setting the number of clusters (K) to which the genes should be partitioned, so we applied it many times with K values ranging from 2 to 54. The details of the application of the method are provided in Additional file 10.
We have demonstrated that poor prognosis can be predicted in breast cancer clinical samples by the conjoint up-regulation of two distinct subsets of genes which are rather observed to be oppositely regulated under hypoxia in sixteen in vitro breast cancer cell-line datasets. One of these two subsets of genes is upregulated under hypoxia in cell-lines and is enriched with HIF targets that have roles in response to hypoxia. In contrast, the other subset of genes is down-regulated under hypoxia in cell-lines and is enriched with MYC targets that are involved in various growth processes. Interestingly, the gene expression values of these two subsets are positively correlated in six examined clinical datasets, predicting poor prognosis when up-regulated, and demonstrating a clear case of deep disagreement between cell-lines and clinical data.
To be able to generalise our results within the context of breast cancer tissues under hypoxia, the 16 cell-line datasets that we analysed were chosen from that context but from various cell-lines and biological conditions (Table 1), and this is feasible due to the capability of the developed computational methods herein. Moreover, the microarray platforms of these datasets vary widely in terms of the manufacturer (e.g. Affymetrix and Illumina) and the actual model. This allows for the inaccuracies and missing data which may exist in some datasets to be overcome by the majority of the other datasets. In other words, although every platform has its own inherent imperfections, they are reduced by using the entirety of the 16 datasets.
The core results have been obtained by applying a clustering approach which we have developed recently  and improved in this study, and which can be employed to identify cohorts of co-regulated genes from collections of many datasets generated under diverse conditions. Comparisons with the commonly used iCluster method have revealed the unique ability of our approach to analyse integrated data at such a scale.
What causes this mismatch between in vitro cell-lines and clinical data?
The mismatched relation between hypoxia-repressed genes (C1a) and hypoxia-induced genes (C2b and the 51-gene sub-cluster) in cell-line data versus clinical data may be due to their transcriptional regulators. Results showed that MYC seems to be the main activator of C1 and its sub-clusters (C1a and C1b) while HIF1A is the main activator of C2 and two of its sub-clusters (C2b and C2c) (Table 6). HIF1A and MYC interplaying is well-known in cancers ; under hypoxia, HIF1A represses MYC through different suggested ways, such as activating MXI-1, which is an antagonist of MYC with regards to mitochondrial biogenesis [50, 51], or by competing with MYC in binding directly to MAX, without which MYC does not carry out its function , or by binding to Sp1 to result in the displacement of MYC from Sp1 complexes and consequently decreased MYC activity . These facts well justify the consistent anti-correlation observed between these two groups of genes in cell lines.
In contrast, and in all six clinical datasets, either or both sub-clusters of genes activated by MYC (C1a and C1b) show positive correlation with those activated by HIF1A (mainly C2b) (Fig. 10 (b-g)). Apparently, in vivo further selection, maybe through mutations of genes like TP53 and HER2, or other micro-environmental or epigenetic factors, overcomes the suppression effect of hypoxia mediated by HIF on these MYC-regulated growth genes. The expression of MYC itself is significantly upregulated in ER− samples versus ER+ samples in four out of five clinical datasets with ER information (p-values of 6 × 10−8, 5 × 10−4, 0.8, 1 × 10−3, and 9 × 10−9 respectively). Indeed, HIF1A is significantly upregulated in ER− samples in all five datasets (p-values of 5 × 10−17, 5 × 10−7, 5 × 10−4, 5 × 10−20, and 9 × 10−24 respectively). Overexpression of MYC was shown to overcome the competition of HIF1A with it over MAX, and thus overexpressed MYC maintain stable MYC-MAX heterodimers despite the presence of HIF1A , and is associated with poor outcome . Nonetheless, what causes MYC to be overexpressed in tumours and not in cell lines? Also, MYC does not seem to be overexpressed in ER− tumours in one of the clinical datasets, yet its predicted targets involved in growth and proliferation are upregulated. Is there an alternative regulatory machinery which upregulates such a group of genes despite the presence of HIF1A and the absence of MYC?
For instance, HIF2A (EPAS1) was shown to have a promoting effect on MYC as opposed to HIF1A, and it was suggested that those two HIF factors are likely to negate each other’s effect on MYC when both are present [10, 53]. However, as seen in the expression of C1 and C2 in the dataset D13, knocking down HIF2A in the MCF7 cell line showed know difference in the expression of either C1 or C2 compared with wild-type cells, but knocking down HIF1A did down-regulate C2 and up-regulate C1, and that was similar between knocking down HIF1A only and a double-knocking down of both factors (Fig. 2 and Additional file 2). Also, there is no significant upregulation of HIF2A in ER− samples versus ER+ in any of the clinical datasets (p-values always higher than 0.1).
Such results further demonstrate, as discussed in previous studies and reviews , that cell lines, despite their usefulness, do not accurately represent real clinical tumours. We provide the community with new significant signatures of hypoxia which are derived from cell lines, and then mapped to clinical samples to derive prognostic signatures, namely C2b and the more interesting C1a. We also provide such a thorough comparison between the gene expression of different groups of genes under clinical or cell line datasets, which may lead to a better understanding of the factors influencing the differences between the two systems.
Speculations regarding novel association of genes with hypoxia response
A few genes with unknown or poorly understood functions have been found in the clusters C1 and C2. The most notable ones are RSBN1 (chromosomal location 1p13.2) and KIAA0195 (aka TMEM94; chromosomal location 17q25.1), which are induced under hypoxia, and they have been identified as potential HIF targets by 7 or 5 different HIF target lists respectively (Table 4).
Only one study focused on RSBN1 gene or on any of its homologues . This study suggested that murine RSBN1 may have a transcriptional regulatory role in haploid germ of male mice as it specifically localise in the nucleus at stages VII-VIII of murine spermatogenesis . Interestingly, scrutiny of Takahashi and colleagues’ experiments shows that they investigated RSBN1’s expression in the brain, heart, intestine, kidney, liver, lung, muscle, ovary, spleen, and testis of male mice, to conclude that it is only expressed in the testis . In other words, expression of RSBN1 in murine female breast tissues was not investigated, and it may well be regulated by HIF as all seven lists include it, it may be expressed in breast tissues, and may have a role in response to hypoxia in breast cancer consequently.
Few genome-wide studies identified interactions between KIAA0195, or one of its homologues, and some other gene products, such as the kinase CDK5 [56–58], ubiquitin C (UBC) [59, 60], and the Drosophila’s CG9099 protein . To find KIAA0195 consistently co-expressed with many other HIF targets in our study and in five out of seven lists of potential HIF targets indicates that this poorly understood gene may be a genuine target of HIF and may also have a role in response to hypoxia.
Poor prognosis can be predicted by the conjoint up-regulation of two subsets of genes in breast tumours, which we identified in this study. The first subset is enriched with MYC targets participating in various growth-related processes, while the second subset is enriched with targets of the hypoxia-induced factor (HIF) and is expected to participate in response to hypoxia. The HIF-targets subset includes genes that are consistently co-expressed in sixteen different breast cancer cell-line datasets generated from conditions related to hypoxia. As expected, it shows up-regulation exclusively under hypoxic conditions in all of these cell-lines, which agrees with the literature associating hypoxia with bad prognosis. Strikingly, the MYC-targets growth-related subset shows an opposite profile in all tested cell-line datasets, where its genes are always repressed by hypoxia. This is despite the fact that its up-regulation in tumours indicates poor prognosis. The repression of growth-related genes under hypoxia agrees well with expectations, which we observe in cell-lines (in vitro). However, the association of their up-regulation in tumours (in vivo) with poor prognosis is an interesting finding. In fact, we showed that the conjoint in vivo up-regulation of both subsets combined with an existing state-of-the-art hypoxia signature is as strong in predicting bad tumour outcomes as that state-of-the-art signature and potentially stronger.
These results were found by analysing six clinical datasets after the application of the UNCLES method, with its novel and important extension that we introduced in this study, to sixteen different genome-wide gene expression datasets of breast cancer cell-lines. These datasets were generated by different groups, using different platforms, and under different conditions related to hypoxia. This method, which is publically available as part of the Clust python package [27, 62], can be applied to other sets of gene expression datasets from other areas in order to find those subsets of genes which are consistently co-expressed over all of them. The uniqueness of the ability to perform this task fruitfully at a genome-wide scale as such has been demonstrated by comparisons with iCluster, which is a relevant state-of-the-art method.
Generation of many partitions for the same set of genes by applying C different clustering methods over the expression profiles of these genes from L different microarray datasets. As each of these C methods is applied independently to each one of the L microarray datasets, this step generates R = C × L different partitions. All of these partitions should have the same number of clusters (K).
Relabelling the R partitions. The clusters within each partition are permuted such that the i th cluster from one partition corresponds to the i th cluster from each one of the other partitions. This is essential because, due to the unsupervised nature of clustering, different partitions are not guaranteed to have this alignment of their clusters. We adopt the min-min relabelling technique  here because it aims at giving the fittest clusters a higher priority to be properly matched even if the poorer clusters are improperly matched. This is in contrast to the min-max technique  which aims at giving poorer clusters a chance to be fairly matched even if the compensation was to reduce the matching quality of the fittest clusters. The latter is better when all of the clusters are of interest, while the former is more suitable for the aim of our study, in which we prefer to optimise the fittest clusters and eliminate the remaining clusters from the final result.
Generation of the fuzzy consensus partition matrix (CoPaM) by element-by-element averaging of the relabelled partitions. This fuzzy CoPaM assigns each gene to each one of the K different clusters with a fuzzy value ranging from zero (does not belong) to unity (perfectly belongs).
Binarization of the CoPaM by one or more of the six tuneable binarization techniques proposed in . Here we adopt the difference threshold binarisation (DTB) technique, which assigns a gene to the cluster in which it has the highest fuzzy membership value if, and only if, the fuzzy membership value of that gene in any of the other clusters is lower than that highest value at least by the value of the tuning parameter δ. In other words, if two or more clusters compete over a given gene with close fuzzy membership values, the gene is not considered distinctly belonging to a single specific cluster, and is, consequently, not assigned to any of them. The tightness of the clusters is therefore controlled by the value of δ, which ranges from zero (loosest) to unity (most stringent).
Extension of UNCLES
M-N scatter plots technique
The UNCLES method requires a fixed number of clusters (K) to be set as well as the tuning parameter δ. However, the best values of K and δ may not be easily set manually. The M-N scatter plots technique addresses this issue by selecting the best clusters of the results of the application of UNCLES multiple times, each with a different K value and with varying δ values. All of the individual clusters from these different results of UNCLES are scattered on a 2-D plot. Each point on the plot represents one cluster, the horizontal axis represents the dispersion of the cluster measured by a modified version of the mean-square error (MSE) metric, and the vertical axis represents the logarithm of the size of the cluster, i.e. the logarithm of the number of genes included in the cluster.
Aiming at minimising the dispersion of the clusters (horizontal axis) while maximising their sizes (vertical axis), the cluster which is represented by closest point to the top-left corner of the plot is selected as the best cluster. This solves the problem of merely minimising the dispersion, as the dispersion of the trivially small clusters which include one or few genes is usually the minimum. In contrast, the M-N scatter plots favours larger clusters which maintain as low dispersion as possible.
It is likely that many of the repetitions of UNCLES have produced clusters with similar contents. Therefore, after selecting the top-left cluster in the M-N plot, all of those clusters with overlapping contents are removed from the plot. Out of the remaining clusters in the plot, the one which is closest to the top-left corner is selected as the second best cluster. The same steps are repeated until the M-N plots are empty or until the researcher is satisfied with the top few clusters obtained.
As can be seen, M-N plots do not select a whole partition and does not select the best K value or δ values. It rather selects the best clusters ordered in quality, which might have been generated by different K or δ values.
Prior to clustering, the one-colour datasets were normalised by quantile normalisation . Then each gene’s expression profile was shifted and scaled to be zero-mean and unity standard deviation. The log-ratios of the two-colour datasets were zero-centred by subtracting genes’ log-ratios’ mean values . In case of having multiple replicates per condition, the median value was taken. In the case of having multiple probes in the same dataset representing the same gene, the representative probe was considered as the one with the maximum coefficient of variation while at least a quarter of its samples exceed the first quartile of expression values in that dataset. If none of the probes representing a gene exceeds that quartile, the one with highest coefficient of variation of all of its probes was considered.
Statistical analysis of clinical datasets
Given a clinical dataset, a single representative probe-set is first selected for each gene which has multiple probe-sets. This is done by taking the most variable probe-set, as measured by covariance, out of the probe-sets exceeding the 25th percentile expression in at least 25% of the samples. If no probe-set for that particular gene exceeds the aforementioned threshold, the most variable probe-set amongst all of the genes’ probe-sets is selected. The dataset is thereafter normalised by quantile normalisation.
To test the prognostic capability of a signature given a normalised dataset, the signature is first summarised by taking the median expression value of its genes at each sample. The median values are ranked over the samples and then scaled to the range of 0.0 to 1.0. Those [0.0, 1.0] values are called the hypoxia scores (HS) of the signature over each one of the samples. The HS values are submitted to Cox proportional hazards regression analysis to obtain their hazard ratios (HR) and the associated p-values.
If the estrogen (ER) status of the samples is provided with the clinical dataset, analysis of variance (ANOVA) is performed for each gene in the dataset in order to identify if it is differentially expressed between ER+ and ER- samples. A p-value and the fold-changes are therefore associated with each gene. The average of the expression values of each gene was shifted to zero and the standard deviation was scaled to unity before submission to ANOVA analysis. The negative logarithm of those ANOVA p-values followed an approximately lognormal distribution, while the fold-changes followed, as they were, a lognormal distribution. Therefore, to test if the genes of a given signature have significantly lower ANOVA p-values or higher fold-changes over the ER variable compared with the rest of the genome, another ANOVA analysis was adopted over the logarithm of the negative logarithm of the p-values, and over the logarithm of the fold-changes.
Professor Nandi is Distinguished Visiting Professor in Tongji University, Shanghai, China.
Dr. Abu-Jamous would like to acknowledge the financial assistance from Brunel University London. Professors Buffa and Harris acknowledge support from Cancer Research UK, EU framework 7, and the Oxford NIHR Biomedical Research Centre. Professor Harris acknowledges support from the Breast Cancer Research Foundation. Professor Nandi would like to acknowledge that this work was partly supported by the National Science Foundation of China grant number 61520106006 and the National Science Foundation of Shanghai grant number 16JC1401300. The funding bodies have no role in the design of the study, in the collection, analysis, and interpretation of data, or in writing the manuscript.
Availability of data and materials
The implementation of the UNCLES method and the M-N scatter plots technique are available as part of the open-source R package “UNCLES”, available on https://cran.r-project.org/web/packages/UNCLES/index.html and the Python package “Clust” available on https://github.com/BaselAbujamous/clust . All datasets supporting the conclusions of this article are available in the repositories or the references listed in Table 1 for breast cancer cell line datasets and Table 5 for breast cancer clinical datasets.
All authors participated in conceiving and designing the experiments, interpreting the results, and writing the paper. BAJ performed the experiments. BAJ, FB, and AKN participated in the bioinformatic analysis of the data. FB and ALH are the contributors whose primary background in breast cancer research. All authors read and approved the final manuscript.
The authors declare that they have no competing interests.
Consent for publication
Ethics approval and consent to participate
This study makes use of data generated by the Molecular Taxonomy of Breast Cancer International Consortium, which was funded by Cancer Research UK and the British Columbia Cancer Agency Branch. The results published here are in part based upon data generated by The Cancer Genome Atlas pilot project established by the NCI and NHGRI. Information about TCGA and their policies regarding the ethics of collecting and sharing human tissues’ data, as well as the investigators and institutions who constitute the TCGA research network can be found at http://cancergenome.nih.gov/.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
- Harris AL. Hypoxia — a key regulatory factor in tumour growth. Nat Rev Cancer. 2002;2:38–47.View ArticlePubMedGoogle Scholar
- Semenza GL. Oxygen Sensing, Hypoxia-Inducible Factors, and Disease Pathophysiology. Ann Rev Pathol. 2014;9:47–71.View ArticleGoogle Scholar
- Bernardi R, Gianni L. Hallmarks of triple negative breast cancer emerging at last? Cell Res. 2014;24:904–5.View ArticlePubMedPubMed CentralGoogle Scholar
- Tan EY, et al. The key hypoxia regulated gene CAIX is upregulated in basal-like breast tumours and is associated with resistance to chemotherapy. Br J Cancer. 2009;100:405–11.View ArticlePubMedPubMed CentralGoogle Scholar
- Buffa FM, Harris AL, West CM, Miller CJ. Large meta-analysis of multiple cancers reveals a common, compact and highly prognostic hypoxia metagene. Br J Cancer. 2010;102:428–35.View ArticlePubMedPubMed CentralGoogle Scholar
- Fox NS, et al. Ensemble analyses improve signatures of tumour hypoxia and reveal inter-platform differences. BMC Bioinformatics. 2014;15:170.View ArticlePubMedPubMed CentralGoogle Scholar
- Semenza GL. The hypoxic tumor microenvironment: a driving force for breast cancer progression. Biochimica et Biophysica Acta (BBA) - Molecular Cell Research. 2015. p. In Press.Google Scholar
- Ward C, et al. New strategies for targeting the hypoxic tumour microenvironment in breast cancer. Cancer Treat Rev. 2013;39(2):171–9.View ArticlePubMedGoogle Scholar
- Shen C, Kaelin WGJ. The VHL/HIF axis in clear cell renal carcinoma. Semin Cancer Biol. 2013;23(1):18–25.View ArticlePubMedGoogle Scholar
- Gordan JD, Thompson CB, Simon MC. HIF and c-Myc: sibling rivals for control of cancer cell metabolism and proliferation. Cancer Cell. 2007;12(2):108–13.View ArticlePubMedPubMed CentralGoogle Scholar
- Multhoff G, Radons J, Vaupel P. Critical role of aberrant angiogenesis in the development of tumor hypoxia and associated radioresistance. Cancers (Basel). 2014;6(2):813–28.View ArticleGoogle Scholar
- Ortiz-Barahona A, et al. Genome-wide identification of hypoxia-inducible factor binding sites and target genes by a probabilistic model integrating transcription-profiling data and in silico binding site prediction. Nucleic Acids Res. 2010;38(7):2332–45.View ArticlePubMedPubMed CentralGoogle Scholar
- McIntyre A, Harris AL. Metabolic and hypoxic adaptation to anti-angiogenic therapy: a target for induced essentiality. EMBO Mol Med. 2015;7(4):368–79.View ArticlePubMedPubMed CentralGoogle Scholar
- Wenger RH, Stiehl DP, Camenisch G. Integration of oxygen signaling at the consensus HRE. Sci Signal. 2005;2005(306):re12.View ArticleGoogle Scholar
- Brahimi-Horn MC, Bellot G, Pouysségur J. Hypoxia and energetic tumour metabolism. Curr Opin Genet Dev. 2011;21(1):67–72.View ArticlePubMedGoogle Scholar
- De Rock K, Mazzone M, Carmeliet P. Antiangiogenic therapy, hypoxia, and metastasis: risky liaisons, or not? Nat Rev Clin Oncol. 2011;8:393–404.View ArticleGoogle Scholar
- Favaro E, Lord S, Harris AL, Buffa FM. Gene expression and hypoxia in breast cancer. Genome Med. 2011;3:55.View ArticlePubMedPubMed CentralGoogle Scholar
- Singleton DC, et al. Hypoxic regulation of RIOK3 is a major mechanism for cancer cell invasion and metastasis. Oncogene. 2015;34:4713–22.View ArticlePubMedGoogle Scholar
- Jones DT, Harris AL. Small-molecule inhibitors of the HIF pathway and synthetic lethal interactions. Expert Opin Ther Targets. 2012;16(5):463–80.View ArticlePubMedGoogle Scholar
- Carmeliet P, Jain RK. Molecular mechanisms and clinical applications of angiogenesis. Nature. 2011;473:298–307.View ArticlePubMedPubMed CentralGoogle Scholar
- Grothey A, Galanis E. Targeting angiogenesis: progress with anti-VEGF treatment with large molecules. Nat Rev Clin Oncol. 2009;6:507–18.View ArticlePubMedGoogle Scholar
- Kümler I, Christiansen OG, Nielsen DL. A systematic review of bevacizumab efficacy in breast cancer. Cancer Treat Rev. 2014;40(8):960–73.View ArticlePubMedGoogle Scholar
- Rzymski T, Milani M, Singleton DC, Harris AL. Role of ATF4 in regulation of autophagy and resistance to drugs and hypoxia. Cell Cycle. 2009;8(23):3838–47.View ArticlePubMedGoogle Scholar
- Harris B, Barberis A, West C, Buffa F. Gene expression signatures as biomarkers of tumour hypoxia. Clin Oncol. 2015;27(10):547–60.View ArticleGoogle Scholar
- Mole DR, et al. Genome-wide association of hypoxia-inducible factor (HIF)-1alpha and HIF-2alpha DNA binding with expression profiling of hypoxia-inducible transcripts. J Biol Chem. 2009;284:16767–75.View ArticlePubMedPubMed CentralGoogle Scholar
- Abu-Jamous B, Fa R, Roberts DJ, Nandi AK. UNCLES: method for the identification of genes differentially consistently co-expressed in a specific subset of datasets. BMC Bioinformatics. 2015;16:184.View ArticlePubMedPubMed CentralGoogle Scholar
- Abu-Jamous B, Fa R, Roberts DJ, Nandi AK. Paradigm of Tunable clustering using Binarization of consensus partition matrices (bi-CoPaM) for Gene discovery. PLoS One. 2013a;8(2).Google Scholar
- Lee JS, et al. Negative regulation of hypoxic responses via induced Reptin methylation. Mol Cell. 2010;39(1):71–85.View ArticlePubMedPubMed CentralGoogle Scholar
- Carmona-Saez P, et al. GENECODIS: a web-based tool for finding significant concurrent annotations in gene lists. Genome Biol. 2007;8:R3.View ArticlePubMedPubMed CentralGoogle Scholar
- Nogales-Cadenas R, et al. GeneCodis: interpreting gene lists through enrichment analysis and integration of diverse biological information. Nucleic Acids Res. 2009;37(suppl 2):W317–22.View ArticlePubMedPubMed CentralGoogle Scholar
- Tabas-Madrid D, Nogales-Cadenas R, Pascual-Montano A. GeneCodis3: a non-redundant and modular enrichment analysis tool for functional genomics. Nucleic Acids Res. 2012;40(W1):W478–83.View ArticlePubMedPubMed CentralGoogle Scholar
- Wu M-Z, et al. Hypoxia drives breast tumor malignancy through a TET-TNFα-p38-MAPK signaling axis. Cancer Res. 2015;75(18):3912–24.View ArticlePubMedGoogle Scholar
- Tafani M, et al. Modulators of HIF1a and NFkB in cancer treatment: is it a rational approach for controlling malignant progression? Front Pharmacol. 2013;4:13.View ArticlePubMedPubMed CentralGoogle Scholar
- Favaro E, et al. Glucose utilization via glycogen phosphorylase sustains proliferation and prevents premature senescence in cancer cells. Cell Metab. 2012;16(6):687–8.View ArticleGoogle Scholar
- Kim J-W, Tchernyshyov I, Semenza GL, Dang CV. HIF-1-mediated expression of pyruvate dehydrogenase kinase: a metabolic switch required for cellular adaptation to hypoxia. Cell Metab. 2006;3:177–85.View ArticlePubMedGoogle Scholar
- Papandreou I, et al. HIF-1 mediates adaptation to hypoxia by actively downregulating mitochondrial oxygen consumption. Cell Metab. 2006a;3(3):187–97.View ArticlePubMedGoogle Scholar
- Papandreou I, et al. HIF-1 mediates adaptation to hypoxia by actively downregulating mitochondrial oxygen consumption. Cell Metab. 2006b;3:187–97.View ArticlePubMedGoogle Scholar
- Park JS, et al. Hypoxia-induced IL-32β increases glycolysis in breast cancer cells. Cancer Lett. 2015;356:800–8.View ArticlePubMedGoogle Scholar
- Lee JS, et al. Hypoxia-induced methylation of a pontin chromatin remodeling factor. PNAS. 2011;108(33):13510–5.View ArticlePubMedPubMed CentralGoogle Scholar
- Camps C, et al. Integrated analysis of microRNA and mRNA expression and association with HIF binding reveals the complexity of microRNA expression regulation under hypoxia. Mol Cancer. 2014;13:28.View ArticlePubMedPubMed CentralGoogle Scholar
- Benita Y, et al. An integrative genomics approach identifies hypoxia inducible factor-1 (HIF-1)-target genes that form the core response to hypoxia. Nucleic Acids Res. 2009;37(14):4587–602.View ArticlePubMedPubMed CentralGoogle Scholar
- Schödel J, et al. High-resolution genome-wide mapping of HIF-binding sites by ChIP-seq. Blood. 2011;117(23):e207–17.View ArticlePubMedPubMed CentralGoogle Scholar
- Xia X, Kung AL. Preferential binding of HIF-1 to transcriptionally active loci determines cell-type specific response to hypoxia. Genome Biol. 2009;10:R113.View ArticlePubMedPubMed CentralGoogle Scholar
- Ciriello G, et al. Comprehensive molecular portraits of invasive lobular breast cancer. Cell. 2015;163(2):506–19.View ArticlePubMedPubMed CentralGoogle Scholar
- Chen EY, et al. Enrichr: interactive and collaborative HTML5 gene list enrichment analysis tool. BMC Bioinformatics. 2013;14:128.View ArticlePubMedPubMed CentralGoogle Scholar
- Kuleshov MV, et al. Enrichr: a comprehensive gene set enrichment analysis web server 2016 update. Nucleic Acids Research. 2016.Google Scholar
- Lachmann A, et al. ChEA: transcription factor regulation inferred from integrating genome-wide ChIP-X experiments. Bioinformatics. 2010;26(19):2438–44.View ArticlePubMedPubMed CentralGoogle Scholar
- Stine ZE, et al. MYC, metabolism, and cancer. Cancer Discov. 2015;5(10):1024–39.View ArticlePubMedPubMed CentralGoogle Scholar
- Shen R, et al. Integrative subtype discovery in glioblastoma using iCluster. PLoS One. 2012;7(4):e35236.View ArticlePubMedPubMed CentralGoogle Scholar
- Corn PG, et al. Mxi1 is induced by hypoxia in a HIF-1-dependent manner and protects cells from c-Myc-induced apoptosis. Cancer Biol Ther. 2005;4(11):1285–94.View ArticlePubMedGoogle Scholar
- Zhang H, et al. HIF-1 inhibits mitochondrial biogenesis and cellular respiration in VHL-deficient renal cell carcinoma by repression of C-MYC activity. Cancer Cell. 2007;11(5):407–20.View ArticlePubMedGoogle Scholar
- Koshiji M, et al. HIF-1alpha induces cell cycle arrest by functionallycounteracting Myc. EMBO J. 2004;23(9):1949–56.View ArticlePubMedPubMed CentralGoogle Scholar
- Dang, C. V., .Kim, J.-W., Gao, P. & Yustein, J, 2008. The interplay between MYC and HIF in cancer. Nat Rev Cancer, Volume 8, pp. 51-56.View ArticlePubMedGoogle Scholar
- Green AR, et al. MYC functions are specific in biological subtypes of breast cancer and confers resistance to endocrine therapy in luminal tumours. Br J Cancer. 2016;114(8):917–28.View ArticlePubMedPubMed CentralGoogle Scholar
- Takahashi T, et al. Rosbin: a novel homeobox-like protein gene expressed exclusively in round spermatids. Biol Reprod. 2004;70(5):1485–92.View ArticlePubMedGoogle Scholar
- So J, et al. Integrative analysis of kinase networks in TRAIL-induced apoptosis provides a source of potential targets for combination therapy. Sci Signal. 2015;8(371)Google Scholar
- Varjosalo M, et al. The protein interaction landscape of the human CMGC kinase group. Cell Rep. 2013a;3(4):1306–20.View ArticlePubMedGoogle Scholar
- Varjosalo M, et al. Interlaboratory reproducibility of large-scale human protein-complex analysis by standardized AP-MS. Nat Methods. 2013b;10(4):307–14.View ArticlePubMedGoogle Scholar
- Danielsen JMR, et al. Mass spectrometric analysis of lysine ubiquitylation reveals promiscuity at site level. Mol Cell Proteomics. 2011;10(3):M110.003590.View ArticlePubMedGoogle Scholar
- Wagner SA, et al. A proteome-wide, quantitative survey of in vivo ubiquitylation sites reveals widespread regulatory roles. Mol Cell Proteomics. 2011;10(10):M111.013284.View ArticlePubMedPubMed CentralGoogle Scholar
- Giot L, et al. A protein interaction map of Drosophila melanogaster. Science. 2003;302(5651):1727–36.View ArticlePubMedGoogle Scholar
- Abu-Jamous, B. & Kelly, S., 2017. Clust. [Online] Available at: https://github.com/BaselAbujamous/clust
- Abu-Jamous B, Fa R, Roberts DJ, Nandi AK. Yeast gene CMR1/YDL156W is consistently co-expressed with genes participating in DNA-metabolic processes in a variety of stringent clustering experiments. J R Soc Interface. 2013b;10(81)Google Scholar
- Abu-Jamous B, Fa R, Roberts DJ, Nandi AK. Comprehensive analysis of forty yeast microarray datasets reveals a novel subset of genes (APha-RiB) consistently negatively associated with ribosome biogenesis. BMC Bioinformatics. 2014;15:322.View ArticlePubMedPubMed CentralGoogle Scholar
- Bolstad B, Irizarry R, Astrand M, Speed T. A comparison of normalization methods for high density oligonucleotide array data based on variance and bias. Bioinformatics. 2003;19:185–93.View ArticlePubMedGoogle Scholar
- Quackenbush J. Microarray data normalization and transformation. Nat Genet. 2002;32:496–501.View ArticlePubMedGoogle Scholar
- Elvidge GP, et al. Concordant regulation of gene expression by hypoxia and 2-oxoglutarate-dependent dioxygenase inhibition: the role of HIF-1alpha, HIF-2alpha, and other pathways. J Biol Chem. 2006;281(22):15215–26.View ArticlePubMedGoogle Scholar
- Koritzinsky M, et al. Two phases of disulfide bond formation have differing requirements for oxygen. J Cell Biol. 2013;203(4):615–27.View ArticlePubMedPubMed CentralGoogle Scholar
- Lu X, et al. In vivo dynamics and distinct functions of hypoxia in primary tumor growth and organotropic metastasis of breast cancer. Cancer Res. 2010;70(10):3905–14.View ArticlePubMedPubMed CentralGoogle Scholar
- Krutilina R, et al. MicroRNA-18a inhibits hypoxia-inducible factor 1α activity and lung metastasis in basal breast cancers. Breast Cancer Res. 2014;16:R78.View ArticlePubMedPubMed CentralGoogle Scholar
- Tang X, et al. Functional interaction between responses to lactic acidosis and hypoxia regulates genomic transcriptional outputs. Cancer Res. 2012;72(2):491–502.View ArticlePubMedGoogle Scholar
- Yang J, et al. The histone demethylase JMJD2B is regulated by estrogen receptor alpha and hypoxia, and is a key mediator of estrogen induced growth. Cancer Res. 2010;70(16):6456–66.View ArticlePubMedPubMed CentralGoogle Scholar
- Askautrud HA, et al. Global gene expression analysis reveals a link between NDRG1 and vesicle transport. PLoS One. 2014;9(1):e87268.View ArticlePubMedPubMed CentralGoogle Scholar
- Chen X, et al. XBP1 promotes triple-negative breast cancer by controlling the HIF1α pathway. Nature. 2014;508(7494):103–7.View ArticlePubMedPubMed CentralGoogle Scholar
- Lai L-C, et al. Down-regulation of NDRG1 promotes migration of cancer cells during reoxygenation. PLoS One. 2011;6(8):e24375.View ArticlePubMedPubMed CentralGoogle Scholar
- Buffa FM, et al. microRNA-associated progression pathways and potential therapeutic targets identified by integrated mRNA and microRNA expression profiling in breast cancer. Cancer Res. 2011;71(17):5635.View ArticlePubMedGoogle Scholar
- Wang Y, et al. Gene-expression profiles to predict distant metastasis of lymph-node-negative primary breast cancer. Lancet. 2005;365(9460):671–9.View ArticlePubMedGoogle Scholar
- Miller LD, et al. An expression signature for p53 status in human breast cancer predicts mutation status, transcriptional effects, and patient survival. PNAS. 2005;102(38):13550–5.View ArticlePubMedPubMed CentralGoogle Scholar
- Curtis C, et al. The genomic and transcriptomic architecture of 2,000 breast tumours reveals novel subgroups. Nature. 2012;486:346–52.PubMedPubMed CentralGoogle Scholar