Meta-analysis of human cancer microarrays reveals GATA3 is integral to the estrogen receptor alpha pathway

Background The transcription factor GATA3 has recently been shown to be necessary for mammary gland morphogenesis and luminal cell differentiation. There is also an increasing body of data linking GATA3 to the estrogen receptor α (ERα) pathway. Among these it was shown that GATA3 associates with the promoter of the ERα gene and ERα can reciprocally associate with the GATA3 gene. GATA3 has also been directly implicated in a differentiated phenotype in mouse models of mammary tumourigenesis. The purpose of our study was to compare coexpressed genes, by meta-analysis, of GATA3 and relate these to a similar analysis for ERα to determine the depth of overlap. Results We have used a newly described method of meta-analysis of multiple cancer studies within the Oncomine database, focusing here predominantly upon breast cancer studies. We demonstrate that ERα and GATA3 reciprocally have the highest overlap with one another. Furthermore, we show that when both coexpression meta-analysis lists for ERα and GATA3 are compared there is a significant overlap between both and, like ERα, GATA3 coexpresses with ERα pathway partners such as pS2 (TFF1), TFF3, FOXA1, BCL2, ERBB4, XBP1, NRIP1, IL6ST, keratin 18(KRT18) and cyclin D1 (CCND1). Moreover, as these data are derived from human tumour samples this adds credence to previous cell-culture or murine based studies. Conclusion GATA3 is hypothesized to be integral to the ERα pathway given the following: (1) The large overlap of coexpressed genes as seen by meta-analysis, between GATA3 and ERα, (2) The highest coexpressing gene for GATA3 was ERα and vice-versa, (3) GATA3, like ERα, coexpresses with many well-known ERα pathway partners such as pS2.


Background
While GATA3 has most intensively been studied in the immune system [1] GATA3 is also expressed in other biological environments such as the human endometrium epithelial cells, where levels are regulated in a cyclic manner [2]. GATA3 levels are also considered a good prognostic biomarker in breast tumours. Specifically, in the luminal A subtype of breast cancer GATA3 has both a favorable prognostic outcome, and the highest ERα and GATA3 expression of all breast tumours [3]. Consistent with this, basal-like tumours have the lowest GATA3 expression and the worst prognosis. GATA3 has also been shown in murine models to be essential to the development and maintenance of mammary luminal cells [4,5]. There is also tentative data showing that different poly-morphisms of the GATA3 gene may associate with differential susceptibility to breast cancer [6]. GATA3 levels have previously been correlated with expression of ERα [7] and both were shown to reciprocally regulate one another at the transcriptional level in a cellculture based system in a cross-regulatory loop [8]. Furthermore, in a meta-analysis of ERα 10 genes were proposed as classifier of ERα positive breast tumours, listing GATA3 as one of these [9]. A study has also compared microarray experiments between estradiol-induced genes from MCF-7 cells, and transfected GATA3-induced genes from 293T cells to assess common upregulated genes [10].
In an elegant series of experiments utilizing MMTV-PyMT (polyoma middle T antigen) mice it was first shown that GATA3 expression was downregulated with the transition from adenoma to carcinoma in mammary tumours, and the expression was lost in lung metastases. Infection of the MMTV-PyMT carcinomas with GATA3 upregulated markers of differentiation and resulted in a dramatic 27-fold reduction in lung metasases [11]. Further crossing of these mice with an inducible cre-WAP (whey acidic proteinspecific to luminal mammary epithelial cells) driven knockout of GATA3, resulted in loss of markers of terminal differentiation, detachment from the basal membrane and apoptosis. This is consistent with the requirement of GATA3 in differentiated tumours.
As described in a recent study known pathway partners have been shown to yield a similar 'meta-analysis coexpression signature' i.e. having a significant overlap of coexpressed genes can link proteins to the same pathways [12]. Thus performing independent meta-analyses for ERα and GATA3 (putative pathway partners), then comparing the results for overlapping genes would yield a highly significant number of genes if these transcription factors were in the same pathway. We report here not only that these meta-analyses have a high degree of overlap, but that genes identified are consistent with previous reports of the ERα pathway regulation. Additionally we show this correlation with previously identified ERα target genes by combining our meta-analysis data with both RT-PCR and genome-wide location analysis from other studies. These data not only confirm GATA3 as being a key player in the ERα pathway, but also give fresh insights into the pathway itself.

Meta-analysis
The following procedure was undertaken for independent meta-analyses of GATA3 or ERα: a co-expression gene search was performed within Oncomine [13]. Twenty-one studies were chosen for analysis, most of which were breast cancer studies. The top 400 coexpressed genes were extracted and filtered to give one representative gene per study (removing duplicates and ESTs). These filtered genelists were then compared for repeating coexpressed genes over multiple studies. The frequency cutoff was 3 studies (14% of 21 studies). This generated a meta-analysis list for ERα or GATA3, which were then compared for overlap. As the overlap was high the stringency was increased to 4 studies (19%), the data of which is used for Table 1. Gene names were obtained using Genecards [14].
Here 200-400 ng reporter were transfected with 200 ng pcDNA3 or pcDNA3-GATA3, and 3U/well of β-galactosidase protein (Sigma) as transfection efficiency control. Ten nM Tamoxifen (Sigma) was incubated for 14 h prior to cell assay.

Results and Discussion
Using the Oncomine™ integrated cancer profiling database GATA3 and ERα were searched for coexpressing genes [13]. Coexpression data from 21 multi-array studies was extracted and analysed, separately for ERα and GATA3. While these studies varied in cancer-types, the overwhelming majority extracted for analysis were breastcancer based [Additional file 1 and 2]. The frequency of coexpressing genes over the 21 studies was determined and the cutoff set to 3 studies or more (3 studies = 14% frequency overlap -[see Additional file 1 and 2]). Next, to ascertain the extent GATA3 may play a role in ERα pathways the frequency coexpression lists were compared for overlap. Interestingly, there was an extensive overlap between both GATA3 and ERα lists at the cutoff of 3 studies ( Figure 1A). Increasing the cutoff to 4 or more studies (almost one-fifth of the studies) did not change the relative overlap with respect to total gene numbers, with 43% of the number of ERα coexpressed genes, and 56% of GATA3 coexpressed genes represented in the overlap (Figure 1B). The overlap data with the frequency cutoff of 4 studies is shown in Table 1.
Every technique has its caveats, and the limitation here is that we are assessing the common genes that are consistently coexpressed with ERα and GATA3 over many different human cancer studies. This implies that coexpressed genes are in the same pathways as GATA3 and ERα. However, the meta-analyses can only infer association within A recent study identified 51 genes significantly upregulated in ERα positive breast tumours, using a real-time PCR based approach [16]. Attesting to the stringency of the meta-analysis approach used here 32 of theses genes were found to overlap with the ERα coexpression list, while an identical number also overlapped with GATA3  ( Table 2). This was reflected in a similar study comparing ERα over-expressed transcripts in both oligonucleotide microarray and SAGE platforms [17], where 27 genes common to the ERα pathway are represented here in our common ERα:GATA3 meta-analysis comparison [see Additional file 3]. These data not only acted as wide-ranging external validation for the individual meta-analyses, but also confirmed the extent of the involvement of GATA3 in ERα pathways.
Furthermore, when compared to a list of genome-wide promoters shown to be bound by ERα in MCF-7 cells [18] or on chromosomes 21 and 22 [19], 23 were identified in the ERα meta-analysis list, while 27 were identified within the GATA3 list (Table 3). This again supports both the validity of the meta-analysis technique used here, and the role of GATA3 in ERα pathways. It is also possible that the overlap would be even higher if the ERα genomic location analysis were performed on a pool of human ERα-positive breast tumour samples as opposed to a cell-culture model system. While not to detract from the power of a model system such as MCF-7 there are likely to be a great many differences between a homogeneous cell monolayer and a 3-dimensional cancer made up of a heterogeneous cell population.
Of the 10 classifier genes previously identified in a metaanalysis of ERα, the same 4 were identified in both metaanalyses of this study (ESR1, GATA3, FOXA1, SLC39A6) [9]. Once again this adds credence to the high-quality data obtained in our current meta-analyses.
Implicating GATA3 in control of some of these gene products is a microarray experiment performed after overex-  pression of GATA3 in 293T cells [20]. After expression of GATA3 elevated levels of TFF1, TFF3, KRT18, FOXA1, SLC9A3R1, TPD52, BCAS1 were observed, all of which we identified here for both GATA3 and ERα meta-analyses. While 293T are not breast cancer cells, it raises the question of how many more of our predicted pathway partners of GATA3 would be identified if the microarray were repeated in cells such as MCF-7 which also retain high ERα expression. In the example of SLC9A3R1 (NHERF1) which is a putative tumour suppressor, it was shown to increase growth of 2 breast cancer cell lines when knocked down by shRNA [21]. If GATA3 does help to control expression of NHERF1 this might be one mechanism consistent with its role in the less-aggressive differentiated luminal A breast cancers. Another example is BCAS1 (NABC1) which is overexpressed in breast carcinomas but downregulated in colorectal tumours [22,23]. Indeed, overexpression of NABC1 did not result in changes in cellcycle or anchorage-dependent growth properties in NIH3T3 cells, implying it may not be intrinsically oncogenic [24].
As GATA3 is expressed in, and regulates, luminal epithelial cells and has also been shown to regulate the MUC1 gene it is no surprise that MUC1 is also mostly expressed in luminal breast epithelial cells as well as other glandular epithelia [25]. MUC1, when abnormally expressed, leads to a loss of both cell-extracellular and cell-cell contacts. It has also been shown that MUC1 levels can be regulated by estrogen and ERα can bind putative binding sites derived from the MUC1 promoter in-vitro [26]. Here we reveal that both GATA3 and ERα coexpress with MUC1 acting as further validation of the meta-analysis technique used here. Furthermore, transfected GATA3 can activate a MUC1 promoter reporter in MCF-7 cells, even in the presence of Tamoxifen i.e. independently to ERα activation. This activation could be repeated in the ERα-negative breast cancer cell line SKBR3 (Figure 2). The activation of ERα pathway genes was also observed with pS2 (TFF1) and KRT18 reporters ( Figure 2). These data indicate that GATA3 can have its own impact on the ERα pathway and is not just acting indirectly via ERα.
It has also been postulated that, as the deletion of GATA3 in mammary primordia (by K14-Cre) resulted in an inability to form mammary placodes is similar to that of loss of LEF1, Msx1 and Msx2 these may all be intertwined in a transcriptional network [4,27]. It is of interest that in our Oncomine meta-analysis data for GATA3 or ERα was compared both to a promoter list published by Laganiere et al, (P = 0.05), and to a chromosome array list of 30 genes identified by Carroll et al. The overlap is shown and common overlap between ERα and GATA3 is shown in bold.
present study we observe MSX2 coexpression both with GATA3 and ERα, which helps to support this notion.
Using the meta-analysis data presented it is easy to build up transcriptional networks such as this and all of the data presented strongly supports (1) the quality of the metaanalysis results, (2) the concept that GATA3 is firmly entrenched within ERα pathways. Future in-depth analysis of the data presented may lead to novel aspects of ERα or GATA3 regulated pathways, and help to understand the etiology of ERα-positive breast cancers, and management of their outcomes.

Conclusion
While GATA3 has been identified previously in a metaanalysis of ERα only 10 genes were identified in total [9]. Here we give an extensive list of coexpressed ERα genes and for the first time a reciprocal meta-analysis for GATA3 has been performed, and the results compared for overlap. This overlap was considerable, confirming the important role of GATA3 in the ERα pathway. The vital question raised is whether GATA3 is crucial to the ERα pathway only by regulation of ERα levels, or through further control of ERα-regulated genes in concert with ERα itself. The GATA3 overexpression microarray experiment in 293T cells, and our reporter gene assays certainly implies the latter [20]. Genome-wide location analysis (ChIP-chip) of GATA3 in a well-established ERα system such as MCF-7 cells, as well as specific analysis of the ERα pathway in GATA3 conditional knockout mice will yield vital information regarding the extent that GATA3 is integral to the ERα pathway.