Gene expression profiling of mouse p53-deficient epidermal carcinoma defines molecular determinants of human cancer malignancy

Background The epidermal specific ablation of Trp53 gene leads to the spontaneous development of aggressive tumors in mice through a process that is accelerated by the simultaneous ablation of Rb gene. Since alterations of p53-dependent pathway are common hallmarks of aggressive, poor prognostic human cancers, these mouse models can recapitulate the molecular features of some of these human malignancies. Results To evaluate this possibility, gene expression microarray analysis was performed in mouse samples. The mouse tumors display increased expression of cell cycle and chromosomal instability associated genes. Remarkably, they are also enriched in human embryonic stem cell gene signatures, a characteristic feature of human aggressive tumors. Using cross-species comparison and meta-analytical approaches, we also observed that spontaneous mouse tumors display robust similarities with gene expression profiles of human tumors bearing mutated TP53, or displaying poor prognostic outcome, from multiple body tissues. We have obtained a 20-gene signature whose genes are overexpressed in mouse tumors and can identify human tumors with poor outcome from breast cancer, astrocytoma and multiple myeloma. This signature was consistently overexpressed in additional mouse tumors using microarray analysis. Two of the genes of this signature, AURKA and UBE2C, were validated in human breast and cervical cancer as potential biomarkers of malignancy. Conclusions Our analyses demonstrate that these mouse models are promising preclinical tools aimed to search for malignancy biomarkers and to test targeted therapies of prospective use in human aggressive tumors and/or with p53 mutation or inactivation.


Introduction
Mouse models of human cancer have become essential tools for preclinical analysis of antitumoral drug discovery. To demonstrate that these models faithfully recapitulate human disease, a deep characterization of the tumors is required. Functional comparative genomics is one of the most powerful techniques for such validation. Moreover, such analyses have also evidenced that mouse models display the complexity of human cancer genomes. Cross-species studies using genomic-based technologies have indicated the preservation of oncogene transcriptional signatures [1,2] or the synteny of tumor-associated copy number alterations [3][4][5]. Furthermore, comparison between mouse and human samples have demonstrated the conservation of somatic signature mutational events [4,5], and have enabled the efficient identification of new oncogenes in human cancers [6].
The p53 protein is a transcription factor that responds to diverse stress signals (including DNA damage, oncogene activation and various metabolic limitations) to regulate many target genes that induce cell-cycle arrest, apoptosis, senescence, autophagy, DNA repair and/or metabolic changes [7,8]. As a consequence, the p53 pathway is a crucial mechanism for effective tumor suppression. Somatic or germline mutations in TP53 gene that compromise its function occur in around 50% of all human cancers (IARC TP53 mutation database, version R14, November 2009 is the latest, [9]), and even those tumors that retain wild-type p53 frequently show defects in the pathways leading to its functional inactivation [10], such as amplification of MDM2 [11]. Furthermore, somatic mutations in TP53 have been associated with poor outcome in most human cancers [9,11]. Importantly, both somatic and germline TP53 mutations are usually followed by loss of heterozygosity (LOH) during tumor progression [12], which suggest that a selective force inactivates the remaining wild-type allele. The majority of TP53 mutations are missense (73.6%), and many of these missense mutant p53 forms not only lose their tumor suppressive function and acquire dominantnegative activities, but also gain new oncogenic properties that are independent of wild-type p53, the so called gain-of-function mutants [12]. However, an important proportion of mutations would give rise to a truncated p53 protein, such as nonsense, frameshift and large deletion mutations (16.6% of all mutations). The essential role of p53 in tumor suppression has also been demonstrated using genetically modified mice, whereby Trp53 deletion or missense mutations induce tumor formation in multiple tissues and organs [13]. We and others have reported that the somatic inactivation of p53 tumor suppressor in stratified epithelia, using the Cre-LoxP system (hereafter Trp53 ΔEC ), induces spontaneous development of skin squamous cell carcinoma (SCC) [14,15]. Besides, skin tumor development is accelerated by inactivation of both Trp53 and Rb genes (hereafter Rb ΔEC ; Trp53 ΔEC ) [15]. Interestingly, tumors arising in both genotypes, which originate in close proximity to hair bulge, where the adult epidermal stem cells reside, display high aggressive characteristics including premature epithelial-mesenchymal transition (EMT) and distant metastasis (manuscript in preparation).
Here we have characterized the differential gene expression patterns between tumor and normal skin tissue, in order to obtain putative target genes for antitumoral therapies and/or for biomarker discovery. We have observed that primary tumors from Trp53 ΔEC and Rb ΔEC ; Trp53 ΔEC show a predominant overexpression of genes involved in cell cycle progression and mitosis regulation. The mouse tumors also display a core transcriptional profile similar to human embryonic stem cells, a feature associated with increased aggressiveness of human tumors. Cross-species studies demonstrate that the overexpressed genes could significantly identify human cancers bearing p53-mutations and/or highly aggressive behavior. Collectively, we have obtained a set of genes with reproducible overexpression in mouse samples, which could be used as targets for preclinical antitumor therapies and as biomarkers of malignancy in primary tumors.

Results
Inactivation of Trp53 in stratified epithelia leads to the generation of spontaneous epidermal tumors with a complete penetrance by one year of age [15]. The simultaneous inactivation of Rb1 and Trp53 leads to earlier appearance of the tumors and faster growth at early stages when compared to inactivation of only Trp53 alleles. To fully characterize the molecular features of these tumors we performed gene expression profiling using Affymetrix microarrays using total RNA from 27 carcinomas arising in Trp53 ΔEC and Rb ΔEC ; Trp53 ΔEC mice, and 9 normal, wild type skin samples.

Gene expression comparison of tumours arising in Trp53 ΔEC and Rb ΔEC ; Trp53 ΔEC mouse models
In both genotypes the tumors appeared as small subcutaneous squamous lesions originating in or close to the hair follicles ( Fig. 1a). They exhibit a fast growth leading to poorly differentiated squamous cell carcinomas (Fig. 1b), which rapidly progress, lose the expression of differentiation markers such as K10, K6 and K17 [15], and become highly undifferentiated carcinomas (Fig. 1c) and in, some cases evolve to spindle cell carcinomas (Fig. 1d), possibly by means of a premature EMT process. Overall, at the most advanced stage Trp53 ΔEC and Rb ΔEC ; Trp53 ΔEC mice tumors are very similar and histopathologically indistinguishable ( Fig. 1e) [15].
In order to characterize the tumor progression and to compare tumors with different histological grade at the molecular level, we performed supervised analysis of differential gene expression. This analysis showed significant differences depending on tumor histological grade (undifferentiated/spindle vs. poorly differentiated carcinomas) (Fig. 2). Enrichment analysis of Gene Ontology Biological Processes (GOBP) terms demonstrated increased expression of genes involved in vasculature development, cell adhesion, and endocytosis in undifferentiated/spindle carcinomas with respect to poorly differentiated carcinomas. More specifically, we have found overexpression of genes that mediate EMT such as Snai1, Zeb1 or Zeb2, or genes associated with EMT such as Tgf-brII, Dab2, Vimentin, Col6a1 and Col6a2 and Adam19. Also, we found in undifferentiated/spindle carcinomas a significant reduced expression of genes involved in keratinocyte and/or epidermal differentiation such as Cdh1 (E-cadherin), Krt17, desmocollin 1, 2 and 3, desmoplakin, claudin 1, 4 and 8, Lama5, plakophilin 1 and 3, or plakoglobin. Some of these genes can be repressed in EMT by Snai1, Zeb1 or Zeb2 transcription factors [16]. The results confirm that the undifferentiated/spindle carcinomas display molecular features of EMT tumors.
The comparative analysis of tumor appearance in Trp53 ΔEC and Rb ΔEC ; Trp53 ΔEC mice have revealed an accelerated tumor onset in Rb ΔEC ; Trp53 ΔEC indicating the existence of cooperative functions for these tumor suppressors in epidermis, which is in contrast with the absence of spontaneous tumors [17] and the reduced susceptibility to chemical carcinogenesis in mice lacking epidermal pRb [18]. However, at the advanced stage there are no overt differences in differentiation, grade or growth rate between the two genotypes [15]. In order to identify possible molecular differences/similarities in the tumors arising in both mouse models, we also performed a supervised analysis of differential expression based on mouse genotype (Trp53 ΔEC vs Rb ΔEC ; Trp53 ΔEC ). This analysis revealed that tumors in both genotypes are very similar, as only 83 probesets were differentially expressed at the significance level of FDR < 0.1 (Additional file 1). The result might indicate that although Rb somatic inactivation in double deficient mice accelerated the appear-ance and initial growth of tumors, it does not importantly contribute to the overall gene expression pattern of overt primary tumors.
Trp53 ΔEC and Rb ΔEC ; Trp53 ΔEC mouse tumors are enriched in human stem cell genes Using gene enrichment analysis of 13 partially overlapping gene signatures that were compiled from the literature, Ben-Porath et al. reported that high grade, metastatic human tumors displayed gene expression programs similar to those described for human embryonic stem (ES) cells, and are also enriched for targets of key regulators of ES cell identity (Oct4, Sox2, and Nanog) and targets of Myc oncogene (key regulator of cell differentiation) [19]. On the contrary, these ES-like human tumor samples displayed down-regulation of genes bound by the Polycomb repressive complex 2 (PRC2) [19]. Ben Porath et al. study represents an important evidence of the similarities between gene expression programs of metastatic tumors and ES cells.
Spontaneous tumors arising in Trp53 ΔEC and Rb ΔEC ; Trp53 ΔEC mice are high grade and aggressive, they originate from hair follicles (where adult epidermal stem cells reside) and, at early stages, display increased expression of certain epidermal stem cell markers such as keratin K15 [15]. Therefore, we wanted to analyze whether they also share gene expression patterns of human ES cells. To this, we downloaded the 13 gene signatures (described in Materials and Methods) used by Ben-Porath et al, and performed a similar analysis using Gene Set Enrichment Analysis (GSEA) [20,21] on the mouse samples. We observed that tumors are enriched in human ES cell genes, and in targets of Nanog, Sox2, and Myc transcription factors (Table 1). Conversely, mouse tumors displayed repression of Polycomb targets. The analysis demonstrates similar patterns of human ES cells gene programs within the mouse epidermal tumors from p53deficient mouse, thus resembling most of the molecular features of high-grade, malignant human tumors.

Generation of a gene expression signature for epidermal tumors from p53-deficient mouse
Gene expression profiles comparing normal and tumoral samples provide information about genes that could display important functions in the carcinoma maintenance or aggressiveness, and non essential roles in the normal tissue. The therapeutic inhibition of these genes would not affect normal tissue homeostasis but may affect tumor growth or metastasis, thus becoming potential molecular targets for therapy. In addition, interspecies comparison between human and mouse could also be useful to determine which genes display similar expression patterns so they can be considered validated targets for therapy and/or biomarkers of human cancer. In order  to identify such possible genes in our mouse tumor samples, we divided the full dataset in two datasets containing 20 and 16 samples selected randomly: i) a training dataset, with 5 normal skin and 15 tumors from both genotypes, which were used to compare with human tumors and to select gene targets/biomarkers; ii) a testing dataset, with 4 normal skin and 12 tumors from both genotypes, in order to validate the selected genes in new, external samples.
Differential expression analysis between mouse tumors and control normal skin in the training dataset provided a gene expression signature of 682-probesets (371 overexpressed and 311 underexpressed in tumors) (Additional files 2 and 3). Unsupervised hierarchical clustering analysis and principal component analysis of the samples using this gene signature revealed a high degree of similarity between tumors of both Trp53 ΔEC and Rb ΔEC ; Trp53 ΔEC mouse genotypes ( Fig. 3a and 3b), which is in line with the above results (see Additional file 1). Thus, the observed gene expression profile can be ascribed to a tumor gene signature from p53-deficient mouse, as somatic inactivation of Trp53 alleles is the common hallmark of both transgenic lines. Consistent with the functional roles of p53, most of the overexpressed genes in the tumors are involved in DNA replication and repair, or genomic instability and cell cycle checkpoint, as evidenced by enrichment analysis of GOBP terms (Fig. 3c). This finding is coincident with previous reports showing that TP53 mutations are associated with increased global genomic instability [22] and the observation of a high chromosomal instability in tumors and in pretumoral skin of Trp53 ΔEC mice [23]. In sharp contrast, downregulated genes are related to muscle development and physiology, that may be explained by the absence of dermal muscle layers in tumor samples (see also Fig. 1).

Gene expression comparison between mouse and human tumors
To test whether the gene expression patterns that characterize Trp53 ΔEC and Rb ΔEC ; Trp53 ΔEC mouse tumors (training dataset) are also present in human cancers with TP53 mutations and/or with poor clinical outcome, we performed an exhaustive comparison of the mouse tumor signature (682-probesets) with gene datasets of human cancer samples using the Oncomine human cancer genomics database (see Materials and Methods) [24,25]. First we analyzed the genes overexpressed in the mouse tumors (371 probesets), and compared them with the overexpressed genes in human samples bearing TP53 mutations. This meta-analysis showed a very significant overlap with many human epithelial and non-epithelial cancers, indicating that multiple genes overexpressed in the mouse epidermal tumors are in common with human tumors from distinct body sites and characterized by bearing mutant TP53 gene ( Fig. 4a and Additional file 4). To study the existence of possible correlation with different types of p53 mutations in human tumors, we analyzed in further detail a breast cancer dataset containing gene expression and p53 mutation data of 247 patients [26], and which also showed the highest overlapping sig-  As a measure of the status similarity with respect to mouse tumors, we calculated an overlapping score of each breast cancer sample (see Materials and Methods) and represented it as a function of TP53 mutational status. The mean differences and significance were calculated as the tumors were grouped by TP53 mutation status (mutant or wild type) or by mutation type (missense or truncating). As expected, the differences were highly significant between tumor samples bearing TP53 mutation or wild type (Fig. 4b, left panel), demonstrating that the mouse tumors expression profile could distinguish between both types of human tumors. Furthermore, the mean values were also significantly higher in the samples with truncating mutations when compared with missense mutations (Fig. 4b, right panel). Overall these analyses suggest that mouse tumors with somatic deficiency of both p53 alleles resemble human tumors with TP53 mutations, especially tumors that can produce truncated p53 proteins. Next, we extracted common genes overexpressed in the mouse tumors and in human tumors bearing TP53 mutations. We found 51 that are overexpressed in 5 out of 10 studies of human tumors analyzed, representing a mouse and human tumor signature associated with TP53 mutation (51-gene signature) ( Table 2).
On the other hand, we analyzed whether the mouse tumor gene signature can distinguish human aggressive tumor samples (tumors from which patients died at an early time post surgery or diagnosis) independently of the TP53 mutational status. Again, we found a highly significant overlap with human cancers of different origin (Fig.  4c, and Additional file 4), including carcinomas (breast, brain, bladder, and renal), and hematological cancers (lymphoma and myeloma). The overlapping of the tumor signature of p53-deficient mouse with genes differentially expressed in human aggressive tumors suggests that this signature could be predictive of malignant progression, in agreement with the observation that TP53 mutations have been associated with poor prognosis in human cancer [9,11]. To study this, we also extracted mouse tumor genes that display overexpression in highly metastatic human tumors (in 14 out of 28 studies). These genes (n = 26) represent a mouse and human tumor signature associated with poor prognosis (26-gene signature) ( Table 2).
Remarkably, 20 genes (asterisks in Table 2) are common between the 26-gene signature (associated with poor prognosis) and the 51-gene signature (associated with TP53 mutation), corroborating, in agreement with others [9,11], that TP53-mutations or p53 functional inhibition is a common hallmark of human cancer malignancy. These 20 genes (20-gene signature) represent biomarkers of TP53 mutant and/or aggressive tumors, and, consequently, possible therapeutic targets.
It is worth mentioning that a similar analysis was also performed using genes showing reduced expression in mouse tumors as compared with normal mouse skin. However, the number of overlapping signatures and their significance is lower (Additional files 4 and 5).

Validation of the 20-gene signature in the mouse testing dataset
In order to validate the genes of the 20-gene signature as possible cancer targets or biomarkers, it is necessary to analyze its predictive capability to distinguish between normal skin and tumor samples arising in new mice. Gene expression values for the 20 genes from the testing mouse dataset were extracted (Fig. 5a), and the prediction accuracy to distinguish between normal or tumoral condition was calculated using Support Vector Machine (SVM) [27], Uncorrelated Shrunken Centroid Classification (USC) [28], K-Nearest Neighbor Classification (KNNC) [29] and Discriminant Analysis Module (DAM) [30]. The accuracy of the classifiers is very high, and varied between 87.5% (USC), 93.8% (SVM or KNNC), or 100% (DAM) (Fig. 5b). This result demonstrates that the 20-gene signature is a good predictor of carcinoma vs. normal skin samples, so the 20 genes could be considered good markers of Trp53 ΔEC and Rb ΔEC ; Trp53 ΔEC mouse tumors and good targets for preclinical antitumor therapies.

Patient stratification using 20-gene signature
To further confirm that the 20-gene signature obtained could be suitable for prediction analysis of human cancer outcome, we analyzed the overall survival of human patients depending on the expression pattern of this signature using four studies representing three different cancer types: breast cancer, astrocytoma, and multiple myeloma. To do this, we computed the sum of the expression values of the 20 genes (20-gene score) in each human tumor sample, we classified the samples depending on this score, and stratified the samples in three groups: low, intermediate and high score (Fig. 6). The association with survival of the resulting sample clusters was analyzed using Kaplan-Meier curves. The results showed that patients displaying low scores (this is, low expression values of the 20-gene signature) had a higher survival probability than those displaying intermediate or high scores (Fig. 6a-d), which suggests that the mouse-derived signature could help to determine the prognosis of human cancers.
Gene expression predictive tests have been previously developed for breast cancer with the aim to be implemented in clinical use. Thus, we wanted to analyze how the 20-gene derived patient stratification compares with  prognostic signatures such as 70-gene [31] or 76-gene [32,33]. Survival curves according to 70-gene (Fig. 6a) and 76-gene (Fig. 6b) for good and poor prognosis breast cancer patients are very similar to the curves of low and high 20-gene scores, respectively. Furthermore, survival of patient groups using 20-gene scores displayed similar significance p-values compared to survival of 70-gene and 76-gene prognostic groups. Interestingly, 20-gene has 4 genes in common with 70-gene (BIRC5, BUB1, CENPA, and CKS2) and 2 genes in common with 76-gene (PLK1 and KPNA2). Finally, we also investigated whether the genes identified could also behave as possible biomarkers for malignant progression of human cancer. To this, we analyzed the expression of two of the genes belonging to the 20gene signature, UBE2C and AURKA, using immunohistochemistry on tissue array samples from human cervical (n = 55) and breast cancer (n = 86). The p53 pathway in  1 Genes were selected if they have a significant differential expression between wild type and TP53-mutated human tumors, in at list 5 out of 10 human studies (Fig. 4a) 2 Genes were selected if they have a significant differential expression between primary tumors with poor versus good outcome in at list 14 out of 28 human studies (Fig. 4c) *Genes common between both lists: 20-gene signature  Real class 4 12 Predicted class both cancer types is frequently inhibited, either by expression of human papillomavirus E6 oncogene, which induces p53 protein degradation in cervical carcinomas, or by mutation of the TP53 gene in breast carcinomas. As the tissue collections include primary cancer samples of different tumor grades, we could relate the expression of both proteins with tumor grade. Furthermore, in the case of breast cancer, we could assess the expression in metastases. The analysis of immunohistochemical data (Fig. 7) revealed that UBE2C and AURKA expression levels are higher in undifferentiated tumors of cervical (p < 0.0001 and p < 0.004, for UBE2C and AURKA respectively) and breast (p < 0.001 and p < 0.0001, for UBE2C and AURKA respectively) cancer samples. Moreover, in both cases there is significant correlation between the expression levels of these putative biomarkers and grade (p < 0.001 and p < 0.02, for cervix and breast tumors respectively). As undifferentiated tumors are more aggressive and met-astatic, we can suggest that UBE2C and AURKA are overexpressed in malignant tumors. Furthermore, expression levels are higher in metastases of breast cancer tumors (p < 0.001 and p < 0.01, for UBE2C and AURKA respectively), which again points to a role for these proteins in metastatic behavior.
Collectively, these results demonstrate that the mouse tumor derived 20-gene signature efficiently distinguishes groups of patients with different outcomes and from different types of human cancer.

Discussion
Mouse models of human cancer could potentially be used as tools for preclinical analysis of antitumor therapies. However, before doing so, a full molecular characterization of the animal tumors is necessary to be able to compare them with the human tumors counterpart and to validate them. Gene expression studies are perfectly Moreover, interspecies comparison could also be useful to determine which genes display similar expression patterns in human and mouse models so they can be considered targets for therapy and/or cancer biomarkers in preclinical settings. Here we report the gene expression profiling of mouse tumors arising in epidermis as a consequence of the somatic ablation of either Trp53 or Trp53 and Rb tumor suppressor genes.
The supervised gene expression profiles of the mouse tumors obtained demonstrated that there are no major differences between the two genotypes. This finding is in agreement with our previous results using different allelic combinations and indicating that loss of Rb mediated acceleration of tumor appearance but did not affect the histological grade growth or aggressiveness of overt tumors [15]. The fact that most of the overexpressed genes are involved in cell cycle or mitotic control as well as chromosome instability (see Fig. 3c) is also in agreement with our previous data indicating that development of tumors in the Trp53 ΔEC mice is associated with early chromosome aberrations due to deregulated centrosome division occurring in pretumoral epidermis [23]. Under this context, the increase in proliferation due to Rb loss [17] can accelerate the process of tumorigenesis by increasing the number of cells subject to such chromosome alterations.
Our data also suggested another important hypothesis: since mutations in TP53 and chromosome instability are associated with increased aggressiveness and malignancy in human patients, we can speculate that mouse p53-deficient epidermal tumors can represent a well suited model for human malignant cancer analysis. Three major findings of the mouse-to-human microarray gene expression comparison presented here support such hypothesis: i) there is a significant overlapping of gene expression pat-  AurkA terns between mouse tumors and human cancers bearing mutant TP53; ii) the gene expression signatures typical of human ES cells are displayed by mouse tumors; and iii) the overlapping of overexpressed genes between mouse tumors and highly malignant human primary tumors from various origins. The similarities that we have found between mouse tumors and human cancer samples from different tissues and origin could be explained by the undifferentiated status of animal tumor samples. In this sense, the 20-gene signature (asterisks in Table 2) shares 13 genes with a meta-signature of undifferentiated human cancer (69 genes) previously published [34], demonstrating the molecular similarities between mouse tumors and human high grade samples. Furthermore, the undifferentiated and aggressive features of the mouse tumors could be related to the expression pattern representative of an ES cell-like phenotype. GSEA using signatures of human ES cell targets and key regulators of ES cell identity (Oct4, Sox2, and Nanog) or Myc oncogene demonstrates that mouse tumors display similar expression pattern of ES cells.
One important feature of the spontaneous tumors arising in both genotypes is the rapid loss of the differentiated phenotype. The microarray analysis corroborates the changes observed by histology studies and confirm, at the molecular level, that an important fraction of the tumors undergo EMT. Since EMT also correlates with metastatic properties, this result would also support a highly metastatic behavior of the mouse epidermal tumors. Analyses to characterize these features in more detail are underway.
These observations also reinforce the possibility that mouse models can be useful as tools for preclinical analysis of potential antitumor therapies targeted against specific signaling pathways or gene products. By means of meta-analysis approaches and interspecies comparison we developed a signature composed by 20 genes, which displays the following attributes: i) it is composed by genes displaying increased expression in tumors compared to normal tissue, being possible targets with important functions in the carcinoma maintenance/ aggressiveness; ii) it is independent of the genotype (Trp53 ΔEC or Rb ΔEC ; Trp53 ΔEC ) or the histological subtype; and more importantly, iii) it can identify human primary tumors bearing TP53 mutations and/or displaying a more aggressive malignant behavior. In consequence, we postulate that these 20 genes, besides of being considered biomarkers of malignant human cancer progression, could also be useful cancer therapeutic targets, whose inhibition can be preclinically tested in our mouse models.
Consistent with the functional roles of p53, the 20-gene signature contain genes involved in DNA replication and repair, or genomic instability and cell cycle checkpoint (Table 2). This finding is coincident with previous reports showing that TP53 mutations are associated with increased global genomic instability [22], and with a report describing a signature of chromosomal instability (CIN25) inferred from gene expression profiles that predicts clinical outcome in multiple human cancers [35]. Thus, the results suggest that chromosomal instability mediated by the loss of TP53 could be the driving force of metastatic behavior in primary tumors with TP53 mutations.
Epidemiological studies of human cancer demonstrate that somatic mutations in the TP53 gene are mostly missense (73.6% of all mutations). Some missense mutations produce p53-mutant proteins that have been associated with gain of function, or dominant-negative activities. Supported by studies in genetically engineered mice [13], gain of function activity renders a more transformation prone phenotype and confers major metastatic advantages. However, it is important to note that such features have been characterized in mice when the mutations were introduced in combination with specific oncogenes such as K-ras [36] or in the absence of MDM2 [37]. In addition, dominant negative of mutated p53 is only apparent upon external carcinogenic challenges [38]. On the other hand, an important proportion of TP53 mutations found in human cancers would give rise to truncated p53 proteins, such as nonsense, frameshift and large deletion mutations (16.6% of all mutations). In this sense, our animal models of somatic inactivation of the Trp53 mouse gene could represent a suitable model for these types of human mutations. Analysis of the expression deregulation similarities of the mouse tumors and human samples with either truncating or missense TP53 mutations (Fig. 4b) suggest that animal samples share molecular features with both types of mutations, but more significantly with those producing truncated p53 protein. Recently, it has been described that an important proportion of BRCA1-related human breast tumors display TP53 mutations that produce truncated p53 proteins [39]. Approximately half of all hereditary breast cancers are due to loss of BRCA1 or BRCA2 function. Thus, cancer treatments that can restore TP53 pathway function in the mouse models could then be used to treat these BRCA1-related breast tumors, where the p53 loss has been suggested to be essential for tumorigenesis.
The results shown here constitute a comprehensive metagenomic comparison of mouse p53-deficient skin SCCs with human cancer in which we describe common genes of p53-dependent malignancy. These genes are markers of malignant cancer, and potential targets for antitumor therapy. Furthermore, as these genes share expression patterns in both species and in different types of human cancer, our mouse models constitute validated models to initiate preclinical analysis of antitumor therapies that could be useful against p53-mutated human cancer.

Microarray analysis of mouse skin tumors
RNA was obtained from 9 normal wild type control skin samples and 27 tumors of both genotypes, and purified from mice tissue as previously described [15]. Hybridization was done to Affymetrix Mouse Gene expression MOE430 2.0 array. Raw and processed data for training and testing mouse datasets were deposited in the GEO database with the accession identifiers GSE11990 and GSE19616, respectively. Supervised analysis of differential expression between tumors with different histological grade or arising in the two different genotypes was done using Ttest available in the open source software Multiexperiment Viewer 4.5 (MeV 4.5) [40]. The p-values were corrected using FDR. To obtain a tumor signature of p53deficient mice, differential expression of mouse tumors compared to normal skin tissue was performed using Ttest and Significant Analysis of Microarrays (SAM) [41]. Probes were selected if they passed two criteria: i) Ttest analysis with FDR corrected p-val < 3 × 10 -7 ; and ii) SAM analysis with q-val < 1 × 10 -3 for FDR. A number of 682 probesets were selected as differentially regulated, being 371 overexpressed and 311 underexpressed in mouse tumors (Fig. 3, Additional files 2 and 3). MOE430 2.0 Affymetrix chip probeset IDs were mapped to human using Ailun web utility [42].

Enrichment analysis of Gene Ontology terms
Probesets identifiers of differentially expressed genes were uploaded into DAVID Functional Annotation web tool, which computes enrichment of Gene Ontology biological processes terms using EASE score [43,44].

Enrichment analysis in human stem cells signatures
GSEA [20,21] was used to analyze the enrichment of human embryonic stem cell gene signatures within the mouse tumors when compared to normal skin. Gene sets were downloaded from Ben-Porath et al. [19], and fall into four groups: (i) ES expressed genes: two sets of genes overexpressed in ES cells compared to other cells and tissues according to a multistudy compilation and metaanalysis [45]; (ii) Nanog, Oct4 and Sox2 (NOS) targets: four sets of genes whose promoters are bound and activated in human ES cells by each of these regulators of ES cell identity, or co-activated by all three [46], and an additional set (NOS TFs) including a subset of NOS activation targets encoding transcription regulators; (iii) Polycomb targets: four sets representing genes bound by the Polycomb repressive complex 2 (PRC2) in human ES cells [47]; and (iv) Myc targets: two sets of genes bound and activated by c-Myc, identified in two independent studies [48,49].

Overlapping analysis in human cancer gene expression studies
We used Oncomine Gene Expression Signatures database to search for overlapping [24]. Association of the mapped signatures with the database signatures was tested using Fisher's exact test, and was considered significant for Odds Ratio >1.25, and p-val < 0.01. Genes overexpressed or underexpressed in the tumor signature of p53-deficient mouse were mapped to human gene symbols and loaded into the Oncomine database. Although both sets of genes display similar trends, the significance was lower for those underexpressed. We have searched for overlaps using different filtering criteria, based on the type of human cancer comparison performed. These criteria were: "Molecular Subtype: Mutation" and "Clinical Outcome". In the case of the overlapping with the Ivshina et al. study of TP53 mutational status in breast cancer samples [26] (asterisk in Fig. 4a), as a measure of status similarity with respect to mouse tumors, we calculated the sum of the expression values of the 98 common genes in each breast cancer sample. The higher the sum, the greater the overlap with the mouse tumor signature.

Validation analysis in the mouse tumor testing dataset
For both the training (GSE11990) and testing mouse datasets (GSE19616) the gene expression values for the 20-gene signature were extracted. The probeset with maximal value was selected for genes with more than one. Similar results were obtained using the median value of probesets. For each method, the classifier was trained in the GSE11990 dataset and tested in the GSE19616 dataset. Prediction methods (SVM, USC, KNNC and DAM) were calculated using default settings within the MeV 4.5 software.

Immunohistochemistry on human tissue arrays
FFPE tissue arrays of breast and cervical carcinomas were purchased to Cybrdi, Inc. (Maryland, USA). Individual clinical specimens were pathologically confirmed. Immunohistochemistry was done using standard protocols on deparaffinized sections using a polyclonal rabbit antibody to human UBE2C or Aurora kinase A (AbCam). The slides were microwaved for 15 min. Biotin-conjugated secondary antibody was purchased from Jackson Immuno-Research Laboratories and used at 1:1,000. Peroxidase was visualized using avidin-biotin complex method and 3,3-diaminobenzidine kit (Vector). Doubleblind analysis was performed to assign a staining score (0 to 5) for each sample considering the intensity and the percentage of tumor cells stained. Statistical analyses (χ 2 and Cox correlation) were performed using SPSS software.

Ethics Statement
All animals were handled in strict accordance with good animal practice as defined by the relevant international laboratory animal welfare bodies (FELASA), and all animal work was approved by the Animal Ethical Committee and conducted in compliance with Centro de Investigaciones Energéticas, Medioambientales y Tecnológicas (CIEMAT) Guidelines.