Loss, mutation and deregulation of L3MBTL4 in breast cancers

Background Many alterations are involved in mammary oncogenesis, including amplifications of oncogenes and losses of tumor suppressor genes (TSG). Losses may affect almost all chromosome arms and many TSGs remain to be identified. Results We studied 307 primary breast tumors and 47 breast cancer cell lines by high resolution array comparative genomic hybridization (aCGH). We identified a region on 18p11.31 lost in about 20% of the tumors and 40% of the cell lines. The minimal common region of loss (Chr18:6,366,938-6,375,929 bp) targeted the L3MBTL4 gene. This gene was also targeted by breakage in one tumor and in two cell lines. We studied the exon sequence of L3MBTL4 in 180 primary tumor samples and 47 cell lines and found six missense and one nonsense heterozygous mutations. Compared with normal breast tissue, L3MBTL4 mRNA expression was downregulated in 73% of the tumors notably in luminal, ERBB2 and normal-like subtypes. Losses of the 18p11 region were associated with low L3MBTL4 expression level. Integrated analysis combining genome and gene expression profiles of the same tumors pointed to 14 other potential 18p TSG candidates. Downregulated expression of ZFP161, PPP4R1 and YES1 was correlated with luminal B molecular subtype. Low ZFP161 gene expression was associated with adverse clinical outcome. Conclusion We have identified L3MBTL4 as a potential TSG of chromosome arm 18p. The gene is targeted by deletion, breakage and mutations and its mRNA is downregulated in breast tumors. Additional 18p TSG candidates might explain the aggressive phenotype associated with the loss of 18p in breast tumors.


Background
The development and progression of breast cancer is the result of the accumulation of genetic alterations such as amplification of oncogenes and deletions of tumor suppressor genes (TSG) in the epithelial cells of the mammary gland. Frequent deletions have been reported on chromosome arms 1p, 3p, 7q, 8p 9p, 16q and 17p but only few reports describe such deletions on chromosome arm 18p [1][2][3].
We profiled a series of 307 primary breast tumors and 47 breast cancer cell lines by using high resolution array comparative genomic hybridization (aCGH). We identified a region on 18p11.31 deleted in 25% of the tumors and 40% of the cell lines. We delineated a minimal common region of deletion that targeted the L3MBTL4 gene. The L3MBTL4 gene is one of the four human orthologs of Drosophila lethal (3) malignant brain tumor (l(3)mbt). L(3)mbt is a bona fide TSG in the fly [4,5]. The L3MBTL4 protein contains three "malignant brain tumor" (MBT) domains. This domain of about 100 amino acid residues is conserved in protostomians and deuterostomians and often exists as repeats [6]. The MBT domain binds methylated histone residues. The human genome contains several MBT-containing proteins, some of which have been linked to gene regulatory pathways and polycomb-mediated repression, and to cancer [7]. L3MBTL1 is a TSG implicated in myeloid malignancies [8], and L3MBTL3 deregulation is associated with neuroblastoma [9].
To document the involvement of L3MBTL4 in breast cancer we searched for mutations by sequence analysis of 180 primary tumor samples and 47 cell lines. We found that, in addition to deletions, L3MBTL4 is targeted by mutations. Finally, we found that L3MBTL4 deletions correlate with low mRNA expression and with the presence of lymph node metastasis, high Scarf-Bloom-Richardson (SBR) grade and luminal B molecular subtype.
Our study is the first to identify a region on chromosome arm 18p likely to contain a putative TSG involved in breast oncogenesis, and to show that this TSG may be L3MBTL4. We did not exclude the existence of other potential 18p TSGs, some of which could explain the aggressive phenotype of breast tumors with 18p loss.

Breast tumors
Tumor tissues were collected from 307 patients with primary adenocarcinoma who underwent initial surgery at the Institut Paoli-Calmettes (Marseille, France) between 1992 and 2004. Immediately after macroscopic examination of the surgery specimen by two pathologists, samples containing more than 60% of tumor cells were obtained, frozen in liquid nitrogen and stored at -80°C until nucleic acids extraction. The main histoclinical characteristics of tumors are listed in the Additional file 1, Table S1. Each patient gave written informed consent and the study was approved by our institutional review committee.

DNA and RNA extraction
DNA and RNA were extracted from frozen samples by using guanidium isothiocynanate and cesium chloride gradient, as previously described [15]. DNA quality and RNA integrity were respectively controled on polyacrylamide gel electrophoresis and on Agilent Bioanalyzer (Agilent Technologies, Massy, France). DNA was also extracted from a normal area of paraffin-embedded T8584, T8847, T9193 and T8525 tissues.

Genome analysis by array-comparative genomic hybridization (aCGH)
Genome profiles were established on 307 tumors and 47 breast cancer cell lines by aCGH using 244K CGH Microarrays (Hu-244A, Agilent Technologies, Massy, France) as previously described [16]. A pool of 13 normal male DNAs was used as reference. Scanning was done with Agilent Autofocus Dynamic Scanner (G2565BA, Agilent Technologies). Data analysis was done and visualized with CGH Analytics 3.4 software (Agilent Technologies). Extraction of data (log 2 ratio) was done from CGH analytics, while normalized and filtered log 2 ratio were obtained from "Feature extraction" software (Agilent Technologies). Data generated by probes mapped to × and Y chromosomes were eliminated. The final dataset contained 225,388 unique probes covering 22,509 genes and intergenic regions according to the hg17/NCBI human genome mapping database (build 35). Data were analyzed using circular binary segmentation (CBS) [17] as implemented in the DNA copy R/Bioconductor package [18] with default parameters to translate intensity measurements in regions of equal copy number, each region being defined by at least five consecutive probes. Thus, each probe was assigned a segment value referred to as its "smoothed" value.
We used a threshold value of |0.33| (log 2 ratio) to define a copy number aberration (CNA) [16]. Identification of copy number variations (CNV) was done using the regions published by McCarroll et al [19] which are stored in the Database of Genomic Variants (release v8) [20]. To determine altered regions, we used the GISTIC algorithm [21], which computes for each genomic segment through the whole genome a score based on the frequency of CNA combined with its amplitude, with bootstrapping to calculate the significance level (p < 0.001). To establish significant association between CNA and categorical variables, Fisher's exact test was used. Gains and losses were handled separately.

Mutation analysis
Sequence analysis was done on 180 out of the 307 breast tumors and on the 47 breast cell lines after amplification of genomic DNA. Seventeen primers pairs were designed to amplify L3MBTL4 DNA by polymerase chain reaction (PCR) (Additional file 1, Table S2). PCR amplifications were done in a total volume of 25 μl PCR mix containing at least 10 ng template DNA, Taq buffer, 500 μmol of each deoxynucleotide triphosphate, 100 μmol of each primer and 1 unit of Hot Star Taq DNA polymerase (Qiagen, Courtaboeuf, France). PCR amplification conditions were as follows: 95°C 10 min and 95°C 30 sec, annealing T°55°C 30 sec, 72°C 45 sec to to 1 min depending on PCR product length, 72°C 10 min, 35 cycles. PCR products were purified using Millipore plate MSNU030 (Millipore SAS, Molsheim, France). Aliquots (1 μl) of the purified PCR products were sequenced using Big Dye terminator v1.1 Cycling Sequencing Kit (Applied Biosystems, Courtaboeuf, France) including the forward or reverse primer. The sequencing products were purified through Multi-Screen-HV 96-well filter plates (Millipore, Billerica, MA, USA) preloaded with Sephadex G-50 (Sigma, St Louis, MO, USA). The reactions were run on the ABI 3130XL Genetic Analyser (Applied Biosystems, Courtaboeuf, France). The sequence data files were analyzed using the phred/phrap/consed software. All mutations were confirmed by a second round of PCR and sequencing reactions in both directions.
Quantitative real time reverse-transcribed PCR (qRT-PCR) L3MBTL4 gene expression level was analyzed by qRT-PCR on a set of 52 out of the 307 tumors. Two μg of total RNA, treated beforehand with RNase-free DNase (Promega, France), was reverse-transcribed using the SuperScript II RT and 100 ng of randomhexamers (Invitrogen, France). PCR reactions were carried out in a LightCycler 2.0 instrument (Roche, Germany) in a final volume of 20 μl according to the supplier's recommendations using LightCycler FastStart DNA Master plus SYBR Green I Kit (Roche, Germany). L3MBTL4 primers, L3MBTL4-F (CTTGGAGCAAGCTGAAGAGG) and L3MBTL4-R (TGGAAAGGACTGATCCTTGG) (Sigma-Aldrich, Austria), were designed to anneal to exon 4 and exon 6, respectively. Primers for the control gene, GUSB, were GUSB-F (GAAAATATGTGGTTGGAGAGCT) and GUSB-R (CCGAGTGAAGATCCCCTTTTTA) (Sigma-Aldrich, Austria). The mean threshold cycle (Cp) was calculated for each gene and ΔCp was defined as ΔCp = Cp (GUSB) -Cp (L3MBTL4). The ΔCp was determined on two to six times for each sample and the mean calculated. The fold ratio of L3MBTL4 transcripts was calculated using the equation, fold ratio = 2 ΔCp . Commercial pools of normal breast RNA (Clontech, Palo Alto, CA) were analyzed and used as control.

Gene expression profiling with DNA microarrays
A total of 229 tumors studied by aCGH, as well as 4 normal breast tissue samples, were profiled with Affymetrix U133 Plus 2.0 human oligonucleotide microarrays as previously described [22]. Scanning was done with Affymetrix GeneArray scanner. Data were analyzed by 'Robust Multichip Average' (RMA) with the nonparametric quantile algorithm as normalization parameter in R/Bioconductor and associated packages [18].
All probes were mapped based on their EntrezGeneID. When multiple probes were mapped to the same gene, the probe sets with an extension « at », next « s_at », and followed by all other extensions were preferentially kept. When several probe sets with the best extension were available, the one with the highest median value was retained.
The five molecular subtypes related to the intrinsic breast cancer tumor classification were determined using the single sample predictor (SSP) classifier [23] associated to 'Distance Weighted Discrimination' (DWD) as data set adjustment [24]. Prior statistical analyses L3MBTL4 expression level in samples was centered using its expression level in the 4 normal breast samples pooled. Over-and under-expression were defined using a two-fold threshold i.e. |1| in the log 2 transformed data.

Comparative analyses of genome and expression data
To identify other potential 18p TSGs we compared the degree of CNA-driven RNA downregulation in 229 of the 307 samples by analyzing the genes common to the genome and expression platforms (aCGH Agilent Technologies and Affymetrix) and retained after filtering based on the expression level. Briefly, a potential TSG had to show a lower expression value in a sample with loss than in those without (Student t test) and an underexpression frequency overrepresented in samples with the loss (Fisher's exact test). For both tests significance level was 5% with a false discovery rate (FDR, [25]) less than 1%.

Statistical analyses
Correlations between L3MBTL4 CNA groups and histoclinical factors were calculated with the Fisher's exact test. Student's t test and one-way ANOVA test were used to evaluate association of L3MBTL4 gene expression level within histoclinical factors. Overall specific survival (OS) and metastasis-free survival (MFS) curves were estimated using the Kaplan-Meier method and statistical significance of pairwise comparisons was assessed using the log-rank test. OS and MFS follow-up times were measured from the date of diagnosis till death from breast cancer and till the first occurrence of distant metastases, respectively. All statistical tests were two-sided at the 5% level of significance. Statistical analysis was done using the survival package (version 2.30), in the R software (version 2.9.1).

Results
Losses and breakages of 18p targets the L3MBTL4 gene at p11.31 Genome profiles of 307 primary breast tumors and 47 cancer cell lines were established by aCGH. GISTIC analysis showed that chromosome arm 18p was targeted by losses in more than 20% of tumor samples (Additional file 2, Figure S1). None of these losses were homozygous deletions. Profiles showed various sizes of losses ( Figure 1A). In tumor T8700 loss within 18p11.31 ( Figure 1A) spanned the L3MBTL4 gene only ( Figure  1B). 18p losses including this region were found in 77 (25%) primary tumors and 19 (40%) cell lines.
Copy number transitions are likely to reflect DNA strand breakages that may lead to nonreciprocal translocations [26]. L3MBTL4 was targeted by copy number transition in tumor T50115, and in the HCC38 ( Figure  1C, D) and MDA-MB-453 cell lines. In T50115, the breakpoint was within the genome interval [ch18: 6,202,799-6,212,615], which contains L3MBTL4 exons 10 and 11. The resulting protein should be truncated of its C-terminal part starting from the second MBT motif. In HCC38 and MDA-MB-453 the breakpoints spanned genome intervals chr18: 6,009,803-6,020,061, (between exons 17 and 18) and chr18: 6,030,519-6,044,093 (including exon 16], respectively. These two breakages should generate a protein truncated of the sterile alpha motif (SAM) domain. None of these potential fusion transcripts involving the L3MBTL4 gene has been identified yet.
Thus, the L3MBTL4 gene is targeted by various genomic alterations, including loss and breakage, in a high proportion of breast cancers.
Features of tumors with L3MBTL4 loss L3MBTL1, L3MBTL2 and L3MBTL3 paralogs were not targeted by deletion in our breast tumor set. We did not find any L3MBTL4 deletion among genome profiles similarly established in 80 colon cancers, 115 myeloid hematopoietic diseases and 53 sarcomas (data not shown). This suggests that L3MBTL4 loss occurs specifically in breast cancers.  Table S4A). (C, D) Breakpoint targeting the L3MBTL4 gene in breast primary tumor T50115 and HCC38 breast cancer cell line.
The 35 other cases showed a loss of L3MBTL4 as a consequence of the complete loss of the chromosome 18p. They included 10 luminal A, 12 luminal B, 7 basal, 4 ERBB2, 1 normal-like and 1 non-informed tumors. Interestingly, the complete loss of the chromosome 18p was associated with the luminal B molecular subtype (p < 0.05), as well as cancer in older women (>50 years old) (Additional file 1, Table S4B).
Comparison of clinical features between breast tumors with and without L3MBTL4 loss (Table 1) showed that L3MBTL4 loss was associated with the presence of lymph node metastases (p = 1.02 10 -2 ), high SBR grade (p = 1.15 10 -2 ) and luminal B molecular subtype defined by SSP classification [23] (p = 3.10 -5 ). No impact on survival was noted.

L3MBTL4 is targeted by mutations
We searched for mutation in L3MBTL4 exons in 180 and 47 of the aCGH-profiled tumors and cell lines, respectively. Sequence analysis of the tumor samples identified 32 variants including 25 synonymous (23 p. Ileu570Ileu, 2 p.Val201Val), 6 missense (3 p.Ser123Asn, 1 p.Ser493Leu, 1 p.Glu560Lys, 1 p.Ile615Ser) and 1 nonsense (p.Tyr339X). In BT-483 cell line a missense (p.Arg96Gln) mutation was identified. Figure 2A shows two examples of mutation. The localization and nature of the mutations are shown in Figure 2B. The p.Ser123-Asn substitution and the nonsense mutation were located in regions encoding the conserved MBT1 and MBT3 motifs, respectively. The nonsense mutation should generate a truncated L3MBTL4 protein without C2HC zinc finger and SAM domains. The p.Arg96Gln substitution in BT-483 should affect the MBT1 domain. The SAM motif was also targeted by the p.Glu560Lys substitution. We were able to show that the p.Ser123-Asn and p.Tyr339X mutations found respectively in T8584, T8847 and T9193 and in T8525, were acquired ( Figure 2A). We could not confirm that the other missense mutations are similarly somatic. However, they were not included as SNP or missenses in NCBI db SNP build 131 [27].
Counting only the 7 potentially deleterious mutations the frequency of such event would be 3.9% in tumor samples.

L3MBTL4 mRNA is downregulated in breast tumors
L3MBTL4 gene expression was measured using qRT-PCR in normal breast tissues and 52 breast tumor samples including 16 with a loss of the L3MBTL4 gene region (T7420, T8009, T8189, T8600, T8700, T9059,  T9398, T9888, T9941, T11348, T11485, T11568,  T10684, T12854, T13469, T13018), one mutated and deleted (T10003), and one mutated but not deleted (T8525). L3MBTL4 gene expression was quantified by comparison with the expression of the housekeeping GUS gene. Overall, tumor samples expressed a low level of L3MBTL4 mRNA as compared with normal breast tissue. L3MBTL4 mRNA level was decreased at least two-fold in all tumors with L3MBTL4 loss as well as in the mutated and non-deleted sample (Figure 3). Some non-deleted tumors also exhibited low L3MBTL4 mRNA level suggesting that L3MBTL4 downregulation could be attributed to several mechanisms including deletions but also regulation of transcription.
Deletion, mutation and expression data are described in the Additional file 1, Table S4A and summarized in Table 2. Expression microarray profile revealed five genes (TTMA, PTPN2, NDC80, SLMO1 and TUBB6) coordinately deregulated with L3MBTL4 and also localized in an 18p region frequently lost in breast tumors, suggesting a common mechanism of deregulation of their expression, which might contribute to breast oncogenesis.

Sixty-four genes are targeted by 18p copy number losses in breast cancer
To identify other 18p genes targeted by copy number losses, a GISTIC analysis was done on the genome profiles of the 307 cases. Recurrent copy number losses targeted 64 genes within the 18p11.21-11.32 region (p < 1 × 10 -5 ) (Additional file 2, Figure S1), including L3MBTL4 and the potential TSG EPB41L3/DAL1 [28]. This suggests that this 18p11 region probably hosts multiple TSGs.

Fourteen new 18p TSG candidates
To identify these other potential 18p11 TSGs we compared the degree of CNA-driven RNA downregulation in the 307 samples by analyzing the genes common to the genome and expression platforms (aCGH Agilent Technologies and Affymetrix) and retained after filtering based on the expression level. Sixty-three of the 64 were common. The gene expression of 14 genes (YES1, SMCHD1, LPIN2, MRCL3, MRCL2, ZFP161, RALBP1, PPP4R1, NAPG, AFG3L2, SPIRE1, CEP76, PTPN2 and SEH1L) was downregulated in relation with their copy number losses (p < 0.05) (Additional file 1, Table S6). We noted that EPB41L3/DAL1 and L3MBTL4 were not retained. The lowest expression of EPB41L3 in breast tumors was associated with the presence of lymph node metastases (p < 0.05), correlated with basal and luminal B molecular subtypes (p < 0.01) and showed clinical impact (Additional file 1, Table S7).

L3MBTL4 gene alterations
Few works have reported the presence of TSGs on 18p for breast cancer [1,2] and only EPB41L3/DAL1 [28] has been identified as potential TSG involved in this neoplasm thus far. Our aCGH results provide a detailed view of the alterations of chromosome arm 18p in breast tumors. Losses at 18p11.31 occurred in 25% of tumors and the smallest common region of deletion targeted the L3MBTL4 locus and was also the site of breakages. The karyotype of the MDA-MB-453 cell line displays a der(18)t(7;18) [29]. The L3MBTL4 breakpoint suggests involvement of this gene in this alteration. L3MBTL4 loss was associated with high grade and with lymph node metastasis. In agreement, an increased risk of relapse in patients with high risk breast cancer has been associated with 18p loss [30]. L3MBTL4 loss was associated with the luminal B subtype, which is characterized by a poor prognosis. Interestingly, L3MBTL4 is centromeric and in close proximity to the loss of heterozygosity (LOH) region spanning EPB41L3 previously reported in non-small cell lung carcinomas [1], breast carcinomas [3], and meningiomas [31].
L3MBTL4 was targeted by point mutations in few cases. These mutations clustered in the vicinity of the MBT or SAM motifs. A study of L3MBTL1 has shown that a point mutation in the second MBT repeat motif affected the binding to H1K26me and H4K20me [32]. We surmise that the nonsense mutation in the third MBT motif of L3MBTL4 may have similar functional implication.
In three tumors and in the BT-483 cell line L3MBTL4 mutations were associated with the loss of the other allele, suggesting a double-hit mechanism leading to complete loss-of-function of L3MBTL4 in these samples. We did not find homozygous mutations or deletions of L3MBTL4. However, we cannot exclude that in addition to these alterations, total inactivation could result from promoter hypermethylation of the other allele. The study of mRNA expression tended to confirm this.

L3MBTL4 loss of expression
Expression of L3MBTL4 level was decreased at least two-fold in 73% of breast cancer samples, particularly in tumors deleted at the L3MBTL4 locus since decreased expression correlated with genomic alteration. All the deleted tumors showed absence or decreased L3MBTL4 expression. Few cases displayed mRNA downregulation in the absence of loss suggesting that other mechanisms such as epigenetic repression play a role in the downregulation of L3MBTL4 in breast tumors. L3MBTL4 downregulated expression was not found only in luminal B tumors but also in luminal A, luminal B, ERBB2 and normal-like cases. Only the basal subtype presented a normal expression of the gene, independently of loss. This suggests that L3MBTL4 does not play a TSG role in this subtype of breast cancer. Inactivation of other TSGs may explain 18p losses in basal breast cancers.

L3MBTL4 as a tumor suppressor gene
Our observations are in agreement with a previous study that suggested the presence of TSGs on 18p with a role in the genesis of breast cancer [33]. There are several reasons to believe that L3MBTL4 is a good candidate TSG. First, it lies within the region of chromosome 18 that is frequently deleted in breast cancers; L3MBTL4 is also targeted by mutations and breakages and is dowregulated in tumors. Second, L3MBTL4 gene is a human homolog of Drosophila l(3)mbt, which functions as a TSG in fly. The loss-of-function generated by mutation of l(3)mbt causes brain tumors in Drosophila [4,5]. Third, L3MBTL4 has three paralogs that are suspected  Table S2.
to play a role in the etiology of certain types of cancer [7]. L3MBTL1, a known transcriptional repressor [34], has been proposed as a TSG gene in myeloid malignancies associated with 20q deletion [8,35,36]. A recent study reported focal hemi-and homozygous deletions of L3MBTL2 and L3MBTL3 in medulloblastoma [9].
The four human L3MBTL proteins have MBT domains involved in transcriptional repression and chromatin remodeling. The MBT domain was originally identified in the Drosophila l(3)mbt protein [5] and binds methyl-lysine residues [37,38], particularly and strongly H3K9me and H4K20me [39]. The transcriptional repressor L3MBTL1 requires its three MBT domains for compacting chromatin and silencing [32]. Although the exact biochemical properties and cellular functions of L3MBTL4 are unclear to date, the presence of MBT domains in L3MBTL4 suggests that it interacts with chromatin [40], may potentially bind methylated histone and thus play a role in transcriptional regulation of stem cell genes, oncogenes and tumor suppressors. Loss of this regulator may thus affect several breast cancer genes. Some of these genes may be part of the E2F/ RB pathway. Indeed, the E2F/RB pathway is altered in luminal B cancers and L3MBTL proteins are known to regulate this pathway [7].
Taken together, our results suggest that aberrations targeting L3MBTL4 could confer to cancer cells specific advantages but do not exclude the role of other potential 18p candidates.

Other 18p TSG candidates
One of the potential 18p TSGs might be EPB41L3/ DAL1 [28], which is just telomeric to L3MBTL4, between the D18S59 and D18S452 markers. However, its expression was not correlated with that of L3MBTL4. We also identified 14 TSG candidates whose gene expression was downregulated in relation with their copy number losses. We noted that EPB41L3/DAL1 and L3MBTL4 were not included in the 14 suggesting that their downregulated gene expression is not only the consequence of their copy number loss but could result from other mechanisms of deregulation including epigenetic modifications. This is in agreement with study showing that the hypermethylation of EPB41L3/DAL1 was associated with its downregulation in lung cancer [41]. Our EPB41L3/DAL1 expression data showed that Figure 3 Comparison of L3MBTL4 mRNA levels in deleted and non deleted breast tumors. mRNA levels were measured by quantitative RT-PCR. GUSB mRNA expression was used as internal control for mRNA normalization. L3MBTL4 mRNA expression in deleted tumors was lower than in normal breast (NB). Among the non-deleted tumors, L3MBTL4 mRNA expression was lower in luminal than in normal breast (NB) and basal tumors. This suggests that the L3MBTL4 gene expression can be affected by heterozygous deletion but also regulated by an epigenetic mechanism. Asterisk (*) indicates the mutated samples.
its downregulation could be associated with an increased risk of relapse in patients with high risk breast cancer.
EPB41L3/DAL1 undergoes allelic losses in various cancers and in a significant proportion of ductal carcinomas in situ of the breast. The EPB41L3/DAL1 protein suppresses the growth of MCF7 breast cancer cells and increases attachment of these cells to a variety of extracellular matrices [28]. Modulation of post-translational methylation may be an important mechanism through which EPB41L3/DAL1 affects tumor cell growth [42]. EPB41L3/DAL1 plays a critical role in the suppression of lung tumor formation and metastasis [43]. However, the role of EPB41L3/DAL1 as a TSG has yet to be validated in vivo. EPB41L3/DAL1 deficient  Figure 4B). However, the correlation between L3MBTL4 mRNA downregulation and gene loss was not simply a consequence of the proportion of basal tumors in the category "tumors without loss". Indeed, a similar proportion of basal tumors [26% (15/58) and 32% (52/162)] was observed in both "tumors with loss" and "tumors without loss" groups, respectively (Fisher test, p = 0.41). Moreover, downregulated L3MBTL4 expression was correlated with L3MBTL4 gene loss in basal tumors (p < 0.05). mice are healthy and do not develop spontaneous tumors [44]. Mutational screening failed to identify inactivating mutation of the EPB41L3/DAL1 gene [33]. Among the 14 other TSG candidates, except PTPN2/ TCTP none has been so far associated with cancer. PTPN2/TCPTP codes for the T-cell protein tyrosine phosphatase (TCPTP) and is an important negative regulator of SFK, JAK1 and STAT3 signaling during the cell cycle [45]. TCPTP suppresses the tumorigenicity of glioblastoma cells expressing a mutant epidermal growth factor receptor [46]. CEP76 encodes a centrosomal protein controling centrosome duplication during cell division. Abnormal centrosome duplication contributes to mitotic failure, genome instability, aneuploidy, and cancer. Depletion of CEP76 drives the accumulation of centrosome intermediates in certain types of cancer cells [47]. Only the downregulated expressions of ZFP161, PPP4R1 and YES1 were correlated with the luminal B subtype suggesting their potential involvement in the genesis of a particularly aggressive form of breast cancer with 18p loss. The downregulated expression of ZFP161 in breast tumors was correlated with a poor clinical evolution. ZFP161/ZF5 encodes a ubiquitously-expressed protein originally identified by its ability to bind and repress the murine Myc promoter [48,49]. The protein contains an N-terminal POZ domain, which recruits cofactors to modulate transcription [50]. ZFP161/ZF5 mediates both transcriptional activation and repression of cellular and viral promoters [48,50,51]. ZFP161 may compete with MYC-induced transcription [52].
PPP4R1 encodes the regulatory subunit of a~125-kDa protein phosphatase. PPP4R1 interacts with PPP4C [53,54] which is implicated in the regulation of histone acetylation, DNA damage checkpoint signaling, NFB activation, and microtubule organization at centrosomes [55][56][57][58][59]. YES1 encodes a SRC-family kinase [60] and its tyrosine kinase activity has been shown to be elevated in colonic adenomas compared to its activity in adjacent normal mucosa [61]. A number of studies have linked increased expression of YES in cancer with increased cell motility and tumor invasion [62,63]. It is then surprising to find YES1 among the 18p TSG candidates. The downregulation of YES1 could be the consequence of its loss as a simple passenger of a larger region lost within 18p.

Conclusion
We have delineated a region of frequent loss in breast cancer on chromosome arm 18p. We have identified L3MBTL4 as the gene targeted by these losses. L3MBTL4 is also targeted by mutations and breakages. L3MBTL4 mRNA expression is low in non-basal breast tumors and in particular in tumors with loss of the gene. Alteration of L3MBTL4, coding for a regulator of epigenetic marks, is well in line with recent advances in cancer research [64,65]. We have also pointed to other 18p TSG candidates including ZFP161, CEP76, PPP4R1 and PTPN2 whose involvement might explain the aggressive phenotype of breast tumors with 18p loss.

Additional material
Additional file 1: Supplementary Tables S1-S8. Table S1. Clinical and histological features of 307 breast tumor cases. Table S2. Primers used for L3MBTL4 sequence determination. Table S3. Association of genes with copy number alterations and L3MBTL4 genomic loss.  Figure S1. The 18p is targeted by losses in breast cancer. On the top, combining the CNA frequency and gene copy number alterations level, the GISTIC algorithm plotted the score index observed in genomic profiles of 307 breast tumors as a function of chromosome 18 locations. The chromosome arm 18p was targeted by losses in more than 20% of tumor samples. The dotted line indicates the threshold of significance for the score. At the bottom, the figure shows only an 18p11.21-p11.32 region of 12.79 Mb including 64 loci as significant 18p losses (p < 10 -5 ). L3MBTL4 and EPB41L3 are contained within this region.