A dedicated microarray for in-depth analysis of pre-mRNA splicing events: application to the study of genes involved in the response to targeted anticancer therapies

Alternative pre-mRNA splicing (AS) widely expands proteome diversity through the combinatorial assembly of exons. The analysis of AS on a large scale, by using splice-sensitive microarrays, is a highly efficient method to detect the majority of known and predicted alternative transcripts for a given gene. The response to targeted anticancer therapies cannot easily be anticipated without prior knowledge of the expression, by the tumor, of target proteins or genes. To analyze, in depth, transcript structure and levels for genes involved in these responses, including AKT1-3, HER1-4, HIF1A, PIK3CA, PIK3R1-2, VEGFA-D and PIR, we engineered a dedicated gene chip with coverage of an average 185 probes per gene and, especially, exon-exon junction probes. As a proof of concept, we demonstrated the ability of such a chip to detect the effects of over-expressed SRSF2 RNA binding protein on the structure and abundance of mRNA products in H358 lung cancer cells conditionally over-expressing SRSF2. Major splicing changes were observed, including in HER1/EGFR pre-mRNA, which were also seen in human lung cancer samples over-expressing the SRSF2 protein. In addition, we showed that variations in HER1/EGFR pre-mRNA splicing triggered by SRSF2 overexpression in H358 cells resulted in a drop in HER1/EGFR protein level, which correlated with increased sensitivity to gefitinib, an EGFR tyrosine kinase inhibitor. We propose, therefore, that this novel tool could be especially relevant for clinical applications, with the aim to predict the response before treatment.


Background
Alternative pre-mRNA splicing (AS) occurs for an estimated 90% of genes in the human genome [1], with remarkable repercussions on proteome diversity [2]. The outcome of AS strongly depends on context. Hence, AS occurs to allow the onset of development or differentiation programs, to participate in cancer occurrence or progression, and to develop integrated responses to stressful conditions [3][4][5]. Importantly, AS transcripts may encode alternative protein isoforms, which quite often display distinct or even opposite functions, such as for the pro-or anti-apoptotic caspases or Bcl-2 family proteins [6][7][8]. In addition, AS may also lead to the assembly of short-lived mRNAs targeted to degradation through the nonsense mediated decay (NMD) system [9]. However, even if NMD transcripts do not encode proteins, their occurrence may modify the ratio of mRNA isoforms, potentially affecting protein synthesis outcome [10].
Analytical tools to study AS on a large scale have been developed by Affymetrix™, with the Human Exon 1.0 ST arrays, also referred to as splice-sensitive microarrays, which allow surveying known and predicted AS events throughout the transcriptome [11,12]. Recently, deep sequencing methods have made it possible to determine both mRNA levels and structure [13][14][15]. Nevertheless, the mathematical tools necessary to decipher the structure and amount of mRNA species identified by sequencing are still under constant development [16,17]. In addition, a recent comparison between RNA-Seq and Affymetrix™ Exon arrays has revealed that the chip method was more powerful at detecting and quantifying exons [18]. It was also demonstrated that microarray technologies could be used as a reliable routine diagnostic tool, thanks to the development of a small custom-made microarray able to predict disease outcome in breast cancer patients [19]. Following on that path, the aim of the present study was to develop a customized microarray enabling to detect both known and predictable AS events for a small number of genes involved in tumor growth and in the response to targeted anticancer therapies. To take advantage of the DNA chip experimental setup, we wished to improve the methodology by increasing the amount of probes, including exon-exon junction probes absent from Affymetrix™ Exon arrays, which would allow detecting virtually all AS events that could occur in this subset of genes.
Targeted anticancer therapies include drugs, such as inhibitors of tyrosine kinase or monoclonal antibodies (mAbs), which oppose cell growth signaling or tumor blood vessel development, promote the specific death of cancer cells, or stimulate the immune system. Among specific molecules with which targeted therapies interfere, the HER (human epidermal growth factor receptor) family regulates cell growth, survival, adhesion, migration and differentiation. Trastuzumab (Herceptin™), which was FDA-approved in 2000, was the first treatment using a humanized mAb to target the receptor tyrosine kinase encoded by the HER2 oncogene, and is mainly used to treat breast cancers over-expressing this receptor [20,21]. Cetuximab (Erbitux™) and gefitinib (Iressa™) target HER1/ EGFR (epithelial growth factor receptor), or its tyrosine kinase activity, respectively, and bevacizumab (Avastin™) blunts VEGF-A (vascular endothelial growth factor A) activity upon binding to the Gly88 residue from the extracellular domain [22]. AS transcript variants have been characterized for all these targets, especially for VEGFA [23][24][25], and could account for part of the inefficacy of the responses to mAbs. The PIK3/Akt pathway is a major signaling cascade downstream of the receptor tyrosine kinases. In addition, VEGFA expression is regulated by the hypoxia factor HIF-1α. The analyzed genes on this custom microarray include AKT1-3, HER1-4, HIF1A, PIK3CA, PIK3R1-2, VEGFA-D, and PIR that lies close to the VEGFD locus and could be fused to VEGFD upon read through transcription. Collectively, these genes can lead to the assembly of more than 100 mRNAs with protein-coding capacity (http://www.ensembl.org). Hence, the response to targeted anticancer therapy will likely depend, at least in part, on the selection of specific combinations of protein targets derived from AS events.
In order to validate our custom DNA chip, we took advantage of the human lung adenocarcinoma H358 cell line that we previously engineered to conditionally overexpress the pre-mRNA splicing enhancer protein SRSF2, which controls the splicing of VEGFA pre-mRNA [26], but also has a role in transcriptional elongation [27]. Positive results were further validated by specific quantitative RT-PCR in both H358 cells and human non-small cell lung carcinoma (NSCLC) samples that we previously showed to over-express the SRSF2 protein [28]. The repercussion of altered splicing on the amount of the HER1/EGFR protein and the response to gefitinib were analyzed in H358 cells.

Results
Validation of the splice-inducing ability of SRSF2 Using an E1A-based plasmid minigene in transient transfection experiments, we analyzed the splice-inducing ability of SRSF2 (Additional file 1: Figure S1). There was an up-regulation of the 13S PCR band associated with a down-regulation of the 9S band, indicating that SRSF2 over-expression could modify the balance of E1A-derived transcripts, as originally described [29].

Cross validation with 44 k Agilent microarray
To analyze the gene expression changes triggered by over-expression of SRSF2 in H358 lung cancer cells, we performed an analysis using 44 k Agilent™ microarrays. These data have been deposited in NCBI's Gene Expression Omnibus and are accessible through GEO Series accession number GSE50467. A lot of genes were differentially expressed between SRSF2-over-expressing H358 lung cancer cells and H358 control cells (1,709 deregulated probes; ≥ 2.0 FC, P-value ≤ 0.05 by t-test with FDR; Additional file 2: Table S1), corresponding to 52% upand 48% down-regulations. Hence, in addition to its already reported role in the regulation of VEGFA splicing, over-expression of SRSF2 led to the regulation of transcript abundance of many additional genes, including genes present on the 15 k custom chip (Additional file 3: Table S2), as demonstrated with the 44 k Agilent™ microarrays.
Validation of the labeling method: comparison of the 15 k custom and 44 k Agilent microarrays The labeled cRNA yield and the specific activity of cya-nine3 were examined for each of three labeling experiments (Additional file 4: Table S3). A comparison of the 15 k custom and 44 k commercial microarrays, with respect to Agilent™ probes present on both chips, was performed in order to validate the use of the labeling method with the 15 k custom microarray. The number of 15 k replicates using Quick Amp labeling was equal to 4 for each condition (control or SRSF2 over-expression), and the number of 44 k replicates was equal to 6 for each condition. We found that 313 Agilent™ probes (corresponding to 16% of the total number of Agilent™ probes on the 15 k chip) were deregulated on the 15 k custom microarray (≥ 1.5 FC, P-value ≤ 0.05), among which 310 (99%) had the same type of (up-or down-) regulation on the 44 k commercial microarrays (Additional file 5: Table S4). Pearson correlation between expression signals of these 313 common genes led to a coefficient of 0.89. Therefore, it was considered that Quick Amp labeling was validated for the 15 k custom microarray.

Detection of the mRNA regulation
We analyzed the expression of the 16 selected genes present in the 15 k custom microarray, considering the expression of all custom probes for each gene (Table 1). Four genes (HER4, PIK3CA, PIK3R1 and VEGFD) were not expressed; five genes (AKT2, AKT3, HER2, PIK3R2 and VEGFC) were not differentially expressed; five genes (AKT1, HER3, HIF1A, PIR and VEGFB) were slightly down-regulated (≤ 1.5 FC, P-value ≤ 0.05); HER1/EGFR was more strongly down-regulated (≥ 1.5 FC, P-value ≤ 0.05), and VEGFA was up-regulated (≥ 1.5 FC, P-value ≤ 0.05) in SRSF2-over-expressing H358 lung cancer cells in comparison to H358 control cells. A good concordance between the 15 k and 44 k microarray results was found: 8 out of the 16 genes present in 15 k custom chip were deregulated on 44 k chips (≥ 1.1 FC, P-value ≤ 0.05), considering Agilent™ probes, and showed the same type of regulation on the 15 k chip, considering custom probes (Additional file 3: Table S2).

Regulation events among the expressed genes
The bioinformatics analysis of the 15 k custom microarray showed that 30 custom probe sets from expressed genes were differentially expressed in SRSF2-over-expressing H358 lung cancer cells in comparison to H358 control cells (≥ 1.5 FC, P-value ≤ 0.05; Table 2). The low expressed deregulated probe sets were not considered. The regulation events corresponded to 70% down-and 30% upregulations, mostly affecting cassette exons, but also 5′-untranslated regions and terminal or donor splice sites, of 9 genes among the 12 expressed genes (AKT2, AKT3, HER1/EGFR, HER2, HER3, HIF1A, PIK3R2, VEGFA and VEGFB). Regulations were associated with a high, medium or low confidence, depending on the regulation of probes close to the deregulated probe sets. A list of supporting evidences (Additional file 6: Table S5) was defined corresponding to the regulations that were not always statistically relevant, but confirmed the deregulation of some probe sets. Consequently, these regulations were associated with a high confidence. On the contrary, the confidence was considered as low if neighboring probes were not deregulated or if their regulation was opposite. The regulations associated with a high fold-change and corresponding to unknown and predicted pre-mRNA splicing events could be of special interest.

Validation of regulation events by real-time polymerase chain reaction
Quantitative RT-PCR was used to measure the expression of 9 genes deregulated on both the 15 k custom and the 44 k commercial microarrays, and the differential expression of all genes in SRSF2-over-expressing H358 lung The expression and the regulation of the 16 genes were analyzed on the 15 k custom microarray in SRSF2-over-expressing H358 lung cancer cells in comparison to control cells. Some genes were not expressed; others were not differentially expressed. Five genes were slightly down-regulated (≤ 1.5 FC, P-value ≤ 0.05), and one gene (HER1/EGFR) was more strongly down-regulated (≥ 1.5 FC, P-value ≤ 0.05). Only one gene (VEGFA) was up-regulated in the SRSF2 over-expression condition (≥ 1.5 FC, P-value ≤ 0.05).
cancer cells in comparison to H358 control cells was analyzed with RNA isolated independently from that used for chip hybridization (Additional file 7: Table S6). These results confirmed the validity of our experimental approach used to analyze the 15 k custom microarray. Ten out of the 30 deregulated probe sets were selected according to their high confidence (Table 2), and concerned 4 genes, including AKT3, HER1/EGFR, HIF1A and VEGFA ( Figure 1). The results of quantitative RT-PCR experiments are shown in Table 3. Relative mRNA levels were normalized to control gene mRNA levels or a fold-change was calculated comparing to a reference event. For HER1/ EGFR, we showed a down-regulation of one of the transcripts (last exon > e20) in SRSF2-over-expressing H358 lung cancer cells in comparison to H358 control cells. For AKT3, we validated the up-regulation of exon 7 and the down-regulation of exon 8; that is because the e7+/e8transcript was over-expressed as compared to the e7+/e8+ transcript including both exons. For HIF1A, the upregulation for two (e9+/e10-and e9-/e10-) of the three alternative transcripts compared to the e9+/e10+ transcript led us to conclude that both exons 9 and 10 were downregulated. For VEGFA, we validated the alternative polyadenylation in intron 4 by an over-expression of the smaller  transcript (last exon = e4) in comparison to the longer transcript (last exon > e5). We also confirmed the alternative donor site for the exon 6 by an up-regulation of the "alternative donor e6" transcript in comparison with the "constitutive donor e6" transcript.

HER1/EGFR protein expression analysis
The 15 k custom microarray predicted multiple exon skipping in the 3′ region of HER1/EGFR in SRSF2-overexpressing H358 lung cancer cells, which was confirmed by quantitative RT-PCR. These observations led us to test whether these splicing events would have an impact on the amount of the HER1/EGFR protein. Western blotting analysis was performed using various anti-EGFR antibodies directed against the N-terminal (31G7) or the C-terminal (D38B1) portion of the protein, as well as against the phosphorylated active form of EGFR (P-HER1/ EGFR-Tyr1068). The results demonstrated that SRSF2 overexpression in H358 cells led to a decrease in EGFR protein amount, as detected using all antibodies ( Figure 2). These data suggested that SRSF2-regulated EGFR pre-mRNA splicing strongly affects EGFR protein expression. In addition, H358 cells express a wild-type EGFR protein and are resistant to apoptosis in response to EGFR tyrosine kinase inhibitors such as gefitinib. In order to determine if SRSF2-induced EGFR protein down-regulation could modify the response of H358 cells to gefitinib, we performed a dose-response of the drug in the presence or absence of SRSF2 induction ( Figure 3). As expected, a 24 hours-treatment with gefitinib significantly prevented EGFR-Tyr1068 phosphorylation in these cells, but only partially engaged apoptosis at the higher concentration, which was detected by poly-ADP ribose polymerase (PARP) processing. However, caspase-3 was never activated in gefitinib-treated cells. Of note, at the highest gefitinib concentration, a reduction in the amount of total  EGFR together with the appearance of protein bands of smaller sizes was observed when using the 31G7 antibody mainly. These data suggested that EGFR could be processed in response to high gefitinib doses. Importantly, when SRSF2 was overexpressed in gefitinib-treated cells, the decrease in EGFR protein amount was more pronounced and apoptosis was strongly engaged, as evidenced by procaspase-3 and PARP cleavages (Figure 3). This result indicated that SRSF2, through its ability to control EGFR protein expression, sensitizes H358 cells to the apoptosis induced by EGFR tyrosine kinase inhibitors.

Alternative splicing events in lung cancer biopsy samples
Finally, we aimed at extending some of our in vitro data to cancer tissues. For this purpose, we took advantage of

Over-expression of alternative donor
The regulation of the 10 selected deregulated custom probe sets was analyzed by quantitative RT-PCR in SRSF2-over-expressing lung cancer cells in comparison to control cells. Relative mRNA levels were normalized to that of beta-2-microglobulin or a fold-change was calculated comparing to a reference event. The cut-off value was equal to 1.40. n/a: not available.  the cancer-associated over-expression of SRSF2, as it may occur in NSCLC [28]. SRSF2 and phospho-SRSF2 expression scores (0-300) were established in 10 NSCLC biopsy samples (Table 4A) by multiplying the percentage of labeled tumor cells (0 to 100%) by the staining intensity (0, null; 1, low; 2, moderate; 3, strong). Interestingly, the three NSCLC samples with the highest SRSF2 and phospho-SRSF2 scores all displayed a drop in the HER1/ EGFR "last exon > e20" transcript, as determined by quantitative RT-PCR, similarly to what occurred in lung cancer cells. We also analyzed the occurrence of the AKT3, HIF1A and VEGFA splicing events in NSCLC biopsy samples (Table 4B). For several samples, we observed an overexpression of exon 7 and an under-expression of exon 8 of AKT3, and an over-expression of exon 4 and alternative exon 6 donor splice site for VEGFA. Although the relationships between SRSF2 status and these splicing events were less clear in these cases, maybe owing to the small number of samples, these data validated, in cancer samples, some of the pre-mRNA splicing events detected in the SRSF2-over-expressing H538 cell line. The results were inconclusive for HIF1A, possibly reflecting heterogeneity among the NSCLC samples with respect to expression of this gene.

Discussion
In this study, we designed a custom gene expression microarray amenable to the study of alternative pre-mRNA splicing (AS) events of a selection of genes involved in the response to targeted anticancer therapies. This approach was preferred to commercial microarrays, such as the Human Exon 1.0 ST arrays (Affymetrix™) because it allowed a deeper analysis of AS, in this case of a small number of genes highly relevant from a clinical standpoint. Indeed, it is clear that our custom splice-sensitive microarray could theoretically detect many more events than Affymetrix™ 10 Observed Transcript Regulation HER1/EGFR Relative expression Last exon = e17 n/a n/a n/a n/a 7.17 n/a n/a 16. 15  e7-e8+ vs. e7+ e8+ n/a 1.33 n/a 1.35 n/a n/a n/a 0.38 n/a n/a Low expression of e7-e8+ e7-e8-vs. e7+ e8+ n/a n/a n/a n/a n/a n/a n/a n/a n/a n/a No expression of e7-e8- e9-e10+ vs. e9+ e10+ 1.87 n/a n/a 1.06 n/a n/a n/a n/a 1.14 1. The regulation of the 10 selected deregulated custom probe sets was analysed by quantitative RT-PCR in 10 non small cell lung carcinomanormal sample pairs (patients numbered from 1 to 10). SRSF2 protein expression levels in biopsy samples were analysed by immunohistochemistry in a previous study. A score (0-300) was established for SRSF2 and phosphorylated SRSF2 (P-SRSF2). Patients with scores ≥ 150 and > 175 were those over-expressing SRSF2 and P-SRSF2 proteins respectively, as compared to normal lung tissues. Patients in bold characters over-expressed both proteins. n/a: not available. Relative mRNA levels were normalised to that of beta-2-microglobulin or a fold-change was calculated comparing to a reference event. The cut-off value was equal to 1.40. n/a: not available. Patients in bold characters over-expressed both SRSF2 and phospho-SRSF2 proteins (see Table 4A).
Exon Arrays (Table 5), considering probe length, probe number and, especially, exon-exon junction probes, which were not present on Affymetrix™ Exon Arrays. At a practical level, several high confidence events revealed, thanks to exon-exon junction probes, specific splicing events ( Table 2). For example AKT3 je7_e8, HER1/EGFR je16_e19 or HIF1A je10_e11 junction-specific events would have been undetected on Affymetrix™ arrays. In addition, selecting only the high confidence events, the regulations observed through the chip analysis were confirmed by quantitative RT-PCR, emphasizing the robustness of both the technical and the analytical tools used in this study. Nevertheless, we anticipate that RNA-Seq methodologies will probably soon be another, reliable, means for characterizing AS throughout the transcriptome [30,31].
We are aware of only one study that used a designed chip to analyze the occurrence of splicing variants which, in that case, corresponded to AS events from a single gene, CIZ1, encoding a Cip1-interacting zinc finger protein [32]. This approach led to the identification of a splice variant that may be specific for pediatric cancer. There is an absolute need for predictive biomarkers of therapeutic responses, especially targeted anticancer therapies, as many patients do not respond or acquire resistance. For instance, VEGF-A isoforms may not respond identically to anti-VEGF-A mAbs (bevacizumab). In fact, the co-occurrence of both pro-angiogenic (VEGF-A xxx ) and anti-angiogenic (VEGF-A xxx b) splice isoforms might restrict the therapeutic response [33][34][35][36][37]. In addition, the occurrence of soluble EGFR isoforms, as detected in meningiomas [38], presumably unresponsive to tyrosine kinase inhibitor therapy, might also dampen the therapeutic response. Furthermore, an exon 4-lacking EGFR variant mRNA was associated with an increased metastatic potential, a molecular event that would likely have been detected with our splice-sensitive microarray [39]. Hence, in addition to providing a comprehensive picture of splicing events and potential therapy response, our chip could also help predicting clinical outcome, based on the detection of prometastatic mRNA species. Nevertheless, beyond the concept, more predictive studies should be performed to make our splice-screening methodology an efficient therapy selecting option.
We showed that SRSF2 has an effect on transcriptional regulation and on AS of several genes analyzed in this study. Notably, SRSF2 over-expression modified HER1/ EGFR and VEGFA expression in H358 lung cancer cells. Using patient-derived material, we observed that strong SRSF2 over-expression in NSCLC is associated with splicing alterations of the HER1/EGFR and VEGFA transcripts, as predicted from the results in the SRSF2-over-expressing H358 lung cancer cell line. In addition, HER1/EGFR splicing events have also been identified in lung adenocarcinomas [40], lending support to our results. The observation that the increase in SRSF2 protein level induced massive procaspase-3 cleavage when associated with gefitinib in H358 cells, which express wild-type and non amplified EGFR protein, may be particularly relevant for patients with lung adenocarcinomas without EGFR mutations, as one of the challenges is to understand why only some of them respond to EGFR tyrosine kinase inhibitors.
The expression level of HER1 mRNA, measured through analysis of the 44 k Agilent™ chip, and the western blotting analysis of the protein, showed a good correlation in response to SRSF2 over-expression. In this specific case, use of the custom 15 k chip would not have been more predictive. Nevertheless, it is doubtless that AS, analyzed globally for all genes from the chip, will provide a lot more information on both transcript abundance and structure, allowing defining a prognostic indicator of response to antibody-based therapy [41]. An important challenge will be to develop specific antibodies to detect full length or modified proteins encoded by AS-derived transcripts. Alternatively, mass spectrometry proteomics could be used to identify and quantify such proteins [42]. The custom chip analysis could thus ideally supplement immunologyor proteomics-based approaches aimed at looking for the expression of protein targets. Our DNA gene chip could also be used to analyze the effect of other triggers, such as over-expression or silencing of other splice-modifying proteins, or treatment with drugs, especially anticancer drugs, which can profoundly affect pre-mRNA splicing [3,43].

Conclusion
Our results describe, for the first time, the design and validation of a custom splice-sensitive microarray to detect AS events occurring in genes involved in the response to targeted anticancer therapies. Such an experimental setup could help clinicians choose anticancer drugs depending on the tumor expression of gene targets with proficient mRNA structures.

Custom microarray design
A custom microarray was designed taking advantage on the 15 k Whole Human Genome microarray, available from Agilent™ (Agilent, Massy, France). Among the Agilent™ probes initially loaded on the chip, 11,881 (Additional file 8: Figure S2) were substituted by custom oligonucleotides, corresponding to known and predicted exons, introns and junctions of 16 selected genes, among which there were members of the AKT (AKT1, AKT2, AKT3), HER (HER1/ EGFR, HER2, HER3, HER4), PIK3 (PIK3CA, PIK3R1, PIK3R2) and VEGF (VEGFA, VEGFB, VEGFC, VEGFD) families, but also HIF1A and PIR. On the microarray, the majority (60%) of custom probes had a length of 40 bp; some were shorter (down to 22 bp; 8%); others were longer (up to 50 bp; 26%), which was mostly the case of the probes for exon-exon junctions. This was especially important to insure a good detection of alternative 5′ and 3′ splice sites, i.e. alternative exon boundaries. Each custom probe length was adjusted to 60 bp with linker addition. The other 3,863 probes on the microarray corresponded to replicates of commercial Agilent™ probes (genes or controls). As a whole, the expression of 1,967 distinct genes can be analyzed with our chip.

Cell culture and RNA extraction
The H358 human lung adenocarcinoma cell line was cultured as described previously [44]. The H358/Tet-On/ SRSF2 inducible clone, conditionally over-expressing the SRSF2 splicing factor under the control of a Tet-responsive promoter, has been described previously [44,45]. SRSF2 over-expression was induced upon 24 hours treatment with 1 μg/mL doxycycline (Additional file 9: Figure S3). Gefitinib was added to the cells at the indicated final concentrations for 24 hours. Total RNA was isolated using the Trizol reagent (Invitrogen, Cergy-Pontoise, France), according to the manufacturer's instructions. RNA purity and integrity were determined by measuring the optical density ratio (A260/A280) and the RNA integrity number (RIN) using the RNA 6000 Nano LabChip (Agilent™) and the 2100 Bioanalyzer (Agilent™). Only RNA samples with a 28S/18S ratio > 1.0 and RIN ≥ 7.0 were used for microarray analyses.

Plasmid transfection and minigene analysis
An E1A reporter minigene-containing plasmid (pXJ41-E1A) to study the effect of splice modifier proteins was used to further validate the effect of SRSF2 protein overexpression. The plasmid was transfected using Lipofectamine 2000 (Invitrogen). Cells were harvested 24 hours after transfection, and total RNA was extracted using the RNeasy Mini kit (Qiagen, Courtaboeuf, France), according to the manufacturer's instructions. The RNAs (200 ng) were further used for first-strand cDNA synthesis with the High-Capacity cDNA Reverse Transcription kit (Applied Biosystems, Courtaboeuf, France). For the detection of E1A splice variants, PCR amplification was performed using primers 5′-TTT-GGA-CCA-GCT-GAT-CGA-AG-3′ and 5′-AAG-CTT-GGG-CTG-CAG-GTC-GA-3′, and PCR products were analyzed by agarose gel electrophoresis.

Microarray hybridization
Analyses of the H358/Tet-On/SRSF2 mRNA content were performed on both the 15 k custom microarray and the 44 k Whole Human Genome microarray (Agilent™) that contains roughly 41,000 probes, providing full coverage of human transcripts. Double-stranded cDNA was synthesized from 500 ng of total RNA using the Quick Amp Labeling kit, One-color, as instructed by the manufacturer (Agilent™). Labeling with cyanine3-CTP, fragmentation of cRNA, hybridization and washing were performed according to the manufacturer's instructions. The microarrays were scanned and the data were extracted with the Agilent™ Feature Extraction Software.

Gene expression analysis
The bioinformatics analysis of the 15 k custom microarray data and the comparison of 15 k chip results with 44 k commercial chip results were performed by GenoSplice technology™. Concerning the 15 k custom microarray data analysis, data were normalized using median normalization based on Agilent™ control genes. Gene expression level was assessed using constitutive probes only (i.e., probes targeting regions that are not known to be alternative regions). For each gene of interest, all possible splicing patterns were defined and analyzed. All types of alternative events can be analyzed: alternative first exons, alternative terminal exons, cassette exons, mutually exclusive exons, alternative 5′ donor splice sites, alternative 3′ acceptor splice sites, and intron retentions. Analyses were performed using unpaired Student's t-test on the splicing-index as previously described [46,47]. Results were considered statistically significant for unadjusted P-values ≤ 0.05 and fold-changes ≥ 1.5. After bioinformatics analysis of microarray data, a manual inspection using the GenoSplice EASANA™ interface was conducted to select highconfidence events. An alternative 44 k bioinformatics analysis was carried out. Raw gene expression data were imported into the GeneSpring GX 11.0.2 software program (Agilent™). Genes with missing values in more than 25% of the samples were excluded from the analysis. A 2-fold cut-off difference was applied to select the up-and down-regulated genes (P-value ≤ 0.05 by ttest with Benjamini-Hochberg false discovery rate).

Real-time polymerase chain reaction analysis
Regulation events detected in the 15 k custom and 44 k commercial microarrays were analyzed by quantitative RT-PCR using RNA isolated from cell preparations separate from those originally used for microarray hybridization. Reverse transcription was performed as instructed by the manufacturer (Applied Biosystems), as described previously, and quantitative RT-PCR was conducted using the SYBR GREEN PCR Master Mix (Applied Biosystems), according to the manufacturer's instructions, with an ABI 7300 real-time PCR system (Applied Biosystems). All determinations were performed in duplicate, normalized against beta-2-microglobulin or GAPDH as internal control genes. These reference transcripts were found to be stable when surveyed in several cell culture systems (data not shown).
The results were expressed as the relative gene expression using the ΔΔCt method [48]. The fold-change was also calculated comparing to a reference event. The sequences of the primers used for the 15 k custom microarray validation are presented in Additional file 10: Table S7.

Human samples
Tissue samples were collected from resection of lung tumors, and stored for scientific research in a biological resource repository (Centre de Ressources Biologiques, CHU Albert Michallon, Grenoble Hospital). National ethical guidelines were followed. All patients enrolled provided written informed consent. Tissue banking and research conduct was approved by the Ministry of Research (approval AC-2010-1129) and by the regional IRB (CPP 5 Sud Est). Protein and RNA samples were isolated and analysed as described above.