Non random distribution of genomic features in breakpoint regions involved in chronic myeloid leukemia cases with variant t(9;22) or additional chromosomal rearrangements

Background The t(9;22)(q34;q11), generating the Philadelphia (Ph) chromosome, is found in more than 90% of patients with chronic myeloid leukemia (CML). As a result of the translocation, the 3' portion of the ABL1 oncogene is transposed from 9q34 to the 5' portion of the BCR gene on chromosome 22 to form the BCR/ABL1 fusion gene. At diagnosis, in 5-10% of CML patients the Ph chromosome is derived from variant translocations other than the standard t(9;22). Results We report a molecular cytogenetic study of 452 consecutive CML patients at diagnosis, that revealed 50 cases identifying three main subgroups: i) cases with variant chromosomal rearrangements other than the classic t(9;22)(q34;q11) (9.5%); ii) cases with cryptic insertions of ABL1 into BCR, or vice versa (1.3%); iii) cases bearing additional chromosomal rearrangements concomitant to the t(9;22) (1.1%). For each cytogenetic group, the mechanism at the basis of the rearrangement is discussed. All breakpoints on other chromosomes involved in variant t(9;22) and in additional rearrangements have been characterized for the first time by Fluorescence In Situ Hybridization (FISH) experiments and bioinformatic analyses. This study revealed a high content of Alu repeats, genes density, GC frequency, and miRNAs in the great majority of the analyzed breakpoints. Conclusions Taken together with literature data about CML with variant t(9;22), our findings identified several new cytogenetic breakpoints as hotspots for recombination, demonstrating that the involvement of chromosomes other than 9 and 22 is not a random event but could depend on specific genomic features. The presence of several genes and/or miRNAs at the identified breakpoints suggests their potential involvement in the CML pathogenesis.


Background
Chronic myeloid leukemia (CML) is characterized by the constitutive expression of the 5'BCR/3' ABL1 fusion gene resulting from the t(9;22)(q34;q11); this translocation is evident in more than 90% of patients and produces the Philadelphia chromosome (Ph) [1].
In 5-10% of CML patients, the 5'BCR/3' ABL1 fusion gene arises from complex variant rearrangements which may involve one or more chromosomes in addition to 9 and 22 [2,3]. In some variant t(9;22) cases, additional material is transferred onto the Ph chromosome, resulting in a "masked" Ph whereas other CML patients show a classic Ph and an atypical der(9) chromosome as a consequence of a rearrangement between the der(9)t(9;22) and another chromosome [4,5]. Serial translocations or a single simultaneous event are alternative hypotheses proposed to justify the occurrence of these complex rearrangements [6].
In a subset of CML patients, cryptic rearrangements have been postulated to induce the chimeric gene forma-tion, such as a nonreciprocal insertion between chromosomes 9 and 22 or two sequential translocations restoring the partner chromosomes morphology [7][8][9][10][11].
Microdeletions on the der(9) chromosome next to the t(9;22) breakpoint have been described in patients with classic and variant Ph translocations, and appear to be a valuable prognostic factor [12][13][14][15][16][17]. Recently, the frequency of such deletions has been investigated in the subgroup of CML patients with a masked Ph chromosome [18]. Additional genomic deletions on the third derivative chromosome have also been described in CML cases with variant translocations [19,20].
To our knowledge, an accurate breakpoints identification and bioinformatic analysis of other chromosomes involved in variant t (9;22) or in concomitant chromosomal rearrangements apart from the t(9;22) has never been performed in CML.
In this paper, a detailed molecular cytogenetic characterization of 50 (11.1%) out of 452 chronic phase (CP) CML patients was carried out to define the precise breakpoints on chromosomes other than 9 and 22. Bioinformatic analysis of breakpoint regions was performed to investigate the presence of repeated elements (Alu, LINE), GC content, Segmental Duplications (SDs), miR-NAs, and known genes. Our findings, taken together with a review of literature data, allowed us to identify new cytogenetic hotspots in CML cases with variant t(9;22).

Patients
The study included 452 CP-CML patients. All of them were newly diagnosed at our hospital between 1990 and 2009. As a consequence of the long time span for sample accrual, several therapeutic regimens (hydroxyurea, interferon-α, imatinib, nilotinib, and dasatinib) were adopted.

Conventional cytogenetics
Conventional cytogenetic analysis of a 24-48 hour culture was performed at diagnosis of CML on bone marrow cells by standard techniques and evaluated by Giemsa-Trypsin-Giemsa (GTG) banding at about the 400-band level according to the ISCN [24]. At least 25 metaphases were analyzed for each case.

Identification of cytogenetic hotspots
To identify new cytogenetic hotspots, an estimate of the Haploid Autosomal Length (HAL) of the bands involved in variant t(9;22) cases was performed [25,26]. We calculated the number of breaks expected (E) in any band, given the null hypothesis of a random distribution of all breaks across the genome. Reviewing large series of CML patients with variant t (9;22) we assessed the number of breaks observed (O) in each band and divided this value by the expected (E) value to determine an O/E ratio. An O/E ratio >1 identified new cytogenetic hotspots.

FISH analysis
FISH analysis was performed on bone marrow samples of all CP-CML patients at diagnosis using "home-brew" FISH probes specific for ABL1 and BCR genes, validated in previous papers [13,16,27]. Breakpoints characterization and deletions size definition were carried out with additional bacterial artificial chromosome (BAC) and Phage P1-derived artificial chromosome (PAC) probes. All clones were selected according to the University of California Santa Cruz (UCSC http://genome.ucsc.edu/ index.html; March 2006 release) database [28]; the mapping of each clone was first tested on normal human metaphases. Chromosome preparations were hybridized in situ with probes labeled with biotin by nick translation [29]. Briefly, 500 ng of labeled probe were used for FISH experiments; hybridization was performed at 37°C in 2× standard saline citrate (SSC), 50% (vol/vol) formamide, 10% (wt/vol) dextran sulphate, 5 μg COT1 DNA (Bethesda Research Laboratories, Gaithersburg, MD), and 3 μg sonicated salmon sperm DNA in a volume of 10 μL. Post-hybridization washing was done at 60°C 0.1× SSC. Biotin-labeled DNA was detected with Cy3-conjugated avidin. In cohybridization experiments, other probes were directly labeled with fluorescein. Chromosomes were identified by 4',6-diamidino-2-phenylindole (DAPI) staining. Digital images were obtained using a Leica DMRXA epifluorescence microscope equipped with a cooled CCD camera (Princeton Instruments, Boston, MA). Cy3 (red; New England Nuclear, NJ), fluorescein (green; NEN Life Science Products, Boston, MA), and DAPI (blue) fluorescence signals, which were detected using specific filters, were recorded separately as gray-scale images. Pseudocoloring and merging of images were performed with Adobe Photoshop software.

Bioinformatic analysis
Breakpoint regions on other chromosomes involved in variant t(9;22) and additional rearrangements were included in 250 Kb size intervals, according to the resolution limit of the BAC clones used for breakpoints definition. Each interval was checked for the presence of interspersed repeats classes (Alu and LINE repeats), SDs, GC content, and gene density. The UCSC Table Browser [28] was queried for summary analysis about the items belonging to the tracks "RepeatMasker", "Segmental Dups", "GC Percent", and "RefSeq Genes". For SDs and RefSeq gene analysis, both "Item count" and "Item Bases" values were considered, to assess their number and the bases percentage involved in SDs or coding sequences, respectively. For each genomic feature, the obtained value was normalized to the mean value for the examined chromosome. For example, in case 1, the breakpoint mapped in 1q32. 1 (chr1:203,949,120-204,199,120) showed an Alu frequency of 13.47%. As the mean Alu content inside chromosome 1 is estimated to be 11.9%, the normalized value will be 1.13 (i.e. 13.47/11.90). Therefore, greater or lesser values than 1 correspond to regions with a richer or poorer content of a specific genomic feature than those observed along the entire chromosome.
In view of the known low miRNAs density in the human genome, regions spanning 2 Mb proximally and distally to breakpoints were investigated by querying the UCSC database at the track "sno/miRNA". For each chromosome the expected miRNA density within a 4 Mb interval was established according to the following formula: number of miRNA along the entire chromosome/ size in bp of the chromosome × 4000000 bp. The identification of the predicted miRNAs target genes was performed by querying the miRGen database http:// www.diana.pcbi.upenn.edu/cgi-bin/miRGen/v3/Targets.cgi. Intersection data from the three widely used target prediction programs (miRanda, PicTar, TargetScan) were considered. The definition of target genes function as oncogenes or tumor suppressor genes (TSGs) was made according to the National Center for Biotechnology Information (NCBI, http://www.ncbi.nlm.nih.gov/gene/) database.

FISH data
Cytogenetic analysis and FISH experiments with specific BAC/PAC probes for the ABL1 and BCR genes allowed us to detect 50 (11.1%) out of 452 cases, that identify 3 main subgroups of CML patients showing variant t(9;22) rearrangements, the occurrence of cryptic insertions of the ABL1 in the BCR region (or vice versa), and the presence of additional chromosomal abnormalities, respectively ( Table 1).

Variant t(9;22) rearrangements
Forty-three (9.5%) out of 452 CML patients showed the involvement of one (90.7%) or more chromosomes (9.3%) in addition to 9 and 22. These complex variant translocations generated a classic Ph together with a masked der(9) in 36 out of 43 cases (83.7%) and a masked Ph in association with a classic der(9) chromosome in 7 patients (16.3%) ( Table 1). Cases with a masked der (9) showed the presence of additional material belonging to partner chromosomes other than chromosome 22 (Fig.  1A, B). Several chromosomes were involved in these variant translocations, with a prevalence of chromosomes 4, 6, 12, and 17 ( Table 1). The 5'BCR/3' ABL1 fusion gene was localized on the Ph chromosome in all these cases, whereas the 5' ABL1/3'BCR gene was retained on the der(9) only in 4 (11.1%) out of 36 patients (Table 1). In the remaining 32 (88.9%) cases, the 5' ABL1/3'BCR gene was not detected on der(9) due to deletions and/or 3'BCR transfer onto partner chromosomes (Table 1; Fig. 1A, B). Molecular cytogenetic characterization performed to verify the presence of microdeletions at the level of the rearrangements breakpoints revealed sequences loss in 18 out of the 43 (42%) cases. Among these 18 patients, 10 (55.6%) showed microdeletions of sequences belonging to the third partner chromosome, revealing a high incidence of this kind of deletions in t(9;22) variant rearrangement cases (Table 1).
Among 7 CML cases with a "masked Ph" chromosome, 3 showed the 5'BCR/3' ABL1 fusion signal on 22q11, the second breakpoint on the derivative chromosome 22 mapping inside chromosome 9 sequences distal to the ABL1 gene (Table 1; Fig. 1C). In 3 cases the fusion gene was detected on the third partner chromosome, the second chromosome 22 breakpoint being localized centromerically to the BCR gene (Table 1; Fig. 1D). In patient #42 the insertion of 5' BCR into the ABL1 gene caused the 5'BCR/3' ABL1 localization on der(9). The 5' ABL1/3'BCR gene was detected on the der(9) in 3 cases with masked Ph, was deleted in case #39 whereas in the remaining patients the 5' ABL1 gene was retained on the der(9) and the 3'BCR gene was transferred onto other derivative chromosomes (Table 1). Chromosome 9 sequences loss next to the rearrangement breakpoint was observed in case #43 and an unusual loss of a region of about 400 Kb localized telomerically to the ABL1 gene was detected in case #41 [22] (Table 1).

Cryptic insertions
Six (1.3%) out of the 452 CML cases showed cryptic insertions of ABL1 into BCR, or vice versa, as the cause of the 5'BCR/3' ABL1 fusion gene generation (Fig. 1E, F). Four (66.7%) of these cases are indicated as "Ph negative" (Ph -), with chromosome 22 appearing normal without the presence of additional genomic material (Table 1). Two (33.3%) out of these 6 cases were also included in the previous group as they showed variant rearrangements generating a masked Ph ( Table 1). The 5'BCR/3' ABL1 gene was detected on the der(9) or on the der(22) at a ratio of 1:1 as a consequence of 5' BCR insertion in 9q34 or 3' ABL1 insertion in 22q11, respectively (Table 1).

Bioinformatic analysis of breakpoints on other chromosomes involved in variant t(9;22) or in concomitant chromosomal rearrangements
FISH experiments with BAC clones specific for other chromosomes involved in variant or additional chromosomal rearrangements revealed a total number of 58 breakpoints. These breakpoints were mapped within a single BAC clone or in the region between two overlapping or adjacent clones (Table 2). In cases with sequences loss, two different breakpoints were defined at the level of the deleted regions boundaries. Interestingly, the majority of breakpoints on chromosomes involved in variant or additional chromosomal rearrangements showed a high frequency of Alu repeats ( Table 2; Fig. 2A). In fact, 41 out of 58 (71%) breakpoints showed an Alu content of more than one whereas the remaining 17 out of 58 (29%) had a content of less than one. Instead, the LINE content was lower than one in 44 out of 58 (76%) breakpoints (Table 2). Thirty-five out of 41 breakpoints (85%) with Alu >1 showed a LINE amount < 1 ( Table 2).
Most of the analyzed breakpoints map within gene-rich regions as a RefSeq Genes Item count of more than one was observed in 45 out of 58 (78%) breakpoints (Table 2; Fig. 2B). Moreover, 40 out of 58 (69%) breakpoints showed a RefSeq Genes Item bases value of more than 1 ( Table 2). It is worthy of note that 34 out of 41 bp (83%) with Alu >1 showed a RefSeq Genes Item count >1 ( Table  2). The number of known genes localized at breakpoints and their function as oncogenes and/or TSGs are reported in Table 3.
In the search for SDs, 49 out of 58 (84%) and 51 out of 58 (88%) breakpoints revealed SDs Item count and SDs Item bases of less than one, respectively ( Table 2). In cases showing the presence of SDs within breakpoint regions no specific association with chromosomes 9 and 22 was detected, as the duplicated elements recognized several chromosomal regions.
The search for miRNAs revealed a different density from the expected value in 32 out of 58 (55%) breakpoint  The 250 Kb size intervals covering the molecular breakpoints were analyzed for the presence of Alu, LINE, RefSeq Genes, SDs, and GC. The reported values were normalized to the mean value for each chromosome. In cases characterized by sequences deletions or the involvement of several chromosomes, more than one molecular breakpoints was identified. regions (Fig. 2D). In detail, in 30 (94%) and 2 out of 32 (6%) breakpoints a higher or lower number of miRNA than the expected value was identified, respectively (Fig.  2D). In the remaining 26 out of 58 (45%) breakpoints no miRNA was revealed in the 4 Mb analyzed intervals. It is noteworthy that in case #49 with an additional t(14;15)(q32;q24) a miRNA cluster of 54 elements was revealed in the 14q32 breakpoint region. In this patient a microdeletion of about 450 Kb was detected on 14q32, resulting in the loss of almost the entire miRNA cluster. The list of miRNAs found at the breakpoints is reported in Table 4; in addition to the 14q32 miRNA cluster a total number of 63 known miRNA was identified, 8 (13%) of which show involvement in hematological malignancies. Moreover, querying the miRGen database (the intersection data from the miRanda, PicTar, and TargetScan pro-  The number of the genes with known function mapping in the breakpoint regions or located inside the deleted regions was reported for each case. Known oncogenes and TSGs have been identified according to the NCBI.  grams) allowed the identification of the predicted target genes in 19 out of 63 (30%) analyzed miRNAs (see Additional File 1). Among the identified target genes, several play a role as oncogenes or TSGs (see Additional File 1). Noteworthy, some miRNAs share the same target oncogenes or TSGs; for example, PPM1D (protein phosphatase, Mg2+/Mn2+ dependent, 1D) and AKT3 (v-akt murine thymoma viral oncogene homolog 3) genes are the most frequent miRNAs targets (see Additional File 1).

Identification of cytogenetic hotspots
Our study revealed 46 cytogenetic breakpoints on other chromosomes involved in variant t(9;22) rearrangements (see Additional File 2). The assessment of the O/E ratio for each breakpoint allowed us to identify 24 hotspots, 12 of which have been previously described in literature [26] (see Additional File 2). Notably, 4 out of 12 new hotspots showed a ratio >2 involving the chromosomal bands 4q12, 9p11, 11q21 and 21q22 (see Additional File 2).
To investigate the breakpoints distribution in the genome, a review of literature data about variant t(9;22) following the study by Fisher et al. was carried out [4,[30][31][32][33]. In total, 60 new hotspots were identified, 18 of which have already been reported [26]. However, 10 previously published hotspots were not supported by our literature review (see Additional File 2). Among the 60 new hotspots, 27 showed an O/E ratio > 2.

Treatment response
Data on the response to treatment in the analyzed CML patients were only available for about 50% of the cases; a summary is shown in Table 5. All the cases evaluable for the response to interferon-α therapy were non responders whereas 11 out of 17 (65%) cases treated with imatinib achieved cytogenetic response. Among patients resistant to imatinib, 3 (75%) treated with dasatinib achieved CCyR.

Discussion
Literature data indicate that breakpoints on additional chromosomes involved in CML cases with variant t(9;22) are not distributed randomly in the genome but show hotspots [26]. Several genomic features such as the density of CpG islands, genes, Alu repeats, recombination events, openness of the chromatin structure and tran-    [26,33,34].
In this study, we have performed for the first time a precise molecular cytogenetic characterization of breakpoints involved in variant t (9;22) or in additional rearrangements, in 50 CML cases. To identify genomic elements with a role in the occurrence of chromosomal translocations, bioinformatic analysis was carried out to investigate the distribution and density of several genomic features, such as Alu, LINE, GC, SDs, miRNAs, and genes at breakpoint regions. To date, according to the miRBase database http://www.mirbase.org [35] the total number of known miRNAs is very low (about 720) as compared to the human genome size (3.1 × 10 9 bp). In this study the miRNAs density within the 4 Mb analyzed intervals resulted higher than the expected value in 32 out of 58 (55%) breakpoint regions. These findings suggest a potential role for miRNAs in the pathogenesis of CML cases with variant or additional chromosomal rearrangements. Few miRNAs located at breakpoint regions have previously been described in several hematological malignancies [36][37][38][39][40][41][42][43][44][45][46][47][48]. However, none of them was involved in CML. It is worth noting the presence of the miRNA cluster next to the breakpoint region in 14q32 (case #49). miRNAs in this region are organized in an imprinted domain regulated by a differentially methylated region located upstream of the miRNA cluster. It has been reported that these miRNAs act as tumor suppressor genes and that changes in their methylation status could promote tumor development [49]. Querying of miRGen and NCBI databases showed the involvement of interesting target oncogenes or TSGs implicated in a wide variety of biological processes including cell proliferation, differentiation, apoptosis, and tumorigenesis.
Increasing evidence shows a high density of interspersed repetitive elements, such as Alu and LINE, at some chromosomal translocation breakpoints, suggesting a mediator role of some recurrent rearrangements in tumors [50]. Because a much higher density of Alu repeats has been observed in the DNA sequences flanking the ABL1 and BCR genes, it has been hypothesized that Alu elements provide hotspots for non allelic homologous recombination and mediate chromosomal translocation in CML [34,50]. Our data, supported by bioinformatic evidence, suggest that the high density of Alu repeats could increase the propensity to undergo rearrangements also of other chromosomes involved in variant t(9;22). In our CML series, a high Alu density was detected in 71% of the analyzed breakpoints. Moreover, a rich content of Alu repeats was revealed also on breakpoint regions identified in chromosomal rearrangements concomitant to the t(9;22).
Literature data revealed a preferential breakpoints distribution in CML cases with variant t(9;22) within the CG-richest regions of the genome corresponding to the G-light banding karyotype [26,33]. Our data confirmed this association, as 83% of the identified cytogenetic breakpoints mapped inside G-light bands. Moreover, we report the first bioinformatic evidence of the association between GC-content and breaks in cases with variant t(9;22), as 73% of the molecular breakpoints showed a GC content >1. In addition, these data showed that CG richness was related to other genomic features such as Alu content and a greater gene density than the mean expected value.
The search for SDs revealed a low density in the majority of the analyzed breakpoints, without showing any specific association with chromosomes 9 and 22 regions, unlike what has been reported about the occurrence of the t(9;22) in CML [51].
Moreover, our study provided an outline of the frequency and molecular features of the most relevant cytogenetic groups identified in a very large series of CML patients at diagnosis. Three-way translocations were the most frequent among variant t(9;22) rearrangements, chromosomes 4, 6, 12, and 17 being common partners. However, no cytogenetic breakpoints clustering was revealed when the same partner chromosome was rearranged, except for the 3p21 band, that was involved in 3 CML cases with variant t(9;22).
As to the mechanisms involved in the formation of the variant t(9;22) rearrangements, our data indicated that the most probable mechanism, identified in cases with a "masked der(9)" chromosome, is a single event consisting of multiple simultaneous breaks and rejoins (one-step model). In fact, splitting of the 5' ABL1/3'BCR fusion gene signal was observed in the majority (27 out of 36, 75%) of analyzed cases. A two-step mechanism was hypothesized in about 11% of cases bearing a "masked der(9)" chromosome; the permanence of the 5' ABL1/3'BCR gene on the der(9) suggests that a second break occurred inside the chromosome 22 sequence telomeric to the BCR gene. On the contrary, in 71.4% of cases (#37 -#41) with a "masked Ph" chromosome a second break located proximally to BCR or distally to ABL1 was identified, suggesting the occurrence of a two-step mechanism in the majority of the CML patients included in this group.
In our study, FISH 'walking' with BAC/PAC contigs belonging to the chromosomes 9 and 22 next to the t(9;22) breakpoint regions allowed us to assess the frequency of deletions in three main cytogenetic subgroups of CML patients and the size of these microdeletions. Confirming the deletion frequency reported in literature [12], 12 out of the 36 (33%) cases with a "masked der(9)" chromosome showed chromosome 9 and/or 22 sequences loss. Moreover, in about 55% of these patients we found extensive genomic deletions on the third chromosome, in addition to deletions on der(9). Chromosome