Mechanism of ERBB2 gene overexpression by the formation of super-enhancer with genomic structural abnormalities in lung adenocarcinoma without clinically actionable genetic alterations

Kaneko, Syuzo; Takasawa, Ken; Asada, Ken; Shiraishi, Kouya; Ikawa, Noriko; Machino, Hidenori; Shinkai, Norio; Matsuda, Maiko; Masuda, Mari; Adachi, Shungo; Takahashi, Satoshi; Kobayashi, Kazuma; Kouno, Nobuji; Bolatkan, Amina; Komatsu, Masaaki; Yamada, Masayoshi; Miyake, Mototaka; Watanabe, Hirokazu; Tateishi, Akiko; Mizuno, Takaaki; Okubo, Yu; Mukai, Masami; Yoshida, Tatsuya; Yoshida, Yukihiro; Horinouchi, Hidehito; Watanabe, Shun-Ichi; Ohe, Yuichiro; Yatabe, Yasushi; Saloura, Vassiliki; Kohno, Takashi; Hamamoto, Ryuji

doi:10.1186/s12943-024-02035-6

Research
Open access
Published: 11 June 2024

Mechanism of ERBB2 gene overexpression by the formation of super-enhancer with genomic structural abnormalities in lung adenocarcinoma without clinically actionable genetic alterations

Molecular Cancer volume 23, Article number: 126 (2024) Cite this article

1432 Accesses
1 Altmetric
Metrics details

Abstract

Background

In an extensive genomic analysis of lung adenocarcinomas (LUADs), driver mutations have been recognized as potential targets for molecular therapy. However, there remain cases where target genes are not identified. Super-enhancers and structural variants are frequently identified in several hundred loci per case. Despite this, most cancer research has approached the analysis of these data sets separately, without merging and comparing the data, and there are no examples of integrated analysis in LUAD.

Methods

We performed an integrated analysis of super-enhancers and structural variants in a cohort of 174 LUAD cases that lacked clinically actionable genetic alterations. To achieve this, we conducted both WGS and H3K27Ac ChIP-seq analyses using samples with driver gene mutations and those without, allowing for a comprehensive investigation of the potential roles of super-enhancer in LUAD cases.

Results

We demonstrate that most genes situated in these overlapped regions were associated with known and previously unknown driver genes and aberrant expression resulting from the formation of super-enhancers accompanied by genomic structural abnormalities. Hi-C and long-read sequencing data further corroborated this insight. When we employed CRISPR-Cas9 to induce structural abnormalities that mimicked cases with outlier ERBB2 gene expression, we observed an elevation in ERBB2 expression. These abnormalities are associated with a higher risk of recurrence after surgery, irrespective of the presence or absence of driver mutations.

Conclusions

Our findings suggest that aberrant gene expression linked to structural polymorphisms can significantly impact personalized cancer treatment by facilitating the identification of driver mutations and prognostic factors, contributing to a more comprehensive understanding of LUAD pathogenesis.

Introduction

Lung adenocarcinoma (LUAD) is a major subtype of non-small cell lung cancer (NSCLC), with ALK, EGFR, and KRAS gene mutations being the most common driver gene mutations [1]. These mutations are critical for selecting targeted therapies and determining treatment strategies, with specific molecularly targeted therapies available for patients carrying these gene mutations [1]. Driver gene mutations are detected in approximately 50 ~ 70% of patients, though the exact percentage may vary depending on the study or patient population [2,3,4]. Despite the prevalence of identifiable driver mutations in a significant portion of the patient population, a considerable number of lung adenocarcinoma (LUAD) patients lack these specific somatic mutations, presenting challenges in both diagnosis and treatment planning.

Advancements in whole-genome sequencing (WGS) technology have made it possible to investigate novel lung cancer-related mutations and complex structural variants. structural variants have emerged as key events in causing copy number alterations (CNAs), generating gene fusions, and dysregulating gene expression through super-enhancer hijacking and the disruption of 3D genomic structure [5]. However, determining structural variant events related to super-enhancer formation using WGS alone remains challenging [6,7,8]. Furthermore, it is unclear whether these events can serve as druggable targets as driver mutations [9].

The super-enhancers span extensive genomic regions, with median sizes remarkably larger than those of typical enhancers. From a molecular biology perspective, it has been discovered that the super-enhancer region encompasses numerous factors related to enhancer activity, including RNA polymerase II (RNA Pol II), RNA from transcribed enhancer loci (eRNA), histone acetyltransferases p300 and CBP, chromatin factors such as cohesin, and histone modifications (histone H3 lysine 27 acetylation (H3K27Ac), H3 lysine 4 di-methylation (H3K4me2), H3 lysine 4 mono-methylation (H3K4me1). Additionally, increased chromatin accessibility has been identified within these regions. Abnormalities in the function of super-enhancers have been reported to be associated with cancer, type 1 diabetes, and Alzheimer’s disease [10, 11]. Particularly in cancer, super-enhancers may play a crucial role in the dysregulation of gene expression. For instance, during tumorigenesis, malignant cells acquire super-enhancers in key oncogenes, and higher levels of transcription of these genes have been reported compared to normal cells [12, 13]. However, it remains unclear whether these phenomena are genuinely attributable to epigenetic abnormalities or result from genomic alterations [10, 14]. Recent studies have shed light on the role of extrachromosomal DNA (ecDNA) in connection with structural variants. Not merely isolated circular DNAs, these ecDNAs form substantial clusters that potentially catalyze the emergence of super-enhancers [15].

ERBB2, also known as HER2 (human epidermal growth factor receptor 2), is a receptor tyrosine kinase that belongs to the epidermal growth factor receptor (EGFR) family [16, 17]. It plays a crucial role in cell growth, differentiation, and survival. Overexpression or amplification of ERBB2 has been reported in various cancers, including breast cancer and NSCLC, and is associated with aggressive disease and poor prognosis [18]. As a druggable target, ERBB2 has been the focus of several targeted therapies. In breast cancer, the monoclonal antibody trastuzumab has been successfully used to treat patients with HER2-positive tumors [19]. Other HER2-targeted therapies include pertuzumab (another monoclonal antibody), ado-trastuzumab emtansine (an antibody–drug conjugate), and small molecule tyrosine kinase inhibitors such as lapatinib and neratinib [20,21,22,23]. In the context of NSCLC, ERBB2-targeted therapies have shown promise in clinical trials, particularly for patients with ERBB2 mutations or amplifications [24].

In this study, we aimed to identify genomic alterations accompanied by the formation of super-enhancers. To achieve this, we conducted both WGS and H3K27Ac chromatin immunoprecipitation sequencing (ChIP-seq) analyses using cases with driver gene mutations and those without, allowing for a comprehensive investigation of the potential roles of super-enhancers in the context of these genetic alterations. Specifically, the super-enhancer formation surrounding the ERBB2 gene locus is associated with exceptionally high gene expression and involves structural variant events, as revealed by Hi-C and long-read sequencing. We provide evidence that an increase in ERBB2 gene expression occurred when one of the structural variant events, specifically an inversion, brought the ERBB2 genomic region near the HNF1β gene locus. Finally, 23 genes displaying significantly aberrant expression patterns were identified as potential indicators of driver mutations in LUAD. These genes were associated with decreased recurrent-free survival in patients, suggesting their clinical relevance as prognostic factors for postoperative outcomes.

Materials and methods

Ethical considerations and clinical materials

All methods used in this study adhered to the ethical guidelines for medical and health research involving human subjects. Informed consent was obtained from all participating patients. The institutional review board of the National Cancer Center (NCC) approved the study (2005–109, 2016–496, 2019–018), which was conducted in accordance with the Declaration of Helsinki.

Patient samples and clinical records were collected based on the Public/Private R&D Investment Strategic Expansion PrograM (PRISM), an in-house lung cancer database of the NCC Japan, containing clinical information (n = 1,714), whole-exome sequencing (WES, n = 1,599) and RNA sequencing (RNA-seq, n = 1,682). In addition, DNA methylation data (n = 402) and H3K27Ac ChIP-seq data (n = 222) were collected as of April 15, 2023.

Tumor samples were collected from individuals who underwent either surgery or medical treatment at the NCC hospital in Tokyo, Japan, between 1997 and 2019. Data for the analysis was retrospectively gathered from electronic medical records. Tumor diagnoses were made through cytological and/or histological evaluations, following the World Health Organization classification guidelines. Freshly frozen tissue samples from surgical specimens were obtained from the NCC Biobank.

WGS

We used the AllPrep DNA/RNA mini kit to extract genomic DNA from fresh frozen samples. Sequencing was performed on the Illumina HiSeq 2500 or Illumina NovaSeq 6000 platforms. To identify somatic mutations of tumor samples, we have analyzed the tumor tissues at a coverage of 100X, and the peripheral blood lymphocytes from the same cases at a coverage of 30X in WGS. The raw sequencing data was then processed using the NVIDIA Clara Parabricks, a GPU-based framework for genomic sequence analysis. For structural variant calling, we utilized Manta, a specialized tool. To consolidate and screen the detected variants, we applied SURVIVOR, a tool that aids in eliminating potential false positives and enhancing the precision and trustworthiness of the resultant structural variant dataset. More comprehensive methods are available in supplementary methods.

Identification of LUAD without clinically actionable genetic alterations (CAGAs)

To investigate the underlying mechanism of non-CAGAs LUAD-specific cancer pathogenesis, we filtered out the cases with mutations in specific genes by identifying the driver mutations. These genes were annotated as pathogenic or likely pathogenic in the ClinVar database, or as oncogenic or likely oncogenic in the OncoKB database [25, 26]. Specifically, the genes analyzed included EGFR, KRAS, BRAF, ERBB2, MET skipping, as well as fusion genes of ALK, ROS1, NRG1, RET, NTRK, and FGFR, which were considered as CAGAs. We identified these gene mutations using both WES and RNA-seq datasets.

ChIP-seq

The ChIP-seq procedure used in the study has been previously described using semi-automated dual-arm robot [27]. The full method for ChIP-seq analysis is available in supplementary methods.

Overlap analysis of super-enhancers and structural variants

To investigate potential functional relationships or co-regulation between genomic regions, we examined the genomic coordinates of the peaks to ascertain whether their ranges intersect. This overlap can manifest as partial, wherein only a segment of one peak intersects with the other, or complete, where one peak is entirely subsumed by the other. Specifically, for the overlap analysis of super-enhancer and structural variant regions, we employed the findOverlappingPeaks function from the “ChIPpeakAnno” R package. Recognizing that structural variant events are characterized by extensive disruptions involving the 3D genomic structure, we deem an overlap between a 20 kb region surrounding the genomic breakpoint and super-enhancer region to be significant.

Super-enhancer (SE)-to-gene links analysis

In light of a noticeable bias inherent in differing RNA-seq methodologies, we used samples processed through polyA RNA-seq for the following SE-to-gene links analysis (n = 142). Given that super-enhancer regions are often annotated over large areas encompassing multiple gene clusters, we first examined the correlation between H3K27Ac peaks and gene expression, referred to as peak-to-gene links. We then extracted the genes that were annotated as super-enhancer regions by the method of rank ordering of super-enhancers (ROSE) and as structural variants by Manta. The comprehensive methodology for the SE-to-gene links analysis can be found in supplementary methods.

Hi-C

The high-throughput chromosome conformation capture (Hi-C) procedure has been previously described in the study by Rao et al. [28]. The full method for the Hi-C analysis is described in supplementary methods.

Long-read sequencing

The complete methodology for acquiring long-read sequencing data using the PacBio Sequel II system is detailed in supplementary methods. For de novo assembly to obtain contiguous assemblies, we employed hifiasm (v0.16.1-r375) in combination with the Hi-C dataset and option -t 86. Note that this is particularly beneficial when assembling complex genomes or resolving repetitive regions, which are often difficult to decipher with short-read sequencing data. As the obtained genomic data is too large in size, visualizing the entire genome region is challenging. Therefore, we used the Bandage’s reduce command and options --scope aroundblast --evfilter 1e-100 --distance 2 to extract the ERBB2 cDNA sequence as a query in the assembly graph, along with adjacent nodes. To query sequences and visualize ERBB2 and HNF1β genes, we locally performed a BLAST search (v2.9.0) with filter parameters e-value 1e-100 and bit score 10,000 to identify genomic regions encompassing GRCh38: chr17:37,686,431–37,745,059 and GRCh38: chr17: 39,687,914 – 39,730,426 within Bandage. The continuity of genomes assembled with PacBio long reads is crucial due to its capacity for improved structural variant detection and its ability to resolve complex regions. To determine the continuity of the genome sequence according to Bandage’s rule, the following conditions were followed: one of the edges connected to node A uniquely leads to node B in all possible paths, or one of the edges connected to node B uniquely leads to node A in all possible paths.

Targeted chromosomal rearrangements

The generation of inducible Cas9 expression in cell lines is detailed in supplementary methods. To design highly specific single guide RNAs (sgRNAs) targeting the genomic regions near the cleavage sites that cause structural variants identified from the WGS of LUAD, we used crispRdesignR (v1.1.6) package and further verified selected sgRNAs using CRISPR-Cas9 guide RNA design checker (Integrated DNA Technologies). The sgRNAs were then synthesized with the molecules comprising both crRNA and tracrRNA sequences with chemical modifications for a high level of functional stability (Integrated DNA Technologies). The targeted sequences for sgRNA were as follows:

gRNA #1: 5'-GTT ATG AAC ATT GGC AAT GT-3',
gRNA #2: 5'-GTC ACC TAG ATG CCC ATC CA-3',
gRNA #3: 5'-GAG ACT GGC GTG CAG CGC GA-3',
gRNA #4: 5'-GCC TAG GAG ATC AAA ATC TG-3'.

We then transfected Cas9-inducible HBEC3-KT and HSAEC1-KT cells with single guide RNAs (sgRNAs) using Lipofectamine RNAiMax transfection reagent (ThermoFisher Scientific, 13778–150) according to the manufacturer’s instructions to achieve targeted chromosomal rearrangements. To screen for the presence of mutations or small insertions/deletions (indels) in the specific DNA region of interest, we performed T7 endonuclease I (T7EI) mismatch detection assays using the Alt-R Genome Editing Detection Kit (Integrated DNA Technologies, 1075932) according to the manufacturer’s instructions. Genomic inversion of HNF1β-ERBB2 region was confirmed by PCR and sequencing.

FACS

Forty-eight hours post-transfection, cells were subjected to analysis. The cells were resuspended in 50 µL of Stain buffer (BD, 554656) and treated with 5 µL (2.5 µg) of Human BD Fc Block (BD, 564219) per 10⁶ cells, followed by a 10-min incubation. Then, we added either Anti-Her2/neu (BD, 340552) or Mouse IgG1 (20 µL, 0.1 µg/20 µL) and incubated at 4 °C for at least 30 min. We obtained the data from 50,000 individual cells. The detailed FACS analysis method is available in supplementary methods.

Recurrence-free survival (RFS) analysis

We utilized the most comprehensive RNA-seq dataset available for LUAD (n = 1,115). To identify LUAD cases exhibiting outlier gene expression, we calculated the quartiles for each gene expression dataset and ascertained the interquartile range (IQR). We then computed the upper bound for the outliers in the data, which was specifically defined as the third quartile plus 1.5 times the interquartile range. This approach is considered robust for detecting outliers and is applicable across polyA RNA-seq, Ribo-Zero RNA-seq, and SMART-seq methods, irrespective of the differences in these techniques. Outlier genes used for RFS analysis were described in Table 1. RFS curves for cases with and without outlier gene expression were estimated using the Kaplan–Meier method. Differences in RFS, including postoperative recurrence, were assessed using the log-rank test. GraphPad Prism (GraphPad Software, v9) was employed for statistical analyses.

Table 1 A ranked list of genes according to the SE-to-gene links analysis. The peak-to-gene links analysis was conducted on the non-CAGAs LUAD cohort. The peaks annotated as SE regions with FDR less than 0.05 were extracted. The gene symbol (Symbol), chromosome number (Chr), Start and End positions, r as correlation coefficient, and FDR are displayed and ranked based on FDR scores

Full size table

Bioinformatic analysis

The complete methods are available in supplementary methods.

Statistical analysis

Comparisons between group means were performed using a two-tailed student’s t-test as indicated. P-value of less than 0.05 was considered statistically significant.

Results

Identification of driver mutations driven by super-enhancer formation with structural variants

Given the presence of somatic mutations relevant to cancer in a subset of patients with lung adenocarcinoma (LUAD), we often encounter significant challenges when attempting to apply targeted therapies. To broaden the scope of precision medicine to include patients with clinically actionable genetic alterations (CAGAs), we used a comprehensive strategy to classify patients with LUAD. Our initial classification scheme emphasized the identification of primary mutations in essential oncogenes, such as EGFR, KRAS, BRAF, ERBB2, and MET (exon skipping), as well as in oncogenic fusion genes, including ALK, ROS1, NRG1, RET, NTRK, and FGFR. Identifying these mutations is particularly important for cancer therapy because targeted treatments specifically designed for these mutations have demonstrated significant therapeutic benefits [1]. Therefore, we selected LUAD cases from 938 patients using WES and poly(A) RNA-seq dataset. From the subset that did not possess these CAGAs (n = 420, termed non-CAGAs), we selected 174 cases for WGS and H3K27Ac ChIP-seq analyses (Fig. S1A). Of note, driver mutations in genes including EGFR, KRAS, BRAF, and ERBB2 were identified in 476 cases, representing 50.7% of the entire cohort (Fig. S1B). Importantly, a higher frequency of mutations was observed in the non-CAGA cohort (Fig. S2A, and the oncoprints shown in Fig. S2B and C), suggesting that these variants may not serve as specific markers for non-CAGA cases but rather indicate a general elevation in mutation frequency. Furthermore, the CNV and SV landscapes revealed hotspots associated with the CDK4/MDM2 loci, where copy number amplification was observed. This may be partly explained by complex chromothripsis events characterized by extensive copy number amplification (Fig. S3).

To elucidate the distinct characteristics between normal and tumor tissues, we conducted H3K27Ac ChIP-seq analysis of seven non-CAGA LUADs. Adjacent matched tissues were used as normal controls for comparison. The PCA results indicated that, while the adjacent tissues manifested homogenously, the lung adenocarcinoma samples exhibited diverse features (Fig. S4). Subsequently, we performed WGS and H3K27Ac ChIP-seq in and 174 patients without CAGAs (non-CAGAs, see Fig. S2B and C and Table S1) and 45 patients with CAGAs to comprehensively investigate the potential roles of super-enhancers in our LUAD cohort (QC data are summarized in Fig. S5 and Dataset S1).

To explore the direct correlation between the formation of super-enhancers and genomic structural variants, and to better understand their molecular interplay in disease mechanisms, we employed Manta analysis to identify genomic breakpoints from WGS and ROSE analysis to identify super-enhancer regions from ChIP-seq and obtained the genomic loci where these two sets of data overlapped (Fig. 1A, super-enhancer and structural variant regions summarized in Dataset S2-3, 4–5, respectively). Although the total number of loci in the entire dataset was 67,349 and 69,991, we found that only a small fraction, 700 (~ 1%), showed overlapping regions (Fig. S6, genome coordinates listed in Dataset S6-7), suggesting that structural variants play a confined role in specific regions as direct triggers for the formation of super-enhancers in non-CAGAs LUAD. A noteworthy finding was that when focusing on regions where super-enhancers and structural variants overlapped, the frequency of overlaps per patient in non-CAGA was substantially higher than that in CAGAs LUAD (Fig. 1B). This suggest that in some instances, the concurrent presence of super-enhancers and structural polymorphisms may act as discriminating factors for non-CAGA LUAD. Furthermore, all pathways were significantly associated with cancer-related processes in the non-CAGA LUAD group (Fig. 1C). Conversely, cancer-related pathways were not consistently observed for gene groups located near the super-enhancers and structural variant regions alone (Fig. S7). Finally, we confirmed the formation of super-enhancers accompanied by structural variants in genes such as BAX, CCND1, CDK4, EGFR, ERBB2, FOXO3, RXRA, and STAT3, which are all frequently related to NSCLC (Fig. 1D, Fig. S8).

To explore the potential impact of structural variants on gene expression in our dataset, we conducted an integrated analysis of RNA sequencing data and structural variants. Most genes showed no significant changes in expression levels (black dotted line in Fig. S9A). However, a subset of genes (n = 632) exhibited elevated expression levels, which may be associated with the presence of structural variants (red dotted line). Conversely, 170 genes exhibited decreased expression levels (blue dotted line). Notably, no significant differences in outliner gene expression (red dots, n = 20) were observed between the non-CAGA and CAGA cohorts, suggesting that structural variants alone cannot be used to distinguish between non-CAGA and CAGA cases (Fig. S9B). Hence, identifying the genomic regions where both super-enhancers and structural variant coexist might offer insights into the cancer-related attributes of non-CAGAs LUAD. These findings suggest that the genes identified in these regions have potential therapeutic implications.

Impact of gene expression on super-enhancer formation accompanied by structural variants in non-CAGAs LUAD

To investigate distinct cellular or tissue signatures within LUAD through transcriptional profiling, we performed a clustering analysis on the entire RNA-seq dataset comprising 938 cases. This analysis revealed that a specific subset of non-CAGA LUAD cases exhibited prominent characteristics similar to those of limbal and corneal epithelial stem cells. In contrast, the EGFR mutation-positive group was markedly enriched in Type II pneumocytes and epithelial progenitor cells as shown in group 1 and 3, respectively (Fig. S10 and Table S2). To decipher the super-enhancer and structural variant landscape in our non-CAGAs LUAD cohort and better understand its impact on gene expression, we performed a peak-to-gene links analysis [29], by correlating H3K27Ac peaks within 0.5 M bp of the gene promoter with the expression of the gene (n = 142). In this analysis, 10,683 genes were identified to have a significant quantitative correlation with H3K27Ac peaks (FDR < 0.05, top 1,000 lists summarized in Dataset S8). A notable observation from our data suggests a positive correlation among gene clusters annotated as super-enhancer regions that also have accompanying structural variants (Fig. S11). Strikingly, genes such as ERBB2 and EGFR, which are recognized as representative driver genes in LUAD, ranked prominently in this assessment (Fig. 2A, B, Table 1). Moreover, although CDK4 and MDM2 have been demonstrated to be involved in lung cancer, their roles as therapeutic targets have not yet been firmly established [30]. Regardless, they were ranked the most prominent in this assessment (Fig. 2C, D, Table 1). In a limited number of non-CAGA LUAD cases involving the ERBB2, EGFR, CDK4, and MDM2 genes, we identified events where gene expression was induced to a considerable extent that they were deemed outliers (Fig. 2A-D). We confirmed that the super-enhancer and structural variants overlaps, which served as the origin of the genomic rearrangements, were present in all these cases (Fig. 2E-G). Importantly, structural variations associated with super-enhancers did not exhibit extensive copy number amplification, albeit with a moderate gain in copy number (Fig. 2E-G). Finally, to elucidate the differences in gene expression patterns and pathway engagements, particularly between those with super-enhancers and structural variants in genes including ERBB2, EGFR, KRAS, CCND1, MDM2, and those primarily displaying copy number alterations (CNAs), we conducted a comparative expression analysis. This analysis distinctly identified the chemokine activity pathway as significantly involved in cases with super-enhancers and structural variants, as highlighted in Group 4 (Fig. S12, Table S3). These findings indicate that H3K27Ac peaks provide a more explicit marker for gene expression amplification associated with the formation of super-enhancers concomitant with structural variants. Therefore, our analysis of the super-enhancer and structural variant landscape successfully identified gene clusters with strong correlations to expression levels. However, in instances where super-enhancer and structural variant overlaps were present, we observed an exceptionally aberrant elevation in gene expression.

Candidates of driver mutations driven by exceptionally aberrant elevation in gene expression

Our SE-to-gene link analysis revealed a group of genes displaying remarkably aberrant expression that are compelling candidates for driver mutations driven by both super-enhancers and structural variants. Therefore, additional driver genes may need to be identified. Indeed, genes such as FRS2 and CAV2 may emerge as candidates (Fig. S13, the peak-to-gene link analysis for all other candidate genes, as shown in Fig. S14 and the Circos plots shown in Fig. S15). FRS2 (Fibroblast Growth Factor Receptor Substrate 2) plays a critical role in activating the MAPK and PI3K signaling pathways, which are essential for cell proliferation, migration, and survival [31]. It has been identified as oncogenic and is amplified in high-grade serous ovarian cancer, highlighting its potential as a driver gene in oncogenesis [32]. Similarly, CAV2 (Caveolin 2) is implicated in cancer progression; genetic variants leading to high CAV2 expression have been shown to promote pancreatic cancer progression and are associated with poor prognosis [33]. Furthermore, CAV2 influences focal adhesion and extracellular matrix organization pathways, underscoring its role in tumor development and metastasis [33]. In summary, this analysis suggests that FRS2 and CAV2 are involved in the molecular dynamics of non-CAGA LUAD. A deeper understanding of their molecular mechanisms may provide insights into potential therapeutic strategies.

Chromosomal structure of super-enhancer and structural variant overlapped ERBB2 gene locus

Considering the notably aberrant increase in gene expression driven by super-enhancer formation associated with SV events, it is noteworthy that such super-enhancer and structural variant overlapping cases were observed in 40.8% of patients with non-CAGA LUAD (Fig. 3A, gene clusters on KEGG pathway enrichment analysis: FDR < 0.05). Although a small patient group with super-enhancer and structural variant formation was observed for the ZFP36L1, DDIT4, and MIR21, unique super-enhancer and structural variant formations were observed in individual patients (Table S4). Among these, we focused on non-CAGAs LUAD cases displaying super-enhancer formation around ERBB2, comprising 1.15% of non-CAGA LUAD patients (Table S4). To further evaluate the validity of ERBB2 as a potential drug target in non-CAGA LUAD cases, we conducted H3K27Ac ChIP-seq analysis in HER2-overexpressing LUAD cases verified by IHC and RNA-seq; however, its relationship with genomic amplification remains unclear [34]. These analyses were performed using patient-derived xenograft (PDX) models established at the NCC Japan [34, 35]. Extensive super-enhancer formation in the HER2 region was indeed observed (Fig. 3B, Fig. S16A). This super-enhancer formation led to marked overexpression of HER2, as evidenced by both transcripts (Fig. S16B), and protein levels (Fig. S16C). To investigate the activation mechanisms of overexpressed ERBB2, we analyzed the same PDX samples as previously mentioned: one harboring an EGFR activating mutation L858R, sample #1, and the other exhibiting ERBB2 overexpression, sample #2. This analysis was performed utilizing both mass spectrometry and reverse-phase protein array methodologies. Although we confirmed ERBB2 overexpression (Fig. S17A), we did not observe a significant increase in phosphorylated ERBB2 at Y1248—a well-established marker of ERBB2 activation (Fig. S17B). However, we found that phosphorylation levels of ERK1/2 and the S6 ribosomal protein within the PI3K-AKT-mTOR pathway were found to be comparable in both PDX samples (Fig. S17C). Despite the limited number of samples, this suggests that there are common activation mechanisms in ERBB2 overexpressed cases that do not depend on its Tyr-1248 phosphorylation, indicating alternative pathways could be involved in ERBB2-driven signaling [36]. Importantly, drug testing using the pan-HER inhibitor, poziotinib exhibited a significantly promising effect, whereas afatinib showed no antitumor effect. In contrast, trastuzumab deruxtecan (T-DXd) induces significant tumor shrinkage in a dose-dependent manner [34]. These findings suggest that ERBB2-targeting therapies, particularly poziotinib and T-DXd, could be effective therapeutic options for LUAD with super-enhancer formation around ERBB2.

To delve deeper into large-scale chromosomal structural changes and interactions, we conducted Hi-C analysis of cases exhibiting extensive super-enhancer formation surrounding the ERBB2 gene. Genomic alterations coinciding with H3K27Ac peaks were corroborated by the Hi-C results, as demonstrated by altered genomic organization (Fig. 3C). To directly identify the bona fide structural variants, we conducted de novo assembly using long-read sequencing with the PacBio Sequel II platform in conjunction with Hi-C data obtained from the same specimen (Fig. S18A, referring to Materials and methods). Upon analysis, we observed that the ERBB2 gene loci were situated in closer proximity (~ 125 kb) to the HNF1β gene loci compared to their respective positions in the standard GRCh38 reference genome (~ 1.9 Mb apart) (Fig. 3D) while preserving contiguity (Fig. S18B). This observation suggested that a structural variant event was responsible for the rearrangement of the ERBB2-HNF1β gene loci (Fig. 3D). This comprehensive analysis not only elucidates the complex genomic landscape of non-CAGA LUAD, but also highlights the potential of ERBB2-targeting therapies for a subset of patients with specific super-enhancer formations.

Targeted chromosomal rearrangements between ERBB2 and HNF1β loci in cultured cells

To directly determine whether the characteristic structural abnormalities obtained from the aforementioned results led to aberrant gene expression, we induced genomic structural abnormalities in cultured cells using the CRISPR-Cas9 system. WGS and Hi-C analyses revealed highly complex structural abnormalities in the ERBB2 region. Meanwhile, from the Hi-C analysis results (Fig. 4A), we identified common chromosomal inversions in the ERBB2 and HNF1β gene loci, respectively (Fig. 4B-D). Therefore, we designed gRNAs targeting the regions adjacent to these two breakpoints and attempted to induce chromosomal inversions in HBEC3-KT and HSAEC1-KT cells (Fig. S19). These cell lines, immortalized with CDK4 and hTERT, represent human bronchial and small airway epithelial cells, respectively, and neither form colonies on soft agar nor initiate tumor growth in mice [37, 38]. No oncogenic mutations in EGFR have been detected in HBEC3-KT using WES [39]. To confirm specific inversions, we employed T7EI assays and sequencing techniques (Fig. S20-21). Chromosomal inversions require simultaneous double-strand breaks at two distinct locations. When double-strand breaks were simultaneously induced, approximately 0.20–0.69% of the cells displayed an increase in HER2 expression, as confirmed by FACS (Fig. 5A-B, Fig. S22A-B) and RT-PCR (Fig. S23). This is comparable to the reported frequency of chromosomal inversions of approximately 1–8% [40, 41]. This increase was also observed with gRNAs targeting different sequences, albeit in the proximate regions (Fig. 5C-D, Fig. S22C-D). Conversely, upon inducing a break at only one site, we observed no significant difference in HER2 expression compared to baseline, with an approximate frequency of 0.01–0.04% (Fig. 5E-J, Fig. S22E-J, summarized in Fig. 5K, Fig. S22K). These results indicate that an increase in HER2 expression occurs only when double-strand breaks are induced in both ERBB2 and HNF1β genomic regions, strongly suggesting that the observed genomic structural abnormalities directly impact HER2 expression.

Significance of outlier genes in clinical outcomes

Our SE-to-gene link analysis, prioritized by the top six genes (CDK4, ERBB2, MDM2, FRS2, EGFR, CAV2), identified a set of genes displaying markedly aberrant expression patterns, indicative of potential driver mutations (Table 1). To evaluate the clinical implications of gene overexpression in the absence of somatic mutations, we analyzed its correlation with recurrence-related clinical outcomes. In patients with non-CAGA LUAD (n = 312), the presence of pronounced aberrant gene expression elevation, as ascertained by gene expression outlier analysis (refer to Materials and methods), was associated with significantly decreased RFS compared to those without such elevation (Fig. 6A). Moreover, these results were observed irrespective of driver mutations in the LUAD cohort (n = 1,147, Fig. 6B). Additionally, comparable results were obtained when the entire set of 26 genes extracted from the SE-to-gene link analysis (Table 1) was considered as the target group (Fig. 6C-D). Among the 26 genes, all LUAD cases with outliers in the 23 gene groups exhibited an increased risk of recurrence, particularly with FGF3, FGF4 and FGF19, which are involved in recurrence risk (Fig. 6E). These findings underscore the robustness of the gene set derived from the super-enhancer and structural variant landscape analyses and imply that regardless of the presence or absence of driver gene mutations, such as CAGAs, the identified genes possess clinical significance as prognostic factors for predicting postoperative outcomes in LUAD.

Discussion

Although super-enhancers and structural variants are often detected in the range of several hundred spots per case, most cancer research conducted thus far has analyzed these datasets independently [42], and there are no examples of integrated analysis in LUAD. In this study, we focused on understanding the interplay between super-enhancers and structural variants in the regulation of gene expression in non-CAGA LUAD. We found that the co-localization of super-enhancers and structural variants was limited, accounting for approximately 1% of the overall spots detected using our methodology. However, this co-localization was observed in approximately 40% of non-CAGA LUAD cases. Importantly, genes such as ERBB2, EGFR, CDK4, and MDM2, all with established links to NSCLC, demonstrated increased expression due to super-enhancer and structural variant overlap without extensive copy number amplifications. Furthermore, we identified clusters of genes that form super-enhancers linked to structural variations. This indicates that adjacent genes, including FRS2, CAV2, FGF3, FGF4, and FGF19, may also serve as driver genes besides well-established driver genes [32, 43,44,45]. Although further investigation is required to determine whether these genes are drivers, our analysis lies in the extension of the driver mutation concept from solely somatic mutations to include driver changes due to overexpression in wild-type genes [46,47,48,49].

Therapies targeting HER2, such as poziotinib and T-DXd, have shown significant efficacy in treating PDX models of LUAD with super-enhancer formation in the vicinity of the ERBB2 gene. To further elucidate the influence of genomic structure on gene expression, we utilized the CRISPR-Cas9 system to induce chromosomal translocation between the ERBB2 and HNF1β loci within a cell culture system. Our results revealed that an increase in HER2 expression was observed only when double-strand breaks occurred concurrently at both loci. Although this observation strongly reinforces the hypothesis that structural abnormalities within the gene directly influence ERBB2 expression, the structural variant event alone seems insufficient for full ERBB2 activation and subsequent cellular transformation. This suggests a potential need for other genetic or epigenetic alterations. In line with this, it would be intriguing to explore how EGF influences ERBB2 expression mediated by super-enhancers and structural variant formation. Thus, our culture conditions may unmask the complete array of genetic and epigenetic modifications necessary for cellular transformation. Overall, these findings underscore the pivotal role of genomic structures, such as super-enhancers and structural variants, in modulating gene expression in non-CAGA LUAD.

One of the most recent and ambitious efforts in this field is TRACERx, which was designed to trace genetic alterations in cancer, providing a profound understanding of how these driver genes contribute to disease progression and treatment responses [50, 51]. Such mutations often have considerable implications for the function or regulation of associated proteins, and when present, these mutations can lead to disease states such as cancer. However, when these mutations are absent, it becomes notably challenging to categorize a gene as a “driver” gene. In the context of our research, we propose a promising alternative approach for instances in which mutations in driver genes are not detected. In addition, the identification of super-enhancers and structural variants is a qualitative process that is less burdened by the complexities associated with quantitative analysis such as RNA-seq. Therefore, our approach presents an alternative pathway for identifying potential driver events and provides a new direction for research in cases where conventional methods fail to identify somatic mutations within the protein-coding regions of driver genes.

Copy number amplification is a significant event in cancer that often results in the overexpression of oncogenes and promotes tumor development and progression [52, 53]. It is plausible that regions of the genome with amplified copy numbers also coincide with areas where super-enhancers and structural variants overlap, leading to further enhancement of gene expression. Indeed, within our non-CAGA cohort, specific cases demonstrated complex chromothripsis events characterized by extensive copy number amplification around the CDK4/MDM2 loci (Fig. S3). Since chromothripsis inherently involves complex structural variations, further investigations are required to determine whether the analyses of super-enhancers associated with structural variations indicate chromothripsis events [54]. However, it is important to note that while copy number amplification often leads to the overexpression of genes, gene expression is also regulated by other factors, including epigenetic changes and transcription factor binding [8, 55]. Therefore, an understanding of genomic-epigenetic configurations could potentially aid in the accurate identification of target genes for therapeutic interventions.

Translating our findings from WGS and ChIP-seq analyses for clinical applications requires prospective trials. By applying our method, we identified a preponderance of probable driver genes, some of which are currently under clinical investigation [30, 56,57,58,59]. This approach offers significant benefits for patient selection and potentially improves the efficacy of clinical trials by targeting individuals with relevant genetic profile. This may lead to more personalized treatment strategies, enhanced therapeutic outcomes, and better patient prognoses. However, this study has some limitations must be acknowledged. For example, the size of the obtained clinical samples may impose constraints on the scope and depth of the analyses that can be performed. Furthermore, the quality and quantity of genomic and epigenomic data may have been affected by the small sample size, potentially influencing the statistical power and reliability of the study outcomes. Despite these limitations, we previously reported that automated techniques using a dual-arm robot [27] can partially mitigate these challenges, enabling more efficient and accurate data collection and analysis.

In summary, our study provides valuable insights into the interplay between genomic and epigenetic configurations in non-CAGAs LUAD. We envision that our findings will contribute to the development of novel therapeutic strategies for patients with non-CAGAs LUAD by identifying potential therapeutic targets. Our work paves the way for further research to verify and expand upon these findings, aiming to improve patient outcomes in LUAD.

Conclusions

Our study elucidated the intricate interplay between super-enhancers and structural variants in non-CAGA LUAD, underscoring their significant contribution to the modulation of gene expression. The methodology employed facilitated the identification of a substantial number of putative driver genes, thereby enabling a more precise selection of patients for clinical trials, potentially augmenting the effectiveness of personalized therapeutic approaches and improving patient prognoses.

Availability of data and materials

The dataset for exploring genomic-epigenetic configurations in lung adenocarcinoma without clinically actionable genetic alterations is available at https://doi.org/10.6084/m9.figshare.22826636.

The dataset encompasses the following files:

1) complete_p2gl_dataset.tsv contains a comprehensive array of peak-to-gene links analysis in non-CAGAs LUAD.

2) LUAD_56.bigWig, LUAD_JPDX0057.bigWig, and LUAD_222.bigWig include H3K27Ac ChIP-seq data for genome-wide visualization and analysis, as shown in Fig. 4B-D, respectively.

3) LUAD_56.mcool, LUAD_JPDX0057.mcool, and LUAD_222.mcool represent Hi-C data for genome-wide visual representation and analysis, as demonstrated in Fig. 4B-D, respectively.

Raw sequence data, including WGS, Hi-C, RNA-seq and ChIP-seq, are not currently accessible to the public due to reasons of sensitivity. However, these can be procured from the corresponding author, provided a reasonable request is submitted. Each request will be evaluated for appropriateness and necessitate an appropriate data access agreement that aligns with the relevant ethical approvals. This procedure is coordinated through our platform, Mine (https://www.nibiohn.go.jp/mine/). Mine publicly showcases the outcomes of the inter-agency research project “Development of artificial intelligence to accelerate drug discovery”, which operates under the framework of the PRISM project in Japan. One of the projects focuses on drug development targeting patients with a condition termed “Pan-negative” lung cancer, where no specific therapeutic targets have been identified.

Abbreviations

ALK:: Anaplastic lymphoma kinase
BAX:: BCL2 associated X
BLAST:: Basic local alignment search tool
BRAF:: B-Raf proto-oncogene
CAGAs:: Clinically actionable genetic alterations
CAV2:: Caveolin 2
CCND1:: Cyclin D1
CDK4:: Cyclin-dependent kinase 4
ChIP-seq:: Chromatin immunoprecipitation sequencing
CNAs:: Copy number alterations
DDIT4:: DNA damage-inducible transcript 4
EGFR:: Epidermal growth factor receptor
ERBB2/HER2:: Human epidermal growth factor receptor 2
FGF3:: Fibroblast growth factor 3
FGF4:: Fibroblast growth factor 4
FGF19:: Fibroblast growth factor 19
FGFR:: Fibroblast growth factor receptor
FOXO3:: Forkhead box O3
FRS2:: Fibroblast growth factor receptor substrate 2
FDR:: False discovery rate
H3K27Ac:: Histone H3 lysine 27 acetylation
HNF1β:: Hepatocyte nuclear factor 1 beta
Hi-C:: High-throughput chromosome conformation capture
KEGG:: Kyoto encyclopedia of genes and genomes
KRAS:: Kirsten rat sarcoma
LUADs:: Lung adenocarcinomas
MDM2:: MDM2 proto-oncogene
MET:: MET proto-oncogene
MIR21:: MicroRNA 21
NCC:: National cancer center
NRG1:: Neuregulin 1
NSCLC:: Non-small cell lung cancer
NTRK:: Neurotrophic tyrosine kinase
PDX:: Patient-derived xenograft
polyA RNA-seq:: Polyadenylated RNA sequencing
PRISM:: Public/Private R&D Investment Strategic Expansion PrograM
RET:: Ret proto-oncogene
RNA-seq:: RNA sequencing
ROSE:: Rank ordering of super-enhancers
ROS1:: ROS proto-oncogene 1
Ribo-Zero RNA-seq:: Ribosomal RNA depletion sequencing
RXRA:: Retinoid X receptor alpha
SE:: Super-enhancer
SMART-seq:: Switching mechanism at 5' end of RNA template sequencing
STAT3:: Signal transducer and activator of transcription 3
SVs:: Structural variants
T-DXd:: Trastuzumab deruxtecan
T7EI:: T7 endonuclease I
TRACERx:: Tracking cancer evolution through therapy
WES:: Whole-exome sequencing
WGS:: Whole-genome sequencing
ZFP36L1:: ZFP36 ring finger protein-like 1

References

Herbst RS, Morgensztern D, Boshoff C. The biology and management of non-small cell lung cancer. Nature. 2018;553:446–54.
Article CAS PubMed Google Scholar
Cancer Genome Atlas Research N. Comprehensive molecular profiling of lung adenocarcinoma. Nature. 2014;511:543–50.
Article Google Scholar
Saito M, Shiraishi K, Kunitoh H, Takenoshita S, Yokota J, Kohno T. Gene aberrations for precision medicine against lung adenocarcinoma. Cancer Sci. 2016;107:713–20.
Article CAS PubMed PubMed Central Google Scholar
Carrot-Zhang J, Yao X, Devarakonda S, Deshpande A, Damrauer JS, Silva TC, et al. Whole-genome characterization of lung adenocarcinomas lacking alterations in the RTK/RAS/RAF pathway. Cell Rep. 2021;34:108784.
Article CAS PubMed PubMed Central Google Scholar
Weischenfeldt J, Dubash T, Drainas AP, Mardin BR, Chen Y, Stutz AM, et al. Pan-cancer analysis of somatic copy-number alterations implicates IRS4 and IGF2 in enhancer hijacking. Nat Genet. 2017;49:65–74.
Article CAS PubMed Google Scholar
Newman S, Nakitandwe J, Kesserwan CA, Azzato EM, Wheeler DA, Rusch M, et al. Genomes for Kids: the scope of pathogenic mutations in pediatric cancer revealed by comprehensive DNA and RNA sequencing. Cancer Discov. 2021;11:3008–27.
Article CAS PubMed PubMed Central Google Scholar
Duncavage EJ, Schroeder MC, O’Laughlin M, Wilson R, MacMillan S, Bohannon A, et al. Genome sequencing as an alternative to cytogenetic analysis in myeloid cancers. N Engl J Med. 2021;384:924–35.
Article CAS PubMed PubMed Central Google Scholar
Dubois F, Sidiropoulos N, Weischenfeldt J, Beroukhim R. Structural variations in cancer and the 3D genome. Nat Rev Cancer. 2022;22:533–46.
Article CAS PubMed PubMed Central Google Scholar
Mohammad HP, Barbash O, Creasy CL. Targeting epigenetic modifications in cancer therapy: erasing the roadmap to cancer. Nat Med. 2019;25:403–18.
Article CAS PubMed Google Scholar
Hnisz D, Abraham BJ, Lee TI, Lau A, Saint-Andre V, Sigova AA, et al. Super-enhancers in the control of cell identity and disease. Cell. 2013;155:934–47.
Article CAS PubMed Google Scholar
Hamamoto R, Takasawa K, Shinkai N, Machino H, Kouno N, Asada K, et al. Analysis of super-enhancer using machine learning and its application to medical biology. Brief Bioinform. 2023;24:bbad107.
Article PubMed PubMed Central Google Scholar
Loven J, Hoke HA, Lin CY, Lau A, Orlando DA, Vakoc CR, et al. Selective inhibition of tumor oncogenes by disruption of super-enhancers. Cell. 2013;153:320–34.
Article CAS PubMed PubMed Central Google Scholar
Hnisz D, Schuijers J, Lin CY, Weintraub AS, Abraham BJ, Lee TI, et al. Convergence of developmental and oncogenic signaling pathways at transcriptional super-enhancers. Mol Cell. 2015;58:362–70.
Article CAS PubMed PubMed Central Google Scholar
Novo CL, Javierre BM, Cairns J, Segonds-Pichon A, Wingett SW, Freire-Pritchett P, et al. Long-range enhancer interactions are prevalent in mouse embryonic stem cells and are reorganized upon pluripotent state transition. Cell Rep. 2018;22:2615–27.
Article CAS PubMed PubMed Central Google Scholar
Hung KL, Yost KE, Xie L, Shi Q, Helmsauer K, Luebeck J, et al. ecDNA hubs drive cooperative intermolecular oncogene expression. Nature. 2021;600:731–6.
Article CAS PubMed PubMed Central Google Scholar
Avraham R, Yarden Y. Feedback regulation of EGFR signalling: decision making by early and delayed loops. Nat Rev Mol Cell Biol. 2011;12:104–17.
Article CAS PubMed Google Scholar
Arteaga CL, Engelman JA. ERBB receptors: from oncogene discovery to basic science to mechanism-based cancer therapeutics. Cancer Cell. 2014;25:282–303.
Article CAS PubMed PubMed Central Google Scholar
Scholl S, Beuzeboc P, Pouillart P. Targeting HER2 in other tumor types. Ann Oncol. 2001;12(Suppl 1):S81–87.
Article PubMed Google Scholar
Slamon DJ, Leyland-Jones B, Shak S, Fuchs H, Paton V, Bajamonde A, et al. Use of chemotherapy plus a monoclonal antibody against HER2 for metastatic breast cancer that overexpresses HER2. N Engl J Med. 2001;344:783–92.
Article CAS PubMed Google Scholar
Baselga J, Cortes J, Kim SB, Im SA, Hegg R, Im YH, et al. Pertuzumab plus trastuzumab plus docetaxel for metastatic breast cancer. N Engl J Med. 2012;366:109–19.
Article CAS PubMed Google Scholar
Verma S, Miles D, Gianni L, Krop IE, Welslau M, Baselga J, et al. Trastuzumab emtansine for HER2-positive advanced breast cancer. N Engl J Med. 2012;367:1783–91.
Article CAS PubMed PubMed Central Google Scholar
Geyer CE, Forster J, Lindquist D, Chan S, Romieu CG, Pienkowski T, et al. Lapatinib plus capecitabine for HER2-positive advanced breast cancer. N Engl J Med. 2006;355:2733–43.
Article CAS PubMed Google Scholar
Chan A, Delaloge S, Holmes FA, Moy B, Iwata H, Harvey VJ, et al. Neratinib after trastuzumab-based adjuvant therapy in patients with HER2-positive breast cancer (ExteNET): a multicentre, randomised, double-blind, placebo-controlled, phase 3 trial. Lancet Oncol. 2016;17:367–77.
Article CAS PubMed Google Scholar
Li BT, Smit EF, Goto Y, Nakagawa K, Udagawa H, Mazieres J, et al. Trastuzumab deruxtecan in HER2-mutant non-small-cell lung cancer. N Engl J Med. 2022;386:241–51.
Article CAS PubMed Google Scholar
Chakravarty D, Gao J, Phillips SM, Kundra R, Zhang H, Wang J, et al. OncoKB: a precision oncology knowledge base. JCO Precis Oncol. 2017;2017:PO.17.00011.
PubMed Google Scholar
Landrum MJ, Lee JM, Benson M, Brown GR, Chao C, Chitipiralla S, et al. ClinVar: improving access to variant interpretations and supporting evidence. Nucleic Acids Res. 2018;46:D1062–7.
Article CAS PubMed Google Scholar
Kaneko S, Mitsuyama T, Shiraishi K, Ikawa N, Shozu K, Dozen A, et al. Genome-wide chromatin analysis of FFPE tissues using a dual-arm robot with clinical potential. Cancers (Basel). 2021;13:2126.
Article CAS PubMed Google Scholar
Rao SS, Huntley MH, Durand NC, Stamenova EK, Bochkov ID, Robinson JT, et al. A 3D map of the human genome at kilobase resolution reveals principles of chromatin looping. Cell. 2014;159:1665–80.
Article CAS PubMed PubMed Central Google Scholar
Corces MR, Granja JM, Shams S, Louie BH, Seoane JA, Zhou W, et al. The chromatin accessibility landscape of primary human cancers. Science. 2018;362:eaav1898.
Article PubMed PubMed Central Google Scholar
Portman N, Chen J, Lim E. MDM2 as a rational target for intervention in CDK4/6 inhibitor resistant, hormone receptor positive breast cancer. Front Oncol. 2021;11:777867.
Article CAS PubMed PubMed Central Google Scholar
Santhana Kumar K, Brunner C, Schuster M, Kopp LL, Gries A, Yan S, et al. Discovery of a small molecule ligand of FRS2 that inhibits invasion and tumor growth. Cell Oncol (Dordr). 2023;46:331–56.
Article CAS PubMed Google Scholar
Luo LY, Kim E, Cheung HW, Weir BA, Dunn GP, Shen RR, et al. The tyrosine kinase adaptor protein FRS2 is oncogenic and amplified in high-grade serous ovarian cancer. Mol Cancer Res. 2015;13:502–9.
Article CAS PubMed Google Scholar
Zhu Y, Tian J, Peng X, Wang X, Yang N, Ying P, et al. A genetic variant conferred high expression of CAV2 promotes pancreatic cancer progression and associates with poor prognosis. Eur J Cancer. 2021;151:94–105.
Article CAS PubMed Google Scholar
Jo H, Yagishita S, Hayashi Y, Ryu S, Suzuki M, Kohsaka S, et al. Comparative study on the efficacy and exposure of molecular target agents in non-small cell lung cancer PDX models with driver genetic alterations. Mol Cancer Ther. 2022;21:359–70.
Article CAS PubMed Google Scholar
Yagishita S, Kato K, Takahashi M, Imai T, Yatabe Y, Kuwata T, et al. Characterization of the large-scale Japanese patient-derived xenograft (J-PDX) library. Cancer Sci. 2021;112:2454–66.
Article CAS PubMed PubMed Central Google Scholar
Ramic S, Paic F, Smajlbegovic V, PericBalja M, Hirsl L, Marton I, et al. Non-phosphorylated Tyr-1248 form of human epidermal growth factor receptor 2 (HER2) predicts resistance to trastuzumab therapy and poor disease-free survival of HER2-positive breast cancer patients. Croat Med J. 2022;63:126–40.
Article CAS PubMed PubMed Central Google Scholar
Ramirez RD, Sheridan S, Girard L, Sato M, Kim Y, Pollack J, et al. Immortalization of human bronchial epithelial cells in the absence of viral oncoproteins. Cancer Res. 2004;64:9027–34.
Article CAS PubMed Google Scholar
Kalita M, Tian B, Gao B, Choudhary S, Wood TG, Carmical JR, et al. Systems approaches to modeling chronic mucosal inflammation. Biomed Res Int. 2013;2013:505864.
Article PubMed PubMed Central Google Scholar
McMillan EA, Ryu MJ, Diep CH, Mendiratta S, Clemenceau JR, Vaden RM, et al. Chemistry-first approach for nomination of personalized treatment in lung cancer. Cell. 2018;173:864–878 e829.
Article CAS PubMed PubMed Central Google Scholar
Choi PS, Meyerson M. Targeted genomic rearrangements using CRISPR/Cas technology. Nat Commun. 2014;5:3728.
Article CAS PubMed Google Scholar
Maddalo D, Manchado E, Concepcion CP, Bonetti C, Vidigal JA, Han YC, et al. In vivo engineering of oncogenic chromosomal rearrangements with the CRISPR/Cas9 system. Nature. 2014;516:423–7.
Article CAS PubMed PubMed Central Google Scholar
Li Y, Roberts ND, Wala JA, Shapira O, Schumacher SE, Kumar K, et al. Patterns of somatic structural variation in human cancer genomes. Nature. 2020;578:112–21.
Article CAS PubMed PubMed Central Google Scholar
Wang Y, Wang Y, Liu R, Wang C, Luo Y, Chen L, et al. CAV2 promotes the invasion and metastasis of head and neck squamous cell carcinomas by regulating S100 proteins. Cell Death Discov. 2022;8:386.
Article CAS PubMed PubMed Central Google Scholar
Sawey ET, Chanrion M, Cai C, Wu G, Zhang J, Zender L, et al. Identification of a therapeutic strategy targeting amplified FGF19 in liver cancer by Oncogenomic screening. Cancer Cell. 2011;19:347–58.
Article CAS PubMed PubMed Central Google Scholar
Hajitou A, Deroanne C, Noel A, Collette J, Nusgens B, Foidart JM, et al. Progression in MCF-7 breast cancer cell tumorigenicity: compared effect of FGF-3 and FGF-4. Breast Cancer Res Treat. 2000;60:15–28.
Article CAS PubMed Google Scholar
Hirata Y, Noorani A, Song S, Wang L, Ajani JA. Early stage gastric adenocarcinoma: clinical and molecular landscapes. Nat Rev Clin Oncol. 2023;20:453–69.
Article PubMed Google Scholar
Wong GS, Zhou J, Liu JB, Wu Z, Xu X, Li T, et al. Targeting wild-type KRAS-amplified gastroesophageal cancer through combined MEK and SHP2 inhibition. Nat Med. 2018;24:968–77.
Article CAS PubMed PubMed Central Google Scholar
Nukaga S, Yasuda H, Tsuchihara K, Hamamoto J, Masuzawa K, Kawada I, et al. Amplification of EGFR wild-type alleles in non-small cell lung cancer cells confers acquired resistance to mutation-selective EGFR tyrosine kinase inhibitors. Cancer Res. 2017;77:2078–89.
Article CAS PubMed Google Scholar
Talasila KM, Soentgerath A, Euskirchen P, Rosland GV, Wang J, Huszthy PC, et al. EGFR wild-type amplification and activation promote invasion and development of glioblastoma independent of angiogenesis. Acta Neuropathol. 2013;125:683–98.
Article CAS PubMed PubMed Central Google Scholar
Frankell AM, Dietzen M, Al Bakir M, Lim EL, Karasaki T, Ward S, et al. The evolution of lung cancer and impact of subclonal selection in TRACERx. Nature. 2023;616:525–33.
Article CAS PubMed PubMed Central Google Scholar
Al Bakir M, Huebner A, Martinez-Ruiz C, Grigoriadis K, Watkins TBK, Pich O, et al. The evolution of non-small cell lung cancer metastases in TRACERx. Nature. 2023;616:534–42.
Article CAS PubMed PubMed Central Google Scholar
Zack TI, Schumacher SE, Carter SL, Cherniack AD, Saksena G, Tabak B, et al. Pan-cancer patterns of somatic copy number alteration. Nat Genet. 2013;45:1134–40.
Article CAS PubMed PubMed Central Google Scholar
Beroukhim R, Mermel CH, Porter D, Wei G, Raychaudhuri S, Donovan J, et al. The landscape of somatic copy-number alteration across human cancers. Nature. 2010;463:899–905.
Article CAS PubMed PubMed Central Google Scholar
Lee JJ, Jung YL, Cheong TC, Espejo Valle-Inclan J, Chu C, Gulhan DC, et al. ERalpha-associated translocations underlie oncogene amplifications in breast cancer. Nature. 2023;618:1024–32.
Article CAS PubMed PubMed Central Google Scholar
Cramer P. Eukaryotic transcription turns 50. Cell. 2019;179:808–12.
Article CAS PubMed Google Scholar
Sternberg CN, Petrylak DP, Bellmunt J, Nishiyama H, Necchi A, Gurney H, et al. FORT-1: phase II/III study of rogaratinib versus chemotherapy in patients with locally advanced or metastatic urothelial carcinoma selected based on FGFR1/3 mRNA expression. J Clin Oncol. 2023;41:629–39.
Article CAS PubMed Google Scholar
Yu F, Yu C, Li F, Zuo Y, Wang Y, Yao L, et al. Wnt/beta-catenin signaling in cancers and targeted therapies. Signal Transduct Target Ther. 2021;6:307.
Article CAS PubMed PubMed Central Google Scholar
Llombart V, Mansour MR. Therapeutic targeting of “undruggable” MYC. EBioMedicine. 2022;75:103756.
Article CAS PubMed Google Scholar
Ciardiello D, Elez E, Tabernero J, Seoane J. Clinical development of therapies targeting TGFbeta: current knowledge and future perspectives. Ann Oncol. 2020;31:1336–49.
Article CAS PubMed Google Scholar

Download references

Acknowledgements

The authors express their deepest gratitude to the patients who donated tumor tissues for this study. We sincerely thank Shigehiro Yagishita and Akinobu Hamada for their generous provision of J-PDX tumor samples, Katsuji Takeda and Ryohei Kawanago for their invaluable assistance with code scripting and Yoko Shimada for technical assistance. We would like to express our sincere gratitude to Masashi Sugiyama for his invaluable guidance and continuous support throughout this research We greatly thank Drs. Shinji Kohsaka and Kazuya Takamochi for their support in this study.

Funding

This study was supported by funding from the AMED Innovative Cancer Medical Practice Research Project (Grant Number JP22ck0106643), the JST CREST (Grant Number JPMJCR1689), the JSPS Grant-in-Aid for Scientific Research on Innovative Areas (Grant Number JP18H04908), the JST AIP-PRISM (Grant Number JPMJCR18Y4), and the MEXT subsidy for the Advanced Integrated Intelligence Platform to R.H., and the JSPS Grant-in-Aid for Scientific Research (Grant Numbers JP21H03550), the Takeda Science Foundation and the Research Grant of the Princess Takamatsu Cancer Research Fund (19–25108) to S.K.

Author information

Syuzo Kaneko, Ken Takasawa and Ken Asada contributed equally to this work.

Authors and Affiliations

Division of Medical AI Research and Development, National Cancer Center Research Institute, 5-1-1 Tsukiji, Chuo-Ku, Tokyo, 104-0045, Japan
Syuzo Kaneko, Ken Takasawa, Ken Asada, Noriko Ikawa, Hidenori Machino, Norio Shinkai, Satoshi Takahashi, Kazuma Kobayashi, Nobuji Kouno, Amina Bolatkan, Masaaki Komatsu & Ryuji Hamamoto
Cancer Translational Research Team, RIKEN Center for Advanced Intelligence Project, Tokyo, 103-0027, Japan
Syuzo Kaneko, Ken Takasawa, Ken Asada, Hidenori Machino, Norio Shinkai, Satoshi Takahashi, Kazuma Kobayashi, Nobuji Kouno, Amina Bolatkan, Masaaki Komatsu & Ryuji Hamamoto
Division of Genome Biology, National Cancer Center Research Institute, Tokyo, 104-0045, Japan
Kouya Shiraishi, Maiko Matsuda, Takaaki Mizuno & Takashi Kohno
Department of Proteomics, National Cancer Center Research Institute, Tokyo, 104-0045, Japan
Mari Masuda & Shungo Adachi
Endoscopy Division, National Cancer Center Hospital, Tokyo, 104-0045, Japan
Masayoshi Yamada
Department of Diagnostic Radiology, National Cancer Center Hospital, Tokyo, 104-0045, Japan
Mototaka Miyake & Hirokazu Watanabe
Department of Thoracic Oncology, National Cancer Center Hospital, Tokyo, 104-0045, Japan
Akiko Tateishi, Takaaki Mizuno, Tatsuya Yoshida, Hidehito Horinouchi & Yuichiro Ohe
Department of Experimental Therapeutics, National Cancer Center Hospital, Tokyo, 104-0045, Japan
Takaaki Mizuno
Department of Thoracic Surgery, National Cancer Center Hospital, Tokyo, 104-0045, Japan
Yu Okubo, Yukihiro Yoshida & Shun-Ichi Watanabe
Division of Medical Informatics, National Cancer Center Hospital, Tokyo, 104-0045, Japan
Masami Mukai
Department of Diagnostic Pathology, National Cancer Center Hospital, Tokyo, 104-0045, Japan
Yasushi Yatabe
Center for Cancer Research, National Cancer Institute, Bethesda, MD, 20892, USA
Vassiliki Saloura

Authors

Syuzo Kaneko
View author publications
You can also search for this author in PubMed Google Scholar
Ken Takasawa
View author publications
You can also search for this author in PubMed Google Scholar
Ken Asada
View author publications
You can also search for this author in PubMed Google Scholar
Kouya Shiraishi
View author publications
You can also search for this author in PubMed Google Scholar
Noriko Ikawa
View author publications
You can also search for this author in PubMed Google Scholar
Hidenori Machino
View author publications
You can also search for this author in PubMed Google Scholar
Norio Shinkai
View author publications
You can also search for this author in PubMed Google Scholar
Maiko Matsuda
View author publications
You can also search for this author in PubMed Google Scholar
Mari Masuda
View author publications
You can also search for this author in PubMed Google Scholar
Shungo Adachi
View author publications
You can also search for this author in PubMed Google Scholar
Satoshi Takahashi
View author publications
You can also search for this author in PubMed Google Scholar
Kazuma Kobayashi
View author publications
You can also search for this author in PubMed Google Scholar
Nobuji Kouno
View author publications
You can also search for this author in PubMed Google Scholar
Amina Bolatkan
View author publications
You can also search for this author in PubMed Google Scholar
Masaaki Komatsu
View author publications
You can also search for this author in PubMed Google Scholar
Masayoshi Yamada
View author publications
You can also search for this author in PubMed Google Scholar
Mototaka Miyake
View author publications
You can also search for this author in PubMed Google Scholar
Hirokazu Watanabe
View author publications
You can also search for this author in PubMed Google Scholar
Akiko Tateishi
View author publications
You can also search for this author in PubMed Google Scholar
Takaaki Mizuno
View author publications
You can also search for this author in PubMed Google Scholar
Yu Okubo
View author publications
You can also search for this author in PubMed Google Scholar
Masami Mukai
View author publications
You can also search for this author in PubMed Google Scholar
Tatsuya Yoshida
View author publications
You can also search for this author in PubMed Google Scholar
Yukihiro Yoshida
View author publications
You can also search for this author in PubMed Google Scholar
Hidehito Horinouchi
View author publications
You can also search for this author in PubMed Google Scholar
Shun-Ichi Watanabe
View author publications
You can also search for this author in PubMed Google Scholar
Yuichiro Ohe
View author publications
You can also search for this author in PubMed Google Scholar
Yasushi Yatabe
View author publications
You can also search for this author in PubMed Google Scholar
Vassiliki Saloura
View author publications
You can also search for this author in PubMed Google Scholar
Takashi Kohno
View author publications
You can also search for this author in PubMed Google Scholar
Ryuji Hamamoto
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

SK, KT, KA, and RH designed this study. SK, KT, KA, KS, NI, HM, NS, MMatsu, MMasu, SA, ST, KK, NK, AB, MK, MY, AT, TM, YOk, MMu, TY, YYo, and HH performed data analysis. SW, YOh, YYa, MS, VS, TK, and RH supervised this study. SK wrote the manuscript, and RH edited the manuscript. All authors contributed to interpreting the data and critically revised the manuscript. All authors have read and approved the final version of the manuscript.

Corresponding authors

Correspondence to Syuzo Kaneko or Ryuji Hamamoto.

Ethics declarations

Ethics approval and consent to participate

All methods were performed in accordance with the Ethical Guidelines for Medical and Health Research Involving Human Subjects. The study was approved by the institutional review board of the National Cancer Center Japan (2005–109, 2016–496, 2019–018). In addition, this study was conducted in accordance with the Declaration of Helsinki. All patients provided written informed consent. All authors have read and approved the final version of the manuscript.

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Additional file 1. Includes Figs. S1 – 23 and Tables S1 – 4.

Additional file 2. Contains supplementary methods.

12943_2024_2035_MOESM3_ESM.html

Additional file 3. Represents Dataset S1 for the multiple QC reports generated by the nf-core/chipseq analysis pipeline.

Additional file 4. Provides Dataset S2 for the list of super-enhancer regions of each non-CAGAs LUAD analyzed by ROSE.

Additional file 5. Includes Dataset S3 for the list of super-enhancer regions of each CAGAs LUAD analyzed by ROSE.

12943_2024_2035_MOESM6_ESM.xlsx

Additional file 6. Includes Dataset S4 for the list of structural variant regions of each non-CAGAs LUAD analyzed by Manta.

Additional file 7. Includes Dataset S5 for the list of structural variant regions of each CAGAs LUAD analyzed by Manta.

12943_2024_2035_MOESM8_ESM.xlsx

Additional file 8. Provides Dataset S6 for the list of genome coordinates for regions where super-enhancer and structural variant overlap in non-CAGAs LUAD.

12943_2024_2035_MOESM9_ESM.xlsx

Additional file 9. Includes Dataset S7 for the list of genome coordinates for regions where super-enhancer and structural variant overlap in CAGAs LUAD.

12943_2024_2035_MOESM10_ESM.pdf

Additional file 10. Provides Dataset S8 for the results of the peak-to-gene links analysis showing a scatter plot of the top 1,000 expression values and peak values, their correlation coefficient, and the null distribution of the correlation coefficient.

Additional file 11. The uncropped images include uncropped immunoblot data shown in Fig. S16C.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article

Kaneko, S., Takasawa, K., Asada, K. et al. Mechanism of ERBB2 gene overexpression by the formation of super-enhancer with genomic structural abnormalities in lung adenocarcinoma without clinically actionable genetic alterations. Mol Cancer 23, 126 (2024). https://doi.org/10.1186/s12943-024-02035-6

Download citation

Received: 21 August 2023
Accepted: 30 May 2024
Published: 11 June 2024
DOI: https://doi.org/10.1186/s12943-024-02035-6

Mechanism of ERBB2 gene overexpression by the formation of super-enhancer with genomic structural abnormalities in lung adenocarcinoma without clinically actionable genetic alterations

Abstract

Background

Methods

Results

Conclusions

Introduction

Materials and methods

Ethical considerations and clinical materials

WGS

Identification of LUAD without clinically actionable genetic alterations (CAGAs)

ChIP-seq

Overlap analysis of super-enhancers and structural variants

Super-enhancer (SE)-to-gene links analysis

Hi-C

Long-read sequencing

Targeted chromosomal rearrangements

FACS

Recurrence-free survival (RFS) analysis

Bioinformatic analysis

Statistical analysis

Results

Identification of driver mutations driven by super-enhancer formation with structural variants

Impact of gene expression on super-enhancer formation accompanied by structural variants in non-CAGAs LUAD

Candidates of driver mutations driven by exceptionally aberrant elevation in gene expression

Chromosomal structure of super-enhancer and structural variant overlapped ERBB2 gene locus

Targeted chromosomal rearrangements between ERBB2 and HNF1β loci in cultured cells

Significance of outlier genes in clinical outcomes

Discussion

Conclusions

Availability of data and materials

Abbreviations

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Ethics approval and consent to participate

Competing interests

Additional information

Publisher’s Note

Supplementary Information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Molecular Cancer

Contact us