- Open Access
Identification of a panel of sensitive and specific DNA methylation markers for squamous cell lung cancer
Molecular Cancervolume 7, Article number: 62 (2008)
Lung cancer is the leading cause of cancer death in men and women in the United States and Western Europe. Over 160,000 Americans die of this disease every year. The five-year survival rate is 15% – significantly lower than that of other major cancers. Early detection is a key factor in increasing lung cancer patient survival. DNA hypermethylation is recognized as an important mechanism for tumor suppressor gene inactivation in cancer and could yield powerful biomarkers for early detection of lung cancer. Here we focused on developing DNA methylation markers for squamous cell carcinoma of the lung. Using the sensitive, high-throughput DNA methylation analysis technique MethyLight, we examined the methylation profile of 42 loci in a collection of 45 squamous cell lung cancer samples and adjacent non-tumor lung tissues from the same patients.
We identified 22 loci showing significantly higher DNA methylation levels in tumor tissue than adjacent non-tumor lung. Of these, eight showed highly significant hypermethylation in tumor tissue (p < 0.0001): GDNF, MTHFR, OPCML, TNFRSF25, TCF21, PAX8, PTPRN2 and PITX2. Used in combination on our specimen collection, this eight-locus panel showed 95.6% sensitivity and specificity.
We have identified 22 DNA methylation markers for squamous cell lung cancer, several of which have not previously been reported to be methylated in any type of human cancer. The top eight markers show great promise as a sensitive and specific DNA methylation marker panel for squamous cell lung cancer.
Cancer is responsible for one in four deaths in the US, making it the second most common cause of death . Lung cancer is the leading cancer killer in men and women.
Over 160,000 Americans will die of this disease in 2007. In men, lung cancer accounts for 31% of cancer deaths, killing more men than leukemia and prostate, colorectal, and pancreatic cancer combined. In women, lung cancer accounts for 27% of all cancer deaths, taking as many lives as breast and colorectal cancer combined . The overall five-year survival rate of lung cancer patients is 15%, significantly lower than that of patients with prostate cancer (99.9%), breast cancer (88.5%) or colon cancer (64.1%) . This rate increases dramatically to greater than 50% when lung cancer is diagnosed at an early stage. However, only 14–16% of cases are detected early .
In contrast to breast, colon, and prostate cancer, no routine screening method for early detection of lung cancer exists. Methods based on imaging (chest X-ray, low dose spiral computed tomography (LDSCT), autofluorescence bronchoscopy (AFB)), and sputum cytology have been tested, however, none have proven ideal. Screening via chest X-ray is not sufficiently sensitive , and trials demonstrated that its use in high risk populations showed no decrease in mortality . LDSCT screening can detect a number of stage I lung cancers, with survival at 10 years reported as high as 88% . However, the possibility of lead-time bias and the high false positive rate  limit the utility of this screening modality. These false positive tests frequently lead to invasive procedures to remove lesions that later prove to be benign . In addition, LDSCT appears to favor detection of peripheral lesions, being less effective at detecting small pre-invasive/micro-invasive lesions in the central airways . Its effects on reducing lung cancer mortality remain in question . Autofluorescence bronchoscopy (AFB) also has a high false positive rate [9, 10], and preferentially detects centrally located cancers. Screening by sputum cytology can detect a number of aspymptomatic cases, but it has not been shown to decrease lung cancer mortality . Studies using molecular marker techniques on sputum samples appear promising .
Given the poor five-year survival rates and limitations of current screening techniques, it is clear that improved methods for early detection of lung cancer are needed. One strategy is to develop sensitive and specific molecular markers that distinguish cancer type and subtype, that are detectable in 'remote' patient media (e.g. blood, sputum) by non-invasive/minimally invasive means, and that can be assayed using a quantitative approach.
DNA methylation has emerged as a prime source of potential cancer-specific biomarkers. In cancer, despite global DNA hypomethylation, many genes become hypermethylated. Typically this occurs in CpG rich regions called CpG islands at/near gene promoters. Methylation often results in the silencing of tumor suppressor or growth regulatory genes . Such cancer-specific hypermethylation results in differential DNA methylation profiles between tumor and non-tumor tissues, which can be exploited to distinguish the two, allowing DNA methylation to serve as a cancer-specific molecular marker. Using bisulfite treatment, which embeds methylation information in the DNA sequence, coupled with a sensitive and quantitative real-time PCR-based assay (MethyLight), hypermethylated CpGs form stable, easily amplifiable, and readily available biomarkers . As no one locus can be expected to detect all cancers of a particular type, reactions for multiple loci can be easily combined into panels of markers, increasing the potential to detect lung cancer in a highly sensitive and specific manner. Because our end goal is a non-invasive lung cancer detection method using DNA methylation markers, it is worth noting that DNA hypermethylation has been detected in remote patient media such as sputum, blood  and bronchoalveolar lavage (BAL)  from lung cancer patients.
Lung cancer is divided clinically into two major subtypes – the rapidly progressing small cell lung cancer (SCLC), and the more common non-small cell lung cancer (NSCLC). As NSCLC accounts for > 85% of all lung cancer cases, and is less aggressive than SCLC, there is a greater chance for early detection, resulting in increased patient survival. NSCLC is divided into four major histological subtypes: adenocarcinoma (AD), squamous cell carcinoma (SQ), large cell carcinoma and others (carcinoids, neuroendocrine cancers, etc). A comparison of SQ and AD of the lung shows differences in DNA hypermethylation profiles [17–19], in expression of therapeutic targets , in the mutational and polymorphic spectra [21, 22] and in gene expression profiles . The region of the lung in which these tumors usually occur also differs, with AD typically located at the periphery and SQ arising near the central airways. Given the distinct nature of SQ and AD, it is to be expected that different molecular markers would need to be developed to sensitively detect these two types of lung cancer. We have recently identified a panel of DNA methylation markers for lung adenocarcinoma . Here we focus on the development of molecular markers for squamous cell lung cancer.
SQ accounts for 25 – 35% of all lung cancer cases in the United States . Our goal was to identify a panel of DNA markers that are frequently and highly methylated in SQ lung tumors when compared to non-tumor lung. Such a panel may be used for non-invasive/minimally invasive and potentially subtype-specific early detection of SQ lung cancer. We envision that in the future, detection of DNA methylation markers in remote media (blood, sputum, bronchoalveolar lavage) might complement less specific imaging-based lung cancer screening tests, and if sensitivity and specificity are high enough, might eventually be directly applied to the screening of high risk populations.
In an effort to develop sensitive and specific molecular markers for squamous cell carcinoma (SQ) of the lung, the methylation status of 42 candidate loci was examined in a collection of 45 tumors and histologically normal adjacent non-tumor lung samples from the same patients. These 42 loci were identified in a pre-screen examination of the methylation status of 304 MethyLight reactions on cell lines and a small number of tumors distinct from the ones used in this study (data not shown). As our aim was to identify novel high penetrance markers for lung SQ, many loci previously reported as methylated in NSCLC/SQ were not included in our study due to their lower methylation frequency. In five of the 42 loci (HRAS, MGMT, MTHFR, PAX8 and SLC38A4), the region examined is not in a CpG island. In our pre-screen, multiple reactions in and around the CpG islands of these loci were tested and the chosen reactions showed the highest methylation in cancer. Paired histologically normal adjacent lung tissue samples, derived from a separate non-cancer block of the lung cancer patients, were used as control samples. Thus, our control tissue matched tumor tissue fully with respect to most variables, including environmental exposures, age, gender, ethnicity and genetic background. The use of paired control tissue from lung cancer patients, which may show higher background methylation, ensures the identification of markers that are hypermethylated in a cancer-specific manner. MethyLight provides a quantitative measure for methylation at each locus; the percentage of methylated reference (PMR) value reflects the level of DNA methylation at the locus examined compared to in vitro methylated control DNA.
We observed a high methylation frequency (the fraction of samples showing any methylation) for all 42 loci in both the tumors and the adjacent non-tumor tissues taken from the same patient (Figure 1, Table 1). The DNA methylation in histologically normal adjacent non-tumor lung is likely due, on the one hand, to the sensitivity of MethyLight, and on the other, to age and/or environmental exposure, and has been observed in other studies [26–28]. We examined the statistical significance of differences in DNA methylation levels in tumor versus adjacent non-tumor tissue using the PMR as a continuous variable. Out of the 42 loci studied, 13 were previously reported to be methylated in NSCLC. Hence, a marker from these 13 was considered statistically significant if it attained the 0.05 level of significance without correction for multiple testing. A marker from the remaining 29 targets was declared statistically significant if it exceeded the 5% false-discovery rate threshold defined using the Benjamini and Hochberg  approach. Overall, twenty-five of the 42 loci examined showed a statistically significant difference (highlighted in italics in Table 1). Three markers – DIRAS3, MGMT, and HRAS – showed statistically significant hypermethylation in non-tumor tissue. The importance of this suggested loss of methylation in the tumors was not further explored here, as we are focused on identifying positive methylation markers for SQ of the lung. The phenomenon could be of interest for future studies. The remaining 22 loci were found to be statistically significantly hypermethylated in the tumors (Table 1). This is the first report of methylation in any cancer for five loci (CPVL, HOXC9, PAX8, PTPRN2, and SLC38A4), flagging these loci as potential novel cancer markers. Eight loci (GDNF, MTHFR, OPCML, TNFRSF25, TCF21, PAX8, PTPRN2, and PITX2) showed highly statistically significant differences with p-values <0.0001.
Potential biomarkers should be effective in all patients regardless of cancer stage, age, gender or ethnicity. We examined DNA methylation levels in tumors vs. adjacent non-tumor tissue in relation to tumor stage. Because the number of cases was not very large, we grouped stage IA and IB cases together (six IA and twenty-five IB), and stages II and III (no IIA, seven IIB and five IIIA). Each of the eight highly significant loci showed higher DNA methylation levels in tumors vs. adjacent non-tumor lung in both early (stage I; n = 31, p-value range = 1 × 10-7 - 0.0041) and advanced (stage II/III; n = 12, p-value range = 6 × 10-5 - 0.0194) lung cancer patients. When analyzing each stage (IA, IB, IIB and IIIA) independently, the two most significant markers (GDNF and MTHFR) showed significantly higher DNA methylation levels in tumor vs. adjacent non-tumor in every stage, despite the modest number of cases. Comparison of DNA methylation levels for the top eight markers in early vs. advanced cancers showed no significant differences between the methylation levels in these tumors, reinforcing the idea that these markers are not stage-specific. This is important, since effective DNA methylation markers for SQ lung cancer must function on every stage of cancer, but particularly on early stage tumors.
We also examined methylation in tumors in relation to age. HOXC9 showed higher levels of DNA methylation in patients under the median age (70: p = 0.021) and TCF21 showed increased DNA methylation in females (p = 0.047). However, if a multiple comparisons correction were applied, these differences would not be significant. DNA methylation of PAX8 appeared higher in males (p = 0.001; significant even with application of a multiple comparison threshold), a factor that might require consideration if it were to be developed for clinical use. As our population is primarily Caucasian, we were not able to examine DNA methylation levels in relation to ethnicity. Studies are in progress in a larger more ethnically diverse population, to examine the possible relationship of DNA methylation to ethnicity.
To provide more insight into the distribution of DNA methylation levels in the tumor and non-tumor samples, we plotted the distribution of PMR values for tumor and non-tumor tissues for the eight most highly significant loci (Figure 2). These plots illustrate differences in the nature of these markers that are not evident from the p-values. For example, GDNF appears to promise substantial specificity and sensitivity due to frequently highly elevated DNA methylation of this locus in tumor tissues. A similar pattern is seen in MTHFR, OPCML, and TNFRSF25. For TCF21, PTPRN2, and PITX2, the DNA methylation levels of tumor tissues show a wider distribution and more overlap with non-tumor samples. The PAX8 DNA methylation values were tightly clustered, and while the difference is highly statistically significant (p = 9 × 10-6), the fold-difference is small, indicating that this marker may not be as useful in the clinical setting.
The utility of clinical markers is often evaluated by generating a receiver operating characteristic (ROC) curve, in which sensitivity versus 1-specificity at all possible cutoff values is plotted. Ultimately, such ROC curves will be generated based on methylation values detected in remote media. However, here we used ROC curves based on the tumor and non-tumor PMR values to provide an early indication of the potential of the top eight loci as cancer-specific markers. The area under the curve (AUC), an indicator of marker performance, ranged from a modest 0.75 for PITX2 to a much better 0.9 for GDNF (Figure 3). The sensitivity and specificity values for each of the eight top loci were individually calculated using the present tumor collection in a five-fold cross validation (Table 2). The quantitative marker values were dichotomized at a level that would minimize the classification error. Sensitivity ranged from 58–89% and specificity from 69–100%.
While measurements for several individual markers look promising, it is unrealistic to expect detection of all cases of a particular type of cancer using a single biomarker. Thus, our goal is to develop a panel of DNA methylation markers that, used in combination, can sensitively and specifically detect lung SQ. To assess the performance of combinations of our markers in the identification of tumors, we fit a random forest classifier to the data set, using 90 samples and 42 variables. Using bootstrap samples of the data, we grew a forest of 30,000 trees. Splits were determined using a random sample of five variables and trees were grown until there was only one observation in each leaf. When the 42 loci were ranked using the random forests classifier, the top four loci were the same as when the data was ranked by p-value or AUC value, and the order of the ranking is the same for these top four in all three groups (data not shown). Using all 42 loci in combination, we observed 97.7% sensitivity and 97.7% specificity. While this is encouraging, 42 loci are too many to test in a clinical setting. Trimming the panel down to just the top eight loci resulted in 95.6% sensitivity and specificity. Further restricting our analysis to the four most highly ranked loci maintained sensitivity at 95.6% while specificity dropped to 93.3%.
Thirteen of the 42 loci examined here were previously reported to be methylated in lung cancer tumor samples. Consistent with the literature, eight loci (MTHFR, OPCML, TNFRSF25, TCF21, SFRP2, SFRP1, CYP1B1, GPIBB, DLEC and ONECUT2) [17, 24, 30–39] are hypermethylated in tumor tissue in our study. Indeed, MTHFR, OPCML, TNFRSF25 and TCF21 show highly statistically significant differences (p < 1 × 10-6) between tumor and adjacent non-tumor tissues in our study. The results for three loci are in contrast with the published literature. MGMT, DIRAS3 (previously described as ARHI) and TMEFF2 (previously described as HPP1) have been reported to be hypermethylated in lung cancer [17, 18, 28, 33–36, 40–45]. We found that MGMT and DIRAS3 were statistically significantly more highly methylated in adjacent non-tumor than in SQ samples, while for TMEFF2, we observed almost no difference in methylation levels between tumor and non-tumor tissue (Table 1). The differences between our results and the published literature may be due to a variety of reasons, including technical differences (such as the use of the quantitative MethyLight versus qualitative methylation specific PCR, or the less sensitive CpG island microarrays), the sampling of a different region of the gene, differences in the lung cancer histologies studied (many studies contain a mix of NSCLC samples), and ethnic/racial differences in the patient populations studied. In the case of MGMT we sampled regions in and out of the CpG island in our pre-screen, and the region outside of the CpG island looked more promising, and was therefore tested. Thus, the primer/probe set we used differs from what has been published in the literature.
When examining the function of the 22 statistically significant potential markers for SQ, four major functional categories emerged. Eight loci encode proteins involved in signaling and growth regulation, seven loci encode transcription factors, four loci encode proteins with metabolic function, and three loci belong to no particular group (Table 3). Our strongest potential biomarkers, the eight most statistically significantly hypermethylated loci, are scattered across the first three of these groups. Because our focus is development of DNA methylation markers, our primary concern is consistent methylation of a particular locus, not whether the associated gene is actually silenced by methylation. Hence, genes in which the consistently hypermethylated locus is outside of the CpG island can serve as markers (e.g. HRAS, MGMT, MTHFR, PAX8, SLC38A4), even though the DNA methylation may not be of functional significance. While we have not determined whether the genes for our eight top markers are silenced, there is published evidence for the inactivation of some of these genes in lung cancer. For others, their expression in cancer has not yet been investigated, and might be worth examining in future, more mechanistic, studies. As six of the top eight loci show potentially functionally relevant DNA hypermethylation in tumors, we will discuss what is known about their role in cancer development.
OPCML, TNFRSF25 and TCF21 have been previously reported to be hypermethylated in lung cancer [30–32] and based on their function, methylation-induced silencing could favor tumor growth. Opioid binding protein/cell adhesion molecule (OPCML) is an opioid receptor and is involved in cell-cell adhesion. It binds opioid peptides (e.g. enkephalin) and causes apoptosis of lung cancer cell lines, indicating it functions as a tumor suppressor gene. This inhibition was reversed by nicotine , which may be of particular interest in lung cancer pathogenesis. It is of note that PENK, which encodes the precursor peptide of the OPCML ligand enkephalin, was also found to be significantly hypermethylated in tumor tissue in our studies. This might suggest methylation-induced silencing of a tumor suppressor pathway. We recently reported OPCML as highly methylated in lung adenocarcinoma,  indicating that it is a potential AD/SQ lung cancer biomarker.
Tumor necrosis factor receptor superfamily member 25 (TNFRSF25) has been shown to be methylated in bladder cancer, and very recently methylation in lung SQ was reported [31, 47]. As this receptor mediates apoptosis, methylation-induced silencing may facilitate evasion of cell death – a key step in cancer growth. The transcription factor TCF21 has been reported to be more highly methylated in lung cancer tissue than non-tumor adjacent lung, and overexpression in mouse xenografts results in a reduction in tumor size and weight . This implies a tumor suppressor function for TCF21, therefore tumor-associated promoter DNA methylation, and possibly transcriptional silencing, are not surprising.
For other genes, such as PITX2, PAX8 and PTPRN2, the biological consequences of DNA methylation remain a question. Functionally, it is unclear how PITX2 silencing would contribute to lung cancer growth. This member of the paired-like homeodomain transcription factor family functions in left-right asymmetry in development , but has no described function in adult lung. However, cancer-related methylation is reported in other tissues in which the gene has no described function, for example, in acute myeloid leukemia , breast cancer , and prostate cancer . Interestingly, higher DNA methylation levels of PITX2 are associated with greater recurrence of both breast and prostate cancer [50, 51]. Whether such a link exists in lung cancer will require further studies. Protein tyrosine phosphatase, receptor type, N polypeptide 2 (PTPRN2) is an autoantigen involved in insulin dependent diabetes mellitus . No previous reports of methylation of PTPRN2 exist, making it a potentially novel cancer biomarker.
The most intriguing of the identified loci is the top marker GDNF, encoding glial cell line-derived neurotrophic factor. GDNF has been reported to be overexpressed in lung tumor tissue  and is silent in normal adult lung . As a ligand for the RET proto-oncogene, GDNF would be a likely candidate for promoting cancer progression, and has been proposed to do so in pancreatic cancer . DNA methylation of this locus would seem contradictory. However, the high DNA methylation we report is at promoter 2 (located at the intron 1/exon 2 boundary of GDNF), a promoter that has been shown to have low activity . Indeed, in our preliminary studies, a primer designed against the primary promoter of GDNF showed no hypermethylation (data not shown). It may be possible that DNA methylation at the downstream promoter is somehow related to the transcriptional activity from the upstream promoter. Given the fact that GDNF is, to our knowledge, the strongest candidate DNA methylation marker for lung SQ identified to date, this issue would be worth investigating further.
While the top eight markers identified in this study show highly significant DNA hypermethylation in cancer, it will of course be important to validate these markers in an independent collection of samples. Such studies are in progress using a specimen collection balanced for gender and the major ethnic groups in the United States.
Our primary goal is to find sensitive and specific biomarkers for the early detection of lung cancer. Differences in the biology and treatment of different lung cancer histological subtypes warrant the development of markers for each cancer subtype. We have recently reported a panel of DNA methylation markers for lung adenocarcinoma . Here we report the identification of promising DNA methylation markers for squamous cell lung cancer. Statistical analysis of the difference in DNA methylation levels between SQ tumor and adjacent non-tumor lung tissue identified 25 statistically significant loci. Of these, three are potential negative DNA methylation markers (more methylated in adjacent non-tumor tissues), while 22 are potential positive DNA methylation markers. Of the 22 loci, we focused on those eight that were ranked most significantly hypermethylated in the cancer versuspaired non-cancer samples by p-value and ROC curves. These eight loci are significantly hypermethylated in both early (stage I) and more advanced cancers. Two of those eight loci (PAX8, PTPRN2) have never been reported to be hypermethylated in human cancer specimens, and thus constitute promising new candidate cancer markers. To our knowledge, the eight-locus panel consisting of GDNF, MTHFR, OPCML, TNFRSF25, TCF21, PAX8, PTPRN2 and PITX2, constitutes the highest sensitivity and specificity DNA methylation marker panel for lung SQ reported to date. Following its validation on a separate set of tumor and non-tumor lung samples, the next step will be to examine the DNA methylation of these loci in remote media (such as blood, sputum, bronchoalveolar lavage) from lung cancer patients and control non-cancer cases. In conjunction with our work on AD lung cancer and ongoing studies of other NSCLC subtypes, we hope to develop a panel of markers for the sensitive and specific detection of non-small cell lung cancer that would also identify the histological subtype. The further development of DNA methylation markers promises to be important not only for diagnostics, but also for prognostication, the ability to follow response to therapy, and guidance in the choice of treatment.
Tissue samples and DNA extraction
Samples were collected from the Los Angeles County Hospital archives, the Norris Comprehensive Cancer Center archives and the National Disease Research Interchange (NDRI). Study subjects included 21 males and 22 females ranging in age from 45 – 84 at time of diagnosis (median age: 70 years old). Age and gender information was missing for 2 patients. The study population was primarily Caucasian, with 35 Caucasians, 2 African Americans and race unknown for 8 patients. Information as to tumor stage was available for 43 of the 45 patients. TNM status was either listed in the pathology report, or discerned from the report using the International System for Staging Lung Cancer . This information was used to assign tumor stage. There were 6 stage IA, 25 stage IB, 7 stage IIB and 5 stage IIIA patients. Sections were cut from separate, histologically verified, tumor and adjacent non-tumor paraffin blocks. A 5 μm slide was haematoxylin & eosin (H&E) stained and coverslipped for histological confirmation of tumor histological type, and presence or absence of tumor, by an expert lung pathologist (MNK). Five adjacent 10 μm slides were cut, H&E stained, and tumor or non-tumor material was manually microdissected. DNA was extracted via proteinase K digestion . Briefly, cells were lysed in a solution containing 100 mM Tris-HCl (pH 8.0), 10 mM EDTA (pH 8.0), 1 mg/mL proteinase K, and 0.05 mg/mL tRNA and incubated at 50°C overnight. The DNA was bisulfite converted as previously described . All studies were institutionally approved by the University of Southern California Institutional Review Board (IRB# HS-016041, HS-06-00447), and the identities of patients were not made available to laboratory investigators.
DNA methylation analysis was done by MethyLight as previously described . A pre-screen methylation analysis using cell lines and five sets of paired SQ/non-tumor adjacent lung (distinct from the samples used in this study) were used to screen over 300 DNA methylation loci, and led to the identification of 42 loci of interest, which were evaluated in this study. The primer and probe sequences are described in the supplemental data [see additional file 1]. In addition to primer and probe sets designed specifically for the locus of interest, two internal reference primer and probe sets directed against collagen and ALU repeats were included in the analysis to normalize for input DNA [60, 61]. The percentage methylated reference (PMR) compares the level of methylation in the sample to in vitro methylated control DNA. It is calculated by dividing the GENE:reference ratio of a sample by the GENE:reference ratio of M. SssI-treated in vitro methylated human DNA and multiplying by 100 . PMRs were individually calculated using the collagen and ALU controls and then averaged.
Using PMR as a continuous variable, methylation levels of tumor samples were compared to adjacent non-tumor lung by means of the Wilcoxon signed rank test. The large number of loci analyzed increases the potential for false discovery. To counteract this risk, a multiple comparisons threshold was set and applied to those loci for which no previous data demonstrated their methylation in SQ of the lung at the time of analysis (Table 1, last column; ). To examine whether tumor-specific hypermethylation was seen in early as well as later stages of SQ lung cancer, methylation levels in tumor and adjacent non-tumor tissue were compared for "early" (stages IA and IB, n = 31) and more advanced cancers (stages II and III, n = 12), as well as for each individual stage (IA, IB, IIB and IIIA) using the Wilcoxon test. The same test was applied to the comparison of methylation levels in tumor samples between the early and advanced cancers. Associations with gender and age were tested using the Wilcoxon test to compare methylation levels within the tumor sample collection only. As an indicator of the potential utility of methylation of these loci as a marker for cancer, Receiver Operating Characteristic (ROC) curves were calculated for each of our top markers, using the PMR values for the tumor and adjacent non-tumor lung specimens. All statistical tests were two-sided. Statistical tests were carried out using JMP (v 5.0.1a, SAS Institute Inc, NC).
To determine which combinations of markers would be most effective to correctly identify tumor vs. non-tumor samples, we fit a random forest classifier to the data set, using the R programming language (v 2.5; ) and 90 samples and 42 variables. Using bootstrap samples of the data, we grew a forest of 30,000 trees. Splits were determined using a random sample of five variables and trees were grown until there was only one observation in each leaf. We determined error rates using the observations that were not used to generate the trees. For each observation, its outcome was predicted by having the majority vote from the trees that were generated without the original data point in their bootstrap sample. These predicted values were compared against the true tissue type to estimate prediction error.
Jemal A, Siegel R, Ward E, Murray T, Xu J, Thun MJ: Cancer statistics, 2007. CA Cancer J Clin. 2007, 57 (1): 43-66.
Kaneko M, Eguchi K, Ohmatsu H, Kakinuma R, Naruke T, Suemasu K, Moriyama N: Peripheral lung cancer: screening and detection with low-dose spiral CT versus radiography. Radiology. 1996, 201 (3): 798-802.
Gavelli G, Giampalma E: Sensitivity and specificity of chest X-ray screening for lung cancer: review article. Cancer. 2000, 89 (11 Suppl): 2453-2456. 10.1002/1097-0142(20001201)89:11+<2453::AID-CNCR21>3.0.CO;2-M
Henschke CI, Yankelevitz DF, Libby DM, Pasmantier MW, Smith JP, Miettinen OS: Survival of patients with stage I lung cancer detected on CT screening. N Engl J Med. 2006, 355 (17): 1763-1771. 10.1056/NEJMoa060476
Crestanello JA, Allen MS, Jett JR, Cassivi SD, Nichols FC, Swensen SJ, Deschamps C, Pairolero PC: Thoracic surgical operations in patients enrolled in a computed tomographic screening trial. J Thorac Cardiovasc Surg. 2004, 128 (2): 254-259. 10.1016/j.jtcvs.2004.02.017
Diederich S, Wormanns D: Impact of low-dose CT on lung cancer screening. Lung Cancer. 2004, 45 (Suppl 2): S13-19. 10.1016/j.lungcan.2004.07.997
McWilliams A, MacAulay C, Gazdar AF, Lam S: Innovative molecular and imaging approaches for the detection of lung cancer and its precursor lesions. Oncogene. 2002, 21 (45): 6949-6959. 10.1038/sj.onc.1205831
Bach PB, Jett JR, Pastorino U, Tockman MS, Swensen SJ, Begg CB: Computed tomography screening and lung cancer outcomes. Jama. 2007, 297 (9): 953-961. 10.1001/jama.297.9.953
Haussinger K, Becker H, Stanzel F, Kreuzer A, Schmidt B, Strausz J, Cavaliere S, Herth F, Kohlhaufl M, Muller KM: Autofluorescence bronchoscopy with white light bronchoscopy compared with white light bronchoscopy alone for the detection of precancerous lesions: a European randomised controlled multicentre trial. Thorax. 2005, 60 (6): 496-503. 10.1136/thx.2005.041475
Feller-Kopman D, Lunn W, Ernst A: Autofluorescence bronchoscopy and endobronchial ultrasound: a practical review. Ann Thorac Surg. 2005, 80 (6): 2395-2401. 10.1016/j.athoracsur.2005.04.084
Bach PB, Kelley MJ, Tate RC, McCrory DC: Screening for lung cancer: a review of the current literature. Chest. 2003, 123 (1 Suppl): 72S-82S. 10.1378/chest.123.1_suppl.72S
Li R, Todd NW, Qiu Q, Fan T, Zhao RY, Rodgers WH, Fang HB, Katz RL, Stass SA, Jiang F: Genetic deletions in sputum as diagnostic markers for early detection of stage I non-small cell lung cancer. Clin Cancer Res. 2007, 13 (2): 482-487. 10.1158/1078-0432.CCR-06-1593
Laird PW, Jaenisch R: The role of DNA methylation in cancer genetic and epigenetics. Annu Rev Genet. 1996, 30: 441-464. 10.1146/annurev.genet.30.1.441
Eads CA, Danenberg KD, Kawakami K, Saltz LB, Blake C, Shibata D, Danenberg PV, Laird PW: MethyLight: a high-throughput assay to measure DNA methylation. Nucleic Acids Res. 2000, 28 (8): E32- 10.1093/nar/28.8.e32
Belinsky SA: Gene-promoter hypermethylation as a biomarker in lung cancer. Nat Rev Cancer. 2004, 4 (9): 707-717. 10.1038/nrc1432
de Fraipont F, Moro-Sibilot D, Michelland S, Brambilla E, Brambilla C, Favrot MC: Promoter methylation of genes in bronchial lavages: a marker for early diagnosis of primary and relapsing non-small cell lung cancer?. Lung Cancer. 2005, 50 (2): 199-209. 10.1016/j.lungcan.2005.05.019
Field JK, Liloglou T, Warrak S, Burger M, Becker E, Berlin K, Nimmrich I, Maier S: Methylation discriminators in NSCLC identified by a microarray based approach. Int J Oncol. 2005, 27 (1): 105-111.
Toyooka S, Toyooka KO, Maruyama R, Virmani AK, Girard L, Miyajima K, Harada K, Ariyoshi Y, Takahashi T, Sugio K: DNA methylation profiles of lung tumors. Mol Cancer Ther. 2001, 1 (1): 61-67.
Ehrich M, Field JK, Liloglou T, Xinarianos G, Oeth P, Nelson MR, Cantor CR, Boom van den D: Cytosine methylation profiles as a molecular marker in non-small cell lung cancer. Cancer Res. 2006, 66 (22): 10911-10918. 10.1158/0008-5472.CAN-06-0400
Vischioni B, Oudejans JJ, Vos W, Rodriguez JA, Giaccone G: Frequent overexpression of aurora B kinase, a novel drug target, in non-small cell lung carcinoma patients. Mol Cancer Ther. 2006, 5 (11): 2905-2913. 10.1158/1535-7163.MCT-06-0301
Tam IY, Chung LP, Suen WS, Wang E, Wong MC, Ho KK, Lam WK, Chiu SW, Girard L, Minna JD: Distinct epidermal growth factor receptor and KRAS mutation patterns in non-small cell lung cancer patients with different tobacco exposure and clinicopathologic features. Clin Cancer Res. 2006, 12 (5): 1647-1653. 10.1158/1078-0432.CCR-05-1981
Zhou W, Heist RS, Liu G, Neuberg DS, Asomaning K, Su L, Wain JC, Lynch TJ, Giovannucci E, Christiani DC: Polymorphisms of vitamin D receptor and survival in early-stage non-small cell lung cancer patients. Cancer Epidemiol Biomarkers Prev. 2006, 15 (11): 2239-2245. 10.1158/1055-9965.EPI-06-0023
Raponi M, Zhang Y, Yu J, Chen G, Lee G, Taylor JM, Macdonald J, Thomas D, Moskaluk C, Wang Y: Gene expression signatures for predicting prognosis of squamous cell and adenocarcinomas of the lung. Cancer Res. 2006, 66 (15): 7466-7472. 10.1158/0008-5472.CAN-06-1191
Tsou JA, Galler JS, Siegmund KD, Laird PW, Turla S, Cozen W, Hagen JA, Koss MN, Laird-Offringa IA: Identification of a panel of sensitive and specific DNA methylation markers for lung adenocarcinoma. Mol Cancer. 2007, 6: 70- 10.1186/1476-4598-6-70
Janssen-Heijnen ML, Coebergh JW: Trends in incidence and prognosis of the histological subtypes of lung cancer in North America, Australia, New Zealand and Europe. Lung Cancer. 2001, 31 (2–3): 123-137. 10.1016/S0169-5002(00)00197-5
Dammann R, Strunnikova M, Schagdarsurengin U, Rastetter M, Papritz M, Hattenhorst UE, Hofmann HS, Silber RE, Burdach S, Hansen G: CpG island methylation and expression of tumour-associated genes in lung carcinoma. Eur J Cancer. 2005, 41 (8): 1223-1236. 10.1016/j.ejca.2005.02.020
Kim YT, Lee SH, Sung SW, Kim JH: Can aberrant promoter hypermethylation of CpG islands predict the clinical outcome of non-small cell lung cancer after curative resection?. Ann Thorac Surg. 2005, 79 (4): 1180-1188. discussion 1180–1188, 10.1016/j.athoracsur.2004.09.060
Safar AM, Spencer H, Su X, Coffey M, Cooney CA, Ratnasinghe LD, Hutchins LF, Fan CY: Methylation profiling of archived non-small cell lung cancer: a promising prognostic system. Clin Cancer Res. 2005, 11 (12): 4400-4405. 10.1158/1078-0432.CCR-04-2378
Benjamini Y, Hochberg Y: Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Statist Soc B. 2005, 57: 289-300.
Virmani AK, Tsou JA, Siegmund KD, Shen LY, Long TI, Laird PW, Gazdar AF, Laird-Offringa IA: Hierarchical clustering of lung cancer cell lines using DNA methylation markers. Cancer Epidemiol Biomarkers Prev. 2002, 11 (3): 291-297.
Nakas CT, Alonzo TA: ROC graphs for assessing the ability of a diagnostic marker to detect three disease classes with an umbrella ordering. Biometrics. 2007, 63 (2): 603-609. 10.1111/j.1541-0420.2006.00715.x
Smith LT, Lin M, Brena RM, Lang JC, Schuller DE, Otterson GA, Morrison CD, Smiraglia DJ, Plass C: Epigenetic regulation of the tumor suppressor gene TCF21 on 6q23-q24 in lung and head and neck cancer. Proc Natl Acad Sci USA. 2006, 103 (4): 982-987. 10.1073/pnas.0510171102
Zochbauer-Muller S, Fong KM, Virmani AK, Geradts J, Gazdar AF, Minna JD: Aberrant promoter methylation of multiple genes in non-small cell lung cancers. Cancer Res. 2001, 61 (1): 249-255.
Brabender J, Usadel H, Metzger R, Schneider PM, Park J, Salonga D, Tsao-Wei DD, Groshen S, Lord RV, Takebe N: Quantitative O(6)-methylguanine DNA methyltransferase methylation analysis in curatively resected non-small cell lung cancer: associations with clinical outcome. Clin Cancer Res. 2003, 9 (1): 223-227.
Gu J, Berman D, Lu C, Wistuba II, Roth JA, Frazier M, Spitz MR, Wu X: Aberrant promoter methylation profile and association with survival in patients with non-small cell lung cancer. Clin Cancer Res. 2006, 12 (24): 7329-7338. 10.1158/1078-0432.CCR-06-0894
Harden SV, Tokumaru Y, Westra WH, Goodman S, Ahrendt SA, Yang SC, Sidransky D: Gene promoter hypermethylation in tumors and lymph nodes of stage I lung cancer patients. Clin Cancer Res. 2003, 9 (4): 1370-1375.
Rauch T, Li H, Wu X, Pfeifer GP: MIRA-assisted microarray analysis, a new technology for the determination of DNA methylation patterns, identifies frequent methylation of homeodomain-containing genes in lung cancer cells. Cancer Res. 2006, 66 (16): 7939-7947. 10.1158/0008-5472.CAN-06-1888
Marsit CJ, Houseman EA, Christensen BC, Eddy K, Bueno R, Sugarbaker DJ, Nelson HH, Karagas MR, Kelsey KT: Examination of a CpG island methylator phenotype and implications of methylation profiles in solid tumors. Cancer Res. 2006, 66 (21): 10621-10629. 10.1158/0008-5472.CAN-06-1687
Dai Z, Lakshmanan RR, Zhu WG, Smiraglia DJ, Rush LJ, Fruhwald MC, Brena RM, Li B, Wright FA, Ross P: Global methylation profiling of lung cancer identifies novel methylated genes. Neoplasia. 2001, 3 (4): 314-323. 10.1038/sj.neo.7900162
Liu Y, Lan Q, Siegfried JM, Luketich JD, Keohavong P: Aberrant promoter methylation of p16 and MGMT genes in lung tumors from smoking and never-smoking lung cancer patients. Neoplasia. 2006, 8 (1): 46-51. 10.1593/neo.05586
Furonaka O, Takeshima Y, Awaya H, Kushitani K, Kohno N, Inai K: Aberrant methylation and loss of expression of O-methylguanine-DNA methyltransferase in pulmonary squamous cell carcinoma and adenocarcinoma. Pathol Int. 2005, 55 (6): 303-309. 10.1111/j.1440-1827.2005.01830.x
Guo M, House MG, Hooker C, Han Y, Heath E, Gabrielson E, Yang SC, Baylin SB, Herman JG, Brock MV: Promoter hypermethylation of resected bronchial margins: a field defect of changes?. Clin Cancer Res. 2004, 10 (15): 5131-5136. 10.1158/1078-0432.CCR-03-0763
Luo RZ, Fang X, Marquez R, Liu SY, Mills GB, Liao WS, Yu Y, Bast RC: ARHI is a Ras-related small G-protein with a novel N-terminal extension that inhibits growth of ovarian and breast cancers. Oncogene. 2003, 22 (19): 2897-2909. 10.1038/sj.onc.1206380
Suzuki M, Shigematsu H, Shames DS, Sunaga N, Takahashi T, Shivapurkar N, Iizasa T, Frenkel EP, Minna JD, Fujisawa T: DNA methylation-associated inactivation of TGFbeta-related genes DRM/Gremlin, RUNX3, and HPP1 in human cancers. Br J Cancer. 2005, 93 (9): 1029-1037. 10.1038/sj.bjc.6602837
Hanabata T, Tsukuda K, Toyooka S, Yano M, Aoe M, Nagahiro I, Sano Y, Date H, Shimizu N: DNA methylation of multiple genes and clinicopathological relationship of non-small cell lung cancers. Oncol Rep. 2004, 12 (1): 177-180.
Maneckjee R, Minna JD: Opioids induce while nicotine suppresses apoptosis in human lung cancer cells. Cell Growth Differ. 1994, 5 (10): 1033-1040.
Friedrich MG, Weisenberger DJ, Cheng JC, Chandrasoma S, Siegmund KD, Gonzalgo ML, Toma MI, Huland H, Yoo C, Tsai YC: Detection of methylated apoptosis-associated genes in urine sediments of bladder cancer patients. Clin Cancer Res. 2004, 10 (22): 7457-7465. 10.1158/1078-0432.CCR-04-0930
Blum M, Steinbeisser H, Campione M, Schweickert A: Vertebrate left-right asymmetry: old studies and new insights. Cell Mol Biol (Noisy-le-grand). 1999, 45 (5): 505-516.
Toyota M, Kopecky KJ, Toyota MO, Jair KW, Willman CL, Issa JP: Methylation profiling in acute myeloid leukemia. Blood. 2001, 97 (9): 2823-2829. 10.1182/blood.V97.9.2823
Maier S, Nimmrich I, Koenig T, Eppenberger-Castori S, Bohlmann I, Paradiso A, Spyratos F, Thomssen C, Mueller V, Nahrig J: DNA-methylation of the homeodomain transcription factor PITX2 reliably predicts risk of distant disease recurrence in tamoxifen-treated, node-negative breast cancer patients – Technical and clinical validation in a multi-centre setting in collaboration with the European organisation for research and treatment of cancer (EORTC) pathobiology group. Eur J Cancer. 2007, 43 (11): 1679-1686. 10.1016/j.ejca.2007.04.025
Hampton T: New Markers may help predict prostate cancer relapse risk. Journal of the American Medical Association. 2006, 295 (19): 2234-2238. 10.1001/jama.295.19.2234
Li Q, Borovitskaya AE, DeSilva MG, Wasserfall C, Maclaren NK, Notkins AL, Lan MS: Autoantigens in insulin-dependent diabetes mellitus: molecular cloning and characterization of human IA-2 beta. Proc Assoc Am Physicians. 1997, 109 (4): 429-439.
Garnis C, Davies JJ, Buys TP, Tsao MS, MacAulay C, Lam S, Lam WL: Chromosome 5p aberrations are early events in lung cancer: implication of glial cell line-derived neurotrophic factor in disease progression. Oncogene. 2005, 24 (30): 4806-4812. 10.1038/sj.onc.1208643
Fromont-Hankard G, Philippe-Chomette P, Delezoide AL, Nessmann C, Aigrain Y, Peuchmaur M: Glial cell-derived neurotrophic factor expression in normal human lung and congenital cystic adenomatoid malformation. Arch Pathol Lab Med. 2002, 126 (4): 432-436.
Funahashi H, Okada Y, Sawai H, Takahashi H, Matsuo Y, Takeyama H, Manabe T: The role of glial cell line-derived neurotrophic factor (GDNF) and integrins for invasion and metastasis in human pancreatic cancer cells. J Surg Oncol. 2005, 91 (1): 77-83. 10.1002/jso.20277
Grimm L, Holinski-Feder E, Teodoridis J, Scheffer B, Schindelhauer D, Meitinger T, Ueffing M: Analysis of the human GDNF gene reveals an inducible promoter, three exons, a triplet repeat within the 3'-UTR and alternative splice products. Hum Mol Genet. 1998, 7 (12): 1873-1886. 10.1093/hmg/7.12.1873
Mountain CF: The international system for staging lung cancer. Semin Surg Oncol. 2000, 18 (2): 106-115. 10.1002/(SICI)1098-2388(200003)18:2<106::AID-SSU4>3.0.CO;2-P
Laird PW, Zijderveld A, Linders K, Rudnicki MA, Jaenisch R, Berns A: Simplified mammalian DNA isolation procedure. Nucleic Acids Res. 1991, 19 (15): 4293- 10.1093/nar/19.15.4293
Weisenberger DJ, Siegmund KD, Campan M, Young J, Long TI, Faasse MA, Kang GH, Widschwendter M, Weener D, Buchanan D: CpG island methylator phenotype underlies sporadic microsatellite instability and is tightly associated with BRAF mutation in colorectal cancer. Nat Genet. 2006, 38 (7): 787-793. 10.1038/ng1834
Eads CA, Danenberg KD, Kawakami K, Saltz LB, Danenberg PV, Laird PW: CpG island hypermethylation in human colorectal tumors is not associated with DNA methyltransferase overexpression. Cancer Res. 1999, 59 (10): 2302-2306.
Weisenberger DJ, Campan M, Long TI, Kim M, Woods C, Fiala E, Ehrlich M, Laird PW: Analysis of repetitive element DNA methylation by MethyLight. Nucleic Acids Res. 2005, 33 (21): 6823-6836. 10.1093/nar/gki987
Ihaka R, Gentleman R: R: a language for data analysis and graphics. J Comput Graph Statist. 1996, 5: 299-314. 10.2307/1390807. 10.2307/1390807
The authors thank members of the Laird lab for help with MethyLight and probe/primer design, and Laird-Offringa lab members for critical comments on the manuscript. We thank Joe Hacia, Gyeong-Hoon Kang, Brian Pike, Jeffrey Tsou and Deborah Weener for help with primer/probe design. This project was funded by grant support for IALO: National Institutes of Health/National Cancer Institute R21 CA102247 and R01 CA119029, Whittier Foundation Translational Research Grant, a STOP Cancer award and generous support by the Kazan, McClain, Abrams, Fernandez, Lyons & Farrise Foundation and Paul and Michelle Zygielbaum. Two of the cancer samples used in this study were provided by the Norris Comprehensive Cancer Center's NIH-funded Slide Retrieval and Tissue Discard Repository. None of the funding agencies played any role in the collection, analysis, interpretation of the data, writing of the manuscript, nor the decision to publish. The content is solely the responsibility of the authors and does not necessarily represent the official views of the funding agencies.
IALO and PWL are shareholders of Epigenomics AG, which has a commercial interest in the development of DNA markers for disease detection and diagnosis. None of the work performed in the laboratories of the authors is or has been supported or directed by Epigenomics.
PPA was involved in experimental execution and extensive data analysis, drafting the manuscript, and generation of figures. JSG was involved in marker design, experimental execution and initial analysis. MNK reviewed all histological slides prior to microdissection. JAH provided samples and statistical discussions. ST helped locate and section tissues from the Los Angeles County Hospital and provided the linked and de-identified clinicopathological information. MC and DJW provided experimental advice and designed several of the MethyLight reactions used in this study. PWL provided experimental advice and discussion regarding data interpretation. KDS oversaw statistical analysis and drafted statistical sections of the manuscript. IALO designed the study, oversaw all aspects of the project, mentored PPA and JSG, and revised manuscript drafts. All authors reviewed and commented on the manuscript during its drafting and approved the final version.
Paul P Anglim, Janice S Galler contributed equally to this work.