Correlation between EGFR Gene Mutations and Lung Cancer: a Hospital-Based Study

Epidermal growth factor receptor ( EGFR ) is one of the targeted molecular markers in many cancers including lung malignancies. Gefitinib and erlotinib are two available therapeutics that act as specific inhibitors of tyrosine kinase (TK) domains. We performed a case-control study with formalin-fixed paraffin-embedded tissue blocks (FFPE) from tissue biopsies of 167 non-small cell lung carcinoma (NSCLC) patients and 167 healthy controls. The tissue biopsies were studied for mutations in exons 18-21 of the EGFR gene. This study was performed using PCR followed by DNA sequencing. We identified 63 mutations in 33 men and 30 women. Mutations were detected in exon 19 (delE746-A750, delE746-T751, delL747-E749, delL747-P753, delL747-T751) in 32 patients, exon 20 (S786I, T790M) in 16, and exon 21 (L858R) in 15. No mutations were observed in exon 18. The 63 patients with EFGR mutations were considered for upfront therapy with oral tyrosine kinase inhibitor (TKI) drugs and have responded well to therapy over the last 15 months. The control patients had no mutations in any of the exons studied. The advent of EGFR TKI therapy has provided a powerful new treatment modality for patients diagnosed with NSCLC. The study emphasizes the frequency of EGFR mutations in NSCLC patients and its role as an important predictive marker for response to oral TKI in the south Indian population.


Introduction
Lung Cancer is the most common cancer and the leading cause of cancer death in the world. This is also true in India, where cancer accounts for 28% of the total number of deaths. The incidence of lung cancer in Asia has become the greatest threat to human health (Varalakshmi et al., 2013;Zhao et al., 2014). The etiological factors affecting lung cancer have not been completely uncovered, but some studies propose that environmental exposure to tobacco smoke is a primary risk factor . Other known carcinogens related to lung cancer are radon, arsenic, cadmium, chromates, and asbestos (Wang et al., 2014). Lung cancer is divided into non-small-cell lung cancer (NSCLC) and small-cell lung cancer. Adenocarcinoma, squamous cell carcinoma, and small-cell carcinoma are three major forms of lung cancers. Therefore, the pivotal role of genetics in cancer predisposition, especially in the most malignant lung cancers, has prompted substantial interest in recent years (Jin et al., 2014).
Mutations in the epidermal growth factor receptor (EGFR) gene are reported far more frequently in lung cancer than any other cancer (Powrozek et al., 2014). EGFR mutation analysis is one of the best predictive markers for the use of EGFR-Tyrosine Kinase Inhibitor (TKI) therapy in NSCLC (Thunnissen et al., 2014). EGFR mutations may also impart an improved overall prognosis for advanced NSCLC patients compared with EGFR wild-type tåumors. The predictive and prognostic role of EGFR mutational status in stage I through III NSCLC remains less well defined, with conflicting results in studies reported to date. In many of these studies, the administration of neoadjuvant or adjuvant chemotherapy or TKIs clouds the interpretation of the impact of EGFR status on prognosis. The EGFR gene shows significant and durable response to treatment with the EGFR-TKI erlotinib or gefitinib (Izar et al., 2013). Single-nucleotide polymorphisms (SNPs) are the most common and stable markers of human genetic variation, and may be associated with the risk of a variety of cancers, including that of the lung . Deletions in exon 19 and the L858R point mutation in exon 21 occur most frequently and are associated with a response rate of approximately 70% to EGFR-TKI therapy (Thunnissen et al., 2014).
We performed this study to examine the prognostic value of EGFR mutation status in NSCLC patients in a cosmopolitan city in the South of India. We aimed to identify the impact of EGFR mutational status on the risk of lung cancer, and the NSCLC patients' clinical outcome when treated with oral EGFR-TKI. We carried out a hospital-based study in the former capital city of Hyderabad, India. We investigated the hypothesis that EGFR mutations may be an important indicator of lung cancer and may be correlated with clinical characteristics.

Selection of subjects
This is a prospective case-control hospital-based study carried out in the Department of Molecular Biology and Ctytogenetics at the Apollo Hospital, in Telangana region of Hyderabad, India. Histological type was determined according to WHO criteria. In this study, 274 subjects were included and among them 167 patients were randomly selected from the Oncology department. This study included only patients with NSCLC, confirmed by postoperative pathology, who were receiving EGFR-TKI treatment. Normal healthy controls (n=167) were selected from the general population and included in our study. Freshly harvested NSCLC biopsies were fixed in 10% buffered formalin within 30 min and embedded into paraffin (FFPE). Immunohistochemistry (IHC) staining was performed on two cores extracted from each specimen. Written informed-consent was obtained for each patient and the ethical committee of the Apollo Hospital approved the study.

DNA isolation and genotyping of EGFR mutations
Genomic DNA was isolated from FFPE lung biopsies of NSCLC patients and control subjects as per the standard QIAamp protocol recommended by the manufacturers (QIAGEN Inc., Valencia, CA). The quality and quantity of the DNA was quantified using a spectrophotometer. The purity was determined by calculating the ratio of absorbance at 260 nm to absorbance at 280 nm (A260/ A280). Non-annealed DNA should have an A260/A280 ratio of 1.7-1.9. Polymerase chain reaction (PCR) was performed in a thermal cycler for exons 18, 19, 20, and 21 with flanking intronic sequences. Specific primers were designed for the selected exons in the EGFR gene. The details of the selected exons and primer sequences are listed in Table 1. The PCR profile consisted of 35 cycles of denaturation (95°C for 5 min), annealing (at 52°C for 30 s), and extension at (72°C for 5 min) (Khan et al., 2015). The amplified products were separated on 2% agarose gel, stained with ethidium bromide (Cambrex, East Rutherford, NJ, USA), and visualized on a UV transilluminator (Dafco, USA). DNA Sanger sequencing method were used (Khan et al., 2015a).

Sanger sequencing
The purified PCR fragments were then sequenced in both the forward and reverse directions. Chain termination sequencing involves the synthesis of new strands of DNA complementary to a single-stranded template. The template DNA (~50 ng) was purified with a PCR clean up kit and was amplified in a total reaction volume of 20 μL containing Big Dye terminator reaction buffer, forward and reverse primers (10 pmoles each), and molecular grade water. DNA sequencing was performed with an Applied Biosystem machine. The reaction was performed at 95°C for 15 s, 60°C for 15 s, and 60°C for 5 min. The amplified gene in the reaction was precipitated after several washes in 95% and 70% alcohol, dried in a vacuum centrifuge, resuspended in Hi-Di formamide, and loaded onto 7200 Genetic Analyzer (Applied Biosystem, Chicago, USA) for sequencing. The resultant sequences were compared with the Cambridge sequence and Seascapes software.

Statistical analysis
Clinical characteristics of all subjects are expressed as the mean±SD. Alleles and genotype frequency differences between patients and controls were tested using a chisquare test. Odds ratios (ORs) and 95% confidence intervals were calculated by binomial logistic regression for the allele, genotype, and haplotype frequencies, and the chi-square test was used to identify departures from Hardy-Weinberg equilibrium. Statistical analyses were performed with SPSS (version 19.0) software.

Baseline characteristics
167 cases and 167 controls were included in this study. All subjects were native of Hyderabad, India. In the study case subjects, 112 (67%) were male and 55 (33%) were female. The control subjects consisted of 110 (65.9%) males and 57 (34.1%) females. The baseline characteristics are shown in Table 2. The mean age was 55.12 years in cases and 51.12 in controls (p=0.56). In this study, 90 (53.9%) subjects were smokers and the remaining 77 (46.1%) were non-smokers. The patients who were selected for this study had NSCLC tumors (n=167), whereas the control subjects all had benign tumors (n=50).

Mutational analysis
Genotyping was performed with direct sequencing in all 167 patients and 167 controls. Sixty-three (37.7%) patients had an EGFR mutation in all exons tested (18-21), as determined by Sanger sequencing. All the NSCLC patients had at least a single mutation and none of the patients had a double mutation (i.e. not more than 1 mutation in Exon 18-21). The most frequently observed EGFR mutation was a deletion (Del 19) on the LREA region of exon 19, found in 32 (19.1%) patients. The second most common EGFR mutation, T790M on exon 20, was found in 16 (9.6%) of the NSCLC patients. T790M has been implicated in primary and secondary resistance to EGFR-TKIs. The exon 21 point mutation L858R was found in 15 (9%) patients (Table 3). No mutations were observed on exon 18 (G719). Genotyping and sequencing was also performed in the control subjects (n=167), however we did not find any mutations in any of the tested exons (exon18-21) (Figures 1 and 2).

Gender stratification
We compared the stratification of mutations in the selected exons based on gender and found 29.2% of the mutations were male-specific, while 54.5% of the mutations were female-specific. A total of 18.7% (n=26) mutations were found in males on exon 19, 3.5% (n=4) were in exon 20, and 7.1% (n=4) were in exon 21. In females we found 20% (n=11) of the total mutations in exon 19, 21.8% (n=12) in exon 20, and 12.7% (n=7) in exon 21. There were no mutations observed in exon 18 (Table 4).   Smoking Furthermore, we analyzed our study with respect to patient smoking status. Our calculations indicated that 55.5% (n=50) of the total mutations were present in the tumor biopsies isolated from smokers, while 16.9% (n=13) of the mutations were observed in the non-smokers' tumor biopsies. In smokers, we found 28.9% (n=26) of the mutations in exon 19, 16.7% (n=15) in exon 20, and 10% (n=9) of the total mutations in exon 21. However, in non-smokers we found 13% (n=10) of the total mutations in exon 19, 2.6% (n=2) in exon 20, and 1.3% (n=1) in exon 21. There were no mutations in exon 18 (Table 5).

Discussion
We scrutinized the influence of EGFR mutations on lung cancer risk by conducting a hospital-based casecontrol study. EGFR is one of the most highly targeted molecular markers in many cancers including lung cancer. Genetic modifications such as deletions, insertions, and SNP in the TK domain of EGFR are a common feature observed in most lung cancers. We studied the prevalence of EGFR mutations in NSCLC patients from samples obtained from biopsy/cytology/pleural fluid and fine needle aspiration (FNA), across South India. We have screened for 13 somatic mutations that span exons 18, 19, 20, and 21 of EGFR gene using PCR and Sanger sequencing. The underlying reason for the high somatic mutation rate noted in this study is uncertain. All of the cases included in the current study had been confirmed to have NSCLC tumors, as reported in the majority of the literature. This also may reflect the finding that cases in the current study were more selective because at the study institution a specific molecular test was often requested by clinicians in consultation with pathologists after reviewing the clinicopathologic characteristics of each individual case (Cai et al., 2013).
Gefitinib is an oral EGFR TKI that has been shown to be efficacious and well tolerated in patients with pretreated advanced NSCLC (Kris et al., 2003;Fukuoka et al., 2013). The complex relationship between EGFRrelated biomarkers and response to EGFR-TKIs has been investigated extensively. EGFR mutation testing was recommended prior to systemic chemotherapy for all patients with advanced NSCLC, excluding SCC (David et al., 2006). To select the appropriate treatment regimen, rapid and accurate mutation test results are necessary in clinical practice. Although various methods are used to detect EGFR mutations, there is no universal consensus on which method is the most effective. However, studies have shown that direct DNA sequencing and TaqManbased real-time PCR followed by pyrosequencing are the current standard for EGFR mutation detection (Ettinger et al., 2010).
Previous studies have suggested that -216G/T (Reinersman et al., 2010;Noronha et al., 2013) and D994D (Paez et al., 2004) polymorphisms are associated with clinical outcome of gefitinib therapy. CA-SSR in intron 1 of EGFR is the most studied polymorphism. CA-SSR has been associated with EGFR gene expression and has been reported to correlate with clinical outcome of gefitinib therapy (Reinersman et al., 2010;Sun and Ueno et al., 2012;Noronha et al., 2013). Shorter CA repeats have been associated with higher transcription levels of EGFR and have been reported to be correlated with better clinical outcome of gefitinib therapy. Liu et al found that the -216G/T polymorphism and CA-19 genotypes are found more frequently in patients with exon 19 deletions (Liu et al., 2011). On the other hand, Suzuki et al reported that the EGFR protein expression level was significantly higher in the shorter CA repeats group than in the longer allele group, but its length was not associated with EGFR somatic mutations (Suzuki et al., 2008). Jou et al. (2009) revealed that the EGFR 8227G/A polymorphism was associated with lung cancer, especially in non smoking female lung adenocarcinoma patients in the Taiwanese population (Jou et al., 2009;Shitara et al., 2012).
In our study, 37.7% of the patients were found to harbor an EGFR mutation. The previous study from India found that the mutation rate was 35% (24), which was similar to our study. It is likely that the present study and the prior Indian report overestimated the incidence of EGFR mutations because of small sample sizes and clinically selected patients. Worldwide, the incidence of EGFR mutations has been well characterized and has been reported to occur at a rate of 10-15% in North Americans and Europeans, 19% in African-Americans, and about 30% in East Asians (Paez et al., 2004;Cortes-Funes et al., 2005;Reinersman et al., 2010;Dong et al., 2012).
In this study, we found that 13 (16.9%) of patients with EGFR mutations were non-smokers, while 50 (55.5%) of the patients who did not have any EGFR mutations had a smoking history (Table 5). Our study also indicated that smokers have more frequent EGFR mutations as compared to non-smokers (Table 5). This is in contrast to the studies by Sun et al. (2012) and Noronha et al (2013), which showed that EGFR mutations were more   frequent in females than in males and in non-smokers than smokers. The reason for these inconsistencies in our study is currently unknown; however, it may be due to our small sample size and the greater number of male smokers included in our study group. In our study, the highest frequency of mutations was seen in exon 19, followed by exon 20, and exon 21 (Table  3). Similar observations were made by Chang et al who found more EGFR mutations in patients who are Asian, female, nonsmokers, and have adenocarcinoma (Kosaka et al., 2004;Marchetti et al., 2005;Shigematsu et al., 2005;Tokumo et al., 2005;Chang et al., 2006).
In a similar study by Shigematsu et al. (2005) and Rosell et al. (2009), the most common EGFR mutations were short, in-frame deletions (most often 15 or 18 bp) in exon 19. Our study group did not show any mutations in exon 18. A similar observation was made by Noronha et al (2013) where 74% of patients were noted to have an in-frame deletion in exon 19, while 23% had the L858R point mutation in exon 21, and only 2.5% of patients had the G719C point mutation in exon 18. There was a positive association between the number of EGFR mutations and age amongst never-smokers regardless of sex, indicating that EGFR mutations occur cumulatively by unidentified internal/external factors other than smoking. Aging is one of the best, but rarely referred, risk factors for various types of cancer including lung cancer, because age could be a surrogate for accumulation of genetic events in cancers. Smoking is inversely associated with the presence of EGFR mutations in lung cancer, but because smoking status is strongly confounded by age and sex, the sole impact of age is difficult to evaluate. Our patients with EGFR-activating mutations had a significantly better response rate, progression-free survival, and overall survival when treated with EGFR-targeted therapies. This result was seen similarly by Noronha et al. (2013).
In this study, 80 % of the positive EGFR patients were treated with an oral TKI. Among the 36 patients, only 24 patients (66.6%) responded to the oral TKI therapy and survived for 15 months ( Table 6). The remaining 12 patients harboring activating mutations in the EGFR tyrosine kinase domain were found to be resistant to oral TKI at their 2-to 3-month follow-up scan, and survived for 9 months. This might be possibly due to exon 20 mutations, which are considered to be TK resistant (Wang et al., 2013). Thus, it is necessary to identify more markers for the effective prediction of patient response to EGFR TKIs, and it is also necessary to obtain biopsies of the primary tumor subsequently during the course of treatment to detect the presence of secondary mutations that could alter the patient's drug response (Pao et al., 2004). Thus, Indian patients with EGFR-activating mutations have a significantly better response rate and progression-free survival when treated with EGFR targeted therapies.
To summarize, the advent of EGFR TKI therapy has provided a powerful new treatment modality for patients diagnosed with NSCLC. Yet, primary and acquired resistance to targeted therapy continues to be a major obstacle for satisfying clinical outcomes. Thus, the identification of specific molecular alterations that contribute to EGFR-targeted therapy response has become critical for selecting patients for appropriate treatments. We found that among the mutant positive cases, the deletions delE746-A750 in exon 19 and a missense mutation L858 in exon 21 were the most predominant. Therefore, these prognostic indicators, and can be used as biomarkers to customize treatment for a particular patient.