Prediction and Analysis of Breast Cancer Related Deleterious Non-Synonymous Single Nucleotide Polymorphisms in the PTEN Gene

Breast cancer is one of the common cancer types faced by the women in the modern life. In countries like United States, one in 8 women develops breast cancer in her lifetime and its incidence rates is slightly increasing among African American women (De Santis et al., 2014). Along with colorectal cancer, it ranks high in all countries. Its incidence in the United States and Europe is twice as high as it is in Asian countries, and its incidence rates have been increasing in all countries (Saika and Sobue, 2013). Tumor heterogeneity, lifestyle factors including obesity, breastfeeding, and alcohol consumption are some of the traditional risk factors associated with breast cancer (Kwan et al., 2009). Some of the genetic and hormonal factors constitute risk to breast cancer (Martin and Weber, 2000). Phosphatase and tensin homolog deleted on chromosome ten (PTEN) is one of the frequent mutated gene found in many primary and metastatic malignancies including breast cancer (Kechagioglou et al., 2014). Human estrogen receptor-positive (ER+) breast cancer cell lines containing inducible PTEN short hairpin RNAs, result in the hyperactivation of the PI3K pathway and a concomitant change in gene expression similar to luminal B breast cancer types (Maggi and Weber, 2015). In breast cancer cells, reduced expression of PTEN is known to confer susceptibility to inhibitors of the PI3 kinase/Akt pathway (De Graffenried et al., 2004). Triple-negative breast cancers are aggressive forms, CIB1 plays a broad role in its cell survival, tumor growth and a low expression of PTEN is a key predictor of sensitivity to CIB1 depletion


Introduction
Breast cancer is one of the common cancer types faced by the women in the modern life. In countries like United States, one in 8 women develops breast cancer in her lifetime and its incidence rates is slightly increasing among African American women (De Santis et al., 2014). Along with colorectal cancer, it ranks high in all countries. Its incidence in the United States and Europe is twice as high as it is in Asian countries, and its incidence rates have been increasing in all countries (Saika and Sobue, 2013). Tumor heterogeneity, lifestyle factors including obesity, breastfeeding, and alcohol consumption are some of the traditional risk factors associated with breast cancer (Kwan et al., 2009). Some of the genetic and hormonal factors constitute risk to breast cancer (Martin and Weber, 2000).
Phosphatase and tensin homolog deleted on chromosome ten (PTEN) is one of the frequent mutated gene found in many primary and metastatic malignancies including breast cancer (Kechagioglou et al., 2014). Human estrogen receptor-positive (ER+) breast cancer cell lines containing inducible PTEN short hairpin RNAs, result in the hyperactivation of the PI3K pathway and a concomitant change in gene expression similar to luminal B breast cancer types (Maggi and Weber, 2015). In breast cancer cells, reduced expression of PTEN is known to confer susceptibility to inhibitors of the PI3 kinase/Akt pathway (De Graffenried et al., 2004). Triple-negative breast cancers are aggressive forms, CIB1 plays a broad role in its cell survival, tumor growth and a low expression of PTEN is a key predictor of sensitivity to CIB1 depletion

Prediction and Analysis of Breast Cancer Related Deleterious
Non-Synonymous Single Nucleotide Polymorphisms in the PTEN Gene C Kumaraswamy Naidu, Y Suneetha* (Black et al., 2015). So, treatment with trastuzumab to improve the disease-free and overall survival has been a standard approach for HER2-overexpressing breast cancer patients and PTEN status was suggested to be one of the indicators (Adamczyk et al., 2015).
Variations on the promoter region of PTEN are known to affect the progression of breast cancer and the survival of the patients (Heikkinen et al., 2011). Previously several single nucleotide polymorphisms such as rs1234212, rs11202586, rs1234221, rs1903860, rs1234220, rs1234219, rs1903858, rs2299939, rs1234224, rs1234223, rs1234213, rs2673832 with a PTEN haplotype associated with breast cancer risk were predicted (Haiman et al., 2006). Previous study showed that a high level of discordance in PTEN level, PIK3CA mutations and receptor status between primary tumors and metastases influenced the patient selection and response to PI3Ktargeted therapies (Gonzalez-Angulo et al., 2011). An in vivo study showed that non catalytic PTEN missense mutation predisposes the organ-selective cancer development (Caserta et al., 2015). In the present study, we aim to predict the breast cancer-associated nSNPs in PTEN and to further to analyze

Materials and Methods
SNP datasets used for the study SNP datasets for PTEN were retrieved from the dbSNP database (http://www.ncbi.nlm.nih.gov/projects/SNP/, Build 138; access date: August 22, 2015) (Sherry et al., 2001) for our study.

Prediction of Deleterious SNPs
SIFT and Polyphen-2 database servers to screen out the deleterious coding nSNPs from other SNPs of PTEN. 'Sorting Tolerant From Intolerant' (SIFT) (http://sift. jcvi.org/) uses a sequence homology based approach for predicting the amino acid substitution in a protein affecting the protein function (Kumar et al., 2009). It assigns 0-0.05 score for intolerant or deleterious amino acid substitutions and 0.05-1 scores for tolerant or neutral amino acid substitutions (Ng and Henikoff, 2003;Ng and Henikoff, 2006). PolyPhen-2 (http://genetics.bwh.harvard. edu/pph2/) on the other hand predicts the functional significance of variation using Naïve Bayes classifier. We used WHESS.db a quick access for precomputed set in PolyPhen-2 predictions was used for our analysis (Adzhubei et al., 2010). We submitted our query in the

Phenotype of predicted deleterious coding nSNPs
Search for phenotype information of the breast cancer SNPs was performed using the databases SNPedia (Cariaso and Lennon, 2012), (Schaefer et al., 2012), Hapmap (International HapMap et al., 2010), Pubmed (Sood and Ghosh, 2006). The SNPs with breast cancer phenotype were further cross checked for deleterious nature using the PROVEAN software (Choi et al., 2012).

Modeling nSNPs locations in protein structure
Crystal structure of PTEN downloaded from the protein databank (PDB ID: 1d5r, chain A) (Lee et al., 1999) was used for modelling the nSNPs in the protein structure. All water molecules and the TLA ligand were removed from the crystal structure and the mutants (MTs) G129E, R130Q and D107N were created by replacing the wild-type (WT) protein residue with its polymorphic residue using PyMOL (PyMol, 2006). Mutants were optimized and energy minimized using Nomad-Ref server (Lindahl et al., 2006) with conjugate gradient method.

Analysis of the impact of mutant on the PTEN protein product
The effect of amino acid changes on the stability of PTEN protein was analyzed by using the MUpro (http:// mupro.proteomics.ics.uci.edu/) (Cheng et al., 2006) web servers. Thermodynamic stability of the mutants was analyzed using POPMusic server (Dehouck et al., 2009). The solvent accessibility information of the WT and MTs was analyzed using GETAREA server (http://curie.utmb. edu/getarea.html) (Fraczkiewicz and Braun, 1998) which considers the residues which exceeds the ratio value 50% to be solvent and buried if it is less than 20% marked as "o" and "i" respectively.

nSNPs from dbSNP database
A search for total SNPs in PTEN against dbSNP database resulted in a total of 18242 SNPs, out of which 5597 were found to be Human (active) SNPs (i.e., Active Human RS and not including those that have been merged). Among the 5597 Human (active) SNPs, 247 were coding non-synonymous SNPs (nSNPs), 83 were coding synonymous, 358 SNPs occurred in the mRNA 3'   UTR, 451 occurred in the mRNA 5' UTR and 4851 were occurred in intronic regions. It can be seen from the Fig. 1 that the vast majority of SNPs occur in the intronic region (86.6%) and more SNPs are nSNPs (4.4%) compared to synonymous SNPs (1.4%), SNPs occurring in the mRNA 3' UTR (6.3%) and 5' UTR (8%) regions. We selected coding nSNPs for our investigation.

Deleterious nSNPs in PTEN gene
Among the 247 coding nSNPs from dbSNP, 18 were found to be deleterious with a tolerance index score of less than or equal to 0.05 using SIFT server. Among these 18 deleterious SNPs, 13 had a highly deleterious tolerance index score of 0.00 using orthologues and homologues in the protein alignment and the remaining 5 deleterious nSNPs had a tolerance index score had a tolerance index score of 0.04, 0.05, 0.08, 0.01 and 0.05 using orthologues and homologues in the protein alignment respectively (Table 1). Among 18 nSNPs predicted to be deleterious using SIFT server, seven nSNPs showed a nucleotide change of A/G, three showed a change of C/T, two showed a change of C/G, one showed a change of A/T, one showed a change of G/T, two showed a change of A/C/G, one showed a change of C/G/T and one showed a change of A/G/T respectively. A/G and C/T changes occurred maximum number of times compared to the other nucleotide changes.
18 nSNPs that are predicted to be deleterious using SIFT server were submitted to Polyphen-2 to predict their respective functional significance of allele replacement. All the 18 nSNPs submitted to the Polyphen-2, were found to be possibly damaging, or probably damaging by both HumDiv and HumVar predictions ( Table 2).

Phenotype prediction of deleterious nSNPs
18 nSNPs that are predicted to be deleterious or probably damaging using SIFT and Polyphen-2 was subjected to phenotype prediction. Results showed that among the 18 nSNPs, three SNPs rs121909218 (G129E), rs121909229 (R130Q) and rs57374291 (D107N) showed a phenotype in breast tumors (Table 3). Results from PROVEAN server also showed that these SNPs as deleterious (Table 4) these SNPs were considered for further analysis.

Deleterious nSNPs impact on PTEN protein
To analyze the impact of deleterious nSNPs on the PTEN protein product, we have analyzed the stability of each mutant. Stability analysis of the mutants using the MUpro server showed that the mutants R130Q and D107N showed a decrease in the stability whereas the mutant G129E showed a increase in the stability (Table  5). Their respective change in the solvent accessibility and the free energy were provided in the Table 6 given below. Results from the Total area/energy of each mutant showed the mutants G129E and D107N major change compared to the wild type PTEN protein Total area/energy (15517.79) ( Table 7).
In conclusion, the results from our study indicate that three mutations R130Q, D107N and G129E in PTEN are associated with the breast cancer phenotype. Results showed that these three mutants showed a change in stability. Overall, the present computational approach reported in this study allowed elucidation of the role of deleterious mutations in PTEN thereby providing useful information for the design of PTEN mutant-based therapeutic strategies against breast cancer.

References
Adamczyk A, Niemiec J, Janecka A, et al (2015). Prognostic value of PIK3CA mutation status, PTEN and androgen receptor expression for metastasis-free survival in HER2positive breast cancer patients treated with trastuzumab in