Mutation Detection of E6 and LCR Genes from HPV 16 Associated with Carcinogenesis

Human papillomavirus type 16 (HPV16) is the main etiological agent of cervical cancer, which is the second most common type of cancer in women worldwide (Muñoz et al., 2003). In a study based on the detection of circulating HPV genotypes in women from CórdobaArgentina, HPV16 demonstrated to be the most frequent etiological agent (Venezuela et al., 2012). The HPV16 genome has about 8000 base pairs and includes 8 protein-coding genes (L1, L2, E1, E2, E4, E5, E6 and E7) plus a non coding region LCR (Long Control Region) (Smith et al., 2011). According to the Papillomavirus Nomenclature Committee, a new HPV type is defined by a nucleotide sequence variation of more than 10% compared to an already known HPV type in the L1 open reading frame. Those types differing in 2 to 10% are considered subtypes, whereas intratype variants vary by 2% in the L1 region (Pande et al., 2008). The genetic variability study started with the analysis of HPV16, as this is the most prevalent viral agent responsible for increased risk of developing cervical cancer. The HPV16 genome used as reference was


Introduction
Human papillomavirus type 16 (HPV16) is the main etiological agent of cervical cancer, which is the second most common type of cancer in women worldwide (Muñoz et al., 2003). In a study based on the detection of circulating HPV genotypes in women from Córdoba-Argentina, HPV16 demonstrated to be the most frequent etiological agent (Venezuela et al., 2012).
The HPV16 genome has about 8000 base pairs and includes 8 protein-coding genes (L1, L2, E1, E2, E4, E5, E6 and E7) plus a non coding region LCR (Long Control Region) (Smith et al., 2011). According to the Papillomavirus Nomenclature Committee, a new HPV type is defined by a nucleotide sequence variation of more than 10% compared to an already known HPV type in the L1 open reading frame. Those types differing in 2 to 10% are considered subtypes, whereas intratype variants vary by 2% in the L1 region (Pande et al., 2008).
The genetic variability study started with the analysis of HPV16, as this is the most prevalent viral agent responsible for increased risk of developing cervical cancer. The HPV16 genome used as reference was F Venezuela, Cecilia G Cuffini sequenced for the first time by Seedorf (Seedorf et al., 1985). The first worldwide study of HPV16 lineages using the LCR region was performed in 1993 by Ho (Ho et al., 1993), who reported 4 major variant lineages: European (E), Asian American (AA), African-1 (AF-1) and African-2 (AF-2). Smith proposed a sublineage within the European lineage, differentiating European prototype (Ep) and European Asian (Ea) because of the small difference (0.5-1%) in the sequences of European and Asian lineages. Their names are derived from the geographical origin of the populations in which they are more prevalent (Ho et al., 1993).
Previous studies have also shown that specific lineages may influence the persistence of HPV infection and the progression of cervical cancer precursor lesions (Londesboroug et al., 1996;Zehbe et al., 1998). HPV lineages may also affect virus assembly, immunological response, pathogenicity, p53 degradation, immortalization activity and transcription regulation (Pande et al., 2008).
L1 encodes the major structural protein of the viral particles that are expressed at ultimate stages of the viral production (Fernandes Brenna and Syrjanen, 2003) and differences in their nucleotide sequences could be linked to changes in the host immune response (Picconi et al., 2002). E6 and E7 are the major oncoproteins and are involved in tumorigenesis. LCR has been demonstrated to be the most variable region of HPV16 and contains the early promoter and regulatory elements involved in viral DNA replication and transcription (Cornet et al., 2012). Several authors have evaluated the association of specific HPV16 mutations with viral persistence and the development of cervical intraepithelial neoplastic lesions (Londesboroug et al., 1996: Bontkes et al., 1998Villa et al., 2000;Cornet et al., 2013).
The aim of this study was to perform a phylogenetic study of HPV16 sequences in cervical lesions samples obtained from women of Córdoba city-Argentina, in order to determine the circulating lineages and analyze the presence of mutations related to malignancies.

Samples
The analyzed samples belonged to women with different grades of cervical lesions that were examined to discard HPV. A total of 432 endocervical samples were collected between August 2011 and July 2012 in Córdoba-Argentina; all of them were obtained by swab or cervical smears. All patients included in the study were adult females; those younger than 18 years of age were excluded.
This study was approved by the Institutional Committee of Ethics of Health "Oulton Romagosa" of Córdoba-Argentina, according to the ethical principles stated in the declaration of Helsinki.

DNA extraction
The samples were collected in 500 μL of phosphatebuffered saline (PBS). DNA was extracted using the commercial AxyPrep Body Fluid Viral DNA/RNA Extraction Kit (Axygen Scientific Inc., USA), in accordance with the manufacturer's instructions.
The products were detected by electrophoresis in 1.5% agarose gel using a U.V. transilluminator. The β-globin gene was used as DNA preservation marker. Negative samples were considered inadequate.
HPV-DNA positive samples were typed by Restriction Fragment Length Polymorphism (RFLP), according to Bernard2. Briefly, aliquots of the PCR products obtained using the degenerate consensus primers MY09-MY11, targeting a region of approximately 450 bp in length in the L1 ORF of the viral genome, were mixed with 7 different restriction enzymes (Bam HI, Dde I, Hae III, Hinf I, Pst I, Rsa I and Sau III) in separate reactions. The digestion products were separated by electrophoresis in 3% agarose gel and the pattern obtained was compared with published data.

Sequencing
PCR products were purified by gel electrophoresis using the QIAquick Gel Extraction Kit (Qiagen, Valencia, CA, US) and subjected to direct nucleotide sequencing reaction in both directions by Macrogen, Inc. (Seoul, Korea).
Sequence analysis and phylogenetic tree construction. The sequences obtained were edited and prepared with MEGA 5 version software and aligned to the sequences downloaded from the GenBank. Relatedness of newly characterized sequences was assessed by analysis with 2.2.19 Basic Local Alignment Search Tool (BLAST).
For each dataset, the best fit model of nucleotide substitution was selected using jModelTest v 0.1.1. Posada (2008), assessed by the Akaike Information Criterion. The phylogenetic relationships of each dataset analyzed were evaluated by the Maximum Likelihood methodology with PhyML 3.0 (Guindon and Gascuel, 2003) setting the parameters obtained by the model selection program. The branch support was evaluated by non-parametric bootstrapping with 1000 pseudo-replicas. The trees were prepared for publication using Dendroscope program.
The sequences obtained in this study were deposited in the GenBank under the following accession numbers: KC291255 to KC291269 for the E6 region, KC291270 to KC291277 for the L1 region and KC291278 to KC291292 for the LCR region.

Results
One hundred and thirty three out of 432 cervical samples (31%) tested positive for HPV; of these, 24 (18%) belonged to the HPV16 genotype. Fifteen samples were optimal for LCR and E6 sequences, and 8 for L1 region.
The European lineage was observed in 4 HSIL samples, 5 LSIL, 1 SCC and 3 samples without defined cytology; African-1 lineage in 1 sample without defined cytology and African-2 lineage in 1 LSIL sample (Table 1).

LCR sequence variation
The LCR of HPV16 type is the most variable region and presents a high number of nucleotide variations in regulatory binding sites of a number of cellular and/or viral transcription and regulatory factors, such as E2, YY1, AP1, Tef-1, NF1 and Oct-1. The most frequently observed mutation was G7521A in 80% of the analyzed samples, located in one of the many YY1 binding sites. This change was detected in most of the Ep sublineage samples as well as in African lineages, but it was not detected in any of the samples of the Ea sublineage.  1 2 3 7 7 7 7 7 6 6 6 6 6 6 6 7 9 5 4 4 4 5 5 6 8 8 9 9 9 9 6 0 0 3 5 6 0 2 5 7 9 0 7 9 9 6 0 2 2 1 1 1   Other point mutations were detected: G7436C, T7450C, G7462C and G7502C (Table 1).

E6 sequence variation
The most frequently detected mutation was T350G, in 67% of the analyzed samples; it produces an amino acid change of leucine to valine (L83V). This mutation was detected in 100% of the Ep sublineage samples but was not detected in any of the samples of the Ea sublineage.
Other two point mutations have been detected G176A and T290A; they also produce amino acid changes, D59N and C97S, respectively. (Tables 1-2).

Discussion
Human Papillomavirus type 16 (HPV16) is the most prevalent oncogenic HPV type worldwide and the prevalence of its different lineages differs across geographical locations (zur Hausen 1991; Cento et al., 2009;Cornet et al., 2013).
Recent investigations have evaluated the association between specific HPV16 lineages and viral persistence, as well as with the development of high-grade cervical lesions (Xi et al., 1995;Londesboroug et al., 1996;Xi et al., 2007;Cornet et al., 2013).
When bearing in mind the geographical distribution of its lineages, it has been reported that the European lineage predominates in the ordinary European and American populations; the African 1 and 2 lineages predominate in Africa, the Asian lineage (emerging branch of the European lineage) predominates in Southern areas of Asia and the Asian-American lineage in Central and South America and Europe. The geographical distribution of HPV types and their specific lineages may be influenced by several factors, including founder effects, coevolution of HPV with human beings, human migration patterns and viral fitness measures such as transmissibility (Cornet et al., 2013).
The high detection of European lineage in this study is in agreement with previous studies performed in Argentina and other South American countries, such as the investigation by Yamada, Picconi, Tonón, Cornet and Argentina received many European immigrants during the first decades of the 20th Century; this fact may have led to the entrance this variant into the country (Yamada et al., 1997;Picconi et al., 2003;Tonón et al., 2007;Cornet et al., 2013).
Even though the number of samples analyzed in this study is relatively low (15 HPV16 positive samples from 432 women), it can be considered representative of our population, since the mortality rate due to cervical cancer in Córdoba province-Argentina, is 5.7%, different to other provinces of Argentina like Misiones (Tonón et al., 2007), which has a mortality rate of 16.5% or Jujuy (Picconi et al., 2002) which has a mortality rate of 13.3%, according to data provided by the National Ministry of Health during the last years.
It is important to point out that we did find a correlation between the lineages detected by the analysis of the three regions; however, the percentage of the European lineage was greater in the analysis of L1 because 8 sequences of the HPV16 positive samples were analyzed, different than results from LCR and E6 sequences, in which all the samples were analyzed (N=15). Even though most of the studied samples presented great homology with the European sequences, some differences were detected at the time of classifying them by sublineages; in LCR, samples 883 and 877 grouped with sublineage Ep, and E6 grouped with Ea, respectively. The opposite occurred with samples 811, 847, 926, 1091 and 1217, which showed an association with sublineage Ea in LCR, as did E6 with Ep. These differences were not observed in those sequences that grouped with African lineages.
In order to get a better idea of which types of the lineages circulate in the studied population, we performed a complementary alignment of E6-LCR, which allowed detecting that most of sequences grouped with the European prototype sublineage which in part, is also in accordance with the origin of the persons that migrated to Argentina during early 20th Century, mainly to central regions of the country.
Many authors have suggested that some lineages of HPV16 show greater associations with cervical neoplasms (Londesboroug et al., 1996;Bontkes et al., 1998;Zehbe et al., 1998;Villa et al., 2000;Xi et al., 2007;Schiffman et al., 2010;Smith et al., 2011;Cornet et al., 2012). This would partially explain why some HPV16 infections progress to HSIL or cancer, while others do not (Picconi et al., 2003). Increasing the quality of DNA sequencing technology allowed a finer resolution of the different lineages and the association of non-European lineages with high grade lesions and cervical cancer was established (Xi et al., 2007;Smith et al., 2011). However, the underlying genetic details that make non-European lineages of HPV16 more carcinogenic are still unknown, and this important issue definitely deserves pursuit to identify the genetic basis of this association (Schiffman et al., 2010).
The expression of the virus genes is a complex process that involves viral and cellular transcription factors that have either stimulant or repressive effects (O´Connor et al., 1995). Deletion and point mutation experiments indicate that the activity of viral enhancers depends on several factors.
The LCR region contains several binding sites for the viral transcriptional regulatory protein E2 and cellular transcription factors, which modulate its function positively (AP1, Sp1, NF1, Oct-1, Tef-1) or negatively (YY1) (O´Connor et al., 1995;Bernad 2006). The transcription factor YY1 is a repressive agent responsible for silencing the E6 promoter transcription. The mutation that leads to a loss of this silencer was detected in patients with cancer, but was not found in asymptomatic carriers. This would explain the reason why although HPV infection is the most frequent sexually transmitted disease, DOI:http://dx.doi.org/10.7314/APJCP.2015.16.3.1151 Mutation Detection of E6 and LCR Genes from HPV 16 Associated with Carcinogenesis only a fraction of the women with HPV16 positive cervical precursor lesions evolve to cancer.
The analysis of the nucleotide changes found in cervical cancer isolates showed that the changes were mainly located within or close to YY1 binding sites, while in asymptomatic carriers these changes presented as a wide variety of transcription factor binding sites (Schmidt et al., 2001). The mutation most frequently observed in LCR was G7521A in the binding site of the transcription factor YY1 (Pande et al., 2008). In this study, in accordance to other authors (Yamada et al., 1997;Schmidt et al., 2001;Kammer et al., 2002;Pande et al., 2008;Shang et al., 2011), the mutation most frequently detected in the LCR region was G7521A, found in the African lineages and most European ones. Thus, only 20% of analyzed samples presented homology with the prototype sequence. In this analysis, the samples that presented G7521A mutation belonged mostly to low-grade squamous intraepithelial lesions (LSIL); this is an interesting and alarming fact since it indicates that these patients might be at higher risk of developing more severe lesions.
The mutations of the encoding genes could produce changes in the amino acid composition, which in turn result into protein changes of tertiary structures and alterations of their functions. In vitro studies of different HPV16 mutations have suggested some differences in the protein biological activity. T350G change constitutes the most frequent mutation in E6, a common such polymorphism within E (Cornet et al., 2013); it produces a replacement of leucine by valine, named L83V (Zuna et al., 2011). The amino acid 83 is located in the center of conserved amino acid S-L83V-YG, identical to those of genital high-risk oncogenic HPV types but different than those from the low risk types. The E6 protein presents transforming activity, which may be affected by amino acid changes in its functional sites (Andersson et al., 2000). E6 variants may differ in their immunogenicity by the generation of different peptides for the presentation of polymorphic HLA molecules to specific T cells (Zehbe et al., 2001). Several studies have reported the fact that the T350G mutation is associated to a rise in the risk of developing persistent infections or with progression of LSIL to HSIL (Cornet et al., 2013); however other studies have found opposite results (Zehbe et al., 1998). One possible hypothesis is that this phenomenon may be associated with the host cellular immune response. Different haplotypes of the major histocompatibility complex may recognize E6 variants with different efficiency scales (Zehbe et al., 1998).
The nucleotide variation with maximum percentage of detection for E6 sequences found this study consisted on the modification found in position 350 (T350G-L83V), which was detected in most samples of the European lineages. These results agree with previous studies (Brady et al., 1990;Ho et al., 1991;Andersson et al., 2000;Kammer et al., 2002;Sichero et al., 2012), which have described this modification as the most frequent one in the E6 sequence in E sublineage (Cornet et al., 2013). Noteworthy, we detected this mutation in all of the Ep sublineage samples, but in none of the Ea ones. This result is in accordance to findings reported by Pande (Pande et al., 2008).
Tonón reported this mutation in 32% of their Ep sublineage samples but we detected it in 100% of our samples that corresponded to this sublineage. We consider that this difference could be due to the fact that Tonón. studied Guarani-indian women. The Guarani-indians are no more than 1700 people who inhabit the rainforest of Misiones province (Northeast region of Argentina), concentrated in more than 20 small communities (Tonón et al., 2007).
Different from information published by other authors (Zehbe et al., 1998;Xi et al., 2007), most of the samples in which this modification was observed belonged to patients with low grade intraepithelial lesions. This could be due to the fact that, regardless of other risk factors, the samples were obtained before the progress of the lesions to malign processes, taking into consideration that the possibility of receiving appropriate health care affects not only the possibilities of early detection but also the chances of lesion progress. For these patients, the access to routine gynecological tests allows timely therapeutic decisions.
Nucleotide variation in L1 region are linked to changes of the host immune response, so that the nucleotide modifications can affect the efficiency of the L1 protein to self assemble and this variability can lead to conformational changes within the relevant neutralization epitopes (Pande et al., 2008).
Differently to findings in LCR and E6, in the sequence analysis of L1 gene, the most frequent mutation, T6862C, observed in previous studies (Pande et al., 2008) was not detected in our study. However, other unreported mutations were found (T6651A, A6871T, G6891T, C6908T, A6970T, G6994A and A6999T); they produced changes of one amino acid to another, which can affect the structure or function of the L1 protein, with the ability of playing an important role in the immune response. Table 1 shows that 9 samples (801, 811, 847, 926, 954, 1087, 1091, 1170 and 1217) presented the two mutations more frequently associated with progression of the lesions (G7521A in LCR and T350G in E6). This could be the largest circulation of this strain in Córdoba-Argentina, constituting a warning for the Public Health System.
In this study, we detected two other nucleotide changes in E6 sequences: G176A and T290A. Importantly, while changes in E6 resulted in amino acid changes, G176A variation produced a change of negative amino acids for polar amino acids, which could modify the protein biological activity (Table 2).
Other 4 nucleotide changes were detected in the LCR sequences, which are located in close transcription factor binding sites. G7436C is located near the YY1 binding site and T7450C is located close to the viral transcriptional regulatory protein E2. G7462C changes were detected close to the Tef-1 binding site and G7502C. All these changes, both in LCR and in E6, had not been previously reported by other authors. As observed, in the LCR sequences, almost 95% of the nucleotide changes already found was located inside or close to the binding sites of the transcription factors. Five of the studied samples contained 2 of these specific changes in different sites, which could affect the regulation (in a positive or negative manner) of the HPV transcription, being capable to trigger the progression of the lesions.
These results are the first contribution to the field of molecular epidemiology of HPV16 in patients from Córdoba province-Argentina, indicating the importance of the detection and study of circulating lineages and the analysis of the changes of the nucleotide structure: This is significant not only in terms of molecular epidemiology, but also in the impact of these variations on Public Health Affairs when these results are correlated to the evolution of the disease towards malignant processes. DOI:http://dx.doi.org/10.7314/APJCP.2015.16.3.1151 Mutation Detection of E6 andLCR Genes from HPV 16