RESEARCH ARTICLE Prevalence of Human Papillomavirus Types and Phylogenetic Analysis of HPV-16 L1 Variants from Southern India

Background: The human papillomavirus (HPV) and its variants show wide geographical distribution and have been reported to cause cervical lesions. With cervical neoplasia as the leading cancer in Indian women, the aim of the present study was to evaluate the multiple infection HPV type distribution and variant genotypes in cervical samples from the coastal Karnataka region, India. Materials and Methods: A total of 212 samples were screened by nested polymerase chain reaction using PGMY9/11 and GP5+/6+ primers. HPV positive samples were sequenced to identify the types and a phylogenetic tree was constructed using the neighbor-joining method. Results: Sequence analysis identified a total of 14 HPV types distributed in 20%, 73.3% and 82.5% of non-malignant, pre-malignant [low grade squamous intraepithelial lesion (LSIL) and high grade squamous intraepithelial lesion (HSIL)] and cervical cancer samples. The distribution of high risk HPV in cancer samples was HPV 16, 76.4%, HPV18, 11.7%, HPV81, 2.9%, HPV31, 1.4%, HPV35, 1.4% and HPV 45, 1.4%. Multiple infections were observed in 11.8% of tumor samples with HPV 16 contributing to 62.5% of cases. In non-malignant samples, 20% of HPV positive samples were detected with HPV16, 82.3%, HPV33, 5.8% and HPV58, 5.8% and very low incidence of multiple infections. Comparative phylogenetic analysis of HPV variants identified 9 HPV sequences as new papillomavirus species, predominantly classified as European lineage type. Conclusions: The findings for HPV infections associated with progression of cervical cancer in coastal Karnataka region and HPV variant analysis provide baseline data for prevention and HPV vaccination programs.

Till date more than 100 types of HPV has been identified and are classified as high risk (HR), intermediate risk (IR) and low risk (LR) types. In February 2009, "IARC monographs on the evaluation of carcinogenic risks to Human beings" meeting has identified HPV16,18,31,33,35,39,45,51,52,56,58 and 59 as high risk (HR-HPVs). Worldwide, HPV-16 and 18 contribute to over 70% and other HR group viruses for about an additional 20% of cervical cancer cases (de Sanjose et al., 2010). Although, it has been widely reported that nearly all cervical cancer cases harbor HR and IR HPV, recent studies have shown low prevalence of HPV in different parts of the world (Khorasanizadeh et al., 2012;Tabone et al., 2012;Alsbeih et al., 2013;Ribeiro et al., 2014). In India, the prevalence of HPV 16 and 18 in normal cytology, low grade lesions, high grade lesion and cervical cancer are 6.0, 29.4, 56.0 and 82.5% respectively (http://www.hpvcentre. net/statistics/reports/IND_FS.pdf). HPV variants are classified taxonomically based on the L1 DNA sequence typing into six major phylogenetic branches. These are North-American (NA1), Asian-American ( AA), Asian (As), European (E), African-1 (Af1) and African -2 (Af2). Some of the studies have suggested that HPV variants can influence the viral persistence and development of cervical cancer (Cornet et al., 2012). However, little information is available on multiple HPV infection, their identity as well as HPV sub-type variants for viral persistence and cervical cancer. Since HPV vaccine(s) are type specific and HPV types show geographical variation in the distribution in malignant and non-malignant subjects, it is important to identify the HPV types in various regions. Moreover, HPV subtype L1 variants in determining the effectiveness of the vaccine remains to be understood as L1 capsid is one of the targets of neutralizing antibodies. The aim of the study was to report the HPV prevalence, type distribution and identify predominant HPV 16 variants from rural women who visit secondary and tertiary care centers in coastal part of Karnataka, India.

Sample collection
A total of 212 samples which included 85 nonmalignant, 45 pre-malignant [17 LSIL and 28 HSIL] and 82 malignant samples were used in the study. The study was approved by Kasturba Hospital ethical committee of Manipal University and samples were collected after obtaining the written informed consent from the participants. Non-malignant, LSIL, HSIL and malignant samples were collected using cervical brush and punch biopsy respectively. Clinically diagnosed and histopathologically confirmed cases were subsequently used for HPV typing. Non-malignant cervical samples which acted as controls were age matched samples from women who were undergoing (i) hysterectomy for reasons other than cervical cancer such as cervicitis and fibroids and (ii) regular cervical screening. Cervical cancer cell lines such as SiHa, CaSki and HeLa were used as positive controls in the study.

DNA extraction and detection of HPV by polymerase chain reaction (PCR) and sequencing
The biopsy samples were cut into small pieces, digested with proteinase-K (10mg/mL) and DNA was  extracted by standard phenol-chloroform method. HPV genotyping was performed by nested polymerase chain reaction (PCR) using PGMY09/11 and GP5+/GP6+ primers with beta globin as internal control (Gravitt et al., 2000;Evans et al., 2005). Samples which showed amplification of 150bp was identified as HPV positive, purified from the agarose gel and sequenced using Big Dye terminator kit (ABI, USA) in Genetic Analyzer 3130XL (ABI, USA) according to the manufacturer's instruction. The HPV strain was identified by using NCBI BLAST search. HPV negative samples were scored as negative after at least two independent testing

Statistical analysis
Microsoft Office Excel (Microsoft, USA) and MedCalc (http://www.medcalc.org/calc/odds_ratio.php) were used for statistical analysis. The HPV prevalence and type distribution was compared by 2 sided χ 2 and Fisher exact test. A P value < 0.05 was considered statistically significant. The statistical comparison was performed between non-malignant vs pre-malignant (LSIL and HSIL) and non-malignant vs malignant cervical samples.
In order to identify the sequence variation with in HPV genome, the 150bp of L1 region (6614-6788) from HPV16 positive samples were aligned with NCBI database accession number NC_001526.2 as reference sequence ( Figure 1B, 1C, 1D, 1E). Numerous uniform deletions and insertions were identified ( Figure 1B-E). When the individual samples were studied for their sequence alterations, a number of tumor specific changes were observed reflecting the emergence of aberrant clone. Majority of the sequence alterations were either G to A or A to G type. Among the nucleotide alteration, deletion and insertions were common at nucleotide position 6695, 6722, 6733, 6736, 6737, 6738 and 6744 while SNPs were frequently observed at nucleotide position 6722, 6741, 6742, 6743, 6759 and 6760 respectively. The nucleotide changes at 6726,6729,6730,6732,6738,6739,6741,6742,6744,6759 and 6760 were common among single infection and multiple infection cases. Phylogenetic analysis was performed for all the HPV and HPV 16 sequences separately (Figure 2). Results showed 3 main clusters namely group 1 (47.5%), group II (33.75%) and group III (47.5%)( Figure 2A); phylogenetic analysis of only HPV 16 samples identified 3 groups in group 1, group II and group III with 48.5%, 29% and 18.9% of samples respectively. Moreover 48% of HPV16 variant sequences aligned were to European lineage. PaVE database search identified 9 of the HPV 16 sequences as new papilloma virus species (Table 3).

Discussion
Cervical cancer is the common malignancy effecting Indian women. Studies have clearly shown that persistent infection with HR-HPV is necessary for the pathogenesis of cervical cancer (zur Hausen, 2002). It is estimated that about 80% of sexually active individuals encounter an HPV infection during their life span; most of which go undetected and persistent infection, co-infection along with deleterious host factors leads to cervical cancer (Baseman and Koutsky, 2005). The HPV prevalence depends on age and geographic area, and is the highest (~20%) among women between 16 and 25 years of age worldwide. It is reported to decline markedly (to about 5%) in women who are 40 years or older (Burchell et al., 2006).
Eighty percent of all HR-HPV infections are transient and will not result in lesions. From the remaining 20%, a majority of them develops into non-progressive CIN1 lesions that reflect a tolerant state of a productive infection and will regress spontaneously over time. A minority of the HPV infections persists and induces high-grade CIN lesions, i.e., CIN2 and CIN3. It is estimated that only 5% of the CIN lesions, when left untreated, would result in cervical cancer, which equals to maximum of 1% of all HR-HPV infections. Thus, cervical cancer is a rare complication of HR-HPV infection. Apart from multiple infections, genetic/epigenetic events may also be necessary for the development of cervical cancer and the underlying genes can be used as potential diagnostic and prognostic marker (Bai et al., 2014;Chujan et al., 2014).
To date several studies have reported varying prevalence of HPV in general population and in cervical cancer patients. Though HPV is the main causative agent for cervical cancer, its prevalence and distribution varies in different geographical regions of the world. Several studies have shown HPV prevalence from different parts of India; however studies are limited for molecular variants within HPV types (Senapathy et al., 2011). Our study has generated valuable information of most prevalent HPV genotype and their molecular variants in our region.
Several meta-analyses have identified the distribution of HPV types associated with cervical cancer (De Vuyst et al., 2009;Sankaranarayanan et al., 2009;Senapathy et al., 2011. The prevalence of HPV in healthy population in India is found to be in the range of 7-13% with HPV 16 and 18 as being the most common HPV types (Senapathy et al., 2011). HR-HPV prevalence was approximately 91% followed by 2% of IR-HPV infection and 7% of LR-HPV infection. In our study, HPV16 and 18 were the most prevalent in addition to other HPV types [HPV 31,35,42,45,6,81,33 and 58]. Sowjanya et al., (2005) have reported the detection of only HR-HPVs in Andhra Pradesh, India, with a prevalence of 87.8% in cervical neoplasia (Sowjanya et al., 2005). One of the earlier studies from Madurai, India, showed the HPV prevalence as low as 70% in cervical cancer samples (Munirajan et al., 1998). In a case control study from Chennai, India, prevalence of HPV infection was reported as high as 99.4% in invasive cancer samples (Franceschi et al., 2003;Das et al., 2013). In our study, we have reported the 91% of prevalence of HR-HPVs. Yet another study using consensus PCR method followed by pyrosequencing for HPV genotyping showed the prevalence of HR-HPVs in 96.9% of cervical cancer samples (Travasso et al., 2008). Authors have also shown a high prevalence of HPV infection (76.19%) in non-malignant samples with large proportion of cases (52.38%) co-infected with HPV16 and 18. Apart from this, in a meta study involving 131,746 healthy women, it was observed in low resource setting, a single round of HPV screening has reduced the potential incidence and mortality from cervical cancer (Sankaranarayanan et al., 2009). Gheit et al., (2009 have reported 93% prevalence of HPV in cervical cancer cases (Gheit et al., 2009). Gupta et al. (2009 have reported 16.6% prevalence of HPV among cytologically normal women of reproductive age (Gupta et al., 2009). Our HPV prevalence (20.48%) in nonmalignant samples was higher when compared to other studies reported from India except for one study wherein it has been reported to be 76.19% of HPV infection in non-malignant samples (Travasso et al., 2008). Moreover, the high prevalence of HR-HPV types (15/17, 88.2%) indicates the high prevalence of pathogenic HPV types in general population.
The prevalence of HPV in healthy individuals from Noida and Delhi, India was 3.5% in 2012 (Hussain et al., 2011). A population based study from Eastern India reported 9.9% of HPV prevalence with HPV 18 being highest (1.4%) followed by HPV 16 (0.6%) (Dutta et al., 2012). Another study from Uttar Pradesh India reported 9.9% of asymptomatic population as HPV positive with HPV16 (63.7%) and HPV31 (6.7%) as predominant HPV types (Srivastava et al., 2012). In addition to this, studies have also investigated type specific and persistence of HPV infection from Indian women and were found to be 5 per 1000 women per months (Datta et al., 2012). Recent studies have also focused on socio-demographic and behavioral risk factor for cervical cancer (Raychaudhuri and Mandal, 2012b;Raychaudhuri and Mandal, 2012a;Thulaseedharan et al., 2013;Joshi et al., 2014). Among the chromosomal loci 1p, 3q, 6q, 11q, 13q and 20q were found to be the most frequent HPV integration site reported from India (Das et al., 2013). Several studies from India have reported multiple HPV infection in cervical cancer (Basu et al., 2009;Srivastava et al., 2012;Srivastava et al., 2014). In addition to these, studies have also shown high prevalence of high risk HPVs in HIV patients (Aggarwal et al., 2012;Joshi et al., 2012). Pandey et al, 2012 reported 11.7% of samples screened (139 out of 890) as HPV positive (Pandey et al., 2012). In a case control study in Tiruchirapalli, Tamil Nadu, India, 54.9% of samples were found to be HPV positive (Vinodhini et al., 2012). Taken together, HPV prevalence in India is similar to some of the high risk areas such as Latin America (16%); however it is still lower than some of sub-Saharan Africa (24%) and Eastern Europe (21%) (Forman et al., 2012). Recent Indian studies have also focused on public awareness on HPV infection and the impact of HPV vaccines on cervical cancer (Hussain et al., 2014). These studies have revealed a low level of awareness about HPV, cervical cancer and HPV vaccines.
Hybrid capture assay and PCR using MY9/11, PGMY9/11 and GP5+/GP6+ are commonly used technique for HPV detection in clinical specimens targeting broad range of HPVs which are subsequently identified by sequencing (Fuessel Haws et al., 2004;Giovannelli et al., 2004;Estrade et al., 2011;Natphopsuk et al., 2014., Rai et al., 2014. The PGMY/GP+ system are shown to detect HPV DNA even at 1 copy and allow characterization of multiple infections (Fuessel Haws et al., 2004). Although, it is one of the sensitive methods of HPV detection, prevalence of low HPV frequency in the study population might be probably due to low copy number of HPV or subtypes not detected by the primers. In our study, the presence of HPV infection in cervical cancer samples is lower when compared with worldwide estimation of 85% to 100% and also contrasting with previous Indian studies except for a study reporting 70% HPV infection in cervical cancer (Munirajan et al., 1998). However, recent studies have also shown lower prevalence of HPV in cervical cancer (75.7% and 76% in central south China and Iran respectively). Alsbeih et al., reported similar HPV prevalence (82%) in cervical cancer subjects from Saudi Arabia (Alsbeih et al., 2013).
In the present study molecular characterization of HPV variants within the L1 region of GP5+/GP6+ amplified region was performed. Our study showed a number of sequence alterations in individual samples in the form single nucleotide change as low as 1 to as high as 34 (0.7 to 23%) indicating the development of aberrant clones which might reflect the virus infectivity and pathogenicity as LI molecular variants of HPV are the targets of neutralizing antibodies (Serrano et al., 2014;Yang et al., 2014). Our study has shown the prevalence of novel HPV types indicating the emergence of new HPV types from the existing ones; their potential in cervical carcinogenesis needs to be understood. When the individual samples were studied for their sequence alteration, a number of tumor specific changes were observed reflecting the emergence of aberrant clone in cervical cancer samples. However, at present the significance of large number of HPV nucleotide alteration in HPV mutator phenotype is unknown. In addition, the causes and consequences of new sequence variations in HPV function needs to be evaluated in relation to pathogenicity and severity of the disease. Within the L1 region of HPV16, several sequence variations were identified indicating that HPV sequence varies within the given HPV type. The characterization of sequence variation could allow identification of natural variants and further serological testing and vaccine development. The importance of these sequence variation within HPV16 in relation to the pathogenicity needs to addressed using large sample size. According to our knowledge, this is the first study to report cervical HPV infection in coastal Karnataka and HPV molecular variant analysis from India thus providing baseline data to evaluate for HPV vaccination programs. A recent study has shown that testing for HPV DNA is more specific and sensitive when compared to testing for oncogenic E6/E7 mRNA (Tezcan et al., 2014).
In summary, we have performed nested PCR sequencing and phylogenetic analysis for identification and molecular classification of HPV types. Our results show relatively lesser prevalence of HPV in cervical cancer samples studied. Moreover, multiple infections of different HPV types are also of concern. Thus further studies needs to be undertaken to identify the distribution of HR-HPV other than HPV16 and 18 at population level in order to obtain a comprehensive view on the efficacy of existing HPV vaccine and also to understand the need for development of population specific vaccine.