Application of Bayesian Multilevel Space-Time Models to Study Relative Risk of Esophageal Cancer in Iran 2005-2007 at a County Level

Background: Reported age standardized incidence rates for esophageal cancer in Iran are 0.88 and 6.15 for females and males, at fifth and the eighth ranks, respectively, of cancers overall. The present study aimed to map relative risk using more realistic and less problematic methods than common estimators. Materials and Methods: In this ecological investigation, the studied population consisted of all esophageal cancer patients in Iran from 2005 to 2007. The Bayesian multilevel space-time model with three levels of county, province, and time was used to measure the relative risk of esophageal cancer. Analyses were conducted using R package INLA. Results: The total number of registered patients was 7,160. According to the results, the three-level model with adjustment for risk factors of physical activity and smoking had the best fit among all models .The overall temporal trend was significantly increasing. At county level, Ahar, Marand, Salmas, Bojnoord, Saghez, Sarakhs, Shahroud and Torbatejam had the highest relative risks. Physical activity was found to have significant direct association with risk of developing esophageal cancer. Conclusions: Given to great variation across geographical areas, many different factors affect the incidence of esophageal cancer. Conducting further studies at the individual level in areas with high incidence could provide more detailed information on risk factors of esophageal cancer.


Introduction
The causes of human death have been shifted from infectious diseases to non-communicable ones; In developed countries, the incidence of cancers is outpacing cardiovascular diseases (Mohammadpour Tahamtan et al., 2013).Cancer is one of the main causes of death and disabling factors all around the world. According to the World Health Organization, up to 2020, the incidence of cancer will be increased 50 percent. In fact, cancer is expected to be the first cause of death in 2030 (Goya, 2007;Zarnegar Nia et al., 2011). Nowadays, cancer is the second cause of death in the world (Scholefield, 2000) and Over 70 percent of deaths occur due to cancer in developing and developed countries (Mathers and Boschi-Pinto, 2000). It is the third cause of death in Iran after cardiovascular diseases and accidents. Hence, it makes necessary to search the treatments and preventive methods as the priorities for health officials and medical community (Naghavi, 2006).
Gastrointestinal tract cancers are prevalent, especially in developing countries (Parkin, 1998). According to

Application of Bayesian Multilevel Space-Time Models to Study Relative Risk of Esophageal Cancer in Iran 2005-2007 at a County Level
Sedigheh Rastaghi 1 , Tohid Jafari-Koshki 2,3 , Behzad Mahaki 1 * and sixth cause of death induced by cancer around the world (Mathers and Boschi-Pinto, 2000;Parkin et al., 2005;Semnani et al., 2005;Kamangar et al., 2006;Ma and Yu, 2006;Ferlay et al., 2010;Scarpa et al., 2011). Esophageal cancer is rare in western countries as around 80 percent of all total esophageal cancer cases are from developing countries (Parkin et al., 2005). The prevalence of esophageal cancer is about 5-10 per 100, 000 population in the north of America and western Europe (Kamangar et al., 2006;Kamangar et al., 2007). Esophageal cancer is a relatively common kind of cancer in the eastern countries (Delpisheh et al., 2014). Half million deaths is attributable to esophageal cancer and its five year survival rate is less than 10 percent (Ma and Yu, 2006;Fauci et al., 2008;Rasouli et al., 2011). Previous epidemiological studies in Asia have shown considerable changes in the incidence of esophageal cancer. Iran is located in areas of Asia with high rates of incidence (Long et al., 2010;Moore et al., 2010a;Moore et al., 2010b;Salim et al., 2010). This cancer is more common in some geographical areas, known as the belt of esophageal cancer, that covers areas from north of China, west and center of Asia to the north of Iran (Kamangar et al., 2007). This suggests the existence of some risk factors with a heterogeneous geographical distribution that are related to the incidence pattern (Corley and Buffler, 2001;Stein et al., 2005).
Incidence of esophageal cancer in Iran is higher than the global rate in both sexes (Moore et al., 2010b;Radmard, 2010).with age standardized incidence rate of 0.88 and 6.15 for females and males having the fifth and eighth ranks among all cancer sites respectively (E'temad and Gooya, 2011).
Obtaining a geographical pattern for disease has been used since the early decades of the 20 th century both for communicable and non-communicable diseases. Its use, nowadays, has been developed so that health designers apply it as a way to do more qualified interventions and help disease prevention. On the other hand, disease registration based on geographical regions, makes it possible to test variability patterns. It is also useful to recognize low and high risk regions (Lawson et al., 2001).
A number of studies have been conducted about the mapping of GI tract cancers in Iran (Mahaki et al., 2011;Asmarian et al., 2012;Asmarian et al., 2013a;Asmarian et al., 2013b;Mahaki et al., 2013) , but few investigations have been done about the mapping of esophageal cancer in recent years (Poor Ahmad and Yavar, 2002;Asmarian et al., 2013c).
Incidence studies are descriptive and have been conducted for specific individual years or province. None of them were at the province and county levels and/or for a wider range of time.
Due to high incidence of esophageal cancer and necessity of providing knowledge about its geographical distribution at province and county levels, present study aims to use Bayesian multilevel space-time models to obtain more accurate estimates and maps for relative risk of esophageal cancer.
The merits of this model includes the power of tolerance for missing data without need of estimation and loss of data and the mean studied population, the possibility of considering the role of neighboring areas in different levels, the possibility of determining the relative effectiveness of every level hierarchically over the response variable and, finally, the feasibility of determining effective factor over the response variable for each levels. These models provide more precise estimates and more clear responses in comparison with single level analyses (Hosseni and Gohari, 2014).

Materials and Methods
This is an ecological study conducted in Iran in county level. The data were obtained from reports by Noncommunicable Diseases Center of Health and Medical Education Ministry's Diseases Management Center (http:// www.ircancer.ir). On esophageal cancer in 320 counties (ICD-10) from 2005 to 2007. The census of the year 2006 and also estimated population of Iranian Statistics Center were used for the population which is potentially at risk (Iran, 2012).
The two geographically nested levels of counties and provinces were nested in time episodes .The expected number of esophageal cancer cases was calculated on the basis of population size and the observed number of cancer cases during this period. In order to prevent assessing geographical distribution from being distorted by other factors, we included risk factors, including physical activity, smoking, fruit and vegetable consumption and weight at province level and obtained adjusted estimates. Smoking was considered as multiplication of the population percent of smokers in the mean number of used cigarettes per day in every province. Also, obesity was regarded as the percent of the population having BMI>25 in every province. Fruit and vegetable consumption were measured as the total of mean fruit and vegetable number used daily in every province. Physical activity was calculated using a combined index called Metabolic Equivalent (MET).
For rare diseases such as cancers, it is usually assumed that the number of patients follows a Poisson distribution in every region. For modeling the data, multilevel model was used where parameters, in every level, can be random and the function of the characteristics of that level or other levels. One variation of the multilevel model is multiplemembership model as follows.
Where [i] is the county indicator and random effects are defined by u and superscript numbers show corresponding level. Since number 1 is supposed for observation level, the numbering process is started from 2. There are regional and neighborhood effects indicated by number 2 and 3, respectively. Here β and x represent fixed effects and design matrix.
Regions [i] refer to counties and neighbor [i] denote the set of neighboring areas of each county. Therefore, in this model, the specification of the observed count is affected by different predictor variables. Typically the weights in this model are such that and, generally, all neighbors are given equal weights so that, in fact, where ni is the number of neighbors to Region [i] .Bayesian estimation is used for fitting the model (Lawson et al., 2003).
The Full Bayesian method is one of the common ways in diseases mapping. Structural and non-structural heterogeneities are regarded by this method. We calculate relative risk of esophageal cancer by use of full Bayesian model suggested by Besag et al. known as BYM model from 2005 to 2007. This spatial model has no parameter to capture the effect of time (Lawson et al., 2003).
We also used Deviance Information Criterion (DIC) and Logarithmic Score (LS) to compare and find the best model. The model with the least LS and DIC is supposed to have the highest quality of prediction among all models. DIC is applied to assess the goodness of fit of Bayesian models and is calculated as DIC=, D _ +pD=D +2pD where, D _ , D and pD represent the mean deviance posterior distribution, estimation of pointed variation and the number of effective parameters of the model, respectively. pD is found by pD= D _ -D. As a result, a model with a best trade-off between DIC, LS, and pD would be considered as the best fit for the data (Spiegelhalter et al., 2002;Celeux et al., 2006).
In this study, analysis was performed using INLA Bayesian approach in R software. INLA approach is used as an effective calculating method. This method is not only faster than McMC methods, but also does not have the challenges related to convergence and other assumptions of McMC (Schrödle and Held, 2011;Blangiardo et al., 2013).

Results
There were the total numbers of 7160 esophageal cancer cases in Iran in [2005][2006][2007]. The highest number of cases was in the Mashhad with 684 cases in a three-year period. Table1 shows the DIC, pD, and LS for different models. The description for each model is stated in the table.
Model 6 has the least amount of DIC, LS and fits the data more appropriately. The results of these models are shown in Table 2. In the model 6, the effect of year was significant and increasing. Physical activity was also significant. The same results were happened in model 7. But the other factors were not significant. Figure 1, a, b and c shows the relative risk of the counties by adjusting risk factors, including smoking and     and Fariman had the most risk and Shahryar, Robatkarim, Saravan(Saravan, Zaboli and sibvaSaravan), Kish and Lamerd counties had the least risk of esophageal cancer incidence. Figure 2, shows the map of relative risk in county level within three years without adjustment of risk factors and by regarding both the structural and non-structural heterogeneities. According to this map, Torbatjam, Sarakhs, Fariman, Saghez, Gonbad, kolale (kolale and MoravehTappeh), Minoodasht (Minoodasht and Galikash), Mshkynshahr and Shahroud (Shahroud, Miami and Bastam) have the highest risk of incidence and Tonbebozorg, Aboumousa, Nikshahr, Shahryar, Robatkarim and Saravan (Saravan, Zaboli and Sibvasaravan) counties have the least risk of cumulative incidence during three years of study.
According to Table 3 the rest of other provinces and counties are supposed to have normal distribution. ICC has been described according to a percentage of variance in the data and by the residuals of mentioned levels. Province level has the most amount of ICC (57%).This showed that considering the multilevel model is appropriate.

Discussion
According to the results ,north-east and northwest regions and some parts of central zones (especially Khorasan, North Khorasan, Golestan, Ardabil, West Azerbaijan, East Azerbaijan, Kurdistan and some regions of Semnan provinces) have higher levels of esophageal cancer incidence (relative risk more than 3.5) in comparison with desert and southern regions(especially Sistan and Baluchestan, Bushehrand and Hormozgan provinces).It is necessary to investigate the causes of cancer in high-risk areas and the relationship between these factors and the results of this study.
The present investigation is the first mapping of esophageal cancer incidence rate in Iran at county level. Esophageal cancer incidence and related mortality in Iran is one of the important health problems and climate conditions facilitates its incidence and prevalence of esophageal cancer. Hence in the present study, environmental investigation and its mapping were done and relative risks were estimated in the counties during three years. Eslami (Islami et al., 2009a;Islami et al., 2009b), Aazami et al. (2006) and Kamangar et al. (2007) illustrated that Ardebil and Golestan provinces are the high risk regions of esophageal cancer.
In their ecological study, Semnani et al. (2010) found that the highest incidence of esophageal cancer in Golestan province was related to the exceeding rate of selenium in the soil of the region. Rahimzadeh Bozorgi et al. (2013) declared that the presence of selenium in the soil and rice of Golestan was a possibly important factor in the high incidence of esophageal cancer. Sadeghieh Ahari et al. indicated the importance of the Sabalan volcano as an important environmental factor in the incidence of esophageal cancer in Ardebil province (Ahari et al., 2013).
As in the work by Azami et al. (2006) we found no statistically significant association between smoking and vegetable intake and esophageal cancer incidence. In the investigation of Hajizadeh et al. (2012) there was no correlation between vegetable intake and esophageal cancer as well. In contrast to the results of previous studies (Lagergren et al., 1999;Lagergren et al., 2014;Thrift et al., 2014), weight was not significant factor in esophageal cancer incidence in our study. Physical activity has significant effect, which is in accordance with the results of previous studies (Chen et al., 2014;Dos Santos and Coutinho, 2014;Singh et al., 2014). This risk factor may be accompanied by decreasing incidence risk of esophageal cancer.
Many articles have been published in Iran about esophageal cancer in which this kind of cancer has been studied in a single or a few provinces especially in Northern provinces. They are, mostly, descriptive and cross-sectional (Ghavamzadeh et al., 2001;Radmard, 2010;Sadjadi et al., 2010;Rasouli et al., 2011;Rahimzada Bozorgi et al., 2013;Somi et al., 2014). We found no study in which environmental factors have been taken into consideration the esophageal cancer incidence nationwide.
The specialists and health planners should present their beliefs for designing the effective programs, in order to study precisely influencing factors in the incidence of esophageal cancer and the sort of achieved pattern in the variation of this cancer, especially the approach of facing high risk areas.
It should be noted that just pathological reports are taken into account in an Iranian registration system of cancer. So, at best, only 80 percent of cancer cases can be registered. In addition, this registration system was disqualified in some counties and years so that it apparently leads to underestimate the number of cancer cases (Asmarian et al., 2013a). Mapping of a disease at county level will be smoother than province level due to more areas in county level mapping.
Because access to up-to-date data about esophageal cancer is not available, other types of risk factors such as tobacco consumption, alcohol use, warm drinks like tea, socioeconomic condition, unhealthy nutrition, Selenium, shortage, viral and genetic agents were not registered, and also the data related to effective risk factors in county level were not registered and there are missed data in county level. Using more precise and complete data will help further clarifying the process of esophageal cancer in Iranian counties. It is also recommended to increase the period of study or to investigate the impact of each factor in every region by use of varying slope models. To find more clearly risk factors for esophageal cancer incidence, it is suggested conducting further studies in the areas with high incidence.