Clustering Asian and North African Countries According to Trend of Colon and Rectum Cancer Mortality Rates: an Application of Growth Mixture Models

Background: Colorectal cancer is the second most common cause of cancer death with half a million deaths per year. Incidence and mortality rates have demonstrated notable changes in Asian and African countries during the last few decades. In this study, we first aimed to determine the trend of colorectal cancer mortality rate in each Institute for Health Metrics and Evaluation (IHME) region, and then re-classify them to find more homogenous classes. Materials and Methods: Our study population consisted of 52 countries of Asia and North Africa in six IHME pre-defined regions for both genders and age-standardized groups from 1990 to 2010.We first applied simple growth models for pre-defined IHME regions to estimate the intercepts and slopes of mortality rate trends. Then, we clustered the 52 described countries using the latent growth mixture modeling approach for classifying them based on their colorectal mortality rates over time. Results: Statistical analysis revealed that males and people in high income Asia pacific and East Asia countries were at greater risk of death from colon and rectum cancer. In addition, South Asia region had the lowest rates of mortality due to this cancer. Simple growth modeling showed that majority of IHME regions had decreasing trend in mortality rate of colorectal cancer. However, re-classification these countries based on their mortality trend using the latent growth mixture model resulted in more homogeneous classes according to colorectal mortality trend. Conclusions: In general, our statistical analyses showed that most Asian and North African countries had upward trend in their colorectal cancer mortality. We therefore urge the health policy makers in these countries to evaluate the causes of growing mortality and study the interventional programs of successful countries in managing the consequences of this cancer.


Clustering Asian and North African Countries According to Trend of Colon and Rectum Cancer Mortality Rates: an Application of Growth Mixture Models
Farid Zayeri, Ali Sheidaei, Anita Mansouri* per 100000 in 2002 in China (Sung et al., 2005;Sung et al., 2008). Asia as the largest and most populous continent in the world encompasses countries with wide variation in culture, race, life style and diet. This variety usually affects the pattern of mortality trends for wide range of disease including colon and rectum cancer.
Although cancer of colon is more common in developed countries (Merika et al., 2010), incidence and mortality rate of colon and rectum cancer had notable changes in Asian and African developing countries during the last few decades (Kono, 2004;Sung et al., 2005). However, these changes in trend of colon cancer mortality vary from region to region because of dietary habits, lifestyle and environmental exposure (Janout and Kollarova, 2001;Sung et al., 2005;Behnampour et al., 2014). On the other hand, some researchers believe that the colorectal cancer mostly occurs in people with low genetic characteristics risk factors (Watson and Collins, 2011). Regarding this, studying mortality time trend patterns and classifying the regions according to their trend of colorectal mortality rate can provide an opportunity for health policy makers to screen and explore this cancer risk factors.
There are many useful statistical tools to assess trend during the time. Methods for modeling longitudinal data are common approaches in this context. In previous decades, different version of marginal, random effects and transition models have been proposed for the analysis of longitudinal data (Diggle et al., 2002;Fitzmaurice et al., 2012). These modeling approaches have many advantages such as describing the relationship between different types of outcomes (binary, ordinal, continuous …) with a variety of covariates accounting for correlation between response variables. However they could not be generalized for clustering longitudinal data directly. To do this, we need statistical approaches that simultaneously cluster the data while accounting for the correlation between them. Latent growth models (LGM) family, as a combination of mixed effects and structural equation modeling (SEM), are appropriate choices for clustering longitudinal outcomes. In this context the latent growth mixture models (LGMM) as a member of LGM family have been increasingly utilized for the analysis of social and medical longitudinal data during the recent years (Colder et al., 2002;Duncan et al., 2006;Swanson et al., 2007;deRoon-Cassini et al., 2010). The LGMs can provide an efficient means of outcome growth trajectories. A common application of LGMs is to assess features of outcome growth trajectory such as form of the latent growth trajectory, the initial level of the outcome, the rate of outcome change and the association between the rate of change and the initial level of outcome (Wang and Wang, 2012). In contrast, in LGMMs, the main aim is to identify homogeneous subpopulations and meaningful classes of individuals (McArdle and Epstein, 1987;Nylund et al., 2007;Jung and Wickrama, 2008;Wang and Wang, 2012).
In our literature review, we found a number of published studies about the incidence and mortality rate of colorectal cancer in different parts of Asia (Reddy et al., 1989;Koyama and Kotake, 1997;Janout and Kollarova, 2001;Boyle and Leon, 2002;Kono, 2004;Yiu et al., 2004;Sung et al., 2005;Sung et al., 2008;Center et al., 2009;Herszenyi and Tulassay, 2010;Merika et al., 2010;Moghimi-Dehkordi and Safaee, 2012;Pourhoseingholi, 2012;Siegel et al., 2012;Atrkar-Roushan et al., 2013;Zheng et al., 2014). However, we could not find any research about clustering the Asian countries based on the colorectal cancer mortality rate during the time. Therefore, we decided to perform this study in order to: 1) introduce the application of LGMM in analyzing longitudinal health and medical data, 2) utilize these models for clustering Asian countries regarding their colorectal mortality rate during 1990-2010 Using the registered data on institute for health metrics and evaluation (IHME) website (IHME, 2014). To do this we first presented the time trend estimates in six IHME regions (as pre-defined clusters) and then used the LGMM to determine the new clusters and presented the time trend estimates of colorectal mortality rate in them.

Study setting
As mentioned before, in this study we used the IHME classification for Asian countries as a part of our analysis. IHME has clustered countries based on geographical and economic criteria. This classification includes all Asian and north African countries. Based on this classification, some Asian and North African countries lie in the same region (cluster). Regarding this, our study population consisted of 52 countries of Asia and North Africa in six IHME pre-defined regions. List of countries in each region is available on IHME website.

Data source
In this study, we used the online available data of IHME website for 52 described countries, for both genders and age-standardized groups. This longitudinal data is now available for years 1990, 1995, 2000, 2005 and 2010. We used the registered data for colon and rectum cancer mortality rates to explore the time trend and cluster these countries in this period (IHME, 2014).

Statistical methods
Random effects models as one of common approach for analyzing longitudinal data allow the analysis of between-subject and within-subject sources of variation at the same time by distinguishing fixed and random effects. In this approach, one part of the model shows the population average and another part is related to deviation of each individual from the population average (Fitzmaurice et al., 2012). The LGM can be thought as an application of mixed effects longitudinal modeling in the SEM framework. A well-known application of LGMs is to assess feature of outcome growth trajectory by latent growth factors.
LGMM is an extension of LGM that accommodates population heterogeneity in the outcome growth by classifying individual trajectories into subpopulations or classes (Wang and Wang, 2012).
More formally, if we assume that our population of interest has K latent classes, then the following equations describe a LGMM model: Where y it k represents the observed outcome variables for case i at time t for latent class k. η i0 k and η i1 k are latent growth factors; λ t k as time scores can be specified as linear, nonlinear polynomial functions of time and even free time score. First equation is level 1 model and two other are level 2 models. So η 00 k and η 10 k are intercept coefficients and ς i0 k and ς i1 k are error terms for level 2. ε i1 k represents error term at time t. Finally β 01j k and β 11j k are slope coefficients of covariates x j (Wang and Wang, 2012).
First step in LGMM is determining optimal number of latent trajectory classes and examine the significance of class membership. In addition, a wide set of functions over trajectories should be examined in order to find best fitting DOI:http://dx.doi.org/10.7314/APJCP.2015.16.9.4115 Clustering Asian and North African Countries According to Trend of Colon and Rectum Cancer Mortality Rate one in each class (deRoon-Cassini et al., 2010;Wang and Wang, 2012). After these explorations, one can choose the best number of classes and growth functions according to fitness criteria such as AIC, BIC and SSBIC. Lower values of these criteria show better fitting. In addition, a LRT test can be done to examine if the number of classes are statistically significant (Nylund et al., 2007).
In this study, we first overviewed the descriptive statistics for countries in each region to explore functions of trajectories suggested by data. Then, we applied the simple growth models for pre-defined IHME regions to estimate the intercepts and slopes of mortality rate trends in these clusters. Finally, we clustered 52 described countries using the latent growth mixture modeling approach as our suggested method for classifying them based on their colorectal mortality rates over time. To identify latent class of colon and rectum cancer trend, we used Mplus software version 6.12 which uses a maximum likelihood methodology for estimating LGMM parameters (Muthen and Muthen, 2011).

Results
As mentioned before, the registered data for colon and rectum cancer mortality rate in 52 countries were downloaded from the IHME website. Exploring raw data shows males and people in high income Asia pacific and East Asia countries had more risk for death from colon and rectum cancer. In addition, the lowest rates of mortality can be observed in south Asia. Moreover, the majority of regions have decreasing trend in mortality rate of colon and rectum cancer. This result may be highly misleading because of heterogeneity of mortality rates in countries of each region. For instance, South Korea and Singapore are classified in high income Asia pacific despite their differences in trend of mortality rate. Actually, slope of trend in mortality rate of colon and rectum cancer is highly negative for Singapore and positive for South Korea. Contradictions like this indicate demanding more exploration and using more complex statistical methods to achieve reliable results.
For all regions, we began with a simple growth model to fit a linear trend for countries in each IHME region. Table 1 shows the estimated intercepts and slopes in each region for both genders. These results confirm our former statement about negative slopes in some regions. Again, consider the negative estimates for the slopes in the high income Asia pacific region (-0.047 for females and -0.637 for males). These negative estimates do not reveal the actual trend of mortality rates in this region (negative trend for Singapore and positive trend for South Korea). In order to overcome this problem, we need to re-classify these 52 countries according to their trends of mortality rates and then determine the trends of colon and rectum cancer death in these new classes. To do this, we used the latent growth mixture models in the next step of our processing.
In this stage of data analysis, we first assessed the LGMM with different class numbers to find the best fitted one. Table 2 summarizes the obtained results. According these results, the optimum number of classes was three and five for females and males, respectively. In addition, LRT p-values were less than 0.05 for both genders in optimum number of classes, which indicated statistical significance of classes.
LGMM clusters countries into new classes based on their growth (trend) parameters. In addition, this modeling approach estimates these parameters in each class. Table  3 shows intercepts and slopes for new classes and Figure  1 and 2 display shape of trend in all classes. In contrast to data in Table 1, only one class in each gender has negative slope according this clustering approach.
Although new classes obtained from LGMM are more heterogeneous than the IHME classification regarding the cluster size, they are more homogeneous on the basis of their mortality rates variation. The largest classes are class 1 for females and class 3 for males with 40 and 26 countries, respectively. Also the smallest classes are related to class 1 for females with a single country (Singapore) and class three for males with two countries (Singapore and Japan). Geographic illustration of LGMM classification depicted in Figure 3 and 4. ISO 3 codes for naming countries also are available in legends of these maps. To distinguish differences in maps more clearly, some small countries are depicted again in separate small maps in Figures.

Discussion
In this study, we first aimed to explore the trend of colon and rectum cancer death rate in Asia and North Africa, and then categorize these countries according to their trends and eventually compare these model-based clusters with IHME regions. Summing over males and females data for all Asian and North African countries results in a slope of 0.097 for colon and rectum cancer mortality rate. In addition, our results showed an upward slope of 0.201 for males and a downward slope of -0.051 for females in these countries. An increasing trend for mortality rate of this cancer was confirmed by other researchers in some areas of Asia (Sung et al., 2005;Sung et al., 2008;Pourhoseingholi, 2012).
As mentioned before, although the majority of countries (as an individual observation) in our data had an ascending trend for their mortality rate of colorectal cancer, simple growth models estimated negative slopes for many IHME regions. This contrast was made because of heterogeneity in mortality rate of colorectal cancer trend within these regions. In Latent Growth Modeling process, this heterogeneity resulted in these negative slopes. However, the LGMM could overcome this limitation and distinguish countries with negative slopes from others and classify them as separate clusters.
Our model-based clusters stated that most of obtained classes had rising trend for colorectal cancer death rate. More clearly, only one cluster (in both genders) had downward mortality rate trend. For males, Singapore and Japan were in the smallest cluster with downward trend, while Singapore was in such cluster alone among females. These results are compatible with many researches on trend of colorectal cancer mortality rate in Japan and Singapore (Sung et al., 2008;Katanoda et al., 2012;Lim et al., 2012;Teo and Soo, 2013;Lee, 2014). The relative improvement in mortality rate in Singapore may be explained by cancer control program in this country. Prevention, cancer screening for early detection and facilities for cancer patients are main components of this cancer control program (Hock, 2002;Teo and Soo, 2013). Main activities in the field of cancer prevention are hepatitis vaccination, anti-smoking campaign, health campaign ongoing, health life-style promotion and awareness programs. Cancer screening for early detection includes awareness clinics, public education and making screening affordable for relevant instances (Hock, 2002). On the other hand, the rising colorectal cancer mortality trend in other countries may be a reflection of the rising colorectal cancer incidence trends. Also it illustrates lack of colorectal cancer screening programs and interventions to control it (Center et al., 2009). However, more surveys about reasons of this reduction in death rate of colorectal cancer in Singapore and Japan can provide an opportunity for other nations to screen and control death due to this cancer.
Our LGM modeling process suggested that five clusters for males and three clusters for females could appropriately explain the variation of colorectal cancer death rate trend among countries under study. This shows that mortality trends in colorectal cancer was more heterogeneous in males than females. In the other words, trend of colorectal cancer mortality differed substantially among clusters in men. In contrast, females had more constant trends and different initial rates (intercepts) in their colorectal cancer rates resulted in different clusters. It reinforces the idea that males had more exposure to the risk factors such as smoking and alcohol and in general females have healthier habits related to nutrition (Denton et al., 2004;de Kok et al., 2008). So variation in risk factors probably leads to variation in their trends.
As we described above, the LGMM has some advantages comparing to other classic statistical methods. However, these models need a rather large sample size for obtaining the best results. Small sample size (countries) for some clusters was one of our limitations in this study which possibly decreases the reliability of our statistical method. Moreover, lack of information about death rates for all countries compelled us to use the available estimation of death rates on IHME website. Future research is encouraged to study this issue with registered mortality data.
Eventually, this study revealed that most Asian and North African countries had upward trend in their colorectal cancer mortality and new clusters obtained from LGMM were more homogeneous than the IHME and WHO classification. Of course, we know that the IHME clusters (regions) are based on some variables such as economic and geographic situations. Therefore, when we wish to cluster countries based on other variables such as mortality rate of different cancers, we surely expect to identify different classes from IHME or WHO clusters. Our findings indicated that pattern of changes in death due to colon cancer are different for males and females in some countries and males had steeper trend for colorectal mortality rate. In general, we suggest that when the researchers aim to classify countries according to variables such as disease mortality rates, it would be more appropriate to use statistical modeling approach instead of reporting the results for IHME or WHO regions. This strategy helps them to extract best information and have more benefit decisions.