Analysis of SEER Glassy Cell Carcinoma Data : Underuse of Radiotherapy and Predicators of Cause Specific Survival

Glassy carcinoma is a relatively new and rare diagnosis. There are very few reports on the clinical history of this disease reports so far have been small series and case reports(Takahashi et al., 2011; Zhu and Li, 2011; Garg and Arora, 2012). This study is the largest study using the Surveillance Epidemiology and End Results (SEER) cancer registry data to analyze the prognostic and socioeconomic factors affecting the outcome of glassy cell carcinoma. Surveillance Epidemiology and End Results (SEER) (http://seer.cancer.gov/) is a public use cancer registry of United States of America (US). SEER is funded by National Cancer Institute and Center for Disease Control to cover 28% of all oncology cases in US. SEER started collecting data in 1973 for 7 states and cosmopolitan registries. Its main purpose is through collecting and distributing data on cancer, it strives to decrease the burden of cancer. SEER data are used widely as a benchmark data source for studying cancer outcomes in US and in other countries (Ognjanovic et al., 2009; Sultan et al., 2009; Cheung et al., 2010; McDowell et al., 2010; Pappo et al., 2010; Bhatia, 2011; Perez et al., 2011). The extensive ground coverage by the SEER data is ideal for identifying the disparity in oncology outcome and


Introduction
Glassy carcinoma is a relatively new and rare diagnosis. There are very few reports on the clinical history of this disease reports so far have been small series and case reports (Takahashi et al., 2011;Zhu and Li, 2011;Garg and Arora, 2012). This study is the largest study using the Surveillance Epidemiology and End Results (SEER) cancer registry data to analyze the prognostic and socioeconomic factors affecting the outcome of glassy cell carcinoma.
Surveillance Epidemiology and End Results (SEER) (http://seer.cancer.gov/) is a public use cancer registry of United States of America (US). SEER is funded by National Cancer Institute and Center for Disease Control to cover 28% of all oncology cases in US. SEER started collecting data in 1973 for 7 states and cosmopolitan registries. Its main purpose is through collecting and distributing data on cancer, it strives to decrease the burden of cancer. SEER data are used widely as a benchmark data source for studying cancer outcomes in US and in other countries (Ognjanovic et al., 2009;Sultan et al., 2009;Cheung et al., 2010;McDowell et al., 2010;Pappo et al., 2010;Bhatia, 2011;Perez et al., 2011). The extensive ground coverage by the SEER data is ideal for identifying the disparity in oncology outcome and

Abstract
Background: This study used receiver operating characteristic curve to analyze Surveillance, Epidemiology and End Results (SEER) for glassy cell carcinoma data to identify predictive models and potential disparities in outcome. Materials and Methods: This study analyzed socio-economic, staging and treatment factors. For risk modeling, each factor was fitted by a generalized linear model to predict the cause specific survival. Area under the receiver operating characteristic curves (ROCs) were computed. Similar strata were combined to construct the most parsimonious models. A random sampling algorithm was used to estimate modeling errors. Risk of glassy cell carcinoma death was computed for the predictors for comparison. Results: There were 79 patients included in this study. The mean follow up time (S.D.) was 37 (32.8) months. Female patients outnumbered males 4:1. The mean (S.D.) age was 54.4 (19.8) years. SEER stage was the most predictive factor of outcome (ROC area of 0.69). The risks of cause specific death were, respectively, 9.4% for localized, 16.7% for regional, 35% for the un-staged/others category, and 60% for distant disease. After optimization, separation between the regional and unstaged/others category was removed with a higher ROC area of 0.72. Several socio-economic factors had small but measurable effects on outcome. Radiotherapy had not been used in 90% of patients with regional disease. Conclusions: Optimized SEER stage was predictive and useful in treatment selection. Underuse of radiotherapy may have contributed to poor outcome.
Keywords: Glassy cell carcinoma -radiotherapy -SEER registry -under usage -cause specific survival

Analysis of SEER Glassy Cell Carcinoma Data: Underuse of Radiotherapy and Predicators of Cause Specific Survival
Rex Cheung treatment in different geographical and cultural areas for cancers in U.S. and could be served as a model for global public health registries (Cheung, 2014a;2014b;2014c;2014d;2015a;2015b;2016 (in press)).
In addition to the biological staging factors and the treatment factors, this database also contains a large number of county level socio-economic factors data. This study aimed to explore the potential barriers to good treatment outcome that may be discernable from a national database.

Materials and Methods
The SEER registry has a massive amount of data available for analysis, however, manipulating this data pipeline could be challenging. SEER Clinical Outcome Prediction Expert (SCOPE) used mine SEER datasets to construct accurate and efficient prediction models (Cheung, 2014a;2014b;2014c). The data were obtained from SEER 18 database. SEER is a public use database that can be used for analysis with no internal review board approval needed. SEER*Stat (http://seer.cancer.gov/ seerstat/) was used for listing the cases. The filter used was: Site and Morphology. ICD-O-3 Hist/behav, malignant = '8015/3: Glassy cell carcinoma'. This study explored a long list of socio-economic, staging and treatment factors that were available in the SEER database. The codes of SCOPE are posted on Matlab Central (www.mathworks. com). SCOPE has a number of utility programs that are adapted to handle the large SEER data pipeline. All statistics and programming were performed in Matlab (www.mathworks.com). The areas under the receiver operating characteristic curve (ROC) were computed. Similar strata were fused to make more efficient models if the ROC performance did not degrade (Cheung et al., 2001a;Cheung et al., 2001b).  (Cheung, 2015 (In press))varied greatly with the stage of the disease and other factors (Table 1 and Table 2). About half of the cases were cervix uteri (Table 3) but it also occurred in lung, corpus uteri, pancreas, breast and prostate. More than fifty percent of patients did not receive radiation treatment (Figure 1 and Table 1). The predictive power of each model was measured by the area under the receiver operating characteristic curve (Table 1). Table 1 shows that the SEER stage was the most predictive factor of outcome (ROC area of 0.69). The risks of cause specific death were, respectively, 9.4% for localized, 16.7% for regional, 35% for the un-staged/ others category, and 60% for distant disease. After optimization (Table 1), separation between the regional and unstaged/others category was removed with a higher ROC area of 0.72. Radiotherapy has not been used in 90% of patients with regional disease (Figure 1). Poorly and undifferentiated glassy carcinoma had a 52% risk  (Table 1 and Table 2). Among the socio-economic factors, lower county family income level and rural residence had better prognosis (Table 2). This is worth further investigation.

Discussion
This study is interested in constructing models that will aid patient and treatment selection for glassy cell carcinoma cancer patients. To that end, this study examined the ROC models (Hanley and McNeil, 1982) of a long list of potential explanatory factors (Table 1). ROC models take into account both sensitivity and specificity of the prediction. Ideal model would have a ROC area of 1 and a random model is expected to have an area of 0.5 (Hanley and McNeil, 1982). For example, a clinical ROC model can be used to predict if a patient receiving the recommended treatment will die from the disease. SEER stage was the most predictive of patient outcome (Table  1). SEER stage has ROC of 0.69 that was the highest among the other factors tested. Thus complete staging is important for patient selection and council.
After binary fusion by SCOPE (Cheung, 2014a;Cheung, 2014b), the 4 tiered grade was reduced to a 3 tiered grade based on ROC area calculations (Table 1). Un-staged/others was associated with intermediate risk of cause specific death (Table 2). Although in this study,   the un-staged was fused with the regional category, there was no a priori reason to put un-staged category between regional and metastatic categories. The solution to the uncertainty of placement of these cases is to complete the staging. The binary fusion was performed to demonstrate how a complex predictive model could be numerically optimized to a much simpler model that may also be useful.
When there are competing prediction or prognostic models, the most efficient (i.e. the simplest) model is thought to prevail (D'Amico et al., 1998). This has an information theoretic under-pinning. For practical purposes, simpler models require fewer patients for a randomized trials because fewer risk strata need to be balanced. In the clinic, simpler models are easier to use. SCOPE streamlined ROC models by binary fusion (Table  1). Two adjacent strata were tested iteratively to see if they could be combined without sacrificing the higher predictive power usually belong to the more complex models. This study has shown that SCOPE can build efficient and accurate prediction models.
For radiotherapy, the ROC area of 0.56 (Table 1) was modestly more than 0.5. For a point of reference, using we computed the prostate risk model was 0.75 in its accuracy of predicting biochemical failure (Cheung et al., 2001a;Cheung et al., 2001b). Low ROC areas imply the information content (i.e. the staging accuracy) of the models may be limited. It is consistent with the fact that most patients did not have complete grading or staging (Table 2). This is an area of improvement. It may be a consequence of having a better guidance model in treatment and patient selection.
Glassy cell carcinoma is an aggressive disease when it is metastatic, there was a 16.7% risk of glassy cell carcinoma death (Table 2) despite treatments. There was only 8% use of RT (Figure 1) for these patients. Thus radiation oncologists should be more attentive in recommending RT for these patients.
In conclusion, this study has identified the staging models are the most prognostic of treatment outcomes of biliary tract cancer patients. The high under-staging rates may have prevented patients from selecting definitive local therapy. The poor rates of radiotherapy after surgery use may have contributed to the poor outcome in these patients with this aggressive disease.