Analysis of SEER Adenosquamous Carcinoma Data to Identify Cause Specific Survival Predictors and Socioeconomic Disparities

SEER registry has massive amount of data available for analysis, however, manipulating this data pipeline could be challenging. SEER Clinical Outcome Prediction Expert (SCOPE) (Cheung, 2014c; Cheung, 2014a; Cheung, 2014d; Cheung, 2014b) was used mine SEER data and construct accurate and efficient prediction models (Cheung et al., 2001a; Cheung et al., 2001b). The data were obtained from SEER 18 database. SEER is a public use database that can be used for analysis with no internal review board approval needed. SEER*Stat (http://seer. cancer.gov/seerstat/) was used for listing the cases. The filter used was: Site and Morphology.ICD-O-3 Hist/behav, malignant = ‘8560/3: Adenosquamous carcinoma’. This study explored a long list of socio-economic, staging and treatment factors that were available in the SEER database (Cheung, 2014a; 2014b; 2014c; 2014b; 2014e; Cheung, 2015a; 2015b; Cheung, 2015 (In press)). The codes of SCOPE are posted on Matlab Central (www.mathworks.com). SCOPE has a number of utility programs that are adapted to handle the large SEER data pipeline. All statistics and programming were performed in Matlab (www.mathworks.com) (Cheung, 2014a; 2014b;


Introduction
SEER registry has massive amount of data available for analysis, however, manipulating this data pipeline could be challenging. SEER Clinical Outcome Prediction Expert (SCOPE) (Cheung, 2014c;Cheung, 2014a;Cheung, 2014d;Cheung, 2014b) was used mine SEER data and construct accurate and efficient prediction models (Cheung et al., 2001a;Cheung et al., 2001b). The data were obtained from SEER 18 database. SEER is a public use database that can be used for analysis with no internal review board approval needed. SEER*Stat (http://seer. cancer.gov/seerstat/) was used for listing the cases. The filter used was: Site and Morphology.ICD-O-3 Hist/behav, malignant = '8560/3: Adenosquamous carcinoma'. This study explored a long list of socio-economic, staging and treatment factors that were available in the SEER database (Cheung, 2014a;2014b;2014c;2014b;2014e;Cheung, 2015a;2015b;Cheung, 2015 (In press)).

Results
There were 20712 patients included in this study ( Table 1). The follow up (S.D.) was 54.2 (78.4) months. 64% of the patients were female. The mean (S.D.) age was 63 (13.8) years. There were 60% adenosquamous carcinoma patients listed from SEER database were adults. There were 7 patients younger than 20 years old in the SEER data, and it was a poor prognostic factor (Table 1 and Table 2). There is a significant female to male difference in risk of cause specific death (Table 2) favoring the female sex. 46% of the patients had lung cancers. Uterus and uterine cervix were also the common anatomic sites (Table 3). 30.6% of the tumors were not graded. Unknown grade has the highest risk of cause specific death at 51.8%. SEER stage model (localized, regional, distant, un-staged/others) was the most predictive model (ROC area or 0.71). A 4-tiered staging model was optimized to a 3-tiered model (with a ROC area of 0.67) by SCOPE (Table 1). ROC areas were used to optimize the risk models. For example, the SEER staging could be slimmed down to 3-tiered structure while not abandoning the poor (Table 1, 2 and 3). Among the socioeconomic factors studies, African American patients had 53.8% risk of death compared with 43.7% of others. However, this level of difference increased the ROC area mildly to 0.52 (Table 1). A rural residence and living a cosmopolitan area have respectively 48.7% and 44.2% risk of cause specific death (Table1, 2 and 3).
There is about 44.73% overall risk of adenosquamous carcinoma death for patients listed in SEER. The risks were 19.1% and 45.3% for localized and regional adenosquamous carcinoma respectively (Table 2). Age older than 20 years old did correlate with higher percentage mortality during this study period from 1973 to 2009 (Table 1 and Table 2). RT with external beam was associated with 54.5% risk of death, and 32.5% risk of death without RT (Table 2). Patients had surgery had 34% risk of death, 66% risk of death among patients who did not have surgery. This study explored a long list of socio-economic, staging and treatment factors that were available in the SEER database (Cheung, 2014a;2014b;2014c;2014b;2014e;Cheung, 2015a;2015b;Cheung, 2015 (In press)

Discussion
This study is interested in constructing models that will aid patient and treatment selection for adenosquamous carcinoma cancer patients. To that end, this study examined the ROC models (Hanley and McNeil, 1982) of a long list of potential explanatory factors (Table 1). ROC models take into account both sensitivity and specificity of the prediction. Ideal model would have a ROC area of 1 and a random model is expected to have an area of 0.5 (Hanley and McNeil, 1982;Cheung, 2014c;Cheung, 2014a;Cheung, 2014d;Cheung, 2014b;Cheung, 2014e;Cheung, 2015b;Cheung, 2015a;Cheung, 2015 (In press)). For example, a clinical ROC model can be used to predict if a patient receiving the recommended treatment will die from the disease. SEER stage in order to be consistent over decades, it abstracts the staging into simple but important stages for cancer progression: localized, regional and distant. Stage was the most predictive of patient outcome (Table 1). Stage has ROC of 0.71 was higher than the 0.65 of surgery. Thus complete staging is important in this disease and it may improve patient selection and council.
After binary fusion by SCOPE (Table 1), the 4 tiered stage was reduced to a 3 tiered grade based on ROC area calculations (Table 1). Un-staged grade was associated with high risk of cause specific death (Table 2). However, there is no a priori reason to put it between localized and distant. Thus it was left as a high risk factor. The solution to the uncertainty of placement of these cases is to complete the staging (Cheung, 2014c;Cheung, 2014a;Cheung, 2014d;Cheung, 2014b). The binary fusion was performed to demonstrate how a complex predictive model could be numerically optimized to a much simpler model that may also be useful (Cheung, 2014c;Cheung, 2014a;Cheung, 2014d;Cheung, 2014b).
When there are competing prediction or prognostic models, the most efficient (i.e. the simplest) model is thought to prevail (D'Amico et al., 1998). This has an information theoretic under-pinning (Cheung, 2014c;Cheung, 2014a;Cheung, 2014d;Cheung, 2014b). For practical purposes, simpler models require fewer patients for a randomized trials because fewer risk strata need to be balanced using epidemiology data (Cheung, 2014a;2014b;2014c;2014b;2014e;Cheung, 2015a;2015b;Cheung, 2015 (In press)). In the clinic, simpler models are easier to use. SCOPE streamlined ROC models by binary fusion (Table 1). Two adjacent strata were tested iteratively to see if they could be combined without sacrificing the higher predictive power usually belong to the more complex models (Cheung, 2014c;Cheung, 2014a;Cheung, 2014d;Cheung, 2014b). This study has shown that SCOPE can build efficient and accurate prediction models (Cheung, 2014c;Cheung, 2014a;Cheung, 2014d;Cheung, 2014b). For optimized stage model (Table 1), the ROC area of 0.67 was modestly more than that of surgery. For a point of reference, using we computed the prostate risk model was 0.75 in its accuracy of predicting biochemical failure (Cheung et al., 2001a;Cheung et al., 2001b). Low ROC areas imply the information content (i.e. the staging accuracy) of the models may be limited. It is consistent with the fact that most patients did not have complete grading or staging (Table 2). This is an area of improvement. It may be a consequence of having a better guidance model in treatment and patient selection.
Adenosquamous carcinoma is an aggressive disease, there was a 19% risk of adenosquamous carcinoma death (Table 2) despite treatments even for early stage cancer.
In conclusion, this study has identified the staging models are the most prognostic of treatment outcomes of adenosquamous cancer patients. The high under-staging rates may have prevented patients from selecting definitive local therapy and may have contributed to the poor outcome in these patients with this aggressive disease.