• 제목/요약/키워드: statistical data analysis

Search Result 9,252, Processing Time 0.042 seconds

Machine learning-based corporate default risk prediction model verification and policy recommendation: Focusing on improvement through stacking ensemble model (머신러닝 기반 기업부도위험 예측모델 검증 및 정책적 제언: 스태킹 앙상블 모델을 통한 개선을 중심으로)

  • Eom, Haneul;Kim, Jaeseong;Choi, Sangok
    • Journal of Intelligence and Information Systems
    • /
    • v.26 no.2
    • /
    • pp.105-129
    • /
    • 2020
  • This study uses corporate data from 2012 to 2018 when K-IFRS was applied in earnest to predict default risks. The data used in the analysis totaled 10,545 rows, consisting of 160 columns including 38 in the statement of financial position, 26 in the statement of comprehensive income, 11 in the statement of cash flows, and 76 in the index of financial ratios. Unlike most previous prior studies used the default event as the basis for learning about default risk, this study calculated default risk using the market capitalization and stock price volatility of each company based on the Merton model. Through this, it was able to solve the problem of data imbalance due to the scarcity of default events, which had been pointed out as the limitation of the existing methodology, and the problem of reflecting the difference in default risk that exists within ordinary companies. Because learning was conducted only by using corporate information available to unlisted companies, default risks of unlisted companies without stock price information can be appropriately derived. Through this, it can provide stable default risk assessment services to unlisted companies that are difficult to determine proper default risk with traditional credit rating models such as small and medium-sized companies and startups. Although there has been an active study of predicting corporate default risks using machine learning recently, model bias issues exist because most studies are making predictions based on a single model. Stable and reliable valuation methodology is required for the calculation of default risk, given that the entity's default risk information is very widely utilized in the market and the sensitivity to the difference in default risk is high. Also, Strict standards are also required for methods of calculation. The credit rating method stipulated by the Financial Services Commission in the Financial Investment Regulations calls for the preparation of evaluation methods, including verification of the adequacy of evaluation methods, in consideration of past statistical data and experiences on credit ratings and changes in future market conditions. This study allowed the reduction of individual models' bias by utilizing stacking ensemble techniques that synthesize various machine learning models. This allows us to capture complex nonlinear relationships between default risk and various corporate information and maximize the advantages of machine learning-based default risk prediction models that take less time to calculate. To calculate forecasts by sub model to be used as input data for the Stacking Ensemble model, training data were divided into seven pieces, and sub-models were trained in a divided set to produce forecasts. To compare the predictive power of the Stacking Ensemble model, Random Forest, MLP, and CNN models were trained with full training data, then the predictive power of each model was verified on the test set. The analysis showed that the Stacking Ensemble model exceeded the predictive power of the Random Forest model, which had the best performance on a single model. Next, to check for statistically significant differences between the Stacking Ensemble model and the forecasts for each individual model, the Pair between the Stacking Ensemble model and each individual model was constructed. Because the results of the Shapiro-wilk normality test also showed that all Pair did not follow normality, Using the nonparametric method wilcoxon rank sum test, we checked whether the two model forecasts that make up the Pair showed statistically significant differences. The analysis showed that the forecasts of the Staging Ensemble model showed statistically significant differences from those of the MLP model and CNN model. In addition, this study can provide a methodology that allows existing credit rating agencies to apply machine learning-based bankruptcy risk prediction methodologies, given that traditional credit rating models can also be reflected as sub-models to calculate the final default probability. Also, the Stacking Ensemble techniques proposed in this study can help design to meet the requirements of the Financial Investment Business Regulations through the combination of various sub-models. We hope that this research will be used as a resource to increase practical use by overcoming and improving the limitations of existing machine learning-based models.

A Study on the Correlation between Uniaxial Compressive Strength of Rock by Elastic Wave Velocity and Elastic Modulus of Granite in Seoul and Gyeonggi Region (서울·경기지역 화강암의 탄성파속도와 탄성계수에 의한 암석의 일축압축강도와의 상관성 연구)

  • Son, In-Hwan;Kim, Byong-kuk;Lee, Byok-Kyu;Jang, Seung-jin;Lee, Su-Gon
    • Journal of the Society of Disaster Information
    • /
    • v.15 no.2
    • /
    • pp.249-258
    • /
    • 2019
  • Purpose: The purpose of this study is to attain the correlation analysis and thereby to deduce the uniaxial compressive strength of rock specimens through the elastic wave velocity and the elastic modulus among the physical characteristics measured from the rock specimens collected during drilling investigations in Seoul and Gyeonggi region. Method: Experiments were conducted in the laboratory with 119 granite specimens in order to derive the correlation between the compressive strength of the rocks and elastic wave velocity and elastic modulus. Results: In the case of granite, the results of the analysis of the interaction between the compressive strength of a rock and the elastic wave velocity and elastic modulus were found to be less reliable in the relation equation as a whole. And it is believed that the estimation of the compressive strength by the elastic wave velocity and elastic modulus is less used because of the composition of non-homogeneous particles of granite. Conclusion: In this study, the analysis of correlation between the compressive strength of a rock and the elastic wave velocity and elastic modulus was performed with simple regression analysis and multiple regression analysis. The coefficient determination ($R^2$) of simple regression analysis was shown between 0.61 and 0.67. Multiple regression analysis was 0.71. Thus, using multiple regression analysis when estimating compressive strength can increase the reliability of the correlation. Also, in the future, a variety of statistical analysis techniques such as recovery analysis, and artificial neural network analysis, and big data analysis can lead to more reliable results when estimating the compressive sterength of a rock based on the elastic wave velocity and elastic modulus.

Mobility Change around Neighborhood Parks and Green Spaces before and after the Outbreak of the COVID-19 Pandemic (COVID-19 발생 전·후 생활권 공원녹지 모빌리티 변화 분석)

  • Choi, Ga yoon;Kim, Yong gook;Kwon, Oh kyu;Yoo, Ye seul
    • Journal of the Korean Institute of Landscape Architecture
    • /
    • v.51 no.4
    • /
    • pp.101-118
    • /
    • 2023
  • During the COVID-19 pandemic, the utilization rate of neighborhood parks and green spaces increased significantly, and the outbreak served as an opportunity to highlight the values and functions of neighborhood parks and green spaces for urban residents. This study aims to empirically analyze how citizens' movement and the use of neighborhood parks and green spaces changed before and after COVID-19 and examine the social and spatial characteristics that affected these changes. As a research method, first, people's mobility around neighborhood parks and green spaces before and after the COVID-19 pandemic were compared using signal data from telecommunication carriers. Through the analysis of changes in residence time and movement volume, the movement characteristics of citizens after COVID-19 and changes in walking-based park visits were examined. Second, the factors affecting the mobility change in neighborhood parks and green spaces were analyzed. The social and spatial characteristics that affect citizens' visits to neighborhood parks and green spaces before and after COVID-19 were examined through correlation and multiple regression analysis. Subsequently, through cluster analysis, the types of living areas for the post-COVID era were classified from the perspective of the supply and management of neighborhood parks and green spaces services, and directions for improving neighborhood parks and green spaces by type were presented. Major research findings are as follows: First, since the outbreak of COVID-19, activities within 500m of the residence have increased. The amount of stay and walking movement increased in both 2020 and 2021, which means that the need to review the quantitative standards and attractions of neighborhood parks and green spaces has increased considering the changed scope of the walking and living area. Second, the overall number of visits to neighborhood parks and green spaces by walking has increased since the outbreak of COVID-19. The number of visits to neighborhood parks and green spaces centered on the house and the workplace increased significantly. The park green policy in the post-COVID era should be promoted by discovering underprivileged areas, focusing on areas where residential, commercial, and business facilities are concentrated, and improving neighborhood parks and green services in quantitative and qualitative terms. Third, it was found that the higher the level of park green service, the higher the amount of walking movement. It is necessary to use indicators that contribute to improving citizens' actual park green services, such as walking accessibility, rather than looking at the criteria for securing green areas. Fourth, as a result of cluster analysis, five types of neighborhood parks and green spaces were derived in response to the post-COVID era. This suggests that it is necessary to consider the socioeconomic status and characteristics of living areas and the level of park green services required in future park green policies. This study has academic and policy significance in that it has laid the basis for establishing neighborhood parks and green spaces policy in response to the post-COVID era by using various analysis methodologies such as carrier signal data analysis, GIS analysis, and statistical analysis.

The Relationship of Social Support, Stress, Health Status and Quality of Life in Caregivers of Home-stay Cancer Patient in a Comminity (지역사회 재가 암환자 가족의 사회적 지지 스트레스, 건강상태 및 삶의 질과의 관계)

  • Kim, Boon-Han;Kim, Tae-Su;Kim, Eui-Sook;Jung, Yun
    • Journal of Hospice and Palliative Care
    • /
    • v.3 no.2
    • /
    • pp.144-151
    • /
    • 2000
  • Purpose : This investigation was to identify the relationship of social support, stress, health and quality of life in caregivers of home-stay cancer patient. Method : We used a questionnaire and obtained data from the records of 79 caregivers of home-stay cancer patient in a community. Window SPSS-PC was used for the data analysis and the statistical method used were the t-test, ANOVA and Pearson's correlation coefficient. Result : The mean score of family support(3.24) was higher than nurse's support(3.03). The mean score of stress was 3.52 and that of health status was 2.98. The mean score of quality of life was 2.34. The health status of caregivers of cancer patient was influence by age(F=3.17, p=0.018) and education(F=3.59, p=0.032). There was a correlation between nurse's support and family support(r=.263, p<0.05). There was a correlation between stress and health status(r=0.597, p<0.01). The quality of life was correlated with stress(r=-.678, p<0.01) and health status(r=-0.741, p<0.01). Conclusion : The above result indicate that we must consider of social support, stress and health status to promote of quality of life of the caregiver of cancer patient.

  • PDF

The Relationships among Quality of Life and Stress, Health-related Habits and Food Intake in Korean Healthy Adults Based on 2013 Korea National Health and Nutrition Examination Survey (건강한 성인에게서 삶의 질과 스트레스, 건강관련 생활습관, 영양소 및 음식 섭취와의 관련성 연구 - 2013 국민건강영양조사를 근거하여 -)

  • Lee, Su Bin;Choi, Hyun Jin;Kim, Mi Joung
    • Korean Journal of Community Nutrition
    • /
    • v.20 no.6
    • /
    • pp.411-422
    • /
    • 2015
  • Objectives: This study investigated the socioeconomic factors that affect quality of life (QL) in healthy adults and to study the relationship between QL and health-related habits and food intake. Methods: Subjects consisted of 1,154 healthy adults without any known disease, aged 19 to 65 years from the 2013 Korean National Health and Nutrition Examination Survey data. We used SPSS statistical program version 20.0 for data analysis. Results: The average age and QL score of the study population were 36.7 years and 0.99 points, respectively. Males had a significantly higher QL score than the females (p < 0.001), and employed subjects and those employed in permanent positions had significantly higher scores as compared respectively with unemployed subjects and those employed in temporary positions (p < 0.001, p < 0.05). The group that responded "almost every day" to the "frequency of binge drinking" and "frequency of disruption of daily life due to drinking" had significantly lower QL scores as compared to other groups (p < 0.05). Further, the scores were significantly higher for individuals who practiced "intense physical activities" and "walking" (p < 0.001). The groups that responded that they were "very stressed" showed significantly lower QL scores in comparison to the other groups (p < 0.05). There were no significant differences in QL scores according to anthropometric or biochemical indices. When subjects were divided into two groups based on average QL scores, the frequency of intake of "barbecued beef" was significantly higher while the frequency of intake of "fried eggs or rolled omelet," and "soy milk" was significantly lower in the high QL group. Conclusions: Based on these findings, it is evident that in healthy adults without any known underlying illnesses, psychological factors such as economic activity, occupational environment, and stress are considered to have a greater impact on their QL than are nutrient intake, blood biochemical indices, and anthropometric status.

Effect of The First Authors Determine to Paper (논문의 주저자 결정에 미치는 영향)

  • Seong, Jeong-Min;Park, Yong-Duk
    • Journal of dental hygiene science
    • /
    • v.11 no.2
    • /
    • pp.129-134
    • /
    • 2011
  • The purpose of this study was to examine the professors, who teach the dental hygiene program in Korea, their Awareness on authors determined among research ethics. Three hundred and six full time professors and four hundred and eighty four part time professors in seventy eight universities all around Korea were surveyed. The following are the results that the collected data was carried out statistical analysis by using SPSS 12.0 program. The results was as followed. 1. The present study's author makers, 95 respondents (55.6%) experienced advising professor and 67 respondents (39.2%) experienced personally. 2. As recognition about range and order decision of the authors, they expressed the biggest recognition that person who did the interpretation of the results or gave important information doesn't always become the first author of the article($2.81{\pm}.485$). 3. Between general characteristics and authorship recognition level, they were statistically significantly different with number of articles published as the first author(p<0.05). 4. Regarding the correlation between students who helped with collection of data and references also have the right to be the author of the research article and Person who did the rough draft translation also has the right to be the co-author of the research article had a statistically significant correlation of .433 which was the highest correlation factor (p<0.01). Conclusions, rules and regulations on research ethic should be more publicized through educational institutions.

A Survey on Fish Habitat Conditions of Domestic Rivers and Construction of Its Database (국내 어류 서식환경 조사 및 데이터베이스 구축)

  • Jung, Jin-Hong;Park, Ji-Young;Yoon, Young-Han;Lim, Hyun-Man;Kim, Weon-Jae
    • Journal of Korean Society of Environmental Engineers
    • /
    • v.36 no.3
    • /
    • pp.221-230
    • /
    • 2014
  • In order to restore an ecologically damaged river, freshwater fish which inhabit at the target aquatic ecosystem have a great applicability as one of the essential indicators. Although the informations about the habitat conditions of freshwater fish are key elements reflecting biological, physical, and chemical properties of the aquatic environment, because of the lack of preceding related research and insufficient database with scattered data, they have not been applied effectively for the ecological river restoration projects in Korea. To cope with these problems, based on the nation-wide detailed investigation for domestic freshwater fish habitat conditions, we have selected 70 species considering the possibility for the candidates of flagship species, constructed a database for their population, physical, and chemical habitat properties, and suggested its application methodology for the river restoration projects. In particular, the utilization of the database has been enhanced by the additional statistical analysis to present their resistance and optimum ranges for physical, and chemical habitat properties respectively. It is expected that the database constructed in this study can be utilized for the calculation and evaluation of the appropriate ecological flow rate and target water quality for the selected flagship species (fish), and the basic data for the restoration of river environment.

A study on the Image of Nurses and Determinants the Image (간호사이미지 결정요인에 관한 연구)

  • Yang, Il-Shim
    • Journal of Korean Academy of Nursing Administration
    • /
    • v.4 no.2
    • /
    • pp.289-306
    • /
    • 1998
  • For continuous development of professional nursing to the powerful professional organization, it is essential that the public understand and help nursing. This research was done to identify the image of nurses and factors that determine that image. The study subjects were 97 admitted patients 95 family members of patients who were admitted to a university hospital and a general hospital in Seoul and 164 parents of stutents in elemantary, middle, high schools in Seoul. The total numbers of subjects was 356. The researcher collected the data from April 13.1998 to April 20.1998. The Research tool was developed by the researcher following a literature review. Cronbach ${\alpha}$ for the tool of the image of nurses was 0.9397 and Cronbach a for the tool for determinants of the image was 0.8764. The obtained data were processed by SPSS (Statistical Package for Social Science) and the results are as follows : 1. The mean score for the image of nurses was 90.40${\pm}$15.15(range 47${\sim}$138) indicating a positive response. 2. Analysis of the image of nurses : Four factors were identified traditional. social. professional and personal image. The mean score for traditional image was 3.27. the second highest score. and for social image. 2.95. the lowest score. The mean score for professional image score was 3.48. the highest score. and for personal image, 3.20. a lower score. 3. The image of nurses according to respondents There were significant differences for traditional. social, professional. personal factors between subject groups. A more positive responses was found in the patients and patient' families as a compared to the students' parents. 4. Image of nurses related general characteristics : There was a significant difference for age and school graduation. More negative responses were found in the 31${\sim}$40 years old age group and in the higher educated group. 5. Image of nurses related to experience of nurses The respondents showed a more negative image when their experience related to nurses through the mass media, as a compared to the experiences of having talked with patient who had been admitted to hospital. For the social image factor. a more negative attitude was revealed for those who had the experience of patient who had been admitted to hospital as compared to other factors. 6. Determinants of image of nurses : There were three factors that were named subjective. administrative and media . The mean for the subjective factor score was 3.85. the highest score of the three factors. The mean for the administrative factor score was 3.53. And the mean for the media factor score was 3.27. 7. Determinants of image of nurses according to respondents group : There were no significant differences(F= 1.95, P= .14) Consequently the result showed a low social image of nurses. So. nurses must work to improve the social image of nurses through scientific approaches and by monitoring the mass media for correct descriptions of nurses. Also. it is necessary that excellent education for service and politeness be continually provided in order to positively effect the personal image field. It is also importent to raise the expectations of the recipients of nursing care by having a strategy for the determinants of the image of nurses that allows nurses to personnally develop professionally.

  • PDF

A Study on the Positioning of the Korea Dental Hygienists Association(KDHA) - Based on Undergraduates in Dental Hygienics - (대한치과위생사협회의 포지셔닝에 관한 연구 -치위생과 재학생 대상-)

  • Kim, Bit-Na;Kwon, Hong-Min
    • Journal of dental hygiene science
    • /
    • v.6 no.3
    • /
    • pp.163-167
    • /
    • 2006
  • The purpose of this study is to position the Korean Dental Hygienists Association(KDHA) for reserve dental hygienists as undergraduates, and thereby suggest KDHA's future potential businesses and its promising directions from comprehensive perspectives. To meet this goal, total 430 undergraduates in dental hygienics were asked to join questionnaire survey dating from November 28 to December 9, 2005. Then, the resulting data collected were analyzed using SPSS WIN 12.0. The results of data analysis can be outlined as follows: 1. Almost all of respondents(95.1%) recognized KDHA mainly via departmental faculty(37.7%), Internet(26.7%) and more. 2. It was found that KDHA's future potential businesses should be devoted primarily to promoting the right and benefit of dental hygienists, and secondly to business for their capability development. 3. In terms of joining the membership of KDHA, 73.0% of respondents showed desires to join KDHA certainly if they get relevant qualifications and 81.2% of respondents answered that it is necessary to pay membership fee to KDHA, if they join it. 4. A test about any possible associations with KDHA's positioning according to general characteristics showed that there were more or less significant differences in KDHA membership experience depending upon age(P = .022), and so was in the intention to join KDHA depending upon grade(P = .000), and in the membership fee payment depending upon both age(P = .000) and grade(P = .000) on statistical level.

  • PDF

Impact of Chemotherapy on Hypercalcemia in Breast and Lung Cancer Patients

  • Hassan, Bassam Abdul Rasool;Yusoff, Zuraidah Binti Mohd;Hassali, Mohamed Azmi;Othman, Saad Bin;Weiderpass, Elisabete
    • Asian Pacific Journal of Cancer Prevention
    • /
    • v.13 no.9
    • /
    • pp.4373-4378
    • /
    • 2012
  • Introduction: Hypercalcemia is mainly caused by bone resorption due to either secretion of cytokines including parathyroid hormone-related protein (PTHrP) or bone metastases. However, hypercalcemia may occur in patients with or without bone metastases. The present study aimed to describe the effect of chemotherapy treatment, regimens and doses on calcium levels among breast and lung cancer patients with hypercalcemia. Methods: We carried a review of medical records of breast and lung cancer patients hospitalized in years 2003 and 2009 at Penang General Hospital, a public tertiary care center in Penang Island, north of Malaysia. Patients with hypercalcemia (defined as a calcium level above 10.5 mg/dl) at the time of cancer diagnosis or during cancer treatment had their medical history abstracted, including presence of metastasis, chemotherapy types and doses, calcium levels throughout cancer treatment, and other co-morbidity. The mean calcium levels at first hospitalization before chemotherapy were compared with calcium levels at the end of or at the latest chemotherapy treatment. Statistical analysis was conducted using the Chi-square test for categorical data, logistic regression test for categorical variables, and Spearman correlation test, linear regression and the paired sample t tests for continuous data. Results: Of a total 1,023 of breast cancer and 814 lung cancer patients identified, 292 had hypercalcemia at first hospitalization or during cancer treatment (174 breast and 118 lung cancer patients). About a quarter of these patients had advanced stage cancers: 26.4% had mild hypercalcemia (10.5-11.9 mg/dl), 55.5% had moderate (12-12.9 mg/dl), and 18.2% severe hypercalcemia (13-13.9; 14-16 mg/dl). Chemotherapy lowered calcium levels significantly both in breast and lung cancer patients with hypercalcemia; in particular with chemotherapy type 5-flurouracil+epirubicin+cyclophosphamide (FEC) for breast cancer, and gemcitabine+cisplatin in lung cancer. Conclusion: Chemotherapy decreases calcium levels in breast and lung cancer cases with hypercalcemia at cancer diagnosis, probably by reducing PTHrP levels.