• Title/Summary/Keyword: 불균형비율

Search Result 179, Processing Time 0.021 seconds

Resolving CTGAN-based data imbalance for commercialization of public technology (공공기술 사업화를 위한 CTGAN 기반 데이터 불균형 해소)

  • Hwang, Chul-Hyun
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.26 no.1
    • /
    • pp.64-69
    • /
    • 2022
  • Commercialization of public technology is the transfer of government-led scientific and technological innovation and R&D results to the private sector, and is recognized as a key achievement driving economic growth. Therefore, in order to activate technology transfer, various machine learning methods are being studied to identify success factors or to match public technology with high commercialization potential and demanding companies. However, public technology commercialization data is in the form of a table and has a problem that machine learning performance is not high because it is in an imbalanced state with a large difference in success-failure ratio. In this paper, we present a method of utilizing CTGAN to resolve imbalances in public technology data in tabular form. In addition, to verify the effectiveness of the proposed method, a comparative experiment with SMOTE, a statistical approach, was performed using actual public technology commercialization data. In many experimental cases, it was confirmed that CTGAN reliably predicts public technology commercialization success cases.

Development of Prediction Model of Financial Distress and Improvement of Prediction Performance Using Data Mining Techniques (데이터마이닝 기법을 이용한 기업부실화 예측 모델 개발과 예측 성능 향상에 관한 연구)

  • Kim, Raynghyung;Yoo, Donghee;Kim, Gunwoo
    • Information Systems Review
    • /
    • v.18 no.2
    • /
    • pp.173-198
    • /
    • 2016
  • Financial distress can damage stakeholders and even lead to significant social costs. Thus, financial distress prediction is an important issue in macroeconomics. However, most existing studies on building a financial distress prediction model have only considered idiosyncratic risk factors without considering systematic risk factors. In this study, we propose a prediction model that considers both the idiosyncratic risk based on a financial ratio and the systematic risk based on a business cycle. Ultimately, we build several IT artifacts associated with financial ratio and add them to the idiosyncratic risk factors as well as address the imbalanced data problem by using an oversampling technique and synthetic minority oversampling technique (SMOTE) to ensure good performance. When considering systematic risk, our study ensures that each data set consists of both financially distressed companies and financially sound companies in each business cycle phase. We conducted several experiments that change the initial imbalanced sample ratio between the two company groups into a 1:1 sample ratio using SMOTE and compared the prediction results from the individual data set. We also predicted data sets from the subsequent business cycle phase as a test set through a built prediction model that used business contraction phase data sets, and then we compared previous prediction performance and subsequent prediction performance. Thus, our findings can provide insights into making rational decisions for stakeholders that are experiencing an economic crisis.

Effect of Raw Broun Rice and Job식s Tear Supplemented Diet on Serum and Hepatic Lipid Concentrations, Antioxidative System, and Immune Function of Rats (현미 및 율무 함유 생식이 영양불균형이 유도된 흰쥐의 체내 지질농도, 항산화체계 및 면역기능에 미치는 영향)

  • 박진영;양미자;전혜승;이진희;배희경;박태선
    • Journal of the Korean Society of Food Science and Nutrition
    • /
    • v.32 no.2
    • /
    • pp.197-206
    • /
    • 2003
  • Physiological functions of raw grain diet composed of brown rice and Job's Tear (1 : 1) were evaluated in rats raised with nutritionally unbalanced diet including 1% cholesterol, high proportion of animal lipids (lard: soybean oil : 8 : 2), sub-optimal levels of vitamin and mineral mixture along with 0.5% ethanol in drinking water for 4 weeks. Control rats were fed the AIN-93G diet for 9 weeks, and nutritionally unbalanced rats were divided into 3 groups, and fed one of the following diets with 0.5% ethanol in drinking water for another 5 weeks: unbalanced control diet (UCD), raw grain diet (RGD) (UCD +20% brown rice and Job's Tear mixture, and cooked grain diet (CGD)(autoclaved RGD at 121$^{\circ}C$ for 3 hours). Feeding UCD for 5 weeks significantly lowered the food efficiency ratio (FER) of rats than the value for control animals, and dietary supplementation of brown rice and Job's Tear mixture to UCD significantly restored the FER. Serum total cholesterol concentration was significantly lowered in rats fed RGD (24% decrease) or CGD (16% decrease) compared to the value for rats fed UCD. Feeding RGD for 5 weeks significaly lowered the serum LDL+VLDL-cholesterol concentration (26% decrease), as well as the hepatic cholesterol level (16% decrease) than the values for UCD rats. Animals fed CGD (38% decrease) or RGU (59% decrease) showed significantly lower level of hepatic thiobarbituric acid reactive substances (TBARS) compared to the value for rats fed UCD (p<0.05), although hepatic activities of antioxidative enzymes were not influenced by dietary supplementation. Feeding RGD for 5 weeks significantly increased CD4$^{+}$ T-cell population along with CD4$^{+}$/CD8$^{+}$ ratio of mesenteric lymph nodes compared to those for UCD rats (p<0.05). In conclusion, dietary supplementation of brown rice and Job's Tear mixture as raw grains exhibited superior activity lowering blood and hepatic levels of cholesterol, and improving mesenteric lymph nodes immune function of rats to the cooked grain mixture of identical ingredients.

Evaluating the Imbalance of Green Space and Establishing its Management Zone Using Spatial Analysis - Focused on the Use of Green Space - (공간분석을 활용한 녹지의 불균형 평가 및 관리권역 설정 - 녹지의 이용적 측면을 중심으로 -)

  • Lee, Woo-Sung;Jung, Sung-Gwan
    • Journal of the Korean Association of Geographic Information Studies
    • /
    • v.15 no.2
    • /
    • pp.126-138
    • /
    • 2012
  • The purpose of this study is to evaluate the imbalance of green space using various spatial analysis methods and to establish the management zone for green space with service supply in the aspect of its use in Daegu. The total green space of Daegu is 48,936.1ha which is the second among 7 metropolitan cities of Korea. According to the imbalance analysis of green space, the Gini's coefficient based on the area was not high, on the other hand, the Gini's coefficient based on the population was high by above 0.6. According to an evaluation of service supply of green space in Dalseo-gu, the area within about 100m around large green space was supplied with green spaces of above $25m^2$/pop. On the other hand, the area such as Sangin, Jukjeon, and Yongsan was not almost supplied with green space. Finally, 'Rich zone', 'Fair zone', 'Poor zone', and Broken zone' could be established based on the service supply for the management direction of green space. The findings from this study can be used as the basic data for selecting the construction priority of new green spaces.

Application of Random Over Sampling Examples(ROSE) for an Effective Bankruptcy Prediction Model (효과적인 기업부도 예측모형을 위한 ROSE 표본추출기법의 적용)

  • Ahn, Cheolhwi;Ahn, Hyunchul
    • The Journal of the Korea Contents Association
    • /
    • v.18 no.8
    • /
    • pp.525-535
    • /
    • 2018
  • If the frequency of a particular class is excessively higher than the frequency of other classes in the classification problem, data imbalance problems occur, which make machine learning distorted. Corporate bankruptcy prediction often suffers from data imbalance problems since the ratio of insolvent companies is generally very low, whereas the ratio of solvent companies is very high. To mitigate these problems, it is required to apply a proper sampling technique. Until now, oversampling techniques which adjust the class distribution of a data set by sampling minor class with replacement have popularly been used. However, they are a risk of overfitting. Under this background, this study proposes ROSE(Random Over Sampling Examples) technique which is proposed by Menardi and Torelli in 2014 for the effective corporate bankruptcy prediction. The ROSE technique creates new learning samples by synthesizing the samples for learning, so it leads to better prediction accuracy of the classifiers while avoiding the risk of overfitting. Specifically, our study proposes to combine the ROSE method with SVM(support vector machine), which is known as the best binary classifier. We applied the proposed method to a real-world bankruptcy prediction case of a Korean major bank, and compared its performance with other sampling techniques. Experimental results showed that ROSE contributed to the improvement of the prediction accuracy of SVM in bankruptcy prediction compared to other techniques, with statistical significance. These results shed a light on the fact that ROSE can be a good alternative for resolving data imbalance problems of the prediction problems in social science area other than bankruptcy prediction.

Ion-Concentrations of Discharged Nutrient Solution in Closed Perlite Culture for Cucumber (순환식 펄라이트 오이 재배의 배액내 이온 농도 변화)

  • 최영수;유수남
    • Proceedings of the Korean Society for Agricultural Machinery Conference
    • /
    • 2003.02a
    • /
    • pp.484-489
    • /
    • 2003
  • 양액재배에 있어서 환경변화 및 작물 생육단계에 따라 양액 성분의 흡수율을 추적하여 개별 양수분의 혼합 비율과 양액과 배액의 혼입 농도를 자동 조절하는 순환식 양액관리 시스템이 이상적으로 인식되고 있다. 국내에서는 양액재배 면적이 증가하고 있음에도 불구하고 아직까지 완전한 순환식 양액재배 시스템이 사용되는 사례는 거의 없다. 일부 순환식 양액재배 장비를 갖춘 곳에서도 단지 회수된 배액에 양수분을 첨가해 일정한 농도로만 유지시키고 있는 실정이어서 양액내 특정 성분이 저하되거나 높아져서 심한 성분의 불균형을 초래하고 있다. (중략)

  • PDF

Comparison of sex ratio between the wild and cultured olive flounder Paralichthys olivaceus (넙치 양식산과 자연산의 성비 비교)

  • Jeong, Dal Sang;Kim, Chul Won
    • Journal of Practical Agriculture & Fisheries Research
    • /
    • v.16 no.1
    • /
    • pp.135-141
    • /
    • 2014
  • Olive flounder (Paralichthys olivaceus) has a great potential value for aquaculture in Korea. The theoretical sex ratio of this flounder is close to 1:1 as it is in many other type of fish. However, according to the size selection during artificial seedling production, there is a possibility of unbalanced sex ratio. The actual flounder culturing needs female seedlings because of faster growth than male in economically. However, little is known about the sex ratio of cultured flounder. The six populations consisted of three wild populations and three cultured populations were analyzed for investigating the sex ratio. The sex ratio of wild populations ranged from 1.10 to 1.93 in female and 0.52 to 0.91 in male. And the sex ratio of cultured populations ranged from 0.20 to 2.15 in female and 0.46 to 4.88 in male. This result indicated that the sex ratio of cultured flounder varies depending on the size selection method.

Effects of Uncooked Powdered Food on Antioxidative System and Serum Mineral Concentrations in Rats Fed Unbalanced Diet (생식제품 급여가 영양불균형식이를 섭취하는 흰쥐의 항산화체계 및 혈청 무기질 농도에 미치는 영향)

  • 이여진;이해미;박태선
    • Journal of Nutrition and Health
    • /
    • v.36 no.9
    • /
    • pp.898-907
    • /
    • 2003
  • Antioxidative function of uncooked powdered food (Sangsik) was evaluated in rats consuming nutritionally unbalanced diet including 1% cholesterol, high proportion of animal lipids (lard : soybean oil : 8 . 2) , sub-optimal levels of vitamin and mineral mixture along with 0.5% ethanol in drinking water. The uncooked powdered food tested in the present study was a mixture composed of 42 kinds of plant foods (cereals, legumes, seaweeds, vegetables, and fruits) supplemented with vitamins and minerals, and dietary fiber. Control rats were fed the semi-purified diet based on the AIN-93G composition, and nutritionally unbalanced rats were divided into 3 groups, and fed one of the following diets with 0.5% ethanol in drinking water for 5 weeks : unbalanced control diet (UC) ,20% Sangsik powder supplemented diet (S20), and 40% Sangsik powder supplemented diet (S40) . Food efficiency ratio was significantly higher in rats fed S40 compared to the value for rats fed UC (p < 0.05). Hepatic level of thiobarbituric acid reactive substances (TBARS) was significantly lower in rats fed UC compared to that for control rats (p < 0.05) , and was not influenced by dietary supplementation of the Sangsik powder. Hepatic superoxide dismutase (SOD) activity was significantly higher in rats fed UC compared to that for control rats (p < 0.05) , and significantly reduced in rats fed S20 or S40 compared to the value for unbalanced control rats. Feeding unbalanced control diet significantly reduced the ratio of hepatic GSH-Px + catalase/SOD activities compared to the value for control rats, and this decrease in the ratio of antioxidant enzyme activities was reversed by adding the Sangsik powder to the diet at 20% (p <0.05) . Based on the results of antioxidant enzyme activities, feeding uncooked powdered diet appears to provide a favorable environment for body's antioxidative defense mechanism. Serum levels of Fe and Cu were significantly lower in rats fed the Sangsik powder supplemented diets compared to the value for unbalanced control rats (p < 0.05) , and levels of Se, Mn, and Zn were also tended to be decreased by dietary supplementation of the Sangsik powder. These results postulate the possibility that ingredients used in the uncooked powdered food may decrease the bioavailability of trace elements in rats.

Parameter estimation for the imbalanced credit scoring data using AUC maximization (AUC 최적화를 이용한 낮은 부도율 자료의 모수추정)

  • Hong, C.S.;Won, C.H.
    • The Korean Journal of Applied Statistics
    • /
    • v.29 no.2
    • /
    • pp.309-319
    • /
    • 2016
  • For binary classification models, we consider a risk score that is a function of linear scores and estimate the coefficients of the linear scores. There are two estimation methods: one is to obtain MLEs using logistic models and the other is to estimate by maximizing AUC. AUC approach estimates are better than MLEs when using logistic models under a general situation which does not support logistic assumptions. This paper considers imbalanced data that contains a smaller number of observations in the default class than those in the non-default for credit assessment models; consequently, the AUC approach is applied to imbalanced data. Various logit link functions are used as a link function to generate imbalanced data. It is found that predicted coefficients obtained by the AUC approach are equivalent to (or better) than those from logistic models for low default probability - imbalanced data.

Image-Based Skin Cancer Classification System Using Attention Layer (Attention layer를 활용한 이미지 기반 피부암 분류 시스템)

  • GyuWon Lee;SungHee Woo
    • Journal of Practical Engineering Education
    • /
    • v.16 no.1_spc
    • /
    • pp.59-64
    • /
    • 2024
  • As the aging population grows, the incidence of cancer is increasing. Skin cancer appears externally, but people often don't notice it or simply overlook it. As a result, if the early detection period is missed, the survival rate in the case of late stage cancer is only 7.5-11%. However, the disadvantage of diagnosing, serious skin cancer is that it requires a lot of time and money, such as a detailed examination and cell tests, rather than simple visual diagnosis. To overcome these challenges, we propose an Attention-based CNN model skin cancer classification system. If skin cancer can be detected early, it can be treated quickly, and the proposed system can greatly help the work of a specialist. To mitigate the problem of image data imbalance according to skin cancer type, this skin cancer classification model applies the Over Sampling, technique to data with a high distribution ratio, and adds a pre-learning model without an Attention layer. This model is then compared to the model without the Attention layer. We also plan to solve the data imbalance problem by strengthening data augmentation techniques for specific classes.