• Title/Summary/Keyword: c-index

Search Result 4,626, Processing Time 0.034 seconds

Optimization of Multiclass Support Vector Machine using Genetic Algorithm: Application to the Prediction of Corporate Credit Rating (유전자 알고리즘을 이용한 다분류 SVM의 최적화: 기업신용등급 예측에의 응용)

  • Ahn, Hyunchul
    • Information Systems Review
    • /
    • v.16 no.3
    • /
    • pp.161-177
    • /
    • 2014
  • Corporate credit rating assessment consists of complicated processes in which various factors describing a company are taken into consideration. Such assessment is known to be very expensive since domain experts should be employed to assess the ratings. As a result, the data-driven corporate credit rating prediction using statistical and artificial intelligence (AI) techniques has received considerable attention from researchers and practitioners. In particular, statistical methods such as multiple discriminant analysis (MDA) and multinomial logistic regression analysis (MLOGIT), and AI methods including case-based reasoning (CBR), artificial neural network (ANN), and multiclass support vector machine (MSVM) have been applied to corporate credit rating.2) Among them, MSVM has recently become popular because of its robustness and high prediction accuracy. In this study, we propose a novel optimized MSVM model, and appy it to corporate credit rating prediction in order to enhance the accuracy. Our model, named 'GAMSVM (Genetic Algorithm-optimized Multiclass Support Vector Machine),' is designed to simultaneously optimize the kernel parameters and the feature subset selection. Prior studies like Lorena and de Carvalho (2008), and Chatterjee (2013) show that proper kernel parameters may improve the performance of MSVMs. Also, the results from the studies such as Shieh and Yang (2008) and Chatterjee (2013) imply that appropriate feature selection may lead to higher prediction accuracy. Based on these prior studies, we propose to apply GAMSVM to corporate credit rating prediction. As a tool for optimizing the kernel parameters and the feature subset selection, we suggest genetic algorithm (GA). GA is known as an efficient and effective search method that attempts to simulate the biological evolution phenomenon. By applying genetic operations such as selection, crossover, and mutation, it is designed to gradually improve the search results. Especially, mutation operator prevents GA from falling into the local optima, thus we can find the globally optimal or near-optimal solution using it. GA has popularly been applied to search optimal parameters or feature subset selections of AI techniques including MSVM. With these reasons, we also adopt GA as an optimization tool. To empirically validate the usefulness of GAMSVM, we applied it to a real-world case of credit rating in Korea. Our application is in bond rating, which is the most frequently studied area of credit rating for specific debt issues or other financial obligations. The experimental dataset was collected from a large credit rating company in South Korea. It contained 39 financial ratios of 1,295 companies in the manufacturing industry, and their credit ratings. Using various statistical methods including the one-way ANOVA and the stepwise MDA, we selected 14 financial ratios as the candidate independent variables. The dependent variable, i.e. credit rating, was labeled as four classes: 1(A1); 2(A2); 3(A3); 4(B and C). 80 percent of total data for each class was used for training, and remaining 20 percent was used for validation. And, to overcome small sample size, we applied five-fold cross validation to our dataset. In order to examine the competitiveness of the proposed model, we also experimented several comparative models including MDA, MLOGIT, CBR, ANN and MSVM. In case of MSVM, we adopted One-Against-One (OAO) and DAGSVM (Directed Acyclic Graph SVM) approaches because they are known to be the most accurate approaches among various MSVM approaches. GAMSVM was implemented using LIBSVM-an open-source software, and Evolver 5.5-a commercial software enables GA. Other comparative models were experimented using various statistical and AI packages such as SPSS for Windows, Neuroshell, and Microsoft Excel VBA (Visual Basic for Applications). Experimental results showed that the proposed model-GAMSVM-outperformed all the competitive models. In addition, the model was found to use less independent variables, but to show higher accuracy. In our experiments, five variables such as X7 (total debt), X9 (sales per employee), X13 (years after founded), X15 (accumulated earning to total asset), and X39 (the index related to the cash flows from operating activity) were found to be the most important factors in predicting the corporate credit ratings. However, the values of the finally selected kernel parameters were found to be almost same among the data subsets. To examine whether the predictive performance of GAMSVM was significantly greater than those of other models, we used the McNemar test. As a result, we found that GAMSVM was better than MDA, MLOGIT, CBR, and ANN at the 1% significance level, and better than OAO and DAGSVM at the 5% significance level.

A study on dietary habits, nutrient intakes and dietary quality in adults of a health screening and promotion center according to non-alcoholic fatty liver disease (건강증진센터 고객의 비알콜성 지방간 유무에 따른 식습관 및 영양섭취, 식사의 질에 관한 연구)

  • Chang, Ji Ho;Lee, Hye Seung;Kang, Eun Hee
    • Journal of Nutrition and Health
    • /
    • v.47 no.5
    • /
    • pp.330-341
    • /
    • 2014
  • Purpose: The purpose of this study was to evaluate dietary habits, food intakes, nutrient intakes, and diet quality of non-alcoholic fatty liver disease in a health screening and promotion center. Methods: The total number of study subjects was 10,111 adults, where 3087 subjects (30.5%) were diagnosed as NAFLD. The dietary intakes were obtained using a food frequency questionnaire. They were then compared with the dietary reference intakes could be used in the future for development of diet and nutrition guidelines s (KDRIs). Results: Mean age of subjects in the normal group was $52.9{\pm}10.3yrs$ and body mass index (BMI) was $22.4{\pm}2.6kg/m^2$, and those of the NAFLD group were $55.1{\pm}9.2yrs$ and $25.4{\pm}2.9kg/m^2$. BMI, blood pressure of the NAFLD group were significantly higher than those of the normal group. The rates of skipping breakfast, overeating, and eating out were significantly could be used in the future for development of diet and nutrition guidelines er in the NAFLD group (p < 0.05, p < 0.000, p < 0.000 respectively). The speed of eating was fast in the NAFLD group (p < 0.000). The NAFLD group consumed significantly higher amounts of grains, meats, fish, seaweeds, kimchies, sugars, sweets, coffee, teas, and oils compared to the normal group (p < 0.05). Meanwhile, intakes of starch products, fruits, milk, and milk products were significantly lower in the NAFLD group compared with those of the normal group (p < 0.05). Riboflavin, calcium, and dietary fiber nutrient adequacy ratio (NAR) of the NAFLD group were significantly lower than those of the normal group. The Korean's dietary diversity score (KDDS) of the NAFLD group was lower than that of the normal group. Conclusion: In conclusion, we suggest that diet guidelines, such as increasing the intake of calcium and dietary fiber, reducing the intake of energy, fat, and simple carbohydrates, are necessary to improvement of NAFLD. The results could be used in the future for development of diet and nutrition guidelines for NAFLD.

A survey on daily physical activity level, energy expenditure and dietary energy intake by university students in Chungnam Province in Korea (충남지역 대학생의 신체활동수준, 에너지소비량 및 에너지섭취량 조사)

  • Kim, Sun Hyo
    • Journal of Nutrition and Health
    • /
    • v.46 no.4
    • /
    • pp.346-356
    • /
    • 2013
  • This study investigated the daily physical activity level, energy expenditure, energy balance, and body composition and their relationship with university students. The participants were 130 male students ($19.5{\pm}0.5$ yrs) and 139 female students ($19.5{\pm}0.3$ yrs) at a university in Chungnam province. Physical activity level was evaluated by an equation based on 24 hr-activity record and dietary nutrient intake was evaluated using the food record method during a three-day period consisting of two week days and one weekend. Body composition was measured using Inbody 430 (Biospace Co., Cheonan, Korea). As a result, mean body mass index (BMI) of subjects indicated that they had normal weight, however mean body fat ratio was $19.1{\pm}5.4%$ for males and $28.4{\pm}5.0%$ for females, indicating that they had higher than normal weight. Daily mean physical activity level was 1.55 for males and 1.47 for females, which was regarded as 'low active', respectively. Females had more light activity than males (p<0.01). Daily mean energy expenditure was $2,803.5{\pm}788.9$ kcal/d for males and $1,915.4{\pm}510.2$ kcal/d for females (p<0.001). Daily mean dietary energy intake was $2,327.0{\pm}562.5$ kcal/d for males and $1,802.1{\pm}523.6/d$ for females (p<0.001), and daily mean energy balance was $-476.5{\pm}955.9$ kcal/d for males and $-113.3{\pm}728.1$ kcal/d for females (p<0.01). Daily mean dietary intake of protein, vitamins, and minerals, except Ca, satisfied recommended nutrient intake. Daily energy expenditure was positively related to body weight (p<0.01), BMI (p<0.01), and fat free mass ratio (p<0.05), but was negatively related to body fat ratio (p<0.01). In conclusion, subjects had a negative energy balance and low physical activity. They had a normal weight by BMI but had a more fat than normal weight by body fat ratio. This appears to be related to their low physical activity. Thus, nutrition education should be provided for university students in order to increase their physical activity for maintenance of normal weight by body composition and health promotion.

The Effects of Supplemental Bacterial Phytase to the Calcium and Nonphosphorus Levels in Feed of Laying Hens (산란계 사료 내 칼슘 및 무기태 인 수준에 따른 Bacterial Phytase 급여 효과)

  • Kang, H.K.;Park, S.Y.;Yu, D.J.;Kim, J.H.;Kang, G.H.;Na, J.C.;Kim, D.W.;Suh, O.S.;Lee, S.J.;Lee, W.J.;Kim, S.H.
    • Korean Journal of Poultry Science
    • /
    • v.35 no.2
    • /
    • pp.143-151
    • /
    • 2008
  • This study was conducted to identify the correlation of bacterial phytase ($Transphos^{(R)}$) to the calcium level in feed. Of all 21-week-old 720 HyLine brown laying hens, 2 birds of similar weight were placed on each individual cage. The experiment was conducted by $3{\times}2{\times}3$ factorial design with including 3 different levels of phytase (0, 300, and 1,000 DPU/kg), 2 different levels of calcium (3.5% and 4.0%), and 3 different levels of no NPP addition 0% (0.095 NPP), 0.5% (0.185% NPP), and 1.0% (0.275% NPP). The feeding trial maintained the ME level of 2,800 kcal/kg and 16% for crude protein. The diet was fed ad libitum and 17 hours of lighting was provided throughout the experimental period. Egg production seemed to increase, in the 300 DPU of bacterial phytase added group and the cracked egg tended to reduce in Transphos added group. The egg productivity between treatment groups did not show significant difference by dietary calcium level, whereas non NPP added group (0.095% NPP) was found to be low compared to NPP added groups (P<0.05). The highest mean egg weight and the highest daily egg mass were detected in 300 DPU phytase added group. Although the mean egg weight was significantly higher in treatment groups fed with 3.5% calcium containing feeds (P<0.05), daily egg mass was no among treatment groups. The mean egg weight and daily egg mass were the lowest in non NPP added group (0.095% NPP) compared to other treatment groups (P<0.05). The feed intake showed similar pattern regardless of the bacterial phytase and calcium levels in the diet. However, the treatment groups fed diets containing NPP level of 0.275% and 0.165% showed significantly higher feed intake than the group fed with 0.095% NPP (P<0.05). Although the feed conversion was not affected by calcium and NPP levels in the diet, the most improved result was obtained from 300 DPU phytase added group (P<0.05). The eggshell breaking strength and thickness increased as dietary calcium level increase the level of calcium increases in diet. The treatment groups fed diet containing 0.275% and 0.165% NPP revealed to show improvement in eggshell breaking strength and yolk color index compared to the NPP non added (0.095% NPP) treatment group. The result of the present study suggests that the appropriate level of microbial phytase is 300 DPU and at this level, tricalciumphosphate supplementation in feed can be reduced to 40% of NRC recommendation. Higher calcium level in feed fail to show synergistic effect by adding microbial phytase.

A Study on Nutritional Intake Status and Health-related Behaviors of the Elderly People in Gyeongsan Area (경산시 노인의 영양섭취상태 및 건강관련인자에 관한 연구)

  • Yang, Kyung-Mi
    • Journal of the Korean Society of Food Science and Nutrition
    • /
    • v.34 no.7
    • /
    • pp.1018-1027
    • /
    • 2005
  • The purpose of this study was to investigate nutrients intake and health-related behaviors in elderly people residing in Gyeongsan-si, Gyeongbuk who have no problem in daily living. Information on general characteristics of the elderly, health-related behaviors and dietary habits were obtained by interview based on questionnare. Dietary nutrients intake data were obtained through the 24 recall method. The subject group of this study was composed of 113 males and 112 females, the average age being $73.1\pm6.06$ years old. In health related factors, $76.9\%$ of subjects exercised regularly. The rates of alcohol drinking and smoking showed to be $38.2\%\;and\;22.2\%$, respectively. There were many elderly with neuralgia, hypertension, and gastrointestinal disease, especially female were worse. Average heights of the subjects were lower than the standard established in the Korean Recommended Dietary Allowances, and average weights were similar to the standards. The mean BMI and WHR were 24.8 (male 23.7, female 25.7) and 0.92 (male 0.92, female 0.89), respectively. Most of the subjects had a regular meal pattern comsuming three meals a day, and many elderly, especially more than $79.5\%$ of female, prepared the meals for themselves. Mean daily energy intakes and RDA percentage of energy intakes of the male and female subjects were estimated as 1426.9kcal $(79.3\%)$ and 1381.3 kcal $(86.3\%)$, respectively. Mean daily intakes of nutrients were estimatied as 48.1g for protein, 411.3mg for calcium, 8.05mg for iron, 541.8 R.E. for vitamin A, 0.84mg for vitamin $B_1$, and 0.79mg for vitamin $B_2$. Most nutrients except protein, clacium, iron and vitamin $B_2$ were consumed over $75\%$ of the RDA. Female elderly showed significant lower intakes (p<0.05) for most of the nutrients except calcium, phosphorus and vitamin ethan the elderly male.

The Variation of Natural Population of Pinus densiflora S. et Z. in Korea (III) -Genetic Variation of the Progeny Originated from Mt. Chu-wang, An-Myon Island and Mt. O-Dae Populations- (소나무 천연집단(天然集團)의 변이(變異)에 관(關)한 연구(硏究)(III) -주왕산(周王山), 안면도(安眠島), 오대산(五臺山) 소나무집단(集團)의 차대(次代)의 유전변이(遺傳變異)-)

  • Yim, Kyong Bin;Kwon, Ki Won
    • Journal of Korean Society of Forest Science
    • /
    • v.32 no.1
    • /
    • pp.36-63
    • /
    • 1976
  • The purpose of this study is to elucidate the genetic variation of the natural forest of Pinus densiflora. Three natural populations of the species, which are considered to be superior quality phenotypically, were selected. The locations and conditions of the populations are shown in table 1 and 2. The morphological traits of tree and needle and some other characteristics were presented already in our first report of this series in which population and family differences according to observed characteristics were statistically analyzed. Twenty trees were sampled from each populations, i.e., 60 trees in total. During the autumn of 1974, matured cones were collected from each tree and open-pollinated seeds were extracted in laboratory. Immediately after cone collection, in closed condition, the morphological characteristics were measured. Seed and seed-wing dimensions were also studied. In the spring of 1975, the seeds were sown in the experimental tree nursery located in Suweon. And in the April of 1976, the 1-0 seedlings were transplanted according to the predetermined experimental design, randomized block design with three replications. Because of cone setting condition. the number of family from which progenies were raised by populations were not equal. The numbers of family were 20 in population 1. 18 in population 2 and 15 in population 3. Then, each randomized block contained seedlings of 53 families from 3 populations. The present paper is mainly concerned with the variation of some characteristics of cone, seed, needle, growth performance of seedlings, and chlorophyll and monoterpene compositions of needles. The results obtained are summerized as follows. 1. The meteorological data obtained by averaging the records of 30 year period, observed from the nearest station to each location of populations, are shown in Fig. 3, 4, and 5. The distributional pattern of monthly precipitation are quite similar among locations. However, the precipitation density on population 2, Seosan area, during growing season is lower as compared to the other two populations. Population 1. Cheong-song area, and population 3, Pyong-chang area, are located in inland, but population 2 in the western seacoast. The differences on the average monthly air temperatures and the average monthly lowest temperatures among populations can hardly be found. 2. Available information on the each mother trees (families) studied, such as age, stem height, diameter at breast height, clear-bole-length, crown conditions and others are shown in table 6,7, and 8. 3. The measurements of fresh cone weight, length and the widest diameter of cone are given in Tab]e 9. All these traits arc concerned with the highly significant population differences and family differences within population. And the population difference was also found in the cone-index, that is, length-diameter ratio. 4. Seed-wing length and seed-wing width showed the population differences, and the family differences were also found in both characteristics. Not discussed in this paper, however, seed-wing colours and their shapes indicate the specificity which is inherent to individual trees as shown in photo 3 on page 50. The colour and shape are fully the expression of genetic make up of mother tree. The little variations on these traits are resulted from this reason. The significant differences among populations and among families were found in those characteristics, such as 1000-seed weight, seed length, seed width, and seed thickness as shown in table 11. As to all these dimensions, the values arc always larger in population 1 which is younger in age than that of the other two. The population differences evaluated by cone, seed and seed-wing sizes could partly be attributed to the growth vigorousity. 5. The values of correlation between the characteristics of cone and seed are presented in table 12. As shown, the positive correlations between cone diameter and seed-wing width were calculated in all populations studied. The correlation between seed-wing length and seed length was significantly positive in population 1 and 3 but not in population 2, that is, the r-value is so small as 0.002. in the latter. The correlation between cone length and seed-wing length was highly significant in population 1, but not in population 2. 6. Differences among progenies in growth performances, such as 1-0 and 1-1 seedling height and root collar diameter were highly singificant among populations as well as families within population(Table 13.) 7. The heritability values in narrow sense of population characteristics were estimated on the basis of variance components. The values based on seedling height at each age stage of 1-1 and 1-0 ranged from 0.146 to 0.288 and the values of root collar diameter from 0.060 to 0.130. (Table 14). These heritability values varied according to characteristics and seedling ages. Here what must be stated is that, for calculation of heritability values, the variance values of population was divided by the variance value of environment (error) and family and population. The present authors want to add the heritability values based on family level in the coming report. It might be considered that if the tree age is increased in furture, the heritability value is supposed to be altered or lowered. Examining the heritability values studied previously by many authors, in pine group at age of 7 to 15, the values of height growth ranged from 0.2 to 0.4 in general. The values we obtained are further below than these. 8. The correlation between seedling growth and seed characteristics were examined and the values resulted are shown in table 16. Contrary to our hypothetical premise of positive correlation between 1-0 seedling height and seed weight, non-significance on it was found. However, 1-0 seedling height correlated positively with seed length. And significant correlations between 1-0 and 1-1 seedling height are calculated. 9. The numbers of stomata row calculated separately by abaxial and adaxial side showed highly significant differences among populations, but not in serration density. On serration density, the differences among families within population were highly significant. (Table 17) A fact must be noted is that the correlation between stomata row on abaxial side and adaxial side was highly significant in all populations. Non-significances of correlation coefficient between progenies and parents regarding to stomata row on abaxial side were shown in all populations studied.(Table 18). 10. The contents of chhlorophyll b of the needle were a little more than that of chlorophyll a irrespective of the populations examined. The differences of chlorophyll a, b and a plus b contents were highly significant but not among families within populations as shown in table 20. The contents of chlorophyll a and b are presented by individual trees of each populations in table 21. 11. The occurrence of monoterpene components was examined by gas liquid chromatography (Shimazu, GC-1C type) to evaluate the population difference. There are some papers reporting the chemical geography of pines basing upon monoterpene composition. The number of populations studied here is not enough to state this problem. The kinds of monoterpene observed in needle were ${\alpha}$-pinene, camphene, ${\beta}$-pinene, myrcene, limonene, ${\beta}$-phellandrene and terpinolene plus two unknowns. In analysis of monoterpene composition, the number of sample trees varied with population, I.e., 18 families for population 1, 15 for population 2 and 11 for population3. (Table 22, 23 and 24). The histograms(Fig. 6) of 7 components of monoterpene by population show noticeably higher percentages of ${\alpha}$-pinene irrespective of population and ${\beta}$-phellandrene in the next order. The minor Pinus densiflora monoterpene composition of camphene, myrcene, limonene and terpinolene made up less than 10 percent of the portion in general. The average coefficients of variation of ${\alpha}$-pinene and ${\beta}$-phellandrene were 11 percent. On the contrary to this, the average coefficients of variation of camphene, limonene and terpinolene varied from 20 to 30 percent. And the significant differences between populaiton were observed only in myrcene and ${\beta}$-phellandrene. (Table 25).

  • PDF