• Title/Summary/Keyword: Classification Problem Solving

Search Result 133, Processing Time 0.02 seconds

Corporate Credit Rating based on Bankruptcy Probability Using AdaBoost Algorithm-based Support Vector Machine (AdaBoost 알고리즘기반 SVM을 이용한 부실 확률분포 기반의 기업신용평가)

  • Shin, Taek-Soo;Hong, Tae-Ho
    • Journal of Intelligence and Information Systems
    • /
    • v.17 no.3
    • /
    • pp.25-41
    • /
    • 2011
  • Recently, support vector machines (SVMs) are being recognized as competitive tools as compared with other data mining techniques for solving pattern recognition or classification decision problems. Furthermore, many researches, in particular, have proved them more powerful than traditional artificial neural networks (ANNs) (Amendolia et al., 2003; Huang et al., 2004, Huang et al., 2005; Tay and Cao, 2001; Min and Lee, 2005; Shin et al., 2005; Kim, 2003).The classification decision, such as a binary or multi-class decision problem, used by any classifier, i.e. data mining techniques is so cost-sensitive particularly in financial classification problems such as the credit ratings that if the credit ratings are misclassified, a terrible economic loss for investors or financial decision makers may happen. Therefore, it is necessary to convert the outputs of the classifier into wellcalibrated posterior probabilities-based multiclass credit ratings according to the bankruptcy probabilities. However, SVMs basically do not provide such probabilities. So it required to use any method to create the probabilities (Platt, 1999; Drish, 2001). This paper applied AdaBoost algorithm-based support vector machines (SVMs) into a bankruptcy prediction as a binary classification problem for the IT companies in Korea and then performed the multi-class credit ratings of the companies by making a normal distribution shape of posterior bankruptcy probabilities from the loss functions extracted from the SVMs. Our proposed approach also showed that their methods can minimize the misclassification problems by adjusting the credit grade interval ranges on condition that each credit grade for credit loan borrowers has its own credit risk, i.e. bankruptcy probability.

Classification of Fire Causes in Warehouses Using the TRIZ Technique and Analysis of Preventive Measures Accordingto 4M (TRIZ기법에 의한 물류창고의 화재원인 및 4M에 따른 예방대책 분석)

  • Han, Sang-Hun;Kong, Ha-Sung
    • The Journal of the Convergence on Culture Technology
    • /
    • v.6 no.3
    • /
    • pp.401-412
    • /
    • 2020
  • This study analyzed the causes of warehouse fires using a creative problem-solving technique called TRIZ. It identified preventive measures by applying 4M. The results are as follows. First, this study examined the inconsistency among the causes of warehouse fires using TRIZ. Second, it analyzed human factors and fire prevention measures in warehouses such as safety standards for managers, and methods for the promotion of safety consciousness among workers, and for the reinforcement of construction technology for sandwich panel workers. Third, it identified the mechanical and facility factors and fire prevention measures in warehouses such as safety facilities, the expanded installation of safety devices, the adoption and development of fire suppression equipment, and the deployment of methods to improve the fire resistance of sandwich panels. Fourth, it presented working and environmental factors and fire prevention measures in warehouses such as the tightening of safety precautions and the supervision of working methods, and setting fire partitions both in loading places and based on performance-based design. Finally, it proposed managerial factors and fire prevention measures in warehouses such as specific targeting for firefighting with low fire hazards, reviewing the material quality regulations of non-combustible or higher for sandwich panels in the specific target of firefighting that cannot apply fire safety standards, installing sprinklers in cold storage, and mandating the installation of automated facilities with retroactive application regardless of the floor area in the warehouse with a sandwich panel structure.

A Time Series Graph based Convolutional Neural Network Model for Effective Input Variable Pattern Learning : Application to the Prediction of Stock Market (효과적인 입력변수 패턴 학습을 위한 시계열 그래프 기반 합성곱 신경망 모형: 주식시장 예측에의 응용)

  • Lee, Mo-Se;Ahn, Hyunchul
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.1
    • /
    • pp.167-181
    • /
    • 2018
  • Over the past decade, deep learning has been in spotlight among various machine learning algorithms. In particular, CNN(Convolutional Neural Network), which is known as the effective solution for recognizing and classifying images or voices, has been popularly applied to classification and prediction problems. In this study, we investigate the way to apply CNN in business problem solving. Specifically, this study propose to apply CNN to stock market prediction, one of the most challenging tasks in the machine learning research. As mentioned, CNN has strength in interpreting images. Thus, the model proposed in this study adopts CNN as the binary classifier that predicts stock market direction (upward or downward) by using time series graphs as its inputs. That is, our proposal is to build a machine learning algorithm that mimics an experts called 'technical analysts' who examine the graph of past price movement, and predict future financial price movements. Our proposed model named 'CNN-FG(Convolutional Neural Network using Fluctuation Graph)' consists of five steps. In the first step, it divides the dataset into the intervals of 5 days. And then, it creates time series graphs for the divided dataset in step 2. The size of the image in which the graph is drawn is $40(pixels){\times}40(pixels)$, and the graph of each independent variable was drawn using different colors. In step 3, the model converts the images into the matrices. Each image is converted into the combination of three matrices in order to express the value of the color using R(red), G(green), and B(blue) scale. In the next step, it splits the dataset of the graph images into training and validation datasets. We used 80% of the total dataset as the training dataset, and the remaining 20% as the validation dataset. And then, CNN classifiers are trained using the images of training dataset in the final step. Regarding the parameters of CNN-FG, we adopted two convolution filters ($5{\times}5{\times}6$ and $5{\times}5{\times}9$) in the convolution layer. In the pooling layer, $2{\times}2$ max pooling filter was used. The numbers of the nodes in two hidden layers were set to, respectively, 900 and 32, and the number of the nodes in the output layer was set to 2(one is for the prediction of upward trend, and the other one is for downward trend). Activation functions for the convolution layer and the hidden layer were set to ReLU(Rectified Linear Unit), and one for the output layer set to Softmax function. To validate our model - CNN-FG, we applied it to the prediction of KOSPI200 for 2,026 days in eight years (from 2009 to 2016). To match the proportions of the two groups in the independent variable (i.e. tomorrow's stock market movement), we selected 1,950 samples by applying random sampling. Finally, we built the training dataset using 80% of the total dataset (1,560 samples), and the validation dataset using 20% (390 samples). The dependent variables of the experimental dataset included twelve technical indicators popularly been used in the previous studies. They include Stochastic %K, Stochastic %D, Momentum, ROC(rate of change), LW %R(Larry William's %R), A/D oscillator(accumulation/distribution oscillator), OSCP(price oscillator), CCI(commodity channel index), and so on. To confirm the superiority of CNN-FG, we compared its prediction accuracy with the ones of other classification models. Experimental results showed that CNN-FG outperforms LOGIT(logistic regression), ANN(artificial neural network), and SVM(support vector machine) with the statistical significance. These empirical results imply that converting time series business data into graphs and building CNN-based classification models using these graphs can be effective from the perspective of prediction accuracy. Thus, this paper sheds a light on how to apply deep learning techniques to the domain of business problem solving.

Research about feature selection that use heuristic function (휴리스틱 함수를 이용한 feature selection에 관한 연구)

  • Hong, Seok-Mi;Jung, Kyung-Sook;Chung, Tae-Choong
    • The KIPS Transactions:PartB
    • /
    • v.10B no.3
    • /
    • pp.281-286
    • /
    • 2003
  • A large number of features are collected for problem solving in real life, but to utilize ail the features collected would be difficult. It is not so easy to collect of correct data about all features. In case it takes advantage of all collected data to learn, complicated learning model is created and good performance result can't get. Also exist interrelationships or hierarchical relations among the features. We can reduce feature's number analyzing relation among the features using heuristic knowledge or statistical method. Heuristic technique refers to learning through repetitive trial and errors and experience. Experts can approach to relevant problem domain through opinion collection process by experience. These properties can be utilized to reduce the number of feature used in learning. Experts generate a new feature (highly abstract) using raw data. This paper describes machine learning model that reduce the number of features used in learning using heuristic function and use abstracted feature by neural network's input value. We have applied this model to the win/lose prediction in pro-baseball games. The result shows the model mixing two techniques not only reduces the complexity of the neural network model but also significantly improves the classification accuracy than when neural network and heuristic model are used separately.

A Study on the Performance and the Importance of Ambulatory Nursing Activities (외래 간호인력 업무활동 수행도와 중요도 분석;종합병원${\cdot}$종합전문요양기관 중심으로)

  • Hwang, Hye-Young;Park, Jeong-Hye;Kim, Ji-Soo;Chen, In-Sug;Bae, Kyung-Ok;Seo, Mi-Sook;Yang, Woo-Jeong;Jung, Moon-Young;Chae, Ji-Sun;Hong, Ji-Yeon;Kim, Moon-Sil
    • Journal of Korean Academy of Nursing Administration
    • /
    • v.13 no.1
    • /
    • pp.109-117
    • /
    • 2007
  • Purpose: This study focused on analysing the performance and the perception of importance about workload of ambulatory nurses and nurse-aides for quality of nursing. Method: The subjects of this study were 126 ambulatory nurses and 117 nurse-aides in 6 secondary and 4 tertiary hospitals. The method of data collection was used the questionnaire. Result: As a result, First, nurses' activities that the performance score is above 3.0 are reception, guidance, reservation, confirm, checking medical record, operating report, explanation of disease, explanation of examination discuss with medical part, discuss with supporting part, solving patient problem environment management, and paper work. And the other side, those of nurse-aides are reception, guidance, reservation, preparation for clinic, assistant for clinic, preparation for examination, material transfer & receipt, confirm, checking medical record, and arrangement. Second, nurses-aids perceive above 3.0 performance score activities to be important for themselves. Finally, nurses perceive three categories of patient education/counselling, patient advocacy and quality improvement to be more important and higher performance when compared with nurse-aides. Conclusions: Ambulatory nurse's important nursing activities are therapeutic care, patient education/counselling, patient advocacy, communication, personal management, quality improvement.

  • PDF

A Development of Career Aptitude Scale for Design Majoring University Students (디자인 진로적성검사의 개발)

  • Gil, Im-Joo;Yang, Sung-Yong
    • Archives of design research
    • /
    • v.19 no.1 s.63
    • /
    • pp.283-292
    • /
    • 2006
  • This study developed 'Design Career Aptitude Scale' to help design majoring college students who are seeking their career goals or conflicting to decide their majors. The subscales of the 'Design Career Aptitude Scale' are 'basic job competency', 'basic design competency' and' advanced design competency'. This study further classified the 'basic job competency' and 'basic design competency' into several subareas and defined each concepts. Based upon the classification of each subareas, tentative test items were developed through the verification of validity three times by seven design professionals. A pilot study of the developed scale was administered to 506 design majoring college students. The results by exploratory factor analysis were that the basic job competency was composed of four factors; ability of interpersonal relations, goal-driven ability, problem solving ability and self-developing ability. The basic design competency was composed of five factor, grounding in design, computer skills, material sensitivity, formative ability and color sensitivity. The results can be seen as an adequate, delicate factor structure to represent design aptitude, and also the scale can be a useful tool to the students who are conflicting to decide their majors and careers. The further study needed to validate the scale through the investigation of the relationship with related scales measuring designing ability, and with other criteria-referenced group.

  • PDF

Analysis of inquiry activities in the life science chapters of middle school 'science' textbooks: Focusing on Science Process Skills and 8 Scientific Practices (중학교 과학교과서 생명과학 단원의 탐구 활동 분석: 과학탐구 기능과 8가지 과학 실천을 중심으로)

  • Kim, Mijung;Hong, Juneuy;Kim, Sung-Ha;Lim, Chae-Seong
    • Journal of Science Education
    • /
    • v.41 no.3
    • /
    • pp.318-333
    • /
    • 2017
  • In this study, we analyzed activities in life science chapters of middle school 'science' textbooks for the 2009 revised Korea national curriculum and examined the difference between the analysis based on scientific practices and the analysis based on inquiry skills. As a results, there was a lot of inquiry skills in the order of 'reasoning', 'observing', 'classification' in the all of grade. In scientific practices, 'data analysis and interpretation' and 'constructing explanations and devising problem solving' were biased. This shows that life science inquiry activities in middle school 'science' textbooks are lacking in diversity in scientific practice elements as well as inquiry skills, and that the goals of the activities are limited. In addition, through the interrelationships between scientific inquiry skills and scientific practice elements, we examined contents relevance in the transition from inquiry function center to scientific practice, and compared with the results of inquiry activities in textbook, The results of this study were matched monotonously due to the tendency to basic inquiry-data interpretation / basic inquiry-explanation. This comes from results of the lack of diversity in activities presented in middle school 'science' textbooks. In this study, it is suggested that efforts should be made to include diverse scientific practice elements in the process of realizing 2015 revised Korea national curriculum from the simple and diversity-less inquiry activity through analyzing the textbooks of the 2009 revised Korea national curriculum.

Analysis of Actors' Interaction Patterns in the Formation Process of Sexual Crime Prevention Policy: Focusing on classification and case analysis (성범죄예방정책의 형성과정에서 행위자의 상호작용 패턴분석: 유형분류 및 사례분석을 중심으로)

  • Yoo, Keun-Hwan;Kim, Duck-Hwan;Suh, Kyung-Do
    • Journal of the Korea Convergence Society
    • /
    • v.9 no.9
    • /
    • pp.209-215
    • /
    • 2018
  • The purpose of this study is to grasp the overall policy decision system of sex crime prevention policy and analyze the interaction and pattern of actors in policy formation process. This is a useful way to identify the causes and ways to improve the policy if the sex crime prevention policy fails. As a research method, we used a model of advocacy through case analysis and language network analysis. In the external environment, low reporting of sex offenses, technical improvement and supplement for preventive management, consciousness of victims of sexual crimes, amendment of legislation, and support of the president. The conflicts between the advocacy coalition opposed the strong regulation, the prevention of recidivism, the expansion of the range of objects to be worn, the temporary effect of the system and the retrospective of the bill. As a problem-solving strategy, it was confirmed that the opposing positions of pros and cons of lack of manpower and negligence of management through the extension of the system were acutely opposed. In the context of media reports, this tendency is more likely to be understood as the concern of prevention and management at the central government level to prevent sex crimes. Therefore, although the methods of prevention of sex crimes have been insufficient in the past, it is hoped that this study will be helpful in breaking the link of negative policy vicious cycle.

A Study on Analysis of national R&D research trends for Artificial Intelligence using LDA topic modeling (LDA 토픽모델링을 활용한 인공지능 관련 국가R&D 연구동향 분석)

  • Yang, MyungSeok;Lee, SungHee;Park, KeunHee;Choi, KwangNam;Kim, TaeHyun
    • Journal of Internet Computing and Services
    • /
    • v.22 no.5
    • /
    • pp.47-55
    • /
    • 2021
  • Analysis of research trends in specific subject areas is performed by examining related topics and subject changes by using topic modeling techniques through keyword extraction for most of the literature information (paper, patents, etc.). Unlike existing research methods, this paper extracts topics related to the research topic using the LDA topic modeling technique for the project information of national R&D projects provided by the National Science and Technology Knowledge Information Service (NTIS) in the field of artificial intelligence. By analyzing these topics, this study aims to analyze research topics and investment directions for national R&D projects. NTIS provides a vast amount of national R&D information, from information on tasks carried out through national R&D projects to research results (thesis, patents, etc.) generated through research. In this paper, the search results were confirmed by performing artificial intelligence keywords and related classification searches in NTIS integrated search, and basic data was constructed by downloading the latest three-year project information. Using the LDA topic modeling library provided by Python, related topics and keywords were extracted and analyzed for basic data (research goals, research content, expected effects, keywords, etc.) to derive insights on the direction of research investment.

A Hybrid Oversampling Technique for Imbalanced Structured Data based on SMOTE and Adapted CycleGAN (불균형 정형 데이터를 위한 SMOTE와 변형 CycleGAN 기반 하이브리드 오버샘플링 기법)

  • Jung-Dam Noh;Byounggu Choi
    • Information Systems Review
    • /
    • v.24 no.4
    • /
    • pp.97-118
    • /
    • 2022
  • As generative adversarial network (GAN) based oversampling techniques have achieved impressive results in class imbalance of unstructured dataset such as image, many studies have begun to apply it to solving the problem of imbalance in structured dataset. However, these studies have failed to reflect the characteristics of structured data due to changing the data structure into an unstructured data format. In order to overcome the limitation, this study adapted CycleGAN to reflect the characteristics of structured data, and proposed hybridization of synthetic minority oversampling technique (SMOTE) and the adapted CycleGAN. In particular, this study tried to overcome the limitations of existing studies by using a one-dimensional convolutional neural network unlike previous studies that used two-dimensional convolutional neural network. Oversampling based on the method proposed have been experimented using various datasets and compared the performance of the method with existing oversampling methods such as SMOTE and adaptive synthetic sampling (ADASYN). The results indicated the proposed hybrid oversampling method showed superior performance compared to the existing methods when data have more dimensions or higher degree of imbalance. This study implied that the classification performance of oversampling structured data can be improved using the proposed hybrid oversampling method that considers the characteristic of structured data.