• Title/Summary/Keyword: tree classification

Search Result 930, Processing Time 0.029 seconds

A study on the analysis of customer loan for the credit finance company using classification model (분류모형을 이용한 여신회사 고객대출 분석에 관한 연구)

  • Kim, Tae-Hyung;Kim, Yeong-Hwa
    • Journal of the Korean Data and Information Science Society
    • /
    • v.24 no.3
    • /
    • pp.411-425
    • /
    • 2013
  • The importance and necessity of the credit loan are increasing over time. Also, it is a natural consequence that the increase of the risk for borrower increases the risk of non-performing loan. Thus, we need to predict accurately in order to prevent the loss of a credit loan company. Our final goal is to build reliable and accurate prediction model, so we proceed the following steps: At first, we can get an appropriate sample by using several resampling methods. Second, we can consider variety models and tools to fit our resampling data. Finally, in order to find the best model for our real data, various models were compared and assessed.

Design and Implementation for Efficient Multi Version ADS-B Target Report Message Processing (효율적인 다중 버전 ADS-B 타깃 리포트 메시지 처리를 위한 모듈 설계 및 구현)

  • Kim, Kanghee;Jang, Eunmee;Song, Inseong;Cho, Taehwan;Choi, Sangbang
    • Journal of Advanced Navigation Technology
    • /
    • v.19 no.4
    • /
    • pp.265-277
    • /
    • 2015
  • Automatic dependent surveillance-broadcast (ADS-B) is the core technology of communication, navigation and surveillance/air traffic management (CNS/ATM), automatically broadcasts its own position information using GNSS and has an advantage of lower geological constraints and faster update speed compared with legacy radar systems. EUROCONTROL defined all purpose structured eurocontrol surveillance information exchange (ASTERIX) CAT.021. ASTERIX CAT.021 is modified several times, but it has compatibility issues with previous version of it. In this paper, we have designed an efficient message processing module regardless of the version of ASTERIX CAT.021. This implemented module generates patterns to collect messages received from the network, after that, received messages are processed in the routine that is defined in accordance with the patterns.

Community Structure and Habitat Environment of Genus Liriope Group in Korea (한반도 맥문동속 집단의 자생지 생육환경과 군락구조)

  • Song, Hong-Seon;Lee, Jung-Hoon;Kim, Seong-Min;Shin, Dong-Il;Kim, Chang-Ho;Koo, Han-Mo;Park, Chung-Berm;Park, Yong-Jin
    • Korean Journal of Medicinal Crop Science
    • /
    • v.19 no.1
    • /
    • pp.24-30
    • /
    • 2011
  • This text was analyzed and investigated the vegetation and floristic composition by cluster analysis and classification of phytosociological method, to evaluate the species composition, habitat environment and community structure of Liriope platyphylla and Liriope spicata group in Korea. The southeast slope gradient of the habitat of L. platyphylla and L. spicata was 6.7 to 8.4%, and the habitat altitude of L. platyphylla (41.0 m), L. spicata (114.9 m) was different. Habitat distribution of L. spicata was broader than L. platyphylla. Appearing plants of L. platyphylla and L. spicata group was 58 taxa, 99 taxa, respectively, and Coverage of tree layer was 87.5%, 92.5% respectively. In genus Liriope group, the highest appearing frequency of plant grow in the moist valley as Quercus serrata. Thus, plants of genus Liriope growth was better in moist shade. The vegetation of L. platyphylla group was classified into Quercus serrata community, Castanopsis sieboldii community, Pinus densiflora community and Pinus thunbergii community, and the Liriope spicata group was classified into Quercus serrata community, Quercus alien community, Quercus acutissima community, Prunus verecunda community, Robinia pseudoacacia community, Pinus densiflora community and Pinus thunbergii community. In genus Liriope group, Quercus serrata and Pinus densiflora communities was the closest the similarities.

Spam-Filtering by Identifying Automatically Generated Email Accounts (자동 생성 메일계정 인식을 통한 스팸 필터링)

  • Lee Sangho
    • Journal of KIISE:Software and Applications
    • /
    • v.32 no.5
    • /
    • pp.378-384
    • /
    • 2005
  • In this paper, we describe a novel method of spam-filtering to improve the performance of conventional spam-filtering systems. Conventional systems filter emails by investigating words distribution in email headers or bodies. Nowadays, spammers begin making email accounts in web-based email service sites and sending emails as if they are not spams. Investigating the email accounts of those spams, we notice that there is a large difference between the automatically generated accounts and ordinaries. Based on that difference, incoming emails are classified into spam/non-spam classes. To classify emails from only account strings, we used decision trees, which have been generally used for conventional pattern classification problems. We collected about 2.15 million account strings from email service sites, and our account checker resulted in the accuracy of $96.3\%$. The previous filter system with the checker yielded the improved filtering performance.

Development of An Expert system with Knowledge Learning Capability for Service Restoration of Automated Distribution Substation (고도화된 자동화 변전소의 사고복구 지원을 위한 지식학습능력을 가지는 전문가 시스템의 개발)

  • Ko Yun-Seok;Kang Tae-Gue
    • The Transactions of the Korean Institute of Electrical Engineers A
    • /
    • v.53 no.12
    • /
    • pp.637-644
    • /
    • 2004
  • This paper proposes an expert system with the knowledge learning capability which can enhance the safety and effectiveness of substation operation in the automated substation as well as existing substation by inferring multiple events such as main transformer fault, busbar fault and main transformer work schedule under multiple inference mode and multiple objective mode and by considering totally the switch status and the main transformer operating constraints. Especially inference mode includes the local minimum tree search method and pattern recognition method to enhance the performance of real-time bus reconfiguration strategy. The inference engine of the expert system consists of intuitive inferencing part and logical inferencing part. The intuitive inferencing part offers the control strategy corresponding to the event which is most similar to the real event by searching based on a minimum distance classification method of pattern recognition methods. On the other hand, logical inferencing part makes real-time control strategy using real-time mode(best-first search method) when the intuitive inferencing is failed. Also, it builds up a knowledge base or appends a new knowledge to the knowledge base using pattern learning function. The expert system has main transformer fault, main transformer maintenance work and bus fault processing function. It is implemented as computer language, Visual C++ which has a dynamic programming function for implementing of inference engine and a MFC function for implementing of MMI. Finally, it's accuracy and effectiveness is proved by several event simulation works for a typical substation.

Climatic Perturbation and Plant Livestock of a Secondary Forest in Kantou Area, Japan (일본 관동지역 2차림지대의 기상환경과 식물군락에 관한 연구)

  • 이성기;안영희;이갑연
    • Korean Journal of Agricultural and Forest Meteorology
    • /
    • v.6 no.1
    • /
    • pp.1-10
    • /
    • 2004
  • The climate of Minamiakikawa forest in Japan is similar to that of Mt. Jiri in South Korea. There is a large development plan for Minamiakikawa forest, and a change in the species composition is expected. This study was initiated to compare forest transition caused by artificial perturbations in Korea and Japan. Long-term field observations on species composition are reported. We found 98 families, 231 genera, 315 species, 29 varieties, and 8 races, making a total of 352 classification groups of higher plants in the Minamiakikawa forest area. Among them, 11 families, 12 species and 2 varieties are rare or endangered. The study area is dominated by Cryptomerica japonica and Chamaecyparis obtusa. The time and restoration effects on secondary and latent forestation consider the development of the Quercus mongolica community, the Quercus serrata community, and deciduous-broadleaved tree ascension. This indicates that the forest has been restored to Abies firma, Pinus densiflora or Cryptomeria japonica and Fagus japonica, which is considered latent natural forestation of the area in a natural transfer.

A Study on the Development of Web-based Expert System for Urban Transit (웹 기반의 도시철도 전문가시스템 개발에 관한 연구)

  • Kim Hyunjun;Bae Chulho;Kim Sungbin;Lee Hoyong;Kim Moonhyun;Suh Myungwon
    • Transactions of the Korean Society of Automotive Engineers
    • /
    • v.13 no.5
    • /
    • pp.163-170
    • /
    • 2005
  • Urban transit is a complex system that is combined electrically and mechanically, it is necessary to construct maintenance system for securing safety accompanying high-speed driving and maintaining promptly. Expert system is a computer program which uses numerical or non-numerical domain-specific knowledge to solve problems. In this research, we intend to develop the expert system which diagnose failure causes quickly and display measures. For the development of expert system, standardization of failure code classification system and creation of BOM(Bill Of Materials) have been first performed. Through the analysis of failure history and maintenance manuals, knowledge base has been constructed. Also, for retrieving the procedure of failure diagnosis and repair linking with the knowledge base, we have built RBR(Rule Based Reasoning) engine by pattern matching technique and CBR(Case Based Reasoning) engine by similarity search method. This system has been developed based on web to maximize the accessibility.

The Development of Korean Rehabilitation Patient Group Version 1.0 (한국형 재활환자분류체계 버전 1.0 개발)

  • Hwang, Soojin;Kim, Aeryun;Moon, Sunhye;Kim, Jihee;Kim, Jinhwi;Ha, Younghea;Yang, Okyoung
    • Health Policy and Management
    • /
    • v.26 no.4
    • /
    • pp.289-304
    • /
    • 2016
  • Background: Rehabilitations in subacute phase are different from acute treatments regarding the characteristics and required resource consumption of the treatments. Lack of accuracy and validity of the Korean Diagnosis Related Group and Korean Out-Patient Group for the acute patients as the case-mix and payment tool for rehabilitation inpatients have been problematic issues. The objective of the study was to develop the Korean Rehabilitation Patient Group (KRPG) reflecting the characteristics of rehabilitation inpatients. Methods: As a retrospective medical record survey regarding rehabilitation inpatients, 4,207 episodes were collected through 42 hospitals. Considering the opinions of clinical experts and the decision-tree analysis, the variables for the KRPG system demonstrating the characteristics of rehabilitation inpatients were derived, and the splitting standards of the relevant variables were also set. Using the derived variables, we have drawn the rehabilitation inpatient classification model reflecting the clinical situation of Korea. The performance evaluation was conducted on the KRPG system. Results: The KRPG was targeted at the inpatients with brain or spinal cord injury. The etiologic disease, functional status (cognitive function, activity of daily living, muscle strength, spasticity, level and grade of spinal cord injury), and the patient's age were the variables in the rehabilitation patients. The algorithm of KRPG system after applying the derived variables and total 204 rehabilitation patient groups were developed. The KRPG explained 11.8% of variance in charge for rehabilitation inpatients. It also explained 13.8% of variance in length of stay for them. Conclusion: The KRPG version 1.0 reflecting the clinical characteristics of rehabilitation inpatients was classified as 204 groups.

A Study on Wildlife Habitat Suitability Modeling for Goral (Nemorhaedus caudatus raddeanus) in Seoraksan National Park (설악산 산양을 대상으로 한 야생동물 서식지 적합성 모형에 관한 연구)

  • Seo, Chang Wan;Choi, Tae Young;Choi, Yun Soo;Kim, Dong Young
    • Journal of the Korean Society of Environmental Restoration Technology
    • /
    • v.11 no.3
    • /
    • pp.28-38
    • /
    • 2008
  • The purpose of this study are to compare existing presence-absence predictive models and to predict suitable habitat for Goral (Nemorhaedus caudatus raddeanus) that is an endangered and protected species in Seoraksan national park using the best model among existing predictive models. The methods of this study are as follows. First, 375 location data and 9 environmental data layers were implemented to build a model. Secondly, 4 existing presence-absence models : Generalized Linear Model (GLM), Generalized Addictive Model (GAM), Classification and Regression Tree (CART), and Artificial Neural Network (ANN) were tested to predict the Goal habitat. Thirdly, ROC (Receiver Operating Characteristic) and Kappa statistics were used to calculate a model performance. Lastly, we verified models and created habitat suitability maps. The ROC AUC (Area Under the Curve) and Kappa values were 0.697/0.266 (GLM), 0.729/0.313 (GAM), 0.776/0.453 (CART), and 0.858/0.559 (ANN). Therefore, ANN was selected as the best model among 4 models. The models showed that elevation, slope, and distance to stream were the significant factors for Goal habitat. The ratio of predicted area of ANN using a threshold was 31.29%, but the area decreased when human effect was considered. We need to investigate the difference of various models to build a suitable wildlife habitat model under a given condition.

Integrative Analysis of Microarray Data with Gene Ontology to Select Perturbed Molecular Functions using Gene Ontology Functional Code

  • Kim, Chang-Sik;Choi, Ji-Won;Yoon, Suk-Joon
    • Genomics & Informatics
    • /
    • v.7 no.2
    • /
    • pp.122-130
    • /
    • 2009
  • A systems biology approach for the identification of perturbed molecular functions is required to understand the complex progressive disease such as breast cancer. In this study, we analyze the microarray data with Gene Ontology terms of molecular functions to select perturbed molecular functional modules in breast cancer tissues based on the definition of Gene ontology Functional Code. The Gene Ontology is three structured vocabularies describing genes and its products in terms of their associated biological processes, cellular components and molecular functions. The Gene Ontology is hierarchically classified as a directed acyclic graph. However, it is difficult to visualize Gene Ontology as a directed tree since a Gene Ontology term may have more than one parent by providing multiple paths from the root. Therefore, we applied the definition of Gene Ontology codes by defining one or more GO code(s) to each GO term to visualize the hierarchical classification of GO terms as a network. The selected molecular functions could be considered as perturbed molecular functional modules that putatively contributes to the progression of disease. We evaluated the method by analyzing microarray dataset of breast cancer tissues; i.e., normal and invasive breast cancer tissues. Based on the integration approach, we selected several interesting perturbed molecular functions that are implicated in the progression of breast cancers. Moreover, these selected molecular functions include several known breast cancer-related genes. It is concluded from this study that the present strategy is capable of selecting perturbed molecular functions that putatively play roles in the progression of diseases and provides an improved interpretability of GO terms based on the definition of Gene Ontology codes.