Search | Korea Science

A Combinatorial Optimization for Influential Factor Analysis: a Case Study of Political Preference in Korea

Yun, Sung Bum;Yoon, Sanghyun;Heo, Joon
- Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
- /
- v.35 no.5
- /
- pp.415-422
- /
- 2017
Finding influential factors from given clustering result is a typical data science problem. Genetic Algorithm based method is proposed to derive influential factors and its performance is compared with two conventional methods, Classification and Regression Tree (CART) and Chi-Squared Automatic Interaction Detection (CHAID), by using Dunn's index measure. To extract the influential factors of preference towards political parties in South Korea, the vote result of $18^{th}$ presidential election and 'Demographic', 'Health and Welfare', 'Economic' and 'Business' related data were used. Based on the analysis, reverse engineering was implemented. Implementation of reverse engineering based approach for influential factor analysis can provide new set of influential variables which can present new insight towards the data mining field.
https://doi.org/10.7848/ksgpc.2017.35.5.415 인용 PDF KSCI

Evaluation of Ultrasound for Prediction of Carcass Meat Yield and Meat Quality in Korean Native Cattle (Hanwoo)

Song, Y.H.;Kim, S.J.;Lee, S.K.
- Asian-Australasian Journal of Animal Sciences
- /
- v.15 no.4
- /
- pp.591-595
- /
- 2002
Three hundred thirty five progeny testing steers of Korean beef cattle were evaluated ultrasonically for back fat thickness (BFT), longissimus muscle area (LMA) and intramuscular fat (IF) before slaughter. Class measurements associated with the Korean yield grade and quality grade were also obtained. Residual standard deviation between ultrasonic estimates and carcass measurements of BFT, LMA were 1.49 mm and $0.96cm^2$. The linear correlation coefficients (p<0.01) between ultrasonic estimates and carcass measurements of BFT, LMA and IF were 0.75, 0.57 and 0.67, respectively. Results for improving predictions of yield grade by four methods-the Korean yield grade index equation, fat depth alone, regression and decision tree methods were 75.4%, 79.6%, 64.3% and 81.4%, respectively. We conclude that the decision tree method can easily predict yield grade and is also useful for increasing prediction accuracy rate.
https://doi.org/10.5713/ajas.2002.591 인용 PDF KSCI

Frequent Itemset Search Using LSI Similarity (LSI 유사도를 이용한 효율적인 빈발항목 탐색 알고리즘)

Ko, Younhee;Kim, Hyeoncheol;Lee, Wongyu
- The Journal of Korean Association of Computer Education
- /
- v.6 no.1
- /
- pp.1-8
- /
- 2003
We introduce a efficient vertical mining algorithm that reduces searching complexity for frequent k-itemsets significantly. This method includes sorting items by their LSI(Least Support Itemsets) similarity and then searching frequent itemsets in tree-based manner. The search tree structure provides several useful heuristics and therefore, reduces search space significantly at early stages. Experimental results on various data sets shows that the proposed algorithm improves searching performance compared to other algorithms, especially for a database having long pattern.
PDF

Scoring models to detect foreign exchange money laundering (외국환 거래의 자금세탁 혐의도 점수모형 개발에 관한 연구)

Hong, Seong-Ik;Moon, Tae-Hee;Sohn, So-Young
- IE interfaces
- /
- v.18 no.3
- /
- pp.268-276
- /
- 2005
In recent years, the money Laundering crimes are increasing by means of foreign exchange transactions. Our study proposes four scoring models to provide early warning of the laundering in foreign exchange transactions for both inward and outward remittances: logistic regression model, decision tree, neural network, and ensemble model which combines the three models. In terms of accuracy of test data, decision tree model is selected for the inward remittance and an ensemble model for the outward remittance. From our study results, the accumulated number of transaction turns out to be the most important predictor variable. The proposed scoring models deal with the transaction level and is expected to help the bank teller to detect the laundering related transactions in the early stage.
PDF KSCI

A Survey of Applications of Artificial Intelligence Algorithms in Eco-environmental Modelling

Kim, Kang-Suk;Park, Joon-Hong
- Environmental Engineering Research
- /
- v.14 no.2
- /
- pp.102-110
- /
- 2009
Application of artificial intelligence (AI) approaches in eco-environmental modeling has gradually increased for the last decade. Comprehensive understanding and evaluation on the applicability of this approach to eco-environmental modeling are needed. In this study, we reviewed the previous studies that used AI-techniques in eco-environmental modeling. Decision Tree (DT) and Artificial Neural Network (ANN) were found to be major AI algorithms preferred by researchers in ecological and environmental modeling areas. When the effect of the size of training data on model prediction accuracy was explored using the data from the previous studies, the prediction accuracy and the size of training data showed nonlinear correlation, which was best-described by hyperbolic saturation function among the tested nonlinear functions including power and logarithmic functions. The hyperbolic saturation equations were proposed to be used as a guideline for optimizing the size of training data set, which is critically important in designing the field experiments required for training AI-based eco-environmental modeling.
https://doi.org/10.4491/eer.2009.14.2.102 인용 PDF KSCI

A Feature Analysis of Industrial Accidents Using CHAID Algorithm (CHAID 알고리즘을 이용한 산업재해 특성분석)

Leem Young-Moon;Hwang Young-Seob
- Journal of the Korea Safety Management & Science
- /
- v.7 no.5
- /
- pp.59-67
- /
- 2005
The main objective of the statistical analysis about industrial accidents is to find out what is the dangerous factor in its own industrial field so that it is possible to prevent or decrease the number of the possible accidents by educating those who work in the fields for safety tools. However, so far, there is no technique of quantitative evaluation on danger. Almost all previous researches as to industrial accidents have only relied on the frequency analysis such as the analysis of the constituent ratio on accidents. As an application of data mining technique, this paper presents analysis on the efficiency of the CHAID algorithm to classify types of industrial accidents data and thereby identifies potential weak points in accident risk grouping.
PDF KSCI

A Technique for Making Efficient Travel Routes using the Mining Method of Frequent Patterns-growth (FP-growth 마이닝을 이용한 효율적인 여행경로 수립 기법)

Yoo, Kibeom;Cho, Kyungsoo;Kim, Ung-Mo
- Annual Conference of KIPS
- /
- 2010.11a
- /
- pp.10-13
- /
- 2010
컴퓨터의 활용이 다양해 지면서 예전과 다르게 다양한 이유로 많은 사람들이 여행을 하고 나서 여행에 대한 정보 블로그나 웹 상에 저장하고 공개한다. 이렇게 웹 상에 많은 양의 여행 관련 데이터가 존재함에도 불구하고 데이터들이 산발적으로 존재하고 체계적으로 데이터 베이스화 되어 있지 않아서 여전히 정보를 검색하고 여행 일정을 세우는 데에 많은 시간과 노력이 필요하다. 따라서 본 논문은 FP-tree 기반의 빈발 패턴 증가 기법을 이용한 여행 계획 수립 기법을 제안한다. 제안되는 기법에서 데이터들은 FP-tree 방식으로 저장되어 검색에 필요한 시간과 노력을 극적으로 줄이고, FP-growth 마이닝 기법을 이용해 효과적인 여행 경로를 선택할 수 있게 도와준다.
https://doi.org/10.3745/PKIPS.y2010m11a.10 인용 PDF

Mining of Stocks Having Similar Pattern using FP-Tree (FP-tree를 이용한 유사 패턴 주식종목 추출)

Sim, Jong-Bo;Kim, Won-Young;kim, Ung-Mo
- Annual Conference of KIPS
- /
- 2009.11a
- /
- pp.727-728
- /
- 2009
최근 컴퓨터와 인터넷의 발달로 과거 창구거래를 이용하던 방법에서 HTS(Home Trading System)을 이용하여 거래하게 됨으로써 개인투자자들도 쉽게 주식투자를 할 수 있게 되었다. 그러나 개인들이 방대한 양의 과거 데이터를 분석하기에는 상당한 어려움이 있다. 본 논문에서는 주식 데이터베이스로부터 과거 특정 종목들 간 연관성을 추출하여 투자자들로 하여금 주식 선별에 참고가 될 수 있는 방안에 관하여 논의한다. 기존의 논문에서 제안된 과거 패턴을 이용하여 미래의 주가변화를 예측하는 것과 달리, 종목들 간에 연관성을 통하여 하나의 테마가 형성 되었을 때 주도주의 변화로 관련주의 변화를 파악하여 투자에 유익한 정보를 제공하는데 목적이 있다.
https://doi.org/10.3745/PKIPS.y2009m11a.727 인용 PDF

Mining Technique of Tour Destination by weighted FP-tree (가중치가 부여된 FP-tree를 이용한 여행지 추출 기법)

MinJu Kim;EunJu Lee;Eung-Mo Kim
- Annual Conference of KIPS
- /
- 2008.11a
- /
- pp.233-236
- /
- 2008
최근 컴퓨터와 통신의 기술이 빠르게 발달함에 따라 사회 각 부분은 그동안 경험하지 못했던 정보화라는 새로운 변화를 겪었다. 그 결과 정보화 수준이 점점 고도화 될수록 더욱 다양하고 방대한 데이터가 생성되어 데이터베이스를 이루게 되었다. 방대한 데이터에서 유용한 정보를 얻는 데이터마이닝 기법이 중요한 문제로 대두되었다. 데이터마이닝 기법은 점점 더 많은 분야에서 합리적인 선택을 위해 필수적으로 사용된다. 본 논문은 마이닝 기법을 적용하여 방대한 데이터베이스가 최적의 여행 경로 선택을 제공한다. 본 논문은 빈발 패턴 증가 기법에 가중치를 두어 여행자가 여행지를 선별하기 좋은 환경을 제공한다. 미래 산업 중 가장 중요한 산업 중 하나인 관광 산업은 계속적으로 성장하고 있으며 논문에서 제시하는 데이터 마이닝 기법으로 더 큰 발전을 기대한다.
https://doi.org/10.3745/PKIPS.y2008m011a.233 인용 PDF

A Study on Variable Selection Bias in Data Mining Software Packages (데이터마이닝 패키지에서 변수선택 편의에 관한 연구)

송문섭;윤영주
- The Korean Journal of Applied Statistics
- /
- v.14 no.2
- /
- pp.475-486
- /
- 2001
데이터마이닝 패키지에 구현된 분류나무 알고리즘 가운데 CART, CHAID, QUEST, C4.5에서 변수 선택법을 비교하였다. CART의 전체탐색법이 편의를 갖는다는 사실은 잘알려졌으며, 여기서는 상품화된 패키지들에서 이들 알고리즘의 편의와 선택력을 모의실험 연구를 통하여 비교하였다. 상용 패키지로는 CART, Enterprise Miner, AnswerTree, Clementine을 사용하였다. 본 논문의 제한된 모의실험 연구 결과에 의하면 C4.5와 CART는 모두 변수선택에서 심각한 편의를 갖고 있으며, CHAID와 QUEST는 비교적 안정된 결과를 보여주고 있었다.
PDF

Search Result 568, Processing Time 0.035 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)