• Title/Summary/Keyword: Tree mining

Search Result 566, Processing Time 0.026 seconds

The Relationship of the Concentration in Physical space and the proliferation of Cyber space : focusing on the Concentration of Plastic Surgery Clinics at Kangnam-gu, Korea (사이버 공간의 확산과 물리적 공간에서의 집중화 현상의 관련성 : 성형외과의 강남구 집중현상 고찰)

  • Cho, Yeong-Bin;Choi, Young-Keun
    • Journal of Information Technology Applications and Management
    • /
    • v.19 no.1
    • /
    • pp.85-100
    • /
    • 2012
  • The development of technology causes a lot of change. Many researchers have insisted that the proliferation of cyber space changes the physical space. Their insistences have been accumulated into three aspects. Firstly, the proliferation of cyber space brings out the concentration in the physical space, secondly the decentralization and lastly both at the same time. In Korea, the concentration of plastic surgery clinics has taken place in Kangnam-gu area at similar period of the Internet proliferation. In this research, we execute empirical study of whether the concentration of plastic surgery in specific areas correlates with the proliferation of cyber space or not. In order to do this, we verified homogeneity of plastic surgery websites between Kangnam-gu and Non-Kangnam-gu areas. Also, we used three statistical and data-mining techniques which are Multi-discriminant analysis, Decision tree analysis and artificial neural network analysis. As a result, there was homogeneity between two different area plastic surgery clinics websites, but there was not big heterogeneity as well. Therefore, in this case of concentration of plastic surgery in Korea, the proliferation of cyber space restrictively correlates with the concentration of physical space.

Computational Methods for Traditional Korean Medicine : A survey (한의 정보의 계산적 방법 조사)

  • Kim, Sang-Kyun;Jang, Hyun-Chul;Kim, Jin-Hyun;Kim, Chul;Yea, Sang-Jun;Song, Mi-Young
    • Journal of Physiology & Pathology in Korean Medicine
    • /
    • v.25 no.5
    • /
    • pp.894-899
    • /
    • 2011
  • Traditional Korean Medicine (TKM) has been actively researched through various approaches, including computational methods. This paper aims at providing an overview of domestic studies using the computational techniques in TKM field. A literature search was conducted in Korean publications using OASIS system, and major studies of data mining in TKM were identified. A review was presented in six diagnosis fields, including sasang constitution diagnosis, eight constitution diagnosis, tongue diagnosis, pattern diagnosis for stroke, diagnosis based on ontology, diagnosis for cause of disease. They collect clinical data themselves for experiments and primarily applied a algorithm of decision tree, SVM, neural network, case-based reasoning, ontology reasoning, discriminant analysis. In the future, there needs to identify which algorithm is suitable to diagnosis or other fields of TKM.

Effective R & D Management using Data Mining Classification Techniques (데이터마이닝 분류기법을 이용한 효과적인 연구관리에 관한 연구)

  • 황석해;문태수;이준한
    • Journal of Information Technology Application
    • /
    • v.3 no.2
    • /
    • pp.1-24
    • /
    • 2001
  • This purpose of this study is to drive important criteria for improving customer relationship of R institute using data mining techniques. The focus of this research is to consider patterns and interactions of research variables from research management database of R institute, and to classify the outside organizations and the inside organizations for research contract organizations, and to decide the directions of customer relationship management through analyzing the research type and research cost of research topics. In order to drive criteria variables through pattern analysis of the research database, decision tree algorithm is employed. The results show that determinant variables of 17 input variables are research period, overhead cost, R & D cost as variables to classify the outside and inside contract organization.

  • PDF

Cloud Computing Adoption Decision-Making Modeling Using CART (CART 방법론을 사용한 클라우드 컴퓨팅 도입 의사 결정 모델링)

  • Baek, Seung Hyun;Chang, Byeong-Yun
    • Journal of the Korea Society for Simulation
    • /
    • v.23 no.4
    • /
    • pp.189-195
    • /
    • 2014
  • In this paper, we conducted a study on place-free and time-free cloud computing (CC) adoption decision-making model. Panel survey data which is collected from 65 people and CART (classification and regression tree) which is one of data mining approaches are used to construct decision-making model. In this modeling, there are 2 steps: In the first step, significant questions (variables) are selected. After that, the CART decision-making model is constructed using the selected variables. In the variable selection stage, the 25 questions are reduced to 5 ones. The benefits of question reduction are quick response from respondent and reducing model-construction time.

A Method of Predicting Service Time Based on Voice of Customer Data (고객의 소리(VOC) 데이터를 활용한 서비스 처리 시간 예측방법)

  • Kim, Jeonghun;Kwon, Ohbyung
    • Journal of Information Technology Services
    • /
    • v.15 no.1
    • /
    • pp.197-210
    • /
    • 2016
  • With the advent of text analytics, VOC (Voice of Customer) data become an important resource which provides the managers and marketing practitioners with consumer's veiled opinion and requirements. In other words, making relevant use of VOC data potentially improves the customer responsiveness and satisfaction, each of which eventually improves business performance. However, unstructured data set such as customers' complaints in VOC data have seldom used in marketing practices such as predicting service time as an index of service quality. Because the VOC data which contains unstructured data is too complicated form. Also that needs convert unstructured data from structure data which difficult process. Hence, this study aims to propose a prediction model to improve the estimation accuracy of the level of customer satisfaction by combining unstructured from textmining with structured data features in VOC. Also the relationship between the unstructured, structured data and service processing time through the regression analysis. Text mining techniques, sentiment analysis, keyword extraction, classification algorithms, decision tree and multiple regression are considered and compared. For the experiment, we used actual VOC data in a company.

Social Network Analysis to Analyze the Purchase Behavior Of Churning Customers and Loyal Customers (사회 네트워크 분석을 이용한 충성고객과 이탈고객의 구매 특성 비교 연구)

  • Kim, Jae-Kyeong;Choi, Il-Young;Kim, Hyea-Kyeong;Kim, Nam-Hee
    • Korean Management Science Review
    • /
    • v.26 no.1
    • /
    • pp.183-196
    • /
    • 2009
  • Customer retention has been a pressing issue for companies to get and maintain the loyal customers in the competing environment. Lots of researchers make effort to seek the characteristics of the churning customers and the loyal customers using the data mining techniques such as decision tree. However, such existing researches don't consider relationships among customers. Social network analysis has been used to search relationships among social entities such as genetics network, traffic network, organization network and so on. In this study, a customer network is proposed to investigate the differences of network characteristics of churning customers and loyal customers. The customer networks are constructed by analyzing the real purchase data collected from a Korean cosmetic provider. We investigated whether the churning customers and the loyal customers have different degree centralities and densities of the customer networks. In addition, we compared products purchased by the churning customers and those by the loyal customers. Our data analysis results indicate that degree centrality and density of the churning customer network are higher than those of the loyal customer network, and the various products are purchased by churning customers rather than by the loyal customers. We expect that the suggested social network analysis is used to as a complementary analysis methodology with existing statistical analysis and data mining analysis.

Estimate Soil Moisutre Using Satelite Image and Data Mining (위성영상과 데이터 마이닝 기법을 이용한 토양수분 산정)

  • Kim, Gwang-Seob;Park, Han-Gyun;Cho, So-Hyun
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2010.05a
    • /
    • pp.1615-1619
    • /
    • 2010
  • 토양수분은 토양입자에 포함되어 있는 물을 의미하는 것으로 지표면과 대기간의 에너지 균형과 물 순환을 조절하는데 중요한 요소이다. 본 연구에서는 토양수분 산정을 위하여 2003년 1월부터 2008년 12월까지의 MODIS(Moderate Resolution Imaging Spectroradiometer) 위성관측 자료로부터 획득한 정규식생지수(NDVI: Normalized Difference Vegetation Index)자료와 지표면 온도자료, 우리나라 76개소 기상관측소 중에 자료의 보유기간이 30년 이하인 관측소와 섬 지역들을 제외한 57개 지점의 강수량, 토양온도 자료 및 우리나라 전역에 대한 토지피복, 유효토심자료를 이용하여 데이터 마이닝(Data Mining) 기법의 하나인 CART(Classification And Regression Tree) 기법을 이용하여 토양수분을 산정하였다. 먼저 신뢰성 높은 토양수분 관측 자료를 가진 용담댐 유역의 6개 지점에 대하여 토양수분을 산정하여 적용 가능성을 분석하였다. 3개 지점의 토양수분 관측치는 토양수분 산정 모형 수립에 사용하였으며 검증에 사용된 1개 지점의 토양수분의 관측치와 추정치 간의 상관계수를 확인한 결과 전체적인 토양수분의 거동을 잘 나타내고 있어 토양수분 추정 모형의 적용가능성을 확인하였다. 이를 이용하여 용담댐 유역의 토양수분 분포와 우리나라 전역에 대한 토양수분 분포도를 추정하였다. 신뢰할 수 있는 지상관측 토양수분 관측치가 다양한 지상조건에 대하여 존재하지 않는 한계가 있음에도 불구하고 제시된 토양수분산정 방법은 제한된 가용자료를 사용한 우리나라 전역의 토양수분 산정에 있어 합리적인 접근법이라 판단된다.

  • PDF

Financial Instruments Recommendation based on Classification Financial Consumer by Text Mining Techniques (비정형 데이터 분석을 통한 금융소비자 유형화 및 그에 따른 금융상품 추천 방법)

  • Lee, Jaewoong;Kim, Young-Sik;Kwon, Ohbyung
    • Journal of Information Technology Services
    • /
    • v.15 no.4
    • /
    • pp.1-24
    • /
    • 2016
  • With the innovation of information technology, non-face-to-face robo advisor with high accessibility and convenience is spreading. The current robot advisor recommends appropriate investment products after understanding the investment propensity based on the structured data entered directly or indirectly by individuals. However, it is an inconvenient and obtrusive way for financial consumers to inquire or input their own subjective propensity to invest. Hence, this study proposes a way to deduce the propensity to invest in unstructured data that customers voluntarily exposed during consultation or online. Since prediction performance based on unstructured document differs according to the characteristics of text, in this study, classification algorithm optimized for the characteristic of text left by financial consumers is selected by performing prediction performance evaluation of various learning discrimination algorithms and proposed an intelligent method that automatically recommends investment products. User tests were given to MBA students. After showing the recommended investment and list of investment products, satisfaction was asked. Financial consumers' satisfaction was measured by dividing them into investment propensity and recommendation goods. The results suggest that the users high satisfaction with investment products recommended by the method proposed in this paper. The results showed that it can be applies to non-face-to-face robo advisor.

Stream-based Biomedical Classification Algorithms for Analyzing Biosignals

  • Fong, Simon;Hang, Yang;Mohammed, Sabah;Fiaidhi, Jinan
    • Journal of Information Processing Systems
    • /
    • v.7 no.4
    • /
    • pp.717-732
    • /
    • 2011
  • Classification in biomedical applications is an important task that predicts or classifies an outcome based on a given set of input variables such as diagnostic tests or the symptoms of a patient. Traditionally the classification algorithms would have to digest a stationary set of historical data in order to train up a decision-tree model and the learned model could then be used for testing new samples. However, a new breed of classification called stream-based classification can handle continuous data streams, which are ever evolving, unbound, and unstructured, for instance--biosignal live feeds. These emerging algorithms can potentially be used for real-time classification over biosignal data streams like EEG and ECG, etc. This paper presents a pioneer effort that studies the feasibility of classification algorithms for analyzing biosignals in the forms of infinite data streams. First, a performance comparison is made between traditional and stream-based classification. The results show that accuracy declines intermittently for traditional classification due to the requirement of model re-learning as new data arrives. Second, we show by a simulation that biosignal data streams can be processed with a satisfactory level of performance in terms of accuracy, memory requirement, and speed, by using a collection of stream-mining algorithms called Optimized Very Fast Decision Trees. The algorithms can effectively serve as a corner-stone technology for real-time classification in future biomedical applications.

Implementing Linear Models in Genetic Programming to Utilize Accumulated Data in Shipbuilding (조선분야의 축적된 데이터 활용을 위한 유전적프로그래밍에서의 선형(Linear) 모델 개발)

  • Lee, Kyung-Ho;Yeun, Yun-Seog;Yang, Young-Soon
    • Journal of the Society of Naval Architects of Korea
    • /
    • v.42 no.5 s.143
    • /
    • pp.534-541
    • /
    • 2005
  • Until now, Korean shipyards have accumulated a great amount of data. But they do not have appropriate tools to utilize the data in practical works. Engineering data contains experts' experience and know-how in its own. It is very useful to extract knowledge or information from the accumulated existing data by using data mining technique This paper treats an evolutionary computation based on genetic programming (GP), which can be one of the components to realize data mining. The paper deals with linear models of GP for the regression or approximation problem when given learning samples are not sufficient. The linear model, which is a function of unknown parameters, is built through extracting all possible base functions from the standard GP tree by utilizing the symbolic processing algorithm. In addition to a standard linear model consisting of mathematic functions, one variant form of a linear model, which can be built using low order Taylor series and can be converted into the standard form of a polynomial, is considered in this paper. The suggested model can be utilized as a designing tool to predict design parameters with small accumulated data.