• Title/Summary/Keyword: CART algorithm

Search Result 91, Processing Time 0.023 seconds

Integrating Ant Colony Clustering Method to a Multi-Robot System Using Mobile Agents

  • Kambayashi, Yasushi;Ugajin, Masataka;Sato, Osamu;Tsujimura, Yasuhiro;Yamachi, Hidemi;Takimoto, Munehiro;Yamamoto, Hisashi
    • Industrial Engineering and Management Systems
    • /
    • v.8 no.3
    • /
    • pp.181-193
    • /
    • 2009
  • This paper presents a framework for controlling mobile multiple robots connected by communication networks. This framework provides novel methods to control coordinated systems using mobile agents. The combination of the mobile agent and mobile multiple robots opens a new horizon of efficient use of mobile robot resources. Instead of physical movement of multiple robots, mobile software agents can migrate from one robot to another so that they can minimize energy consumption in aggregation. The imaginary application is making "carts," such as found in large airports, intelligent. Travelers pick up carts at designated points but leave them arbitrary places. It is a considerable task to re-collect them. It is, therefore, desirable that intelligent carts (intelligent robots) draw themselves together automatically. Simple implementation may be making each cart has a designated assembly point, and when they are free, automatically return to those points. It is easy to implement, but some carts have to travel very long way back to their own assembly point, even though it is located close to some other assembly points. It consumes too much unnecessary energy so that the carts have to have expensive batteries. In order to ameliorate the situation, we employ mobile software agents to locate robots scattered in a field, e.g. an airport, and make them autonomously determine their moving behaviors by using a clustering algorithm based on the Ant Colony Optimization (ACO). ACO is the swarm intelligence-based methods, and a multi-agent system that exploit artificial stigmergy for the solution of combinatorial optimization problems. Preliminary experiments have provided a favorable result. In this paper, we focus on the implementation of the controlling mechanism of the multi-robots using the mobile agents.

The Factors of Participating in a Smoking Cessation Program using Integrated Method of Decision Tree and Neural Network Algorithm (인공신경망 분석과 결정트리 융합에 의한 금연 프로그램 참여 결정 요인)

  • Byeon, Haewon
    • Journal of the Korea Convergence Society
    • /
    • v.6 no.2
    • /
    • pp.25-30
    • /
    • 2015
  • The purpose of this study was to analyze the factors that affects the participating in a smoking cessation program. Data were from the A Study on the Seoul Welfare Panel Study 2010. Subjects were 1,326 smokers aged 19 and older living in the community. Dependent variable was defined as experience of smoking cessation. Explanatory variables were included as age, gender, level of education, employment status, household income, marital status, drinking, self-reported health status, depression, disease, and physical activity. A prediction model was developed by the use of a Decision Tree and Neural Network Algorithm. In the Prediction model, self reported health status, disease, income, household income were significantly associated with participating in a smoking cessation program. Based this study, systematic education and development of programs are required.

Enhanced Recommendation Algorithm using Semantic Collaborative Filtering: E-commerce Portal (전자상거래 포탈을 위한 시맨틱 협업 필터링을 이용한 확장된 추천 알고리즘)

  • Ahmed, Shohel;Kim, Jong-Woo;Kang, Sang-Gil
    • Journal of Intelligence and Information Systems
    • /
    • v.17 no.3
    • /
    • pp.79-98
    • /
    • 2011
  • This paper proposes a semantic recommendation technique for a personalized e-commerce portal. Semantic recommendation is achieved by utilizing the attributes of products. The semantic similarity of the products is merged with the rating information of the products to provide an accurate recommendation. The recommendation technique also analyzes various attitudes of the customer to evaluate the implicit rating of products. Attitudes are classifies into three types such as "purchasing product", "adding product to shopping cart", and "viewing the product information." We implicitly track customer attitude to estimate the rating of products for recommending products. Also we implement a session validation process to identify the valid sessions that are highly important for giving an accurate recommendation. Our recommendation technique shows a high degree of accuracy as we use age groupings of customers with similar preferences. The experimental section shows that our proposed recommendation method outperforms well known collaborative filtering methods not only for the existing customer, but also for the new user with no previous purchase record.

A Machine Learning Approach for Mechanical Motor Fault Diagnosis (기계적 모터 고장진단을 위한 머신러닝 기법)

  • Jung, Hoon;Kim, Ju-Won
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.40 no.1
    • /
    • pp.57-64
    • /
    • 2017
  • In order to reduce damages to major railroad components, which have the potential to cause interruptions to railroad services and safety accidents and to generate unnecessary maintenance costs, the development of rolling stock maintenance technology is switching from preventive maintenance based on the inspection period to predictive maintenance technology, led by advanced countries. Furthermore, to enhance trust in accordance with the speedup of system and reduce maintenances cost simultaneously, the demand for fault diagnosis and prognostic health management technology is increasing. The objective of this paper is to propose a highly reliable learning model using various machine learning algorithms that can be applied to critical rolling stock components. This paper presents a model for railway rolling stock component fault diagnosis and conducts a mechanical failure diagnosis of motor components by applying the machine learning technique in order to ensure efficient maintenance support along with a data preprocessing plan for component fault diagnosis. This paper first defines a failure diagnosis model for rolling stock components. Function-based algorithms ANFIS and SMO were used as machine learning techniques for generating the failure diagnosis model. Two tree-based algorithms, RadomForest and CART, were also employed. In order to evaluate the performance of the algorithms to be used for diagnosing failures in motors as a critical railroad component, an experiment was carried out on 2 data sets with different classes (includes 6 classes and 3 class levels). According to the results of the experiment, the random forest algorithm, a tree-based machine learning technique, showed the best performance.

A Study on Factors of Education's Outcome using Decision Trees (의사결정트리를 이용한 교육성과 요인에 관한 연구)

  • Kim, Wan-Seop
    • Journal of Engineering Education Research
    • /
    • v.13 no.4
    • /
    • pp.51-59
    • /
    • 2010
  • In order to manage the lectures efficiently in the university and improve the educational outcome, the process is needed that make diagnosis of the present educational outcome of each classes on a lecture and find factors of educational outcome. In most studies for finding the factors of the efficient lecture, statistical methods such as association analysis, regression analysis are used usually, and recently decision tree analysis is employed, too. The decision tree analysis have the merits that is easy to understand a result model, and to be easy to apply for the decision making, but have the weaknesses that is not strong for characteristic of input data such as multicollinearity. This paper indicates the weaknesses of decision tree analysis, and suggests the experimental solution using multiple decision tree algorithm to supplement these problems. The experimental result shows that the suggested method is more effective in finding the reliable factors of the educational outcome.

  • PDF

Machine Learning Algorithm for Estimating Ink Usage (머신러닝을 통한 잉크 필요량 예측 알고리즘)

  • Se Wook Kwon;Young Joo Hyun;Hyun Chul Tae
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.46 no.1
    • /
    • pp.23-31
    • /
    • 2023
  • Research and interest in sustainable printing are increasing in the packaging printing industry. Currently, predicting the amount of ink required for each work is based on the experience and intuition of field workers. Suppose the amount of ink produced is more than necessary. In this case, the rest of the ink cannot be reused and is discarded, adversely affecting the company's productivity and environment. Nowadays, machine learning models can be used to figure out this problem. This study compares the ink usage prediction machine learning models. A simple linear regression model, Multiple Regression Analysis, cannot reflect the nonlinear relationship between the variables required for packaging printing, so there is a limit to accurately predicting the amount of ink needed. This study has established various prediction models which are based on CART (Classification and Regression Tree), such as Decision Tree, Random Forest, Gradient Boosting Machine, and XGBoost. The accuracy of the models is determined by the K-fold cross-validation. Error metrics such as root mean squared error, mean absolute error, and R-squared are employed to evaluate estimation models' correctness. Among these models, XGBoost model has the highest prediction accuracy and can reduce 2134 (g) of wasted ink for each work. Thus, this study motivates machine learning's potential to help advance productivity and protect the environment.

유비쿼터스 컴퓨팅 황경에서 발생하는 에이전트간 충돌 해결 모델

  • 이건수;김민구
    • Proceedings of the Korea Inteligent Information System Society Conference
    • /
    • 2004.11a
    • /
    • pp.249-258
    • /
    • 2004
  • 오늘날 활발하게 이루어지고 있는 유비쿼터스 컴퓨팅 관련 기술 연구는 사용자가 시간과 장소에 구애받지 않고 네트워크에 접근해 다양한 컴퓨터 관련 서비스를 제공 받을 수 있는 방법에 초점을 맞추고 있다. 이 처럼 시간과 공간의 한계를 뛰어 넘은 네트워크로의 자유로운 접근은 일상 생활의 패러다임을 바꾸어 놓게 될 것이다. 유비쿼터스 컴퓨팅 기술을 통해 가장 큰 변화가 일어나는 분야는 일반 가정환경에서 일어나는 인텔리전트 홈 네트워크 (Intelligent Home Network) 라고 할 수 있다. 집에 들어오면, 자동으로 문을 열어주고, 불을 켜주며, 놓쳤던 TV 프로그램을 자동으로 녹화해 놓았다가 원하는 시간에 보여주고, 적당한 시간에 목욕물을 미리 받아준다. 또한 집밖으로 나가기 전, 일기예보에 따라 우산을 챙겨주고, 일정을 확인시켜주며 입고 나갈 옷을 골라줄 수도 있다. 이 모든 일들이 유비쿼터스 컴퓨팅 기술이 가져올 인텔리전트 홈 네트워크의 모습이다. 그러나, 모든 사용자에게 효과적인 서비스를 제공하기 위해서는 홈 네트워크 상의 자원 관리에서 일어날 수 있는 에이전트들간의 자원 접근 권한 충돌을 효율적으로 방지할 수 있는 기술이 필요하다. 유비쿼터스 컴퓨팅 환경에서 자원관리 특성은 점유의 연속성, 자원 사이의 연관성, 그리고 자원과 사용자 사 사이의 연계성의 3 가지 특성을 지니고 있다. 본 논문에서는 유비쿼터스 컴퓨팅 환경에서 일어날 수 있는 자원 충돌 상황을 효율적으로 처리하기 위한 자원 협상 방법을 제안한다. 본 방법은 자원 관리 특성을 바탕으로 시간논리에 기반을 둔 자원 선점과 분배 규칙으로 구성된다.트 시스템은 b-Cart를 기반으로 할 것으로 예측할 수 있다.타났다. 또한, 스네이크의 초기 제어점을 얼굴은 44개, 눈은 16개, 입은 24개로 지정하여 MER추출에 성공한 영상에 대해 스네이크 알고리즘을 수행한 결과, 추출된 영역의 오차율은 각각 2.2%, 2.6%, 2.5%로 나타났다.해서 Template-based reasoning 예를 보인다 본 방법론은 검색노력을 줄이고, 검색에 있어 Feasibility와 Admissibility를 보장한다.매김할 수 있는 중요한 계기가 될 것이다.재무/비재무적 지표를 고려한 인공신경망기법의 예측적중률이 높은 것으로 나타났다. 즉, 로지스틱회귀 분석의 재무적 지표모형은 훈련, 시험용이 84.45%, 85.10%인 반면, 재무/비재무적 지표모형은 84.45%, 85.08%로서 거의 동일한 예측적중률을 가졌으나 인공신경망기법 분석에서는 재무적 지표모형이 92.23%, 85.10%인 반면, 재무/비재무적 지표모형에서는 91.12%, 88.06%로서 향상된 예측적중률을 나타내었다.ting LMS according to increasing the step-size parameter $\mu$ in the experimentally computed. learning curve. Also we find that convergence speed of proposed algorithm is increased by (B+1) time proportional to B which B is the number of recycled data b

  • PDF

Development of Sentiment Analysis Model for the hot topic detection of online stock forums (온라인 주식 포럼의 핫토픽 탐지를 위한 감성분석 모형의 개발)

  • Hong, Taeho;Lee, Taewon;Li, Jingjing
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.1
    • /
    • pp.187-204
    • /
    • 2016
  • Document classification based on emotional polarity has become a welcomed emerging task owing to the great explosion of data on the Web. In the big data age, there are too many information sources to refer to when making decisions. For example, when considering travel to a city, a person may search reviews from a search engine such as Google or social networking services (SNSs) such as blogs, Twitter, and Facebook. The emotional polarity of positive and negative reviews helps a user decide on whether or not to make a trip. Sentiment analysis of customer reviews has become an important research topic as datamining technology is widely accepted for text mining of the Web. Sentiment analysis has been used to classify documents through machine learning techniques, such as the decision tree, neural networks, and support vector machines (SVMs). is used to determine the attitude, position, and sensibility of people who write articles about various topics that are published on the Web. Regardless of the polarity of customer reviews, emotional reviews are very helpful materials for analyzing the opinions of customers through their reviews. Sentiment analysis helps with understanding what customers really want instantly through the help of automated text mining techniques. Sensitivity analysis utilizes text mining techniques on text on the Web to extract subjective information in the text for text analysis. Sensitivity analysis is utilized to determine the attitudes or positions of the person who wrote the article and presented their opinion about a particular topic. In this study, we developed a model that selects a hot topic from user posts at China's online stock forum by using the k-means algorithm and self-organizing map (SOM). In addition, we developed a detecting model to predict a hot topic by using machine learning techniques such as logit, the decision tree, and SVM. We employed sensitivity analysis to develop our model for the selection and detection of hot topics from China's online stock forum. The sensitivity analysis calculates a sentimental value from a document based on contrast and classification according to the polarity sentimental dictionary (positive or negative). The online stock forum was an attractive site because of its information about stock investment. Users post numerous texts about stock movement by analyzing the market according to government policy announcements, market reports, reports from research institutes on the economy, and even rumors. We divided the online forum's topics into 21 categories to utilize sentiment analysis. One hundred forty-four topics were selected among 21 categories at online forums about stock. The posts were crawled to build a positive and negative text database. We ultimately obtained 21,141 posts on 88 topics by preprocessing the text from March 2013 to February 2015. The interest index was defined to select the hot topics, and the k-means algorithm and SOM presented equivalent results with this data. We developed a decision tree model to detect hot topics with three algorithms: CHAID, CART, and C4.5. The results of CHAID were subpar compared to the others. We also employed SVM to detect the hot topics from negative data. The SVM models were trained with the radial basis function (RBF) kernel function by a grid search to detect the hot topics. The detection of hot topics by using sentiment analysis provides the latest trends and hot topics in the stock forum for investors so that they no longer need to search the vast amounts of information on the Web. Our proposed model is also helpful to rapidly determine customers' signals or attitudes towards government policy and firms' products and services.

Development of Predictive Models for Rights Issues Using Financial Analysis Indices and Decision Tree Technique (경영분석지표와 의사결정나무기법을 이용한 유상증자 예측모형 개발)

  • Kim, Myeong-Kyun;Cho, Yoonho
    • Journal of Intelligence and Information Systems
    • /
    • v.18 no.4
    • /
    • pp.59-77
    • /
    • 2012
  • This study focuses on predicting which firms will increase capital by issuing new stocks in the near future. Many stakeholders, including banks, credit rating agencies and investors, performs a variety of analyses for firms' growth, profitability, stability, activity, productivity, etc., and regularly report the firms' financial analysis indices. In the paper, we develop predictive models for rights issues using these financial analysis indices and data mining techniques. This study approaches to building the predictive models from the perspective of two different analyses. The first is the analysis period. We divide the analysis period into before and after the IMF financial crisis, and examine whether there is the difference between the two periods. The second is the prediction time. In order to predict when firms increase capital by issuing new stocks, the prediction time is categorized as one year, two years and three years later. Therefore Total six prediction models are developed and analyzed. In this paper, we employ the decision tree technique to build the prediction models for rights issues. The decision tree is the most widely used prediction method which builds decision trees to label or categorize cases into a set of known classes. In contrast to neural networks, logistic regression and SVM, decision tree techniques are well suited for high-dimensional applications and have strong explanation capabilities. There are well-known decision tree induction algorithms such as CHAID, CART, QUEST, C5.0, etc. Among them, we use C5.0 algorithm which is the most recently developed algorithm and yields performance better than other algorithms. We obtained data for the rights issue and financial analysis from TS2000 of Korea Listed Companies Association. A record of financial analysis data is consisted of 89 variables which include 9 growth indices, 30 profitability indices, 23 stability indices, 6 activity indices and 8 productivity indices. For the model building and test, we used 10,925 financial analysis data of total 658 listed firms. PASW Modeler 13 was used to build C5.0 decision trees for the six prediction models. Total 84 variables among financial analysis data are selected as the input variables of each model, and the rights issue status (issued or not issued) is defined as the output variable. To develop prediction models using C5.0 node (Node Options: Output type = Rule set, Use boosting = false, Cross-validate = false, Mode = Simple, Favor = Generality), we used 60% of data for model building and 40% of data for model test. The results of experimental analysis show that the prediction accuracies of data after the IMF financial crisis (59.04% to 60.43%) are about 10 percent higher than ones before IMF financial crisis (68.78% to 71.41%). These results indicate that since the IMF financial crisis, the reliability of financial analysis indices has increased and the firm intention of rights issue has been more obvious. The experiment results also show that the stability-related indices have a major impact on conducting rights issue in the case of short-term prediction. On the other hand, the long-term prediction of conducting rights issue is affected by financial analysis indices on profitability, stability, activity and productivity. All the prediction models include the industry code as one of significant variables. This means that companies in different types of industries show their different types of patterns for rights issue. We conclude that it is desirable for stakeholders to take into account stability-related indices and more various financial analysis indices for short-term prediction and long-term prediction, respectively. The current study has several limitations. First, we need to compare the differences in accuracy by using different data mining techniques such as neural networks, logistic regression and SVM. Second, we are required to develop and to evaluate new prediction models including variables which research in the theory of capital structure has mentioned about the relevance to rights issue.

An Analytical Approach Using Topic Mining for Improving the Service Quality of Hotels (호텔 산업의 서비스 품질 향상을 위한 토픽 마이닝 기반 분석 방법)

  • Moon, Hyun Sil;Sung, David;Kim, Jae Kyeong
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.1
    • /
    • pp.21-41
    • /
    • 2019
  • Thanks to the rapid development of information technologies, the data available on Internet have grown rapidly. In this era of big data, many studies have attempted to offer insights and express the effects of data analysis. In the tourism and hospitality industry, many firms and studies in the era of big data have paid attention to online reviews on social media because of their large influence over customers. As tourism is an information-intensive industry, the effect of these information networks on social media platforms is more remarkable compared to any other types of media. However, there are some limitations to the improvements in service quality that can be made based on opinions on social media platforms. Users on social media platforms represent their opinions as text, images, and so on. Raw data sets from these reviews are unstructured. Moreover, these data sets are too big to extract new information and hidden knowledge by human competences. To use them for business intelligence and analytics applications, proper big data techniques like Natural Language Processing and data mining techniques are needed. This study suggests an analytical approach to directly yield insights from these reviews to improve the service quality of hotels. Our proposed approach consists of topic mining to extract topics contained in the reviews and the decision tree modeling to explain the relationship between topics and ratings. Topic mining refers to a method for finding a group of words from a collection of documents that represents a document. Among several topic mining methods, we adopted the Latent Dirichlet Allocation algorithm, which is considered as the most universal algorithm. However, LDA is not enough to find insights that can improve service quality because it cannot find the relationship between topics and ratings. To overcome this limitation, we also use the Classification and Regression Tree method, which is a kind of decision tree technique. Through the CART method, we can find what topics are related to positive or negative ratings of a hotel and visualize the results. Therefore, this study aims to investigate the representation of an analytical approach for the improvement of hotel service quality from unstructured review data sets. Through experiments for four hotels in Hong Kong, we can find the strengths and weaknesses of services for each hotel and suggest improvements to aid in customer satisfaction. Especially from positive reviews, we find what these hotels should maintain for service quality. For example, compared with the other hotels, a hotel has a good location and room condition which are extracted from positive reviews for it. In contrast, we also find what they should modify in their services from negative reviews. For example, a hotel should improve room condition related to soundproof. These results mean that our approach is useful in finding some insights for the service quality of hotels. That is, from the enormous size of review data, our approach can provide practical suggestions for hotel managers to improve their service quality. In the past, studies for improving service quality relied on surveys or interviews of customers. However, these methods are often costly and time consuming and the results may be biased by biased sampling or untrustworthy answers. The proposed approach directly obtains honest feedback from customers' online reviews and draws some insights through a type of big data analysis. So it will be a more useful tool to overcome the limitations of surveys or interviews. Moreover, our approach easily obtains the service quality information of other hotels or services in the tourism industry because it needs only open online reviews and ratings as input data. Furthermore, the performance of our approach will be better if other structured and unstructured data sources are added.