• Title/Summary/Keyword: Association Rules Analysis

Search Result 402, Processing Time 0.03 seconds

Analysis of Internet User Features using Multi-dimensional Association Analysis (다차원 연관 분석을 이용한 인터넷 이용자의 특징 분석)

  • Lee, Su-Eun;Jung, Yong-Gyu
    • Journal of Service Research and Studies
    • /
    • v.1 no.1
    • /
    • pp.61-69
    • /
    • 2011
  • Data mining that can not be extracted with a simple query in the form of "useful" means to find information in large databases from the existing and unknown knowledge. It is based on this insight about the data can be defined as a gain. In this paper, we use the Internet to find useful patterns on the Web or saved data to the target Web site, which is to analyze the characteristics of users. A general statistical information on Internet users to the data by applying a relevance analysis, Internet use affect the amount of time to analyze the characteristics of Internet users. Only through experiments extracting data from the association rules, producing optimal results apply for the data pre-processing and algorithm for mining the Web to Internet users. characteristics were analyzed.

  • PDF

An Investigation on Expanding Co-occurrence Criteria in Association Rule Mining (연관규칙 마이닝에서의 동시성 기준 확장에 대한 연구)

  • Kim, Mi-Sung;Kim, Nam-Gyu;Ahn, Jae-Hyeon
    • Journal of Intelligence and Information Systems
    • /
    • v.18 no.1
    • /
    • pp.23-38
    • /
    • 2012
  • There is a large difference between purchasing patterns in an online shopping mall and in an offline market. This difference may be caused mainly by the difference in accessibility of online and offline markets. It means that an interval between the initial purchasing decision and its realization appears to be relatively short in an online shopping mall, because a customer can make an order immediately. Because of the short interval between a purchasing decision and its realization, an online shopping mall transaction usually contains fewer items than that of an offline market. In an offline market, customers usually keep some items in mind and buy them all at once a few days after deciding to buy them, instead of buying each item individually and immediately. On the contrary, more than 70% of online shopping mall transactions contain only one item. This statistic implies that traditional data mining techniques cannot be directly applied to online market analysis, because hardly any association rules can survive with an acceptable level of Support because of too many Null Transactions. Most market basket analyses on online shopping mall transactions, therefore, have been performed by expanding the co-occurrence criteria of traditional association rule mining. While the traditional co-occurrence criteria defines items purchased in one transaction as concurrently purchased items, the expanded co-occurrence criteria regards items purchased by a customer during some predefined period (e.g., a day) as concurrently purchased items. In studies using expanded co-occurrence criteria, however, the criteria has been defined arbitrarily by researchers without any theoretical grounds or agreement. The lack of clear grounds of adopting a certain co-occurrence criteria degrades the reliability of the analytical results. Moreover, it is hard to derive new meaningful findings by combining the outcomes of previous individual studies. In this paper, we attempt to compare expanded co-occurrence criteria and propose a guideline for selecting an appropriate one. First of all, we compare the accuracy of association rules discovered according to various co-occurrence criteria. By doing this experiment we expect that we can provide a guideline for selecting appropriate co-occurrence criteria that corresponds to the purpose of the analysis. Additionally, we will perform similar experiments with several groups of customers that are segmented by each customer's average duration between orders. By this experiment, we attempt to discover the relationship between the optimal co-occurrence criteria and the customer's average duration between orders. Finally, by a series of experiments, we expect that we can provide basic guidelines for developing customized recommendation systems. Our experiments use a real dataset acquired from one of the largest internet shopping malls in Korea. We use 66,278 transactions of 3,847 customers conducted during the last two years. Overall results show that the accuracy of association rules of frequent shoppers (whose average duration between orders is relatively short) is higher than that of causal shoppers. In addition we discover that with frequent shoppers, the accuracy of association rules appears very high when the co-occurrence criteria of the training set corresponds to the validation set (i.e., target set). It implies that the co-occurrence criteria of frequent shoppers should be set according to the application purpose period. For example, an analyzer should use a day as a co-occurrence criterion if he/she wants to offer a coupon valid only for a day to potential customers who will use the coupon. On the contrary, an analyzer should use a month as a co-occurrence criterion if he/she wants to publish a coupon book that can be used for a month. In the case of causal shoppers, the accuracy of association rules appears to not be affected by the period of the application purposes. The accuracy of the causal shoppers' association rules becomes higher when the longer co-occurrence criterion has been adopted. It implies that an analyzer has to set the co-occurrence criterion for as long as possible, regardless of the application purpose period.

Comparison of Association Rule Learning and Subgroup Discovery for Mining Traffic Accident Data (교통사고 데이터의 마이닝을 위한 연관규칙 학습기법과 서브그룹 발견기법의 비교)

  • Kim, Jeongmin;Ryu, Kwang Ryel
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.4
    • /
    • pp.1-16
    • /
    • 2015
  • Traffic accident is one of the major cause of death worldwide for the last several decades. According to the statistics of world health organization, approximately 1.24 million deaths occurred on the world's roads in 2010. In order to reduce future traffic accident, multipronged approaches have been adopted including traffic regulations, injury-reducing technologies, driving training program and so on. Records on traffic accidents are generated and maintained for this purpose. To make these records meaningful and effective, it is necessary to analyze relationship between traffic accident and related factors including vehicle design, road design, weather, driver behavior etc. Insight derived from these analysis can be used for accident prevention approaches. Traffic accident data mining is an activity to find useful knowledges about such relationship that is not well-known and user may interested in it. Many studies about mining accident data have been reported over the past two decades. Most of studies mainly focused on predict risk of accident using accident related factors. Supervised learning methods like decision tree, logistic regression, k-nearest neighbor, neural network are used for these prediction. However, derived prediction model from these algorithms are too complex to understand for human itself because the main purpose of these algorithms are prediction, not explanation of the data. Some of studies use unsupervised clustering algorithm to dividing the data into several groups, but derived group itself is still not easy to understand for human, so it is necessary to do some additional analytic works. Rule based learning methods are adequate when we want to derive comprehensive form of knowledge about the target domain. It derives a set of if-then rules that represent relationship between the target feature with other features. Rules are fairly easy for human to understand its meaning therefore it can help provide insight and comprehensible results for human. Association rule learning methods and subgroup discovery methods are representing rule based learning methods for descriptive task. These two algorithms have been used in a wide range of area from transaction analysis, accident data analysis, detection of statistically significant patient risk groups, discovering key person in social communities and so on. We use both the association rule learning method and the subgroup discovery method to discover useful patterns from a traffic accident dataset consisting of many features including profile of driver, location of accident, types of accident, information of vehicle, violation of regulation and so on. The association rule learning method, which is one of the unsupervised learning methods, searches for frequent item sets from the data and translates them into rules. In contrast, the subgroup discovery method is a kind of supervised learning method that discovers rules of user specified concepts satisfying certain degree of generality and unusualness. Depending on what aspect of the data we are focusing our attention to, we may combine different multiple relevant features of interest to make a synthetic target feature, and give it to the rule learning algorithms. After a set of rules is derived, some postprocessing steps are taken to make the ruleset more compact and easier to understand by removing some uninteresting or redundant rules. We conducted a set of experiments of mining our traffic accident data in both unsupervised mode and supervised mode for comparison of these rule based learning algorithms. Experiments with the traffic accident data reveals that the association rule learning, in its pure unsupervised mode, can discover some hidden relationship among the features. Under supervised learning setting with combinatorial target feature, however, the subgroup discovery method finds good rules much more easily than the association rule learning method that requires a lot of efforts to tune the parameters.

Development of an Expert System to Improve the Methods of Parameter Estimation (매개변수 추정방법의 개선을 위한 전문가 시스템의 개발)

  • Lee, Beom-Hui;Lee, Gil-Seong
    • Journal of Korea Water Resources Association
    • /
    • v.31 no.6
    • /
    • pp.641-655
    • /
    • 1998
  • The methods of development and application of an expert system are suggested to solve more efficiently the problems of water resources and quality induced by the rapid urbanization. Major parameters of the water quantity and quality of urban areas are selected their characteristics are presented by the sensitivity analysis. The rules to decide the parameters effectively are proposed based on these characteristics. the ESPE(Expert System for Parameter Estimation), an expert system based on the 'facts' and 'rules', is developed using the CLIPS 6.0 and applied to the basin of the An-Yang stream. The results of estimating t도 parameters of water quantity show a high applicability, but those of water quality imply the necessity of improving the present methods due to both the complexity of estimation processes and the lack of decision rules.

  • PDF

An Analysis of Delivery/Transport Documents Content in Relation to the Contract of Carriage under Incoterms 2020 Rules

  • Jeon, Soon-Hwan
    • Journal of Korea Trade
    • /
    • v.25 no.1
    • /
    • pp.203-219
    • /
    • 2021
  • Purpose - The purpose of this study is to review and analyzes the contract of carriage and delivery/transport document in light of the major changes made to the Incoterms® 2020 rules forced into effect on January 1st, 2020. Design/methodology - This study analyzed responsibility for the loading and unloading of goods under the contract of carriage in Incoterms 2020® rules forced into effect by the ICC from January 1, 2020, and what document must be presented as evidence of delivery by the seller. Findings - A review revealed that in Rule C, the costs of unloading at the place of destination are determined by the terms of the contract of carriage, and in the DAP and DDP rules, if the seller bears the unloading costs, such unloading costs cannot be recovered from the buyer. To settle this issue, the seller needs to make a contract of carriage by sea with the carrier on FI terms. Furthermore, in the case of containerized goods that the FCA should be used, FOB was misused because the seller could not present an on-board bill of lading in the L/C transaction. However, it was confirmed that in FCA, the parties can use an optional mechanism to issue an on-board bill of lading. Originality/value - Incoterms 2020® rules are still widely used in international trade by parties to contract sales around the world, just like Incoterms 2010® rules. This study attempts to reduce or eliminate disputes that may arise from interpretative misunderstandings between the parties in the contract of sales concluded by the seller and the buyer.

Analysis on Relation between Rehabilitation Training Movement and Muscle Activation using Weighted Association Rule Discovery (가중연관규칙 탐사를 이용한 재활훈련운동과 근육 활성의 연관성 분석)

  • Lee, Ah-Reum;Piao, Youn-Jun;Kwon, Tae-Kyu;Kim, Jung-Ja
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.46 no.6
    • /
    • pp.7-17
    • /
    • 2009
  • The precise analysis of exercise data for designing an effective rehabilitation system is very important as a feedback for planing the next exercising step. Many subjective and reliable research outcomes that were obtained by analysis and evaluation for the human motor ability by various methods of biomechanical experiments have been introduced. Most of them include quantitative analysis based on basic statistical methods, which are not practical enough for application to real clinical problems. In this situation, data mining technology can be a promising approach for clinical decision support system by discovering meaningful hidden rules and patterns from large volume of data obtained from the problem domain. In this research, in order to find relational rules between posture training type and muscle activation pattern, we investigated an application of the WAR(Weishted Association Rule) to the biomechanical data obtained mainly for evaluation of postural control ability. The discovered rules can be used as a quantitative prior knowledge for expert's decision making for rehabilitation plan. The discovered rules can be used as a more qualitative and useful priori knowledge for the rehabilitation and clinical expert's decision-making, and as a index for planning an optimal rehabilitation exercise model for a patient.

Analysis of the Research Trends by Environmental Spatial-Information Using Text-Mining Technology (텍스트 마이닝 기법을 활용한 환경공간정보 연구 동향 분석)

  • OH, Kwan-Young;LEE, Moung-Jin;PARK, Bo-Young;LEE, Jung-Ho;YOON, Jung-Ho
    • Journal of the Korean Association of Geographic Information Studies
    • /
    • v.20 no.1
    • /
    • pp.113-126
    • /
    • 2017
  • This study aimed to quantitatively analyze the trends in environmental research that utilize environmental geospatial information through text mining, one of the big data analysis technologies. The analysis was conducted on a total of 869 papers published in the Republic of Korea, which were collected from the National Digital Science Library (NDSL). On the basis of the classification scheme, the keywords extracted from the papers were recategorized into 10 environmental fields including "general environment", "climate", "air quality", and 20 environmental geospatial information fields including "satellite image", "numerical map", and "disaster". With the recategorized keywords, their frequency levels and time series changes in the collected papers were analyzed, as well as the association rules between keywords. First, the results of frequency analysis showed that "general environment"(40.85%) and "satellite image"(24.87%) had the highest frequency levels among environmental fields and environmental geospatial information fields, respectively. Second, the results of the time series analysis on environmental fields showed that the share of "climate" between 1996 and 2000 was high, but since 2001, that of "general environment" has increased. In terms of environmental geospatial information fields, the demand for "satellite image" was highest throughout the period analyzed, and its utilization share has also gradually increased. Third, a total of 80 correlation rules were generated for environmental fields and environmental geospatial information fields. Among environmental fields, "general environment" generated the highest number of correlation rules (17) with environmental geospatial information fields such as "satellite image" and "digital map".

A Study on Target-Tracking Algorithm using Fuzzy-Logic

  • Kim, Byeong-Il;Yoon, Young-Jin;Won, Tae-Hyun;Bae, Jong-Il;Lee, Man-Hyung
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 1999.10a
    • /
    • pp.206-209
    • /
    • 1999
  • Conventional target tracking techniques are primarily based on Kalman filtering or probabilistic data association(PDA). But it is difficult to perform well under a high cluttered tracking environment because of the difficulty of measurement, the problem of mathematical simplification and the difficulty of combined target detection for tracking association problem. This paper deals with an analysis of target tracking problem using fuzzy-logic theory, and determines fuzzy rules used by a fuzzy tracker, and designs the fuzzy tracker by using fuzzy rules and Kalman filtering.

  • PDF

Evaluation Method of College English Education Effect Based on Improved Decision Tree Algorithm

  • Dou, Fang
    • Journal of Information Processing Systems
    • /
    • v.18 no.4
    • /
    • pp.500-509
    • /
    • 2022
  • With the rapid development of educational informatization, teaching methods become diversified characteristics, but a large number of information data restrict the evaluation on teaching subject and object in terms of the effect of English education. Therefore, this study adopts the concept of incremental learning and eigenvalue interval algorithm to improve the weighted decision tree, and builds an English education effect evaluation model based on association rules. According to the results, the average accuracy of information classification of the improved decision tree algorithm is 96.18%, the classification error rate can be as low as 0.02%, and the anti-fitting performance is good. The classification error rate between the improved decision tree algorithm and the original decision tree does not exceed 1%. The proposed educational evaluation method can effectively provide early warning of academic situation analysis, and improve the teachers' professional skills in an accelerated manner and perfect the education system.

Economic Effects of FTA Logistics Hub Utilizing Direct Transportation Rules of Origin in RCEP (RCEP 직접운송원칙을 활용한 우리나라의 FTA 물류 허브 가능성과 경제적 효과)

  • Byeong-Ho Lim
    • Korea Trade Review
    • /
    • v.46 no.3
    • /
    • pp.135-149
    • /
    • 2021
  • This study analyzes the economic effect through the use of the RCEP direct transport rules, and suggests the necessacity of logistics efficiency and policy alternatives. The advantage of the hub network has been widely applied to the international logistics system, but there is a limit in the FTA logistics system in which goods must be directly transported between two contracting parties. Therefore, based on the new RCEP direct transport rules and the theoretical review on the possibility of an FTA logistics hub, FTA logistics efficiency improvement is estimated. This study quantitatively estimated the economic effect of direct transportation, unlike the previous studies, which were limited to the analysis of judicial precedents or surveys. GTAP model was used through five scenarios according to the impact of the RCEP tariff cut and the FTA logistics hub establishment in Singapore or Korea. As a result of the analysis, Korea's trade volume increased by 0.38% of exports and 1.63% of imports, and RCEP would increase exports by 0.27% and imports by 0.42%. In particular, the establishment of an FTA logistics hub (0.71%) was found to have a greater effect on the improvement of terms of trade than a tariff cut (0.12%), confirming the necessity of establishing an FTA logistics hub in RCEP. As a policy proposal, the institutional support of the customs authorities for the use of RCEP, the expansion of the free trade area where BWT traded cargo can be stored, and the establishment of a system for issuing back-to-back certificates of origin with approved exporters.