• Title/Summary/Keyword: Knowledge-Based Data Mining

Search Result 262, Processing Time 0.026 seconds

An Efficient Approach for Single-Pass Mining of Web Traversal Sequences (단일 스캔을 통한 웹 방문 패턴의 탐색 기법)

  • Kim, Nak-Min;Jeong, Byeong-Soo;Ahmed, Chowdhury Farhan
    • Journal of KIISE:Databases
    • /
    • v.37 no.5
    • /
    • pp.221-227
    • /
    • 2010
  • Web access sequence mining can discover the frequently accessed web pages pursued by users. Utility-based web access sequence mining handles non-binary occurrences of web pages and extracts more useful knowledge from web logs. However, the existing utility-based web access sequence mining approach considers web access sequences from the very beginning of web logs and therefore it is not suitable for mining data streams where the volume of data is huge and unbounded. At the same time, it cannot find the recent change of knowledge in data streams adaptively. The existing approach has many other limitations such as considering only forward references of web access sequences, suffers in the level-wise candidate generation-and-test methodology, needs several database scans, etc. In this paper, we propose a new approach for high utility web access sequence mining over data streams with a sliding window method. Our approach can not only handle large-scale data but also efficiently discover the recently generated information from data streams. Moreover, it can solve the other limitations of the existing algorithm over data streams. Extensive performance analyses show that our approach is very efficient and outperforms the existing algorithm.

A Spatial Data Mining System Extending Generalization based on Rulebase (규칙베이스 기반의 일반화를 확장한 공간 데이터 마이닝 시스템)

  • Choi, Seong-Min;Kim, Ung-Mo
    • The Transactions of the Korea Information Processing Society
    • /
    • v.5 no.11
    • /
    • pp.2786-2796
    • /
    • 1998
  • Extraction of interesting and general knowledge from large spatial database is an important task in the development of geographical information system and knowledge-base systems. In this paper, we propose a spatial data mining system using generalization method; In this system, we extend an existing generalization mining and design a rulebase to support deriving new spatial knowledge. For this purpose, we propose an interleaved method which integrates spatial data dominated and nonspatial data dominated mining and construct a rulebase to extract topological relationship between spatial objects.

  • PDF

DSS Architectures to Support Data Mining Activities for Supply Chain Management (데이터 마이닝을 활용한 공급사슬관리 의사결정지원시스템의 구조에 관한 연구)

  • Jhee, Won-Chul;Suh, Min-Soo
    • Asia pacific journal of information systems
    • /
    • v.8 no.3
    • /
    • pp.51-73
    • /
    • 1998
  • This paper is to evaluate the application potentials of data mining in the areas of Supply Chain Management (SCM) and to suggest the architectures of Decision Support Systems (DSS) that support data mining activities. We first briefly introduce data mining and review the recent literatures on SCM and then evaluate data mining applications to SCM in three aspects: marketing, operations management and information systems. By analyzing the cases about pricing models in distribution channels, demand forecasting and quality control, it is shown that artificial intelligence techniques such as artificial neural networks, case-based reasoning and expert systems, combined with traditional analysis models, effectively mine the useful knowledge from the large volume of SCM data. Agent-based information system is addressed as an important architecture that enables the pursuit of global optimization of SCM through communication and information sharing among supply chain constituents without loss of their characteristics and independence. We expect that the suggested architectures of intelligent DSS provide the basis in developing information systems for SCM to improve the quality of organizational decisions.

  • PDF

RFID-based Supply Chain Process Mining for Imported Beef

  • Kang, Yong-Shin;Lee, Kyounghun;Lee, Yong-Han;Chung, Ku-Young
    • Food Science of Animal Resources
    • /
    • v.33 no.4
    • /
    • pp.463-473
    • /
    • 2013
  • Through the development of efficient data collecting technologies like RFID, and inter-enterprise collaboration platforms such as web services, companies which participate in supply chains can acquire visibility over the whole supply chain, and can make decisions to optimize the overall supply chain networks and processes, based on the extracted knowledge from historical data collected by the visibility system. Although not currently active, the MeatWatch system has been developed, and is used in part for this purpose, in the imported beef distribution network in Korea. However, the imported beef distribution network is too complicated to analyze its various aspects using ordinary process analysis approaches. In this paper, we suggest a novel approach, called RFID-based supply chain process mining, to automatically discover and analyze the overall supply chain processes from the distributed RFID event data, without any prior knowledge. The proposed approach was implemented and validated, by using a case study of the imported beef distribution network in Korea. Specifically we demonstrated that the proposed approach can be successfully applied to discover supply chain networks from the distributed event data, to simplify the supply chain networks, and to analyze anomaly of the distribution networks. Such novel process mining functionalities can reinforce the capability of traceability services like MeatWatch in the future.

Designing Cost Effective Open Source System for Bigdata Analysis (빅데이터 분석을 위한 비용효과적 오픈 소스 시스템 설계)

  • Lee, Jong-Hwa;Lee, Hyun-Kyu
    • Knowledge Management Research
    • /
    • v.19 no.1
    • /
    • pp.119-132
    • /
    • 2018
  • Many advanced products and services are emerging in the market thanks to data-based technologies such as Internet (IoT), Big Data, and AI. The construction of a system for data processing under the IoT network environment is not simple in configuration, and has a lot of restrictions due to a high cost for constructing a high performance server environment. Therefore, in this paper, we will design a development environment for large data analysis computing platform using open source with low cost and practicality. Therefore, this study intends to implement a big data processing system using Raspberry Pi, an ultra-small PC environment, and open source API. This big data processing system includes building a portable server system, building a web server for web mining, developing Python IDE classes for crawling, and developing R Libraries for NLP and visualization. Through this research, we will develop a web environment that can control real-time data collection and analysis of web media in a mobile environment and present it as a curriculum for non-IT specialists.

Mining Spatio-Temporal Patterns in Trajectory Data

  • Kang, Ju-Young;Yong, Hwan-Seung
    • Journal of Information Processing Systems
    • /
    • v.6 no.4
    • /
    • pp.521-536
    • /
    • 2010
  • Spatio-temporal patterns extracted from historical trajectories of moving objects reveal important knowledge about movement behavior for high quality LBS services. Existing approaches transform trajectories into sequences of location symbols and derive frequent subsequences by applying conventional sequential pattern mining algorithms. However, spatio-temporal correlations may be lost due to the inappropriate approximations of spatial and temporal properties. In this paper, we address the problem of mining spatio-temporal patterns from trajectory data. The inefficient description of temporal information decreases the mining efficiency and the interpretability of the patterns. We provide a formal statement of efficient representation of spatio-temporal movements and propose a new approach to discover spatio-temporal patterns in trajectory data. The proposed method first finds meaningful spatio-temporal regions and extracts frequent spatio-temporal patterns based on a prefix-projection approach from the sequences of these regions. We experimentally analyze that the proposed method improves mining performance and derives more intuitive patterns.

Heterogeneous Lifelog Mining Model in Health Big-data Platform (헬스 빅데이터 플랫폼에서 이기종 라이프로그 마이닝 모델)

  • Kang, JI-Soo;Chung, Kyungyong
    • Journal of the Korea Convergence Society
    • /
    • v.9 no.10
    • /
    • pp.75-80
    • /
    • 2018
  • In this paper, we propose heterogeneous lifelog mining model in health big-data platform. It is an ontology-based mining model for collecting user's lifelog in real-time and providing healthcare services. The proposed method distributes heterogeneous lifelog data and processes it in real time in a cloud computing environment. The knowledge base is reconstructed by an upper ontology method suitable for the environment constructed based on the heterogeneous ontology. The restructured knowledge base generates inference rules using Jena 4.0 inference engines, and provides real-time healthcare services by rule-based inference methods. Lifelog mining constructs an analysis of hidden relationships and a predictive model for time-series bio-signal. This enables real-time healthcare services that realize preventive health services to detect changes in the users' bio-signal by exploring negative or positive correlations that are not included in the relationships or inference rules. The performance evaluation shows that the proposed heterogeneous lifelog mining model method is superior to other models with an accuracy of 0.734, a precision of 0.752.

Ontology based Preprocessing Scheme for Mining Data Streams from Sensor Networks (센서 네트워크의 데이터 스트림 마이닝을 위한 온톨로지 기반의 전처리 기법)

  • Jung, Jason J.
    • Journal of Intelligence and Information Systems
    • /
    • v.15 no.3
    • /
    • pp.67-80
    • /
    • 2009
  • By a number of sensors and sensor networks, we can collect environmental information from a certain sensor space. To discover more useful information and knowledge, we want to employ data mining methodologies to sensor data stream from such sensor spaces. In this paper, we present a novel data preprocessing scheme to improve the performances of the data mining algorithms. Especially, ontologies are applied to represent meanings of the sensor data. For evaluating the proposed method, we have collected sensor streams for about 30 days, and simulated them to compare with other approaches.

  • PDF

Data Mining Techniques for Medical Informatics: Application to SNP Analysis

  • Chun, Se-Hak;Kim, Jin;Park, Yoon-Joo;Ham, Ki-Baek;Chun, Se-Chul
    • Proceedings of the Korea Inteligent Information System Society Conference
    • /
    • 2005.11a
    • /
    • pp.258-263
    • /
    • 2005
  • Haplotype-based analysis using high-density SNP markers have gained a great attention in evaluating genes in gene analysis and various clinical situations. However, there has been no research on disease diagnostic modeling based on SNPs analysis to our knowledge. The purpose of this study is to explore how knowledge discovery techniques are applied in medical informatics area and proposes a Case Based Reasoning (CBR) technique for diagnosis of gastric caner using Single Nucleotide Polymorphism(SNP).

  • PDF

Process Planning Method under Make-to-Order Production System using Data Mining (데이터마이닝을 이용한 수주생산시스템의 공정계획방안)

  • Oh, Kyung-Mo;Park, Chang-Kwon
    • IE interfaces
    • /
    • v.18 no.2
    • /
    • pp.148-157
    • /
    • 2005
  • The manufacturing industry with Make-to-Order production system is difficult to decide the standard information for the product and the demand is variable to estimate. In this paper, we concerned with the process planning method using data mining in the manufacturing industry with Make-to-Order environment. The subject of our study is the industry transformer plant which is received an diverse order of customer and then produced the product. Currently, process planning method is classified the standard information by hand based on the acquired knowledge through the experience. The standard information stored the various information, such as work sequence, time and so on. This process planning method needs an experts which possesses the field experience for several years. For the product specification which is varied in each order, current process planning method is not efficient due to need many times To solve this problem, we extract the information using data mining process for each processing time, and then construct the knowledge base. We propose a method which is the process planning of the industry transformer product in Make-to-Order environment using the knowledge base.