• Title/Summary/Keyword: Intelligent Data Analysis

Search Result 1,456, Processing Time 0.028 seconds

Addressing Big Data solution enabled Connected Vehicle services using Hadoop (Hadoop을 이용한 스마트 자동차 서비스용 빅 데이터 솔루션 개발)

  • Nkenyereye, Lionel;Jang, Jong-Wook
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.19 no.3
    • /
    • pp.607-612
    • /
    • 2015
  • As the amount of vehicle's diagnostics data increases, the actors in automotive ecosystem will encounter difficulties to perform a real time analysis in order to simulate or to design new services according to the data gathered from the connected cars. In this paper, we have conducted a study of a Big Data solution that expresses the essential deep analytics to process and analyze vast quantities of vehicles on board diagnostics data generated by cars. Hadoop and its ecosystems have been deployed to process a large data and delivered useful outcomes that may be used by actors in automotive ecosystem to deliver new services to car owners. As the Intelligent transport system is involved to guarantee safety, reduce rate of crash and injured in the accident due to speed, addressing big data solution based on vehicle diagnostics data is upcoming to monitor real time outcome from it and making collection of data from several connected cars, facilitating reliable processing and easier storage of data collected.

Smart-Coord: Enhancing Healthcare IoT-based Security by Blockchain Coordinate Systems

  • Talal Saad Albalawi
    • International Journal of Computer Science & Network Security
    • /
    • v.24 no.8
    • /
    • pp.32-42
    • /
    • 2024
  • The Internet of Things (IoT) is set to transform patient care by enhancing data collection, analysis, and management through medical sensors and wearable devices. However, the convergence of IoT device vulnerabilities and the sensitivity of healthcare data raises significant data integrity and privacy concerns. In response, this research introduces the Smart-Coord system, a practical and affordable solution for securing healthcare IoT. Smart-Coord leverages blockchain technology and coordinate-based access management to fortify healthcare IoT. It employs IPFS for immutable data storage and intelligent Solidity Ethereum contracts for data integrity and confidentiality, creating a hierarchical, AES-CBC-secured data transmission protocol from IoT devices to blockchain repositories. Our technique uses a unique coordinate system to embed confidentiality and integrity regulations into a single access control model, dictating data access and transfer based on subject-object pairings in a coordinate plane. This dual enforcement technique governs and secures the flow of healthcare IoT information. With its implementation on the Matic network, the Smart-Coord system's computational efficiency and cost-effectiveness are unparalleled. Smart-Coord boasts significantly lower transaction costs and data operation processing times than other blockchain networks, making it a practical and affordable solution. Smart-Coord holds the promise of enhancing IoT-based healthcare system security by managing sensitive health data in a scalable, efficient, and secure manner. The Smart-Coord framework heralds a new era in healthcare IoT adoption, expertly managing data integrity, confidentiality, and accessibility to ensure a secure, reliable digital environment for patient data management.

Feature Subset Selection in the Induction Algorithm using Sensitivity Analysis of Neural Networks (신경망의 민감도 분석을 이용한 귀납적 학습기법의 변수 부분집합 선정)

  • 강부식;박상찬
    • Journal of Intelligence and Information Systems
    • /
    • v.7 no.2
    • /
    • pp.51-63
    • /
    • 2001
  • In supervised machine learning, an induction algorithm, which is able to extract rules from data with learning capability, provides a useful tool for data mining. Practical induction algorithms are known to degrade in prediction accuracy and generate complex rules unnecessarily when trained on data containing superfluous features. Thus it needs feature subset selection for better performance of them. In feature subset selection on the induction algorithm, wrapper method is repeatedly run it on the dataset using various feature subsets. But it is impractical to search the whole space exhaustively unless the features are small. This study proposes a heuristic method that uses sensitivity analysis of neural networks to the wrapper method for generating rules with higher possible accuracy. First it gives priority to all features using sensitivity analysis of neural networks. And it uses the wrapper method that searches the ordered feature space. In experiments to three datasets, we show that the suggested method is capable of selecting a feature subset that improves the performance of the induction algorithm within certain iteration.

  • PDF

Application of Market Basket Analysis to One-to-One Marketing on Internet Storefront (인터넷 쇼핑몰에서 원투원 마케팅을 위한 장바구니 분석 기법의 활용)

  • 강동원;이경미
    • Journal of the Korea Computer Industry Society
    • /
    • v.2 no.9
    • /
    • pp.1175-1182
    • /
    • 2001
  • One to one Marketing (a.k.a. database marketing or relationship marketing) is one of the many fields that will benefit from the electronic revolution and shifts in consumer sales and advertising. As a component of intelligent customer services on Internet storefront, this paper describes technology of providing personalized advertisement using the market basket analysis, a well-Known data mining technique. The underlining theories of recommendation techniques are statistics, data mining, artificial intelligence, and/or rule-based matching. In the rule-based approach for personalized recommendation, marketing rules for personalization are usually collected from marketing experts and are used to inference with customer's data. However, it is difficult to extract marketing rules from marketing experts, and also difficult to validate and to maintain the constructed Knowledge base. In this paper, using marketing basket analysis technique, marketing rules for cross sales are extracted, and are used to provide personalized advertisement selection when a customer visits in an Internet store.

  • PDF

Artificial Intelligence-based Security Control Construction and Countermeasures (인공지능기반 보안관제 구축 및 대응 방안)

  • Hong, Jun-Hyeok;Lee, Byoung Yup
    • The Journal of the Korea Contents Association
    • /
    • v.21 no.1
    • /
    • pp.531-540
    • /
    • 2021
  • As cyber attacks and crimes increase exponentially and hacking attacks become more intelligent and advanced, hacking attack methods and routes are evolving unpredictably and in real time. In order to reinforce the enemy's responsiveness, this study aims to propose a method for developing an artificial intelligence-based security control platform by building a next-generation security system using artificial intelligence to respond by self-learning, monitoring abnormal signs and blocking attacks.The artificial intelligence-based security control platform should be developed as the basis for data collection, data analysis, next-generation security system operation, and security system management. Big data base and control system, data collection step through external threat information, data analysis step of pre-processing and formalizing the collected data to perform positive/false detection and abnormal behavior analysis through deep learning-based algorithm, and analyzed data Through the operation of a security system of prevention, control, response, analysis, and organic circulation structure, the next generation security system to increase the scope and speed of handling new threats and to reinforce the identification of normal and abnormal behaviors, and management of the security threat response system, Harmful IP management, detection policy management, security business legal system management. Through this, we are trying to find a way to comprehensively analyze vast amounts of data and to respond preemptively in a short time.

User-Perspective Issue Clustering Using Multi-Layered Two-Mode Network Analysis (다계층 이원 네트워크를 활용한 사용자 관점의 이슈 클러스터링)

  • Kim, Jieun;Kim, Namgyu;Cho, Yoonho
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.2
    • /
    • pp.93-107
    • /
    • 2014
  • In this paper, we report what we have observed with regard to user-perspective issue clustering based on multi-layered two-mode network analysis. This work is significant in the context of data collection by companies about customer needs. Most companies have failed to uncover such needs for products or services properly in terms of demographic data such as age, income levels, and purchase history. Because of excessive reliance on limited internal data, most recommendation systems do not provide decision makers with appropriate business information for current business circumstances. However, part of the problem is the increasing regulation of personal data gathering and privacy. This makes demographic or transaction data collection more difficult, and is a significant hurdle for traditional recommendation approaches because these systems demand a great deal of personal data or transaction logs. Our motivation for presenting this paper to academia is our strong belief, and evidence, that most customers' requirements for products can be effectively and efficiently analyzed from unstructured textual data such as Internet news text. In order to derive users' requirements from textual data obtained online, the proposed approach in this paper attempts to construct double two-mode networks, such as a user-news network and news-issue network, and to integrate these into one quasi-network as the input for issue clustering. One of the contributions of this research is the development of a methodology utilizing enormous amounts of unstructured textual data for user-oriented issue clustering by leveraging existing text mining and social network analysis. In order to build multi-layered two-mode networks of news logs, we need some tools such as text mining and topic analysis. We used not only SAS Enterprise Miner 12.1, which provides a text miner module and cluster module for textual data analysis, but also NetMiner 4 for network visualization and analysis. Our approach for user-perspective issue clustering is composed of six main phases: crawling, topic analysis, access pattern analysis, network merging, network conversion, and clustering. In the first phase, we collect visit logs for news sites by crawler. After gathering unstructured news article data, the topic analysis phase extracts issues from each news article in order to build an article-news network. For simplicity, 100 topics are extracted from 13,652 articles. In the third phase, a user-article network is constructed with access patterns derived from web transaction logs. The double two-mode networks are then merged into a quasi-network of user-issue. Finally, in the user-oriented issue-clustering phase, we classify issues through structural equivalence, and compare these with the clustering results from statistical tools and network analysis. An experiment with a large dataset was performed to build a multi-layer two-mode network. After that, we compared the results of issue clustering from SAS with that of network analysis. The experimental dataset was from a web site ranking site, and the biggest portal site in Korea. The sample dataset contains 150 million transaction logs and 13,652 news articles of 5,000 panels over one year. User-article and article-issue networks are constructed and merged into a user-issue quasi-network using Netminer. Our issue-clustering results applied the Partitioning Around Medoids (PAM) algorithm and Multidimensional Scaling (MDS), and are consistent with the results from SAS clustering. In spite of extensive efforts to provide user information with recommendation systems, most projects are successful only when companies have sufficient data about users and transactions. Our proposed methodology, user-perspective issue clustering, can provide practical support to decision-making in companies because it enhances user-related data from unstructured textual data. To overcome the problem of insufficient data from traditional approaches, our methodology infers customers' real interests by utilizing web transaction logs. In addition, we suggest topic analysis and issue clustering as a practical means of issue identification.

Research of Semantic Considered Tree Mining Method for an Intelligent Knowledge-Services Platform

  • Paik, Juryon
    • Journal of the Korea Society of Computer and Information
    • /
    • v.25 no.5
    • /
    • pp.27-36
    • /
    • 2020
  • In this paper, we propose a method to derive valuable but hidden infromation from the data which is the core foundation in the 4th Industrial Revolution to pursue knowledge-based service fusion. The hyper-connected societies characterized by IoT inevitably produce big data, and with the data in order to derive optimal services for trouble situations it is first processed by discovering valuable information. A data-centric IoT platform is a platform to collect, store, manage, and integrate the data from variable devices, which is actually a type of middleware platforms. Its purpose is to provide suitable solutions for challenged problems after processing and analyzing the data, that depends on efficient and accurate algorithms performing the work of data analysis. To this end, we propose specially designed structures to store IoT data without losing the semantics and provide algorithms to discover the useful information with several definitions and proofs to show the soundness.

Students' Performance Prediction in Higher Education Using Multi-Agent Framework Based Distributed Data Mining Approach: A Review

  • M.Nazir;A.Noraziah;M.Rahmah
    • International Journal of Computer Science & Network Security
    • /
    • v.23 no.10
    • /
    • pp.135-146
    • /
    • 2023
  • An effective educational program warrants the inclusion of an innovative construction which enhances the higher education efficacy in such a way that accelerates the achievement of desired results and reduces the risk of failures. Educational Decision Support System (EDSS) has currently been a hot topic in educational systems, facilitating the pupil result monitoring and evaluation to be performed during their development. Insufficient information systems encounter trouble and hurdles in making the sufficient advantage from EDSS owing to the deficit of accuracy, incorrect analysis study of the characteristic, and inadequate database. DMTs (Data Mining Techniques) provide helpful tools in finding the models or forms of data and are extremely useful in the decision-making process. Several researchers have participated in the research involving distributed data mining with multi-agent technology. The rapid growth of network technology and IT use has led to the widespread use of distributed databases. This article explains the available data mining technology and the distributed data mining system framework. Distributed Data Mining approach is utilized for this work so that a classifier capable of predicting the success of students in the economic domain can be constructed. This research also discusses the Intelligent Knowledge Base Distributed Data Mining framework to assess the performance of the students through a mid-term exam and final-term exam employing Multi-agent system-based educational mining techniques. Using single and ensemble-based classifiers, this study intends to investigate the factors that influence student performance in higher education and construct a classification model that can predict academic achievement. We also discussed the importance of multi-agent systems and comparative machine learning approaches in EDSS development.

Design of Client-Server Model For Effective Processing and Utilization of Bigdata (빅데이터의 효과적인 처리 및 활용을 위한 클라이언트-서버 모델 설계)

  • Park, Dae Seo;Kim, Hwa Jong
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.4
    • /
    • pp.109-122
    • /
    • 2016
  • Recently, big data analysis has developed into a field of interest to individuals and non-experts as well as companies and professionals. Accordingly, it is utilized for marketing and social problem solving by analyzing the data currently opened or collected directly. In Korea, various companies and individuals are challenging big data analysis, but it is difficult from the initial stage of analysis due to limitation of big data disclosure and collection difficulties. Nowadays, the system improvement for big data activation and big data disclosure services are variously carried out in Korea and abroad, and services for opening public data such as domestic government 3.0 (data.go.kr) are mainly implemented. In addition to the efforts made by the government, services that share data held by corporations or individuals are running, but it is difficult to find useful data because of the lack of shared data. In addition, big data traffic problems can occur because it is necessary to download and examine the entire data in order to grasp the attributes and simple information about the shared data. Therefore, We need for a new system for big data processing and utilization. First, big data pre-analysis technology is needed as a way to solve big data sharing problem. Pre-analysis is a concept proposed in this paper in order to solve the problem of sharing big data, and it means to provide users with the results generated by pre-analyzing the data in advance. Through preliminary analysis, it is possible to improve the usability of big data by providing information that can grasp the properties and characteristics of big data when the data user searches for big data. In addition, by sharing the summary data or sample data generated through the pre-analysis, it is possible to solve the security problem that may occur when the original data is disclosed, thereby enabling the big data sharing between the data provider and the data user. Second, it is necessary to quickly generate appropriate preprocessing results according to the level of disclosure or network status of raw data and to provide the results to users through big data distribution processing using spark. Third, in order to solve the problem of big traffic, the system monitors the traffic of the network in real time. When preprocessing the data requested by the user, preprocessing to a size available in the current network and transmitting it to the user is required so that no big traffic occurs. In this paper, we present various data sizes according to the level of disclosure through pre - analysis. This method is expected to show a low traffic volume when compared with the conventional method of sharing only raw data in a large number of systems. In this paper, we describe how to solve problems that occur when big data is released and used, and to help facilitate sharing and analysis. The client-server model uses SPARK for fast analysis and processing of user requests. Server Agent and a Client Agent, each of which is deployed on the Server and Client side. The Server Agent is a necessary agent for the data provider and performs preliminary analysis of big data to generate Data Descriptor with information of Sample Data, Summary Data, and Raw Data. In addition, it performs fast and efficient big data preprocessing through big data distribution processing and continuously monitors network traffic. The Client Agent is an agent placed on the data user side. It can search the big data through the Data Descriptor which is the result of the pre-analysis and can quickly search the data. The desired data can be requested from the server to download the big data according to the level of disclosure. It separates the Server Agent and the client agent when the data provider publishes the data for data to be used by the user. In particular, we focus on the Big Data Sharing, Distributed Big Data Processing, Big Traffic problem, and construct the detailed module of the client - server model and present the design method of each module. The system designed on the basis of the proposed model, the user who acquires the data analyzes the data in the desired direction or preprocesses the new data. By analyzing the newly processed data through the server agent, the data user changes its role as the data provider. The data provider can also obtain useful statistical information from the Data Descriptor of the data it discloses and become a data user to perform new analysis using the sample data. In this way, raw data is processed and processed big data is utilized by the user, thereby forming a natural shared environment. The role of data provider and data user is not distinguished, and provides an ideal shared service that enables everyone to be a provider and a user. The client-server model solves the problem of sharing big data and provides a free sharing environment to securely big data disclosure and provides an ideal shared service to easily find big data.

A Study on Detecting Selfish Nodes in Wireless LAN using Tsallis-Entropy Analysis (뜨살리스-엔트로피 분석을 통한 무선 랜의 이기적인 노드 탐지 기법)

  • Ryu, Byoung-Hyun;Seok, Seung-Joon
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.22 no.1
    • /
    • pp.12-21
    • /
    • 2012
  • IEEE 802.11 MAC protocol standard, DCF(CSMA/CA), is originally designed to ensure the fair channel access between mobile nodes sharing the local wireless channel. It has been, however, revealed that some misbehavior nodes transmit more data than other nodes through artificial means in hot spot area spreaded rapidly. The misbehavior nodes may modify the internal process of their MAC protocol or interrupt the MAC procedure of normal nodes to achieve more data transmission. This problem has been referred to as a selfish node problem and almost literatures has proposed methods of analyzing the MAC procedures of all mobile nodes to detect the selfish nodes. However, these kinds of protocol analysis methods is not effective at detecting all kinds of selfish nodes enough. This paper address this problem of detecting selfish node using Tsallis-Entropy which is a kind of statistical method. Tsallis-Entropy is a criteria which can show how much is the density or deviation of a probability distribution. The proposed algorithm which operates at a AP node of wireless LAN extracts the probability distribution of data interval time for each node, then compares the one with a threshold value to detect the selfish nodes. To evaluate the performance of proposed algorithm, simulation experiments are performed in various wireless LAN environments (congestion level, how selfish node behaviors, threshold level) using ns2. The simulation results show that the proposed algorithm achieves higher successful detection rate.