• Title/Summary/Keyword: Intelligent Data Analysis

Search Result 1,456, Processing Time 0.028 seconds

Sentiment analysis on movie review through building modified sentiment dictionary by movie genre (영역별 맞춤형 감성사전 구축을 통한 영화리뷰 감성분석)

  • Lee, Sang Hoon;Cui, Jing;Kim, Jong Woo
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.2
    • /
    • pp.97-113
    • /
    • 2016
  • Due to the growth of internet data and the rapid development of internet technology, "big data" analysis is actively conducted to analyze enormous data for various purposes. Especially in recent years, a number of studies have been performed on the applications of text mining techniques in order to overcome the limitations of existing structured data analysis. Various studies on sentiment analysis, the part of text mining techniques, are actively studied to score opinions based on the distribution of polarity of words in documents. Usually, the sentiment analysis uses sentiment dictionary contains positivity and negativity of vocabularies. As a part of such studies, this study tries to construct sentiment dictionary which is customized to specific data domain. Using a common sentiment dictionary for sentiment analysis without considering data domain characteristic cannot reflect contextual expression only used in the specific data domain. So, we can expect using a modified sentiment dictionary customized to data domain can lead the improvement of sentiment analysis efficiency. Therefore, this study aims to suggest a way to construct customized dictionary to reflect characteristics of data domain. Especially, in this study, movie review data are divided by genre and construct genre-customized dictionaries. The performance of customized dictionary in sentiment analysis is compared with a common sentiment dictionary. In this study, IMDb data are chosen as the subject of analysis, and movie reviews are categorized by genre. Six genres in IMDb, 'action', 'animation', 'comedy', 'drama', 'horror', and 'sci-fi' are selected. Five highest ranking movies and five lowest ranking movies per genre are selected as training data set and two years' movie data from 2012 September 2012 to June 2014 are collected as test data set. Using SO-PMI (Semantic Orientation from Point-wise Mutual Information) technique, we build customized sentiment dictionary per genre and compare prediction accuracy on review rating. As a result of the analysis, the prediction using customized dictionaries improves prediction accuracy. The performance improvement is 2.82% in overall and is statistical significant. Especially, the customized dictionary on 'sci-fi' leads the highest accuracy improvement among six genres. Even though this study shows the usefulness of customized dictionaries in sentiment analysis, further studies are required to generalize the results. In this study, we only consider adjectives as additional terms in customized sentiment dictionary. Other part of text such as verb and adverb can be considered to improve sentiment analysis performance. Also, we need to apply customized sentiment dictionary to other domain such as product reviews.

A Study on the Accuracy of GPS Received Data in Travel Vehicle (통행차량에 대한 GPS수신자료의 정확도에 관한 연구)

  • Kim, Jae-Seok;Lee, Seung Jun;Woo, Yong-Han
    • Journal of the Korean Association of Geographic Information Studies
    • /
    • v.5 no.4
    • /
    • pp.75-85
    • /
    • 2002
  • The introduction of GPS technique in transportation study make real time tracking of driving vehicle's position possible. Position information data gauged with three dimension(X, Y, Z) can be achieved with time-continuity and from more than two driving vehicles. For this field of study, in past day, there were many difficulties in collecting real time data, but now, it becomes easy thanks to this. But for the resonable result analysis, fully understanding of GPS data-accuracy must be proceeded. Because accompanied magnitude of error depends on the study's accuracy. For this reason, this study surveyed the GPS data's error and suggest calibrate technique. This study's result will be helpful for following studies using DGPS data. For this, this study takes two types study in road, and set triangulation coordinates, and compare it with GPS data. DGPS data contains less than 0.6m's error.

  • PDF

Implementation of Feeding Management Service Model based on Pig Raising Data (양돈 데이터 기반의 급이 관리 서비스 모델 구현)

  • Kim, Bong-Hyun
    • Journal of Digital Convergence
    • /
    • v.19 no.10
    • /
    • pp.105-110
    • /
    • 2021
  • The pig ICT automatic feeder is capable of automatically feeding feed, etc. according to the set conditions. However, there is a disadvantage that the setting condition itself must depend on the user's experience. Therefore, trial and error is caused, and there is a problem that the efficiency is lowered. Therefore, it is necessary to develop a system and implement a service model that can improve pig productivity by suggesting optimal feeding setting conditions based on data. Therefore, in this paper, a pig feeding management service model was developed using the performance analysis program such as the existing feeding data, breeding management data, and pig production management system. Through this, we developed a consumer-oriented feed management service model that can be efficiently utilized by analyzing pig data. In addition, it is possible to provide a service that contributes to a decrease in the mortality rate and an increase in the MSY of the farms with the intelligent automatic feeding management service, thereby improving the productivity of the pig farms and thereby increasing the income of the pig farms.

A Study on Implementation of a Disaster Crisis Alert System based on National Disaster Management System

  • Hyong-Seop, Shim
    • Journal of the Korea Society of Computer and Information
    • /
    • v.28 no.1
    • /
    • pp.55-63
    • /
    • 2023
  • In this paper, we propose a function and service of the Disaster Crisis Alert Management System that automatically analyzes the situation judgment criteria to issue a disaster crisis alert and a plan to operate in the National Disaster Management System(NDMS). In the event of a disaster, a crisis alert(interest-caution-alert-serious) is issued according to the crisis alert level. In order to automatically analyze and determine the crisis alert level, first, data collection, crisis alert level analysis, crisis alert level judgment, and disaster crisis alert management system that expresses the crisis alert level by spatial scale(province, city, district) were implemented. The crisis alert level was analyzed and expressed in two ways by applying the intelligent crisis alert level(determination of regional sensitivity, risk level, and crisis alert level) and the crisis alert standard of the crisis management manual(province-level standard setting). Second, standard metadata, linkage of situation information of target) and API standards for data provision are presented to jointly utilize data linkage and crisis alert data of the disaster and safety data sharing platform so that it can be operated within the NDMS.

Detection of Phantom Transaction using Data Mining: The Case of Agricultural Product Wholesale Market (데이터마이닝을 이용한 허위거래 예측 모형: 농산물 도매시장 사례)

  • Lee, Seon Ah;Chang, Namsik
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.1
    • /
    • pp.161-177
    • /
    • 2015
  • With the rapid evolution of technology, the size, number, and the type of databases has increased concomitantly, so data mining approaches face many challenging applications from databases. One such application is discovery of fraud patterns from agricultural product wholesale transaction instances. The agricultural product wholesale market in Korea is huge, and vast numbers of transactions have been made every day. The demand for agricultural products continues to grow, and the use of electronic auction systems raises the efficiency of operations of wholesale market. Certainly, the number of unusual transactions is also assumed to be increased in proportion to the trading amount, where an unusual transaction is often the first sign of fraud. However, it is very difficult to identify and detect these transactions and the corresponding fraud occurred in agricultural product wholesale market because the types of fraud are more intelligent than ever before. The fraud can be detected by verifying the overall transaction records manually, but it requires significant amount of human resources, and ultimately is not a practical approach. Frauds also can be revealed by victim's report or complaint. But there are usually no victims in the agricultural product wholesale frauds because they are committed by collusion of an auction company and an intermediary wholesaler. Nevertheless, it is required to monitor transaction records continuously and to make an effort to prevent any fraud, because the fraud not only disturbs the fair trade order of the market but also reduces the credibility of the market rapidly. Applying data mining to such an environment is very useful since it can discover unknown fraud patterns or features from a large volume of transaction data properly. The objective of this research is to empirically investigate the factors necessary to detect fraud transactions in an agricultural product wholesale market by developing a data mining based fraud detection model. One of major frauds is the phantom transaction, which is a colluding transaction by the seller(auction company or forwarder) and buyer(intermediary wholesaler) to commit the fraud transaction. They pretend to fulfill the transaction by recording false data in the online transaction processing system without actually selling products, and the seller receives money from the buyer. This leads to the overstatement of sales performance and illegal money transfers, which reduces the credibility of market. This paper reviews the environment of wholesale market such as types of transactions, roles of participants of the market, and various types and characteristics of frauds, and introduces the whole process of developing the phantom transaction detection model. The process consists of the following 4 modules: (1) Data cleaning and standardization (2) Statistical data analysis such as distribution and correlation analysis, (3) Construction of classification model using decision-tree induction approach, (4) Verification of the model in terms of hit ratio. We collected real data from 6 associations of agricultural producers in metropolitan markets. Final model with a decision-tree induction approach revealed that monthly average trading price of item offered by forwarders is a key variable in detecting the phantom transaction. The verification procedure also confirmed the suitability of the results. However, even though the performance of the results of this research is satisfactory, sensitive issues are still remained for improving classification accuracy and conciseness of rules. One such issue is the robustness of data mining model. Data mining is very much data-oriented, so data mining models tend to be very sensitive to changes of data or situations. Thus, it is evident that this non-robustness of data mining model requires continuous remodeling as data or situation changes. We hope that this paper suggest valuable guideline to organizations and companies that consider introducing or constructing a fraud detection model in the future.

Digital Signage service through Customer Behavior pattern analysis

  • Shin, Min-Chan;Park, Jun-Hee;Lee, Ji-Hoon;Moon, Nammee
    • Journal of the Korea Society of Computer and Information
    • /
    • v.25 no.9
    • /
    • pp.53-62
    • /
    • 2020
  • Product recommendation services that have been researched recently are only recommended through the customer's product purchase history. In this paper, we propose the digital signage service through customers' behavior pattern analysis that is recommending through not only purchase history, but also behavior pattern that customers take when choosing products. This service analyzes customer behavior patterns and extracts interests about products that are of practical interest. The service is learning extracted interest rate and customers' purchase history through the Wide & Deep model. Based on this learning method, the sparse vector of other products is predicted through the MF(Matrix Factorization). After derive the ranking of predicted product interest rate, this service uses the indoor signage that can interact with customers to expose the suitable advertisements. Through this proposed service, not only online, but also in an offline environment, it would be possible to grasp customers' interest information. Also, it will create a satisfactory purchasing environment by providing suitable advertisements to customers, not advertisements that advertisers randomly expose.

Decoding Brain Patterns for Colored and Grayscale Images using Multivariate Pattern Analysis

  • Zafar, Raheel;Malik, Muhammad Noman;Hayat, Huma;Malik, Aamir Saeed
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.14 no.4
    • /
    • pp.1543-1561
    • /
    • 2020
  • Taxonomy of human brain activity is a complicated rather challenging procedure. Due to its multifaceted aspects, including experiment design, stimuli selection and presentation of images other than feature extraction and selection techniques, foster its challenging nature. Although, researchers have focused various methods to create taxonomy of human brain activity, however use of multivariate pattern analysis (MVPA) for image recognition to catalog the human brain activities is scarce. Moreover, experiment design is a complex procedure and selection of image type, color and order is challenging too. Thus, this research bridge the gap by using MVPA to create taxonomy of human brain activity for different categories of images, both colored and gray scale. In this regard, experiment is conducted through EEG testing technique, with feature extraction, selection and classification approaches to collect data from prequalified criteria of 25 graduates of University Technology PETRONAS (UTP). These participants are shown both colored and gray scale images to record accuracy and reaction time. The results showed that colored images produces better end result in terms of accuracy and response time using wavelet transform, t-test and support vector machine. This research resulted that MVPA is a better approach for the analysis of EEG data as more useful information can be extracted from the brain using colored images. This research discusses a detail behavior of human brain based on the color and gray scale images for the specific and unique task. This research contributes to further improve the decoding of human brain with increased accuracy. Besides, such experiment settings can be implemented and contribute to other areas of medical, military, business, lie detection and many others.

Performance Evaluation of a Fat-tree Network with Output-Buffered $a{\times}b$ Switches (출력 버퍼형 $a{\times}b$스위치로 구성된 Fat-tree 망의 성능 분석)

  • 신태지;양명국
    • Journal of KIISE:Information Networking
    • /
    • v.30 no.4
    • /
    • pp.520-534
    • /
    • 2003
  • In this paper, a performance evaluation model of the Fat-tree Network with the multiple-buffered crossbar switches is proposed and examined. Buffered switch technique is well known to solve the data collision problem of the switch network. The proposed evaluation model is developed by investigating the transfer patterns of data packets in a switch with output-buffers. Two important parameters of the network performance, throughput and delay, are then evaluated. The proposed model takes simple and primitive switch networks, i.e., no flow control and drop packet, to demonstrate analysis procedures clearly. It, however, can not only be applied to any other complicate modern switch networks that have intelligent flow control but also estimate the performance of any size networks with multiple-buffered switches. To validate the proposed analysis model, the simulation is carried out on the various sizes of Fat-tree networks that uses the multiple buffered crossbar switches. Less than 2% differences between analysis and simulation results are observed.

Trading rule extraction in stock market using the rough set approach

  • Kim, Kyoung-jae;Huh, Jin-nyoung;Ingoo Han
    • Proceedings of the Korea Inteligent Information System Society Conference
    • /
    • 1999.10a
    • /
    • pp.337-346
    • /
    • 1999
  • In this paper, we propose the rough set approach to extract trading rules able to discriminate between bullish and bearish markets in stock market. The rough set approach is very valuable to extract trading rules. First, it does not make any assumption about the distribution of the data. Second, it not only handles noise well, but also eliminates irrelevant factors. In addition, the rough set approach appropriate for detecting stock market timing because this approach does not generate the signal for trade when the pattern of market is uncertain. The experimental results are encouraging and prove the usefulness of the rough set approach for stock market analysis with respect to profitability.

  • PDF

Computer-based Automated System for Determining the Characteristics, Losses and Efficiency of Separately Excited DC Motors

  • Kaur, Puneet;Chatterji, S.
    • Journal of international Conference on Electrical Machines and Systems
    • /
    • v.1 no.4
    • /
    • pp.440-447
    • /
    • 2012
  • This paper provides essential information on research completed with the aim to develop a 'dc motor test and analysis platform' which can be used to provide dc motor characteristics, calculate losses and efficiency, and also work as a dc motor speed controller. A user can test a given dc motor for these analyses by practicing different conventional methods, but, the concept discussed in this paper, reveals how intelligent integration of all these analyses can be done with a single user friendly automated setup. Integration has been accomplished by a technique that can accommodate all types of dc motors with different ratings at various loading conditions. However, experimentally measured results of a 0.5HP separately excited dc motor using the discussed scheme are presented in the paper. Also, a comparison of the methodology of this system with conventional techniques has also been elaborated on to show the effectiveness of the system.