• Title/Summary/Keyword: mining monitor

Search Result 57, Processing Time 0.027 seconds

Detection of Phantom Transaction using Data Mining: The Case of Agricultural Product Wholesale Market (데이터마이닝을 이용한 허위거래 예측 모형: 농산물 도매시장 사례)

  • Lee, Seon Ah;Chang, Namsik
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.1
    • /
    • pp.161-177
    • /
    • 2015
  • With the rapid evolution of technology, the size, number, and the type of databases has increased concomitantly, so data mining approaches face many challenging applications from databases. One such application is discovery of fraud patterns from agricultural product wholesale transaction instances. The agricultural product wholesale market in Korea is huge, and vast numbers of transactions have been made every day. The demand for agricultural products continues to grow, and the use of electronic auction systems raises the efficiency of operations of wholesale market. Certainly, the number of unusual transactions is also assumed to be increased in proportion to the trading amount, where an unusual transaction is often the first sign of fraud. However, it is very difficult to identify and detect these transactions and the corresponding fraud occurred in agricultural product wholesale market because the types of fraud are more intelligent than ever before. The fraud can be detected by verifying the overall transaction records manually, but it requires significant amount of human resources, and ultimately is not a practical approach. Frauds also can be revealed by victim's report or complaint. But there are usually no victims in the agricultural product wholesale frauds because they are committed by collusion of an auction company and an intermediary wholesaler. Nevertheless, it is required to monitor transaction records continuously and to make an effort to prevent any fraud, because the fraud not only disturbs the fair trade order of the market but also reduces the credibility of the market rapidly. Applying data mining to such an environment is very useful since it can discover unknown fraud patterns or features from a large volume of transaction data properly. The objective of this research is to empirically investigate the factors necessary to detect fraud transactions in an agricultural product wholesale market by developing a data mining based fraud detection model. One of major frauds is the phantom transaction, which is a colluding transaction by the seller(auction company or forwarder) and buyer(intermediary wholesaler) to commit the fraud transaction. They pretend to fulfill the transaction by recording false data in the online transaction processing system without actually selling products, and the seller receives money from the buyer. This leads to the overstatement of sales performance and illegal money transfers, which reduces the credibility of market. This paper reviews the environment of wholesale market such as types of transactions, roles of participants of the market, and various types and characteristics of frauds, and introduces the whole process of developing the phantom transaction detection model. The process consists of the following 4 modules: (1) Data cleaning and standardization (2) Statistical data analysis such as distribution and correlation analysis, (3) Construction of classification model using decision-tree induction approach, (4) Verification of the model in terms of hit ratio. We collected real data from 6 associations of agricultural producers in metropolitan markets. Final model with a decision-tree induction approach revealed that monthly average trading price of item offered by forwarders is a key variable in detecting the phantom transaction. The verification procedure also confirmed the suitability of the results. However, even though the performance of the results of this research is satisfactory, sensitive issues are still remained for improving classification accuracy and conciseness of rules. One such issue is the robustness of data mining model. Data mining is very much data-oriented, so data mining models tend to be very sensitive to changes of data or situations. Thus, it is evident that this non-robustness of data mining model requires continuous remodeling as data or situation changes. We hope that this paper suggest valuable guideline to organizations and companies that consider introducing or constructing a fraud detection model in the future.

A Topographical Classifier Development Support System Cooperating with Data Mining Tool WEKA from Airborne LiDAR Data (항공 라이다 데이터로부터 데이터마이닝 도구 WEKA를 이용한 지형 분류기 제작 지원 시스템)

  • Lee, Sung-Gyu;Lee, Ho-Jun;Sung, Chul-Woong;Park, Chang-Hoo;Cho, Woo-Sug;Kim, Yoo-Sung
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • v.28 no.1
    • /
    • pp.133-142
    • /
    • 2010
  • To monitor composition and change of the national land, intelligent topographical classifier which enables accurate classification of land-cover types from airborne LiDAR data is highly required. We developed a topographical classifier development support system cooperating with da1a mining tool WEKA to help users to construct accurate topographical classification systems. The topographical classifier development support system has the following functions; superposing LiDAR data upon corresponding aerial images, dividing LiDAR data into tiles for efficient processing, 3D visualization of partial LiDAR data, feature from tiles, automatic WEKA input generation, and automatic C++ program generation from the classification rule set. In addition, with dam mining tool WEKA, we can choose highly distinguishable features by attribute selection function and choose the best classification model as the result topographical classifier. Therefore, users can easily develop intelligent topographical classifier which is well fitted to the developing objectives by using the topographical classifier development support system.

Satellite-based Hybrid Drought Assessment using Vegetation Drought Response Index in South Korea (VegDRI-SKorea) (식생가뭄반응지수 (VegDRI)를 활용한 위성영상 기반 가뭄 평가)

  • Nam, Won-Ho;Tadesse, Tsegaye;Wardlow, Brian D.;Jang, Min-Won;Hong, Suk-Young
    • Journal of The Korean Society of Agricultural Engineers
    • /
    • v.57 no.4
    • /
    • pp.1-9
    • /
    • 2015
  • The development of drought index that provides detailed-spatial-resolution drought information is essential for improving drought planning and preparedness. The objective of this study was to develop the concept of using satellite-based hybrid drought index called the Vegetation Drought Response Index in South Korea (VegDRI-SKorea) that could improve spatial resolution for monitoring local and regional drought. The VegDRI-SKorea was developed using the Classification And Regression Trees (CART) algorithm based on remote sensing data such as Normalized Difference Vegetation Index (NDVI) from MODIS satellite images, climate drought indices such as Self Calibrating Palmer Drought Severity Index (SC-PDSI) and Standardized Precipitation Index (SPI), and the biophysical data such as land cover, eco region, and soil available water capacity. A case study has been done for the 2012 drought to evaluate the VegDRI-SKorea model for South Korea. The VegDRI-SKorea represented the drought areas from the end of May and to the severe drought at the end of June. Results show that the integration of satellite imageries and various associated data allows us to get improved both spatially and temporally drought information using a data mining technique and get better understanding of drought condition. In addition, VegDRI-SKorea is expected to contribute to monitor the current drought condition for evaluating local and regional drought risk assessment and assisting drought-related decision making.

Secure Data Transaction Protocol for Privacy Protection in Smart Grid Environment (스마트 그리드 환경에서 프라이버시 보호를 위한 안전한 데이터 전송 프로토콜)

  • Go, Woong;Kwak, Jin
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.16 no.8
    • /
    • pp.1701-1710
    • /
    • 2012
  • Recently, it has been found that it is important to use a smart grid to reduce greenhouse-gas emissions worldwide. A smart grid is a digitally enabled electrical grid that gathers, distributes, and acts on information regarding the behavior of all participants (suppliers and consumers) to improve the efficiency, importance, reliability, economics, and sustainability of electricity services. The smart grid technology uses two-way communication, where users can monitor and limit the electricity consumption of their home appliances in real time. Likewise, power companies can monitor and limit the electricity consumption of home appliances for stabilization of the electricity supply. However, if information regarding the measured electricity consumption of a user is leaked, serious privacy issues may arise, as such information may be used as a source of data mining of the electricity consumption patterns or life cycles of home residents. In this paper, we propose a data transaction protocol for privacy protection in a smart grid. In addition, a power company cannot decrypt an encrypted home appliance ID without the user's password.

Analyzing Operation Deviation in the Deasphalting Process Using Multivariate Statistics Analysis Method

  • Park, Joo-Hwang;Kim, Jong-Soo;Kim, Tai-Suk
    • Journal of Korea Multimedia Society
    • /
    • v.17 no.7
    • /
    • pp.858-865
    • /
    • 2014
  • In the case of system like MES, various sensors collect the data in real time and save it as a big data to monitor the process. However, if there is big data mining in distributed computing system, whole processing process can be improved. In this paper, system to analyze the cause of operation deviation was built using the big data which has been collected from deasphalting process at the two different plants. By applying multivariate statistical analysis to the big data which has been collected through MES(Manufacturing Execution System), main cause of operation deviation was analyzed. We present the example of analyzing the operation deviation of deasphalting process using the big data which collected from MES by using multivariate statistics analysis method. As a result of regression analysis of the forward stepwise method, regression equation has been found which can explain 52% increase of performance compare to existing model. Through this suggested method, the existing petrochemical process can be replaced which is manual analysis method and has the risk of being subjective according to the tester. The new method can provide the objective analysis method based on numbers and statistic.

Evaluation of Shopping Items: Focused on Purchase of Foreign Tourists in South Korea

  • Jeong, Dong-Bin
    • East Asian Journal of Business Economics (EAJBE)
    • /
    • v.7 no.2
    • /
    • pp.21-30
    • /
    • 2019
  • Purpose - In this work, we categorize the 21 shopping items which foreign tourists purchase in South Korea and monitor the level of dissimilarity (or similarity) between each item by utilizing distance matrix, and both hierarchical and k-means cluster analyses, respectively, based on several purpose of visit attributes in 2017. In addition, multidimensional scaling (MDS) method is applied for mining visual appearance of proximities among shopping items based on purpose of visit attributes. Research design and methodology - This study is carried out in 2017 by Ministry of Culture, Sports and Tourism and conduct a face-to-face survey of foreign tourists from 20 countries who purchase shopping items in South Korea. CLUSTER, PROXIMITIES and ALSCAL modules in IBM SPSS 23.0 are used to perform this work. Results - We ascertain that 21 shopping items can be classified into five similar groups which have homogeneous traits by going through two-step cluster analysis. We can position homogeneous places of cluster and shopping items joining each cluster. Conclusions - We can relatively assess patterns and characteristics of each shopping item, come by useful information in activating shopping tour based on the actual state of recognition of foreign tourists and practically apply to each tourism industry on underlying results.

Analysis and Subclass Classification of Microarray Gene Expression Data Using Computational Biology (전산생물학을 이용한 마이크로어레이의 유전자 발현 데이터 분석 및 유형 분류 기법)

  • Yoo, Chang-Kyoo;Lee, Min-Young;Kim, Young-Hwang;Lee, In-Beum
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.11 no.10
    • /
    • pp.830-836
    • /
    • 2005
  • Application of microarray technologies which monitor simultaneously the expression pattern of thousands of individual genes in different biological systems results in a tremendous increase of the amount of available gene expression data and have provided new insights into gene expression during drug development, within disease processes, and across species. There is a great need of data mining methods allowing straightforward interpretation, visualization and analysis of the relevant information contained in gene expression profiles. Specially, classifying biological samples into known classes or phenotypes is an important practical application for microarray gene expression profiles. Gene expression profiles obtained from tissue samples of patients thus allowcancer classification. In this research, molecular classification of microarray gene expression data is applied for multi-class cancer using computational biology such gene selection, principal component analysis and fuzzy clustering. The proposed method was applied to microarray data from leukemia patients; specifically, it was used to interpret the gene expression pattern and analyze the leukemia subtype whose expression profiles correlated with four cases of acute leukemia gene expression. A basic understanding of the microarray data analysis is also introduced.

Securing a Cyber Physical System in Nuclear Power Plants Using Least Square Approximation and Computational Geometric Approach

  • Gawand, Hemangi Laxman;Bhattacharjee, A.K.;Roy, Kallol
    • Nuclear Engineering and Technology
    • /
    • v.49 no.3
    • /
    • pp.484-494
    • /
    • 2017
  • In industrial plants such as nuclear power plants, system operations are performed by embedded controllers orchestrated by Supervisory Control and Data Acquisition (SCADA) software. A targeted attack (also termed a control aware attack) on the controller/SCADA software can lead a control system to operate in an unsafe mode or sometimes to complete shutdown of the plant. Such malware attacks can result in tremendous cost to the organization for recovery, cleanup, and maintenance activity. SCADA systems in operational mode generate huge log files. These files are useful in analysis of the plant behavior and diagnostics during an ongoing attack. However, they are bulky and difficult for manual inspection. Data mining techniques such as least squares approximation and computational methods can be used in the analysis of logs and to take proactive actions when required. This paper explores methodologies and algorithms so as to develop an effective monitoring scheme against control aware cyber attacks. It also explains soft computation techniques such as the computational geometric method and least squares approximation that can be effective in monitor design. This paper provides insights into diagnostic monitoring of its effectiveness by attack simulations on a four-tank model and using computation techniques to diagnose it. Cyber security of instrumentation and control systems used in nuclear power plants is of paramount importance and hence could be a possible target of such applications.

Big data platform for health monitoring systems of multiple bridges

  • Wang, Manya;Ding, Youliang;Wan, Chunfeng;Zhao, Hanwei
    • Structural Monitoring and Maintenance
    • /
    • v.7 no.4
    • /
    • pp.345-365
    • /
    • 2020
  • At present, many machine leaning and data mining methods are used for analyzing and predicting structural response characteristics. However, the platform that combines big data analysis methods with online and offline analysis modules has not been used in actual projects. This work is dedicated to developing a multifunctional Hadoop-Spark big data platform for bridges to monitor and evaluate the serviceability based on structural health monitoring system. It realizes rapid processing, analysis and storage of collected health monitoring data. The platform contains offline computing and online analysis modules, using Hadoop-Spark environment. Hadoop provides the overall framework and storage subsystem for big data platform, while Spark is used for online computing. Finally, the big data Hadoop-Spark platform computational performance is verified through several actual analysis tasks. Experiments show the Hadoop-Spark big data platform has good fault tolerance, scalability and online analysis performance. It can meet the daily analysis requirements of 5s/time for one bridge and 40s/time for 100 bridges.

Predictive maintenance technology for smart factory (스마트 팩토리를 위한 예지보전 기술)

  • Kwon, Dae-hoon;Oh, Chang-heon
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2021.05a
    • /
    • pp.172-174
    • /
    • 2021
  • In the existing industry, maintenance was carried out in the form of preventive maintenance such as occurrence of unnecessary idle time due to limited monitoring and maintenance. However, with the advent of the Fourth Industrial Revolution, real-time monitoring is possible in many industries including mining, manufacturing, oil and gas, and commercial agriculture, and it is desired to minimize idle time due to maintenance. In particular, there is a growing interest in predictive maintenance that can reduce costs and maximize operational efficiency by predicting and maintaining a failure before equipment and equipment fail. In this study, we look at the predictive maintenance technology that can verify the abnormal condition of the equipment of the smart factory in advance and monitor the abnormal condition in real time.

  • PDF