DOI QR코드

DOI QR Code

Detection of Phantom Transaction using Data Mining: The Case of Agricultural Product Wholesale Market

데이터마이닝을 이용한 허위거래 예측 모형: 농산물 도매시장 사례

  • 이선아 (주식회사 웨이버스 기술본부) ;
  • 장남식 (서울시립대학교 경영대학)
  • Received : 2014.11.25
  • Accepted : 2015.01.14
  • Published : 2015.03.31

Abstract

With the rapid evolution of technology, the size, number, and the type of databases has increased concomitantly, so data mining approaches face many challenging applications from databases. One such application is discovery of fraud patterns from agricultural product wholesale transaction instances. The agricultural product wholesale market in Korea is huge, and vast numbers of transactions have been made every day. The demand for agricultural products continues to grow, and the use of electronic auction systems raises the efficiency of operations of wholesale market. Certainly, the number of unusual transactions is also assumed to be increased in proportion to the trading amount, where an unusual transaction is often the first sign of fraud. However, it is very difficult to identify and detect these transactions and the corresponding fraud occurred in agricultural product wholesale market because the types of fraud are more intelligent than ever before. The fraud can be detected by verifying the overall transaction records manually, but it requires significant amount of human resources, and ultimately is not a practical approach. Frauds also can be revealed by victim's report or complaint. But there are usually no victims in the agricultural product wholesale frauds because they are committed by collusion of an auction company and an intermediary wholesaler. Nevertheless, it is required to monitor transaction records continuously and to make an effort to prevent any fraud, because the fraud not only disturbs the fair trade order of the market but also reduces the credibility of the market rapidly. Applying data mining to such an environment is very useful since it can discover unknown fraud patterns or features from a large volume of transaction data properly. The objective of this research is to empirically investigate the factors necessary to detect fraud transactions in an agricultural product wholesale market by developing a data mining based fraud detection model. One of major frauds is the phantom transaction, which is a colluding transaction by the seller(auction company or forwarder) and buyer(intermediary wholesaler) to commit the fraud transaction. They pretend to fulfill the transaction by recording false data in the online transaction processing system without actually selling products, and the seller receives money from the buyer. This leads to the overstatement of sales performance and illegal money transfers, which reduces the credibility of market. This paper reviews the environment of wholesale market such as types of transactions, roles of participants of the market, and various types and characteristics of frauds, and introduces the whole process of developing the phantom transaction detection model. The process consists of the following 4 modules: (1) Data cleaning and standardization (2) Statistical data analysis such as distribution and correlation analysis, (3) Construction of classification model using decision-tree induction approach, (4) Verification of the model in terms of hit ratio. We collected real data from 6 associations of agricultural producers in metropolitan markets. Final model with a decision-tree induction approach revealed that monthly average trading price of item offered by forwarders is a key variable in detecting the phantom transaction. The verification procedure also confirmed the suitability of the results. However, even though the performance of the results of this research is satisfactory, sensitive issues are still remained for improving classification accuracy and conciseness of rules. One such issue is the robustness of data mining model. Data mining is very much data-oriented, so data mining models tend to be very sensitive to changes of data or situations. Thus, it is evident that this non-robustness of data mining model requires continuous remodeling as data or situation changes. We hope that this paper suggest valuable guideline to organizations and companies that consider introducing or constructing a fraud detection model in the future.

정보기술의 빠른 진화, 빅데이터의 등장, 분석기법의 고도화 등으로 인해 다량의 데이터로부터 의미있는 정보를 추출하는 데이터마이닝을 다양한 영역에 활용하고자 하는 시도들이 활발히 진행되고 있다. 그 중의 한 분야가 농산물 유통영역인데, 농산물에 대한 지속적인 수요 증가와 전자경매의 활성화 등으로 수도권 농산물 도매시장에서만도 연간 수천만건 이상의 거래가 이루어 진다. 그러나 급속한 거래량 증가와 더불어 과거로부터 관행적으로 이루어지고 있는 부정거래도 함께 증가하고 있는데 거래참가자들 사이의 결탁에 의해 발생하는 농산물 도매시장의 부정거래는 점차 지능화되는 추세이며, 이들을 감지하고 적발하기가 매우 어려운 실정이다. 이로 인해 농산물 유통환경의 공정거래 질서는 침해되고 시장에 대한 신뢰는 훼손되곤 한다. 따라서 거래투명성을 제고하고 유통비리를 구조적으로 개선하기 위한 과학적이고 자동화된 부정탐지시스템의 필요성이 어느 때보다도 절실히 요구되는 상황이다. 본 연구에서는 데이터마이닝의 의사결정나무를 이용하여 실제 발생하지 않은 거래를 실물 없이 거래한 것처럼 조작하여 대금을 정산하는 행위인 허위거래를 탐지하는 모형을 제시하였다. 이를 위해 실제 농산물 도매시장의 데이터를 수집하였고, 데이터의 정제 및 표준화 등의 선행작업을 수행하였다. 또한 변수 간의 상관관계 및 분포도 분석 등을 통해 데이터의 특성을 파악한 후 예측모형을 구축하여 허위거래와 정상거래를 분류하는 패턴을 도출하였으며, 최종적으로 시험용 데이터를 이용하여 모형을 평가하는 단계를 거쳐 결과의 적합성을 확인하였다. 향후 데이터마이닝을 이용한 부정탐지 모형을 허위거래뿐만 아니라 낙찰부정, 경매조작 등과 같이 다양화되는 부정거래에 적용하게 되면 보다 지대한 효과를 거둘 수 있으리라 사료된다.

Keywords

References

  1. Cha, K. Y., "An Application of Data-Mining Tool in Fraud Pension Payment Prediction," Communications for Statistical Applications and Methods, Vol.17, No.1(2010), 1-8. https://doi.org/10.5351/CKSS.2010.17.1.001
  2. Chang, N., "Improving the Effect of Customer Classification Models: A Pre-segmentation Approach," Information Systems Review, Vol.7, No.2(2005), 23-40.
  3. Chang, N., S. W. Hong, and J. H. Jang, Data Mining, Daecheong, 1999.
  4. Choi, S.-H., J.-W. Kim, K.-R. Kim, and Y. S. Lee, "A Study on the Problem and Improvement of Farm Product Structure in Korea," Journal of Franchise Management, Vol.2, No.2(2011), 70-83.
  5. Egmarket, Distributor's Role, Available at http://egmarket.busan.go.kr/02_currency/02_01.jsp (Accessed 20 September, 2014).
  6. Garak, Market Function, Available at http://www.garak.co.kr/gongsa/jsp/mk/marketinfo/overview.jsp (Accessed 18 August, 2014).
  7. Ham, S. O. and J. S. Hong, "A Study on the Fraud Detection of Industrial Accident Compensation Insurance," Proceedings of 2008 KORMS Fall Conference, (2008), 342-345.
  8. Jeong, C. S., "A Study on the Agricultural Product Market: The Case of Vegetable Products," Master's Thesis, Department of Economics, Kyung Hee University, 2000.
  9. Kim, D. W., J. W. Song, D. S. Kim, J. H. Park, H. N. Park and Y. R. Lee, "Improving Sales Efforts of Intermediary Wholesaler in Garak Market," Research Report, Seoul Agro-Fisheries & Food Corporation, 2009.
  10. Kim, T.-H and Y.-H. Kim, "A Study on the Analysis of Customer Loan for the Credit Finance Company Using Classification Model," Journal of the Korean Data & Information Science Society, Vol.24, No.3(2013), 411-425. https://doi.org/10.7465/jkdi.2013.24.3.411
  11. Lee, S. A., "A Study on the Fraud Detection using Data Mining: The Case of Agricultural Products Distribution Market," Master's Thesis, College of Business Administration, University of Seoul, 2013.
  12. McKinsey Global Institute, "Big Data: The Next Frontier for Innovation, Competition, and Productivity," McKinsey and Company, 2011.
  13. Park, J., "Real-time Data Integration using Ontology and Semantic Mediators," Asia Pacific Journal of Information Systems, Vol. 16, No.4(2006), 151-178.
  14. Rho, B. H., J. H. Min, and G. H. Lee, Introduction to Statistics, Bobmunsa, 1998.
  15. Seo, K. N. and S. R. Yang, "The Effect of the Electronic Auction on the Price Efficiency in the Garak Market," Korean Journal of Agricultural Management and Policy, Vol.38, No.2(2011), 175-195.
  16. Sha, D. C., "The Legislation on the Stability of Supply and Reform of Circulation Structure on Agricultural Products," Hongik Law Review, Vol.12, No.2(2011), 167-193.
  17. Song, Y., W. Han and W. C. Jhee, "Ensemble Size Reduction in Fraud Detection System," Proceedings of 2007 KMIS International Conference, (2007), 597-602.
  18. Sung, T. K., N. Chang, and G. Lee, "Dynamics of Modeling in Data Mining: Interpretive Approach to Bankruptcy Prediction," Journal of Management Information Systems, Vol. 16, No.1(1999), 63-85. https://doi.org/10.1201/1078/43197.16.3.19990601/31317.9
  19. Stubbs, E., Big Data, Big Innovation, Wiley, 2014.
  20. Tam, K. Y., and M. Y. Kiang, "Managerial Applications of Neural Networks: The Case of Bankruptcy Predictions," Management Science, Vol.38, No.1(1992), 926-947. https://doi.org/10.1287/mnsc.38.7.926
  21. Wi, T.-S. and S.-K. Kwon, "Transaction Practices Reform in the Wholesale Markets for Strengthening the Competition Power," Korean Journal of Food Marketing Economics, Vol.23, No.3(2006), 113-144.
  22. Wi, T.-S. and S.-K. Kwon, "Reorganization of the Agricultural Wholesale Market," Korean Journal of Food Marketing Economics, Vol.26, No.3(2009), 75-93.