DOI QR코드

DOI QR Code

A Method for Detecting Program Plagiarism Comparing Class Structure Graphs

클래스 구조 그래프 비교를 통한 프로그램 표절 검사 방법

  • 김연어 (부산대학교 전자전기컴퓨터공학과) ;
  • 이윤정 (부산대학교 IT기반 융합산업창의인력양성 사업단) ;
  • 우균 (부산대학교 전자전기컴퓨터공학과, LG전자 스마트제어센터)
  • Received : 2013.08.16
  • Accepted : 2013.10.24
  • Published : 2013.11.28

Abstract

Recently, lots of research results on program comparison have been reported since the code theft become frequent as the increase of code mobility. This paper proposes a plagiarism detection method using class structures. The proposed method constructs a graph representing the referential relationship between the member variables and the methods. This relationship is shown as a bipartite graph and the test for graph isomorphism is applied on the set of graphs to measure the similarity of the programs. In order to measure the effectiveness of this method, an experiment was conducted on the test set, the set of Java source codes submitted as solutions for the programming assignments in Object-Oriented Programming course of Pusan National University in 2012. In order to evaluate the accuracy of the proposed method, the F-measure is compared to those of JPlag and Stigmata. According to the experimental result, the F-measure of the proposed method is higher than those of JPlag and Stigmata by 0.17 and 0.34, respectively.

코드 이동성이 증가함에 따라 코드 도용이 문제가 되고 있으며 이를 대처하기 위해 프로그램 비교를 위한 연구가 많이 진행되고 있다. 이 논문은 클래스 구조를 이용하여 Java 프로그램의 표절을 검사하는 방법을 제안한다. 제안 방법은 멤버 변수와 메소드 간의 참조 관계를 나타내는 그래프를 생성한다. 변수 참조 관계는 이분 그래프 형태로 나타나는데 이렇게 생성된 그래프를 대상으로 그래프 동형 검사를 적용하여 프로그램 간의 유사도를 측정한다. 이 논문에서는 제안 방법의 효과를 입증하기 위해 2012년 부산대학교 객체지향 프로그래밍 과제로 제출된 Java 프로그램을 대상으로 실험하였다. 그리고 제안 방법의 정확도를 평가하기 위해 기존 유사도 검사 프로그램인 JPlag와 Stigmata를 대상으로 F-measure 지표를 이용해 비교하였다. 그 결과 제안 방법의 F-measure가 JPlag보다 0.17, Stigmata보다 0.34 높은 것으로 나타났다.

Keywords

References

  1. 차병래, "소프트웨어 소스 코드의 저작권 관리를 위한 디지털 라이센스의 검색", 한국콘텐츠학회논문지, 제7권, 제1호, pp.21-31, 2007. https://doi.org/10.5392/JKCA.2007.7.1.021
  2. H. Tamada, M. Nakamura, A. Monden, and K. Matsumoto, "Java Birthmark - Detecting the Software Theft," IEICE Transactions on Information and Systems, Vol.88, No.9, pp.2148-2158, 2005.
  3. 나지하, 김종원, 김재석, "신분증 위변조 방지를 위한 이미지 워터마킹", 한국콘텐츠학회논문지, 제11권, 제12호, pp.552-559, 2011. https://doi.org/10.5392/JKCA.2011.11.12.552
  4. G. Myles and C. Collberg, "K-gram Based Software Birthmarks," Proceedings of the 2005 ACM symposium on Applied computing, pp.314-318, 2005.
  5. S. Choi, H. Park, H. Lim, and T. Han, "A Static Birthmark of Binary Executables Based on API Call Structure," Proceedings of the 12th Asian computing science conference on Computer and Network Security, pp.2-16, 2007.
  6. G. Myles and C. Collberg, "Detecting Software Theft via Whole Program Path Birthmarks," Information Security Conference, pp.404-415, 2004.
  7. H. Tamada, K. Okamoto, M. Nakamura, and A. Monden, "Dynamic Software Birthmarks to Detect the Theft of Windows Applications," In Proceeding of International Symposium on Future Software Technology 2004, 2004.
  8. H. Tamada, K. Okamoto, M. Nakamura, A. Monden, and K. ichi Matsumoto, Design and Evaluation of Dynamic Software Birthmarks Based on API Calls, Technical Report NAIST-IS-TR2007011, Nara Institute of Science and Technology, 2007.
  9. X. Wang, Y. Jhi, S. Zhu, and P. Liu, "Behavior Based Software Theft Detection," Proceedings of the 16th ACM Conference on Computer and Communications Security, pp.280-290, 2009.
  10. Y. Kim, J. Moon, D. Kim, Y. Jeong, S. Cho, M. Park, and S. Han, "A Static Birthmark of Windows Binary Executables Based on Strings," 2013 7th International Conference on Innovative Mobile and Internet Services in Ubiquitous Computing, pp.734-738, 2013.
  11. J. Choi, Y. Han, S. Cho, H. Yoo, J. Woo, M. Park, Y. Song, and L. Chung, "A Static Birthmark for MS Windows Applications Using Import Address Table," 2013 7th International Conference on Innovative Mobile and Internet Services in Ubiquitous Computing, pp.129-134, 2013.
  12. P. Chan, L. Hui, and S. Yiu, "Dynamic Software Birthmark for Java Based on Heap Memory Analysis," 12th IFIP TC 6 / TC 11 International Conference on Communications and Multimedia Security, pp.94-107, 2011.
  13. C. Roy and J. Cordy, A Survey on Software Clone Detection Research, Technical Report 541, Queen's University at Kingston. 2007.
  14. S. Narayanan and S. Simi, "Source Code Plagiarism Detection and Performance Analysis Using Fingerprint Based Distance Measure Method," 2012 7th International Conference on Computer Science & Education, pp.1065-1068, 2012.
  15. J. Ji, G. Woo and H. Cho, "A Source Code Linearization Technique for Detecting Plagiarized Programs," ACM SIGCSE Bulletin, Vol.39, No.3, pp.73-77, 2007. https://doi.org/10.1145/1269900.1268807
  16. C. Liu, C. Chen, J. Han, and P. Yu, "GPLAG: Detection of Software Plagiarism by Program Dependence Graph Analysis," Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, pp.872-881, 2006.
  17. https://code.google.com/p/javaparser/
  18. http://www.cs.sunysb.edu/-algorith/implement /vflib/implement.shtml
  19. L. Cordella, P. Foggia, C. Sansone, and M. Vento, "A (sub) Graph Isomorphism Algorithm for Matching Large Graphs," IEEE Trans on Pattern Analysis and Machine Intelligence, Vol.26, No.10, pp.1367-1372, 2004. https://doi.org/10.1109/TPAMI.2004.75
  20. L. Prechelt, G. Malpohl, and M. Philippsen, "Finding Plagiarisms Among a Set of Programs with JPlag," J. UCS, Vol.8, No.11, pp.1016-1038, 2002.
  21. http://stigmata.sourceforge.jp/