DOI QR코드

DOI QR Code

Java API Pattern Extraction and Recommendation using Collocation Analysis

연어 관계 분석을 통한 Java API 패턴 추출 및 추천 방법

  • 권찬우 (연세대학교 컴퓨터정보통신공학부) ;
  • 황상원 (연세대학교 컴퓨터정보통신공학부) ;
  • 남영광 (연세대학교 컴퓨터정보통신공학부)
  • Received : 2017.04.10
  • Accepted : 2017.08.11
  • Published : 2017.11.15

Abstract

Many developers utilize specific APIs to develop software, and to identify the use of a particular API, a developer can refer to a website that provides the API or can retrieve the API from the web. However, the site that provides the API does not necessarily provide guidance on how to use it while it can be partially provided in many other cases. In this paper, we propose a novel system JACE (Java AST collocation-pattern extractor) as a method to reuse commonly-used code as a supplement. The JACE extracts the API call nodes, collocation patterns and analyzes the relations between the collocations to extract significant API patterns from the source code. The following experiment was performed to verify the accuracy of a defined pattern: 794 open source projects were analyzed to extract about 15M API call nodes. Then, the Eclipse plug-in test program was utilized to retrieve the pattern using the top 10 classes of API call nodes. Finally, the code search results from reference pages of the API classes and the Searchcode [1] were compared with the test program results.

소프트웨어 개발 진행 시 개발자는 다양한 방법으로 API의 사용 방법을 검색하지만, 원하는 검색 결과를 얻지 못하는 경우가 많다. 이러한 문제를 해결하기 위해, 본 연구에서는 추상구문트리의 연어 관계를 이용하여 API 패턴을 추출하고 이를 추천하는 시스템 JACE(Java AST Collocation-pattern Extractor)를 개발하였다. JACE는 자바 추상구문트리를 분석하여 API 호출 노드를 추출한 후, 노드 간 연어 관계를 분석하고 연어 관계 사전을 구축한다. 구축된 연어 관계 사전을 이용하여 연어 관계 리스트를 생성하고 이것을 패턴으로 정의한다. 정의된 패턴은 이클립스 플러그인으로 제작된 테스트 프로그램을 통하여 사용자 요청 시 추천된다. 실험을 위해 794개의 오픈소스 프로젝트를 분석하였고, 약 1천 5백만개의 API 호출 노드를 추출하여 실험하였다. 결과적으로, 기존 검색 시스템들보다 더 유용한 예제 코드 및 사용법을 제시하였다.

Keywords

Acknowledgement

Supported by : 한국연구재단

References

  1. Searchcode. [Online]. Available: https://Searchcode.com/
  2. Research Institute of Korean Studies, "Korea University Korean Dictionary," pp. 7535, Research Institute of Korean Studies, Korea University, Seoul, 2009.
  3. K. H. Moon, "A Study on the Korean Vocabulary Education by the Collocation," Journal of Korean Language Education, No. 109, pp. 217-250, Dec. 2002.
  4. S. K. Hsu, S. J. Lin, "MACs: Mining API code snippets for code reuse," Journal of Expert Systems with Applications, Vol. 38, No. 6, pp. 7291-7301, Jun. 2011. https://doi.org/10.1016/j.eswa.2010.12.021
  5. A. Michail, "Data mining library reuse patterns using generalized association rules," Proc. of the international conference on Software engineering, pp. 167-176, 2000.
  6. J. Pei, J. Han, B. Mortazavi-Asl, J. Wang, H. Pinto, Q. Chen, U. Dayal, and M. C. Hsu, "Mining sequential patterns by pattern-growth: the PrefixSpan approach," Journal of IEEE Transactions on Knowledge and Data Engineering, Vol. 16, No. 11, pp. 1424-1440, Nov. 2004. https://doi.org/10.1109/TKDE.2004.77
  7. D. Mandelin, L. Xu, R. Bodik, and D. Kimelman, "Jungloid mining: helping to navigate the API jungle," Proc. of the ACM SIGPLAN conference on Programming language design and implementation, pp. 48-61, 2005.
  8. T. Xie, J. Pei, "MAPO: mining API usages from open source repositories," Proc. of the international workshop on Mining software repositories, pp. 54-57, 2006.
  9. R. Holmes, R. J. Walker, and G. C. Murphy, "Approximate Structural Context Matching: An Approach to Recommend Relevant Examples," Journal of IEEE Transactions on Software Engineering, Vol. 32, No. 12, pp. 952-970, Dec. 2006. https://doi.org/10.1109/TSE.2006.117
  10. C. D. Manning, H. Schutze, "Foundations of statistical natural language processing," pp. 680. MIT Press, Cambridge, 1999.
  11. J. R. Firth, "A synopsis of linguistic theory 1930-1955," Journal of Studies in Linguistic Analysis, pp. 1-32, 1962.
  12. SHA. [Online]. Available: https://ko.wikipedia.org/wiki/SHA
  13. A. M. Mood, F. A. Graybill, and D. C. Boes, "Introduction to the theory of statistics,, 3rd Ed., pp. 564, McGraw-Hill, 1974.
  14. Github. [Online]. Available: https://github.com/
  15. Sourceforge. [Online]. Available: https://sourceforge.net/
  16. Softpedia. [Online]. Available: http://www.softpedia.com/
  17. Bitbucket. [Online]. Available: https://bitbucket.org/
  18. Google Gode. [Online]. Available: https://code.google.com/