DOI QR코드

DOI QR Code

Designing a Repository Independent Model for Mining and Analyzing Heterogeneous Bug Tracking Systems

다형의 버그 추적 시스템 마이닝 및 분석을 위한 저장소 독립 모델 설계

  • Lee, Jae-Kwon (Dept. of Computer Engineering, Chungbuk National University) ;
  • Jung, Woo-Sung (Dept. of Computer Engineering, Chungbuk National University)
  • 이재권 (충북대학교 컴퓨터공학과) ;
  • 정우성 (충북대학교 컴퓨터공학과)
  • Received : 2014.08.26
  • Accepted : 2014.09.17
  • Published : 2014.09.30

Abstract

In this paper, we propose UniBAS(Unified Bug Analysis System) to provide a unified repository model by integrating the extracted data from the heterogeneous bug tracking systems. The UniBAS reduces the cost and complexity of the MSR(Mining Software Repositories) research process and enables the researchers to focus on their logics rather than the tedious and repeated works such as extracting repositories, processing data and building analysis models. Additionally, the system not only extracts the data but also automatically generates database tables, views and stored procedures which are required for the researchers to perform query-based analysis easily. It can also generate various types of exported files for utilizing external analysis tools or managing research data. A case study of detecting duplicate bug reports from the Firfox project of the Mozilla site has been performed based on the UniBAS in order to evaluate the usefulness of the system. The results of the experiments with various algorithms of natural language processing and flexible querying to the automatically extracted data also showed the effectiveness of the proposed system.

본 논문은 다양한 버그 추적 시스템으로부터 추출한 데이터를 통합하여 단일 저장소 모델을 제공하는 UniBAS(Unified Bug Analysis System)를 제안한다. UniBAS는 MSR(Mining Software Repositories) 연구 과정에서의 저장소 추출, 데이터 가공이나 모델 생성과 같은 공통적인 반복 작업을 줄이고, 관련 연구자가 상위 수준의 연구에 보다 집중할 수 있도록 함으로써 해당 연구 수행에 발생하는 복잡도와 비용을 줄여준다. 또한, UniBAS는 데이터 추출 뿐 아니라 질의 기반 분석에 필요한 테이블, 뷰 및 저장 프로시저 등을 자동 생성하며, 수집한 데이터 관리와 외부 도구와의 연동을 위해 다양한 형식의 파일을 생성할 수 있다. 사례 연구로 UniBAS의 유용성을 검증하기 위해 Mozilla사이트의 Firefox프로젝트를 대상으로 실제 중복 버그 리포트를 탐지하는 실험을 진행하였다. 이 과정에서 자동 추출된 자료를 대상으로 질의와 분석이 유연하게 이루어질 수 있었으며, 다양한 자연어 처리 알고리즘 적용을 통해 유효한 실험 결과를 얻을 수 있었다.

Keywords

References

  1. N. Bettenburg, R. Premraj, T. Zimmermann, and S. Kim, "Extracting structural information from bug reports," Proc. 5th Working Conference on Mining Software Repositories pp. 27-30, 2008.
  2. P. Runeson, M. Alexandersson, and O. Nyholm, "Detection of Duplicate Defect Reports Using Natural Language Processing," Proc. 29th International Conference on Software Engineering, pp. 499-510, 2007.
  3. A. T. Nguyen, T. T. Nguyen, T. N. Nguyen, D. Lo, and C. Sun, "Duplicate bug report detection with a combination of information retrieval and topic modeling," Proc. 27th IEEE/ACM International Conference on Automated Software Engineering, pp. 70-79, 2012.
  4. J. Zhou, H. Zhang, and D. Lo, "Where should the bugs be fixed? More accurate information retrieval-based bug localization based on bug reports," Proc. 34th International Conference on Software Engineering, pp. 14-24, 2012.
  5. D. Kim, Y. Tao, S. Kim, and A. Zeller, "Where Should We Fix This Bug? A Two-Phase Recommendation Model," IEEE Transactions on Software Engineering, vol. 39, no. 11, pp. 1597-1610, Nov. 2013. https://doi.org/10.1109/TSE.2013.24
  6. D. Cubranic and G. C. Murphy, "Automatic bug triage using text categorization," Proc. 16th International Conference on Software Engineering & Knowledge Engineering, pp. 92-97, 2004.
  7. J. Anvik, L. Hiew, and G. C. Murphy, "Who should fix this bug?," Proc. 28th International Conference on Software Engineering, pp. 361-370, 2006.
  8. F. Servant and J. A. Jones, "WhoseFault: Automatic developer-to-fault assignment through fault localization," Proc. 34th International Conference on Software Engineering, pp. 36-46, 2012.
  9. N. Bettenburg, S. Just, A. Schroter, C. Weiss, R. Premraj, and T. Zimmermann, "Quality of bug reports in Eclipse," Proc. 5th OOPSLA Workshop on Eclipse Technology eXchange, pp. 21-25, 2007.
  10. N. Bettenburg, S. Just, A. Schroter, C. Weiss, R. Premraj, and T. Zimmermann, "What makes a good bug report?," Proc. 16th ACM SIGSOFT International Symposium on Foundations of Software Engineering, pp. 308-318, 2008.
  11. T. Zimmermann, R. Premraj, N. Bettenburg, S. Just, A. Schroter, and C. Weiss, "What Makes a Good Bug Report?," IEEE Transactions on Software Engineering, vol. 36, no. 5, pp. 618-643, Sep. 2010. https://doi.org/10.1109/TSE.2010.63
  12. C. Weiss, R. Premraj, T. Zimmermann, and A. Zeller, "How Long Will It Take to Fix This Bug?," Proc. 4th International Workshop on Mining Software Repositories, pp. 1-8, 2007.
  13. H. M. Tran, C. Lange, G. Chulkov, J. Schönwälder, and M. Kohlhase, "Applying Semantic Techniques to Search and Analyze Bug Tracking Data," Journal of Network and Systems Management, vol. 17, no. 3, pp. 285-308, May. 2009. https://doi.org/10.1007/s10922-009-9134-4
  14. M. Capraro, "Towards a Representative and Diverse Analysis of Issue-Tracker Related Code and Process Metrics," Friedrich-Alexander-University Erlangen-Nurnberg, Germany, 2013.
  15. N. Kaushik and L. Tahvildari, "A Comparative Study of the Performance of IR Models on Duplicate Bug Detection," Proc. 16th European Conference on Software Maintenance and Reengineering, pp. 159-168, 2012.
  16. H. P. Luhn, "The Automatic Creation of Literature Abstracts," IBM Journal of Research and Development, vol. 2, no. 2, pp. 159-165, 1958. https://doi.org/10.1147/rd.22.0159