• Title/Summary/Keyword: Software Repository Mining

Search Result 6, Processing Time 0.016 seconds

Designing a Repository Independent Model for Mining and Analyzing Heterogeneous Bug Tracking Systems (다형의 버그 추적 시스템 마이닝 및 분석을 위한 저장소 독립 모델 설계)

  • Lee, Jae-Kwon;Jung, Woo-Sung
    • Journal of the Korea Society of Computer and Information
    • /
    • v.19 no.9
    • /
    • pp.103-115
    • /
    • 2014
  • In this paper, we propose UniBAS(Unified Bug Analysis System) to provide a unified repository model by integrating the extracted data from the heterogeneous bug tracking systems. The UniBAS reduces the cost and complexity of the MSR(Mining Software Repositories) research process and enables the researchers to focus on their logics rather than the tedious and repeated works such as extracting repositories, processing data and building analysis models. Additionally, the system not only extracts the data but also automatically generates database tables, views and stored procedures which are required for the researchers to perform query-based analysis easily. It can also generate various types of exported files for utilizing external analysis tools or managing research data. A case study of detecting duplicate bug reports from the Firfox project of the Mozilla site has been performed based on the UniBAS in order to evaluate the usefulness of the system. The results of the experiments with various algorithms of natural language processing and flexible querying to the automatically extracted data also showed the effectiveness of the proposed system.

A Technique to Link Bug and Commit Report based on Commit History (커밋 히스토리에 기반한 버그 및 커밋 연결 기법)

  • Chae, Youngjae;Lee, Eunjoo
    • KIISE Transactions on Computing Practices
    • /
    • v.22 no.5
    • /
    • pp.235-239
    • /
    • 2016
  • 'Commit-bug link', the link between commit history and bug reports, is used for software maintenance and defect prediction in bug tracking systems. Previous studies have shown that the links are automatically detected based on text similarity, time interval, and keyword. Existing approaches depend on the quality of commit history and could thus miss several links. In this paper, we proposed a technique to link commit and bug report using not only messages of commit history, but also the similarity of files in the commit history coupled with bug reports. The experimental results demonstrated the applicability of the suggested approach.

A Market Positioning Analysis using Mobile Shopping App Reviews (모바일 쇼핑 앱 리뷰를 이용한 시장 포지셔닝 분석)

  • Kim, Yong-Hwan;Park, Ji-hoon;Lee, Seung-Jun;Kim, Ja-Hee
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2016.01a
    • /
    • pp.157-160
    • /
    • 2016
  • 최근 모바일 쇼핑 시장의 거래액 규모는 해마다 기하급수적으로 증가하고 있으며, 기업들은 모바일 애플리케이션의 어떤 특성들이 자사의 매출을 증대시킬 수 있는지에 대해 관심이 있다. 그러므로 본 논문에서는 텍스트 마이닝을 이용하여 사용자들이 많이 쓰는 모바일 쇼핑 애플리케이션의 리뷰에서 자주 쓰는 명사를 추출하고 내용분석을 통해 평가 항목들을 도출한다. 그리고 도출된 평가항목에 레퍼토리 그리드 기법을 적용하여 모바일 쇼핑 애플리케이션을 평가하고 시장 포지셔닝을 실시한다. 이를 통해 모바일 쇼핑 애플리케이션의 어떤 특성이 이용자들의 서비스 선호도에 영향을 미치는지 분석한다.

  • PDF

Analyzing Characteristics of Code Refactoring for Python Deep-Learning Applications (파이썬 딥러닝 응용의 코드 리팩토링 특성 분석)

  • Kim, Dong Kwan
    • The Journal of the Korea Contents Association
    • /
    • v.22 no.10
    • /
    • pp.754-764
    • /
    • 2022
  • Code refactoring refers to a maintenance task to change the code of a software system in order to consider new requirements, fix bugs, and restructure code. There have been various studies of refactoring subjects such as refactoring types, refactoring benefits, and CASE tools. However, Java applications rather than python ones have been benefited by refactoring-based coding practices. There are few cases of refactoring stuides on Python applications. This paper finds and analyzes single refactoring operations and composite refactoring operations for Python-based deep learning systems. In addition, we find that there is a statistically significant difference in the frequency of occurrence of single and complex refactoring operations in the two groups of deep learning applications and typical Python applications. Furthermore, we analyze keywords of commit messages to catch refactoring intentions of software developers.

A Technique to Recommend Appropriate Developers for Reported Bugs Based on Term Similarity and Bug Resolution History (개발자 별 버그 해결 유형을 고려한 자동적 개발자 추천 접근법)

  • Park, Seong Hun;Kim, Jung Il;Lee, Eun Joo
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.3 no.12
    • /
    • pp.511-522
    • /
    • 2014
  • During the development of the software, a variety of bugs are reported. Several bug tracking systems, such as, Bugzilla, MantisBT, Trac, JIRA, are used to deal with reported bug information in many open source development projects. Bug reports in bug tracking system would be triaged to manage bugs and determine developer who is responsible for resolving the bug report. As the size of the software is increasingly growing and bug reports tend to be duplicated, bug triage becomes more and more complex and difficult. In this paper, we present an approach to assign bug reports to appropriate developers, which is a main part of bug triage task. At first, words which have been included the resolved bug reports are classified according to each developer. Second, words in newly bug reports are selected. After first and second steps, vectors whose items are the selected words are generated. At the third step, TF-IDF(Term frequency - Inverse document frequency) of the each selected words are computed, which is the weight value of each vector item. Finally, the developers are recommended based on the similarity between the developer's word vector and the vector of new bug report. We conducted an experiment on Eclipse JDT and CDT project to show the applicability of the proposed approach. We also compared the proposed approach with an existing study which is based on machine learning. The experimental results show that the proposed approach is superior to existing method.

A Technique to Detect Change-Coupled Files Using the Similarity of Change Types and Commit Time (변경 유형의 유사도 및 커밋 시간을 이용한 파일 변경 결합도)

  • Kim, Jung Il;Lee, Eun Joo
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.3 no.2
    • /
    • pp.65-72
    • /
    • 2014
  • Change coupling is a measure to show how strongly change-related two entities are. When two source files have been frequently changed together, they are regarded as change-coupled files and they will probably be changed together in the near future. In the previous studies, the change coupling between two files is defined with the number of common changed time, that is, common commit time of the files. However, the frequency-based technique has limitations because of 'tangled changes', which frequently happens in the development environments with version control systems. The tangled change means that several code hunks have been changed at the same time, though they have no relation with each other. In this paper, the change types of the code hunks are also used to define change coupling, in addition to the common commit time of target files. First, the frequency vector based on change types are defined with the extracted change types, and then, the similarity of change patterns are calculated using the cosine similarity measure. We conducted experiments on open source project Eclipse JDT and CDT for case studies. The result shows that the applicability of the proposed method, compared to the previous studies.