• Title/Summary/Keyword: Source Code Similarity

Search Result 47, Processing Time 0.027 seconds

Objective Material analysis to the device with IoT Framework System

  • Lee, KyuTae;Ki, Jang Geun
    • International Journal of Advanced Culture Technology
    • /
    • v.8 no.2
    • /
    • pp.289-296
    • /
    • 2020
  • Software copyright are written in text form of documents and stored as files, so it is easy to expose on an illegal copyright. The IOT framework configuration and service environment are also evaluated in software structure and revealed to replication environments. Illegal copyright can be easily created by intelligently modifying the program code in the framework system. This paper deals with similarity comparison to determine the suspicion of illegal copying. In general, original source code should be provided for similarity comparison on both. However, recently, the suspected developer have refused to provide the source code, and comparative evaluation are performed only with executable code. This study dealt with how to analyze the similarity with the execution code and the circuit configuration and interface state of the system without the original source code. In this paper, we propose a method of analyzing the data of the object without source code and verifying the similarity comparison result through evaluation examples.

Comparison procedure in evaluation analysis of source code comparison on Embedded system (정보기기 소스코드 유사성 분석에서 목적물 검증)

  • Nam, SangYep;Kim, Do-Hyeun;Lee, Kyu-Tae
    • Journal of Software Assessment and Valuation
    • /
    • v.17 no.2
    • /
    • pp.31-38
    • /
    • 2021
  • In order to analyze the similarity of the source code object material, the source code on both sides must be able to be compiled and executed. In particular, in the case of hardware-integrated software, it is necessary to check whether the hardware interface matches. However, currently, the source code is provided in an incomplete state which is not original of source code used in developing steps. The complainant confirms that the executing characteristics are similar to their own in the expression and function of the output, and request an evaluation. When a source code compilation error occurs during the evaluation process, the experts draw a flowchart of the source code and applies the method of tracing the code flow for each function as indirect method. However, this method is indirect and the subjective judgment is applied, so there is concern about the contention of objectivity in the similarity evaluation result. In this paper, the problems of unverified source code similarity analysis and improvement directions are dealt with, through the analysis cases of source code disputes applied to embedded systems.

Object Material Confirmation for Source Code Comparison on Embedded System (임베디드 시스템의 동일기능 소스코드 유사도 분석 요구사항)

  • Kim, Do-Hyeun;Lee, Kyu-Tae
    • Journal of Software Assessment and Valuation
    • /
    • v.17 no.1
    • /
    • pp.25-30
    • /
    • 2021
  • In case of evaluating the similarity of the source code analysis material in the embedded system, the provided source code must be confirmed to be executable. However, it is currently being in which compilation and interface matching with hardware are provided in an unconfirmed materials. The complainant assumes that many parts of the source code are similar because the characteristics of the operation are similar and the expression of the function is similar. As for the analysis result, the analysis result may appear different than expected due to these unidentified objects. In this study, the improvement direction is sugested through the case study by the analysis process of the source code and the similarity of the unverified source code.

Software Similarity Measurement based on Dependency Graph using Harmony Search

  • Yun, Ho Yeong;Joe, Yong Joon;Jung, Byung Ok;Shin, Dong myung;Bahng, Hyo Keun
    • Journal of the Korea Society of Computer and Information
    • /
    • v.21 no.12
    • /
    • pp.1-10
    • /
    • 2016
  • In this paper, we attempt to prevent certain cases by tracing a history and making genogram about open source software and its modification using similarity of source code. There are many areas which use open source software actively and widely, and open source software contributes their development. However, there are many unconscious cases like ignoring license or intellectual properties infringe which can lead litigation. To prevent such situation, we analyze source code similarity using program dependence graph which resembles subgraph isomorphism problem, a typical NP-complete problem. To solve subgraph isomorphism problem, we utilized harmony search of metaheuristic algorithm and compared its result with a genetic algorithm. For the future works, we represent open source software as program dependence graph and analyze their similarity.

Comparison of Code Similarity Analysis Performance of funcGNN and Siamese Network (funcGNN과 Siamese Network의 코드 유사성 분석 성능비교)

  • Choi, Dong-Bin;Jo, In-su;Park, Young B.
    • Journal of the Semiconductor & Display Technology
    • /
    • v.20 no.3
    • /
    • pp.113-116
    • /
    • 2021
  • As artificial intelligence technologies, including deep learning, develop, these technologies are being introduced to code similarity analysis. In the traditional analysis method of calculating the graph edit distance (GED) after converting the source code into a control flow graph (CFG), there are studies that calculate the GED through a trained graph neural network (GNN) with the converted CFG, Methods for analyzing code similarity through CNN by imaging CFG are also being studied. In this paper, to determine which approach will be effective and efficient in researching code similarity analysis methods using artificial intelligence in the future, code similarity is measured through funcGNN, which measures code similarity using GNN, and Siamese Network, which is an image similarity analysis model. The accuracy was compared and analyzed. As a result of the analysis, the error rate (0.0458) of the Siamese network was bigger than that of the funcGNN (0.0362).

A Plagiarism Detection Technique for Java Program Using Bytecode Analysis (바이트코드 분석을 이용한 자바 프로그램 표절검사기법)

  • Ji, Jeong-Hoon;Woo, Gyun;Cho, Hwan-Gue
    • Journal of KIISE:Software and Applications
    • /
    • v.35 no.7
    • /
    • pp.442-451
    • /
    • 2008
  • Most plagiarism detection systems evaluate the similarity of source codes and detect plagiarized program pairs. If we use the source codes in plagiarism detection, the source code security can be a significant problem. Plagiarism detection based on target code can be used for protecting the security of source codes. In this paper, we propose a new plagiarism detection technique for Java programs using bytecodes without referring their source codes. The plagiarism detection procedure using bytecode consists of two major steps. First, we generate the token sequences from the Java class file by analyzing the code area of methods. Then, we evaluate the similarity between token sequences using the adaptive local alignment. According to the experimental results, we can find the distributions of similarities of the source codes and that of bytecodes are very similar. Also, the correlation between the similarities of source code pairs and those of bytecode pairs is high enough for typical test data. The plagiarism detection system using bytecode can be used as a preliminary verifying tool before detecting the plagiarism by source code comparison.

A Study on Open Source Version and License Detection Tool (오픈소스 버전 및 라이선스 탐지 도구에 관한 연구)

  • Ki-Hwan Kim;Seong-Cheol Yoon;Su-Hyun Kim;Im-Yeong Lee
    • The Transactions of the Korea Information Processing Society
    • /
    • v.13 no.7
    • /
    • pp.299-310
    • /
    • 2024
  • Software is expensive, labor-intensive, and time-consuming to develop. To solve this problem, many organizations turn to publicly available open source, but they often do so without knowing exactly what they're getting into. Older versions of open source have various security vulnerabilities, and even when newer versions are released, many users are still using them, exposing themselves to security threats. Additionally, compliance with licenses is essential when using open source, but many users overlook this, leading to copyright issues. To solve this problem, you need a tool that analyzes open source versions, vulnerabilities, and license information. Traditional Blackduck provide a wealth of open source information when you request the source code, but it's a heavy lift to build the environment. In addition, Fossology extracts the licenses of open source, but does not provide detailed information such as versions because it does not have its own database. To solve these problems, this paper proposes a version and license detection tool that identifies the open source of a user's source code by measuring the source code similarity, and then detects the version and license. The proposed method improves the accuracy of similarity over existing source code similarity measurement programs such as MOSS, and provides users with information about licenses, versions, and vulnerabilities by analyzing each file in the corresponding open source in a web-based lightweight platform environment. This solves capacity issues such as BlackDuck and the lack of open source details such as Fossology.

Efficient Similarity Analysis Methods for Same Open Source Functions in Different Versions (서로 다른 버전의 동일 오픈소스 함수 간 효율적인 유사도 분석 기법)

  • Kim, Yeongcheol;Cho, Eun-Sun
    • Journal of KIISE
    • /
    • v.44 no.10
    • /
    • pp.1019-1025
    • /
    • 2017
  • Binary similarity analysis is used in vulnerability analysis, malicious code analysis, and plagiarism detection. Proving that a function is equal to a well-known safe functions of different versions through similarity analysis can help to improve the efficiency of the binary code analysis of malicious behavior as well as the efficiency of vulnerability analysis. However, few studies have been carried out on similarity analysis of the same function of different versions. In this paper, we analyze the similarity of function units through various methods based on extractable function information from binary code, and find a way to analyze efficiently with less time. In particular, we perform a comparative analysis of the different versions of the OpenSSL library to determine the way in which similar functions are detected even when the versions differ.

Generating Pylogenetic Tree of Homogeneous Source Code in a Plagiarism Detection System

  • Ji, Jeong-Hoon;Park, Su-Hyun;Woo, Gyun;Cho, Hwan-Gue
    • International Journal of Control, Automation, and Systems
    • /
    • v.6 no.6
    • /
    • pp.809-817
    • /
    • 2008
  • Program plagiarism is widespread due to intelligent software and the global Internet environment. Consequently the detection of plagiarized source code and software is becoming important especially in academic field. Though numerous studies have been reported for detecting plagiarized pairs of codes, we cannot find any profound work on understanding the underlying mechanisms of plagiarism. In this paper, we study the evolutionary process of source codes regarding that the plagiarism procedure can be considered as evolutionary steps of source codes. The final goal of our paper is to reconstruct a tree depicting the evolution process in the source code. To this end, we extend the well-known bioinformatics approach, a local alignment approach, to detect a region of similar code with an adaptive scoring matrix. The asymmetric code similarity based on the local alignment can be considered as one of the main contribution of this paper. The phylogenetic tree or evolution tree of source codes can be reconstructed using this asymmetric measure. To show the effectiveness and efficiency of the phylogeny construction algorithm, we conducted experiments with more than 100 real source codes which were obtained from East-Asia ICPC(International Collegiate Programming Contest). Our experiments showed that the proposed algorithm is quite successful in reconstructing the evolutionary direction, which enables us to identify plagiarized codes more accurately and reliably. Also, the phylogeny construction algorithm is successfully implemented on top of the plagiarism detection system of an automatic program evaluation system.

Study on the comparison result of Machine code Program (실행코드 비교 감정에서 주변장치 분석의 유효성)

  • Kim, Do-Hyeun;Lee, Kyu-Tae
    • Journal of Software Assessment and Valuation
    • /
    • v.16 no.1
    • /
    • pp.37-44
    • /
    • 2020
  • The similarity of the software is extracted by the verification of comparing with the source code. The source code is the intellectual copyright of the developer written in the programming language. And the source code written in text format contains the contents of the developer's expertise and ideas. The verification for judging the illegal use of software copyright is performed by comparing the structure and contents of files with the source code of the original and the illegal copy. However, there is hard to do the one-to-one comparison in practice. Cause the suspected source code do not submitted Intentionally or unconsciously. It is now increasing practically. In this case, the comparative evaluation with execution code should be performed, and indirect methods such as reverse assembling method, reverse engineering technique, and sequence analysis of function execution are applied. In this paper, we analyzed the effectiveness of indirect comparison results by practical evaluation . It also proposes a method to utilize to the system and executable code files as a verification results.