• Title/Summary/Keyword: Source file identification

Search Result 8, Processing Time 0.022 seconds

Semantic Similarity-Based Contributable Task Identification for New Participating Developers

  • Kim, Jungil;Choi, Geunho;Lee, Eunjoo
    • Journal of information and communication convergence engineering
    • /
    • v.16 no.4
    • /
    • pp.228-234
    • /
    • 2018
  • In software development, the quality of a product often depends on whether its developers can rapidly find and contribute to the proper tasks. Currently, the word data of projects to which newcomers have previously contributed are mainly utilized to find appropriate source files in an ongoing project. However, because of the vocabulary gap between software projects, the accuracy of source file identification based on information retrieval is not guaranteed. In this paper, we propose a novel source file identification method to reduce the vocabulary gap between software projects. The proposed method employs DBPedia Spotlight to identify proper source files based on semantic similarity between source files of software projects. In an experiment based on the Spring Framework project, we evaluate the accuracy of the proposed method in the identification of contributable source files. The experimental results show that the proposed approach can achieve better accuracy than the existing method based on comparison of word vocabularies.

A Comparative Study on Function and Performance of Snort and Suricata (Snort와 Suricata의 탐지 기능과 성능에 대한 비교 연구)

  • Jeong, Myeong Ki;Ahn, Seongjin;Park, Won Hyung
    • Convergence Security Journal
    • /
    • v.14 no.5
    • /
    • pp.3-8
    • /
    • 2014
  • We have tried to compare two different IDSs which are widespread over the network administrator, Snort and Suricata, in functional and performance aspects. Specifically, we focused on analyzing upon what functions for detecting threat were added newly and what Multi-Threading introduced newly for Suricata has influenced in a performance aspect. As a result, we could discover that there are some features in Suricata which has never existed in Snort such as Protocol Identification, HTTP Normalizer & Parser, and File Identification. Also, It was proved that the gap of PPS(Packets Per Second) becomes wider, as the number of CPU Cores which are working increase. Therefore, we could conclude that Suricata can be an efficient alternative for Snort considering the result that Suricata is more effective quantitatively as well as qualitatively.

Crowdsourcing Identification of License Violations

  • Lee, Sanghoon;German, Daniel M.;Hwang, Seung-won;Kim, Sunghun
    • Journal of Computing Science and Engineering
    • /
    • v.9 no.4
    • /
    • pp.190-203
    • /
    • 2015
  • Free and open source software (FOSS) has created a large pool of source codes that can be easily copied to create new applications. However, a copy should preserve copyright notice and license of the original file unless the license explicitly permits such a change. Through software evolution, it is challenging to keep original licenses or choose proper licenses. As a result, there are many potential license violations. Despite the fact that violations can have high impact on protecting copyright, identification of violations is highly complex. It relies on manual inspections by experts. However, such inspection cannot be scaled up with open source software released daily worldwide. To make this process scalable, we propose the following two methods: use machine-based algorithms to narrow down the potential violations; and guide non-experts to manually inspect violations. Using the first method, we found 219 projects (76.6%) with potential violations. Using the second method, we show that the accuracy of crowds is comparable to that of experts. Our techniques might help developers identify potential violations, understand the causes, and resolve these violations.

Measurement for License Identification of Open Source Software (오픈소스 소프트웨어 라이선스 파일 식별 기술)

  • Yun, Ho-Yeong;Joe, Yong-Joon;Jung, Byung-Ok;Shin, Dong-Myung
    • Journal of Software Assessment and Valuation
    • /
    • v.12 no.2
    • /
    • pp.1-8
    • /
    • 2016
  • In this paper, we study abstracting and identifying license file from a package to prevent unintentional intellectual property infringement because of lost/modified/confliction of license information when redistributing open source software. To invest character of the license files, we analyzed 322 licenses by n-gram and TF-IDF methods, and abstract license files from the packages. We identified license information with a similarity of the registered licenses by cosine measurement.

Hand-held Multimedia Device Identification Based on Audio Source (음원을 이용한 멀티미디어 휴대용 단말장치 판별)

  • Lee, Myung Hwan;Jang, Tae Ung;Moon, Chang Bae;Kim, Byeong Man;Oh, Duk-Hwan
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.19 no.2
    • /
    • pp.73-83
    • /
    • 2014
  • Thanks to the development of diverse audio editing Technology, audio file can be easily revised. As a result, diverse social problems like forgery may be caused. Digital forensic technology is actively studied to solve these problems. In this paper, a hand-held device identification method, an area of digital forensic technology is proposed. It uses the noise features of devices caused by the design and the integrated circuit of each device but cannot be identified by the audience. Wiener filter is used to get the noise sounds of devices and their acoustic features are extracted via MIRtoolbox and then they are trained by multi-layer neural network. To evaluate the proposed method, we use 5-fold cross-validation for the recorded data collected from 6 mobile devices. The experiments show the performance 99.9%. We also perform some experiments to observe the noise features of mobile devices are still useful after the data are uploaded to UCC. The experiments show the performance of 99.8% for UCC data.

A Study on Identification of the Source of Videos Recorded by Smartphones (스마트폰으로 촬영된 동영상의 출처 식별에 대한 연구)

  • Kim, Hyeon-seung;Choi, Jong-hyun;Lee, Sang-jin
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.26 no.4
    • /
    • pp.885-894
    • /
    • 2016
  • As smartphones become more common, anybody can take pictures and record videos easily nowadays. Video files taken from smartphones can be used as important clues and evidence. While you analyze video files taken from smartphones, there are some occasions where you need to prove that a video file was recorded by a specific smartphone. To do this, you can utilize various fingerprint techniques mentioned in existing research. But you might face the situation where you have to strengthen the result of fingerprinting or fingerprint technique can't be used. Therefore forensic investigation of the smartphone must be done before fingerprinting and the database of metadata of video files should be established. The artifacts in a smartphone after video recording and the database mentioned above are discussed in this paper.

An Accurate Log Object Recognition Technique

  • Jiho, Ju;Byungchul, Tak
    • Journal of the Korea Society of Computer and Information
    • /
    • v.28 no.2
    • /
    • pp.89-97
    • /
    • 2023
  • In this paper, we propose factors that make log analysis difficult and design technique for detecting various objects embedded in the logs which helps in the subsequent analysis. In today's IT systems, logs have become a critical source data for many advanced AI analysis techniques. Although logs contain wealth of useful information, it is difficult to directly apply techniques since logs are semi-structured by nature. The factors that interfere with log analysis are various objects such as file path, identifiers, JSON documents, etc. We have designed a BERT-based object pattern recognition algorithm for these objects and performed object identification. Object pattern recognition algorithms are based on object definition, GROK pattern, and regular expression. We find that simple pattern matchings based on known patterns and regular expressions are ineffective. The results show significantly better accuracy than using only the patterns and regular expressions. In addition, in the case of the BERT model, the accuracy of classifying objects reached as high as 99%.

Digital Camera Identification Based on Interpolation Pattern Used Lens Distortion Correction (디지털 카메라의 렌즈 왜곡 보정에 사용된 보간 패턴 추출을 통한 카메라 식별 방법)

  • Hwang, Min-Gu;Kim, Dong-Min;Har, Dong-Hwan
    • Journal of Internet Computing and Services
    • /
    • v.13 no.3
    • /
    • pp.49-59
    • /
    • 2012
  • Throughout developing digital technology, reproduction of image is growing better day by day. And at the same time, diverse image editing softwares are developed to manage images easily. In the process of editing images, those programs could delete or modify EXIF files which have the original image information; therefore images without the origin source are widely spread on the web site after editing. This matter could affect analysis of images due to the distortion of originality. Especially in the court of law, the source of evidence should be expressed clearly; therefore digital image EXIF file without deletion or distortion could not be the objective evidence. In this research, we try to trace the identification of a digital camera in order to solve digital images originality, and also we focus on lens distortion correction algorism which is used in digital image processing. Lens distortion correction uses mapping algorism, and at this moment it also uses interpolation algorism to prevent aliasing artifact and reconstruction artifact. At this point interpolation shows the similar mapping pattern; therefore we want to find out the interpolation evidence. We propose a minimum filter algorism in order to detect interpolation pattern and adjust the same minimum filter coefficient in two areas; one has interpolation and the second has no interpolation. Throughout DFT, we confirm frequency character between each area. Based on this result, we make the final detection map by using differences between two areas. In other words, thereby the area which has the interpolation caused by mapping is adjusted using minimum filter for detection algorism; the second area which has no interpolation tends to different frequency character.