• Title/Summary/Keyword: 분류 코드

Search Result 612, Processing Time 0.032 seconds

Light-weight Classification Model for Android Malware through the Dimensional Reduction of API Call Sequence using PCA

  • Jeon, Dong-Ha;Lee, Soo-Jin
    • Journal of the Korea Society of Computer and Information
    • /
    • v.27 no.11
    • /
    • pp.123-130
    • /
    • 2022
  • Recently, studies on the detection and classification of Android malware based on API Call sequence have been actively carried out. However, API Call sequence based malware classification has serious limitations such as excessive time and resource consumption in terms of malware analysis and learning model construction due to the vast amount of data and high-dimensional characteristic of features. In this study, we analyzed various classification models such as LightGBM, Random Forest, and k-Nearest Neighbors after significantly reducing the dimension of features using PCA(Principal Component Analysis) for CICAndMal2020 dataset containing vast API Call information. The experimental result shows that PCA significantly reduces the dimension of features while maintaining the characteristics of the original data and achieves efficient malware classification performance. Both binary classification and multi-class classification achieve higher levels of accuracy than previous studies, even if the data characteristics were reduced to less than 1% of the total size.

A Development the Standard of Construction to ERP Template (건설표준 ERP템플릿 개발 사례)

  • Lee, Min-Nam;Oh, Dong-Hwan;Kwon, Oh-In
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2003.05b
    • /
    • pp.939-942
    • /
    • 2003
  • 본 연구과제는 건설업체의 현재 문제점은 각 업체별로 상이한 분류 및 식별코드체계를 사용, 품목표기방식, 속성 포맷의 불일치, 용어와 단위의 불일치, 현 업계에서 사용하는 코드 분류체계는 호환이 되지 않고 국제적으로 사용할 수 없는 부적합한 실정이다. 따라 기존의 통합정보시스템과 별개로 관리 운영이 이원화되어 오던 각종 건설정보를 건설CALS/EC 핵심기술을 적용하며 건설표준 ERP 템플릿시스템을 개발 할 목적으로 한다. 따라서 본 연구에서는 건설표준 ERP 템플릿시스템과 건설통합분류체계의 표준코드의 연계로 건설CALS/EC 핵심기술이 적용된 건설표준ERP(통합DB)를 구축과 건설공사의 물량산출 데이터의 체계화된 물량예측과 물량공정의 통합화와 설계정보, 시공정보 및 관리유지정보의 연계하여 그리고 현장관리 및 원가관리정보의 통신망접속으로 건설통합정보를 제공한다.

  • PDF

A Study for Customer Clustering Mechanism using Automatic Meter Reading Data (자동검침 데이터를 이용한 고객 분류 기법에 대한 연구)

  • Kim, Young-Il;Shin, Jin-Ho;Song, Jae-Ju;Yi, Bong-Jae
    • Proceedings of the KIEE Conference
    • /
    • 2008.07a
    • /
    • pp.179-180
    • /
    • 2008
  • 배전선로의 효과적인 운영을 위해 최근 들어 자동검침 데이터를 활용한 부하분석에 대한 연구가 진행되고 있다. 일반적인 부하분석 방식은 자동검침 고객의 데이터를 이용하여 대표 부하패턴을 생성하고 이를 이용하여 미 검침 고객의 부하패턴을 생성하여, 전체 배전선로의 회선 및 구간에 대한 15분/시간/일/주/월 단위의 최대부하 및 부하패턴 등을 분석하는 방법이다. 기존에는 고객을 분류하기 위해 계약종별 코드만을 사용하였으나, 같은 계약종별 코드를 갖는 고객이라 하더라도 부하패턴이 다른 경우가 많아서 부하분석의 정확도를 떨어뜨렸다. 본 연구에서는 고객의 계약종별 코드뿐 아니라 다양한 고객속성 정보와 15분 단위 자동검침 데이터를 이용하여 k-means 기법을 통해 고객을 분류하는 방식을 제안하였다.

  • PDF

The Improved UCI Identifier Syntax for Convergence Digital Contents (융합 디지털콘텐츠에 적합한 UCI 식별자 구문구조 개선)

  • Kang, Sang-ug;Park, Sanghyun;Lim, Gyoo Gun
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.53 no.9
    • /
    • pp.82-88
    • /
    • 2016
  • The proposed new UCI syntax is compatible with the existing identifier and defines fixed length in such cases as printable ID, bar code and QR code which may entail better usage of identifier itself. For the compatibility, the identifiable metadata "key" is used for the existing UCI identifier and "UCI" element of metadata is defined for the new UCI identifier. The new UCI identifier plays roles of the resolution service and representation, and the old UCI identifier plays a role of internal DB management. Also, the object code has two types, meaningless and meaningful. The meaningful object code type can be used according the content classification standards in various field as comics, games, advertisement etc. The standardization activities can be supported by the root agency of UCI.

The Development of Prefix/Suffix Code of DOI Syntax (DOI 구문 식별 코드 개발)

  • 김세정;안계성
    • Proceedings of the Korean Society for Information Management Conference
    • /
    • 2000.08a
    • /
    • pp.63-66
    • /
    • 2000
  • 본 연구는 인터넷상에서 제공되는 지적콘텐츠에 대한 영속적이고 고유한 식별 체계인 DOI 상용 서비스의 일환으로 DOI 구문을 구성하고 있는 Prefix와 Suffix 식별코드를 개발하였다. 이를 위해서 현재 인터넷상에서 지적콘텐츠를 유통시키고 있는 기관들을 조사 분류하여 콘텐츠 보유기관 식별코드를 개발하였다. 또한 지적콘텐츠의 속성 및 유형 분석을 토대로 Suffix 코드의 구조 및 식별코드를 개발하였으며 관련 콘텐츠간의 연계를 위한 저작권 식별기호를 고려하였다.

  • PDF

Research on text mining based malware analysis technology using string information (문자열 정보를 활용한 텍스트 마이닝 기반 악성코드 분석 기술 연구)

  • Ha, Ji-hee;Lee, Tae-jin
    • Journal of Internet Computing and Services
    • /
    • v.21 no.1
    • /
    • pp.45-55
    • /
    • 2020
  • Due to the development of information and communication technology, the number of new / variant malicious codes is increasing rapidly every year, and various types of malicious codes are spreading due to the development of Internet of things and cloud computing technology. In this paper, we propose a malware analysis method based on string information that can be used regardless of operating system environment and represents library call information related to malicious behavior. Attackers can easily create malware using existing code or by using automated authoring tools, and the generated malware operates in a similar way to existing malware. Since most of the strings that can be extracted from malicious code are composed of information closely related to malicious behavior, it is processed by weighting data features using text mining based method to extract them as effective features for malware analysis. Based on the processed data, a model is constructed using various machine learning algorithms to perform experiments on detection of malicious status and classification of malicious groups. Data has been compared and verified against all files used on Windows and Linux operating systems. The accuracy of malicious detection is about 93.5%, the accuracy of group classification is about 90%. The proposed technique has a wide range of applications because it is relatively simple, fast, and operating system independent as a single model because it is not necessary to build a model for each group when classifying malicious groups. In addition, since the string information is extracted through static analysis, it can be processed faster than the analysis method that directly executes the code.

Stacked Autoencoder Based Malware Feature Refinement Technology Research (Stacked Autoencoder 기반 악성코드 Feature 정제 기술 연구)

  • Kim, Hong-bi;Lee, Tae-jin
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.30 no.4
    • /
    • pp.593-603
    • /
    • 2020
  • The advent of malicious code has increased exponentially due to the spread of malicious code generation tools in accordance with the development of the network, but there is a limit to the response through existing malicious code detection methods. According to this situation, a machine learning-based malicious code detection method is evolving, and in this paper, the feature of data is extracted from the PE header for machine-learning-based malicious code detection, and then it is used to automate the malware through autoencoder. Research on how to extract the indicated features and feature importance. In this paper, 549 features composed of information such as DLL/API that can be identified from PE files that are commonly used in malware analysis are extracted, and autoencoder is used through the extracted features to improve the performance of malware detection in machine learning. It was proved to be successful in providing excellent accuracy and reducing the processing time by 2 times by effectively extracting the features of the data by compressively storing the data. The test results have been shown to be useful for classifying malware groups, and in the future, a classifier such as SVM will be introduced to continue research for more accurate malware detection.

A R&D strategies for development using structured association map (구조화된 연관맵을 이용한 연구개발 전략 수립)

  • Song, Wonho;Lee, Junseok;Park, Sangsung
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.26 no.3
    • /
    • pp.190-195
    • /
    • 2016
  • A technology is continuously developed in a rapidly changing global market. A company requires an appropriate R&D strategy for adapting to this environment. That is, the technologies owned by the company needs to be thoroughly analyzed to improve its competitiveness. Alternatively, technology classification using IPC codes is carried out recently in an objective and quantitative way. International Patent Classification, IPC is an internationally specified classification system, so it is helpful to conduct an objective and quantitative patent analysis of technology. In this study, all of the patents owned by company C are investigated and a matrix representing IPC codes of each patent is created. Then, a structured association map of the patents is made through association rules mining based on Confidence. The association map can be used to inspect the current situation of a company about patents. It also allows highly associated technologies to be clustered. Using the association map, this study analyzes the technologies of company C and how it changes with time. The strategy for future technologies is established based on the result.

AutoML Machine Learning-Based for Detecting Qshing Attacks Malicious URL Classification Technology Research and Service Implementation (큐싱 공격 탐지를 위한 AutoML 머신러닝 기반 악성 URL 분류 기술 연구 및 서비스 구현)

  • Dong-Young Kim;Gi-Seong Hwang
    • Smart Media Journal
    • /
    • v.13 no.6
    • /
    • pp.9-15
    • /
    • 2024
  • In recent trends, there has been an increase in 'Qshing' attacks, a hybrid form of phishing that exploits fake QR (Quick Response) codes impersonating government agencies to steal personal and financial information. Particularly, this attack method is characterized by its stealthiness, as victims can be redirected to phishing pages or led to download malicious software simply by scanning a QR code, making it difficult for them to realize they have been targeted. In this paper, we have developed a classification technique utilizing machine learning algorithms to identify the maliciousness of URLs embedded in QR codes, and we have explored ways to integrate this with existing QR code readers. To this end, we constructed a dataset from 128,587 malicious URLs and 428,102 benign URLs, extracting 35 different features such as protocol and parameters, and used AutoML to identify the optimal algorithm and hyperparameters, achieving an accuracy of approximately 87.37%. Following this, we designed the integration of the trained classification model with existing QR code readers to implement a service capable of countering Qshing attacks. In conclusion, our findings confirm that deriving an optimized algorithm for classifying malicious URLs in QR codes and integrating it with existing QR code readers presents a viable solution to combat Qshing attacks.

Malware API Classification Technology Using LSTM Deep Learning Algorithm (LSTM 딥러닝 알고리즘을 활용한 악성코드 API 분류 기술 연구)

  • Kim, Jinha;Park, Wonhyung
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2022.05a
    • /
    • pp.259-261
    • /
    • 2022
  • Recently, malicious code is not a single technique, but several techniques are combined and merged, and only important parts are extracted. As new malicious codes are created and transformed, attack patterns are gradually diversified and attack targets are also diversifying. In particular, the number of damage cases caused by malicious actions in corporate security is increasing over time. However, even if attackers combine several malicious codes, the APIs for each type of malicious code are repeatedly used and there is a high possibility that the patterns and names of the APIs are similar. For this reason, this paper proposes a classification technique that finds patterns of APIs frequently used in malicious code, calculates the meaning and similarity of APIs, and determines the level of risk.

  • PDF