• Title/Summary/Keyword: Software classification

Search Result 912, Processing Time 0.032 seconds

Classifying Malicious Web Pages by Using an Adaptive Support Vector Machine

  • Hwang, Young Sup;Kwon, Jin Baek;Moon, Jae Chan;Cho, Seong Je
    • Journal of Information Processing Systems
    • /
    • v.9 no.3
    • /
    • pp.395-404
    • /
    • 2013
  • In order to classify a web page as being benign or malicious, we designed 14 basic and 16 extended features. The basic features that we implemented were selected to represent the essential characteristics of a web page. The system heuristically combines two basic features into one extended feature in order to effectively distinguish benign and malicious pages. The support vector machine can be trained to successfully classify pages by using these features. Because more and more malicious web pages are appearing, and they change so rapidly, classifiers that are trained by old data may misclassify some new pages. To overcome this problem, we selected an adaptive support vector machine (aSVM) as a classifier. The aSVM can learn training data and can quickly learn additional training data based on the support vectors it obtained during its previous learning session. Experimental results verified that the aSVM can classify malicious web pages adaptively.

Road Surface Classification Using Weight-Based Clustering Algorithm (가중치 기반 클러스터링 기술을 이용한 도로표면 유형 분류 알고리즘)

  • Kim, Hyungmin;Song, Joongseok;Park, Jong-Il
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2014.11a
    • /
    • pp.146-149
    • /
    • 2014
  • 최근 자동차 산업과 IT 기술의 융합이 활발해지면서 스마트카, 자율주행 자동차(무인 자동차)와 같은 지능형 자동차 개발이 활발히 진행되고 지능형 자동차의 비전 기반 기술개발도 활발히 진행되고 있다. 고속도로와 같이 포장된 도로나 자갈길과 같은 비포장 도로에서도 운전자의 승차감을 고려한 능동적 안전시스템과 안정적인 자율주행 자동차의 주행능력을 보장하는 기술들 중 도로 유형을 판단하는 것이 중요 요소 중 하나이다. 따라서 본 논문에서는 가중치 기반 클러스터링 기술을 이용하여 도로표면 유형을 분류하는 알고리즘을 제안한다. 아스팔트, 자갈길, 흙길, 눈길의 도로표면 영상 데이터를 히스토그램의 분포도와 최고점 위치, 에지 영상의 에지량, 채도성분을 이용하여 특징값을 추출하고 클러스터를 구성한다. 분류할 입력 도로표면 영상에 대해 특징값을 분석한 후 탐색범위 내 선택된 각 클러스터의 벡터와의 거리를 측정하여 가중치를 계산하고 가중치가 높은 클러스터를 분류하여 입력 영상에 대한 도로표면을 결정한다. 실험결과 제안하는 방법이 각 도로표면 영상의 특징값과 이를 이용한 가중치만을 이용하여 약 91.25%의 정확도로 도로의 표면을 분류해 내는 것을 볼 수 있었다.

  • PDF

A Sentiment Classification Approach of Sentences Clustering in Webcast Barrages

  • Li, Jun;Huang, Guimin;Zhou, Ya
    • Journal of Information Processing Systems
    • /
    • v.16 no.3
    • /
    • pp.718-732
    • /
    • 2020
  • Conducting sentiment analysis and opinion mining are challenging tasks in natural language processing. Many of the sentiment analysis and opinion mining applications focus on product reviews, social media reviews, forums and microblogs whose reviews are topic-similar and opinion-rich. In this paper, we try to analyze the sentiments of sentences from online webcast reviews that scroll across the screen, which we call live barrages. Contrary to social media comments or product reviews, the topics in live barrages are more fragmented, and there are plenty of invalid comments that we must remove in the preprocessing phase. To extract evaluative sentiment sentences, we proposed a novel approach that clusters the barrages from the same commenter to solve the problem of scattering the information for each barrage. The method developed in this paper contains two subtasks: in the data preprocessing phase, we cluster the sentences from the same commenter and remove unavailable sentences; and we use a semi-supervised machine learning approach, the naïve Bayes algorithm, to analyze the sentiment of the barrage. According to our experimental results, this method shows that it performs well in analyzing the sentiment of online webcast barrages.

Developing of New a Tensorflow Tutorial Model on Machine Learning : Focusing on the Kaggle Titanic Dataset (텐서플로우 튜토리얼 방식의 머신러닝 신규 모델 개발 : 캐글 타이타닉 데이터 셋을 중심으로)

  • Kim, Dong Gil;Park, Yong-Soon;Park, Lae-Jeong;Chung, Tae-Yun
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.14 no.4
    • /
    • pp.207-218
    • /
    • 2019
  • The purpose of this study is to develop a model that can systematically study the whole learning process of machine learning. Since the existing model describes the learning process with minimum coding, it can learn the progress of machine learning sequentially through the new model, and can visualize each process using the tensor flow. The new model used all of the existing model algorithms and confirmed the importance of the variables that affect the target variable, survival. The used to classification training data into training and verification, and to evaluate the performance of the model with test data. As a result of the final analysis, the ensemble techniques is the all tutorial model showed high performance, and the maximum performance of the model was improved by maximum 5.2% when compared with the existing model using. In future research, it is necessary to construct an environment in which machine learning can be learned regardless of the data preprocessing method and OS that can learn a model that is better than the existing performance.

Generating Test Data for Deep Neural Network Model using Synonym Replacement (동의어 치환을 이용한 심층 신경망 모델의 테스트 데이터 생성)

  • Lee, Min-soo;Lee, Chan-gun
    • Journal of Software Engineering Society
    • /
    • v.28 no.1
    • /
    • pp.23-28
    • /
    • 2019
  • Recently, in order to effectively test deep neural network model for image processing application, researches have actively conducted to automatically generate data in corner-case that is not correctly predicted by the model. This paper proposes test data generation method that selects arbitrary words from input of system and transforms them into synonyms in order to test the bug reporter automatic assignment system based on sentence classification deep neural network model. In addition, we compare and evaluate the case of using proposed test data generation and the case of using existing difference-inducing test data generations based on various neuron coverages.

Quality Analysis of the Request for Proposals of Public Information Systems Project : System Operational Concept (공공정보화사업 제안요청서 품질분석 : 시스템 운영 개념을 중심으로)

  • Park, Sanghwi;Kim, Byungcho
    • Journal of Information Technology Services
    • /
    • v.18 no.2
    • /
    • pp.37-54
    • /
    • 2019
  • The purpose of this study is to present an evaluation model to measure the clarification level of stakeholder requirements of public sector software projects in the Republic of Korea. We tried to grasp the quality of proposal request through evaluation model. It also examines the impact of the level of stakeholder requirements on the level of system requirements. To do this, we analyzed existing research models and related standards related to business requirements and stakeholder requirements, and constructed evaluation models for the system operation concept documents in the ISO/IEC/IEEE 29148. The system operation concept document is a document prepared by organizing the requirements of stakeholders in the organization and sharing the intention of the organization. The evaluation model proposed in this study focuses on evaluating whether the contents related to the system operation concept are faithfully written in the request for proposal. The evaluation items consisted of three items: 'organization status', 'desired changes', and 'operational constraints'. The sample extracted 217 RFPs in the national procurement system. As a result of the analysis, the evaluation model proved to be valid and the internal consistency was maintained. The level of system operation concept was very low, and it was also found to affect the quality of system requirements. It is more important to clearly write stakeholders' requirements than the functional requirements. we propose a news classification methods for sentiment analysis that is effective for bankruptcy prediction model.

Towards Effective Analysis and Tracking of Mozilla and Eclipse Defects using Machine Learning Models based on Bugs Data

  • Hassan, Zohaib;Iqbal, Naeem;Zaman, Abnash
    • Soft Computing and Machine Intelligence
    • /
    • v.1 no.1
    • /
    • pp.1-10
    • /
    • 2021
  • Analysis and Tracking of bug reports is a challenging field in software repositories mining. It is one of the fundamental ways to explores a large amount of data acquired from defect tracking systems to discover patterns and valuable knowledge about the process of bug triaging. Furthermore, bug data is publically accessible and available of the following systems, such as Bugzilla and JIRA. Moreover, with robust machine learning (ML) techniques, it is quite possible to process and analyze a massive amount of data for extracting underlying patterns, knowledge, and insights. Therefore, it is an interesting area to propose innovative and robust solutions to analyze and track bug reports originating from different open source projects, including Mozilla and Eclipse. This research study presents an ML-based classification model to analyze and track bug defects for enhancing software engineering management (SEM) processes. In this work, Artificial Neural Network (ANN) and Naive Bayesian (NB) classifiers are implemented using open-source bug datasets, such as Mozilla and Eclipse. Furthermore, different evaluation measures are employed to analyze and evaluate the experimental results. Moreover, a comparative analysis is given to compare the experimental results of ANN with NB. The experimental results indicate that the ANN achieved high accuracy compared to the NB. The proposed research study will enhance SEM processes and contribute to the body of knowledge of the data mining field.

2-stage Classification Model of vehicles based on CNN Algorithm (CNN 알고리즘 기반 2단계 차종 분류 모델)

  • Kim, Han-Kyum;Ahn, Yoo-Lim;Yoon, Seong-Ho;Lee, Young-Jae;Lee, Young-Heung;Lee, Weon-June;Kim, Hyun-Min;Kim, Young-Ok
    • Annual Conference of KIPS
    • /
    • 2021.11a
    • /
    • pp.791-794
    • /
    • 2021
  • 범죄차량 판독 시스템, 지능화된 CCTV 등 차량과 관련된 시각지능에 관한 연구가 큰 관심을 받고 있다. 이 중 차량 분류 기술은, 특정 차량을 인식하는 핵심기술이다. 이와 관련한 기존 연구들은 큰 차종으로만 분류하거나, 분류 가능한 차종의 수, 정확도 등이 낮아 실용성 및 신뢰성이 떨어진다는 단점이 있다. 따라서, 본 논문에서는 차종을 정확하게 분류할 수 있는 2단계 차종 분류 알고리즘을 제안한다. 제안 시스템은 CNN으로 학습된 모델을 기반으로 1차로 차량의 유형을 분류하고, 2차로 정확한 차종을 분류한다. 실험 결과, 52개의 차종을 분류함에 있어 단일 분류 모델에 비해 5.3%p 더 높은 90.2%의 분류 정확도를 보였다. 이를 통해, 더욱 정확한 차종 분류가 가능하다.

Feature Ensemble-based Wolff Parkinson White Syndrome classification through ECG (ECG를 통한 Feature Ensemble 기반 Wolff Parkinson White 증후군 분류)

  • Gyutae Oh;Inki Kim;Beomjun Kim;Younghoon Jeon;Jeonghwan Gwak
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2023.01a
    • /
    • pp.169-171
    • /
    • 2023
  • Wolff Parkinson White Syndrome(WPW)은 일반인과는 다르게 선천적으로 심방과 심실 사이에 부전도로(Accessory Pathway)가 존재하여 정상 전도와 비교하였을 때, 빠른 속도로 심실을 자극하여 부정맥을 일으키는 것을 의미한다. WPW는 부정맥이 주된 증상이기는 하나, 평소에는 무증상인 경우가 많고, 성인이 되어 갑작스럽게 발생하는 경우가 존재하기 때문에 인지하지 못하고 살아가는 환자들이 많다는 것이 특징이다. 이러한 특징은 갑작스러운 건강 악화가 타인의 생명에 악영향을 줄 수 있는 트럭 운전기사나 의사와 같은 직업군 등의 경우 WPW를 조기에 발견하고 치료해 위험을 사전에 방지하는 것이 매우 중요하다. 따라서, 본 논문에서는 Electrocardiogram(ECG) 데이터를 기반으로 WPW를 자동으로 분류하기 위한 Feature Ensemble 기반 심층 학습 프레임워크를 제안한다. 제안된 기법의 경우 단일 1D-CNN과 GRU를 이용한 기법 대비 F1-Score, Accuracy 기준의 성능 향상을 달성하였기에 본 Task에 적합함을 보여준다.

  • PDF

Development of deep learning-based rock classifier for elementary, middle and high school education (초중고 교육을 위한 딥러닝 기반 암석 분류기 개발)

  • Park, Jina;Yong, Hwan-Seung
    • Journal of Software Assessment and Valuation
    • /
    • v.15 no.1
    • /
    • pp.63-70
    • /
    • 2019
  • These days, as Interest in Image recognition with deep learning is increasing, there has been a lot of research in image recognition using deep learning. In this study, we propose a system for classifying rocks through rock images of 18 types of rock(6 types of igneous, 6 types of metamorphic, 6 types of sedimentary rock) which are addressed in the high school curriculum, using CNN model based on Tensorflow, deep learning open source framework. As a result, we developed a classifier to distinguish rocks by learning the images of rocks and confirmed the classification performance of rock classifier. Finally, through the mobile application implemented, students can use the application as a learning tool in classroom or on-site experience.