• Title/Summary/Keyword: 클래스 분류

Search Result 729, Processing Time 0.025 seconds

Refining Rules of Decision Tree Using Extended Data Expression (확장형 데이터 표현을 이용하는 이진트리의 룰 개선)

  • Jeon, Hae Sook;Lee, Won Don
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.18 no.6
    • /
    • pp.1283-1293
    • /
    • 2014
  • In ubiquitous environment, data are changing rapidly and new data is coming as times passes. And sometimes all of the past data will be lost if there is not sufficient space in memory. Therefore, there is a need to make rules and combine it with new data not to lose all the past data or to deal with large amounts of data. In making decision trees and extracting rules, the weight of each of rules is generally determined by the total number of the class at leaf. The computational problem of finding a minimum finite state acceptor compatible with given data is NP-hard. We assume that rules extracted are not correct and may have the loss of some information. Because of this precondition. this paper presents a new approach for refining rules. It controls their weight of rules of previous knowledge or data. In solving rule refinement, this paper tries to make a variety of rules with pruning method with majority and minority properties, control weight of each of rules and observe the change of performances. In this paper, the decision tree classifier with extended data expression having static weight is used for this proposed study. Experiments show that performances conducted with a new policy of refining rules may get better.

Predicting Power Generation Patterns Using the Wind Power Data (풍력 데이터를 이용한 발전 패턴 예측)

  • Suh, Dong-Hyok;Kim, Kyu-Ik;Kim, Kwang-Deuk;Ryu, Keun-Ho
    • Journal of the Korea Society of Computer and Information
    • /
    • v.16 no.11
    • /
    • pp.245-253
    • /
    • 2011
  • Due to the imprudent spending of the fossil fuels, the environment was contaminated seriously and the exhaustion problems of the fossil fuels loomed large. Therefore people become taking a great interest in alternative energy resources which can solve problems of fossil fuels. The wind power energy is one of the most interested energy in the new and renewable energy. However, the plants of wind power energy and the traditional power plants should be balanced between the power generation and the power consumption. Therefore, we need analysis and prediction to generate power efficiently using wind energy. In this paper, we have performed a research to predict power generation patterns using the wind power data. Prediction approaches of datamining area can be used for building a prediction model. The research steps are as follows: 1) we performed preprocessing to handle the missing values and anomalous data. And we extracted the characteristic vector data. 2) The representative patterns were found by the MIA(Mean Index Adequacy) measure and the SOM(Self-Organizing Feature Map) clustering approach using the normalized dataset. We assigned the class labels to each data. 3) We built a new predicting model about the wind power generation with classification approach. In this experiment, we built a forecasting model to predict wind power generation patterns using the decision tree.

A Personal Digital Library on a Distributed Mobile Multiagents Platform (분산 모바일 멀티에이전트 플랫폼을 이용한 사용자 기반 디지털 라이브러리 구축)

  • Cho Young Im
    • Journal of KIISE:Software and Applications
    • /
    • v.31 no.12
    • /
    • pp.1637-1648
    • /
    • 2004
  • When digital libraries are developed by the traditional client/sever system using a single agent on the distributed environment, several problems occur. First, as the search method is one dimensional, the search results have little relationship to each other. Second, the results do not reflect the user's preference. Third, whenever a client connects to the server, users have to receive the certification. Therefore, the retrieval of documents is less efficient causing dissatisfaction with the system. I propose a new platform of mobile multiagents for a personal digital library to overcome these problems. To develop this new platform I combine the existing DECAF multiagents platform with the Voyager mobile ORB and propose a new negotiation algorithm and scheduling algorithm. Although there has been some research for a personal digital library, I believe there have been few studies on their integration and systemization. For searches of related information, the proposed platform could increase the relationship of search results by subdividing the related documents, which are classified by a supervised neural network. For the user's preference, as some modular clients are applied to a neural network, the search results are optimized. By combining a mobile and multiagents platform a new mobile, multiagents platform is developed in order to decrease a network burden. Furthermore, a new negotiation algorithm and a scheduling algorithm are activated for the effectiveness of PDS. The results of the simulation demonstrate that as the number of servers and agents are increased, the search time for PDS decreases while the degree of the user's satisfaction is four times greater than with the C/S model.

A Research on Network Intrusion Detection based on Discrete Preprocessing Method and Convolution Neural Network (이산화 전처리 방식 및 컨볼루션 신경망을 활용한 네트워크 침입 탐지에 대한 연구)

  • Yoo, JiHoon;Min, Byeongjun;Kim, Sangsoo;Shin, Dongil;Shin, Dongkyoo
    • Journal of Internet Computing and Services
    • /
    • v.22 no.2
    • /
    • pp.29-39
    • /
    • 2021
  • As damages to individuals, private sectors, and businesses increase due to newly occurring cyber attacks, the underlying network security problem has emerged as a major problem in computer systems. Therefore, NIDS using machine learning and deep learning is being studied to improve the limitations that occur in the existing Network Intrusion Detection System. In this study, a deep learning-based NIDS model study is conducted using the Convolution Neural Network (CNN) algorithm. For the image classification-based CNN algorithm learning, a discrete algorithm for continuity variables was added in the preprocessing stage used previously, and the predicted variables were expressed in a linear relationship and converted into easy-to-interpret data. Finally, the network packet processed through the above process is mapped to a square matrix structure and converted into a pixel image. For the performance evaluation of the proposed model, NSL-KDD, a representative network packet data, was used, and accuracy, precision, recall, and f1-score were used as performance indicators. As a result of the experiment, the proposed model showed the highest performance with an accuracy of 85%, and the harmonic mean (F1-Score) of the R2L class with a small number of training samples was 71%, showing very good performance compared to other models.

Prediction of Land Surface Temperature by Land Cover Type in Urban Area (도시지역에서 토지피복 유형별 지표면 온도 예측 분석)

  • Kim, Geunhan
    • Korean Journal of Remote Sensing
    • /
    • v.37 no.6_3
    • /
    • pp.1975-1984
    • /
    • 2021
  • Urban expansion results in raising the temperature in the city, which can cause social, economic and physical damage. In order to prevent the urban heat island and reduce the urban land surface temperature, it is important to quantify the cooling effect of the features of the urban space. Therefore, in order to understand the relationship between each object of land cover and the land surface temperature in Seoul, the land cover map was classified into 6 classes. And the correlation and multiple regression analysis between land surface temperature and the area of objects, perimeter/area, and normalized difference vegetation index was analyzed. As a result of the analysis, the normalized difference vegetation index showed a high correlation with the land surface temperature. Also, in multiple regression analysis, the normalized difference vegetation index exerted a higher influence on the land surface temperature prediction than other coefficients. However, the explanatory power of the derived models as a result of multiple regression analysis was low. In the future, if continuous monitoring is performed using high-resolution MIR Image from KOMPSAT-3A, it will be possible to improve the explanatory power of the model. By utilizing the relationship between such various land cover types considering vegetation vitality of green areas with that of land surface temperature within urban spaces for urban planning, it is expected to contribute in reducing the land surface temperature in urban spaces.

IPC Code Based Analysis of Technology Convergence of the IoT Patents in South Korea, China, and Japan : Focusing on PCT International Applications (한중일 사물인터넷(IoT) 관련 특허의 IPC 코드 기반 기술융복합 분석 : PCT 국제출원을 중심으로)

  • Shim, Jaeruen
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.24 no.7
    • /
    • pp.949-955
    • /
    • 2020
  • In this Study, Social Network Analysis of IoT related patents in South Korea, China, and Japan was conducted from the viewpoint of patent informatics. To this end, 2,526 patents filed by PCT until December 2019 were investigated up to the subclass level of the IPC code. As a result, in the case of South Korea, representative IPC codes are in the order of G06Q, H04L, G06F, H04W, and the highest frequency of interconnection is H04L→H04W, H04W→H04L, H04W→H04B. In China, the representative IPC codes are in the order of H04L, H04W, G05B, G06Q. South Korea has strong technological convergence centered on the G06Q, while China has strong convergence centered around H04L and H04W. Moreover, in China, H04L and H04W have more diverse combinations than in South Korea in Section A, B, G, and H. In the future, it is necessary to study the diversity of technology convergence of H04L and H04W in China.

A hybrid intrusion detection system based on CBA and OCSVM for unknown threat detection (알려지지 않은 위협 탐지를 위한 CBA와 OCSVM 기반 하이브리드 침입 탐지 시스템)

  • Shin, Gun-Yoon;Kim, Dong-Wook;Yun, Jiyoung;Kim, Sang-Soo;Han, Myung-Mook
    • Journal of Internet Computing and Services
    • /
    • v.22 no.3
    • /
    • pp.27-35
    • /
    • 2021
  • With the development of the Internet, various IT technologies such as IoT, Cloud, etc. have been developed, and various systems have been built in countries and companies. Because these systems generate and share vast amounts of data, they needed a variety of systems that could detect threats to protect the critical data contained in the system, which has been actively studied to date. Typical techniques include anomaly detection and misuse detection, and these techniques detect threats that are known or exhibit behavior different from normal. However, as IT technology advances, so do technologies that threaten systems, and these methods of detection. Advanced Persistent Threat (APT) attacks national or companies systems to steal important information and perform attacks such as system down. These threats apply previously unknown malware and attack technologies. Therefore, in this paper, we propose a hybrid intrusion detection system that combines anomaly detection and misuse detection to detect unknown threats. Two detection techniques have been applied to enable the detection of known and unknown threats, and by applying machine learning, more accurate threat detection is possible. In misuse detection, we applied Classification based on Association Rule(CBA) to generate rules for known threats, and in anomaly detection, we used One-Class SVM(OCSVM) to detect unknown threats. Experiments show that unknown threat detection accuracy is about 94%, and we confirm that unknown threats can be detected.

The Detection of Online Manipulated Reviews Using Machine Learning and GPT-3 (기계학습과 GPT3를 시용한 조작된 리뷰의 탐지)

  • Chernyaeva, Olga;Hong, Taeho
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.4
    • /
    • pp.347-364
    • /
    • 2022
  • Fraudulent companies or sellers strategically manipulate reviews to influence customers' purchase decisions; therefore, the reliability of reviews has become crucial for customer decision-making. Since customers increasingly rely on online reviews to search for more detailed information about products or services before purchasing, many researchers focus on detecting manipulated reviews. However, the main problem in detecting manipulated reviews is the difficulties with obtaining data with manipulated reviews to utilize machine learning techniques with sufficient data. Also, the number of manipulated reviews is insufficient compared with the number of non-manipulated reviews, so the class imbalance problem occurs. The class with fewer examples is under-represented and can hamper a model's accuracy, so machine learning methods suffer from the class imbalance problem and solving the class imbalance problem is important to build an accurate model for detecting manipulated reviews. Thus, we propose an OpenAI-based reviews generation model to solve the manipulated reviews imbalance problem, thereby enhancing the accuracy of manipulated reviews detection. In this research, we applied the novel autoregressive language model - GPT-3 to generate reviews based on manipulated reviews. Moreover, we found that applying GPT-3 model for oversampling manipulated reviews can recover a satisfactory portion of performance losses and shows better performance in classification (logit, decision tree, neural networks) than traditional oversampling models such as random oversampling and SMOTE.

A Study on Mapping Relations between eBook Contents for Conversion (전자책 문서 변환을 위한 컨텐츠 대응 관계에 관한 연구)

  • 고승규;임순범;김성혁;최윤철
    • The Journal of Society for e-Business Studies
    • /
    • v.8 no.2
    • /
    • pp.99-111
    • /
    • 2003
  • By virtue of diverse advantages derived from digital media, eBook is getting started to use. And many market research agencies have predicted that its market will be greatly expanded soon. But against those expectations, copyright-related problems and the difficulties of its accessing inherited from various eBook content formats become an obstacle to its diffusion. The first problems can be solved by DRM technology. And to solve the second problems, each nation has published its own content standard format. But the domestic standards are useful only the domestic level, they still leave the problems in the national level. The variety of content formats has created a demand for mechanisms that allow the exchange of eBook contents. Therefore we study the mapping relations between eBook contents for conversion. To define the mapping relations, first we extract the mapping both between eBook contents and between normal XML documents. From those mappings, we define seven mapping relations and classify them by cardinality. And we analyze the classified relations, which can be generated by automatic, or not. Using these results, we also classify the eBook content conversion as automatic, semi-automatic, and manual. Besides, we provide the conversion templates for mapping relations for automatic generation of conversion scripts. To show the feasibility of conversion templates, we apply them to the eBook content conversion. Experiment shows that our conversion templates generate the conversion scripts properly. We expected that defined mapping relations and conversion templates can be used not only in eBook content conversion , but also in normal XML document conversion.

  • PDF