• Title/Summary/Keyword: One-hot Encoding

Search Result 21, Processing Time 0.024 seconds

A Study on the traffic flow prediction through Catboost algorithm (Catboost 알고리즘을 통한 교통흐름 예측에 관한 연구)

  • Cheon, Min Jong;Choi, Hye Jin;Park, Ji Woong;Choi, HaYoung;Lee, Dong Hee;Lee, Ook
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.22 no.3
    • /
    • pp.58-64
    • /
    • 2021
  • As the number of registered vehicles increases, traffic congestion will worsen worse, which may act as an inhibitory factor for urban social and economic development. Through accurate traffic flow prediction, various AI techniques have been used to prevent traffic congestion. This paper uses the data from a VDS (Vehicle Detection System) as input variables. This study predicted traffic flow in five levels (free flow, somewhat delayed, delayed, somewhat congested, and congested), rather than predicting traffic flow in two levels (free flow and congested). The Catboost model, which is a machine-learning algorithm, was used in this study. This model predicts traffic flow in five levels and compares and analyzes the accuracy of the prediction with other algorithms. In addition, the preprocessed model that went through RandomizedSerachCv and One-Hot Encoding was compared with the naive one. As a result, the Catboost model without any hyper-parameter showed the highest accuracy of 93%. Overall, the Catboost model analyzes and predicts a large number of categorical traffic data better than any other machine learning and deep learning models, and the initial set parameters are optimized for Catboost.

Ensemble of Degraded Artificial Intelligence Modules Against Adversarial Attacks on Neural Networks

  • Sutanto, Richard Evan;Lee, Sukho
    • Journal of information and communication convergence engineering
    • /
    • v.16 no.3
    • /
    • pp.148-152
    • /
    • 2018
  • Adversarial attacks on artificial intelligence (AI) systems use adversarial examples to achieve the attack objective. Adversarial examples consist of slightly changed test data, causing AI systems to make false decisions on these examples. When used as a tool for attacking AI systems, this can lead to disastrous results. In this paper, we propose an ensemble of degraded convolutional neural network (CNN) modules, which is more robust to adversarial attacks than conventional CNNs. Each module is trained on degraded images. During testing, images are degraded using various degradation methods, and a final decision is made utilizing a one-hot encoding vector that is obtained by summing up all the output vectors of the modules. Experimental results show that the proposed ensemble network is more resilient to adversarial attacks than conventional networks, while the accuracies for normal images are similar.

A Study on the Visualization of an Airline's Fleet State Variation (항공사 기단의 상태변화 시각화에 관한 연구)

  • Lee, Yonghwa;Lee, Juhwan;Lee, Keumjin
    • Journal of the Korean Society for Aviation and Aeronautics
    • /
    • v.29 no.2
    • /
    • pp.84-93
    • /
    • 2021
  • Airline schedule is the most basic data for flight operations and has significant importance to an airline's management. It is crucial to know the airline's current schedule status in order to effectively manage the company and to be prepared for abnormal situations. In this study, machine learning techniques were applied to actual schedule data to examine the possibility of whether the airline's fleet state could be artificially learned without prior information. Given that the schedule is in categorical form, One Hot Encoding was applied and t-SNE was used to reduce the dimension of the data and visualize them to gain insights into the airline's overall fleet status. Interesting results were discovered from the experiments where the initial findings are expected to contribute to the fields of airline schedule health monitoring, anomaly detection, and disruption management.

Isolation and Characterization of Pathogen-Inducible Putative Zinc Finger DNA Binding Protein from Hot Pepper Capsicum annuum L.

  • Oh, Sang-Keun;Park, Jeong-Mee;Jung, Young-Hee;Lee, Sanghyeob;Kim, Soo-Yong;Eunsook Chung;Yi, So-Young;Kim, Young-Cheol;Seung, Eun-Soo
    • Proceedings of the Korean Society of Plant Pathology Conference
    • /
    • 2003.10a
    • /
    • pp.79.2-80
    • /
    • 2003
  • To better understand plant defense responses against pathogen attack, we identified the transcription factor-encoding genes in the hot pepper Capsicum annuum that show altered expression patterns during the hypersensitive response raised by challenge with bacterial pathogens. One of these genes, Ca1244, was characterized further. This gene encodes a plant-specific Type IIIA - zinc finger protein that contains two Cys$_2$His$_2$zinc fingers. Ca1244 expression is rapidly and specifically induced when pepper plants are challenged with bacterial pathogens to which they are resistant. In contrast, challenge with a pathogen to which the plants are susceptible only generates weak Ca1244 expression. Ca1244 expression is also strongly induced in pepper leaves by the exogenous application of ethephon, an ethylene releasing compound. Whereas, salicylic acid and methyl jasmonate had moderate effects. Pepper protoplasts expressing a Ca1244-smGFP fusion protein showed Ca1244 localizes in the nucleus. Transgenic tobacco plants overexpressing Ca1244 driven by the CaMV 355 promoter show increased resistance to challenge with a tobacco-specific bacterial pathogen. These plants also showed constitutive upregulation of the expression of multiple defense-related genes. These observations provide the first evidence that an Type IIIA - zinc finger protein, Ca1244, plays a crucial role in the activation of the pathogen defense response in plants.

  • PDF

Correcting Misclassified Image Features with Convolutional Coding

  • Mun, Ye-Ji;Kim, Nayoung;Lee, Jieun;Kang, Je-Won
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2018.11a
    • /
    • pp.11-14
    • /
    • 2018
  • The aim of this study is to rectify the misclassified image features and enhance the performance of image classification tasks by incorporating a channel- coding technique, widely used in telecommunication. Specifically, the proposed algorithm employs the error - correcting mechanism of convolutional coding combined with the convolutional neural networks (CNNs) that are the state - of- the- arts image classifier s. We develop an encoder and a decoder to employ the error - correcting capability of the convolutional coding. In the encoder, the label values of the image data are converted to convolutional codes that are used as target outputs of the CNN, and the network is trained to minimize the Euclidean distance between the target output codes and the actual output codes. In order to correct misclassified features, the outputs of the network are decoded through the trellis structure with Viterbi algorithm before determining the final prediction. This paper demonstrates that the proposed architecture advances the performance of the neural networks compared to the traditional one- hot encoding method.

  • PDF

Design of an Effective Deep Learning-Based Non-Profiling Side-Channel Analysis Model (효과적인 딥러닝 기반 비프로파일링 부채널 분석 모델 설계방안)

  • Han, JaeSeung;Sim, Bo-Yeon;Lim, Han-Seop;Kim, Ju-Hwan;Han, Dong-Guk
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.30 no.6
    • /
    • pp.1291-1300
    • /
    • 2020
  • Recently, a deep learning-based non-profiling side-channel analysis was proposed. The deep learning-based non-profiling analysis is a technique that trains a neural network model for all guessed keys and then finds the correct secret key through the difference in the training metrics. As the performance of non-profiling analysis varies greatly depending on the neural network training model design, a correct model design criterion is required. This paper describes the two types of loss functions and eight labeling methods used in the training model design. It predicts the analysis performance of each labeling method in terms of non-profiling analysis and power consumption model. Considering the characteristics of non-profiling analysis and the HW (Hamming Weight) power consumption model is assumed, we predict that the learning model applying the HW label without One-hot encoding and the Correlation Optimization (CO) loss will have the best analysis performance. And we performed actual analysis on three data sets that are Subbytes operation part of AES-128 1 round. We verified our prediction by non-profiling analyzing two data sets with a total 16 of MLP-based model, which we describe.

A Classification Model for Customs Clearance Inspection Results of Imported Aquatic Products Using Machine Learning Techniques (머신러닝 기법을 활용한 수입 수산물 통관검사결과 분류 모델)

  • Ji Seong Eom;Lee Kyung Hee;Wan-Sup Cho
    • The Journal of Bigdata
    • /
    • v.8 no.1
    • /
    • pp.157-165
    • /
    • 2023
  • Seafood is a major source of protein in many countries and its consumption is increasing. In Korea, consumption of seafood is increasing, but self-sufficiency rate is decreasing, and the importance of safety management is increasing as the amount of imported seafood increases. There are hundreds of species of aquatic products imported into Korea from over 110 countries, and there is a limit to relying only on the experience of inspectors for safety management of imported aquatic products. Based on the data, a model that can predict the customs inspection results of imported aquatic products is developed, and a machine learning classification model that determines the non-conformity of aquatic products when an import declaration is submitted is created. As a result of customs inspection of imported marine products, the nonconformity rate is less than 1%, which is very low imbalanced data. Therefore, a sampling method that can complement these characteristics was comparatively studied, and a preprocessing method that can interpret the classification result was applied. Among various machine learning-based classification models, Random Forest and XGBoost showed good performance. The model that predicts both compliance and non-conformance well as a result of the clearance inspection is the basic random forest model to which ADASYN and one-hot encoding are applied, and has an accuracy of 99.88%, precision of 99.87%, recall of 99.89%, and AUC of 99.88%. XGBoost is the most stable model with all indicators exceeding 90% regardless of oversampling and encoding type.

Deep Learning Model for Incomplete Data (불완전한 데이터를 위한 딥러닝 모델)

  • Lee, Jong Chan
    • Journal of the Korea Convergence Society
    • /
    • v.10 no.2
    • /
    • pp.1-6
    • /
    • 2019
  • The proposed model is developed to minimize the loss of information in incomplete data including missing data. The first step is to transform the learning data to compensate for the loss information using the data extension technique. In this conversion process, the attribute values of the data are filled with binary or probability values in one-hot encoding. Next, this conversion data is input to the deep learning model, where the number of entries is not constant depending on the cardinality of each attribute. Then, the entry values of each attribute are assigned to the respective input nodes, and learning proceeds. This is different from existing learning models, and has an unusual structure in which arbitrary attribute values are distributedly input to multiple nodes in the input layer. In order to evaluate the learning performance of the proposed model, various experiments are performed on the missing data and it shows that it is superior in terms of performance. The proposed model will be useful as an algorithm to minimize the loss in the ubiquitous environment.

Hyperparameter Optimization and Data Augmentation of Artificial Neural Networks for Prediction of Ammonia Emission Amount from Field-applied Manure (토양에 살포된 축산 분뇨로부터 암모니아 방출량 예측을 위한 인공신경망의 초매개변수 최적화와 데이터 증식)

  • Pyeong-Gon Jung;Young-Il Lim
    • Korean Chemical Engineering Research
    • /
    • v.61 no.1
    • /
    • pp.123-141
    • /
    • 2023
  • A sufficient amount of data with quality is needed for training artificial neural networks (ANNs). However, developing ANN models with a small amount of data often appears in engineering fields. This paper presented an ANN model to improve prediction performance of the ammonia emission amount with 83 data. The ammonia emission rate included eleven inputs and two outputs (maximum ammonia loss, Nmax and time to reach half of Nmax, Km). Categorical input variables were transformed into multi-dimensional equal-distance variables, and 13 data were added into 66 training data using a generative adversarial network. Hyperparameters (number of layers, number of neurons, and activation function) of ANN were optimized using Gaussian process. Using 17 test data, the previous ANN model (Lim et al., 2007) showed the mean absolute error (MAE) of Km and Nmax to 0.0668 and 0.1860, respectively. The present ANN outperformed the previous model, reducing MAE by 38% and 56%.

Development of deep learning structure for complex microbial incubator applying deep learning prediction result information (딥러닝 예측 결과 정보를 적용하는 복합 미생물 배양기를 위한 딥러닝 구조 개발)

  • Hong-Jik Kim;Won-Bog Lee;Seung-Ho Lee
    • Journal of IKEEE
    • /
    • v.27 no.1
    • /
    • pp.116-121
    • /
    • 2023
  • In this paper, we develop a deep learning structure for a complex microbial incubator that applies deep learning prediction result information. The proposed complex microbial incubator consists of pre-processing of complex microbial data, conversion of complex microbial data structure, design of deep learning network, learning of the designed deep learning network, and GUI development applied to the prototype. In the complex microbial data preprocessing, one-hot encoding is performed on the amount of molasses, nutrients, plant extract, salt, etc. required for microbial culture, and the maximum-minimum normalization method for the pH concentration measured as a result of the culture and the number of microbial cells to preprocess the data. In the complex microbial data structure conversion, the preprocessed data is converted into a graph structure by connecting the water temperature and the number of microbial cells, and then expressed as an adjacency matrix and attribute information to be used as input data for a deep learning network. In deep learning network design, complex microbial data is learned by designing a graph convolutional network specialized for graph structures. The designed deep learning network uses a cosine loss function to proceed with learning in the direction of minimizing the error that occurs during learning. GUI development applied to the prototype shows the target pH concentration (3.8 or less) and the number of cells (108 or more) of complex microorganisms in an order suitable for culturing according to the water temperature selected by the user. In order to evaluate the performance of the proposed microbial incubator, the results of experiments conducted by authorized testing institutes showed that the average pH was 3.7 and the number of cells of complex microorganisms was 1.7 × 108. Therefore, the effectiveness of the deep learning structure for the complex microbial incubator applying the deep learning prediction result information proposed in this paper was proven.