• Title/Summary/Keyword: Softmax function

Search Result 22, Processing Time 0.02 seconds

Improvement of Attack Traffic Classification Performance of Intrusion Detection Model Using the Characteristics of Softmax Function (소프트맥스 함수 특성을 활용한 침입탐지 모델의 공격 트래픽 분류성능 향상 방안)

  • Kim, Young-won;Lee, Soo-jin
    • Convergence Security Journal
    • /
    • v.20 no.4
    • /
    • pp.81-90
    • /
    • 2020
  • In the real world, new types of attacks or variants are constantly emerging, but attack traffic classification models developed through artificial neural networks and supervised learning do not properly detect new types of attacks that have not been trained. Most of the previous studies overlooked this problem and focused only on improving the structure of their artificial neural networks. As a result, a number of new attacks were frequently classified as normal traffic, and attack traffic classification performance was severly degraded. On the other hand, the softmax function, which outputs the probability that each class is correctly classified in the multi-class classification as a result, also has a significant impact on the classification performance because it fails to calculate the softmax score properly for a new type of attack traffic that has not been trained. In this paper, based on this characteristic of softmax function, we propose an efficient method to improve the classification performance against new types of attacks by classifying traffic with a probability below a certain level as attacks, and demonstrate the efficiency of our approach through experiments.

Design of Partial Discharge Pattern Classifier of Softmax Neural Networks Based on K-means Clustering : Comparative Studies and Analysis of Classifier Architecture (K-means 클러스터링 기반 소프트맥스 신경회로망 부분방전 패턴분류의 설계 : 분류기 구조의 비교연구 및 해석)

  • Jeong, Byeong-Jin;Oh, Sung-Kwun
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.67 no.1
    • /
    • pp.114-123
    • /
    • 2018
  • This paper concerns a design and learning method of softmax function neural networks based on K-means clustering. The partial discharge data Information is preliminarily processed through simulation using an Epoxy Mica Coupling sensor and an internal Phase Resolved Partial Discharge Analysis algorithm. The obtained information is processed according to the characteristics of the pattern using a Motor Insulation Monitoring System program. At this time, the processed data are total 4 types that void discharge, corona discharge, surface discharge and slot discharge. The partial discharge data with high dimensional input variables are secondarily processed by principal component analysis method and reduced with keeping the characteristics of pattern as low dimensional input variables. And therefore, the pattern classifier processing speed exhibits improved effects. In addition, in the process of extracting the partial discharge data through the MIMS program, the magnitude of amplitude is divided into the maximum value and the average value, and two pattern characteristics are set and compared and analyzed. In the first half of the proposed partial discharge pattern classifier, the input and hidden layers are classified by using the K-means clustering method and the output of the hidden layer is obtained. In the latter part, the cross entropy error function is used for parameter learning between the hidden layer and the output layer. The final output layer is output as a normalized probability value between 0 and 1 using the softmax function. The advantage of using the softmax function is that it allows access and application of multiple class problems and stochastic interpretation. First of all, there is an advantage that one output value affects the remaining output value and its accompanying learning is accelerated. Also, to solve the overfitting problem, L2-normalization is applied. To prove the superiority of the proposed pattern classifier, we compare and analyze the classification rate with conventional radial basis function neural networks.

Long Short Term Memory based Political Polarity Analysis in Cyber Public Sphere

  • Kang, Hyeon;Kang, Dae-Ki
    • International Journal of Advanced Culture Technology
    • /
    • v.5 no.4
    • /
    • pp.57-62
    • /
    • 2017
  • In this paper, we applied long short term memory(LSTM) for classifying political polarity in cyber public sphere. The data collected from the cyber public sphere is transformed into word corpus data through word embedding. Based on this word corpus data, we train recurrent neural network (RNN) which is connected by LSTM's. Softmax function is applied at the output of the RNN. We conducted our proposed system to obtain experimental results, and we will enhance our proposed system by refining LSTM in our system.

Comparison of Gradient Descent for Deep Learning (딥러닝을 위한 경사하강법 비교)

  • Kang, Min-Jae
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.21 no.2
    • /
    • pp.189-194
    • /
    • 2020
  • This paper analyzes the gradient descent method, which is the one most used for learning neural networks. Learning means updating a parameter so the loss function is at its minimum. The loss function quantifies the difference between actual and predicted values. The gradient descent method uses the slope of the loss function to update the parameter to minimize error, and is currently used in libraries that provide the best deep learning algorithms. However, these algorithms are provided in the form of a black box, making it difficult to identify the advantages and disadvantages of various gradient descent methods. This paper analyzes the characteristics of the stochastic gradient descent method, the momentum method, the AdaGrad method, and the Adadelta method, which are currently used gradient descent methods. The experimental data used a modified National Institute of Standards and Technology (MNIST) data set that is widely used to verify neural networks. The hidden layer consists of two layers: the first with 500 neurons, and the second with 300. The activation function of the output layer is the softmax function, and the rectified linear unit function is used for the remaining input and hidden layers. The loss function uses cross-entropy error.

Multi-labeled Domain Detection Using CNN (CNN을 이용한 발화 주제 다중 분류)

  • Choi, Kyoungho;Kim, Kyungduk;Kim, Yonghe;Kang, Inho
    • 한국어정보학회:학술대회논문집
    • /
    • 2017.10a
    • /
    • pp.56-59
    • /
    • 2017
  • CNN(Convolutional Neural Network)을 이용하여 발화 주제 다중 분류 task를 multi-labeling 방법과, cluster 방법을 이용하여 수행하고, 각 방법론에 MSE(Mean Square Error), softmax cross-entropy, sigmoid cross-entropy를 적용하여 성능을 평가하였다. Network는 음절 단위로 tokenize하고, 품사정보를 각 token의 추가한 sequence와, Naver DB를 통하여 얻은 named entity 정보를 입력으로 사용한다. 실험결과 cluster 방법으로 문제를 변형하고, sigmoid를 output layer의 activation function으로 사용하고 cross entropy cost function을 이용하여 network를 학습시켰을 때 F1 0.9873으로 가장 좋은 성능을 보였다.

  • PDF

SIFT Image Feature Extraction based on Deep Learning (딥 러닝 기반의 SIFT 이미지 특징 추출)

  • Lee, Jae-Eun;Moon, Won-Jun;Seo, Young-Ho;Kim, Dong-Wook
    • Journal of Broadcast Engineering
    • /
    • v.24 no.2
    • /
    • pp.234-242
    • /
    • 2019
  • In this paper, we propose a deep neural network which extracts SIFT feature points by determining whether the center pixel of a cropped image is a SIFT feature point. The data set of this network consists of a DIV2K dataset cut into $33{\times}33$ size and uses RGB image unlike SIFT which uses black and white image. The ground truth consists of the RobHess SIFT features extracted by setting the octave (scale) to 0, the sigma to 1.6, and the intervals to 3. Based on the VGG-16, we construct an increasingly deep network of 13 to 23 and 33 convolution layers, and experiment with changing the method of increasing the image scale. The result of using the sigmoid function as the activation function of the output layer is compared with the result using the softmax function. Experimental results show that the proposed network not only has more than 99% extraction accuracy but also has high extraction repeatability for distorted images.

Adaptive Obstacle Avoidance Algorithm using Classification of 2D LiDAR Data (2차원 라이다 센서 데이터 분류를 이용한 적응형 장애물 회피 알고리즘)

  • Lee, Nara;Kwon, Soonhwan;Ryu, Hyejeong
    • Journal of Sensor Science and Technology
    • /
    • v.29 no.5
    • /
    • pp.348-353
    • /
    • 2020
  • This paper presents an adaptive method to avoid obstacles in various environmental settings, using a two-dimensional (2D) LiDAR sensor for mobile robots. While the conventional reaction based smooth nearness diagram (SND) algorithms use a fixed safety distance criterion, the proposed algorithm autonomously changes the safety criterion considering the obstacle density around a robot. The fixed safety criterion for the whole SND obstacle avoidance process can induce inefficient motion controls in terms of the travel distance and action smoothness. We applied a multinomial logistic regression algorithm, softmax regression, to classify 2D LiDAR point clouds into seven obstacle structure classes. The trained model was used to recognize a current obstacle density situation using newly obtained 2D LiDAR data. Through the classification, the robot adaptively modifies the safety distance criterion according to the change in its environment. We experimentally verified that the motion controls generated by the proposed adaptive algorithm were smoother and more efficient compared to those of the conventional SND algorithms.

A Study on the Characteristics of a series of Autoencoder for Recognizing Numbers used in CAPTCHA (CAPTCHA에 사용되는 숫자데이터를 자동으로 판독하기 위한 Autoencoder 모델들의 특성 연구)

  • Jeon, Jae-seung;Moon, Jong-sub
    • Journal of Internet Computing and Services
    • /
    • v.18 no.6
    • /
    • pp.25-34
    • /
    • 2017
  • Autoencoder is a type of deep learning method where input layer and output layer are the same, and effectively extracts and restores characteristics of input vector using constraints of hidden layer. In this paper, we propose methods of Autoencoders to remove a natural background image which is a noise to the CAPTCHA and recover only a numerical images by applying various autoencoder models to a region where one number of CAPTCHA images and a natural background are mixed. The suitability of the reconstructed image is verified by using the softmax function with the output of the autoencoder as an input. And also, we compared the proposed methods with the other method and showed that our methods are superior than others.

Multi-labeled Domain Detection Using CNN (CNN을 이용한 발화 주제 다중 분류)

  • Choi, Kyoungho;Kim, Kyungduk;Kim, Yonghe;Kang, Inho
    • Annual Conference on Human and Language Technology
    • /
    • 2017.10a
    • /
    • pp.56-59
    • /
    • 2017
  • CNN(Convolutional Neural Network)을 이용하여 발화 주제 다중 분류 task를 multi-labeling 방법과, cluster 방법을 이용하여 수행하고, 각 방법론에 MSE(Mean Square Error), softmax cross-entropy, sigmoid cross-entropy를 적용하여 성능을 평가하였다. Network는 음절 단위로 tokenize하고, 품사정보를 각 token의 추가한 sequence와, Naver DB를 통하여 얻은 named entity 정보를 입력으로 사용한다. 실험결과 cluster 방법으로 문제를 변형하고, sigmoid를 output layer의 activation function으로 사용하고 cross entropy cost function을 이용하여 network를 학습시켰을 때 F1 0.9873으로 가장 좋은 성능을 보였다.

  • PDF

New Inference for a Multiclass Gaussian Process Classification Model using a Variational Bayesian EM Algorithm and Laplace Approximation

  • Cho, Wanhyun;Kim, Sangkyoon;Park, Soonyoung
    • IEIE Transactions on Smart Processing and Computing
    • /
    • v.4 no.4
    • /
    • pp.202-208
    • /
    • 2015
  • In this study, we propose a new inference algorithm for a multiclass Gaussian process classification model using a variational EM framework and the Laplace approximation (LA) technique. This is performed in two steps, called expectation and maximization. First, in the expectation step (E-step), using Bayes' theorem and the LA technique, we derive the approximate posterior distribution of the latent function, indicating the possibility that each observation belongs to a certain class in the Gaussian process classification model. In the maximization step, we compute the maximum likelihood estimators for hyper-parameters of a covariance matrix necessary to define the prior distribution of the latent function by using the posterior distribution derived in the E-step. These steps iteratively repeat until a convergence condition is satisfied. Moreover, we conducted the experiments by using synthetic data and Iris data in order to verify the performance of the proposed algorithm. Experimental results reveal that the proposed algorithm shows good performance on these datasets.