• Title/Summary/Keyword: Activation function

Search Result 1,477, Processing Time 0.027 seconds

Performance Improvement Method of Fully Connected Neural Network Using Combined Parametric Activation Functions (결합된 파라메트릭 활성함수를 이용한 완전연결신경망의 성능 향상)

  • Ko, Young Min;Li, Peng Hang;Ko, Sun Woo
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.11 no.1
    • /
    • pp.1-10
    • /
    • 2022
  • Deep neural networks are widely used to solve various problems. In a fully connected neural network, the nonlinear activation function is a function that nonlinearly transforms the input value and outputs it. The nonlinear activation function plays an important role in solving the nonlinear problem, and various nonlinear activation functions have been studied. In this study, we propose a combined parametric activation function that can improve the performance of a fully connected neural network. Combined parametric activation functions can be created by simply adding parametric activation functions. The parametric activation function is a function that can be optimized in the direction of minimizing the loss function by applying a parameter that converts the scale and location of the activation function according to the input data. By combining the parametric activation functions, more diverse nonlinear intervals can be created, and the parameters of the parametric activation functions can be optimized in the direction of minimizing the loss function. The performance of the combined parametric activation function was tested through the MNIST classification problem and the Fashion MNIST classification problem, and as a result, it was confirmed that it has better performance than the existing nonlinear activation function and parametric activation function.

Comparative analysis of activation functions within reinforcement learning for autonomous vehicles merging onto highways

  • Dongcheul Lee;Janise McNair
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.16 no.1
    • /
    • pp.63-71
    • /
    • 2024
  • Deep reinforcement learning (RL) significantly influences autonomous vehicle development by optimizing decision-making and adaptation to complex driving environments through simulation-based training. In deep RL, an activation function is used, and various activation functions have been proposed, but their performance varies greatly depending on the application environment. Therefore, finding the optimal activation function according to the environment is important for effective learning. In this paper, we analyzed nine commonly used activation functions for RL to compare and evaluate which activation function is most effective when using deep RL for autonomous vehicles to learn highway merging. To do this, we built a performance evaluation environment and compared the average reward of each activation function. The results showed that the highest reward was achieved using Mish, and the lowest using SELU. The difference in reward between the two activation functions was 10.3%.

Performance Improvement Method of Convolutional Neural Network Using Agile Activation Function (민첩한 활성함수를 이용한 합성곱 신경망의 성능 향상)

  • Kong, Na Young;Ko, Young Min;Ko, Sun Woo
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.9 no.7
    • /
    • pp.213-220
    • /
    • 2020
  • The convolutional neural network is composed of convolutional layers and fully connected layers. The nonlinear activation function is used in each layer of the convolutional layer and the fully connected layer. The activation function being used in a neural network is a function that simulates the method of transmitting information in a neuron that can transmit a signal and not send a signal if the input signal is above a certain criterion when transmitting a signal between neurons. The conventional activation function does not have a relationship with the loss function, so the process of finding the optimal solution is slow. In order to improve this, an agile activation function that generalizes the activation function is proposed. The agile activation function can improve the performance of the deep neural network in a way that selects the optimal agile parameter through the learning process using the primary differential coefficient of the loss function for the agile parameter in the backpropagation process. Through the MNIST classification problem, we have identified that agile activation functions have superior performance over conventional activation functions.

Performance Improvement Method of Convolutional Neural Network Using Combined Parametric Activation Functions (결합된 파라메트릭 활성함수를 이용한 합성곱 신경망의 성능 향상)

  • Ko, Young Min;Li, Peng Hang;Ko, Sun Woo
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.11 no.9
    • /
    • pp.371-380
    • /
    • 2022
  • Convolutional neural networks are widely used to manipulate data arranged in a grid, such as images. A general convolutional neural network consists of a convolutional layers and a fully connected layers, and each layer contains a nonlinear activation functions. This paper proposes a combined parametric activation function to improve the performance of convolutional neural networks. The combined parametric activation function is created by adding the parametric activation functions to which parameters that convert the scale and location of the activation function are applied. Various nonlinear intervals can be created according to parameters that convert multiple scales and locations, and parameters can be learned in the direction of minimizing the loss function calculated by the given input data. As a result of testing the performance of the convolutional neural network using the combined parametric activation function on the MNIST, Fashion MNIST, CIFAR10 and CIFAR100 classification problems, it was confirmed that it had better performance than other activation functions.

Masking Exponential-Based Neural Network via Approximated Activation Function (활성화 함수 근사를 통한 지수함수 기반 신경망 마스킹 기법)

  • Joonsup Kim;GyuSang Kim;Dongjun Park;Sujin Park;HeeSeok Kim;Seokhie Hong
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.33 no.5
    • /
    • pp.761-773
    • /
    • 2023
  • This paper proposes a method to increase the power-analysis resistance of the neural network model's feedforward process by replacing the exponential-based activation function, used in the deep-learning field, with an approximated function especially at the multi-layer perceptron model. Due to its nature, the feedforward process of neural networks calculates secret weight and bias, which already trained, so it has risk of exposure of internal information by side-channel attacks. However, various functions are used as the activation function in neural network, so it's difficult to apply conventional side-channel countermeasure techniques, such as masking, to activation function(especially, to exponential-based activation functions). Therefore, this paper shows that even if an exponential-based activation function is replaced with approximated function of simple form, there is no fatal performance degradation of the model, and than suggests a power-analysis resistant feedforward neural network with exponential-based activation function, by masking approximated function and whole network.

Comparison of Reinforcement Learning Activation Functions to Improve the Performance of the Racing Game Learning Agent

  • Lee, Dongcheul
    • Journal of Information Processing Systems
    • /
    • v.16 no.5
    • /
    • pp.1074-1082
    • /
    • 2020
  • Recently, research has been actively conducted to create artificial intelligence agents that learn games through reinforcement learning. There are several factors that determine performance when the agent learns a game, but using any of the activation functions is also an important factor. This paper compares and evaluates which activation function gets the best results if the agent learns the game through reinforcement learning in the 2D racing game environment. We built the agent using a reinforcement learning algorithm and a neural network. We evaluated the activation functions in the network by switching them together. We measured the reward, the output of the advantage function, and the output of the loss function while training and testing. As a result of performance evaluation, we found out the best activation function for the agent to learn the game. The difference between the best and the worst was 35.4%.

Alleviation of Vanishing Gradient Problem Using Parametric Activation Functions (파라메트릭 활성함수를 이용한 기울기 소실 문제의 완화)

  • Ko, Young Min;Ko, Sun Woo
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.10 no.10
    • /
    • pp.407-420
    • /
    • 2021
  • Deep neural networks are widely used to solve various problems. However, the deep neural network with a deep hidden layer frequently has a vanishing gradient or exploding gradient problem, which is a major obstacle to learning the deep neural network. In this paper, we propose a parametric activation function to alleviate the vanishing gradient problem that can be caused by nonlinear activation function. The proposed parametric activation function can be obtained by applying a parameter that can convert the scale and location of the activation function according to the characteristics of the input data, and the loss function can be minimized without limiting the derivative of the activation function through the backpropagation process. Through the XOR problem with 10 hidden layers and the MNIST classification problem with 8 hidden layers, the performance of the original nonlinear and parametric activation functions was compared, and it was confirmed that the proposed parametric activation function has superior performance in alleviating the vanishing gradient.

Effect of Nonlinear Transformations on Entropy of Hidden Nodes

  • Oh, Sang-Hoon
    • International Journal of Contents
    • /
    • v.10 no.1
    • /
    • pp.18-22
    • /
    • 2014
  • Hidden nodes have a key role in the information processing of feed-forward neural networks in which inputs are processed through a series of weighted sums and nonlinear activation functions. In order to understand the role of hidden nodes, we must analyze the effect of the nonlinear activation functions on the weighted sums to hidden nodes. In this paper, we focus on the effect of nonlinear functions in a viewpoint of information theory. Under the assumption that the nonlinear activation function can be approximated piece-wise linearly, we prove that the entropy of weighted sums to hidden nodes decreases after piece-wise linear functions. Therefore, we argue that the nonlinear activation function decreases the uncertainty among hidden nodes. Furthermore, the more the hidden nodes are saturated, the more the entropy of hidden nodes decreases. Based on this result, we can say that, after successful training of feed-forward neural networks, hidden nodes tend not to be in linear regions but to be in saturated regions of activation function with the effect of uncertainty reduction.

Stable activation-based regression with localizing property

  • Shin, Jae-Kyung;Jhong, Jae-Hwan;Koo, Ja-Yong
    • Communications for Statistical Applications and Methods
    • /
    • v.28 no.3
    • /
    • pp.281-294
    • /
    • 2021
  • In this paper, we propose an adaptive regression method based on the single-layer neural network structure. We adopt a symmetric activation function as units of the structure. The activation function has a flexibility of its form with a parametrization and has a localizing property that is useful to improve the quality of estimation. In order to provide a spatially adaptive estimator, we regularize coefficients of the activation functions via ℓ1-penalization, through which the activation functions to be regarded as unnecessary are removed. In implementation, an efficient coordinate descent algorithm is applied for the proposed estimator. To obtain the stable results of estimation, we present an initialization scheme suited for our structure. Model selection procedure based on the Akaike information criterion is described. The simulation results show that the proposed estimator performs favorably in relation to existing methods and recovers the local structure of the underlying function based on the sample.

Optimization of Sigmoid Activation Function Parameters using Genetic Algorithms and Pattern Recognition Analysis in Input Space of Two Spirals Problem (유전자알고리즘을 이용한 시그모이드 활성화 함수 파라미터의 최적화와 이중나선 문제의 입력공간 패턴인식 분석)

  • Lee, Sang-Wha
    • The Journal of the Korea Contents Association
    • /
    • v.10 no.4
    • /
    • pp.10-18
    • /
    • 2010
  • This paper presents a optimization of sigmoid activation function parameter using genetic algorithms and pattern recognition analysis in input space of two spirals benchmark problem. To experiment, cascade correlation learning algorithm is used. In the first experiment, normal sigmoid activation function is used to analyze the pattern classification in input space of the two spirals problem. In the second experiment, sigmoid activation functions using different fixed values of the parameters are composed of 8 pools. In the third experiment, displacement of the sigmoid function to determine the value of the three parameters is obtained using genetic algorithms. The parameter values applied to the sigmoid activation functions for candidate neurons are used. To evaluate the performance of these algorithms, each step of the training input pattern classification shows the shape of the two spirals.