• Title/Summary/Keyword: Parametric Activation Function

Search Result 11, Processing Time 0.02 seconds

Performance Improvement Method of Fully Connected Neural Network Using Combined Parametric Activation Functions (결합된 파라메트릭 활성함수를 이용한 완전연결신경망의 성능 향상)

  • Ko, Young Min;Li, Peng Hang;Ko, Sun Woo
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.11 no.1
    • /
    • pp.1-10
    • /
    • 2022
  • Deep neural networks are widely used to solve various problems. In a fully connected neural network, the nonlinear activation function is a function that nonlinearly transforms the input value and outputs it. The nonlinear activation function plays an important role in solving the nonlinear problem, and various nonlinear activation functions have been studied. In this study, we propose a combined parametric activation function that can improve the performance of a fully connected neural network. Combined parametric activation functions can be created by simply adding parametric activation functions. The parametric activation function is a function that can be optimized in the direction of minimizing the loss function by applying a parameter that converts the scale and location of the activation function according to the input data. By combining the parametric activation functions, more diverse nonlinear intervals can be created, and the parameters of the parametric activation functions can be optimized in the direction of minimizing the loss function. The performance of the combined parametric activation function was tested through the MNIST classification problem and the Fashion MNIST classification problem, and as a result, it was confirmed that it has better performance than the existing nonlinear activation function and parametric activation function.

Performance Improvement Method of Convolutional Neural Network Using Combined Parametric Activation Functions (결합된 파라메트릭 활성함수를 이용한 합성곱 신경망의 성능 향상)

  • Ko, Young Min;Li, Peng Hang;Ko, Sun Woo
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.11 no.9
    • /
    • pp.371-380
    • /
    • 2022
  • Convolutional neural networks are widely used to manipulate data arranged in a grid, such as images. A general convolutional neural network consists of a convolutional layers and a fully connected layers, and each layer contains a nonlinear activation functions. This paper proposes a combined parametric activation function to improve the performance of convolutional neural networks. The combined parametric activation function is created by adding the parametric activation functions to which parameters that convert the scale and location of the activation function are applied. Various nonlinear intervals can be created according to parameters that convert multiple scales and locations, and parameters can be learned in the direction of minimizing the loss function calculated by the given input data. As a result of testing the performance of the convolutional neural network using the combined parametric activation function on the MNIST, Fashion MNIST, CIFAR10 and CIFAR100 classification problems, it was confirmed that it had better performance than other activation functions.

Alleviation of Vanishing Gradient Problem Using Parametric Activation Functions (파라메트릭 활성함수를 이용한 기울기 소실 문제의 완화)

  • Ko, Young Min;Ko, Sun Woo
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.10 no.10
    • /
    • pp.407-420
    • /
    • 2021
  • Deep neural networks are widely used to solve various problems. However, the deep neural network with a deep hidden layer frequently has a vanishing gradient or exploding gradient problem, which is a major obstacle to learning the deep neural network. In this paper, we propose a parametric activation function to alleviate the vanishing gradient problem that can be caused by nonlinear activation function. The proposed parametric activation function can be obtained by applying a parameter that can convert the scale and location of the activation function according to the characteristics of the input data, and the loss function can be minimized without limiting the derivative of the activation function through the backpropagation process. Through the XOR problem with 10 hidden layers and the MNIST classification problem with 8 hidden layers, the performance of the original nonlinear and parametric activation functions was compared, and it was confirmed that the proposed parametric activation function has superior performance in alleviating the vanishing gradient.

Performance Improvement Method of Deep Neural Network Using Parametric Activation Functions (파라메트릭 활성함수를 이용한 심층신경망의 성능향상 방법)

  • Kong, Nayoung;Ko, Sunwoo
    • The Journal of the Korea Contents Association
    • /
    • v.21 no.3
    • /
    • pp.616-625
    • /
    • 2021
  • Deep neural networks are an approximation method that approximates an arbitrary function to a linear model and then repeats additional approximation using a nonlinear active function. In this process, the method of evaluating the performance of approximation uses the loss function. Existing in-depth learning methods implement approximation that takes into account loss functions in the linear approximation process, but non-linear approximation phases that use active functions use non-linear transformation that is not related to reduction of loss functions of loss. This study proposes parametric activation functions that introduce scale parameters that can change the scale of activation functions and location parameters that can change the location of activation functions. By introducing parametric activation functions based on scale and location parameters, the performance of nonlinear approximation using activation functions can be improved. The scale and location parameters in each hidden layer can improve the performance of the deep neural network by determining parameters that minimize the loss function value through the learning process using the primary differential coefficient of the loss function for the parameters in the backpropagation. Through MNIST classification problems and XOR problems, parametric activation functions have been found to have superior performance over existing activation functions.

The Effect of regularization and identity mapping on the performance of activation functions (정규화 및 항등사상이 활성함수 성능에 미치는 영향)

  • Ryu, Seo-Hyeon;Yoon, Jae-Bok
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.18 no.10
    • /
    • pp.75-80
    • /
    • 2017
  • In this paper, we describe the effect of the regularization method and the network with identity mapping on the performance of the activation functions in deep convolutional neural networks. The activation functions act as nonlinear transformation. In early convolutional neural networks, a sigmoid function was used. To overcome the problem of the existing activation functions such as gradient vanishing, various activation functions were developed such as ReLU, Leaky ReLU, parametric ReLU, and ELU. To solve the overfitting problem, regularization methods such as dropout and batch normalization were developed on the sidelines of the activation functions. Additionally, data augmentation is usually applied to deep learning to avoid overfitting. The activation functions mentioned above have different characteristics, but the new regularization method and the network with identity mapping were validated only using ReLU. Therefore, we have experimentally shown the effect of the regularization method and the network with identity mapping on the performance of the activation functions. Through this analysis, we have presented the tendency of the performance of activation functions according to regularization and identity mapping. These results will reduce the number of training trials to find the best activation function.

The Effects of Combined Complex Exercise with Abdominal Drawing-in Maneuver on Expiratory Abdominal Muscles Activation and Forced Pulmonary Function for Post Stroke Patients (복합운동과 복부 끌어당김 조정 훈련의 병행이 뇌졸중 환자의 호기 시 복부근육 활성도 및 노력성 폐기능에 미치는 영향)

  • Yun, Jeung-Hyun;Kim, Tae-Soo;Lee, Byung-Ki
    • Journal of the Korean Society of Physical Medicine
    • /
    • v.8 no.4
    • /
    • pp.513-523
    • /
    • 2013
  • PURPOSE: The purpose of this study was to investigate characteristics of the forced pulmonary function test effect and abdominal muscles activation by combined complex exercise with abdominal drawing-in maneuver training of chronic stroke patients. METHODS: 14 post stroke patients(10 males and 4 females) involved voluntary this study and we divided two groups into CEG(complex exercise group) and CEAG (complex exercise and abdominal drawing-in maneuver group).(n=7, per goup). Each groups implicated the 2 times, 30minute exercises for 6 weeks a day. The CEAG performed the complex exercise 15 minutes and 15 minutes of abdominal drawing-in maneuver. For data analysis, the mean and standard deviation were estimated; non-parametric independent t-test was carried out. RESULTS: According to the study, in the combined complex exercise with abdominal drawing-in maneuver group, FVC and activation of transversus abdominis/internal oblique were statistically significant difference compared to the complex exercise group. CONCLUSION: These results indicate that the combined complex with abdominal drawing-in maneuver was efficient in enhancing abdominal muscles activation and pulmonary function of chronic stroke patients.

Function Approximation Based on a Network with Kernel Functions of Bounds and Locality : an Approach of Non-Parametric Estimation

  • Kil, Rhee-M.
    • ETRI Journal
    • /
    • v.15 no.2
    • /
    • pp.35-51
    • /
    • 1993
  • This paper presents function approximation based on nonparametric estimation. As an estimation model of function approximation, a three layered network composed of input, hidden and output layers is considered. The input and output layers have linear activation units while the hidden layer has nonlinear activation units or kernel functions which have the characteristics of bounds and locality. Using this type of network, a many-to-one function is synthesized over the domain of the input space by a number of kernel functions. In this network, we have to estimate the necessary number of kernel functions as well as the parameters associated with kernel functions. For this purpose, a new method of parameter estimation in which linear learning rule is applied between hidden and output layers while nonlinear (piecewise-linear) learning rule is applied between input and hidden layers, is considered. The linear learning rule updates the output weights between hidden and output layers based on the Linear Minimization of Mean Square Error (LMMSE) sense in the space of kernel functions while the nonlinear learning rule updates the parameters of kernel functions based on the gradient of the actual output of network with respect to the parameters (especially, the shape) of kernel functions. This approach of parameter adaptation provides near optimal values of the parameters associated with kernel functions in the sense of minimizing mean square error. As a result, the suggested nonparametric estimation provides an efficient way of function approximation from the view point of the number of kernel functions as well as learning speed.

  • PDF

Controcller design using parametric neural networks

  • HashemiNejad, M.;Murata, J.;Banihabib, M.E.
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 1994.10a
    • /
    • pp.616-621
    • /
    • 1994
  • Neural Networks (henceforth NNs, with adjective "artificial" implied) has been used in the field of control however, has a long way to fit to its abilities. One of the best ways to aid it is "supporting it with the knowledge about the linear classical control theory". In this regard we hive developed two kinds of parametric activation function and then used them in both identification and control strategy. Then using a nonlinear tank system we are to test its capabilities. The simulation results for the identification phase is promising. phase is promising.

  • PDF

Center estimation of the n-fold engineering parts using self organizing neural networks with generating and merge learning (뉴런의 생성 및 병합 학습 기능을 갖는 자기 조직화 신경망을 이용한 n-각형 공업용 부품의 중심추정)

  • 성효경;최흥문
    • Journal of the Korean Institute of Telematics and Electronics C
    • /
    • v.34C no.11
    • /
    • pp.95-103
    • /
    • 1997
  • A robust center estimation tecnique of n-fold engineering parts is presented, which use self-organizing neural networks with generating and merging learning for training neural units. To estimate the center of the n-fold engineering parts using neural networks, the segmented boundaries of the interested part are approximated to strainght lines, and the temporal estimated centers by thecosine theorem which formed between the approximaged straight line and the reference point, , are indexed as (.sigma.-.theta.) parameteric vecstors. Then the entries of parametric vectors are fed into self-organizing nerual network. Finally, the center of the n-fold part is extracted by mean of generating and merging learning of the neurons. To accelerate the learning process, neural network uses an adaptive learning rate function to the merging process and a self-adjusting activation to generating process. Simulation results show that the centers of n-fold engineering parts are effectively estimated by proposed technique, though not knowing the error distribution of estimated centers and having less information of boundaries.

  • PDF

Gas detonation cell width prediction model based on support vector regression

  • Yu, Jiyang;Hou, Bingxu;Lelyakin, Alexander;Xu, Zhanjie;Jordan, Thomas
    • Nuclear Engineering and Technology
    • /
    • v.49 no.7
    • /
    • pp.1423-1430
    • /
    • 2017
  • Detonation cell width is an important parameter in hydrogen explosion assessments. The experimental data on gas detonation are statistically analyzed to establish a universal method to numerically predict detonation cell widths. It is commonly understood that detonation cell width, ${\lambda}$, is highly correlated with the characteristic reaction zone width, ${\delta}$. Classical parametric regression methods were widely applied in earlier research to build an explicit semiempirical correlation for the ratio of ${\lambda}/{\delta}$. The obtained correlations formulate the dependency of the ratio ${\lambda}/{\delta}$ on a dimensionless effective chemical activation energy and a dimensionless temperature of the gas mixture. In this paper, support vector regression (SVR), which is based on nonparametric machine learning, is applied to achieve functions with better fitness to experimental data and more accurate predictions. Furthermore, a third parameter, dimensionless pressure, is considered as an additional independent variable. It is found that three-parameter SVR can significantly improve the performance of the fitting function. Meanwhile, SVR also provides better adaptability and the model functions can be easily renewed when experimental database is updated or new regression parameters are considered.