• Title/Summary/Keyword: LeakyReLU

Search Result 15, Processing Time 0.031 seconds

A Performance Comparison of Super Resolution Model with Different Activation Functions (활성함수 변화에 따른 초해상화 모델 성능 비교)

  • Yoo, Youngjun;Kim, Daehee;Lee, Jaekoo
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.9 no.10
    • /
    • pp.303-308
    • /
    • 2020
  • The ReLU(Rectified Linear Unit) function has been dominantly used as a standard activation function in most deep artificial neural network models since it was proposed. Later, Leaky ReLU, Swish, and Mish activation functions were presented to replace ReLU, which showed improved performance over existing ReLU function in image classification task. Therefore, we recognized the need to experiment with whether performance improvements could be achieved by replacing the RELU with other activation functions in the super resolution task. In this paper, the performance was compared by changing the activation functions in EDSR model, which showed stable performance in the super resolution task. As a result, in experiments conducted with changing the activation function of EDSR, when the resolution was converted to double, the existing activation function, ReLU, showed similar or higher performance than the other activation functions used in the experiment. When the resolution was converted to four times, Leaky ReLU and Swish function showed slightly improved performance over ReLU. PSNR and SSIM, which can quantitatively evaluate the quality of images, were able to identify average performance improvements of 0.06%, 0.05% when using Leaky ReLU, and average performance improvements of 0.06% and 0.03% when using Swish. When the resolution is converted to eight times, the Mish function shows a slight average performance improvement over the ReLU. Using Mish, PSNR and SSIM were able to identify an average of 0.06% and 0.02% performance improvement over the RELU. In conclusion, Leaky ReLU and Swish showed improved performance compared to ReLU for super resolution that converts resolution four times and Mish showed improved performance compared to ReLU for super resolution that converts resolution eight times. In future study, we should conduct comparative experiments to replace activation functions with Leaky ReLU, Swish and Mish to improve performance in other super resolution models.

Performance Analysis of Various Activation Functions in Super Resolution Model (초해상화 모델의 활성함수 변경에 따른 성능 분석)

  • Yoo, YoungJun;Kim, DaeHee;Lee, JaeKoo
    • Annual Conference of KIPS
    • /
    • 2020.05a
    • /
    • pp.504-507
    • /
    • 2020
  • ReLU(Rectified Linear Unit) 함수는 제안된 이후로 대부분의 깊은 인공신경망 모델들에서 표준 활성함수로써 지배적으로 사용되었다. 이후에 ReLU 를 대체하기 위해 Leaky ReLU, Swish, Mish 활성함수가 제시되었는데, 이들은 영상 분류 과업에서 기존 ReLU 함수 보다 향상된 성능을 보였다. 따라서 초해상화(Super Resolution) 과업에서도 ReLU 를 다른 활성함수들로 대체하여 성능 향상을 얻을 수 있는지 실험해볼 필요성을 느꼈다. 본 연구에서는 초해상화 과업에서 안정적인 성능을 보이는 EDSR(Enhanced Deep Super-Resolution Network) 모델의 활성함수들을 변경하면서 성능을 비교하였다. 결과적으로 EDSR 의 활성함수를 변경하면서 진행한 실험에서 해상도를 2 배로 변환하는 경우, 기존 활성함수인 ReLU 가 실험에 사용된 다른 활성함수들 보다 비슷하거나 높은 성능을 보였다. 하지만 해상도를 4 배로 변환하는 경우에서는 Leaky ReLU 와 Swish 함수가 기존 ReLU 함수대비 다소 향상된 성능을 보임을 확인하였다. 구체적으로 Leaky ReLU 를 사용했을 때 기존 ReLU 보다 영상의 품질을 정량적으로 평가할 수 있는 PSNR 과 SSIM 평가지표가 평균 0.06%, 0.05%, Swish 를 사용했을 때는 평균 0.06%, 0.03%의 성능 향상을 확인할 수 있었다. 4 배의 해상도를 높이는 초해상화의 경우, Leaky ReLU 와 Swish 가 ReLU 대비 향상된 성능을 보였기 때문에 향후 연구에서는 다른 초해상화 모델에서도 성능 향상을 위해 활성함수를 Leaky ReLU 나 Swish 로 대체하는 비교실험을 수행하는 것도 필요하다고 판단된다.

A study on activation functions of Artificial Neural Network model suitable for prediction of the groundwater level in the mid-mountainous area of eastern Jeju island (제주도 동부 중산간지역 지하수위 예측에 적합한 인공신경망 모델의 활성화함수 연구)

  • Mun-Ju Shin;Jeong-Hun Kim;Su-Yeon Kang;Jeong-Han Lee;Kyung Goo Kang
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2023.05a
    • /
    • pp.520-520
    • /
    • 2023
  • 제주도 동부 중산간 지역은 화산암으로 구성된 지하지질로 인해 지하수위의 변동폭이 크고 변동양상이 복잡하여 인공신경망(Artificial Neural Network, ANN) 모델 등을 활용한 지하수위의 예측이 어렵다. ANN에 적용되는 활성화함수에 따라 지하수의 예측성능은 달라질 수 있으므로 활성화함수의 비교분석 후 적절한 활성화함수의 사용이 반드시 필요하다. 본 연구에서는 5개 활성화함수(sigmoid, hyperbolic tangent(tanh), Rectified Linear Unit(ReLU), Leaky Rectified Linear Unit(Leaky ReLU), Exponential Linear Unit(ELU))를 제주도 동부 중산간지역에 위치한 2개 지하수 관정에 대해 비교분석하여 최적 활성화함수 도출을 목표로 한다. 또한 최적 활성화함수를 활용한 ANN의 적용성을 평가하기 위해 최근 널리 사용되고 있는 순환신경망 모델인 Long Short-Term Memory(LSTM) 모델과 비교분석 하였다. 그 결과, 2개 관정 중 지하수위 변동폭이 상대적으로 큰 관정은 ELU 함수, 상대적으로 작은 관정은 Leaky ReLU 함수가 지하수위 예측에 적절하였다. 예측성능이 가장 낮은 활성화함수는 sigmoid 함수로 나타나 첨두 및 최저 지하수위 예측 시 사용을 지양해야 할 것으로 판단된다. 도출된 최적 활성화함수를 사용한 ANN-ELU 모델 및 ANN-Leaky ReLU 모델을 LSTM 모델과 비교분석한 결과 대등한 지하수위 예측성능을 나타내었다. 이것은 feed-forward 방식인 ANN 모델을 사용하더라도 적절한 활성화함수를 사용하면 최신 순환신경망과 대등한 결과를 도출하여 활용 가능성이 충분히 있다는 것을 의미한다. 마지막으로 LSTM 모델은 가장 적절한 예측성능을 나타내어 다양한 인공지능 모델의 예측성능 비교를 위한 기준이 되는 참고모델로 활용 가능하다. 본 연구에서 제시한 방법은 지하수위 예측과 더불어 하천수위 예측 등 다양한 시계열예측 및 분석연구에 유용하게 사용될 수 있다.

  • PDF

DQN Reinforcement Learning for Acrobot in OpenAI Gym Environment (OpenAI Gym 환경의 Acrobot에 대한 DQN 강화학습)

  • Myung-Ju Kang
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2023.07a
    • /
    • pp.35-36
    • /
    • 2023
  • 본 논문에서는 OpenAI Gym 환경에서 제공하는 Acrobot-v1에 대해 DQN(Deep Q-Networks) 강화학습으로 학습시키고, 이 때 적용되는 활성화함수의 성능을 비교분석하였다. DQN 강화학습에 적용한 활성화함수는 ReLU, ReakyReLU, ELU, SELU 그리고 softplus 함수이다. 실험 결과 평균적으로 Leaky_ReLU 활성화함수를 적용했을 때의 보상 값이 높았고, 최대 보상 값은 SELU 활성화 함수를 적용할 때로 나타났다.

  • PDF

Comparative Analysis of RNN Architectures and Activation Functions with Attention Mechanisms for Mars Weather Prediction

  • Jaehyeok Jo;Yunho Sin;Bo-Young Kim;Jihoon Moon
    • Journal of the Korea Society of Computer and Information
    • /
    • v.29 no.10
    • /
    • pp.1-9
    • /
    • 2024
  • In this paper, we propose a comparative analysis to evaluate the impact of activation functions and attention mechanisms on the performance of time-series models for Mars meteorological data. Mars meteorological data are nonlinear and irregular due to low atmospheric density, rapid temperature variations, and complex terrain. We use long short-term memory (LSTM), bidirectional LSTM (BiLSTM), gated recurrent unit (GRU), and bidirectional GRU (BiGRU) architectures to evaluate the effectiveness of different activation functions and attention mechanisms. The activation functions tested include rectified linear unit (ReLU), leaky ReLU, exponential linear unit (ELU), Gaussian error linear unit (GELU), Swish, and scaled ELU (SELU), and model performance was measured using mean absolute error (MAE) and root mean square error (RMSE) metrics. Our results show that the integration of attentional mechanisms improves both MAE and RMSE, with Swish and ReLU achieving the best performance for minimum temperature prediction. Conversely, GELU and ELU were less effective for pressure prediction. These results highlight the critical role of selecting appropriate activation functions and attention mechanisms in improving model accuracy for complex time-series forecasting.

The Effect of regularization and identity mapping on the performance of activation functions (정규화 및 항등사상이 활성함수 성능에 미치는 영향)

  • Ryu, Seo-Hyeon;Yoon, Jae-Bok
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.18 no.10
    • /
    • pp.75-80
    • /
    • 2017
  • In this paper, we describe the effect of the regularization method and the network with identity mapping on the performance of the activation functions in deep convolutional neural networks. The activation functions act as nonlinear transformation. In early convolutional neural networks, a sigmoid function was used. To overcome the problem of the existing activation functions such as gradient vanishing, various activation functions were developed such as ReLU, Leaky ReLU, parametric ReLU, and ELU. To solve the overfitting problem, regularization methods such as dropout and batch normalization were developed on the sidelines of the activation functions. Additionally, data augmentation is usually applied to deep learning to avoid overfitting. The activation functions mentioned above have different characteristics, but the new regularization method and the network with identity mapping were validated only using ReLU. Therefore, we have experimentally shown the effect of the regularization method and the network with identity mapping on the performance of the activation functions. Through this analysis, we have presented the tendency of the performance of activation functions according to regularization and identity mapping. These results will reduce the number of training trials to find the best activation function.

Comparative analysis of activation functions of artificial neural network for prediction of optimal groundwater level in the middle mountainous area of Pyoseon watershed in Jeju Island (제주도 표선유역 중산간지역의 최적 지하수위 예측을 위한 인공신경망의 활성화함수 비교분석)

  • Shin, Mun-Ju;Kim, Jin-Woo;Moon, Duk-Chul;Lee, Jeong-Han;Kang, Kyung Goo
    • Journal of Korea Water Resources Association
    • /
    • v.54 no.spc1
    • /
    • pp.1143-1154
    • /
    • 2021
  • The selection of activation function has a great influence on the groundwater level prediction performance of artificial neural network (ANN) model. In this study, five activation functions were applied to ANN model for two groundwater level observation wells in the middle mountainous area of the Pyoseon watershed in Jeju Island. The results of the prediction of the groundwater level were compared and analyzed, and the optimal activation function was derived. In addition, the results of LSTM model, which is a widely used recurrent neural network model, were compared and analyzed with the results of the ANN models with each activation function. As a result, ELU and Leaky ReLU functions were derived as the optimal activation functions for the prediction of the groundwater level for observation well with relatively large fluctuations in groundwater level and for observation well with relatively small fluctuations, respectively. On the other hand, sigmoid function had the lowest predictive performance among the five activation functions for training period, and produced inappropriate results in peak and lowest groundwater level prediction. The ANN-ELU and ANN-Leaky ReLU models showed groundwater level prediction performance comparable to that of the LSTM model, and thus had sufficient potential for application. The methods and results of this study can be usefully used in other studies.

Combining multi-task autoencoder with Wasserstein generative adversarial networks for improving speech recognition performance (음성인식 성능 개선을 위한 다중작업 오토인코더와 와설스타인식 생성적 적대 신경망의 결합)

  • Kao, Chao Yuan;Ko, Hanseok
    • The Journal of the Acoustical Society of Korea
    • /
    • v.38 no.6
    • /
    • pp.670-677
    • /
    • 2019
  • As the presence of background noise in acoustic signal degrades the performance of speech or acoustic event recognition, it is still challenging to extract noise-robust acoustic features from noisy signal. In this paper, we propose a combined structure of Wasserstein Generative Adversarial Network (WGAN) and MultiTask AutoEncoder (MTAE) as deep learning architecture that integrates the strength of MTAE and WGAN respectively such that it estimates not only noise but also speech features from noisy acoustic source. The proposed MTAE-WGAN structure is used to estimate speech signal and the residual noise by employing a gradient penalty and a weight initialization method for Leaky Rectified Linear Unit (LReLU) and Parametric ReLU (PReLU). The proposed MTAE-WGAN structure with the adopted gradient penalty loss function enhances the speech features and subsequently achieve substantial Phoneme Error Rate (PER) improvements over the stand-alone Deep Denoising Autoencoder (DDAE), MTAE, Redundant Convolutional Encoder-Decoder (R-CED) and Recurrent MTAE (RMTAE) models for robust speech recognition.

Comparison on of Activation Functions for Shrinkage Prediction Model using DNN (DNN을 활용한 콘크리트 건조수축 예측 모델의 활성화 함수 비교분석)

  • Han, Jun-Hui;Kim, Su-Hoo;Han, Soo-Hwan;Beak, Sung-Jin;Kim, Jong;Han, Min-Cheol
    • Proceedings of the Korean Institute of Building Construction Conference
    • /
    • 2022.11a
    • /
    • pp.121-122
    • /
    • 2022
  • In this study, compared and analyzed various Activation Functions to present a methodology for developing a natural intelligence-based prediction system. As a result of the analysis, ELU was the best with RMSE: 62.87, R2: 0.96, and the error rate was 4%. However, it is considered desirable to construct a prediction system by combining each algorithm model for optimization.

  • PDF

(Searching Effective Network Parameters to Construct Convolutional Neural Networks for Object Detection) (물체 검출 컨벌루션 신경망 설계를 위한 효과적인 네트워크 파라미터 추출)

  • Kim, Nuri;Lee, Donghoon;Oh, Songhwai
    • Journal of KIISE
    • /
    • v.44 no.7
    • /
    • pp.668-673
    • /
    • 2017
  • Deep neural networks have shown remarkable performance in various fields of pattern recognition such as voice recognition, image recognition and object detection. However, underlying mechanisms of the network have not been fully revealed. In this paper, we focused on empirical analysis of the network parameters. The Faster R-CNN(region-based convolutional neural network) was used as a baseline network of our work and three important parameters were analyzed: the dropout ratio which prevents the overfitting of the neural network, the size of the anchor boxes and the activation function. We also compared the performance of dropout and batch normalization. The network performed favorably when the dropout ratio was 0.3 and the size of the anchor box had not shown notable relation to the performance of the network. The result showed that batch normalization can't entirely substitute the dropout method. The used leaky ReLU(rectified linear unit) with a negative domain slope of 0.02 showed comparably good performance.