• Title/Summary/Keyword: Weight initialization

Search Result 30, Processing Time 0.03 seconds

Efficient weight initialization method in multi-layer perceptrons

  • Han, Jaemin;Sung, Shijoong;Hyun, Changho
    • Proceedings of the Korean Operations and Management Science Society Conference
    • /
    • 1995.09a
    • /
    • pp.325-333
    • /
    • 1995
  • Back-propagation is the most widely used algorithm for supervised learning in multi-layer feed-forward networks. However, back-propagation is very slow in convergence. In this paper, a new weight initialization method, called rough map initialization, in multi-layer perceptrons is proposed. To overcome the long convergence time, possibly due to the random initialization of the weights of the existing multi-layer perceptrons, the rough map initialization method initialize weights by utilizing relationship of input-output features with singular value decomposition technique. The results of this initialization procedure are compared to random initialization procedure in encoder problems and xor problems.

  • PDF

Comparison of Weight Initialization Techniques for Deep Neural Networks

  • Kang, Min-Jae;Kim, Ho-Chan
    • International Journal of Advanced Culture Technology
    • /
    • v.7 no.4
    • /
    • pp.283-288
    • /
    • 2019
  • Neural networks have been reborn as a Deep Learning thanks to big data, improved processor, and some modification of training methods. Neural networks used to initialize weights in a stupid way, and to choose wrong type activation functions of non-linearity. Weight initialization contributes as a significant factor on the final quality of a network as well as its convergence rate. This paper discusses different approaches to weight initialization. MNIST dataset is used for experiments for comparing their results to find out the best technique that can be employed to achieve higher accuracy in relatively lower duration.

Approach to Improving the Performance of Network Intrusion Detection by Initializing and Updating the Weights of Deep Learning (딥러닝의 가중치 초기화와 갱신에 의한 네트워크 침입탐지의 성능 개선에 대한 접근)

  • Park, Seongchul;Kim, Juntae
    • Journal of the Korea Society for Simulation
    • /
    • v.29 no.4
    • /
    • pp.73-84
    • /
    • 2020
  • As the Internet began to become popular, there have been hacking and attacks on networks including systems, and as the techniques evolved day by day, it put risks and burdens on companies and society. In order to alleviate that risk and burden, it is necessary to detect hacking and attacks early and respond appropriately. Prior to that, it is necessary to increase the reliability in detecting network intrusion. This study was conducted on applying weight initialization and weight optimization to the KDD'99 dataset to improve the accuracy of detecting network intrusion. As for the weight initialization, it was found through experiments that the initialization method related to the weight learning structure, like Xavier and He method, affects the accuracy. In addition, the weight optimization was confirmed through the experiment of the network intrusion detection dataset that the Adam algorithm, which combines the advantages of the Momentum reflecting the previous change and RMSProp, which allows the current weight to be reflected in the learning rate, stands out in terms of accuracy.

Performance Comparison of Convolution Neural Network by Weight Initialization and Parameter Update Method1 (가중치 초기화 및 매개변수 갱신 방법에 따른 컨벌루션 신경망의 성능 비교)

  • Park, Sung-Wook;Kim, Do-Yeon
    • Journal of Korea Multimedia Society
    • /
    • v.21 no.4
    • /
    • pp.441-449
    • /
    • 2018
  • Deep learning has been used for various processing centered on image recognition. One core algorithms of the deep learning, convolutional neural network is an deep neural network that specialized in image recognition. In this paper, we use a convolutional neural network to classify forest insects and propose an optimization method. Experiments were carried out by combining two weight initialization and six parameter update methods. As a result, the Xavier-SGD method showed the highest performance with an accuracy of 82.53% in the 12 different combinations of experiments. Through this, the latest learning algorithms, which complement the disadvantages of the previous parameter update method, we conclude that it can not lead to higher performance than existing methods in all application environments.

Initialization by using truncated distributions in artificial neural network (절단된 분포를 이용한 인공신경망에서의 초기값 설정방법)

  • Kim, MinJong;Cho, Sungchul;Jeong, Hyerin;Lee, YungSeop;Lim, Changwon
    • The Korean Journal of Applied Statistics
    • /
    • v.32 no.5
    • /
    • pp.693-702
    • /
    • 2019
  • Deep learning has gained popularity for the classification and prediction task. Neural network layers become deeper as more data becomes available. Saturation is the phenomenon that the gradient of an activation function gets closer to 0 and can happen when the value of weight is too big. Increased importance has been placed on the issue of saturation which limits the ability of weight to learn. To resolve this problem, Glorot and Bengio (Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, 249-256, 2010) claimed that efficient neural network training is possible when data flows variously between layers. They argued that variance over the output of each layer and variance over input of each layer are equal. They proposed a method of initialization that the variance of the output of each layer and the variance of the input should be the same. In this paper, we propose a new method of establishing initialization by adopting truncated normal distribution and truncated cauchy distribution. We decide where to truncate the distribution while adapting the initialization method by Glorot and Bengio (2010). Variances are made over output and input equal that are then accomplished by setting variances equal to the variance of truncated distribution. It manipulates the distribution so that the initial values of weights would not grow so large and with values that simultaneously get close to zero. To compare the performance of our proposed method with existing methods, we conducted experiments on MNIST and CIFAR-10 data using DNN and CNN. Our proposed method outperformed existing methods in terms of accuracy.

Analysis of Weight Distribution of Feedforward Two-Layer Neural Networks and its Application to Weight Initialization (순방향 2층 신경망의 연결강도 분포 특성 분석 및 연결강도 초기화에 적용)

  • Go, Jin-Wook;Park, Mig-Non;Hong, Dae-Sik;Lee, Chul-Hee
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.38 no.3
    • /
    • pp.1-12
    • /
    • 2001
  • In this paper, we investigate and analyze weight distribution of feed forward two-layer neural networks with a hidden layer in order to understand and improve time-consuming training process of neural networks. Generally, when a new problem is presented, neural networks have to be trained again without any benefit from the previous training process. In order to address this problem, training process is viewed as finding a solution point in the weight space and the distribution of solution points is analyzed. Then we propose to initialize neural networks using the information of the distribution of the solution points. Experimental results show that the proposed initialization using the weight distribution provides a better performance than the conventional one.

  • PDF

Time Series Forecasting Based on Modified Ensemble Algorithm (시계열 예측의 변형된 ENSEMBLE ALGORITHM)

  • Kim Yon Hyong;Kim Jae Hoon
    • The Korean Journal of Applied Statistics
    • /
    • v.18 no.1
    • /
    • pp.137-146
    • /
    • 2005
  • Neural network is one of the most notable technique. It usually provides more powerful forecasting models than the traditional time series techniques. Employing the Ensemble technique in forecasting model, one should provide a initial distribution. Usually the uniform distribution is assumed so that the initialization is noninformative. However, it would be expected a sequential informative initialization based on data rather than the uniform initialization gives further reduction in forecasting error. In this note, a modified Ensemble algorithm using sequential initial probability is developed. The sequential distribution is designed to have much weight on the recent data.

Robust Test Generation for Stuck-Open Faults in CMOS Circuits (CMOS 회로의 Stuck-open 고장검출을 위한 로보스트 테스트 생성)

  • Jung, Jun-Mo;Lim, In-Chil
    • Journal of the Korean Institute of Telematics and Electronics
    • /
    • v.27 no.11
    • /
    • pp.42-48
    • /
    • 1990
  • In this paper robust test generation for stuck-open faults in CMOS circuits is proposed. By obtaining initialization patterns and test patterns using the relationship of bit position and Hamming weight among input vectors for CMOS circuit test generation time for stuck-open faults can be reduced, and the problem of input transition skew which make fault detection difficult is solved, and the number of test sequences are minimized. Also the number of test sequences is reduced by arranging test sequences using Hamming distance between initialization patterns and test patterns for circuit.

  • PDF

Comparison on of Minimization of Loos function for strength Prediction Model using DNN (DNN을 활용한 강도예측모델의 손실함수 최소화 기법 비교분석)

  • Han, Jun-Hui;Kim, Su-Hoo;Beak, Sung-Jin;Han, Soo-Hwan;Kim, Jong;Han, Min-Cheol
    • Proceedings of the Korean Institute of Building Construction Conference
    • /
    • 2022.04a
    • /
    • pp.182-183
    • /
    • 2022
  • In this study, compared and analyzed various loss function minimization techniques to present a methodology for developing a natural intelligence-based prediction system. As a result of the analysis, He Initialization was the best with RMSE: 3.78, R2: 0.94, and the error rate was 6%. However, it is considered desirable to construct a prediction system by combining each technique for optimization.

  • PDF

Effects of Hyper-parameters and Dataset on CNN Training

  • Nguyen, Huu Nhan;Lee, Chanho
    • Journal of IKEEE
    • /
    • v.22 no.1
    • /
    • pp.14-20
    • /
    • 2018
  • The purpose of training a convolutional neural network (CNN) is to obtain weight factors that give high classification accuracies. The initial values of hyper-parameters affect the training results, and it is important to train a CNN with a suitable hyper-parameter set of a learning rate, a batch size, the initialization of weight factors, and an optimizer. We investigate the effects of a single hyper-parameter while others are fixed in order to obtain a hyper-parameter set that gives higher classification accuracies and requires shorter training time using a proposed VGG-like CNN for training since the VGG is widely used. The CNN is trained for four datasets of CIFAR10, CIFAR100, GTSRB and DSDL-DB. The effects of the normalization and the data transformation for datasets are also investigated, and a training scheme using merged datasets is proposed.