• Title/Summary/Keyword: Gradient Descent Learning

Search Result 148, Processing Time 0.032 seconds

Comparison of Gradient Descent for Deep Learning (딥러닝을 위한 경사하강법 비교)

  • Kang, Min-Jae
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.21 no.2
    • /
    • pp.189-194
    • /
    • 2020
  • This paper analyzes the gradient descent method, which is the one most used for learning neural networks. Learning means updating a parameter so the loss function is at its minimum. The loss function quantifies the difference between actual and predicted values. The gradient descent method uses the slope of the loss function to update the parameter to minimize error, and is currently used in libraries that provide the best deep learning algorithms. However, these algorithms are provided in the form of a black box, making it difficult to identify the advantages and disadvantages of various gradient descent methods. This paper analyzes the characteristics of the stochastic gradient descent method, the momentum method, the AdaGrad method, and the Adadelta method, which are currently used gradient descent methods. The experimental data used a modified National Institute of Standards and Technology (MNIST) data set that is widely used to verify neural networks. The hidden layer consists of two layers: the first with 500 neurons, and the second with 300. The activation function of the output layer is the softmax function, and the rectified linear unit function is used for the remaining input and hidden layers. The loss function uses cross-entropy error.

Gradient Descent Training Method for Optimizing Data Prediction Models (데이터 예측 모델 최적화를 위한 경사하강법 교육 방법)

  • Hur, Kyeong
    • Journal of Practical Engineering Education
    • /
    • v.14 no.2
    • /
    • pp.305-312
    • /
    • 2022
  • In this paper, we focused on training to create and optimize a basic data prediction model. And we proposed a gradient descent training method of machine learning that is widely used to optimize data prediction models. It visually shows the entire operation process of gradient descent used in the process of optimizing parameter values required for data prediction models by applying the differential method and teaches the effective use of mathematical differentiation in machine learning. In order to visually explain the entire operation process of gradient descent, we implement gradient descent SW in a spreadsheet. In this paper, first, a two-variable gradient descent training method is presented, and the accuracy of the two-variable data prediction model is verified by comparison with the error least squares method. Second, a three-variable gradient descent training method is presented and the accuracy of a three-variable data prediction model is verified. Afterwards, the direction of the optimization practice for gradient descent was presented, and the educational effect of the proposed gradient descent method was analyzed through the results of satisfaction with education for non-majors.

A Study on the Development of Teaching-Learning Materials for Gradient Descent Method in College AI Mathematics Classes (대학수학 경사하강법(gradient descent method) 교수·학습자료 개발)

  • Lee, Sang-Gu;Nam, Yun;Lee, Jae Hwa
    • Communications of Mathematical Education
    • /
    • v.37 no.3
    • /
    • pp.467-482
    • /
    • 2023
  • In this paper, we present our new teaching and learning materials on gradient descent method, which is widely used in artificial intelligence, available for college mathematics. These materials provide a good explanation of gradient descent method at the level of college calculus, and the presented SageMath code can help students to solve minimization problems easily. And we introduce how to solve least squares problem using gradient descent method. This study can be helpful to instructors who teach various college-level mathematics subjects such as calculus, engineering mathematics, numerical analysis, and applied mathematics.

Perceptron-like LVQ : Generalization of LVQ (퍼셉트론 형태의 LVQ : LVQ의 일반화)

  • Song, Geun-Bae;Lee, Haing-Sei
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.38 no.1
    • /
    • pp.1-6
    • /
    • 2001
  • In this paper we reanalyze Kohonen‘s learning vector quantizing (LVQ) Learning rule which is based on Hcbb’s learning rule with a view to a gradient descent method. Kohonen's LVQ can be classified into two algorithms according to 6learning mode: unsupervised LVQ(ULVQ) and supervised LVQ(SLVQ). These two algorithms can be represented as gradient descent methods, if target values of output neurons are generated properly. As a result, we see that the LVQ learning method is a special case of a gradient descent method and also that LVQ is represented by a generalized percetron-like LVQ(PLVQ).

  • PDF

Optimal Learning Rates in Gradient Descent Training of Multilayer Perceptrons (다층퍼셉트론의 강하 학습을 위한 최적 학습률)

  • 오상훈
    • The Journal of the Korea Contents Association
    • /
    • v.4 no.3
    • /
    • pp.99-105
    • /
    • 2004
  • This paper proposes optimal learning rates in the gradient descent training of multilayer perceptrons, which are a separate learning rate for weights associated with each neuron and a separate one for assigning virtual hidden targets associated with each training pattern Effectiveness of the proposed error function was demonstrated for a handwritten digit recognition and an isolated-word recognition tasks and very fast learning convergence was obtained.

  • PDF

Learning algorithms for big data logistic regression on RHIPE platform (RHIPE 플랫폼에서 빅데이터 로지스틱 회귀를 위한 학습 알고리즘)

  • Jung, Byung Ho;Lim, Dong Hoon
    • Journal of the Korean Data and Information Science Society
    • /
    • v.27 no.4
    • /
    • pp.911-923
    • /
    • 2016
  • Machine learning becomes increasingly important in the big data era. Logistic regression is a type of classification in machine leaning, and has been widely used in various fields, including medicine, economics, marketing, and social sciences. Rhipe that integrates R and Hadoop environment, has not been discussed by many researchers owing to the difficulty of its installation and MapReduce implementation. In this paper, we present the MapReduce implementation of Gradient Descent algorithm and Newton-Raphson algorithm for logistic regression using Rhipe. The Newton-Raphson algorithm does not require a learning rate, while Gradient Descent algorithm needs to manually pick a learning rate. We choose the learning rate by performing the mixed procedure of grid search and binary search for processing big data efficiently. In the performance study, our Newton-Raphson algorithm outpeforms Gradient Descent algorithm in all the tested data.

Parameter Learning of Dynamic Bayesian Networks using Constrained Least Square Estimation and Steepest Descent Algorithm (제약조건을 갖는 최소자승 추정기법과 최급강하 알고리즘을 이용한 동적 베이시안 네트워크의 파라미터 학습기법)

  • Cho, Hyun-Cheol;Lee, Kwon-Soon;Koo, Kyung-Wan
    • The Transactions of the Korean Institute of Electrical Engineers P
    • /
    • v.58 no.2
    • /
    • pp.164-171
    • /
    • 2009
  • This paper presents new learning algorithm of dynamic Bayesian networks (DBN) by means of constrained least square (LS) estimation algorithm and gradient descent method. First, we propose constrained LS based parameter estimation for a Markov chain (MC) model given observation data sets. Next, a gradient descent optimization is utilized for online estimation of a hidden Markov model (HMM), which is bi-linearly constructed by adding an observation variable to a MC model. We achieve numerical simulations to prove its reliability and superiority in which a series of non stationary random signal is applied for the DBN models respectively.

Gradient Descent Approach for Value-Based Weighting (점진적 하강 방법을 이용한 속성값 기반의 가중치 계산방법)

  • Lee, Chang-Hwan;Bae, Joo-Hyun
    • The KIPS Transactions:PartB
    • /
    • v.17B no.5
    • /
    • pp.381-388
    • /
    • 2010
  • Naive Bayesian learning has been widely used in many data mining applications, and it performs surprisingly well on many applications. However, due to the assumption that all attributes are equally important in naive Bayesian learning, the posterior probabilities estimated by naive Bayesian are sometimes poor. In this paper, we propose more fine-grained weighting methods, called value weighting, in the context of naive Bayesian learning. While the current weighting methods assign a weight to each attribute, we assign a weight to each attribute value. We investigate how the proposed value weighting effects the performance of naive Bayesian learning. We develop new methods, using gradient descent method, for both value weighting and feature weighting in the context of naive Bayesian. The performance of the proposed methods has been compared with the attribute weighting method and general Naive bayesian, and the value weighting method showed better in most cases.

Cluster Analysis Algorithms Based on the Gradient Descent Procedure of a Fuzzy Objective Function

  • Rhee, Hyun-Sook;Oh, Kyung-Whan
    • Journal of Electrical Engineering and information Science
    • /
    • v.2 no.6
    • /
    • pp.191-196
    • /
    • 1997
  • Fuzzy clustering has been playing an important role in solving many problems. Fuzzy c-Means(FCM) algorithm is most frequently used for fuzzy clustering. But some fixed point of FCM algorithm, know as Tucker's counter example, is not a reasonable solution. Moreover, FCM algorithm is impossible to perform the on-line learning since it is basically a batch learning scheme. This paper presents unsupervised learning networks as an attempt to improve shortcomings of the conventional clustering algorithm. This model integrates optimization function of FCM algorithm into unsupervised learning networks. The learning rule of the proposed scheme is a result of formal derivation based on the gradient descent procedure of a fuzzy objective function. Using the result of formal derivation, two algorithms of fuzzy cluster analysis, the batch learning version and on-line learning version, are devised. They are tested on several data sets and compared with FCM. The experimental results show that the proposed algorithms find out the reasonable solution on Tucker's counter example.

  • PDF

An Adaptive PID Controller Design based on a Gradient Descent Learning (경사 감소 학습에 기초한 적응 PID 제어기 설계)

  • Park Jin-Hyun;Kim Hyun-Duck;Choi Young-Kiu
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.10 no.2
    • /
    • pp.276-282
    • /
    • 2006
  • PID controller has been widely used in industry. Because it has a simple structure and robustness to modeling error. But it is difficult to have uniformly good control performance in system parameters variation or different velocity command. In this paper, we propose an adaptive PID controller based on a gradient descent learning. This algorithm has a simple structure like conventional PID controller and a robustness to system parameters variation and different velocity command. To verify performances of the proposed adaptive PID controller, the speed control of nonlinear DC motor is performed. The simulation results show that the proposed control systems are effective in tracking a command velocity under system parameters variation.