Search | Korea Science

Comparison of Reinforcement Learning Activation Functions to Improve the Performance of the Racing Game Learning Agent

Lee, Dongcheul
- Journal of Information Processing Systems
- /
- v.16 no.5
- /
- pp.1074-1082
- /
- 2020
Recently, research has been actively conducted to create artificial intelligence agents that learn games through reinforcement learning. There are several factors that determine performance when the agent learns a game, but using any of the activation functions is also an important factor. This paper compares and evaluates which activation function gets the best results if the agent learns the game through reinforcement learning in the 2D racing game environment. We built the agent using a reinforcement learning algorithm and a neural network. We evaluated the activation functions in the network by switching them together. We measured the reward, the output of the advantage function, and the output of the loss function while training and testing. As a result of performance evaluation, we found out the best activation function for the agent to learn the game. The difference between the best and the worst was 35.4%.
https://doi.org/10.3745/JIPS.02.0141 인용 PDF KSCI

Prediction of Asphalt Pavement Service Life using Deep Learning (딥러닝을 활용한 일반국도 아스팔트포장의 공용수명 예측)

Choi, Seunghyun;Do, Myungsik
- International Journal of Highway Engineering
- /
- v.20 no.2
- /
- pp.57-65
- /
- 2018
PURPOSES : The study aims to predict the service life of national highway asphalt pavements through deep learning methods by using maintenance history data of the National Highway Pavement Management System. METHODS : For the configuration of a deep learning network, this study used Tensorflow 1.5, an open source program which has excellent usability among deep learning frameworks. For the analysis, nine variables of cumulative annual average daily traffic, cumulative equivalent single axle loads, maintenance layer, surface, base, subbase, anti-frost layer, structural number of pavement, and region were selected as input data, while service life was chosen to construct the input layer and output layers as output data. Additionally, for scenario analysis, in this study, a model was formed with four different numbers of 1, 2, 4, and 8 hidden layers and a simulation analysis was performed according to the applicability of the over fitting resolution algorithm. RESULTS : The results of the analysis have shown that regardless of the number of hidden layers, when an over fitting resolution algorithm, such as dropout, is applied, the prediction capability is improved as the coefficient of determination ($R^2$) of the test data increases. Furthermore, the result of the sensitivity analysis of the applicability of region variables demonstrates that estimating service life requires sufficient consideration of regional characteristics as $R^2$ had a maximum of between 0.73 and 0.84, when regional variables where taken into consideration. CONCLUSIONS : As a result, this study proposes that it is possible to precisely predict the service life of national highway pavement sections with the consideration of traffic, pavement thickness, and regional factors and concludes that the use of the prediction of service life is fundamental data in decision making within pavement management systems.
https://doi.org/10.7855/IJHE.2018.20.2.057 인용 PDF KSCI

Influence on overfitting and reliability due to change in training data

Kim, Sung-Hyeock;Oh, Sang-Jin;Yoon, Geun-Young;Jung, Yong-Gyu;Kang, Min-Soo
- International Journal of Advanced Culture Technology
- /
- v.5 no.2
- /
- pp.82-89
- /
- 2017
The range of problems that can be handled by the activation of big data and the development of hardware has been rapidly expanded and machine learning such as deep learning has become a very versatile technology. In this paper, mnist data set is used as experimental data, and the Cross Entropy function is used as a loss model for evaluating the efficiency of machine learning, and the value of the loss function in the steepest descent method is We applied the GradientDescentOptimize algorithm to minimize and updated weight and bias via backpropagation. In this way we analyze optimal reliability value corresponding to the number of exercises and optimal reliability value without overfitting. And comparing the overfitting time according to the number of data changes based on the number of training times, when the training frequency was 1110 times, we obtained the result of 92%, which is the optimal reliability value without overfitting.
https://doi.org/10.17703/IJACT.2017.5.2.82 인용 PDF KSCI

The Cucumber Cognizance for Back Propagation of Nerual Network (신경회로망의 오류역전파 알고리즘을 이용한 오이 인식)

Min, Byeong-Ro;Lee, Dae-Weon
- Journal of Bio-Environment Control
- /
- v.20 no.4
- /
- pp.277-282
- /
- 2011
We carried out shape recognition. We found out cucumber's feature shape by means of neural network and back propagation algorithm. We developed an algorithm which finds object position and shape in real image and we gained following conclusion as a result. It was processed for feature shape extraction of cucumber to detect automatic. The output pattern rates of the miss-detected objects was 0.1~4.2% in the output pattern which was recognized as cucumber. We were gained output pattern according to image resolution $445{\times}363$, $501{\times}391$, $450{\times}271$, $297{\times}421$. It was appeared that no change was detected. When learning pattern was increased to 25, miss-detection ratio was 16.02%, and when learning pattern had 2 pattern, it didn't detect 8 cucumber in 40 images.
PDF KSCI

Adaptive control for robot manipulator through repeated learning (반복 학습을 통한 로보트 매니퓰레이터의 적응 제어)

Lee, Cheol;An, Duk-Hwan;Lee, sang-Hyo
- 제어로봇시스템학회:학술대회논문집
- /
- 1990.10a
- /
- pp.269-274
- /
- 1990
Usually, robot manipulators in production lines are operated with reperting work trajectories. This paper presents the repeated adaptive learning algorithm for robot manipulates for the case of a trajectory. This algorithm uses the nonlinear dynamic model including the repeated friction compensating term, The advantage of the scheme is that It allows friction compensation which may be otherwise difficult for differently constructed models. A secondary advantage of the sheme is that it can also adapt to torque calculation in order to reduce the computational load of the control computer. To show the efficiency of the proposed controller, a computer simulation is performed for the planar robot manipulator with a 2 degree of freedom.
PDF

Assembly performance evaluation method for prefabricated steel structures using deep learning and k-nearest neighbors

Hyuntae Bang;Byeongjun Yu;Haemin Jeon
- Smart Structures and Systems
- /
- v.32 no.2
- /
- pp.111-121
- /
- 2023
This study proposes an automated assembly performance evaluation method for prefabricated steel structures (PSSs) using machine learning methods. Assembly component images were segmented using a modified version of the receptive field pyramid. By factorizing channel modulation and the receptive field exploration layers of the convolution pyramid, highly accurate segmentation results were obtained. After completing segmentation, the positions of the bolt holes were calculated using various image processing techniques, such as fuzzy-based edge detection, Hough's line detection, and image perspective transformation. By calculating the distance ratio between bolt holes, the assembly performance of the PSS was estimated using the k-nearest neighbors (kNN) algorithm. The effectiveness of the proposed framework was validated using a 3D PSS printing model and a field test. The results indicated that this approach could recognize assembly components with an intersection over union (IoU) of 95% and evaluate assembly performance with an error of less than 5%.
https://doi.org/10.12989/sss.2023.32.2.111 인용

The Gripping Force Control of Robot Manipulator Using the Repeated Learning Function Techniques (반복 학습기능을 이용한 로봇 매니퓰레이터의 파지력제어)

Kim, Tea-Kwan;Baek, Seung-Hack;Kim, Tea-Soo
- Journal of the Korean Society of Industry Convergence
- /
- v.18 no.1
- /
- pp.45-52
- /
- 2015
In this paper, the repeated learning technique of neural network was used for gripping force control algorithm. The hybrid control system was introduced and the manipulator's finger reorganized form 2 ea to 3 ea for comfortable gripping. The data was obtained using the gripping force of repeated learning techniques. In the fucture, the adjustable gripping force will be obtained and improved the accuracy using the artificial intelligence techniques.
https://doi.org/10.21289/KSIC.2015.18.1.045 인용 PDF

LMI-Based Synthesis of Robust Iterative Learning Controller with Current Feedback for Linear Uncertain Systems

Xu, Jianming;Sun, Mingxuan;Yu, Li
- International Journal of Control, Automation, and Systems
- /
- v.6 no.2
- /
- pp.171-179
- /
- 2008
This paper addresses the synthesis of an iterative learning controller for a class of linear systems with norm-bounded parameter uncertainties. We take into account an iterative learning algorithm with current cycle feedback in order to achieve both robust convergence and robust stability. The synthesis problem of the developed iterative learning control (ILC) system is reformulated as the ${\gamma}$-suboptimal $H_{\infty}$ control problem via the linear fractional transformation (LFT). A sufficient convergence condition of the ILC system is presented in terms of linear matrix inequalities (LMIs). Furthermore, the ILC system with fast convergence rate is constructed using a convex optimization technique with LMI constraints. The simulation results demonstrate the effectiveness of the proposed method.
PDF KSCI

Aspect-based Sentiment Analysis of Product Reviews using Multi-agent Deep Reinforcement Learning

M. Sivakumar;Srinivasulu Reddy Uyyala
- Asia pacific journal of information systems
- /
- v.32 no.2
- /
- pp.226-248
- /
- 2022
The existing model for sentiment analysis of product reviews learned from past data and new data was labeled based on training. But new data was never used by the existing system for making a decision. The proposed Aspect-based multi-agent Deep Reinforcement learning Sentiment Analysis (ADRSA) model learned from its very first data without the help of any training dataset and labeled a sentence with aspect category and sentiment polarity. It keeps on learning from the new data and updates its knowledge for improving its intelligence. The decision of the proposed system changed over time based on the new data. So, the accuracy of the sentiment analysis using deep reinforcement learning was improved over supervised learning and unsupervised learning methods. Hence, the sentiments of premium customers on a particular site can be explored to other customers effectively. A dynamic environment with a strong knowledge base can help the system to remember the sentences and usage State Action Reward State Action (SARSA) algorithm with Bidirectional Encoder Representations from Transformers (BERT) model improved the performance of the proposed system in terms of accuracy when compared to the state of art methods.
https://doi.org/10.14329/apjis.2022.32.2.226 인용 PDF

Support Vector Machine based on Stratified Sampling

Jun, Sung-Hae
- International Journal of Fuzzy Logic and Intelligent Systems
- /
- v.9 no.2
- /
- pp.141-146
- /
- 2009
Support vector machine is a classification algorithm based on statistical learning theory. It has shown many results with good performances in the data mining fields. But there are some problems in the algorithm. One of the problems is its heavy computing cost. So we have been difficult to use the support vector machine in the dynamic and online systems. To overcome this problem we propose to use stratified sampling of statistical sampling theory. The usage of stratified sampling supports to reduce the size of training data. In our paper, though the size of data is small, the performance accuracy is maintained. We verify our improved performance by experimental results using data sets from UCI machine learning repository.
https://doi.org/10.5391/IJFIS.2009.9.2.141 인용 PDF KSCI

Search Result 542, Processing Time 0.033 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)