• Title/Summary/Keyword: Q-Learning algorithm

Search Result 153, Processing Time 0.029 seconds

Q-learning to improve learning speed using Minimax algorithm (미니맥스 알고리즘을 이용한 학습속도 개선을 위한 Q러닝)

  • Shin, YongWoo
    • Journal of Korea Game Society
    • /
    • v.18 no.4
    • /
    • pp.99-106
    • /
    • 2018
  • Board games have many game characters and many state spaces. Therefore, games must be long learning. This paper used reinforcement learning algorithm. But, there is weakness with reinforcement learning. At the beginning of learning, reinforcement learning has the drawback of slow learning speed. Therefore, we tried to improve the learning speed by using the heuristic using the knowledge of the problem domain considering the game tree when there is the same best value during learning. In order to compare the existing character the improved one. I produced a board game. So I compete with one-sided attacking character. Improved character attacked the opponent's one considering the game tree. As a result of experiment, improved character's capability was improved on learning speed.

Q-Learning Policy and Reward Design for Efficient Path Selection (효율적인 경로 선택을 위한 Q-Learning 정책 및 보상 설계)

  • Yong, Sung-Jung;Park, Hyo-Gyeong;You, Yeon-Hwi;Moon, Il-Young
    • Journal of Advanced Navigation Technology
    • /
    • v.26 no.2
    • /
    • pp.72-77
    • /
    • 2022
  • Among the techniques of reinforcement learning, Q-Learning means learning optimal policies by learning Q functions that perform actionsin a given state and predict future efficient expectations. Q-Learning is widely used as a basic algorithm for reinforcement learning. In this paper, we studied the effectiveness of selecting and learning efficient paths by designing policies and rewards based on Q-Learning. In addition, the results of the existing algorithm and punishment compensation policy and the proposed punishment reinforcement policy were compared by applying the same number of times of learning to the 8x8 grid environment of the Frozen Lake game. Through this comparison, it was analyzed that the Q-Learning punishment reinforcement policy proposed in this paper can significantly increase the learning speed compared to the application of conventional algorithms.

Multi Behavior Learning of Lamp Robot based on Q-learning (강화학습 Q-learning 기반 복수 행위 학습 램프 로봇)

  • Kwon, Ki-Hyeon;Lee, Hyung-Bong
    • Journal of Digital Contents Society
    • /
    • v.19 no.1
    • /
    • pp.35-41
    • /
    • 2018
  • The Q-learning algorithm based on reinforcement learning is useful for learning the goal for one behavior at a time, using a combination of discrete states and actions. In order to learn multiple actions, applying a behavior-based architecture and using an appropriate behavior adjustment method can make a robot perform fast and reliable actions. Q-learning is a popular reinforcement learning method, and is used much for robot learning for its characteristics which are simple, convergent and little affected by the training environment (off-policy). In this paper, Q-learning algorithm is applied to a lamp robot to learn multiple behaviors (human recognition, desk object recognition). As the learning rate of Q-learning may affect the performance of the robot at the learning stage of multiple behaviors, we present the optimal multiple behaviors learning model by changing learning rate.

Real-Time Path Planning for Mobile Robots Using Q-Learning (Q-learning을 이용한 이동 로봇의 실시간 경로 계획)

  • Kim, Ho-Won;Lee, Won-Chang
    • Journal of IKEEE
    • /
    • v.24 no.4
    • /
    • pp.991-997
    • /
    • 2020
  • Reinforcement learning has been applied mainly in sequential decision-making problems. Especially in recent years, reinforcement learning combined with neural networks has brought successful results in previously unsolved fields. However, reinforcement learning using deep neural networks has the disadvantage that it is too complex for immediate use in the field. In this paper, we implemented path planning algorithm for mobile robots using Q-learning, one of the easy-to-learn reinforcement learning algorithms. We used real-time Q-learning to update the Q-table in real-time since the Q-learning method of generating Q-tables in advance has obvious limitations. By adjusting the exploration strategy, we were able to obtain the learning speed required for real-time Q-learning. Finally, we compared the performance of real-time Q-learning and DQN.

A Learning based Algorithm for Traveling Salesman Problem (강화학습기법을 이용한 TSP의 해법)

  • Lim, JoonMook;Bae, SungMin;Suh, JaeJoon
    • Journal of Korean Institute of Industrial Engineers
    • /
    • v.32 no.1
    • /
    • pp.61-73
    • /
    • 2006
  • This paper deals with traveling salesman problem(TSP) with the stochastic travel time. Practically, the travel time between demand points changes according to day and time zone because of traffic interference and jam. Since the almost pervious studies focus on TSP with the deterministic travel time, it is difficult to apply those results to logistics problem directly. But many logistics problems are strongly related with stochastic situation such as stochastic travel time. We need to develop the efficient solution method for the TSP with stochastic travel time. From the previous researches, we know that Q-learning technique gives us to deal with stochastic environment and neural network also enables us to calculate the Q-value of Q-learning algorithm. In this paper, we suggest an algorithm for TSP with the stochastic travel time integrating Q-learning and neural network. And we evaluate the validity of the algorithm through computational experiments. From the simulation results, we conclude that a new route obtained from the suggested algorithm gives relatively more reliable travel time in the logistics situation with stochastic travel time.

Behavior Learning and Evolution of Swarm Robot based on Harmony Search Algorithm (Harmony Search 알고리즘 기반 군집로봇의 행동학습 및 진화)

  • Kim, Min-Kyung;Ko, Kwang-Eun;Sim, Kwee-Bo
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.20 no.3
    • /
    • pp.441-446
    • /
    • 2010
  • Each robot decides and behaviors themselves surrounding circumstances in the swarm robot system. Robots have to conduct tasks allowed through cooperation with other robots. Therefore each robot should have the ability to learn and evolve in order to adapt to a changing environment. In this paper, we proposed learning based on Q-learning algorithm and evolutionary using Harmony Search algorithm and are trying to improve the accuracy using Harmony Search Algorithm, not the Genetic Algorithm. We verify that swarm robot has improved the ability to perform the task.

Q-Learning based Collision Avoidance for 802.11 Stations with Maximum Requirements

  • Chang Kyu Lee;Dong Hyun Lee;Junseok Kim;Xiaoying Lei;Seung Hyong Rhee
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.17 no.3
    • /
    • pp.1035-1048
    • /
    • 2023
  • The IEEE 802.11 WLAN adopts a random backoff algorithm for its collision avoidance mechanism, and it is well known that the contention-based algorithm may suffer from performance degradation especially in congested networks. In this paper, we design an efficient backoff algorithm that utilizes a reinforcement learning method to determine optimal values of backoffs. The mobile nodes share a common contention window (CW) in our scheme, and using a Q-learning algorithm, they can avoid collisions by finding and implicitly reserving their optimal time slot(s). In addition, we introduce Frame Size Control (FSC) algorithm to minimize the possible degradation of aggregate throughput when the number of nodes exceeds the CW size. Our simulation shows that the proposed backoff algorithm with FSC method outperforms the 802.11 protocol regardless of the traffic conditions, and an analytical modeling proves that our mechanism has a unique operating point that is fair and stable.

Object Tracking Algorithm of Swarm Robot System for using Polygon Based Q-Learning and Cascade SVM (다각형 기반의 Q-Learning과 Cascade SVM을 이용한 군집로봇의 목표물 추적 알고리즘)

  • Seo, Sang-Wook;Yang, Hyung-Chang;Sim, Kwee-Bo
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.3 no.2
    • /
    • pp.119-125
    • /
    • 2008
  • This paper presents the polygon-based Q-leaning and Cascade Support Vector Machine algorithm for object search with multiple robots. We organized an experimental environment with ten mobile robots, twenty five obstacles, and an object, and then we sent the robots to a hallway, where some obstacles were lying about, to search for a hidden object. In experiment, we used four different control methods: a random search, a fusion model with Distance-based action making (DBAM) and Area-based action making (ABAM) process to determine the next action of the robots, and hexagon-based Q-learning and dodecagon-based Q-learning and Cascade SVM to enhance the fusion model with DBAM and ABAM process.

  • PDF

A Study on the Implementation of Crawling Robot using Q-Learning

  • Hyunki KIM;Kyung-A KIM;Myung-Ae CHUNG;Min-Soo KANG
    • Korean Journal of Artificial Intelligence
    • /
    • v.11 no.4
    • /
    • pp.15-20
    • /
    • 2023
  • Machine learning is comprised of supervised learning, unsupervised learning and reinforcement learning as the type of data and processing mechanism. In this paper, as input and output are unclear and it is difficult to apply the concrete modeling mathematically, reinforcement learning method are applied for crawling robot in this paper. Especially, Q-Learning is the most effective learning technique in model free reinforcement learning. This paper presents a method to implement a crawling robot that is operated by finding the most optimal crawling method through trial and error in a dynamic environment using a Q-learning algorithm. The goal is to perform reinforcement learning to find the optimal two motor angle for the best performance, and finally to maintain the most mature and stable motion about EV3 Crawling robot. In this paper, for the production of the crawling robot, it was produced using Lego Mindstorms with two motors, an ultrasonic sensor, a brick and switches, and EV3 Classroom SW are used for this implementation. By repeating 3 times learning, total 60 data are acquired, and two motor angles vs. crawling distance graph are plotted for the more understanding. Applying the Q-learning reinforcement learning algorithm, it was confirmed that the crawling robot found the optimal motor angle and operated with trained learning, and learn to know the direction for the future research.

Object tracking algorithm of Swarm Robot System for using SVM and Dodecagon based Q-learning (12각형 기반의 Q-learning과 SVM을 이용한 군집로봇의 목표물 추적 알고리즘)

  • Seo, Sang-Wook;Yang, Hyun-Chang;Sim, Kwee-Bo
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.18 no.3
    • /
    • pp.291-296
    • /
    • 2008
  • This paper presents the dodecagon-based Q-leaning and SVM algorithm for object search with multiple robots. We organized an experimental environment with several mobile robots, obstacles, and an object. Then we sent the robots to a hallway, where some obstacles were tying about, to search for a hidden object. In experiment, we used four different control methods: a random search, a fusion model with Distance-based action making(DBAM) and Area-based action making(ABAM) process to determine the next action of the robots, and hexagon-based Q-learning and dodecagon-based Q-learning and SVM to enhance the fusion model with Distance-based action making(DBAM) and Area-based action making(ABAM) process.