• Title/Summary/Keyword: Q learning

Search Result 431, Processing Time 0.024 seconds

DYNAMIC ROUTE PLANNING BY Q-LEARNING -Cellular Automation Based Simulator and Control

  • Sano, Masaki;Jung, Si
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 2001.10a
    • /
    • pp.24.2-24
    • /
    • 2001
  • In this paper, the authors present a row dynamic route planning by Q-learning. The proposed algorithm is executed in a cellular automation based traffic simulator, which is also newly created. In Vehicle Information and Communication System(VICS), which is an active field of Intelligent Transport System(ITS), information of traffic congestion is sent to each vehicle at real time. However, a centralized navigation system is not realistic to guide millions of vehicles in a megalopolis. Autonomous distributed systems should be more flexible and scalable, and also have a chance to focus on each vehicles demand. In such systems, each vehicle can search an own optimal route. We employ Q-learning of the reinforcement learning method to search an optimal or sub-optimal route, in which route drivers can avoid traffic congestions. We find some applications of the reinforcement learning in the "static" environment, but there are ...

  • PDF

Adaptive Packet Scheduling Algorithm in IoT environment (IoT 환경에서의 적응적 패킷 스케줄링 알고리즘)

  • Kim, Dong-Hyun;Lim, Hwan-Hee;Lee, Byung-Jun;Kim, Kyung-Tae;Youn, Hee-Yong
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2018.07a
    • /
    • pp.15-16
    • /
    • 2018
  • 본 논문에서는 다수의 센서 노드로 구성된 Internet of Things (IoT) 환경에서 새로운 환경에 대해 적응하는데 걸리는 시간을 줄이기 위한 새로운 스케줄링 기법을 제안한다. IoT 환경에서는 데이터 수집 및 전송 패턴이 사전에 정의되어 있지 않기 때문에 기존 정적인 Packet scheduling 기법으로는 한계가 있다. Q-learning은 네트워크 환경에 대한 사전지식 없이도 반복적 학습을 통해 Scheduling policy를 확립할 수 있다. 본 논문에서는 기존 Q-learning 스케줄링 기법을 기반으로 각 큐의 패킷 도착률에 대한 bound 값을 이용해 Q-table과 Reward table을 초기화 하는 새로운 Q-learning 스케줄링 기법을 제안한다. 시뮬레이션 결과 기존 기법에 비해 변화하는 패킷 도착률 및 서비스 요구조건에 적응하는데 걸리는 시간이 감소하였다.

  • PDF

Patterns of Self-Directed Learning in Nurses (일 대학 종합병원 간호사의 자기주도학습 유형)

  • Oh Won-Oak
    • Journal of Korean Academy of Fundamentals of Nursing
    • /
    • v.9 no.3
    • /
    • pp.447-461
    • /
    • 2002
  • Purpose: The purpose of this study was to identify and understand the self-directed learning patterns of nurses. Q methodology was used to collect the data. Method: For the research method, 43 Q-statements were collected through individual interviews and a review of related literature. The 43 Q-statements were classified by the 34 participants in the study and the data was analyzed by the PC-QUANL program with principal component analysis. Result: There were 4 different patterns of self-directed learning classified as follows : Nurses in Type I the Future Provision Type, studied to promote their own professional development and leadership qualities for the future. Nurses in Type II, the Learning Passion Type, enjoyed learning something new and had a strong learning desire. Nurses in Type III, the Self-reflective Type, continuously evaluated self and their own practice by introspection. Nurses in Type IV, the Accompanying Companion Type, studies with companion support and maintained a collaborative relationship rather than competing with each other. Conclusion: This study explains and allows us to understand self-directed learning in nurses. Thus this study will contribute to building a theoretical base for the development of a self-directed learning model in nursing practice.

  • PDF

Behavior Learning and Evolution of Swarm Robot System using Q-learning and Cascade SVM (Q-learning과 Cascade SVM을 이용한 군집로봇의 행동학습 및 진화)

  • Seo, Sang-Wook;Yang, Hyun-Chang;Sim, Kwee-Bo
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.19 no.2
    • /
    • pp.279-284
    • /
    • 2009
  • In swarm robot systems, each robot must behaves by itself according to the its states and environments, and if necessary, must cooperates with other robots in order to carry out a given task. Therefore it is essential that each robot has both learning and evolution ability to adapt the dynamic environments. In this paper, reinforcement learning method using many SVM based on structural risk minimization and distributed genetic algorithms is proposed for behavior learning and evolution of collective autonomous mobile robots. By distributed genetic algorithm exchanging the chromosome acquired under different environments by communication each robot can improve its behavior ability. Specially, in order to improve the performance of evolution, selective crossover using the characteristic of reinforcement learning that basis of Cascade SVM is adopted in this paper.

A Reinforcement Loaming Method using TD-Error in Ant Colony System (개미 집단 시스템에서 TD-오류를 이용한 강화학습 기법)

  • Lee, Seung-Gwan;Chung, Tae-Choong
    • The KIPS Transactions:PartB
    • /
    • v.11B no.1
    • /
    • pp.77-82
    • /
    • 2004
  • Reinforcement learning takes reward about selecting action when agent chooses some action and did state transition in Present state. this can be the important subject in reinforcement learning as temporal-credit assignment problems. In this paper, by new meta heuristic method to solve hard combinational optimization problem, examine Ant-Q learning method that is proposed to solve Traveling Salesman Problem (TSP) to approach that is based for population that use positive feedback as well as greedy search. And, suggest Ant-TD reinforcement learning method that apply state transition through diversification strategy to this method and TD-error. We can show through experiments that the reinforcement learning method proposed in this Paper can find out an optimal solution faster than other reinforcement learning method like ACS and Ant-Q learning.

A Case Study of Flipped Learning application of Basics Cooking Practice Subject using YouTube (유튜브를 활용한 기초조리실습과목의 플립드러닝 적용사례 연구)

  • Shin, Seoung-Hoon;Lee, Kyung-Soo
    • The Journal of the Korea Contents Association
    • /
    • v.21 no.5
    • /
    • pp.488-498
    • /
    • 2021
  • This study applied Flipped Learning teaching and learning method to Basics Cooking Practice Subject using YouTube. The purpose of this study is to investigate whether the curriculum is properly progressing by grasping the effects of before and after learning and analyzing learners' subjectivity through the learning process. The investigation period was conducted from August 01, 2020 to September 10, 2020. According to the research design of Q Methodology, it was divided into five stages: Q sample selection, P sample selection, Q sorting, coding and recruiting, conclusion and discussion. As a result of the analysis, the first type (N=5): Prior Learning effect, the second type (N=7): Simulation practice effect, and the third type (N=3): self-efficacy effect. As a result, by applying the flipped learning teaching method of the Basics Cooking Practice Subject using YouTube, positive effects such as inducing interest in the class and increasing confidence were found in active learners, but some learners lacked understanding of the system of the class operation method. However, the lack of number of training sessions compared to other subjects is considered to be a solution to be solved later.

Lane Change Methodology for Autonomous Vehicles Based on Deep Reinforcement Learning (심층강화학습 기반 자율주행차량의 차로변경 방법론)

  • DaYoon Park;SangHoon Bae;Trinh Tuan Hung;Boogi Park;Bokyung Jung
    • The Journal of The Korea Institute of Intelligent Transport Systems
    • /
    • v.22 no.1
    • /
    • pp.276-290
    • /
    • 2023
  • Several efforts in Korea are currently underway with the goal of commercializing autonomous vehicles. Hence, various studies are emerging on autonomous vehicles that drive safely and quickly according to operating guidelines. The current study examines the path search of an autonomous vehicle from a microscopic viewpoint and tries to prove the efficiency required by learning the lane change of an autonomous vehicle through Deep Q-Learning. A SUMO was used to achieve this purpose. The scenario was set to start with a random lane at the starting point and make a right turn through a lane change to the third lane at the destination. As a result of the study, the analysis was divided into simulation-based lane change and simulation-based lane change applied with Deep Q-Learning. The average traffic speed was improved by about 40% in the case of simulation with Deep Q-Learning applied, compared to the case without application, and the average waiting time was reduced by about 2 seconds and the average queue length by about 2.3 vehicles.

Reinforcement Learning Using State Space Compression (상태 공간 압축을 이용한 강화학습)

  • Kim, Byeong-Cheon;Yun, Byeong-Ju
    • The Transactions of the Korea Information Processing Society
    • /
    • v.6 no.3
    • /
    • pp.633-640
    • /
    • 1999
  • Reinforcement learning performs learning through interacting with trial-and-error in dynamic environment. Therefore, in dynamic environment, reinforcement learning method like Q-learning and TD(Temporal Difference)-learning are faster in learning than the conventional stochastic learning method. However, because many of the proposed reinforcement learning algorithms are given the reinforcement value only when the learning agent has reached its goal state, most of the reinforcement algorithms converge to the optimal solution too slowly. In this paper, we present COMREL(COMpressed REinforcement Learning) algorithm for finding the shortest path fast in a maze environment, select the candidate states that can guide the shortest path in compressed maze environment, and learn only the candidate states to find the shortest path. After comparing COMREL algorithm with the already existing Q-learning and Priortized Sweeping algorithm, we could see that the learning time shortened very much.

  • PDF

Comparison of value-based Reinforcement Learning Algorithms in Cart-Pole Environment

  • Byeong-Chan Han;Ho-Chan Kim;Min-Jae Kang
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.15 no.3
    • /
    • pp.166-175
    • /
    • 2023
  • Reinforcement learning can be applied to a wide variety of problems. However, the fundamental limitation of reinforcement learning is that it is difficult to derive an answer within a given time because the problems in the real world are too complex. Then, with the development of neural network technology, research on deep reinforcement learning that combines deep learning with reinforcement learning is receiving lots of attention. In this paper, two types of neural networks are combined with reinforcement learning and their characteristics were compared and analyzed with existing value-based reinforcement learning algorithms. Two types of neural networks are FNN and CNN, and existing reinforcement learning algorithms are SARSA and Q-learning.