• Title/Summary/Keyword: Q learning

Search Result 426, Processing Time 0.028 seconds

웹을 활용한 과학영재 심화 학습 지원 체제 구축

  • Jhun, Young-Seok
    • Journal of Gifted/Talented Education
    • /
    • v.12 no.4
    • /
    • pp.72-107
    • /
    • 2002
  • In order to satisfy the gifted students' learning desire and maximize the effectiveness of their learning, we constructed the system which would provide them with supplementary activities based on the Internet boards. At the very beginning, we investigated the personalities of the gifted and their classroom environment which they prefer through studying the related references and asking questionnaires. And then we discussed how to improve the lectures, decided to make the basic structures of the web-based supporting system, and designed some teaching strategies for the gifted. which are named 'GIFTED'. Now the web-based supporting system, which are composed of several boards, was established and is being operated now. Each subject has its own boards. The boards of each subject basically consist of Notice, Learning-materials, Q&A, Homework, Recommended Sites. The results we've got from operating our system are following: Teachers and students were generally satisfied with the system while students wanted more materials. Students and teachers had a positive attitude that the site boards of Learning-materials and Homework are being actively used, while the numbers of contents uploaded in Q&A and Recommended site boards are small and they are regarded as being unimportant to the students and teachers.

A Study on Ship Route Generation with Deep Q Network and Route Following Control

  • Min-Kyu Kim;Hyeong-Tak Lee
    • Journal of Navigation and Port Research
    • /
    • v.47 no.2
    • /
    • pp.75-84
    • /
    • 2023
  • Ships need to ensure safety during their navigation, which makes route determination highly important. It must be accompanied by a route following controller that can accurately follow the route. This study proposes a method for automatically generating the ship route based on deep reinforcement learning algorithm and following it using a route following controller. To generate a ship route, under keel clearance was applied to secure the ship's safety and navigation chart information was used to apply ship navigation related regulations. For the experiment, a target ship with a draft of 8.23 m was designated. The target route in this study was to depart from Busan port and arrive at the pilot boarding place of the Ulsan port. As a route following controller, a velocity type fuzzy P ID controller that could compensate for the limitation of a linear controller was applied. As a result of using the deep Q network, a route with a total distance of 62.22 km and 81 waypoints was generated. To simplify the route, the Douglas-Peucker algorithm was introduced to reduce the total distance to 55.67 m and the number of way points to 3. After that, an experiment was conducted to follow the path generated by the target ship. Experiment results revealed that the velocity type fuzzy P ID controller had less overshoot and fast settling time. In addition, it had the advantage of reducing the energy loss of the ship because the change in rudder angle was smooth. This study can be used as a basic study of route automatic generation. It suggests a method of combining ship route generation with the route following control.

Task offloading scheme based on the DRL of Connected Home using MEC (MEC를 활용한 커넥티드 홈의 DRL 기반 태스크 오프로딩 기법)

  • Ducsun Lim;Kyu-Seek Sohn
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.23 no.6
    • /
    • pp.61-67
    • /
    • 2023
  • The rise of 5G and the proliferation of smart devices have underscored the significance of multi-access edge computing (MEC). Amidst this trend, interest in effectively processing computation-intensive and latency-sensitive applications has increased. This study investigated a novel task offloading strategy considering the probabilistic MEC environment to address these challenges. Initially, we considered the frequency of dynamic task requests and the unstable conditions of wireless channels to propose a method for minimizing vehicle power consumption and latency. Subsequently, our research delved into a deep reinforcement learning (DRL) based offloading technique, offering a way to achieve equilibrium between local computation and offloading transmission power. We analyzed the power consumption and queuing latency of vehicles using the deep deterministic policy gradient (DDPG) and deep Q-network (DQN) techniques. Finally, we derived and validated the optimal performance enhancement strategy in a vehicle based MEC environment.

The Capacity of Multi-Valued Single Layer CoreNet(Neural Network) and Precalculation of its Weight Values (단층 코어넷 다단입력 인공신경망회로의 처리용량과 사전 무게값 계산에 관한 연구)

  • Park, Jong-Joon
    • Journal of IKEEE
    • /
    • v.15 no.4
    • /
    • pp.354-362
    • /
    • 2011
  • One of the unsolved problems in Artificial Neural Networks is related to the capacity of a neural network. This paper presents a CoreNet which has a multi-leveled input and a multi-leveled output as a 2-layered artificial neural network. I have suggested an equation for calculating the capacity of the CoreNet, which has a p-leveled input and a q-leveled output, as $a_{p,q}=\frac{1}{2}p(p-1)q^2-\frac{1}{2}(p-2)(3p-1)q+(p-1)(p-2)$. With an odd value of p and an even value of q, (p-1)(p-2)(q-2)/2 needs to be subtracted further from the above equation. The simulation model 1(3)-1(6) has 3 levels of an input and 6 levels of an output with no hidden layer. The simulation result of this model gives, out of 216 possible functions, 80 convergences for the number of implementable function using the cot(x) input leveling method. I have also shown that, from the simulation result, the two diverged functions become implementable by precalculating the weight values. The simulation result and the precalculation of the weight values give the same result as the above equation in the total number of implementable functions.

Model predictive control combined with iterative learning control for nonlinear batch processes

  • Lee, Kwang-Soon;Kim, Won-Cheol;Lee, Jay H.
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 1996.10a
    • /
    • pp.299-302
    • /
    • 1996
  • A control algorithm is proposed for nonlinear multi-input multi-output(MIMO) batch processes by combining quadratic iterative learning control(Q-ILC) with model predictive control(MPC). Both controls are designed based on output feedback and Kalman filter is incorporated for state estimation. Novelty of the proposed algorithm lies in the facts that, unlike feedback-only control, unknown sustained disturbances which are repeated over batches can be completely rejected and asymptotically perfect tracking is possible for zero random disturbance case even with uncertain process model.

  • PDF

A Study on Reinforcement Learning of Behavior-based Multi-Agent (다중에이전트 행동기반의 강화학습에 관한 연구)

  • Do, Hyun-Ho;Chung, Tae-Choong
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2002.11a
    • /
    • pp.369-372
    • /
    • 2002
  • 다양한 특성들을 가지고 있는 멀티에이전트 시스템의 행동학습은 에이전트 설계에 많은 부담을 덜어준다. 특성들로부터 나오는 다양한 행동의 효과적인 학습은 에이전트들이 환경에 대한 자율성과 반응성을 높여준 수 있다. 행동학습은 model-based learning과 같은 교사학습보다는 각 상태를 바로 지각하여 학습하는 강화학습과 같은 비교사 학습이 효과적이다. 본 논문은 로봇축구환경에 에이전트들의 행동을 개선된 강화학습법인 Modular Q-learning을 적용하여 복잡한 상태공간을 효과적으로 나누어 에이전트들의 자율성과 반응성을 높일 수 있는 강화학습구조를 제안한다.

  • PDF

How the Learning Speed and Tendency of Reinforcement Learning Agents Change with Prior Knowledge (사전 지식에 의한 강화학습 에이전트의 학습 속도와 경향성 변화)

  • Kim, Jisoo;Lee, Eun Hun;Kim, Hyeoncheol
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2020.05a
    • /
    • pp.512-515
    • /
    • 2020
  • 학습 속도가 느린 강화학습을 범용적으로 활용할 수 있도록 연구가 활발하게 이루어지고 있다. 사전 지식을 제공해서 학습 속도를 높일 수 있지만, 잘못된 사전 지식을 제공했을 위험이 존재한다. 본 연구는 불확실하거나 잘못된 사전 지식이 학습에 어떤 영향을 미치는지 살펴본다. OpenAI Gym 라이브러리를 이용해서 만든 Gamble 환경, Cliff 환경, 그리고 Maze 환경에서 실험을 진행했다. 그 결과 사전 지식을 통해 에이전트의 행동에 경향성을 부여할 수 있다는 것을 확인했다. 또한, 경로탐색에 있어서 잘못된 사전 지식이 얼마나 학습을 방해하는지 알아보았다.

Methodology for Apartment Space Arrangement Based on Deep Reinforcement Learning

  • Cheng Yun Chi;Se Won Lee
    • Architectural research
    • /
    • v.26 no.1
    • /
    • pp.1-12
    • /
    • 2024
  • This study introduces a deep reinforcement learning (DRL)-based methodology for optimizing apartment space arrangements, addressing the limitations of human capability in evaluating all potential spatial configurations. Leveraging computational power, the methodology facilitates the autonomous exploration and evaluation of innovative layout options, considering architectural principles, legal standards, and client re-quirements. Through comprehensive simulation tests across various apartment types, the research demonstrates the DRL approach's effec-tiveness in generating efficient spatial arrangements that align with current design trends and meet predefined performance objectives. The comparative analysis of AI-generated layouts with those designed by professionals validates the methodology's applicability and potential in enhancing architectural design practices by offering novel, optimized spatial configuration solutions.

Performance Comparison of Reinforcement Learning Algorithms for Futures Scalping (해외선물 스캘핑을 위한 강화학습 알고리즘의 성능비교)

  • Jung, Deuk-Kyo;Lee, Se-Hun;Kang, Jae-Mo
    • The Journal of the Convergence on Culture Technology
    • /
    • v.8 no.5
    • /
    • pp.697-703
    • /
    • 2022
  • Due to the recent economic downturn caused by Covid-19 and the unstable international situation, many investors are choosing the derivatives market as a means of investment. However, the derivatives market has a greater risk than the stock market, and research on the market of market participants is insufficient. Recently, with the development of artificial intelligence, machine learning has been widely used in the derivatives market. In this paper, reinforcement learning, one of the machine learning techniques, is applied to analyze the scalping technique that trades futures in minutes. The data set consists of 21 attributes using the closing price, moving average line, and Bollinger band indicators of 1 minute and 3 minute data for 6 months by selecting 4 products among futures products traded at trading firm. In the experiment, DNN artificial neural network model and three reinforcement learning algorithms, namely, DQN (Deep Q-Network), A2C (Advantage Actor Critic), and A3C (Asynchronous A2C) were used, and they were trained and verified through learning data set and test data set. For scalping, the agent chooses one of the actions of buying and selling, and the ratio of the portfolio value according to the action result is rewarded. Experiment results show that the energy sector products such as Heating Oil and Crude Oil yield relatively high cumulative returns compared to the index sector products such as Mini Russell 2000 and Hang Seng Index.

Reinforce Learning Based Cooperative Sensing for Cognitive Radio Networks (인지 무선 시스템에서 강화학습 기반 협력 센싱 기법)

  • Kim, Do-Yun;Choi, Young-June;Roh, Bong-Soo;Choi, Jeung-Won
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.13 no.5
    • /
    • pp.1043-1050
    • /
    • 2018
  • In this paper, we propose a reinforce learning based on cooperative sensing scheme to select optimal secondary users(SUs) to enhance the detection performance of spectrum sensing in Cognitive radio(CR) networks. The SU with high accuracy is identified based on the similarity between the global sensing result obtained through cooperative sensing and the local sensing result of the SU. A fusion center(FC) uses similarity of SUs as reward value for Q-learning to determine SUs which participate in cooperative sensing with accurate sensing results. The experimental results show that the proposed method improves the detection performance compared to conventional cooperative sensing schemes.