• Title/Summary/Keyword: Q learning

Search Result 431, Processing Time 0.026 seconds

Design and Development of m-Learning Service Based on 3G Cellular Phones

  • Chung, Kwang-Sik;Lee, Jeong-Eun
    • Journal of Information Processing Systems
    • /
    • v.8 no.3
    • /
    • pp.521-538
    • /
    • 2012
  • As the knowledge society matures, not only distant, but also off-line universities are trying to provide learners with on-line educational contents. Particularly, high effectiveness of mobile devices for e-Learning has been demonstrated by the university sector, which uses distant learning that is based on blended learning. In this paper, we analyzed previous m-Learning scenarios and future technology prospects. Based on the proposed m-Learning scenario, we designed cellular phone-based educational contents and service structure, implemented m-Learning system, and analyzed m-Learning service satisfaction. The design principles of the m-Learning service are 1) to provide learners with m-Learning environment with both cellular phones and desktop computers; 2) to serve announcements, discussion boards, Q&A boards, course materials, and exercises on cellular phones and desktop computers; and 3) to serve learning activities like the reviewing of full lectures, discussions, and writing term papers using desktop computers and cellular phones. The m-Learning service was developed on a cellular phone that supports H.264 codex in 3G communication technology. Some of the functions of the m-Learning design principles are implemented in a 3G cellular phone. The contents of lectures are provided in the forms of video, text, audio, and video with text. One-way educational contents are complemented by exercises (quizzes).

L-CAA : An Architecture for Behavior-Based Reinforcement Learning (L-CAA : 행위 기반 강화학습 에이전트 구조)

  • Hwang, Jong-Geun;Kim, In-Cheol
    • Journal of Intelligence and Information Systems
    • /
    • v.14 no.3
    • /
    • pp.59-76
    • /
    • 2008
  • In this paper, we propose an agent architecture called L-CAA that is quite effective in real-time dynamic environments. L-CAA is an extension of CAA, the behavior-based agent architecture which was also developed by our research group. In order to improve adaptability to the changing environment, it is extended by adding reinforcement learning capability. To obtain stable performance, however, behavior selection and execution in the L-CAA architecture do not entirely rely on learning. In L-CAA, learning is utilized merely as a complimentary means for behavior selection and execution. Behavior selection mechanism in this architecture consists of two phases. In the first phase, the behaviors are extracted from the behavior library by checking the user-defined applicable conditions and utility of each behavior. If multiple behaviors are extracted in the first phase, the single behavior is selected to execute in the help of reinforcement learning in the second phase. That is, the behavior with the highest expected reward is selected by comparing Q values of individual behaviors updated through reinforcement learning. L-CAA can monitor the maintainable conditions of the executing behavior and stop immediately the behavior when some of the conditions fail due to dynamic change of the environment. Additionally, L-CAA can suspend and then resume the current behavior whenever it encounters a higher utility behavior. In order to analyze effectiveness of the L-CAA architecture, we implement an L-CAA-enabled agent autonomously playing in an Unreal Tournament game that is a well-known dynamic virtual environment, and then conduct several experiments using it.

  • PDF

DQN Reinforcement Learning for Acrobot in OpenAI Gym Environment (OpenAI Gym 환경의 Acrobot에 대한 DQN 강화학습)

  • Myung-Ju Kang
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2023.07a
    • /
    • pp.35-36
    • /
    • 2023
  • 본 논문에서는 OpenAI Gym 환경에서 제공하는 Acrobot-v1에 대해 DQN(Deep Q-Networks) 강화학습으로 학습시키고, 이 때 적용되는 활성화함수의 성능을 비교분석하였다. DQN 강화학습에 적용한 활성화함수는 ReLU, ReakyReLU, ELU, SELU 그리고 softplus 함수이다. 실험 결과 평균적으로 Leaky_ReLU 활성화함수를 적용했을 때의 보상 값이 높았고, 최대 보상 값은 SELU 활성화 함수를 적용할 때로 나타났다.

  • PDF

Decision Support Method in Dynamic Car Navigation Systems by Q-Learning

  • Hong, Soo-Jung;Hong, Eon-Joo;Oh, Kyung-Whan
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.12 no.4
    • /
    • pp.361-365
    • /
    • 2002
  • 오랜 세월동안 위대한 이동수단을 만들어내고자 하는 인간의 꿈은 오늘날 눈부신 각종 운송기구를 만들어 내는 결실을 얻고 있다. 자동차 네비게이션 시스템도 그러한 결실중의 한 예라고 할 수 있을 것이다. 지능적으로 판단하고 정보를 처리할 수 있는 자동차 네비게이션 시스템을 부착함으로써 한 단계 발전한 운송수단으로 진화할 수 있을 것이다. 이러한 자동차 네비게이션 시스템의 단점이라면 한정된 리소스만으로 여러 가지 작업을 수행해야만 하는 어려움이다. 그래서 네비게이션 시스템의 주요 작업중의 하나인 경로를 추출하는 경로추출(Route Planning) 작업은 한정된 리소스에서도 최적의 경로를 찾을 수 있는 지능적인 방법이어야만 한다. 이러한 경로를 추출하는 작업을 하는데 기존에 일반적으로 쓰였던 두 가지 방법에는 Dijkstra s algorithm과 A*algorithm이 있다. 이 두 방법은 최적의 경로를 찾아낸다는 점은 있지만 경로를 찾기 위해서 알고리즘의 특성상 각각, 넓은 영역에 대하여 탐색작업을 해야 하고 또한 수행시간이 많이 걸린다는 단점과 또한 경로를 계산하기 위해서 Heuristic function을 추가적인 정보로 계산을 해야 한다는 단점이 있다. 본 논문에서는 적은 탐색 영역을 가지면서 또한 최적의 경로를 추출하는데 드는 수행시간은 작으며 나아가 동적인 교통환경에서도 최적의 경로를 추출할 수 있는 최적 경로 추출방법을 강화학습의 일종인 Q- Learning을 이용하여 구현해 보고자 한다.

The Effect of Segment Size on Quality Selection in DQN-based Video Streaming Services (DQN 기반 비디오 스트리밍 서비스에서 세그먼트 크기가 품질 선택에 미치는 영향)

  • Kim, ISeul;Lim, Kyungshik
    • Journal of Korea Multimedia Society
    • /
    • v.21 no.10
    • /
    • pp.1182-1194
    • /
    • 2018
  • The Dynamic Adaptive Streaming over HTTP(DASH) is envisioned to evolve to meet an increasing demand on providing seamless video streaming services in the near future. The DASH performance heavily depends on the client's adaptive quality selection algorithm that is not included in the standard. The existing conventional algorithms are basically based on a procedural algorithm that is not easy to capture and reflect all variations of dynamic network and traffic conditions in a variety of network environments. To solve this problem, this paper proposes a novel quality selection mechanism based on the Deep Q-Network(DQN) model, the DQN-based DASH Adaptive Bitrate(ABR) mechanism. The proposed mechanism adopts a new reward calculation method based on five major performance metrics to reflect the current conditions of networks and devices in real time. In addition, the size of the consecutive video segment to be downloaded is also considered as a major learning metric to reflect a variety of video encodings. Experimental results show that the proposed mechanism quickly selects a suitable video quality even in high error rate environments, significantly reducing frequency of quality changes compared to the existing algorithm and simultaneously improving average video quality during video playback.

Smart Target Detection System Using Artificial Intelligence (인공지능을 이용한 스마트 표적탐지 시스템)

  • Lee, Sung-nam
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2021.05a
    • /
    • pp.538-540
    • /
    • 2021
  • In this paper, we proposed a smart target detection system that detects and recognizes a designated target to provide relative motion information when performing a target detection mission of a drone. The proposed system focused on developing an algorithm that can secure adequate accuracy (i.e. mAP, IoU) and high real-time at the same time. The proposed system showed an accuracy of close to 1.0 after 100k learning of the Google Inception V2 deep learning model, and the inference speed was about 60-80[Hz] when using a high-performance laptop based on the real-time performance Nvidia GTX 2070 Max-Q. The proposed smart target detection system will be operated like a drone and will be helpful in successfully performing surveillance and reconnaissance missions by automatically recognizing the target using computer image processing and following the target.

  • PDF

Optimal route generation method for ships using reinforcement learning (강화학습을 이용한 선박의 최적항로 생성기법)

  • Min-Kyu Kim;Jong-Hwa Kim;Ik-Soon Choi;Hyeong-Tak Lee;Hyun Yang
    • Proceedings of the Korean Institute of Navigation and Port Research Conference
    • /
    • 2022.06a
    • /
    • pp.167-168
    • /
    • 2022
  • 선박을 운항함에 있어 최적항로를 결정하는 것은 항해시간과 연료 소모를 줄이는 중요한 요인 중의 하나이다. 기존에는 항로를 결정하기 위해 항해사의 전문적인 지식이 요구되지만 이러한 방법은 최적의 항로라고 판단하기 어렵다. 따라서 연료비 절감과 선박의 안전을 고려한 최적의 항로를 생성할 필요가 있다. 연료 소모량 혹은 항해시간을 최소화하기 위해서 에이스타 알고리즘, Dijkstra 알고리즘을 적용한 연구가 있다. 하지만 이러한 연구들은 최단거리만 구할 뿐 선박의 안전, 해상상태 등을 고려하지 못한다. 이를 보완하기 위해 본 연구에서는 강화학습 알고리즘을 적용하고자한다. 강화학습 알고리즘은 앞으로 누적 될 보상을 최대화 하는 행동으로 정책을 찾는 방법으로, 본 연구에서는 강화학습 알고리즘의 하나인 Q-learning을 사용하여 선박의 안전을 고려한 최적의 항로를 생성하는 기법을 제안 하고자 한다.

  • PDF

A Study used Q-methodology on the Subjective Cognition-Patterns of School Aged Children with Borderline Intelligence Function to the School (학령기 경계선 지능 아동의 학교에 대한 주관적 인식 유형 연구: Q방법론 적용)

  • Lee, Keum Jin
    • The Journal of the Korea Contents Association
    • /
    • v.17 no.2
    • /
    • pp.384-393
    • /
    • 2017
  • The purpose of this study was to find out the subjective cognition-patterns of school aged children with borderline intelligence function to the School using Q Methodology. Q-sample was included 21 statements obtained from literatures and in-depth interviews with 4 specialist & 4 children with borderline intelligence function. P-sample was consisted through the consent of 18 children with borderline intelligence function and their parents. The 21 selected Q-statements were classified into a normal distribution using a 5 point scale. The collected data analyzed using a Quanl PC program. This study found out two subjective cognition-patterns of school aged children with borderline intelligence function to the school. Two types were 'participatory & dependent type', and 'onlooking & atrophic type'. This research finding can be used to make clear understanding on diverse voices of school aged children with borderline intelligence function to the School. And this result will attribute to mediations of educational welfare practice for maintaining a safe & healthy learning environment.

Perspectives of Nurse Students on Problem-Based Learning - Learning Experience in Pediatric Nursing - (문제중심 학습방법 경험에 대한 간호학생의 인식유형 - 아동간호학 학습경험을 중심으로 -)

  • Baek, Kyoung-Seon
    • Child Health Nursing Research
    • /
    • v.15 no.1
    • /
    • pp.15-23
    • /
    • 2009
  • Purpose: This research was done to provide fundamental data to improve learning methods in Pediatric nursing and meet the needs of the students in actual nursing by analyzing nurse student experiences with problem-based learning in Pediatric nursing. Method: Using the 31 Q-samples selected, 20 nursing students from J college were selected as p-samples. The students were personally interviewed in January or February 2008. Result: The result of the study showed 3 types. The first type was the "negative resister", who failed to adapt to the problem-based learning and resists negatively. The second type was the "active receiver", who participated in the process of the problem-based learning and received it actively. The third type was the "passive accepters", who accepted problem-based learning but worried because they were familiar only with traditional learning. Conclusions: In this study, problem-based learning was used for classes in the science of pediatric nursing. The findings indicate that preparation for learning and details should be considered when developing and using modules for pediatric nursing. Further study on the development of problem-based learning modules is also indicated.

  • PDF

UAV-MEC Offloading and Migration Decision Algorithm for Load Balancing in Vehicular Edge Computing Network (차량 엣지 컴퓨팅 네트워크에서 로드 밸런싱을 위한 UAV-MEC 오프로딩 및 마이그레이션 결정 알고리즘)

  • A Young, Shin;Yujin, Lim
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.11 no.12
    • /
    • pp.437-444
    • /
    • 2022
  • Recently, research on mobile edge services has been conducted to handle computationally intensive and latency-sensitive tasks occurring in wireless networks. However, MEC, which is fixed on the ground, cannot flexibly cope with situations where task processing requests increase sharply, such as commuting time. To solve this problem, a technology that provides edge services using UAVs (Unmanned Aerial Vehicles) has emerged. Unlike ground MEC servers, UAVs have limited battery capacity, so it is necessary to optimize energy efficiency through load balancing between UAV MEC servers. Therefore, in this paper, we propose a load balancing technique with consideration of the energy state of UAVs and the mobility of vehicles. The proposed technique is composed of task offloading scheme using genetic algorithm and task migration scheme using Q-learning. To evaluate the performance of the proposed technique, experiments were conducted with varying mobility speed and number of vehicles, and performance was analyzed in terms of load variance, energy consumption, communication overhead, and delay constraint satisfaction rate.