• Title/Summary/Keyword: Q learning

Search Result 430, Processing Time 0.025 seconds

Practical Implementation and Stability Analysis of ALOHA-Q for Wireless Sensor Networks

  • Kosunalp, Selahattin;Mitchell, Paul Daniel;Grace, David;Clarke, Tim
    • ETRI Journal
    • /
    • v.38 no.5
    • /
    • pp.911-921
    • /
    • 2016
  • This paper presents the description, practical implementation, and stability analysis of a recently proposed, energy-efficient, medium access control protocol for wireless sensor networks, ALOHA-Q, which employs a reinforcement-learning framework as an intelligent transmission strategy. The channel performance is evaluated through a simulation and experiments conducted using a real-world test-bed. The stability of the system against possible changes in the environment and changing channel conditions is studied with a discussion on the resilience level of the system. A Markov model is derived to represent the system behavior and estimate the time in which the system loses its operation. A novel scheme is also proposed to protect the lifetime of the system when the environment and channel conditions do not sufficiently maintain the system operation.

Deep Q-Network based Game Agents (심층 큐 신경망을 이용한 게임 에이전트 구현)

  • Han, Dongki;Kim, Myeongseop;Kim, Jaeyoun;Kim, Jung-Su
    • The Journal of Korea Robotics Society
    • /
    • v.14 no.3
    • /
    • pp.157-162
    • /
    • 2019
  • The video game Tetris is one of most popular game and it is well known that its game rule can be modelled as MDP (Markov Decision Process). This paper presents a DQN (Deep Q-Network) based game agent for Tetris game. To this end, the state is defined as the captured image of the Tetris game board and the reward is designed as a function of cleared lines by the game agent. The action is defined as left, right, rotate, drop, and their finite number of combinations. In addition to this, PER (Prioritized Experience Replay) is employed in order to enhance learning performance. To train the network more than 500000 episodes are used. The game agent employs the trained network to make a decision. The performance of the developed algorithm is validated via not only simulation but also real Tetris robot agent which is made of a camera, two Arduinos, 4 servo motors, and artificial fingers by 3D printing.

Comparative Analysis of Multi-Agent Reinforcement Learning Algorithms Based on Q-Value (상태 행동 가치 기반 다중 에이전트 강화학습 알고리즘들의 비교 분석 실험)

  • Kim, Ju-Bong;Choi, Ho-Bin;Han, Youn-Hee
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2021.05a
    • /
    • pp.447-450
    • /
    • 2021
  • 시뮬레이션을 비롯한 많은 다중 에이전트 환경에서는 중앙 집중 훈련 및 분산 수행(centralized training with decentralized execution; CTDE) 방식이 활용되고 있다. CTDE 방식 하에서 중앙 집중 훈련 및 분산 수행 환경에서의 다중 에이전트 학습을 위한 상태 행동 가치 기반(state-action value; Q-value) 다중 에이전트 알고리즘들에 대한 많은 연구가 이루어졌다. 이러한 알고리즘들은 Independent Q-learning (IQL)이라는 강력한 벤치 마크 알고리즘에서 파생되어 다중 에이전트의 공동의 상태 행동 가치의 분해(Decomposition) 문제에 대해 집중적으로 연구되었다. 본 논문에서는 앞선 연구들에 관한 알고리즘들에 대한 분석과 실용적이고 일반적인 도메인에서의 실험 분석을 통해 검증한다.

An Application of the HoQ Framework to Website Performance Improvement: Case Study of an Online Education Website (웹사이트 경쟁력 강화를 위한 평가 및 개선 방안 : HoQ 모형에 기반한, 온라인교육 K사 웹사이트의 품질 개선)

  • Kim, Do-Hoon;Suh, Young-Ho;Roh, In-Sung
    • Journal of Korean Society for Quality Management
    • /
    • v.33 no.2
    • /
    • pp.40-50
    • /
    • 2005
  • HoQ (House of Quality) provides an effective tool not only to arrange and evaluate VoC (Voice of Customers) and VoE (Voice of Engineers), but also to link and combine VoC and VoE, thereby presenting explicit directions for quality improvement. There have been, however, few researches on the HoQ framework in the IT industry. The case study discussed here serves an illustration of the applicability and usefulness of the HoQ approach to website quality improvement. The proposed HoQ framework shows great potentials since customers needs are explicitly considered in the framework, and it helps website administrators develop better web services by providing guidelines for reengineering the website operations.

Proactive Operational Method for the Transfer Robot of FMC (FMC 반송용 로봇의 선견형 운영방법)

  • Yoon, Jung-Ik;Um, In-Sup;Lee, Hong-Chul
    • Journal of the Korea Society for Simulation
    • /
    • v.17 no.4
    • /
    • pp.249-257
    • /
    • 2008
  • This paper shows the Applied Q-learning Algorithm which supports selecting the waiting position of a robot and the part serviced next in the Flexible Manufacturing Cell (FMC) that consists of one robot and various types of facilities. To verify the performance of the suggested algorithm, we design the general FMC made up of single transfer robot and multiple machines with a simulation method, and then compare the output with other control methods. As a result of the analysis, the algorithm we use improve the average processing time and total throughputs as well by increasing robot utilization, reversely, by decreasing robot waiting time. Furthermore, because of ease of use compared with other complex ways and its adoptability to real world, we expect that this method contribute to advance total FMC efficiency as well.

  • PDF

Dynamic Resource Allocation in Distributed Cloud Computing (분산 클라우드 컴퓨팅을 위한 동적 자원 할당 기법)

  • Ahn, TaeHyoung;Kim, Yena;Lee, SuKyoung
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.38B no.7
    • /
    • pp.512-518
    • /
    • 2013
  • A resource allocation algorithm has a high impact on user satisfaction as well as the ability to accommodate and process services in a distributed cloud computing. In other words, service rejections, which occur when datacenters have no enough resources, degrade the user satisfaction level. Therefore, in this paper, we propose a resource allocation algorithm considering the cloud domain's remaining resources to minimize the number of service rejections. The resource allocation rate based on Q-Learning increases when the remaining resources are sufficient to allocate the maximum allocation rate otherwise and avoids the service rejection. To demonstrate, We compare the proposed algorithm with two previous works and show that the proposed algorithm has the smaller number of the service rejections.

Q Learning MDP Approach to Mitigate Jamming Attack Using Stochastic Game Theory Modelling With WQLA in Cognitive Radio Networks

  • Vimal, S.;Robinson, Y. Harold;Kaliappan, M.;Pasupathi, Subbulakshmi;Suresh, A.
    • Journal of Platform Technology
    • /
    • v.9 no.1
    • /
    • pp.3-14
    • /
    • 2021
  • Cognitive Radio network (CR) is a promising paradigm that helps the unlicensed user (Secondary User) to analyse the spectrum and coordinate the spectrum access to support the creation of common control channel (CCC). The cooperation of secondary users and broadcasting between them is done through transmitting messages in CCC. In case, if the control channels may get jammed and it may directly degrade the network's performance and under such scenario jammers will devastate the control channels. Hopping sequences may be one of the predominant approaches and it may be used to fight against this problem to confront jammer. The jamming attack can be alleviated using one of the game modelling approach and in this proposed scheme stochastic games has been analysed with more single users to provide the flexible control channels against intrusive attacks by mentioning the states of each player, strategies ,actions and players reward. The proposed work uses a modern player action and better strategic view on game theoretic modelling is stochastic game theory has been taken in to consideration and applied to prevent the jamming attack in CR network. The selection of decision is based on Q learning approach to mitigate the jamming nodes using the optimal MDP decision process

Deep reinforcement learning for optimal life-cycle management of deteriorating regional bridges using double-deep Q-networks

  • Xiaoming, Lei;You, Dong
    • Smart Structures and Systems
    • /
    • v.30 no.6
    • /
    • pp.571-582
    • /
    • 2022
  • Optimal life-cycle management is a challenging issue for deteriorating regional bridges. Due to the complexity of regional bridge structural conditions and a large number of inspection and maintenance actions, decision-makers generally choose traditional passive management strategies. They are less efficiency and cost-effectiveness. This paper suggests a deep reinforcement learning framework employing double-deep Q-networks (DDQNs) to improve the life-cycle management of deteriorating regional bridges to tackle these problems. It could produce optimal maintenance plans considering restrictions to maximize maintenance cost-effectiveness to the greatest extent possible. DDQNs method could handle the problem of the overestimation of Q-values in the Nature DQNs. This study also identifies regional bridge deterioration characteristics and the consequence of scheduled maintenance from years of inspection data. To validate the proposed method, a case study containing hundreds of bridges is used to develop optimal life-cycle management strategies. The optimization solutions recommend fewer replacement actions and prefer preventative repair actions when bridges are damaged or are expected to be damaged. By employing the optimal life-cycle regional maintenance strategies, the conditions of bridges can be controlled to a good level. Compared to the nature DQNs, DDQNs offer an optimized scheme containing fewer low-condition bridges and a more costeffective life-cycle management plan.

A Survey on UAV Network for Secure Communication and Attack Detection: A focus on Q-learning, Blockchain, IRS and mmWave Technologies

  • Madhuvanthi T;Revathi A
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.18 no.3
    • /
    • pp.779-800
    • /
    • 2024
  • Unmanned Aerial Vehicle (UAV) networks, also known as drone networks, have gained significant attention for their potential in various applications, including communication. UAV networks for communication involve using a fleet of drones to establish wireless connectivity and provide communication services in areas where traditional infrastructure is lacking or disrupted. UAV communication networks need to be highly secured to ensure the technology's security and the users' safety. The proposed survey provides a comprehensive overview of the current state-of-the-art UAV network security solutions. In this paper, we analyze the existing literature on UAV security and identify the various types of attacks and the underlying vulnerabilities they exploit. Detailed mitigation techniques and countermeasures for the protection of UAVs are described in this paper. The survey focuses on the implementation of novel technologies like Q-learning, blockchain, IRS, and mmWave. This paper discusses network simulation tools that range in complexity, features, and programming capabilities. Finally, future research directions and challenges are highlighted.

A study on the Types of Perception for the Liberal arts Education of University Students Using Q Methodology (Q 방법을 활용한 대학생의 교양교육에 대한 인식 유형 연구)

  • Lee, Hye-Ju
    • Journal of Digital Convergence
    • /
    • v.19 no.12
    • /
    • pp.103-113
    • /
    • 2021
  • In this study, the Q method is used to investigate the types of perceptions of liberal arts education perceived by college students and to investigate the characteristics of each type. 33 Q samples were extracted from the Q population collected through literature research, open questionnaires, and deep interviews. Q classification was conducted for 27 students of A University located in B City. The data was analyzed using the QUANL program. In the research, the types of awareness of liberal arts education were derived as "pursuit of various experiences", "pursuit of practical studies", "pursuit of accident expansion", and "pursuit of social change". The results of this study re-establish the meaning of liberal arts education in university education and suggest that it is necessary to consider various educational contents and teaching learning methods.