• Title/Summary/Keyword: Multi-armed bandit (MAB) problem

Search Result 2, Processing Time 0.016 seconds

Opportunistic Spectrum Access Based on a Constrained Multi-Armed Bandit Formulation

  • Ai, Jing;Abouzeid, Alhussein A.
    • Journal of Communications and Networks
    • /
    • v.11 no.2
    • /
    • pp.134-147
    • /
    • 2009
  • Tracking and exploiting instantaneous spectrum opportunities are fundamental challenges in opportunistic spectrum access (OSA) in presence of the bursty traffic of primary users and the limited spectrum sensing capability of secondary users. In order to take advantage of the history of spectrum sensing and access decisions, a sequential decision framework is widely used to design optimal policies. However, many existing schemes, based on a partially observed Markov decision process (POMDP) framework, reveal that optimal policies are non-stationary in nature which renders them difficult to calculate and implement. Therefore, this work pursues stationary OSA policies, which are thereby efficient yet low-complexity, while still incorporating many practical factors, such as spectrum sensing errors and a priori unknown statistical spectrum knowledge. First, with an approximation on channel evolution, OSA is formulated in a multi-armed bandit (MAB) framework. As a result, the optimal policy is specified by the wellknown Gittins index rule, where the channel with the largest Gittins index is always selected. Then, closed-form formulas are derived for the Gittins indices with tunable approximation, and the design of a reinforcement learning algorithm is presented for calculating the Gittins indices, depending on whether the Markovian channel parameters are available a priori or not. Finally, the superiority of the scheme is presented via extensive experiments compared to other existing schemes in terms of the quality of policies and optimality.

Reinforcement Learning-Based Illuminance Control Method for Building Lighting System (강화학습 기반 빌딩의 방별 조명 시스템 조도값 설정 기법)

  • Kim, Jongmin;Kim, Sunyong
    • Journal of IKEEE
    • /
    • v.26 no.1
    • /
    • pp.56-61
    • /
    • 2022
  • Various efforts have been made worldwide to respond to environmental problems such as climate change. Research on artificial intelligence (AI)-based energy management has been widely conducted as the most effective way to alleviate the climate change problem. In particular, buildings that account for more than 20% of the total energy delivered worldwide have been focused as a target for energy management using the building energy management system (BEMS). In this paper, we propose a multi-armed bandit (MAB)-based energy management algorithm that can efficiently decide the energy consumption level of the lighting system in each room of the building, while minimizing the discomfort levels of occupants of each room.