• 제목/요약/키워드: Reinforcement Learning-based Protocol

검색결과 14건 처리시간 0.021초

Performance Enhancement of CSMA/CA MAC Protocol Based on Reinforcement Learning

  • Kim, Tae-Wook;Hwang, Gyung-Ho
    • Journal of information and communication convergence engineering
    • /
    • 제19권1호
    • /
    • pp.1-7
    • /
    • 2021
  • Reinforcement learning is an area of machine learning that studies how an intelligent agent takes actions in a given environment to maximize the cumulative reward. In this paper, we propose a new MAC protocol based on the Q-learning technique of reinforcement learning to improve the performance of the IEEE 802.11 wireless LAN CSMA/CA MAC protocol. Furthermore, the operation of each access point (AP) and station is proposed. The AP adjusts the value of the contention window (CW), which is the range for determining the backoff number of the station, according to the wireless traffic load. The station improves the performance by selecting an optimal backoff number with the lowest packet collision rate and the highest transmission success rate through Q-learning within the CW value transmitted from the AP. The result of the performance evaluation through computer simulations showed that the proposed scheme has a higher throughput than that of the existing CSMA/CA scheme.

QLGR: A Q-learning-based Geographic FANET Routing Algorithm Based on Multi-agent Reinforcement Learning

  • Qiu, Xiulin;Xie, Yongsheng;Wang, Yinyin;Ye, Lei;Yang, Yuwang
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제15권11호
    • /
    • pp.4244-4274
    • /
    • 2021
  • The utilization of UAVs in various fields has led to the development of flying ad hoc network (FANET) technology. In a network environment with highly dynamic topology and frequent link changes, the traditional routing technology of FANET cannot satisfy the new communication demands. Traditional routing algorithm, based on geographic location, can "fall" into a routing hole. In view of this problem, we propose a geolocation routing protocol based on multi-agent reinforcement learning, which decreases the packet loss rate and routing cost of the routing protocol. The protocol views each node as an intelligent agent and evaluates the value of its neighbor nodes through the local information. In the value function, nodes consider information such as link quality, residual energy and queue length, which reduces the possibility of a routing hole. The protocol uses global rewards to enable individual nodes to collaborate in transmitting data. The performance of the protocol is experimentally analyzed for UAVs under extreme conditions such as topology changes and energy constraints. Simulation results show that our proposed QLGR-S protocol has advantages in performance parameters such as throughput, end-to-end delay, and energy consumption compared with the traditional GPSR protocol. QLGR-S provides more reliable connectivity for UAV networking technology, safeguards the communication requirements between UAVs, and further promotes the development of UAV technology.

가상 환경에서의 강화학습 기반 긴급 회피 조향 제어 (Reinforcement Learning based Autonomous Emergency Steering Control in Virtual Environments)

  • 이훈기;김태윤;김효빈;황성호
    • 드라이브 ㆍ 컨트롤
    • /
    • 제19권4호
    • /
    • pp.110-116
    • /
    • 2022
  • Recently, various studies have been conducted to apply deep learning and AI to various fields of autonomous driving, such as recognition, sensor processing, decision-making, and control. This paper proposes a controller applicable to path following, static obstacle avoidance, and pedestrian avoidance situations by utilizing reinforcement learning in autonomous vehicles. For repetitive driving simulation, a reinforcement learning environment was constructed using virtual environments. After learning path following scenarios, we compared control performance with Pure-Pursuit controllers and Stanley controllers, which are widely used due to their good performance and simplicity. Based on the test case of the KNCAP test and assessment protocol, autonomous emergency steering scenarios and autonomous emergency braking scenarios were created and used for learning. Experimental results from zero collisions demonstrated that the reinforcement learning controller was successful in the stationary obstacle avoidance scenario and pedestrian collision scenario under a given condition.

Adaptive Success Rate-based Sensor Relocation for IoT Applications

  • Kim, Moonseong;Lee, Woochan
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제15권9호
    • /
    • pp.3120-3137
    • /
    • 2021
  • Small-sized IoT wireless sensing devices can be deployed with small aircraft such as drones, and the deployment of mobile IoT devices can be relocated to suit data collection with efficient relocation algorithms. However, the terrain may not be able to predict its shape. Mobile IoT devices suitable for these terrains are hopping devices that can move with jumps. So far, most hopping sensor relocation studies have made the unrealistic assumption that all hopping devices know the overall state of the entire network and each device's current state. Recent work has proposed the most realistic distributed network environment-based relocation algorithms that do not require sharing all information simultaneously. However, since the shortest path-based algorithm performs communication and movement requests with terminals, it is not suitable for an area where the distribution of obstacles is uneven. The proposed scheme applies a simple Monte Carlo method based on relay nodes selection random variables that reflect the obstacle distribution's characteristics to choose the best relay node as reinforcement learning, not specific relay nodes. Using the relay node selection random variable could significantly reduce the generation of additional messages that occur to select the shortest path. This paper's additional contribution is that the world's first distributed environment-based relocation protocol is proposed reflecting real-world physical devices' characteristics through the OMNeT++ simulator. We also reconstruct the three days-long disaster environment, and performance evaluation has been performed by applying the proposed protocol to the simulated real-world environment.

Weight Adjustment Scheme Based on Hop Count in Q-routing for Software Defined Networks-enabled Wireless Sensor Networks

  • Godfrey, Daniel;Jang, Jinsoo;Kim, Ki-Il
    • Journal of information and communication convergence engineering
    • /
    • 제20권1호
    • /
    • pp.22-30
    • /
    • 2022
  • The reinforcement learning algorithm has proven its potential in solving sequential decision-making problems under uncertainties, such as finding paths to route data packets in wireless sensor networks. With reinforcement learning, the computation of the optimum path requires careful definition of the so-called reward function, which is defined as a linear function that aggregates multiple objective functions into a single objective to compute a numerical value (reward) to be maximized. In a typical defined linear reward function, the multiple objectives to be optimized are integrated in the form of a weighted sum with fixed weighting factors for all learning agents. This study proposes a reinforcement learning -based routing protocol for wireless sensor network, where different learning agents prioritize different objective goals by assigning weighting factors to the aggregated objectives of the reward function. We assign appropriate weighting factors to the objectives in the reward function of a sensor node according to its hop-count distance to the sink node. We expect this approach to enhance the effectiveness of multi-objective reinforcement learning for wireless sensor networks with a balanced trade-off among competing parameters. Furthermore, we propose SDN (Software Defined Networks) architecture with multiple controllers for constant network monitoring to allow learning agents to adapt according to the dynamics of the network conditions. Simulation results show that our proposed scheme enhances the performance of wireless sensor network under varied conditions, such as the node density and traffic intensity, with a good trade-off among competing performance metrics.

Optimizing Energy Efficiency in Mobile Ad Hoc Networks: An Intelligent Multi-Objective Routing Approach

  • Sun Beibei
    • 대한임베디드공학회논문지
    • /
    • 제19권2호
    • /
    • pp.107-114
    • /
    • 2024
  • Mobile ad hoc networks represent self-configuring networks of mobile devices that communicate without relying on a fixed infrastructure. However, traditional routing protocols in such networks encounter challenges in selecting efficient and reliable routes due to dynamic nature of these networks caused by unpredictable mobility of nodes. This often results in a failure to meet the low-delay and low-energy consumption requirements crucial for such networks. In order to overcome such challenges, our paper introduces a novel multi-objective and adaptive routing scheme based on the Q-learning reinforcement learning algorithm. The proposed routing scheme dynamically adjusts itself based on measured network states, such as traffic congestion and mobility. The proposed approach utilizes Q-learning to select routes in a decentralized manner, considering factors like energy consumption, load balancing, and the selection of stable links. We present a formulation of the multi-objective optimization problem and discuss adaptive adjustments of the Q-learning parameters to handle the dynamic nature of the network. To speed up the learning process, our scheme incorporates informative shaped rewards, providing additional guidance to the learning agents for better solutions. Implemented on the widely-used AODV routing protocol, our proposed approaches demonstrate better performance in terms of energy efficiency and improved message delivery delay, even in highly dynamic network environments, when compared to the traditional AODV. These findings show the potential of leveraging reinforcement learning for efficient routing in ad hoc networks, making the way for future advancements in the field of mobile ad hoc networking.

인지무선 에드혹 네트워크를 위한 강화학습기반의 멀티채널 MAC 프로토콜 (Reinforcement Learning based Multi-Channel MAC Protocol for Cognitive Radio Ad-hoc Networks)

  • 박형근
    • 한국정보통신학회논문지
    • /
    • 제26권7호
    • /
    • pp.1026-1031
    • /
    • 2022
  • 인지무선 에드혹 네트워크 (CRAHN : Cognitive Radio Ad-Hoc Networks)는 무선 서비스의 증가에 따른 주파수 자원부족을 극복할 수 있는 네트워크 기술이다. CRANH에서 주 사용자에 대한 간섭을 회피하기 위해 유휴채널을 확인하는 채널센싱이 필요하며, 주 사용자 출현시 빠른 유휴 채널선택을 통해 핸드오버로 인한 시간지연을 최소화 해야한다. 본 연구에서는 강화학습을 이용하여 CRANH에서 부 사용자의 채널 센싱의 대상을 축소하고 유휴채널의 가능성이 높은 채널을 우선적으로 센싱하도록함으로써 전송효율을 개선하였다. 또한 주기적인 센싱을 수행하지 않고 데이터의 전송시점에 채널을 센싱함으로써 센싱시점과 데이터 전송시점간의 차이로 인한 주 사용자와의 충돌가능성을 최소화할 수 있는 멀티채널 매체접근제어(MAC: Medium Access Control) 프로토콜을 제안하고 시뮬레이션을 통해 그 성능을 분석하였다.

Learning Automata Based Multipath Multicasting in Cognitive Radio Networks

  • Ali, Asad;Qadir, Junaid;Baig, Adeel
    • Journal of Communications and Networks
    • /
    • 제17권4호
    • /
    • pp.406-418
    • /
    • 2015
  • Cognitive radio networks (CRNs) have emerged as a promising solution to the problem of spectrum under utilization and artificial radio spectrum scarcity. The paradigm of dynamic spectrum access allows a secondary network comprising of secondary users (SUs) to coexist with a primary network comprising of licensed primary users (PUs) subject to the condition that SUs do not cause any interference to the primary network. Since it is necessary for SUs to avoid any interference to the primary network, PU activity precludes attempts of SUs to access the licensed spectrum and forces frequent channel switching for SUs. This dynamic nature of CRNs, coupled with the possibility that an SU may not share a common channel with all its neighbors, makes the task of multicast routing especially challenging. In this work, we have proposed a novel multipath on-demand multicast routing protocol for CRNs. The approach of multipath routing, although commonly used in unicast routing, has not been explored for multicasting earlier. Motivated by the fact that CRNs have highly dynamic conditions, whose parameters are often unknown, the multicast routing problem is modeled in the reinforcement learning based framework of learning automata. Simulation results demonstrate that the approach of multipath multicasting is feasible, with our proposed protocol showing a superior performance to a baseline state-of-the-art CRN multicasting protocol.

Q-Learning based Collision Avoidance for 802.11 Stations with Maximum Requirements

  • Chang Kyu Lee;Dong Hyun Lee;Junseok Kim;Xiaoying Lei;Seung Hyong Rhee
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제17권3호
    • /
    • pp.1035-1048
    • /
    • 2023
  • The IEEE 802.11 WLAN adopts a random backoff algorithm for its collision avoidance mechanism, and it is well known that the contention-based algorithm may suffer from performance degradation especially in congested networks. In this paper, we design an efficient backoff algorithm that utilizes a reinforcement learning method to determine optimal values of backoffs. The mobile nodes share a common contention window (CW) in our scheme, and using a Q-learning algorithm, they can avoid collisions by finding and implicitly reserving their optimal time slot(s). In addition, we introduce Frame Size Control (FSC) algorithm to minimize the possible degradation of aggregate throughput when the number of nodes exceeds the CW size. Our simulation shows that the proposed backoff algorithm with FSC method outperforms the 802.11 protocol regardless of the traffic conditions, and an analytical modeling proves that our mechanism has a unique operating point that is fair and stable.

MQTT 기반 IoT 네트워크에서 강화학습을 활용한 Retained 메시지 전송 방법 (Retained Message Delivery Scheme utilizing Reinforcement Learning in MQTT-based IoT Networks)

  • 경연웅;김태국;김영준
    • 사물인터넷융복합논문지
    • /
    • 제10권2호
    • /
    • pp.131-135
    • /
    • 2024
  • MQTT 프로토콜에서 Publisher로부터 발행되는 메시지의 retained flag가 세팅되어 있으면 해당 메시지는 Broker에 Retained 메시지로 저장되고, 새로운 Subscriber가 subscribe를 수행할 때 Broker는 Retained 메시지를 바로 전송하게 된다. 이를 통해 새로운 Subscriber는 Publisher의 새로운 메시지 발행을 기다리지 않고 Retained 메시지를 통해 현재 상태에 대한 업데이트를 수행할 수 있다. 하지만 Publisher로부터 새로운 메시지가 자주 발행되는 경우에는 retained 메시지를 보내는 것이 트래픽의 오버헤드가 될 수 있고, 해당 상황은 새로운 Subscriber들의 subscribe가 자주 수행되는 경우 더욱 큰 오버헤드로 고려될 수 있다. 그러므로 본 연구에서는 이러한 문제를 해결하기 위해 발행되는 메시지의 특성을 고려하여 Broker의 Retained 메시지 전송 방법을 제안하고자 한다. 본 연구에서는 Broker 입장에서 새로운 Subscriber로의 전송 및 대기 액션을 고려하여 강화학습을 기반으로 모델링하였고, Q learning 알고리즘을 통해 최적의 전송 방법을 결정하였다. 성능 분석을 통해 제안하는 방법이 기존 방법 대비 개선된 성능을 보이는 것을 확인하였다.