• Title/Summary/Keyword: optimal learning

Search Result 1,243, Processing Time 0.023 seconds

Application of Deep Recurrent Q Network with Dueling Architecture for Optimal Sepsis Treatment Policy

  • Do, Thanh-Cong;Yang, Hyung Jeong;Ho, Ngoc-Huynh
    • Smart Media Journal
    • /
    • v.10 no.2
    • /
    • pp.48-54
    • /
    • 2021
  • Sepsis is one of the leading causes of mortality globally, and it costs billions of dollars annually. However, treating septic patients is currently highly challenging, and more research is needed into a general treatment method for sepsis. Therefore, in this work, we propose a reinforcement learning method for learning the optimal treatment strategies for septic patients. We model the patient physiological time series data as the input for a deep recurrent Q-network that learns reliable treatment policies. We evaluate our model using an off-policy evaluation method, and the experimental results indicate that it outperforms the physicians' policy, reducing patient mortality up to 3.04%. Thus, our model can be used as a tool to reduce patient mortality by supporting clinicians in making dynamic decisions.

Machine-Learning-Based User Group and Beam Selection for Coordinated Millimeter-wave Systems

  • Ju, Sang-Lim;Kim, Nam-il;Kim, Kyung-Seok
    • International journal of advanced smart convergence
    • /
    • v.9 no.4
    • /
    • pp.156-166
    • /
    • 2020
  • In this paper, to improve spectral efficiency and mitigate interference in coordinated millimeter-wave systems, we proposes an optimal user group and beam selection scheme. The proposed scheme improves spectral efficiency by mitigating intra- and inter-cell interferences (ICI). By examining the effective channel capacity for all possible user combinations, user combinations and beams with minimized ICI can be selected. However, implementing this in a dense environment of cells and users requires highly complex computational abilities, which we have investigated applying multiclass classifiers based on machine learning. Compared with the conventional scheme, the numerical results show that our proposed scheme can achieve near-optimal performance, making it an attractive option for these systems.

RL-based Path Planning for SLAM Uncertainty Minimization in Urban Mapping (도시환경 매핑 시 SLAM 불확실성 최소화를 위한 강화 학습 기반 경로 계획법)

  • Cho, Younghun;Kim, Ayoung
    • The Journal of Korea Robotics Society
    • /
    • v.16 no.2
    • /
    • pp.122-129
    • /
    • 2021
  • For the Simultaneous Localization and Mapping (SLAM) problem, a different path results in different SLAM results. Usually, SLAM follows a trail of input data. Active SLAM, which determines where to sense for the next step, can suggest a better path for a better SLAM result during the data acquisition step. In this paper, we will use reinforcement learning to find where to perceive. By assigning entire target area coverage to a goal and uncertainty as a negative reward, the reinforcement learning network finds an optimal path to minimize trajectory uncertainty and maximize map coverage. However, most active SLAM researches are performed in indoor or aerial environments where robots can move in every direction. In the urban environment, vehicles only can move following road structure and traffic rules. Graph structure can efficiently express road environment, considering crossroads and streets as nodes and edges, respectively. In this paper, we propose a novel method to find optimal SLAM path using graph structure and reinforcement learning technique.

Q-Learning based Collision Avoidance for 802.11 Stations with Maximum Requirements

  • Chang Kyu Lee;Dong Hyun Lee;Junseok Kim;Xiaoying Lei;Seung Hyong Rhee
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.17 no.3
    • /
    • pp.1035-1048
    • /
    • 2023
  • The IEEE 802.11 WLAN adopts a random backoff algorithm for its collision avoidance mechanism, and it is well known that the contention-based algorithm may suffer from performance degradation especially in congested networks. In this paper, we design an efficient backoff algorithm that utilizes a reinforcement learning method to determine optimal values of backoffs. The mobile nodes share a common contention window (CW) in our scheme, and using a Q-learning algorithm, they can avoid collisions by finding and implicitly reserving their optimal time slot(s). In addition, we introduce Frame Size Control (FSC) algorithm to minimize the possible degradation of aggregate throughput when the number of nodes exceeds the CW size. Our simulation shows that the proposed backoff algorithm with FSC method outperforms the 802.11 protocol regardless of the traffic conditions, and an analytical modeling proves that our mechanism has a unique operating point that is fair and stable.

Reinforcement Learning using Propagation of Goal-State-Value (목표상태 값 전파를 이용한 강화 학습)

  • Kim, Byeong-Cheon;Yun, Byeong-Ju
    • The Transactions of the Korea Information Processing Society
    • /
    • v.6 no.5
    • /
    • pp.1303-1311
    • /
    • 1999
  • In order to learn in dynamic environments, reinforcement learning algorithms like Q-learning, TD(0)-learning, TD(λ)-learning have been proposed. however, most of them have a drawback of very slow learning because the reinforcement value is given when they reach their goal state. In this thesis, we have proposed a reinforcement learning method that can approximate fast to the goal state in maze environments. The proposed reinforcement learning method is separated into global learning and local learning, and then it executes learning. Global learning is a learning that uses the replacing eligibility trace method to search the goal state. In local learning, it propagates the goal state value that has been searched through global learning to neighboring sates, and then searches goal state in neighboring states. we can show through experiments that the reinforcement learning method proposed in this thesis can find out an optimal solution faster than other reinforcement learning methods like Q-learning, TD(o)learning and TD(λ)-learning.

  • PDF

Reinforcement Learning Approach to Agents Dynamic Positioning in Robot Soccer Simulation Games

  • Kwon, Ki-Duk;Kim, In-Cheol
    • Proceedings of the Korea Society for Simulation Conference
    • /
    • 2001.10a
    • /
    • pp.321-324
    • /
    • 2001
  • The robot soccer simulation game is a dynamic multi-agent environment. In this paper we suggest a new reinforcement learning approach to each agent's dynamic positioning in such dynamic environment. Reinforcement Beaming is the machine learning in which an agent learns from indirect, delayed reward an optimal policy to choose sequences of actions that produce the greatest cumulative reward. Therefore the reinforcement loaming is different from supervised teaming in the sense that there is no presentation of input-output pairs as training examples. Furthermore, model-free reinforcement loaming algorithms like Q-learning do not require defining or loaming any models of the surrounding environment. Nevertheless it can learn the optimal policy if the agent can visit every state-action pair infinitely. However, the biggest problem of monolithic reinforcement learning is that its straightforward applications do not successfully scale up to more complex environments due to the intractable large space of states. In order to address this problem, we suggest Adaptive Mediation-based Modular Q-Learning(AMMQL) as an improvement of the existing Modular Q-Learning(MQL). While simple modular Q-learning combines the results from each learning module in a fixed way, AMMQL combines them in a more flexible way by assigning different weight to each module according to its contribution to rewards. Therefore in addition to resolving the problem of large state space effectively, AMMQL can show higher adaptability to environmental changes than pure MQL. This paper introduces the concept of AMMQL and presents details of its application into dynamic positioning of robot soccer agents.

  • PDF

Landslide susceptibility assessment using feature selection-based machine learning models

  • Liu, Lei-Lei;Yang, Can;Wang, Xiao-Mi
    • Geomechanics and Engineering
    • /
    • v.25 no.1
    • /
    • pp.1-16
    • /
    • 2021
  • Machine learning models have been widely used for landslide susceptibility assessment (LSA) in recent years. The large number of inputs or conditioning factors for these models, however, can reduce the computation efficiency and increase the difficulty in collecting data. Feature selection is a good tool to address this problem by selecting the most important features among all factors to reduce the size of the input variables. However, two important questions need to be solved: (1) how do feature selection methods affect the performance of machine learning models? and (2) which feature selection method is the most suitable for a given machine learning model? This paper aims to address these two questions by comparing the predictive performance of 13 feature selection-based machine learning (FS-ML) models and 5 ordinary machine learning models on LSA. First, five commonly used machine learning models (i.e., logistic regression, support vector machine, artificial neural network, Gaussian process and random forest) and six typical feature selection methods in the literature are adopted to constitute the proposed models. Then, fifteen conditioning factors are chosen as input variables and 1,017 landslides are used as recorded data. Next, feature selection methods are used to obtain the importance of the conditioning factors to create feature subsets, based on which 13 FS-ML models are constructed. For each of the machine learning models, a best optimized FS-ML model is selected according to the area under curve value. Finally, five optimal FS-ML models are obtained and applied to the LSA of the studied area. The predictive abilities of the FS-ML models on LSA are verified and compared through the receive operating characteristic curve and statistical indicators such as sensitivity, specificity and accuracy. The results showed that different feature selection methods have different effects on the performance of LSA machine learning models. FS-ML models generally outperform the ordinary machine learning models. The best FS-ML model is the recursive feature elimination (RFE) optimized RF, and RFE is an optimal method for feature selection.

Reinforcement Learning Using State Space Compression (상태 공간 압축을 이용한 강화학습)

  • Kim, Byeong-Cheon;Yun, Byeong-Ju
    • The Transactions of the Korea Information Processing Society
    • /
    • v.6 no.3
    • /
    • pp.633-640
    • /
    • 1999
  • Reinforcement learning performs learning through interacting with trial-and-error in dynamic environment. Therefore, in dynamic environment, reinforcement learning method like Q-learning and TD(Temporal Difference)-learning are faster in learning than the conventional stochastic learning method. However, because many of the proposed reinforcement learning algorithms are given the reinforcement value only when the learning agent has reached its goal state, most of the reinforcement algorithms converge to the optimal solution too slowly. In this paper, we present COMREL(COMpressed REinforcement Learning) algorithm for finding the shortest path fast in a maze environment, select the candidate states that can guide the shortest path in compressed maze environment, and learn only the candidate states to find the shortest path. After comparing COMREL algorithm with the already existing Q-learning and Priortized Sweeping algorithm, we could see that the learning time shortened very much.

  • PDF

Investigations on data-driven stochastic optimal control and approximate-inference-based reinforcement learning methods (데이터 기반 확률론적 최적제어와 근사적 추론 기반 강화 학습 방법론에 관한 고찰)

  • Park, Jooyoung;Ji, Seunghyun;Sung, Keehoon;Heo, Seongman;Park, Kyungwook
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.25 no.4
    • /
    • pp.319-326
    • /
    • 2015
  • Recently in the fields o f stochastic optimal control ( SOC) and reinforcemnet l earning (RL), there have been a great deal of research efforts for the problem of finding data-based sub-optimal control policies. The conventional theory for finding optimal controllers via the value-function-based dynamic programming was established for solving the stochastic optimal control problems with solid theoretical background. However, they can be successfully applied only to extremely simple cases. Hence, the data-based modern approach, which tries to find sub-optimal solutions utilizing relevant data such as the state-transition and reward signals instead of rigorous mathematical analyses, is particularly attractive to practical applications. In this paper, we consider a couple of methods combining the modern SOC strategies and approximate inference together with machine-learning-based data treatment methods. Also, we apply the resultant methods to a variety of application domains including financial engineering, and observe their performance.

A Genetic Algorithm Based Learning Path Optimization for Music Education (유전 알고리즘 기반의 음악 교육 학습 경로 최적화)

  • Jung, Woosung
    • Journal of the Korea Convergence Society
    • /
    • v.10 no.2
    • /
    • pp.13-20
    • /
    • 2019
  • For customized education, it is essential to search the learning path for the learner. The genetic algorithm makes it possible to find optimal solutions within a practical time when they are difficult to be obtained with deterministic approaches because of the problem's very large search space. In this research, based on genetic algorithm, the learning paths to learn 200 chords in 27 music sheets were optimized to maximize the learning effect by balancing and minimizing learner's burden and learning size for each step in the learning paths. Although the permutation size of the possible learning path for 27 learning contents is more than $10^{28}$, the optimal solution could be obtained within 20 minutes in average by an implemented tool in this research. Experimental results showed that genetic algorithm can be effectively used to design complex learning path for customized education with various purposes. The proposed method is expected to be applied in other educational domains as well.