• Title/Summary/Keyword: a markov reward model

Search Result 12, Processing Time 0.019 seconds

MDP Modeling for the Prediction of Agent Movement in Limited Space (폐쇄공간에서의 에이전트 행동 예측을 위한 MDP 모델)

  • Jin, Hyowon;Kim, Suhwan;Jung, Chijung;Lee, Moongul
    • Journal of the Korean Operations Research and Management Science Society
    • /
    • v.40 no.3
    • /
    • pp.63-72
    • /
    • 2015
  • This paper presents the issue that is predicting the movement of an agent in an enclosed space by using the MDP (Markov Decision Process). Recent researches on the optimal path finding are confined to derive the shortest path with the use of deterministic algorithm such as $A^*$ or Dijkstra. On the other hand, this study focuses in predicting the path that the agent chooses to escape the limited space as time passes, with the stochastic method. The MDP reward structure from GIS (Geographic Information System) data contributed this model to a feasible model. This model has been approved to have the high predictability after applied to the route of previous armed red guerilla.

MDP(Markov Decision Process) Model for Prediction of Survivor Behavior based on Topographic Information (지형정보 기반 조난자 행동예측을 위한 마코프 의사결정과정 모형)

  • Jinho Son;Suhwan Kim
    • Journal of Intelligence and Information Systems
    • /
    • v.29 no.2
    • /
    • pp.101-114
    • /
    • 2023
  • In the wartime, aircraft carrying out a mission to strike the enemy deep in the depth are exposed to the risk of being shoot down. As a key combat force in mordern warfare, it takes a lot of time, effot and national budget to train military flight personnel who operate high-tech weapon systems. Therefore, this study studied the path problem of predicting the route of emergency escape from enemy territory to the target point to avoid obstacles, and through this, the possibility of safe recovery of emergency escape military flight personnel was increased. based problem, transforming the problem into a TSP, VRP, and Dijkstra algorithm, and approaching it with an optimization technique. However, if this problem is approached in a network problem, it is difficult to reflect the dynamic factors and uncertainties of the battlefield environment that military flight personnel in distress will face. So, MDP suitable for modeling dynamic environments was applied and studied. In addition, GIS was used to obtain topographic information data, and in the process of designing the reward structure of MDP, topographic information was reflected in more detail so that the model could be more realistic than previous studies. In this study, value iteration algorithms and deterministic methods were used to derive a path that allows the military flight personnel in distress to move to the shortest distance while making the most of the topographical advantages. In addition, it was intended to add the reality of the model by adding actual topographic information and obstacles that the military flight personnel in distress can meet in the process of escape and escape. Through this, it was possible to predict through which route the military flight personnel would escape and escape in the actual situation. The model presented in this study can be applied to various operational situations through redesign of the reward structure. In actual situations, decision support based on scientific techniques that reflect various factors in predicting the escape route of the military flight personnel in distress and conducting combat search and rescue operations will be possible.