• Title/Summary/Keyword: Markov decision problem

Search Result 70, Processing Time 0.026 seconds

Fault- Tolerant Tasking and Guidance of an Airborne Location Sensor Network

  • Wu, N.Eva;Guo, Yan;Huang, Kun;Ruschmann, Matthew C.;Fowler, Mark L.
    • International Journal of Control, Automation, and Systems
    • /
    • v.6 no.3
    • /
    • pp.351-363
    • /
    • 2008
  • This paper is concerned with tasking and guidance of networked airborne sensors to achieve fault-tolerant sensing. The sensors are coordinated to locate hostile transmitters by intercepting and processing their signals. Faults occur when some sensor-carrying vehicles engaged in target location missions are lost. Faults effectively change the network architecture and therefore degrade the network performance. The first objective of the paper is to optimally allocate a finite number of sensors to targets to maximize the network life and availability. To that end allocation policies are solved from relevant Markov decision problems. The sensors allocated to a target must continue to adjust their trajectories until the estimate of the target location reaches a prescribed accuracy. The second objective of the paper is to establish a criterion for vehicle guidance for which fault-tolerant sensing is achieved by incorporating the knowledge of vehicle loss probability, and by allowing network reconfiguration in the event of loss of vehicles. Superior sensing performance in terms of location accuracy is demonstrated under the established criterion.

Applying the Bi-level HMM for Robust Voice-activity Detection

  • Hwang, Yongwon;Jeong, Mun-Ho;Oh, Sang-Rok;Kim, Il-Hwan
    • Journal of Electrical Engineering and Technology
    • /
    • v.12 no.1
    • /
    • pp.373-377
    • /
    • 2017
  • This paper presents a voice-activity detection (VAD) method for sound sequences with various SNRs. For real-time VAD applications, it is inadequate to employ a post-processing for the removal of burst clippings from the VAD output decision. To tackle this problem, building on the bi-level hidden Markov model, for which a state layer is inserted into a typical hidden Markov model (HMM), we formulated a robust method for VAD not requiring any additional post-processing. In the method, a forward-inference-ratio test was devised to detect the speech endpoints and Mel-frequency cepstral coefficients (MFCC) were used as the features. Our experiment results show that, regarding different SNRs, the performance of the proposed approach is more outstanding than those of the conventional methods.

Sparse Data Cleaning using Multiple Imputations

  • Jun, Sung-Hae;Lee, Seung-Joo;Oh, Kyung-Whan
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • v.4 no.1
    • /
    • pp.119-124
    • /
    • 2004
  • Real data as web log file tend to be incomplete. But we have to find useful knowledge from these for optimal decision. In web log data, many useful things which are hyperlink information and web usages of connected users may be found. The size of web data is too huge to use for effective knowledge discovery. To make matters worse, they are very sparse. We overcome this sparse problem using Markov Chain Monte Carlo method as multiple imputations. This missing value imputation changes spare web data to complete. Our study may be a useful tool for discovering knowledge from data set with sparseness. The more sparseness of data in increased, the better performance of MCMC imputation is good. We verified our work by experiments using UCI machine learning repository data.

A Study on the Criteria to Decide the Number of Aircrafts Considering Operational Characteristics (항공기 운용 특성을 고려한 적정 운용 대수 산정 기준 연구)

  • Son, Young-Su;Kim, Seong-Woo;Yoon, Bong-Kyoo
    • Journal of the Korea Institute of Military Science and Technology
    • /
    • v.17 no.1
    • /
    • pp.41-49
    • /
    • 2014
  • In this paper, we consider a method to access the number of aircraft requirement which is a strategic variable in national security. This problem becomes more important considering the F-X and KF-X project in ROKAF. Traditionally, ATO(Air Tasking Order) and fighting power index have been used to evaluate the number of aircrafts required in ROKAF. However, those methods considers static aspect of aircraft requirement. This paper deals with a model to accommodate dynamic feature of aircraft requirement using absorbing Markov chain. In conclusion, we suggest a dynamic model to evaluate the number of aircrafts required with key decision variables such as destroying rate, failure rate and repair rate.

An Inspection-Maintenance Policy for a System with Various Types of Maintenance (다수의 보수형태를 갖는 시스템에서의 검사.보수정책)

  • 이창훈;홍성희
    • Journal of the Korean Operations Research and Management Science Society
    • /
    • v.6 no.2
    • /
    • pp.7-11
    • /
    • 1981
  • An inspection-maintenance policy is investigated for a system having various states. A policy is characterized by the type of maintenance and the next inspection time. Maintenance actions are classified into various types according to the depth of maintenance. Policy evaluation criterion is the expected cost accumulated up to the failure of the system. The problem is formulated as a Markov decision process and an optimal policy is found by using a policy improvement procedure. A numerical example illustrates the policy for a system having five states.

  • PDF

Learning Multi-Character Competition in Markov Games (마르코프 게임 학습에 기초한 다수 캐릭터의 경쟁적 상호작용 애니메이션 합성)

  • Lee, Kang-Hoon
    • Journal of the Korea Computer Graphics Society
    • /
    • v.15 no.2
    • /
    • pp.9-17
    • /
    • 2009
  • Animating multiple characters to compete with each other is an important problem in computer games and animation films. However, it remains difficult to simulate strategic competition among characters because of its inherent complex decision process that should be able to cope with often unpredictable behavior of opponents. We adopt a reinforcement learning method in Markov games to action models built from captured motion data. This enables two characters to perform globally optimal counter-strategies with respect to each other. We also extend this method to simulate competition between two teams, each of which can consist of an arbitrary number of characters. We demonstrate the usefulness of our approach through various competitive scenarios, including playing-tag, keeping-distance, and shooting.

  • PDF

Hierarchical Power Management Architecture and Optimal Local Control Policy for Energy Efficient Networks

  • Wei, Yifei;Wang, Xiaojun;Fialho, Leonardo;Bruschi, Roberto;Ormond, Olga;Collier, Martin
    • Journal of Communications and Networks
    • /
    • v.18 no.4
    • /
    • pp.540-550
    • /
    • 2016
  • Since energy efficiency has become a significant concern for network infrastructure, next-generation network devices are expected to have embedded advanced power management capabilities. However, how to effectively exploit the green capabilities is still a big challenge, especially given the high heterogeneity of devices and their internal architectures. In this paper, we introduce a hierarchical power management architecture (HPMA) which represents physical components whose power can be monitored and controlled at various levels of a device as entities. We use energy aware state (EAS) as the power management setting mode of each device entity. The power policy controller is capable of getting information on how many EASes of the entity are manageable inside a device, and setting a certain EAS configuration for the entity. We propose the optimal local control policy which aims to minimize the router power consumption while meeting the performance constraints. A first-order Markov chain is used to model the statistical features of the network traffic load. The dynamic EAS configuration problem is formulated as a Markov decision process and solved using a dynamic programming algorithm. In addition, we demonstrate a reference implementation of the HPMA and EAS concept in a NetFPGA frequency scaled router which has the ability of toggling among five operating frequency options and/or turning off unused Ethernet ports.

Efficient Approximation of State Space for Reinforcement Learning Using Complex Network Models (복잡계망 모델을 사용한 강화 학습 상태 공간의 효율적인 근사)

  • Yi, Seung-Joon;Eom, Jae-Hong;Zhang, Byoung-Tak
    • Journal of KIISE:Software and Applications
    • /
    • v.36 no.6
    • /
    • pp.479-490
    • /
    • 2009
  • A number of temporal abstraction approaches have been suggested so far to handle the high computational complexity of Markov decision problems (MDPs). Although the structure of temporal abstraction can significantly affect the efficiency of solving the MDP, to our knowledge none of current temporal abstraction approaches explicitly consider the relationship between topology and efficiency. In this paper, we first show that a topological measurement from complex network literature, mean geodesic distance, can reflect the efficiency of solving MDP. Based on this, we build an incremental method to systematically build temporal abstractions using a network model that guarantees a small mean geodesic distance. We test our algorithm on a realistic 3D game environment, and experimental results show that our model has subpolynomial growth of mean geodesic distance according to problem size, which enables efficient solving of resulting MDP.

Reinforcement Learning-based Dynamic Weapon Assignment to Multi-Caliber Long-Range Artillery Attacks (다종 장사정포 공격에 대한 강화학습 기반의 동적 무기할당)

  • Hyeonho Kim;Jung Hun Kim;Joohoe Kong;Ji Hoon Kyung
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.45 no.4
    • /
    • pp.42-52
    • /
    • 2022
  • North Korea continues to upgrade and display its long-range rocket launchers to emphasize its military strength. Recently Republic of Korea kicked off the development of anti-artillery interception system similar to Israel's "Iron Dome", designed to protect against North Korea's arsenal of long-range rockets. The system may not work smoothly without the function assigning interceptors to incoming various-caliber artillery rockets. We view the assignment task as a dynamic weapon target assignment (DWTA) problem. DWTA is a multistage decision process in which decision in a stage affects decision processes and its results in the subsequent stages. We represent the DWTA problem as a Markov decision process (MDP). Distance from Seoul to North Korea's multiple rocket launchers positioned near the border, limits the processing time of the model solver within only a few second. It is impossible to compute the exact optimal solution within the allowed time interval due to the curse of dimensionality inherently in MDP model of practical DWTA problem. We apply two reinforcement-based algorithms to get the approximate solution of the MDP model within the time limit. To check the quality of the approximate solution, we adopt Shoot-Shoot-Look(SSL) policy as a baseline. Simulation results showed that both algorithms provide better solution than the solution from the baseline strategy.

Approximate Dynamic Programming Based Interceptor Fire Control and Effectiveness Analysis for M-To-M Engagement (근사적 동적계획을 활용한 요격통제 및 동시교전 효과분석)

  • Lee, Changseok;Kim, Ju-Hyun;Choi, Bong Wan;Kim, Kyeongtaek
    • Journal of the Korean Society for Aeronautical & Space Sciences
    • /
    • v.50 no.4
    • /
    • pp.287-295
    • /
    • 2022
  • As low altitude long-range artillery threat has been strengthened, the development of anti-artillery interception system to protect assets against its attacks will be kicked off. We view the defense of long-range artillery attacks as a typical dynamic weapon target assignment (DWTA) problem. DWTA is a sequential decision process in which decision making under future uncertain attacks affects the subsequent decision processes and its results. These are typical characteristics of Markov decision process (MDP) model. We formulate the problem as a MDP model to examine the assignment policy for the defender. The proximity of the capital of South Korea to North Korea border limits the computation time for its solution to a few second. Within the allowed time interval, it is impossible to compute the exact optimal solution. We apply approximate dynamic programming (ADP) approach to check if ADP approach solve the MDP model within processing time limit. We employ Shoot-Shoot-Look policy as a baseline strategy and compare it with ADP approach for three scenarios. Simulation results show that ADP approach provide better solution than the baseline strategy.