• Title/Summary/Keyword: markov decision problem

Search Result 69, Processing Time 0.027 seconds

Optimal maintenance procedure for multi-state deteriorated system with incomplete monitoring

  • Jin, L.;Suzuki, K.
    • International Journal of Reliability and Applications
    • /
    • v.11 no.2
    • /
    • pp.69-87
    • /
    • 2010
  • The optimal replacement problem was investigated for a multi-state deteriorated system for which the true internal state cannot be observed directly except when the system breaks down completely. The internal state was assumed to be monitored incompletely by a monitor that gives information related to the true state of the system. The problem was formulated as a partially observable Markov decision process. The optimal procedure was found to be a monotone procedure with respect to stochastic increasing ordering of the state probability vectors under some assumptions. Limiting the optimal procedure to a monotone procedure would greatly reduce the tremendous amount of calculation time required to find the optimal procedure.

  • PDF

Bayesian Model for Cost Estimation of Construction Projects

  • Kim, Sang-Yon
    • Journal of the Korea Institute of Building Construction
    • /
    • v.11 no.1
    • /
    • pp.91-99
    • /
    • 2011
  • Bayesian network is a form of probabilistic graphical model. It incorporates human reasoning to deal with sparse data availability and to determine the probabilities of uncertain cases. In this research, bayesian network is adopted to model the problem of construction project cost. General information, time, cost, and material, the four main factors dominating the characteristic of construction costs, are incorporated into the model. This research presents verify a model that were conducted to illustrate the functionality and application of a decision support system for predicting the costs. The Markov Chain Monte Carlo (MCMC) method is applied to estimate parameter distributions. Furthermore, it is shown that not all the parameters are normally distributed. In addition, cost estimates based on the Gibbs output is performed. It can enhance the decision the decision-making process.

A Joint Allocation Algorithm of Computing and Communication Resources Based on Reinforcement Learning in MEC System

  • Liu, Qinghua;Li, Qingping
    • Journal of Information Processing Systems
    • /
    • v.17 no.4
    • /
    • pp.721-736
    • /
    • 2021
  • For the mobile edge computing (MEC) system supporting dense network, a joint allocation algorithm of computing and communication resources based on reinforcement learning is proposed. The energy consumption of task execution is defined as the maximum energy consumption of each user's task execution in the system. Considering the constraints of task unloading, power allocation, transmission rate and calculation resource allocation, the problem of joint task unloading and resource allocation is modeled as a problem of maximum task execution energy consumption minimization. As a mixed integer nonlinear programming problem, it is difficult to be directly solve by traditional optimization methods. This paper uses reinforcement learning algorithm to solve this problem. Then, the Markov decision-making process and the theoretical basis of reinforcement learning are introduced to provide a theoretical basis for the algorithm simulation experiment. Based on the algorithm of reinforcement learning and joint allocation of communication resources, the joint optimization of data task unloading and power control strategy is carried out for each terminal device, and the local computing model and task unloading model are built. The simulation results show that the total task computation cost of the proposed algorithm is 5%-10% less than that of the two comparison algorithms under the same task input. At the same time, the total task computation cost of the proposed algorithm is more than 5% less than that of the two new comparison algorithms.

A Localized Adaptive QoS Routing Scheme Using POMDP and Exploration Bonus Techniques (POMDP와 Exploration Bonus를 이용한 지역적이고 적응적인 QoS 라우팅 기법)

  • Han Jeong-Soo
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.31 no.3B
    • /
    • pp.175-182
    • /
    • 2006
  • In this paper, we propose a Localized Adaptive QoS Routing Scheme using POMDP and Exploration Bonus Techniques. Also, this paper shows that CEA technique using expectation values can be simply POMDP problem, because performing dynamic programming to solve a POMDP is highly computationally expensive. And we use Exploration Bonus to search detour path better than current path. For this, we proposed the algorithm(SEMA) to search multiple path. Expecially, we evaluate performances of service success rate and average hop count with $\phi$ and k performance parameters, which is defined as exploration count and intervals. As result, we knew that the larger $\phi$, the better detour path search. And increasing n increased the amount of exploration.

Two-Dimensional POMDP-Based Opportunistic Spectrum Access in Time-Varying Environment with Fading Channels

  • Wang, Yumeng;Xu, Yuhua;Shen, Liang;Xu, Chenglong;Cheng, Yunpeng
    • Journal of Communications and Networks
    • /
    • v.16 no.2
    • /
    • pp.217-226
    • /
    • 2014
  • In this research, we study the problem of opportunistic spectrum access (OSA) in a time-varying environment with fading channels, where the channel state is characterized by both channel quality and the occupancy of primary users (PUs). First, a finite-state Markov channel model is introduced to represent a fading channel. Second, by probing channel quality and exploring the activities of PUs jointly, a two-dimensional partially observable Markov decision process framework is proposed for OSA. In addition, a greedy strategy is designed, where a secondary user selects a channel that has the best-expected data transmission rate to maximize the instantaneous reward in the current slot. Compared with the optimal strategy that considers future reward, the greedy strategy brings low complexity and relatively ideal performance. Meanwhile, the spectrum sensing error that causes the collision between a PU and a secondary user (SU) is also discussed. Furthermore, we analyze the multiuser situation in which the proposed single-user strategy is adopted by every SU compared with the previous one. By observing the simulation results, the proposed strategy attains a larger throughput than the previous works under various parameter configurations.

Rental Resource Management Model with Capacity Expansion and Return (용량 확장과 반납을 갖는 렌탈 자원 관리모델)

  • Kim Eun-Gab;Byun Jin-Ho
    • Journal of the Korean Operations Research and Management Science Society
    • /
    • v.31 no.3
    • /
    • pp.81-96
    • /
    • 2006
  • We consider a rental company that dynamically manages Its capacity level through capacity addition and return While serving customer with its own capacity, the company expands its capacity by renting items from an outside source so that it can avoid lost opportunities of rental which occur when stock is not sufficient. If stock becomes sufficiently large enough to cope with demands, the company returns expanded capacity to the outside source. Formulating the model into a Markov decision problem, we identify an optimal capacity management Policy which states when the company should expand its capacity and when it should return expanded capacity after capacity addition. Since it is intractable to analytically find the optimal capacity management policy and the optimal size of capacity expansion, we present a numerical procedure that finds these optimal values based on the value iteration method. Numerical analysis is implemented and we observe monotonic properties of the optimal performance measures by system parameters, which are meaningful in developing effective heuristic policies.

Demand Variability Impact on the Replenishment Policy in a Two-Echelon Supply Chain Model (두 계층 공급사슬 모형에서 발주정책에 대한 수요 변동성 영향)

  • Kim Eungab
    • Journal of the Korean Operations Research and Management Science Society
    • /
    • v.29 no.3
    • /
    • pp.111-127
    • /
    • 2004
  • We consider a supply chain model with a make-to-order production facility and a single supplier. The model we treat here is a special case of a two-echelon inventory model. Unlike classical two-echelon systems, the demand process at the supplier is affected by production process at the production facility as well as customer order arrival process. In this paper, we address that how the demand variability impacts on the optimal replenishment policy. To this end, we incorporate Erlang and phase-type demand distributions into the model. Formulating the model as a Markov decision problem, we investigate the structure of the optimal replenishment policy. We also implement a sensitivity analysis on the optimal policy and establish its monotonicity with respect to system cost parameters.

A Stochastic Dynamic Programming Model to Derive Monthly Operating Policy of a Multi-Reservoir System (댐 군 월별 운영 정책의 도출을 위한 추계적 동적 계획 모형)

  • Lim, Dong-Gyu;Kim, Jae-Hee;Kim, Sheung-Kown
    • Korean Management Science Review
    • /
    • v.29 no.1
    • /
    • pp.1-14
    • /
    • 2012
  • The goal of the multi-reservoir operation planning is to provide an optimal release plan that maximize the reservoir storage and hydropower generation while minimizing the spillages. However, the reservoir operation is difficult due to the uncertainty associated with inflows. In order to consider the uncertain inflows in the reservoir operating problem, we present a Stochastic Dynamic Programming (SDP) model based on the markov decision process (MDP). The objective of the model is to maximize the expected value of the system performance that is the weighted sum of all expected objective values. With the SDP model, multi-reservoir operating rule can be derived, and it also generates the steady state probabilities of reservoir storage and inflow as output. We applied the model to the Geum-river basin in Korea and could generate a multi-reservoir monthly operating plan that can consider the uncertainty of inflow.

Optimal SMDP-Based Connection Admission Control Mechanism in Cognitive Radio Sensor Networks

  • Hosseini, Elahe;Berangi, Reza
    • ETRI Journal
    • /
    • v.39 no.3
    • /
    • pp.345-352
    • /
    • 2017
  • Traffic management is a highly beneficial mechanism for satisfying quality-of-service requirements and overcoming the resource scarcity problems in networks. This paper introduces an optimal connection admission control mechanism to decrease the packet loss ratio and end-to-end delay in cognitive radio sensor networks (CRSNs). This mechanism admits data flows based on the value of information sent by the sensor nodes, the network state, and the estimated required resources of the data flows. The number of required channels of each data flow is estimated using a proposed formula that is inspired by a graph coloring approach. The proposed admission control mechanism is formulated as a semi-Markov decision process and a linear programming problem is derived to obtain the optimal admission control policy for obtaining the maximum reward. Simulation results demonstrate that the proposed mechanism outperforms a recently proposed admission control mechanism in CRSNs.

Reliability-guaranteed multipath allocation algorithm in mobile network

  • Jaewook Lee;Haneul Ko
    • ETRI Journal
    • /
    • v.44 no.6
    • /
    • pp.936-944
    • /
    • 2022
  • The mobile network allows redundant transmission via disjoint paths to support high-reliability communication (e.g., ultrareliable and low-latency communications [URLLC]). Although redundant transmission can improve communication reliability, it also increases network costs (e.g., traffic and control overhead). In this study, we propose a reliability-guaranteed multipath allocation algorithm (RG-MAA) that allocates appropriate paths by considering the path setup time and dynamicity of the reliability paths. We develop an optimization problem using a constrained Markov decision process (CMDP) to minimize network costs while ensuring the required communication reliability. The evaluation results show that RG-MAA can reduce network costs by up to 30% compared with the scheme that uses all possible paths while ensuring the required communication reliability.