• 제목/요약/키워드: Optimal policy

검색결과 1,120건 처리시간 0.024초

용량 확장과 반납을 갖는 렌탈 자원 관리모델 (Rental Resource Management Model with Capacity Expansion and Return)

  • 김은갑;변진호
    • 한국경영과학회지
    • /
    • 제31권3호
    • /
    • pp.81-96
    • /
    • 2006
  • We consider a rental company that dynamically manages Its capacity level through capacity addition and return While serving customer with its own capacity, the company expands its capacity by renting items from an outside source so that it can avoid lost opportunities of rental which occur when stock is not sufficient. If stock becomes sufficiently large enough to cope with demands, the company returns expanded capacity to the outside source. Formulating the model into a Markov decision problem, we identify an optimal capacity management Policy which states when the company should expand its capacity and when it should return expanded capacity after capacity addition. Since it is intractable to analytically find the optimal capacity management policy and the optimal size of capacity expansion, we present a numerical procedure that finds these optimal values based on the value iteration method. Numerical analysis is implemented and we observe monotonic properties of the optimal performance measures by system parameters, which are meaningful in developing effective heuristic policies.

Worst-case optimal feedback control policy for a remote electrical drive system with time-delay

  • 고유;장정;이창구;정길도
    • 대한전기학회:학술대회논문집
    • /
    • 대한전기학회 2007년도 심포지엄 논문집 정보 및 제어부문
    • /
    • pp.92-94
    • /
    • 2007
  • This paper considers an optimal control problem for a remote control to an electrical drive system with a DC motor. Since it is a linear control system with time-delay subject to unknown but bounded disturbance, we construct a worst-case feedback control policy. This policy can guarantee that, for all admissible uncertain disturbances, the real system state should be in a prescribed neighborhood of a desired value, and the cost functional takes the best guarantee value. The worst-case feedback control policy is allowed to be corrected at one correction point between the initial to the final time, which is equivalent to solving a 1-level min-max problem. Since the min-max problem at the stage does not yield a simple analytical solution, we consider an approximate control policy, which is equivalent and can be solved explicitly m the numerical experiments.

  • PDF

두 종류의 고객이 도착하는 M/M/2/K Queueing System에서의 Server 조정정책에 관한 연주 (A Study on the Service Control Policy of M/M/2/K Queueing System with Two Types of Customers)

  • 유인선;문기석
    • 산업경영시스템학회지
    • /
    • 제6권8호
    • /
    • pp.93-103
    • /
    • 1983
  • In this paper, we study an optimal service policy of the M/M/2/K queueing system with two types of customers. The incurred costs consist of waiting cost, service cost and incurred costs consist of waiting cost, service cut and changeover cost. The changeover cost occurs when a server who assigned to serve a particular type of customers reassigned to the other types of customers. Two servers serve two types of customers who arrive to the two separate queues. The two types of customers differ in respect of their arrival rate, service rate, waiting cost, and service cost. The servers require a policy, for determining when they should change their service type, which minimizes the long run expected total cost. The policy is obtained by a Markov decision process model that consists of a finite number of states and actions. In order to find the optimal service policy, we define states and actions of the system, compute onestep transition probabilities, and apply to the successive approximations algorithm.

  • PDF

Actor-Critic Algorithm with Transition Cost Estimation

  • Sergey, Denisov;Lee, Jee-Hyong
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • 제16권4호
    • /
    • pp.270-275
    • /
    • 2016
  • We present an approach for acceleration actor-critic algorithm for reinforcement learning with continuous action space. Actor-critic algorithm has already proved its robustness to the infinitely large action spaces in various high dimensional environments. Despite that success, the main problem of the actor-critic algorithm remains the same-speed of convergence to the optimal policy. In high dimensional state and action space, a searching for the correct action in each state takes enormously long time. Therefore, in this paper we suggest a search accelerating function that allows to leverage speed of algorithm convergence and reach optimal policy faster. In our method, we assume that actions may have their own distribution of preference, that independent on the state. Since in the beginning of learning agent act randomly in the environment, it would be more efficient if actions were taken according to the some heuristic function. We demonstrate that heuristically-accelerated actor-critic algorithm learns optimal policy faster, using Educational Process Mining dataset with records of students' course learning process and their grades.

수리 가능한 시스템에서의 최적 예방 보전 정책 (Optimal Preventive Maintenance Policy for a Repairable System)

  • Ji Hwan Cha;Jong Tae Jung;Jae Joo Kim
    • 품질경영학회지
    • /
    • 제29권2호
    • /
    • pp.46-53
    • /
    • 2001
  • In this paper, a preventive maintenance(PM) policy for a repairable system is considered. The failure rate model proposed by Park et at.(2000) is generalized by assuming that after each PM not only the PM slows down the degradation process of the system but also reduces down the system failure rate by a certain fixed amount. Long-run expected cost rate of the PM policy is derived and the properties of joint solution of the optimal PM period and optimal number of PM which minimizes the expected cost rate are obtained. Numerical examples for the case of a Weibull-type failure rate are given.

  • PDF

교체-수리보증 하에서 연장된 보증이 종료된 이후의 예방보전정책 (Preventive Maintenance Policy Following the Expiration of Extended Warranty Under Replacement-Repair Warranty)

  • 정기문
    • 한국신뢰성학회지:신뢰성응용연구
    • /
    • 제14권2호
    • /
    • pp.122-128
    • /
    • 2014
  • In this paper, we consider the periodic preventive maintenance model for a repairable system following the expiration of extended warranty under replacement-repair warranty. Under the replacement-repair warranty, the failed system is replaced or minimally repaired by the manufacturer at no cost to the user. Also, under extended warranty, the failed system is minimally repaired by the manufacturer at no cost to the user during the original extended warranty period. As a criterion of the optimality, we utilize the expected cost rate per unit time during the life cycle from the user's perspective. And then we determine the optimal preventive maintenance period and the optimal preventive maintenance number by minimizing the expected cost rate per unit time. Finally, the optimal periodic preventive maintenance policy is given for Weibull distribution case.

통합자동생산시스템에서 최적운영방안 결정을 위한 유전자 알고리즘의 개발 (A genetic algorithm for determining the optimal operating policies in an integrated-automated manufacturing system)

  • 임준묵
    • 한국산업정보학회:학술대회논문집
    • /
    • 한국산업정보학회 1999년도 춘계학술대회 발표논문집
    • /
    • pp.145-153
    • /
    • 1999
  • We consider a Direct Input Output Manufacturing System(DIOMS) which has a munber of machine centers placed along a built-in Automated Storage/Retrieval System(AS/RS). The Storage/Retrieval (S/R) machine handles parts placed on pallets for the machine centers located at either one or both sides of the As/Rs. This report studies the operational aspect of DIOMS and determines the optimal operating policy by combining computer simulation and genetic algorithm. The operational problem includes: input sequencing control, dispatching rule of the S/R machine, machine center-based part type selection rule, and storage assignment policy. For each operating policy, several different policies are considered based on the known research results. In this report, using the computer simulation and genetic algorithm we suggest a method which gives the optimal configuration of operating policies within reasonable computation time.

  • PDF

수리 후 고장률이 지수적으로 증가하는 경우에 최적 예방보전 정책 (A Study on Optimal Preventive Maintenance Policy When Failure Rate is Exponentially Increasing After Repair)

  • 김태희;나명환
    • 한국신뢰성학회지:신뢰성응용연구
    • /
    • 제11권2호
    • /
    • pp.167-176
    • /
    • 2011
  • This paper introduces models for preventive maintenance policies and considers periodic preventive maintenance policy with minimal repair when the failure of system occurs. It is assumed that minimal repairs do not change the failure rate of the system. The failure rate under prevention maintenance received an effect by a previously prevention maintenance and the slope of failure rate increases the model where it considered. Also the start point of failure rate under prevention maintenance considers the degradation of system and that it increases quotient, it assumed. Per unit time it bought an expectation cost from under this prevention maintenance policy. We obtain the optimal periodic time and the number for the periodic preventive maintenance by using Nakagawa's Algorithm, which minimizes the expected cost per unit time.

연속시간 선형시스템에 대한 탐색화된 정책반복법 (Explorized Policy Iteration For Continuous-Time Linear Systems)

  • 이재영;전태윤;최윤호;박진배
    • 전기학회논문지
    • /
    • 제61권3호
    • /
    • pp.451-458
    • /
    • 2012
  • This paper addresses the problem that policy iteration (PI) for continuous-time (CT) systems requires explorations of the state space which is known as persistency of excitation in adaptive control community, and as a result, proposes a PI scheme explorized by an additional probing signal to solve the addressed problem. The proposed PI method efficiently finds in online fashion the related CT linear quadratic (LQ) optimal control without knowing the system matrix A, and guarantees the stability and convergence to the LQ optimal control, which is proven in this paper in the presence of the probing signal. A design method for the probing signal is also presented to balance the exploration of the state space and the control performance. Finally, several simulation results are provided to verify the effectiveness of the proposed explorized PI method.

최적제어기법의 지역정책에의 적용에 관한 연구 : 지역경제정책의 평가를 중심으로 (The Use of Optimal Control Techniques to Design Regional Policies: With Special Reference to the Evaluation of Regional Economic Polices)

  • 강동희
    • 지역연구
    • /
    • 제15권1호
    • /
    • pp.1-22
    • /
    • 1999
  • It is widely known that optimal control techniques are useful to measure the performance of macroeconomic policy. This paper examines how the method could be applies them to the evaluation of the public investment expenditures conducted by the local government of Choongbook Province in Korea. The numerical example illustrates the usefulness of the methods for the evaluation of the regional economic policies suggesting the main findings as follows: (1) If the local government of Choongbook Province had increased the public investment expenditures allowing the budget deficits for the first three to four years during the period between 1985 and 1990, its GRDP would have early risen to the ratio of more than three percent of Korea's total GDP. (2) The additonal welfare losses incurred by not following the optimal policy were 0.191 in 1986, 0.607 in 1987, 1.585 in 1988, and 0.132 in 1989, indicating that the public investment policy proves to be the best in 1989 and the worst in 1988.

  • PDF