• Title/Summary/Keyword: Policy Optimization

Search Result 292, Processing Time 0.023 seconds

Policy Safety Stock Cost Optimization : Xerox Consumable Supply Chain Case Study (정책적 안전재고의 비용 최적화 : 제록스 소모품 유통공급망 사례연구)

  • Suh, Eun Suk
    • Journal of Korean Institute of Industrial Engineers
    • /
    • v.41 no.5
    • /
    • pp.511-520
    • /
    • 2015
  • Inventory, cost, and the level of service are three interrelated key metrics that most supply chain organizations are striving to optimize. One way to achieve this goal is to create a simulation model to conduct sensitivity analysis and optimization on several different supply chain policies that can be implemented in actual operation. In this paper, a case of Xerox global supply chain modeling and analysis to assess several "what if" scenarios for the consumable policy safety stock is presented. The simulation model, combined with analytical cost model and optimization module, is used to optimize the policy safety stock level to achieve the lowest total value chain cost. It was shown quantitatively that the policy safety stock can be reduced, but it is offset by the inbound premium transportation cost to expedite supplies in shortage, and the outbound premium transportation cost to send supplies to customers via express shipment, requiring fine balance.

Evaluation of Human Demonstration Augmented Deep Reinforcement Learning Policies via Object Manipulation with an Anthropomorphic Robot Hand (휴먼형 로봇 손의 사물 조작 수행을 이용한 사람 데모 결합 강화학습 정책 성능 평가)

  • Park, Na Hyeon;Oh, Ji Heon;Ryu, Ga Hyun;Lopez, Patricio Rivera;Anazco, Edwin Valarezo;Kim, Tae Seong
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.10 no.5
    • /
    • pp.179-186
    • /
    • 2021
  • Manipulation of complex objects with an anthropomorphic robot hand like a human hand is a challenge in the human-centric environment. In order to train the anthropomorphic robot hand which has a high degree of freedom (DoF), human demonstration augmented deep reinforcement learning policy optimization methods have been proposed. In this work, we first demonstrate augmentation of human demonstration in deep reinforcement learning (DRL) is effective for object manipulation by comparing the performance of the augmentation-free Natural Policy Gradient (NPG) and Demonstration Augmented NPG (DA-NPG). Then three DRL policy optimization methods, namely NPG, Trust Region Policy Optimization (TRPO), and Proximal Policy Optimization (PPO), have been evaluated with DA (i.e., DA-NPG, DA-TRPO, and DA-PPO) and without DA by manipulating six objects such as apple, banana, bottle, light bulb, camera, and hammer. The results show that DA-NPG achieved the average success rate of 99.33% whereas NPG only achieved 60%. In addition, DA-NPG succeeded grasping all six objects while DA-TRPO and DA-PPO failed to grasp some objects and showed unstable performances.

Healthcare Optimization : Current Status and Vitalization Suggestions (의료서비스 최적화 : 현황 및 활성화 방안)

  • Kang, Sung-Hong;Kim, Byung-In;Jun, Chi-Hyuck;Choi, Byung Kwan;Lee, Shin-Ho
    • Journal of Korean Institute of Industrial Engineers
    • /
    • v.39 no.4
    • /
    • pp.313-324
    • /
    • 2013
  • Healthcare optimization is mandatory to strengthen the competitiveness of domestic healthcare industry. Healthcare optimization aims to increase service quality, patient safety, and system efficiency. This paper reviews various healthcare optimization cases of developed countries, synopsizes the current status of domestic healthcare industry, points out several reasons why healthcare optimization is not active in Korea, and suggests some vitalization ways.

A Study on Load Distribution of Gaming Server Using Proximal Policy Optimization (Proximal Policy Optimization을 이용한 게임서버의 부하분산에 관한 연구)

  • Park, Jung-min;Kim, Hye-young;Cho, Sung Hyun
    • Journal of Korea Game Society
    • /
    • v.19 no.3
    • /
    • pp.5-14
    • /
    • 2019
  • The gaming server is based on a distributed server. In order to distribute workloads of gaming servers, distributed gaming servers apply some algorithms which divide each of gaming server's workload into balanced workload among the gaming servers and as a result, efficiently manage response time and fusibility of server requested by the clients. In this paper, we propose a load balancing agent using PPO(Proximal Policy Optimization) which is one of the methods from a greedy algorithm and Policy Gradient which is from Reinforcement Learning. The proposed load balancing agent is compared with the previous researches based on the simulation.

Optimization of Job-Shop Schedule Considering Deadlock Avoidance (교착 회피를 고려한 Job-Shop 일정의 최적화)

  • Jeong, Dong-Jun;Lee, Du-Yong;Im, Seong-Jin
    • Transactions of the Korean Society of Mechanical Engineers A
    • /
    • v.24 no.8 s.179
    • /
    • pp.2131-2142
    • /
    • 2000
  • As recent production facilities are usually operated with unmanned material-handling system, the development of an efficient schedule with deadlock avoidance becomes a critical problem. Related researches on deadlock avoidance usually focus on real-time control of manufacturing system using deadlock avoidance policy. But little off-line optimization of deadlock-free schedule has been reported. This paper presents an optimization method for deadlock-free scheduling for Job-Shop system with no buffer. The deadlock-free schedule is acquired by the procedure that generates candidate lists of waiting operations, and applies a deadlock avoidance policy. To verify the proposed approach, simulation resultsare presented for minimizing makespan in three problem types. According to the simulation results the effect of each deadlock avoidance policy is dependent on the type of problem. When the proposed LOEM (Last Operation Exclusion Method) is employed, computing time for optimization as well as makespan is reduced.

An Efficient Load Balancing Scheme for Gaming Server Using Proximal Policy Optimization Algorithm

  • Kim, Hye-Young
    • Journal of Information Processing Systems
    • /
    • v.17 no.2
    • /
    • pp.297-305
    • /
    • 2021
  • Large amount of data is being generated in gaming servers due to the increase in the number of users and the variety of game services being provided. In particular, load balancing schemes for gaming servers are crucial consideration. The existing literature proposes algorithms that distribute loads in servers by mostly concentrating on load balancing and cooperative offloading. However, many proposed schemes impose heavy restrictions and assumptions, and such a limited service classification method is not enough to satisfy the wide range of service requirements. We propose a load balancing agent that combines the dynamic allocation programming method, a type of greedy algorithm, and proximal policy optimization, a reinforcement learning. Also, we compare performances of our proposed scheme and those of a scheme from previous literature, ProGreGA, by running a simulation.

Restructuring Primary Health Care Network to Maximize Utilization and Reduce Patient Out-of-pocket Expenses

  • Bardhan, Amit Kumar;Kumar, Kaushal
    • Asian Journal of Innovation and Policy
    • /
    • v.8 no.1
    • /
    • pp.122-140
    • /
    • 2019
  • Providing free primary care to everyone is an important goal pursued by many countries under universal health care programs. Countries like India need to efficiently utilize their limited capacities towards this purpose. Unfortunately, due to a variety of reasons, patients incur substantial travel and out-of-pocket expenses for getting primary care from publicly-funded facilities. We propose a set-covering optimization model to assist health policy-makers in managing existing capacity in a better way. Decision-making should consider upgrading centers with better potential to reduce patient expenses and reallocating capacities from less preferred facilities. A multinomial logit choice model is used to predict the preferences. In this article, a brief background and literature survey along with the mixed integer linear programming (MILP) optimization model are presented. The working of the model is illustrated with the help of numerical experiments.

Ant Colony Optimization Approach to the Utility Maintenance Model for Connected-(r, s)-out of-(m, n) : F System ((m, n)중 연속(r, s) : F 시스템의 정비모형에 대한 개미군집 최적화 해법)

  • Lee, Sang-Heon;Shin, Dong-Yeul
    • IE interfaces
    • /
    • v.21 no.3
    • /
    • pp.254-261
    • /
    • 2008
  • Connected-(r,s)-out of-(m,n) : F system is an important topic in redundancy design of the complex system reliability and it's maintenance policy. Previous studies applied Monte Carlo simulation and genetic, simulated annealing algorithms to tackle the difficulty of maintenance policy problem. These algorithms suggested most suitable maintenance cycle to optimize maintenance pattern of connected-(r,s)-out of-(m,n) : F system. However, genetic algorithm is required long execution time relatively and simulated annealing has improved computational time but rather poor solutions. In this paper, we propose the ant colony optimization approach for connected-(r,s)-out of-(m,n) : F system that determines maintenance cycle and minimum unit cost. Computational results prove that ant colony optimization algorithm is superior to genetic algorithm, simulated annealing and tabu search in both execution time and quality of solution.

Cloud Task Scheduling Based on Proximal Policy Optimization Algorithm for Lowering Energy Consumption of Data Center

  • Yang, Yongquan;He, Cuihua;Yin, Bo;Wei, Zhiqiang;Hong, Bowei
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.16 no.6
    • /
    • pp.1877-1891
    • /
    • 2022
  • As a part of cloud computing technology, algorithms for cloud task scheduling place an important influence on the area of cloud computing in data centers. In our earlier work, we proposed DeepEnergyJS, which was designed based on the original version of the policy gradient and reinforcement learning algorithm. We verified its effectiveness through simulation experiments. In this study, we used the Proximal Policy Optimization (PPO) algorithm to update DeepEnergyJS to DeepEnergyJSV2.0. First, we verify the convergence of the PPO algorithm on the dataset of Alibaba Cluster Data V2018. Then we contrast it with reinforcement learning algorithm in terms of convergence rate, converged value, and stability. The results indicate that PPO performed better in training and test data sets compared with reinforcement learning algorithm, as well as other general heuristic algorithms, such as First Fit, Random, and Tetris. DeepEnergyJSV2.0 achieves better energy efficiency than DeepEnergyJS by about 7.814%.

Application of Stochastic Optimization Method to (s, S) Inventory System ((s, S) 재고관리 시스템에 대한 확률최적화 기법의 응용)

  • Chimyung Kwon
    • Journal of the Korea Society for Simulation
    • /
    • v.12 no.2
    • /
    • pp.1-11
    • /
    • 2003
  • In this paper, we focus an optimal policy focus optimal class of (s, S) inventory control systems. To this end, we use the perturbation analysis and apply a stochastic optimization algorithm to minimize the average cost over a period. We obtain the gradients of objective function with respect to ordering amount S and reorder point s via a combined perturbation method. This method uses the infinitesimal perturbation analysis and the smoothed perturbation analysis alternatively according to occurrences of ordering event changes. Our simulation results indicate that the optimal estimates of s and S obtained from a stochastic optimization algorithm are quite accurate. We consider that this may be due to the estimated gradients of little noise from the regenerative system simulation, and their effect on search procedure when we apply the stochastic optimization algorithm. The directions for future study stemming from this research pertain to extension to the more general inventory system with regard to demand distribution, backlogging policy, lead time, and review period. Another directions involves the efficiency of stochastic optimization algorithm related to searching procedure for an improving point of (s, S).

  • PDF