• Title/Summary/Keyword: Policy-based method

Search Result 2,233, Processing Time 0.025 seconds

Solving Survival Gridworld Problem Using Hybrid Policy Modified Q-Based Reinforcement

  • Montero, Vince Jebryl;Jung, Woo-Young;Jeong, Yong-Jin
    • Journal of IKEEE
    • /
    • v.23 no.4
    • /
    • pp.1150-1156
    • /
    • 2019
  • This paper explores a model-free value-based approach for solving survival gridworld problem. Survival gridworld problem opens up a challenge involving taking risks to gain better rewards. Classic value-based approach in model-free reinforcement learning assumes minimal risk decisions. The proposed method involves a hybrid on-policy and off-policy updates to experience roll-outs using a modified Q-based update equation that introduces a parametric linear rectifier and motivational discount. The significance of this approach is it allows model-free training of agents that take into account risk factors and motivated exploration to gain better path decisions. Experimentations suggest that the proposed method achieved better exploration and path selection resulting to higher episode scores than classic off-policy and on-policy Q-based updates.

An Evaluation Method for Security Policy Model Based on Common Criteria (공통평가기준에 의한 보안정책모델 평가방법)

  • 김상호;임춘성
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.13 no.5
    • /
    • pp.57-67
    • /
    • 2003
  • Security Policy Model is a structured representation using informal, semiformal or formal method of security policy to be enforced by TOE. It provides TOE to get an assurance to mitigate security flaws resulted from inconsistency between security functional requirements and functional specifications. Therefore, Security Policy Model has been required under an hish evaluation assurance level on an evaluation criteria such as ISO/IEC 15408(Common Criteria, CC). In this paper, we present an evaluation method for security policy model based on assurance requirements for security policy model in Common Criteria through an analysis of concepts, related researches and assurance requirements for security policy model.

A Simulation Sample Accumulation Method for Efficient Simulation-based Policy Improvement in Markov Decision Process (마르코프 결정 과정에서 시뮬레이션 기반 정책 개선의 효율성 향상을 위한 시뮬레이션 샘플 누적 방법 연구)

  • Huang, Xi-Lang;Choi, Seon Han
    • Journal of Korea Multimedia Society
    • /
    • v.23 no.7
    • /
    • pp.830-839
    • /
    • 2020
  • As a popular mathematical framework for modeling decision making, Markov decision process (MDP) has been widely used to solve problem in many engineering fields. MDP consists of a set of discrete states, a finite set of actions, and rewards received after reaching a new state by taking action from the previous state. The objective of MDP is to find an optimal policy, that is, to find the best action to be taken in each state to maximize the expected discounted reward of policy (EDR). In practice, MDP is typically unknown, so simulation-based policy improvement (SBPI), which improves a given base policy sequentially by selecting the best action in each state depending on rewards observed via simulation, can be a practical way to find the optimal policy. However, the efficiency of SBPI is still a concern since many simulation samples are required to precisely estimate EDR for each action in each state. In this paper, we propose a method to select the best action accurately in each state using a small number of simulation samples, thereby improving the efficiency of SBPI. The proposed method accumulates the simulation samples observed in the previous states, so it is possible to precisely estimate EDR even with a small number of samples in the current state. The results of comparative experiments on the existing method demonstrate that the proposed method can improve the efficiency of SBPI.

The Block-Based Storage Policy and Order Processing in Logistics Warehouse (물류창고에서 블록별 저장방식 및 주문 처리에 관한 연구)

  • 김명훈;김종화
    • Journal of the Korea Society of Computer and Information
    • /
    • v.8 no.4
    • /
    • pp.159-164
    • /
    • 2003
  • Location of stock in a warehouse directly affects the total materials handling expense of all goods moving through the warehouse. The purpose of this paper is to develop a storage policy in order picking warehouse, the block-based storage policy to minimize the total order picking time. In block-based storage policy, the rack is divided into blocks and items are assigned to each block based on the turn-over rate of each item and the average distance between the blocks and the dock. To demonstrate the performance of the proposed policy, we compare with the existing method called class-based storage policy under various matching methods.

  • PDF

Learning Relational Instance-Based Policies from User Demonstrations (사용자 데모를 이용한 관계적 개체 기반 정책 학습)

  • Park, Chan-Young;Kim, Hyun-Sik;Kim, In-Cheol
    • Journal of KIISE:Software and Applications
    • /
    • v.37 no.5
    • /
    • pp.363-369
    • /
    • 2010
  • Demonstration-based learning has the advantage that a user can easily teach his/her robot new task knowledge just by demonstrating directly how to perform the task. However, many previous demonstration-based learning techniques used a kind of attribute-value vector model to represent their state spaces and policies. Due to the limitation of this model, they suffered from both low efficiency of the learning process and low reusability of the learned policy. In this paper, we present a new demonstration-based learning method, in which the relational model is adopted in place of the attribute-value model. Applying the relational instance-based learning to the training examples extracted from the records of the user demonstrations, the method derives a relational instance-based policy which can be easily utilized for other similar tasks in the same domain. A relational policy maps a context, represented as a pair of (state, goal), to a corresponding action to be executed. In this paper, we give a detail explanation of our demonstration-based relational policy learning method, and then analyze the effectiveness of our learning method through some experiments using a robot simulator.

Data-based Method of Selecting Excellent SMEs for Governmental Funding Policy: Focused on Fishery Industry in Korea (데이터 기반 정책지원 대상 우수 중소기업 발굴 방법론 연구 : 국내 수산산업을 대상으로)

  • Hwang, Soon-Wook;Chun, Dong-Phil
    • The Journal of Fisheries Business Administration
    • /
    • v.49 no.4
    • /
    • pp.1-17
    • /
    • 2018
  • The Korean fisheries industry is a traditional business, the majority of which are small and medium-sized enterprises (SMEs). It has played an important role in the South Korean economies in the past several decades, but it currently faces the limitations of growth potential and profitability due to declining workforce, aging populations, deteriorating fishery environments, climate changes, and rapid changes in the global industrial ecosystem. Many studies have suggested solutions for the fisheries industry in macro perspective, but there are rarely any studies taking the strategic approaches for the problem. If it is possible for governments to support the companies that are likely to increase their value-added selectively, it will break through the current situation more effectively. This paper introduces a study on the selection method utilizing data envelopment analysis (DEA) to find SMEs with potentials to increase profits and growth. We suggest selecting SMEs with high management efficiency and ability to utilize intangible assets as the target companies. We also suggest policy objectives for SMEs in the domestic fisheries industry based on the results of DEA analysis and propose a data-based method for the policy decisions.

Analysis of Problems and Causal Relations of Functional Changes of Local Educational Authority Policy(FCLEAP) based on the Systems Thinking (시스템 사고에 기반한 "지역교육청 기능 및 조직개편" 정책의 문제 및 원인 분석)

  • Ha, Jung-Youn;Rah, Minjoo
    • Korean System Dynamics Review
    • /
    • v.15 no.2
    • /
    • pp.75-96
    • /
    • 2014
  • The purpose of this paper is to analyze the functional changes of local educational authority policy based on the systems thinking perspective using causal loop diagrams. In the past, the main function of the local educational authority was to manage and supervise schools. Through this policy, local educational authority would be transformed into a support agency. However, this policy did not achieve the goal, was to cause confusion and require improvement. This study shows structured causes of the problem based on systems thinking. These diagrams make it possible for educational policy makers to provide ideas, although they have some complicated environment. The findings indicate that based on systems thinking in this policy can help those who related to policy decision than existing diagnosis method.

  • PDF

The big data analysis framework of information security policy based on security incidents

  • Jeong, Seong Hoon;Kim, Huy Kang;Woo, Jiyoung
    • Journal of the Korea Society of Computer and Information
    • /
    • v.22 no.10
    • /
    • pp.73-81
    • /
    • 2017
  • In this paper, we propose an analysis framework to capture the trends of information security incidents and evaluate the security policy based on the incident analysis. We build a big data from news media collecting security incidents news and policy news, identify key trends in information security from this, and present an analytical method for evaluating policies from the point of view of incidents. In more specific, we propose a network-based analysis model that allows us to easily identify the trends of information security incidents and policy at a glance, and a cosine similarity measure to find important events from incidents and policy announcements.

A Study on Developing Science Service of Science and Technology Policy (과학기술 정책의 과학화 서비스 개발에 관한 연구)

  • Shin, Mun-Bong;Chun, Seung-Su;WhangBo, Taeg-Keun
    • Journal of Information Technology Services
    • /
    • v.11 no.1
    • /
    • pp.83-92
    • /
    • 2012
  • The development of science and technology oriented knowledge society accelerates the convergence between scientific theory and industrial technology and increases the complexity problem of social and economic sectors. These cause the difficulty of securing the reliability and objectivity of science and technology policy. These also are barriers of balanced evaluation between rational science and technology policy making, management, and policy coordination. In this regard, Advanced countries in science and technology develops policy support system and promotes the program of evidence-based SciSIP(Science of Science and Innovation policy) together. This paper introduces a new approach developing science service of science and technology policy utilizing business intelligence technology in Korea. Also, it proposes the integration method of policy knowledge base and component-based service supporting S&T policy decision-making process and introduces services case studies.

Developing a Combined Forecasting Model on Hospital Closure (병원도산의 예측모형 개발연구)

  • 정기택;이훈영
    • Health Policy and Management
    • /
    • v.10 no.2
    • /
    • pp.1-21
    • /
    • 2000
  • This study reviewde various parametic and nonparametic method for forexasting hospital closures in Korea. We compared multivariate discriminant analysis, multivartiate logistic regression, classfication and regression tree, and neural network method based on hit ratio of each model for forecasting hospital closure. Like other studies in the literture, neural metwork analysis showed highest average hit ratio. For policy and business purposes, we combined the four analytical method and constructed a foreasting model that can be easily used to predict the probabolity of hospital closure given financial information of a hospital.

  • PDF