• 제목/요약/키워드: Reward rate

검색결과 76건 처리시간 0.023초

Scheduling Algorithms for the Maximal Total Revenue on a Single Processor with Starting Time Penalty

  • Joo, Un-Gi
    • Management Science and Financial Engineering
    • /
    • 제18권1호
    • /
    • pp.13-20
    • /
    • 2012
  • This paper considers a revenue maximization problem on a single processor. Each job is identified as its processing time, initial reward, reward decreasing rate, and preferred start time. If the processor starts a job at time zero, revenue of the job is its initial reward. However, the revenue decreases linearly with the reward decreasing rate according to its processing start time till its preferred start time and finally its revenue is zero if it is started the processing after the preferred time. Our objective is to find the optimal sequence which maximizes the total revenue. For the problem, we characterize the optimal solution properties and prove the NP-hardness. Based upon the characterization, we develop a branch-and-bound algorithm for the optimal sequence and suggest five heuristic algorithms for efficient solutions. The numerical tests show that the characterized properties are useful for effective and efficient algorithms.

Hi Herzberg ? : The Role of Compensation Factors and Suggestions for Performance Compensation System

  • Kim, Yoo-Gue;Yang, Woo-Ryeong;Kim, Ha-Ryong;Yang, Hoe-Chang
    • 융합경영연구
    • /
    • 제5권1호
    • /
    • pp.21-26
    • /
    • 2017
  • Purpose - This study extracts performance-reward factors based on the previous studies related to Herzberg's two-factor theory and performance-reward and proposes a research method to identify how these factors have an influence on task performance directly related to production performance and contextual performance that has an indirect influence. Research Design, Data, and Methodology - This study draws performance-reward factors through Focus Group Interview(FGI), classifies them into economic/uneconomic and direct/indirect factors, draws maintenance/improvement factors and unnecessary ones through IPA, and maximizes the effectiveness of performance-reward factors. Results - It also identifies how performance-reward factors have an influence on internal and external motives based on previous studies, classifies performance-reward factors into task performance and contextual performance and identifies the influence relationship between these, and proposes a research model to identify the roles of equity sensitivity based on equity theory. Conclusion - The findings from this study are expected to lay the groundwork for drawing various methods to reduce the turnover rate of employees and be important resources for reinforcing the competitiveness of businesses by classifying the performance -reward factors that may cause internal and external motives from the small and medium-sized manufacturing perspective and presenting methods to identify if these have an influence on task performance and contextual performance.

Note on Fuzzy Random Renewal Process and Renewal Rewards Process

  • Hong, Dug-Hun
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • 제9권3호
    • /
    • pp.219-223
    • /
    • 2009
  • Recently, Zhao et al. [Fuzzy Optimization and Decision Making (2007) 6, 279-295] characterized the interarrival times as fuzzy random variables and presented a fuzzy random elementary renewal theorem on the limit value of the expected renewal rate of the process in the fuzzy random renewal process. They also depicted both the interarrival times and rewards are depicted as fuzzy random variables and provided fuzzy random renewal reward theorem on the limit value of the long run expected reward per unit time in the fuzzy random renewal reward process. In this note, we simplify the proofs of two main results of the paper.

Stochastic Petri Nets Modeling Methods of Channel Allocation in Wireless Networks

  • Ro, Cheul-Woo;Kim, Kyung-Min
    • International Journal of Contents
    • /
    • 제4권3호
    • /
    • pp.20-28
    • /
    • 2008
  • To obtain realistic performance measures for wireless networks, one should consider changes in performance due to failure related behavior. In performability analysis, simultaneous consideration is given to both pure performance and performance with failure measures. SRN is an extension of stochastic Petri nets and provides compact modeling facilities for system analysis. In this paper, a new methodology to model and analyze performability based on stochastic reward nets (SRN) is presented. Composite performance and availability SRN models for wireless handoff schemes are developed and then these models are decomposed hierarchically. The SRN models can yield measures of interest such as blocking and dropping probabilities. These measures are expressed in terms of the expected values of reward rate functions for SRNs. Numerical results show the accuracy of the hierarchical model. The key contribution of this paper constitutes the Petri nets modeling techniques instead of complicate numerical analysis of Markov chains and easy way of performance analysis for channel allocation under SRN reward concepts.

로봇을 위한 인공 두뇌 개발 (Artificial Brain for Robots)

  • 이규빈;권동수
    • 로봇학회논문지
    • /
    • 제1권2호
    • /
    • pp.163-171
    • /
    • 2006
  • This paper introduces the research progress on the artificial brain in the Telerobotics and Control Laboratory at KAIST. This series of studies is based on the assumption that it will be possible to develop an artificial intelligence by copying the mechanisms of the animal brain. Two important brain mechanisms are considered: spike-timing dependent plasticity and dopaminergic plasticity. Each mechanism is implemented in two coding paradigms: spike-codes and rate-codes. Spike-timing dependent plasticity is essential for self-organization in the brain. Dopamine neurons deliver reward signals and modify the synaptic efficacies in order to maximize the predicted reward. This paper addresses how artificial intelligence can emerge by the synergy between self-organization and reinforcement learning. For implementation issues, the rate codes of the brain mechanisms are developed to calculate the neuron dynamics efficiently.

  • PDF

소프트웨어 신뢰성의 정량적 분석 방법론 (A Quantitative Analysis Theory for Reliability of Software)

  • 조용순;윤현상;이은석
    • 한국정보과학회논문지:컴퓨팅의 실제 및 레터
    • /
    • 제15권7호
    • /
    • pp.500-504
    • /
    • 2009
  • 전통적인 소프트웨어 공학 관점에서 소프트웨어의 비 기능적 요구사항 중 하나인 신뢰성은, 소프트웨어 개발 프로세스에서 마지막 단계인 통합 테스트 이후에 검증이 가능하다. 그러나 이것은 소프트웨어 개발에 있어서 많은 위험성과 개발 비용을 발생시킨다. 따라서 본 논문에서는 소프트웨어 개발 초기 단계에서 수학적인 분석 모델을 통해 신뢰성을 분석할 수 있는 방법을 제안한다. 소프트웨어의 신뢰성분석을 위하여 본 논문에서는 다음 두 가지를 제안한다. 첫째로, 계층형 큐잉 패트리넷을 이용하여 신뢰성 분석을 위한 소프트웨어 모델링 방법론을 제안한다. 둘째로, 완성된 계층형 큐잉 패트리넷 모델로부터 신뢰성 분석을 위한 마코프 리워드 모델(Markov Reward Model)을 유도해내는 방법에 관하여 제안한다. 본 논문의 유효성을 검증하기 위하여, 화상회의 시스템 개발사례에 적용하였다. 본 연구 결과를 통해 소프트웨어 신뢰성의 정량적인 분석이 가능하다.

비트코인 채굴 수익성 모델 및 분석 (Bitcoin Mining Profitability Model and Analysis)

  • 이진우;조국래;염대현
    • 정보보호학회논문지
    • /
    • 제28권2호
    • /
    • pp.303-310
    • /
    • 2018
  • 비트코인은 2009년 사토시 나카모토가 제안한 암호 화폐로 중앙 기관 없이 통화가 발행, 관리되는 분산 합의 구조를 가지고 있다. 채굴은 이러한 분산 합의 구조의 중추를 담당하는 작업으로 대기 중인 비트코인의 거래를 블록화하여 비트코인의 블록체인(장부)에 포함시키는 역할을 한다. 블록의 생성에는 컴퓨팅 자원이 필요하기 때문에, 채굴을 담당하는 채굴자에게 보상으로 비트코인이 지급되며, 이 보상을 통해 새로운 비트코인이 발급된다. 비트코인은 2100만개 까지만 발행할 수 있도록 설계되었으며, 인플레이션에 대비하기 위해 채굴 과정에 반감기라는 개념이 도입되었다. 2009년에 50 BTC이었던 보상은 현재 12.5 BTC인 상태이나 채굴 보상의 실제 가치는 더욱 늘어났다. 이는 2017년 1월 12일 기준 1 BTC당 924,000원이던 비트코인이 2017년 12월 10일 기준 16,103,306원이 되어 실질 보상액을 증가시켰기 때문이다. 가격 상승으로 인해 신규 채굴자가 지속적으로 유입되고 있음에도 채굴이 실제로 어느 정도 수익성이 있는지에 대한 연구는 미비한 상태이다. 본 논문에서는 비트코인의 채굴 구조를 살펴보고 비트코인의 채굴에 어느 정도의 수익성을 기대할 수 있는지를 살펴보고자 한다.

A study on the structure and corrosion characteristics of polyethylene terephtalate and polyvinylchloride

  • Chilnam Choe;Hyo
    • 한국환경과학회:학술대회논문집
    • /
    • 한국환경과학회 1997년도 가을 학술발표회 프로그램
    • /
    • pp.58-58
    • /
    • 1997
  • The corrosion rate of polymer polyethylene terephtalate and polyvinylchloride was characterized at various condition by potentiostate / galvanostate method. The cell and working electrode used for this study was specially preparatain, The potential was scanned at foward scan -2V to 3V and reward scan 3V to -2V, at 50mv/s (R: auto - compensation).

  • PDF

공 던지기 로봇의 정책 예측 심층 강화학습 (Deep Reinforcement Learning of Ball Throwing Robot's Policy Prediction)

  • 강영균;이철수
    • 로봇학회논문지
    • /
    • 제15권4호
    • /
    • pp.398-403
    • /
    • 2020
  • Robot's throwing control is difficult to accurately calculate because of air resistance and rotational inertia, etc. This complexity can be solved by using machine learning. Reinforcement learning using reward function puts limit on adapting to new environment for robots. Therefore, this paper applied deep reinforcement learning using neural network without reward function. Throwing is evaluated as a success or failure. AI network learns by taking the target position and control policy as input and yielding the evaluation as output. Then, the task is carried out by predicting the success probability according to the target location and control policy and searching the policy with the highest probability. Repeating this task can result in performance improvements as data accumulates. And this model can even predict tasks that were not previously attempted which means it is an universally applicable learning model for any new environment. According to the data results from 520 experiments, this learning model guarantees 75% success rate.

액터-크리틱 모형기반 포트폴리오 연구 (A Study on the Portfolio Performance Evaluation using Actor-Critic Reinforcement Learning Algorithms)

  • 이우식
    • 한국산업융합학회 논문집
    • /
    • 제25권3호
    • /
    • pp.467-476
    • /
    • 2022
  • The Bank of Korea raised the benchmark interest rate by a quarter percentage point to 1.75 percent per year, and analysts predict that South Korea's policy rate will reach 2.00 percent by the end of calendar year 2022. Furthermore, because market volatility has been significantly increased by a variety of factors, including rising rates, inflation, and market volatility, many investors have struggled to meet their financial objectives or deliver returns. Banks and financial institutions are attempting to provide Robo-Advisors to manage client portfolios without human intervention in this situation. In this regard, determining the best hyper-parameter combination is becoming increasingly important. This study compares some activation functions of the Deep Deterministic Policy Gradient(DDPG) and Twin-delayed Deep Deterministic Policy Gradient (TD3) Algorithms to choose a sequence of actions that maximizes long-term reward. The DDPG and TD3 outperformed its benchmark index, according to the results. One reason for this is that we need to understand the action probabilities in order to choose an action and receive a reward, which we then compare to the state value to determine an advantage. As interest in machine learning has grown and research into deep reinforcement learning has become more active, finding an optimal hyper-parameter combination for DDPG and TD3 has become increasingly important.