• Title/Summary/Keyword: Reward rate

Search Result 76, Processing Time 0.026 seconds

Scheduling Algorithms for the Maximal Total Revenue on a Single Processor with Starting Time Penalty

  • Joo, Un-Gi
    • Management Science and Financial Engineering
    • /
    • v.18 no.1
    • /
    • pp.13-20
    • /
    • 2012
  • This paper considers a revenue maximization problem on a single processor. Each job is identified as its processing time, initial reward, reward decreasing rate, and preferred start time. If the processor starts a job at time zero, revenue of the job is its initial reward. However, the revenue decreases linearly with the reward decreasing rate according to its processing start time till its preferred start time and finally its revenue is zero if it is started the processing after the preferred time. Our objective is to find the optimal sequence which maximizes the total revenue. For the problem, we characterize the optimal solution properties and prove the NP-hardness. Based upon the characterization, we develop a branch-and-bound algorithm for the optimal sequence and suggest five heuristic algorithms for efficient solutions. The numerical tests show that the characterized properties are useful for effective and efficient algorithms.

Hi Herzberg ? : The Role of Compensation Factors and Suggestions for Performance Compensation System

  • Kim, Yoo-Gue;Yang, Woo-Ryeong;Kim, Ha-Ryong;Yang, Hoe-Chang
    • The Journal of Economics, Marketing and Management
    • /
    • v.5 no.1
    • /
    • pp.21-26
    • /
    • 2017
  • Purpose - This study extracts performance-reward factors based on the previous studies related to Herzberg's two-factor theory and performance-reward and proposes a research method to identify how these factors have an influence on task performance directly related to production performance and contextual performance that has an indirect influence. Research Design, Data, and Methodology - This study draws performance-reward factors through Focus Group Interview(FGI), classifies them into economic/uneconomic and direct/indirect factors, draws maintenance/improvement factors and unnecessary ones through IPA, and maximizes the effectiveness of performance-reward factors. Results - It also identifies how performance-reward factors have an influence on internal and external motives based on previous studies, classifies performance-reward factors into task performance and contextual performance and identifies the influence relationship between these, and proposes a research model to identify the roles of equity sensitivity based on equity theory. Conclusion - The findings from this study are expected to lay the groundwork for drawing various methods to reduce the turnover rate of employees and be important resources for reinforcing the competitiveness of businesses by classifying the performance -reward factors that may cause internal and external motives from the small and medium-sized manufacturing perspective and presenting methods to identify if these have an influence on task performance and contextual performance.

Note on Fuzzy Random Renewal Process and Renewal Rewards Process

  • Hong, Dug-Hun
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • v.9 no.3
    • /
    • pp.219-223
    • /
    • 2009
  • Recently, Zhao et al. [Fuzzy Optimization and Decision Making (2007) 6, 279-295] characterized the interarrival times as fuzzy random variables and presented a fuzzy random elementary renewal theorem on the limit value of the expected renewal rate of the process in the fuzzy random renewal process. They also depicted both the interarrival times and rewards are depicted as fuzzy random variables and provided fuzzy random renewal reward theorem on the limit value of the long run expected reward per unit time in the fuzzy random renewal reward process. In this note, we simplify the proofs of two main results of the paper.

Stochastic Petri Nets Modeling Methods of Channel Allocation in Wireless Networks

  • Ro, Cheul-Woo;Kim, Kyung-Min
    • International Journal of Contents
    • /
    • v.4 no.3
    • /
    • pp.20-28
    • /
    • 2008
  • To obtain realistic performance measures for wireless networks, one should consider changes in performance due to failure related behavior. In performability analysis, simultaneous consideration is given to both pure performance and performance with failure measures. SRN is an extension of stochastic Petri nets and provides compact modeling facilities for system analysis. In this paper, a new methodology to model and analyze performability based on stochastic reward nets (SRN) is presented. Composite performance and availability SRN models for wireless handoff schemes are developed and then these models are decomposed hierarchically. The SRN models can yield measures of interest such as blocking and dropping probabilities. These measures are expressed in terms of the expected values of reward rate functions for SRNs. Numerical results show the accuracy of the hierarchical model. The key contribution of this paper constitutes the Petri nets modeling techniques instead of complicate numerical analysis of Markov chains and easy way of performance analysis for channel allocation under SRN reward concepts.

Artificial Brain for Robots (로봇을 위한 인공 두뇌 개발)

  • Lee, Kyoo-Bin;Kwon, Dong-Soo
    • The Journal of Korea Robotics Society
    • /
    • v.1 no.2
    • /
    • pp.163-171
    • /
    • 2006
  • This paper introduces the research progress on the artificial brain in the Telerobotics and Control Laboratory at KAIST. This series of studies is based on the assumption that it will be possible to develop an artificial intelligence by copying the mechanisms of the animal brain. Two important brain mechanisms are considered: spike-timing dependent plasticity and dopaminergic plasticity. Each mechanism is implemented in two coding paradigms: spike-codes and rate-codes. Spike-timing dependent plasticity is essential for self-organization in the brain. Dopamine neurons deliver reward signals and modify the synaptic efficacies in order to maximize the predicted reward. This paper addresses how artificial intelligence can emerge by the synergy between self-organization and reinforcement learning. For implementation issues, the rate codes of the brain mechanisms are developed to calculate the neuron dynamics efficiently.

  • PDF

A Quantitative Analysis Theory for Reliability of Software (소프트웨어 신뢰성의 정량적 분석 방법론)

  • Cho, Yong-Soon;Youn, Hyun-Sang;Lee, Eun-Seok
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.15 no.7
    • /
    • pp.500-504
    • /
    • 2009
  • A reliability of software is a type of nonfunctional requirement. Traditionally, a validation of the reliability is processed at the integration phase in software development life cycle. However, it increases the cost and the risk for the development. In this paper, we propose reliability analysis method based on mathematical analytic model at the architecture design phase of the development process as follows. First, we propose the software modeling methodology for reliability analysis using Hierarchical combined Queueing Petri Nets(HQPN). Second, we derive the Markov Reward Model from the HQPN based model. We apply our approach to the video conference system to verify the usefulness of our approach. Our approach supports quantitative evaluation of the reliability.

Bitcoin Mining Profitability Model and Analysis (비트코인 채굴 수익성 모델 및 분석)

  • Lee, Jinwoo;Cho, Kookrae;Yum, Dae Hyun
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.28 no.2
    • /
    • pp.303-310
    • /
    • 2018
  • Bitcoin (BTC) is a cryptocurrency proposed by Satoshi Nakamoto in 2009. Bitcoin makes its transactions with no central authorities. This decentralization is accomplished with its mining, which is an operation that makes people compete to solve math puzzles to include new transactions into block, and eventually block chains (ledger) of bitcoin. Because miners need to solve a complex puzzles, they need a lot of computing resources. In return for miners' resources, bitcoin network gives newly minted bitcoins as a reward to miners when they succeed in mining. To prevent inflation, the reward is halved every 4 years. For example, in 2009 block reward was 50 BTC, but today, the block reward is 12.5 BTC. On the other hands, exchange rate for bitcoin and Korean Won (KRW) changed drastically from 924,000 KRW/BTC (January 12th, 2017) to 16,103,306 KRW/BTC (December 10th, 2017), which made mining more attractive. However, there are no rigorous researches on the profitability of bitcoin mining. In this paper, we evaluate the profitability of bitcoin mining.

A study on the structure and corrosion characteristics of polyethylene terephtalate and polyvinylchloride

  • Chilnam Choe;Hyo
    • Proceedings of the Korean Environmental Sciences Society Conference
    • /
    • 1997.10a
    • /
    • pp.58-58
    • /
    • 1997
  • The corrosion rate of polymer polyethylene terephtalate and polyvinylchloride was characterized at various condition by potentiostate / galvanostate method. The cell and working electrode used for this study was specially preparatain, The potential was scanned at foward scan -2V to 3V and reward scan 3V to -2V, at 50mv/s (R: auto - compensation).

  • PDF

Deep Reinforcement Learning of Ball Throwing Robot's Policy Prediction (공 던지기 로봇의 정책 예측 심층 강화학습)

  • Kang, Yeong-Gyun;Lee, Cheol-Soo
    • The Journal of Korea Robotics Society
    • /
    • v.15 no.4
    • /
    • pp.398-403
    • /
    • 2020
  • Robot's throwing control is difficult to accurately calculate because of air resistance and rotational inertia, etc. This complexity can be solved by using machine learning. Reinforcement learning using reward function puts limit on adapting to new environment for robots. Therefore, this paper applied deep reinforcement learning using neural network without reward function. Throwing is evaluated as a success or failure. AI network learns by taking the target position and control policy as input and yielding the evaluation as output. Then, the task is carried out by predicting the success probability according to the target location and control policy and searching the policy with the highest probability. Repeating this task can result in performance improvements as data accumulates. And this model can even predict tasks that were not previously attempted which means it is an universally applicable learning model for any new environment. According to the data results from 520 experiments, this learning model guarantees 75% success rate.

A Study on the Portfolio Performance Evaluation using Actor-Critic Reinforcement Learning Algorithms (액터-크리틱 모형기반 포트폴리오 연구)

  • Lee, Woo Sik
    • Journal of the Korean Society of Industry Convergence
    • /
    • v.25 no.3
    • /
    • pp.467-476
    • /
    • 2022
  • The Bank of Korea raised the benchmark interest rate by a quarter percentage point to 1.75 percent per year, and analysts predict that South Korea's policy rate will reach 2.00 percent by the end of calendar year 2022. Furthermore, because market volatility has been significantly increased by a variety of factors, including rising rates, inflation, and market volatility, many investors have struggled to meet their financial objectives or deliver returns. Banks and financial institutions are attempting to provide Robo-Advisors to manage client portfolios without human intervention in this situation. In this regard, determining the best hyper-parameter combination is becoming increasingly important. This study compares some activation functions of the Deep Deterministic Policy Gradient(DDPG) and Twin-delayed Deep Deterministic Policy Gradient (TD3) Algorithms to choose a sequence of actions that maximizes long-term reward. The DDPG and TD3 outperformed its benchmark index, according to the results. One reason for this is that we need to understand the action probabilities in order to choose an action and receive a reward, which we then compare to the state value to determine an advantage. As interest in machine learning has grown and research into deep reinforcement learning has become more active, finding an optimal hyper-parameter combination for DDPG and TD3 has become increasingly important.