• 제목/요약/키워드: reward

검색결과 1,126건 처리시간 0.029초

A Model of System Design for Rewarding Researchers' Performance on R&D Activities (R&D 활동에서 연구자의 성과보상을 위한 시스템설계모형)

  • 박준호;김점복;권철신
    • Proceedings of the Korean Operations and Management Science Society Conference
    • /
    • 한국경영과학회 1998년도 추계학술대회 논문집
    • /
    • pp.111-113
    • /
    • 1998
  • In this paper, we deal with the model to reward researchers' performance. The rewards which disregarded the preference of researchers don't satisfy researchers, but cause, only conflicts. In order to increase the researchitivity by resolving these researchers' conflicts, we design a new model on the performance rewarding system. For this purpose, we investigate preference structure on the reward of researchers by the$\ulcorner$conjoint analysis$\lrcorner$. And we propose some reasonable and practical programs to reward performance on the basis of the investigation..

  • PDF

A Study on the Converged Difficulty Rewards of Action Game (액션게임의 융합적 난이도보상에 관한 연구)

  • Li, Xin Yu;Cho, Dong Min
    • Journal of Korea Multimedia Society
    • /
    • 제24권7호
    • /
    • pp.933-941
    • /
    • 2021
  • When people enjoy something, they want to be repeated and focus on actions that are rewarded. The same applies to games. The purpose of this study is to ensure that the game continues to enjoy itself over and over again. Based on the artificial reward of the game, two reward methods of the game are studied. It finds Converged Difficulty Reward elements of action game through Confirmatory factor analysis in AMOS.

Generating Cooperative Behavior by Multi-Agent Profit Sharing on the Soccer Game

  • Miyazaki, Kazuteru;Terada, Takashi;Kobayashi, Hiroaki
    • Proceedings of the Korean Institute of Intelligent Systems Conference
    • /
    • 한국퍼지및지능시스템학회 2003년도 ISIS 2003
    • /
    • pp.166-169
    • /
    • 2003
  • Reinforcement learning if a kind of machine learning. It aims to adapt an agent to a given environment with a clue to a reward and a penalty. Q-learning [8] that is a representative reinforcement learning system treats a reward and a penalty at the same time. There is a problem how to decide an appropriate reward and penalty values. We know the Penalty Avoiding Rational Policy Making algorithm (PARP) [4] and the Penalty Avoiding Profit Sharing (PAPS) [2] as reinforcement learning systems to treat a reward and a penalty independently. though PAPS is a descendant algorithm of PARP, both PARP and PAPS tend to learn a local optimal policy. To overcome it, ion this paper, we propose the Multi Best method (MB) that is PAPS with the multi-start method[5]. MB selects the best policy in several policies that are learned by PAPS agents. By applying PS, PAPS and MB to a soccer game environment based on the SoccerBots[9], we show that MB is the best solution for the soccer game environment.

  • PDF

RENEWAL AND RENEWAL REWARD THEORIES FOR T-INDEPENDENT FUZZY RANDOM VARIABLES

  • KIM, JAE DUCK;HONG, DUG HUN
    • Journal of applied mathematics & informatics
    • /
    • 제33권5_6호
    • /
    • pp.607-625
    • /
    • 2015
  • Recently, Wang et al. [Computers and Mathematics with Ap-plications 57 (2009) 1232-1248.] and Wang and Watada [Information Sci-ences 179 (2009) 4057-4069.] studied the renewal process and renewal reward process with fuzzy random inter-arrival times and rewards under the T-independence associated with any continuous Archimedean t-norm. But, their main results do not cover the classical theory of the random elementary renewal theorem and random renewal reward theorem when fuzzy random variables degenerate to random variables, and some given assumptions relate to the membership function of the fuzzy variable and the Archimedean t-norm of the results are restrictive. This paper improves the results of Wang and Watada and Wang et al. from a mathematical per-spective. We release some assumptions of the results of Wang and Watada and Wang et al. and completely generalize the classical stochastic renewal theorem and renewal rewards theorem.

Dopamine signaling in food addiction: role of dopamine D2 receptors

  • Baik, Ja-Hyun
    • BMB Reports
    • /
    • 제46권11호
    • /
    • pp.519-526
    • /
    • 2013
  • Dopamine (DA) regulates emotional and motivational behavior through the mesolimbic dopaminergic pathway. Changes in DA signaling in mesolimbic neurotransmission are widely believed to modify reward-related behaviors and are therefore closely associated with drug addiction. Recent evidence now suggests that as with drug addiction, obesity with compulsive eating behaviors involves reward circuitry of the brain, particularly the circuitry involving dopaminergic neural substrates. Increasing amounts of data from human imaging studies, together with genetic analysis, have demonstrated that obese people and drug addicts tend to show altered expression of DA D2 receptors in specific brain areas, and that similar brain areas are activated by food-related and drug-related cues. This review focuses on the functions of the DA system, with specific focus on the physiological interpretation and the role of DA D2 receptor signaling in food addiction.

Scheduling Algorithms for the Maximal Total Revenue on a Single Processor with Starting Time Penalty

  • Joo, Un-Gi
    • Management Science and Financial Engineering
    • /
    • 제18권1호
    • /
    • pp.13-20
    • /
    • 2012
  • This paper considers a revenue maximization problem on a single processor. Each job is identified as its processing time, initial reward, reward decreasing rate, and preferred start time. If the processor starts a job at time zero, revenue of the job is its initial reward. However, the revenue decreases linearly with the reward decreasing rate according to its processing start time till its preferred start time and finally its revenue is zero if it is started the processing after the preferred time. Our objective is to find the optimal sequence which maximizes the total revenue. For the problem, we characterize the optimal solution properties and prove the NP-hardness. Based upon the characterization, we develop a branch-and-bound algorithm for the optimal sequence and suggest five heuristic algorithms for efficient solutions. The numerical tests show that the characterized properties are useful for effective and efficient algorithms.

Note on Fuzzy Random Renewal Process and Renewal Rewards Process

  • Hong, Dug-Hun
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • 제9권3호
    • /
    • pp.219-223
    • /
    • 2009
  • Recently, Zhao et al. [Fuzzy Optimization and Decision Making (2007) 6, 279-295] characterized the interarrival times as fuzzy random variables and presented a fuzzy random elementary renewal theorem on the limit value of the expected renewal rate of the process in the fuzzy random renewal process. They also depicted both the interarrival times and rewards are depicted as fuzzy random variables and provided fuzzy random renewal reward theorem on the limit value of the long run expected reward per unit time in the fuzzy random renewal reward process. In this note, we simplify the proofs of two main results of the paper.

Design and Implementation of a Behavior-Based Control and Learning Architecture for Mobile Robots (이동 로봇을 위한 행위 기반 제어 및 학습 구조의 설계와 구현)

  • 서일홍;이상훈;김봉오
    • Journal of Institute of Control, Robotics and Systems
    • /
    • 제9권7호
    • /
    • pp.527-535
    • /
    • 2003
  • A behavior-based control and learning architecture is proposed, where reinforcement learning is applied to learn proper associations between stimulus and response by using two types of memory called as short Term Memory and Long Term Memory. In particular, to solve delayed-reward problem, a knowledge-propagation (KP) method is proposed, where well-designed or well-trained S-R(stimulus-response) associations for low-level sensors are utilized to learn new S-R associations for high-level sensors, in case that those S-R associations require the same objective such as obstacle avoidance. To show the validity of our proposed KP method, comparative experiments are performed for the cases that (ⅰ) only a delayed reward is used, (ⅱ) some of S-R pairs are preprogrammed, (ⅲ) immediate reward is possible, and (ⅳ) the proposed KP method is applied.

The Effect of Online Community's Interactivity, Reward, Commitment and Loyalty on Purchase Intention in Portal Sites (포털사이트에서 온라인 커뮤니티의 상호작용성, 보상, 몰입과 충성도가 구매의도에 미치는 영향)

  • Ahn, Tae-Youn;Kim, Jong-Uk
    • Journal of Information Technology Services
    • /
    • 제5권3호
    • /
    • pp.25-43
    • /
    • 2006
  • This research studied the interactions of online communities, reward, commitment and loyalty to purchase intention in portal sites based on relevant theories. Data were collected from the users who had purchase experiences in potal sites to analyze the effects of interactions and loyalty. An empirical analysis regarding the hypothesized structural equation model was performed using SPSS 10.0 and PLS Graph 3.0. As the result, the interactivity of communities was found significant to commitment and loyalty, the reward of community was shown to significantly influence commitment, but not loyalty. And the commitment and loyalty of community were shown to have much effects on purchase intention. Finally, trust on portal sites were found to have an interaction effect on purchase intention.

Orienteering Problem with Unknown Stochastic Reward to Informative Path Planning for Persistent Monitoring and Its Solution (지속정찰 임무의 경로계획을 위한 불확실 기댓값 오리엔티어링 문제와 해법)

  • Kim, Dooyoung
    • Journal of the Korea Institute of Military Science and Technology
    • /
    • 제22권5호
    • /
    • pp.667-673
    • /
    • 2019
  • We present an orienteering problem with unknown stochastic reward(OPUSR) model for persistent monitoring tasks with unknown event probabilities at each point of interest. Prior studies on orienteering problem for persistent monitoring task assume that rewards and event probabilities are known as a prior. In this paper, we propose a stochastic reward model with unknown event statistics and a path re-planning algorithm based on Bayesian reward inference. Experiments demonstrate the efficiency of our method.