• Title/Summary/Keyword: Game Optimal

Search Result 274, Processing Time 0.027 seconds

A Naive Bayesian-based Model of the Opponent's Policy for Efficient Multiagent Reinforcement Learning (효율적인 멀티 에이전트 강화 학습을 위한 나이브 베이지만 기반 상대 정책 모델)

  • Kwon, Ki-Duk
    • Journal of Internet Computing and Services
    • /
    • v.9 no.6
    • /
    • pp.165-177
    • /
    • 2008
  • An important issue in Multiagent reinforcement learning is how an agent should learn its optimal policy in a dynamic environment where there exist other agents able to influence its own performance. Most previous works for Multiagent reinforcement learning tend to apply single-agent reinforcement learning techniques without any extensions or require some unrealistic assumptions even though they use explicit models of other agents. In this paper, a Naive Bayesian based policy model of the opponent agent is introduced and then the Multiagent reinforcement learning method using this model is explained. Unlike previous works, the proposed Multiagent reinforcement learning method utilizes the Naive Bayesian based policy model, not the Q function model of the opponent agent. Moreover, this learning method can improve learning efficiency by using a simpler one than other richer but time-consuming policy models such as Finite State Machines(FSM) and Markov chains. In this paper, the Cat and Mouse game is introduced as an adversarial Multiagent environment. And then effectiveness of the proposed Naive Bayesian based policy model is analyzed through experiments using this game as test-bed.

  • PDF

An Alternative Approach for Environmental Education to overcome free rider egoism based on the Perspectives of Prisoner's Dilemma Situation (죄수딜렘마(PD) 게임상황을 활용한 환경교육의 가능성)

  • 김태경
    • Hwankyungkyoyuk
    • /
    • v.13 no.2
    • /
    • pp.38-50
    • /
    • 2000
  • We are evidently Home Economicus, egoistic rational utility maximiger, and all the capitalism economic situation make us adapt to such life, and recognize that it is rational to act like that. This can be demonstrated in Prisoner′s Dilemma(PD) which always select the non-cooperative choice for free rider in rational selection process of public goods. This paper notice the "what is problem\ulcorner"The problem is not in free rider itself but in free rider egoism. The practical behavior of free rider egoism can be explained by way of Prisoner′s Dilemma. In PD situation, the prisoner makes a rational choice, non-cooperative alternative, but he doesn′arrive at preto-optimality. It is dilemma. Why can′t he arrive \ulcorner Because he is isolated from other prisoner. So we call it prisoner′s dilemma. The PD situation can be compared with our real economic life, which, we think, have kept by rational choice of the public goods. We actually have made our life as an individual one although we organized communities of capitalism. Of course, we know each others as members of same society, but each individual being can′t secure the belief, which has composed basis of community. So, it is very similar and common between PD situation and our real economic life in the production of public goods. We conclude that this non-cooperative process of PD situation can be utilized as instrument of EE. So this non-cooperative process can show us the effectiveness of EE as follows. \circled1 Game situation life PD can be used as good instrument for explaining the rational selection dilemma(error) to Homo-Economicus, the rational agent, with the optimal and rational language. \circled2 We can show that the selection result is dilemma, not arrive pareto - optimality. \circled3 The dilemma can be resolved with accomplishing the good communal life based on the belief, not on the isolation.

  • PDF

Architecture and Path-Finding Behavior of An Intelligent Agent Deploying within 3D Virtual Environment (3차원 가상환경에서 동작하는 지능형 에이전트의 구조와 경로 찾기 행위)

  • Kim, In-Cheol;Lee, Jae-Ho
    • The KIPS Transactions:PartB
    • /
    • v.10B no.1
    • /
    • pp.1-12
    • /
    • 2003
  • In this paper, we Introduce the Unreal Tournament (UT) game and the Gamebots system. The former it a well-known 3D first-person action game and the latter is an intelligent agent research testbed based on UT And then we explain the design and implementation of KGBot, which is an intelligent non-player character deploying effectively within the 3D virtual environment provided by UT and the Gamebots system. KGBot is a bot client within the Gamebots System. KGBot accomplishes its own task to find out and dominate several domination points pro-located on the complex surface map of 3D virtual environment KGBot adopts UM-PRS as its control engine, which is a general BDI agent architecture. KGBot contains a hierarchical knowledge base representing its complex behaviors in multiple layers. In this paper, we explain details of KGBot's Intelligent behaviors, tuck af locating the hidden domination points by exploring the unknown world effectively. constructing a path map by collecting the waypoints and paths distributed over the world, and finding an optimal path to certain destination based on this path graph. Finally we analyze the performance of KGBot exploring strategy and control engine through some experiments on different 3D maps.

Utilization Status of Internet and Dietary Information of School Children in Gyeonenam and Jeonbuk Areas (경남과 전북지역 초등학교 고학년생의 인터넷 및 식생활정보 이용실태)

  • 허은실;이경혜
    • Korean Journal of Community Nutrition
    • /
    • v.8 no.1
    • /
    • pp.15-25
    • /
    • 2003
  • This study was carried out to investigate the utilization status of internet and dietary information by gender (boys : 442, girls : 461) in school children (total 903). The results were summarized as follows. The most of children used internet regularly (98.1%) and major purpose of using were mentioned as 'game (39.0%)' and 'social intercourse (49.5%)'. The duration of internet use was '< 2hours (80.9%)' They used internet mainly at 'home (88.8%)', and favorite search engines were 'Yahoo (54.2%)' and 'Daum (31.1%)'. The searching experience on dietary information was from only 35.6% of subjects mainly 'for homework (39.6%)' and 'for health (36.9%)'. The satisfaction degree of searched information was 'high (79.5%)'. Dissatisfactory reasons of internet site for dietary information were pointed out to be 'bring little interest (28.9)', 'difficult contents (19.2%)', and 'poor Information (18.2%)'. Only fifteen % of subjects had experience of nutrition counseling using internet, and purpose of counseling was mainly 'for homework (51.4%)' and 'for health problem (24.3%)'. The problems for nutritional counseling site were pointed out to be 'difficult answer content (31.7%)', 'insincere answer (28.6%)'and 'poor answer content (25.4%)'. They acquire information of nutrition and health management mainly through 'internet (43.7%)'. 'Growth and nutrition (28.3%)', 'improvement in studying ability (13.8%)', 'right weight control (13.3%)' and 'cooking (12.8%)'were most frequently asked information, They had a preference for 'game (40.5%)', 'animation (29.9%)' and 'quiz (18.1%)'as loaming method tools. The favorite site color was 'green (51.3%)'The results of this study showed that although the internet use was very high, they used internet to search dietary information very seldom. Therefore, the information donor should find out what is the optimal tool, what kind of dietary information was needed for school children.

Development of Brain-machine Interface for MindPong using Internet of Things (마인드 퐁 제어를 위한 사물인터넷을 이용하는 뇌-기계 인터페이스 개발)

  • Hoon-Hee Kim
    • Journal of Internet of Things and Convergence
    • /
    • v.9 no.6
    • /
    • pp.17-22
    • /
    • 2023
  • Brain-Machine Interfaces(BMI) are interfaces that control machines by decoding brainwaves, which are electrical signals generated from neural activities. Although BMIs can be applied in various fields, their widespread usage is hindered by the low portability of the hardware required for brainwave measurement and decoding. To address this issue, previous research proposed a brain-machine interface system based on the Internet of Things (IoT) using cloud computing. In this study, we developed and tested an application that uses brainwaves to control the Pong game, demonstrating the real-time usability of the system. The results showed that users of the proposed BMI achieved scores comparable to optimal control artificial intelligence in real-time Pong game matches. Thus, this research suggests that IoT-based brain-machine interfaces can be utilized in a variety of real-time applications in everyday life.

Design and implementation of Robot Soccer Agent Based on Reinforcement Learning (강화 학습에 기초한 로봇 축구 에이전트의 설계 및 구현)

  • Kim, In-Cheol
    • The KIPS Transactions:PartB
    • /
    • v.9B no.2
    • /
    • pp.139-146
    • /
    • 2002
  • The robot soccer simulation game is a dynamic multi-agent environment. In this paper we suggest a new reinforcement learning approach to each agent's dynamic positioning in such dynamic environment. Reinforcement learning is the machine learning in which an agent learns from indirect, delayed reward an optimal policy to choose sequences of actions that produce the greatest cumulative reward. Therefore the reinforcement learning is different from supervised learning in the sense that there is no presentation of input-output pairs as training examples. Furthermore, model-free reinforcement learning algorithms like Q-learning do not require defining or learning any models of the surrounding environment. Nevertheless these algorithms can learn the optimal policy if the agent can visit every state-action pair infinitely. However, the biggest problem of monolithic reinforcement learning is that its straightforward applications do not successfully scale up to more complex environments due to the intractable large space of states. In order to address this problem, we suggest Adaptive Mediation-based Modular Q-Learning (AMMQL) as an improvement of the existing Modular Q-Learning (MQL). While simple modular Q-learning combines the results from each learning module in a fixed way, AMMQL combines them in a more flexible way by assigning different weight to each module according to its contribution to rewards. Therefore in addition to resolving the problem of large state space effectively, AMMQL can show higher adaptability to environmental changes than pure MQL. In this paper we use the AMMQL algorithn as a learning method for dynamic positioning of the robot soccer agent, and implement a robot soccer agent system called Cogitoniks.

A Study of a Virtual Reality Interface of Person Search in Multimedia Database for the US Defense Industry (미국 방위산업체 상황실의 인물검색 활동을 돕는 가상현실 공간 인터페이스 환경에 관한 연구)

  • Kim, Na-Young;Lee, Chong-Ho
    • Journal of Korea Game Society
    • /
    • v.11 no.5
    • /
    • pp.67-78
    • /
    • 2011
  • This paper introduces an efficient and satisfactory search interface that enables users to browse and find the video data they want from a massively huge video database widely used in various multimedia environment. The target user group is information analysts at US defense industry or governmental intelligence agencies whose job is to identify a certain person from a lot of video footage taken from CCTV(Closed-circuit Television) cameras. For the first user test, we suggested the CAVE-like virtual reality interface to be the most optimal for the tasks we designed for, so we compared this interface with desktop interface. The softwares and database developed and optimized for each task were used in this user test. For the second user test, we researched on what input devices would be most optimal for enhancing efficiency of search task in the CAVE-like virtual reality system. Especially we focused our effort on measuring the effectiveness and user satisfaction of three different types of devices that embody gestural interface input system that encourages users' ergonomic control of the interface. We also measured the time consumed for performing each task to find out the most efficient input device among the ones tested.

Research on Optimal Deployment of Sonobuoy for Autonomous Aerial Vehicles Using Virtual Environment and DDPG Algorithm (가상환경과 DDPG 알고리즘을 이용한 자율 비행체의 소노부이 최적 배치 연구)

  • Kim, Jong-In;Han, Min-Seok
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.15 no.2
    • /
    • pp.152-163
    • /
    • 2022
  • In this paper, we present a method to enable an unmanned aerial vehicle to drop the sonobuoy, an essential element of anti-submarine warfare, in an optimal deployment. To this end, an environment simulating the distribution of sound detection performance was configured through the Unity game engine, and the environment directly configured using Unity ML-Agents and the reinforcement learning algorithm written in Python from the outside communicated with each other and learned. In particular, reinforcement learning is introduced to prevent the accumulation of wrong actions and affect learning, and to secure the maximum detection area for the sonobuoy while the vehicle flies to the target point in the shortest time. The optimal placement of the sonobuoy was achieved by applying the Deep Deterministic Policy Gradient (DDPG) algorithm. As a result of the learning, the agent flew through the sea area and passed only the points to achieve the optimal placement among the 70 target candidates. This means that an autonomous aerial vehicle that deploys a sonobuoy in the shortest time and maximum detection area, which is the requirement for optimal placement, has been implemented.

Developing a comprehensive model of the optimal exploitation of dam reservoir by combining a fuzzy-logic based decision-making approach and the young's bilateral bargaining model

  • M.J. Shirangi;H. Babazadeh;E. Shirangi;A. Saremi
    • Membrane and Water Treatment
    • /
    • v.14 no.2
    • /
    • pp.65-76
    • /
    • 2023
  • Given the limited water resources and the presence of multiple decision makers with different and usually conflicting objectives in the exploitation of water resources systems, especially dam's reservoirs; therefore, the decision to determine the optimal allocation of reservoir water among decision-makers and stakeholders is a difficult task. In this study, by combining a fuzzy VIKOR technique or fuzzy multi-criteria decision making (FMCDM) and the Young's bilateral bargaining model, a new method was developed to determine the optimal quantitative and qualitative water allocation of dam's reservoir water with the aim of increasing the utility of decision makers and stakeholders and reducing the conflicts among them. In this study, by identifying the stakeholders involved in the exploitation of the dam reservoir and determining their utility, the optimal points on trade-off curve with quantitative and qualitative objectives presented by Mojarabi et al. (2019) were ranked based on the quantitative and qualitative criteria, and economic, social and environmental factors using the fuzzy VIKOR technique. In the proposed method, the weights of the criteria were determined by each decision maker using the entropy method. The results of a fuzzy decision-making method demonstrated that the Young's bilateral bargaining model was developed to determine the point agreed between the decisions makers on the trade-off curve. In the proposed method, (a) the opinions of decision makers and stakeholders were considered according to different criteria in the exploitation of the dam reservoir, (b) because the decision makers considered the different factors in addition to quantitative and qualitative criteria, they were willing to participate in bargaining and reconsider their ideals, (c) due to the use of a fuzzy-logic based decision-making approach and considering different criteria, the utility of all decision makers was close to each other and the scope of bargaining became smaller, leading to an increase in the possibility of reaching an agreement in a shorter time period using game theory and (d) all qualitative judgments without considering explicitness of the decision makers were applied to the model using the fuzzy logic. The results of using the proposed method for the optimal exploitation of Iran's 15-Khordad dam reservoir over a 30-year period (1968-1997) showed the possibility of the agreement on the water allocation of the monthly total dissolved solids (TDS)=1,490 mg/L considering the different factors based on the opinions of decision makers and reducing conflicts among them.

An Improvement of the Decision-Making of Categorical Data in Rough Set Analysis (범주형 데이터의 러프집합 분석을 통한 의사결정 향상기법)

  • Park, In-Kyu
    • Journal of Digital Convergence
    • /
    • v.13 no.6
    • /
    • pp.157-164
    • /
    • 2015
  • An efficient retrieval of useful information is a prerequisite of an optimal decision making system. Hence, A research of data mining techniques finding useful patterns from the various forms of data has been progressed with the increase of the application of Big Data for convergence and integration with other industries. Each technique is more likely to have its drawback so that the generalization of retrieving useful information is weak. Another integrated technique is essential for retrieving useful information. In this paper, a uncertainty measure of information is calculated such that algebraic probability is measured by Bayesian theory and then information entropy of the probability is measured. The proposed measure generates the effective reduct set (i.e., reduced set of necessary attributes) and formulating the core of the attribute set. Hence, the optimal decision rules are induced. Through simulation deciding contact lenses, the proposed approach is compared with the equivalence and value-reduct theories. As the result, the proposed is more general than the previous theories in useful decision-making.