Search | Korea Science

Research Trends on Deep Reinforcement Learning (심층 강화학습 기술 동향)

Jang, S.Y.;Yoon, H.J.;Park, N.S.;Yun, J.K.;Son, Y.S.
- Electronics and Telecommunications Trends
- /
- v.34 no.4
- /
- pp.1-14
- /
- 2019
Recent trends in deep reinforcement learning (DRL) have revealed the considerable improvements to DRL algorithms in terms of performance, learning stability, and computational efficiency. DRL also enables the scenarios that it covers (e.g., partial observability; cooperation, competition, coexistence, and communications among multiple agents; multi-task; decentralized intelligence) to be vastly expanded. These features have cultivated multi-agent reinforcement learning research. DRL is also expanding its applications from robotics to natural language processing and computer vision into a wide array of fields such as finance, healthcare, chemistry, and even art. In this report, we briefly summarize various DRL techniques and research directions.
https://doi.org/10.22648/ETRI.2019.J.340401 인용 PDF

DRM-FL: A Decentralized and Randomized Mechanism for Privacy Protection in Cross-Silo Federated Learning Approach (DRM-FL: Cross-Silo Federated Learning 접근법의 프라이버시 보호를 위한 분산형 랜덤화 메커니즘)

Firdaus, Muhammad;Latt, Cho Nwe Zin;Aguilar, Mariz;Rhee, Kyung-Hyune
- Proceedings of the Korea Information Processing Society Conference
- /
- 2022.05a
- /
- pp.264-267
- /
- 2022
Recently, federated learning (FL) has increased prominence as a viable approach for enhancing user privacy and data security by allowing collaborative multi-party model learning without exchanging sensitive data. Despite this, most present FL systems still depend on a centralized aggregator to generate a global model by gathering all submitted models from users, which could expose user privacy and the risk of various threats from malicious users. To solve these issues, we suggested a safe FL framework that employs differential privacy to counter membership inference attacks during the collaborative FL model training process and empowers blockchain to replace the centralized aggregator server.
https://doi.org/10.3745/PKIPS.y2022m05a.264 인용 PDF

Analysis of Teaching Behavior and Visual Attention according to Teacher's Career in Elementary Science Inquire-based Class on Respiration (탐구형 초등과학수업 '호흡' 차시에서 교사의 경력에 따른 교수행동 및 시각적 주의 분석)

Kim, Jang-Hwan;Shin, Won-Sub;Shin, Dong-Hoon
- Journal of Korean Elementary Science Education
- /
- v.37 no.2
- /
- pp.206-218
- /
- 2018
The purpose of this study is to analyze the teaching behaviors and visual attention according to teacher's career in Elementary Science Inquire-based Class. Participants were four elementary school teachers in Seoul. They were all in grade 5 and taught science. According to the experience of elementary science education, two novice teachers and two expert teachers were identified. Participants taught Respiration in the 'Structure and Function of our Body' in the elementary science fifth grade. The mobile eye tracker used in this study is SMI's ETG 2w, which is a binocular tracking system. In addition, a video camera was installed behind the classroom to record the entire class. We recorded all the contents of the recorded video and analyzed the results. In this study, the actual practice time, participant's visual attention, and decentralized attention ability were analyzed by class phase. The results of the study are as follows. First, there was a difference between planned class time and actual practice time. The novice teachers were having difficulty in reconstructing the contents of education, and the expert teachers were reconstructing the curriculum and interacting with the students with high understanding and application of the curriculum. There were many differences between the novice teachers and the expert teachers in the tour guidance to confirm student activities. Second, if we look at the visual attention on the area related to teaching and learning by class phase, the novice teacher concentrates all the steps in a specific area, expert teachers showed an equal visual attention to meaningful areas of teaching and learning activities. Third, there was a statistically significant difference in activities 1-1, 1-2, 2-1, and 2-2 when the participants' decentralized attention ability. Expert teachers frequently checked students' understanding and interests. There was a lot of interaction with students. It is also shown through the decentralized attention ability that the novice teachers concentrate on a specific area, and the expert teachers have a high degree of decentralized attention ability and visual attention evenly.
https://doi.org/10.15267/keses.2018.37.2.206 인용 PDF KSCI

Optimizing Energy Efficiency in Mobile Ad Hoc Networks: An Intelligent Multi-Objective Routing Approach

Sun Beibei
- IEMEK Journal of Embedded Systems and Applications
- /
- v.19 no.2
- /
- pp.107-114
- /
- 2024
Mobile ad hoc networks represent self-configuring networks of mobile devices that communicate without relying on a fixed infrastructure. However, traditional routing protocols in such networks encounter challenges in selecting efficient and reliable routes due to dynamic nature of these networks caused by unpredictable mobility of nodes. This often results in a failure to meet the low-delay and low-energy consumption requirements crucial for such networks. In order to overcome such challenges, our paper introduces a novel multi-objective and adaptive routing scheme based on the Q-learning reinforcement learning algorithm. The proposed routing scheme dynamically adjusts itself based on measured network states, such as traffic congestion and mobility. The proposed approach utilizes Q-learning to select routes in a decentralized manner, considering factors like energy consumption, load balancing, and the selection of stable links. We present a formulation of the multi-objective optimization problem and discuss adaptive adjustments of the Q-learning parameters to handle the dynamic nature of the network. To speed up the learning process, our scheme incorporates informative shaped rewards, providing additional guidance to the learning agents for better solutions. Implemented on the widely-used AODV routing protocol, our proposed approaches demonstrate better performance in terms of energy efficiency and improved message delivery delay, even in highly dynamic network environments, when compared to the traditional AODV. These findings show the potential of leveraging reinforcement learning for efficient routing in ad hoc networks, making the way for future advancements in the field of mobile ad hoc networking.
https://doi.org/10.14372/IEMEK.2024.19.2.107 인용 PDF

Precision of Iterative Learning Control for the Multiple Dynamic Subsystems (복합구조물의 선형반복학습제어 정밀도 연구)

Lee, Soo-Cheol
- Journal of the Korean Society for Precision Engineering
- /
- v.18 no.3
- /
- pp.131-142
- /
- 2001
다양한 산업체에서 반복적인 특정업무를 수행하는 경우가 흔히 발생한다. 반복되는 오차의 경험치를 근거로 주어진 작업을 추진하는 과정에서 이들 업무의 정밀도제고를 추구함으로써 갖는 성능개선은 사업장의 품질관리와 직결된다. 학습제어의 본래 적용동기는 생산조립라인에 투입되어 반복적인 일을 수행하는 산업로봇의 정밀도 제고이다. 본 논문에서 분산이산시형시스템에서 출발하였으며, 이를 산업용로봇에 적용하기 위하여 수학적으로 모델링한 모의실험을 통하여 알고리즘의 안정성과 반복오차를 줄여가는 과정을 보여 주었다. 입출력정보가 상호간섭 하는 산업용로봇과 같은 복합구조물에서도 모든 시스템(링크)의 정밀도를 만족함을 보여 줌으로써 복합구조물에서 선형반복학습제어의 안정성을 증명하였다.
PDF

Research Trends of Multi-agent Collaboration Technology for Artificial Intelligence Bots (AI Bots를 위한 멀티에이전트 협업 기술 동향)

D., Kang;J.Y., Jung;C.H., Lee;M., Park;J.W., Lee;Y.J., Lee
- Electronics and Telecommunications Trends
- /
- v.37 no.6
- /
- pp.32-42
- /
- 2022
Recently, decentralized approaches to artificial intelligence (AI) development, such as federated learning are drawing attention as AI development's cost and time inefficiency increase due to explosive data growth and rapid environmental changes. Collaborative AI technology that dynamically organizes collaborative groups between different agents to share data, knowledge, and experience and uses distributed resources to derive enhanced knowledge and analysis models through collaborative learning to solve given problems is an alternative to centralized AI. This article investigates and analyzes recent technologies and applications applicable to the research of multi-agent collaboration of AI bots, which can provide collaborative AI functionality autonomously.
https://doi.org/10.22648/ETRI.2022.J.370604 인용 PDF

Comparative Analysis of Multi-Agent Reinforcement Learning Algorithms Based on Q-Value (상태 행동 가치 기반 다중 에이전트 강화학습 알고리즘들의 비교 분석 실험)

Kim, Ju-Bong;Choi, Ho-Bin;Han, Youn-Hee
- Proceedings of the Korea Information Processing Society Conference
- /
- 2021.05a
- /
- pp.447-450
- /
- 2021
시뮬레이션을 비롯한 많은 다중 에이전트 환경에서는 중앙 집중 훈련 및 분산 수행(centralized training with decentralized execution; CTDE) 방식이 활용되고 있다. CTDE 방식 하에서 중앙 집중 훈련 및 분산 수행 환경에서의 다중 에이전트 학습을 위한 상태 행동 가치 기반(state-action value; Q-value) 다중 에이전트 알고리즘들에 대한 많은 연구가 이루어졌다. 이러한 알고리즘들은 Independent Q-learning (IQL)이라는 강력한 벤치 마크 알고리즘에서 파생되어 다중 에이전트의 공동의 상태 행동 가치의 분해(Decomposition) 문제에 대해 집중적으로 연구되었다. 본 논문에서는 앞선 연구들에 관한 알고리즘들에 대한 분석과 실용적이고 일반적인 도메인에서의 실험 분석을 통해 검증한다.
https://doi.org/10.3745/PKIPS.y2021m05a.447 인용 PDF

Intelligent Warehousing: Comparing Cooperative MARL Strategies

Yosua Setyawan Soekamto;Dae-Ki Kang
- International Journal of Internet, Broadcasting and Communication
- /
- v.16 no.3
- /
- pp.205-211
- /
- 2024
Effective warehouse management requires advanced resource planning to optimize profits and space. Robots offer a promising solution, but their effectiveness relies on embedded artificial intelligence. Multi-agent reinforcement learning (MARL) enhances robot intelligence in these environments. This study explores various MARL algorithms using the Multi-Robot Warehouse Environment (RWARE) to determine their suitability for warehouse resource planning. Our findings show that cooperative MARL is essential for effective warehouse management. IA2C outperforms MAA2C and VDA2C on smaller maps, while VDA2C excels on larger maps. IA2C's decentralized approach, focusing on cooperation over collaboration, allows for higher reward collection in smaller environments. However, as map size increases, reward collection decreases due to the need for extensive exploration. This study highlights the importance of selecting the appropriate MARL algorithm based on the specific warehouse environment's requirements and scale.
https://doi.org/10.7236/IJIBC.2024.16.3.205 인용 PDF

The Analysis of ALT and Unuse of Learning Time in UCR Based Instruction (UCR활용수업의 실제학습시간 및 소실된 수업시간 분석)

Baek, Je-Eun;Kim, Kyung-Hyun
- The Journal of Korean Association of Computer Education
- /
- v.18 no.3
- /
- pp.15-24
- /
- 2015
Appropriate distribution and utilization of learning time in class are regarded as essential and basic conditions for successful education. Nonetheless, among studies about UCR(User Created Robot) based instruction so far is difficult to find the research related to the class. For these reasons, we attempt to analyze the ALT(Actual Learning Time) and unuse of learning time in UCR based instruction. For these purpose, we observed three students who were with third and fourth grade integrated class of elementary school and interviewed the teachers at pre-post class. The result of this study showed the following results: (1) UCR based instruction present lower ALT than traditional classes. (2) Most of the unnecessary time used in their classes tend to be used in preparing and arranging the robot module, a little is used unnecessarily because of the students' unrelated behaviors for their learning, decentralized behaviors and other external influences.
PDF KSCI

C-COMA: A Continual Reinforcement Learning Model for Dynamic Multiagent Environments (C-COMA: 동적 다중 에이전트 환경을 위한 지속적인 강화 학습 모델)

Jung, Kyueyeol;Kim, Incheol
- KIPS Transactions on Software and Data Engineering
- /
- v.10 no.4
- /
- pp.143-152
- /
- 2021
It is very important to learn behavioral policies that allow multiple agents to work together organically for common goals in various real-world applications. In this multi-agent reinforcement learning (MARL) environment, most existing studies have adopted centralized training with decentralized execution (CTDE) methods as in effect standard frameworks. However, this multi-agent reinforcement learning method is difficult to effectively cope with in a dynamic environment in which new environmental changes that are not experienced during training time may constantly occur in real life situations. In order to effectively cope with this dynamic environment, this paper proposes a novel multi-agent reinforcement learning system, C-COMA. C-COMA is a continual learning model that assumes actual situations from the beginning and continuously learns the cooperative behavior policies of agents without dividing the training time and execution time of the agents separately. In this paper, we demonstrate the effectiveness and excellence of the proposed model C-COMA by implementing a dynamic mini-game based on Starcraft II, a representative real-time strategy game, and conducting various experiments using this environment.
https://doi.org/10.3745/KTSDE.2021.10.4.143 인용 PDF KSCI

Search Result 45, Processing Time 0.019 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)