통합 검색 | Korea Science

스마트 TMD 제어를 위한 강화학습 알고리즘 성능 검토 (Performance Evaluation of Reinforcement Learning Algorithm for Control of Smart TMD)

강주원;김현수
- 한국공간구조학회논문집
- /
- 제21권2호
- /
- pp.41-48
- /
- 2021
A smart tuned mass damper (TMD) is widely studied for seismic response reduction of various structures. Control algorithm is the most important factor for control performance of a smart TMD. This study used a Deep Deterministic Policy Gradient (DDPG) among reinforcement learning techniques to develop a control algorithm for a smart TMD. A magnetorheological (MR) damper was used to make the smart TMD. A single mass model with the smart TMD was employed to make a reinforcement learning environment. Time history analysis simulations of the example structure subject to artificial seismic load were performed in the reinforcement learning process. Critic of policy network and actor of value network for DDPG agent were constructed. The action of DDPG agent was selected as the command voltage sent to the MR damper. Reward for the DDPG action was calculated by using displacement and velocity responses of the main mass. Groundhook control algorithm was used as a comparative control algorithm. After 10,000 episode training of the DDPG agent model with proper hyper-parameters, the semi-active control algorithm for control of seismic responses of the example structure with the smart TMD was developed. The simulation results presented that the developed DDPG model can provide effective control algorithms for smart TMD for reduction of seismic responses.
https://doi.org/10.9712/KASS.2021.21.2.41 인용 PDF KSCI

액터-크리틱 모형기반 포트폴리오 연구 (A Study on the Portfolio Performance Evaluation using Actor-Critic Reinforcement Learning Algorithms)

이우식
- 한국산업융합학회 논문집
- /
- 제25권3호
- /
- pp.467-476
- /
- 2022
The Bank of Korea raised the benchmark interest rate by a quarter percentage point to 1.75 percent per year, and analysts predict that South Korea's policy rate will reach 2.00 percent by the end of calendar year 2022. Furthermore, because market volatility has been significantly increased by a variety of factors, including rising rates, inflation, and market volatility, many investors have struggled to meet their financial objectives or deliver returns. Banks and financial institutions are attempting to provide Robo-Advisors to manage client portfolios without human intervention in this situation. In this regard, determining the best hyper-parameter combination is becoming increasingly important. This study compares some activation functions of the Deep Deterministic Policy Gradient(DDPG) and Twin-delayed Deep Deterministic Policy Gradient (TD3) Algorithms to choose a sequence of actions that maximizes long-term reward. The DDPG and TD3 outperformed its benchmark index, according to the results. One reason for this is that we need to understand the action probabilities in order to choose an action and receive a reward, which we then compare to the state value to determine an advantage. As interest in machine learning has grown and research into deep reinforcement learning has become more active, finding an optimal hyper-parameter combination for DDPG and TD3 has become increasingly important.
https://doi.org/10.21289/KSIC.2022.25.3.467 인용 PDF KSCI HTML

Backstepping Sliding Mode-based Model-free Control of Electro-hydraulic Systems

Truong, Hoai-Vu-Anh;Trinh, Hoai-An;Ahn, Kyoung-Kwan
- 드라이브 ㆍ 컨트롤
- /
- 제19권1호
- /
- pp.51-61
- /
- 2022
This paper presents a model-free system based on a framework of a backstepping sliding mode control (BSMC) with a radial basis function neural network (RBFNN) and adaptive mechanism for electro-hydraulic systems (EHSs). First, an EHS mathematical model was dedicatedly derived to understand the system behavior. Based on the system structure, BSMC was employed to satisfy the output performance. Due to the highly nonlinear characteristics and the presence of parametric uncertainties, a model-free approximator based on an RBFNN was developed to compensate for the EHS dynamics, thus addressing the difficulty in the requirement of system information. Adaptive laws based on the actor-critic neural network (ACNN) were implemented to suppress the existing error in the approximation and satisfy system qualification. The stability of the closed-loop system was theoretically proven by the Lyapunov function. To evaluate the effectiveness of the proposed algorithm, proportional-integrated-derivative (PID) and improved PID with ACNN (ACPID), which are considered two complete model-free methods, and adaptive backstepping sliding mode control, considered an ideal model-based method with the same adaptive laws, were used as two benchmark control strategies in a comparative simulation. The simulated results validated the superiority of the proposed algorithm in achieving nearly the same performance as the ideal adaptive BSMC.
https://doi.org/10.7839/ksfc.2022.19.1.051 인용 PDF KSCI

A3C 기반의 강화학습을 사용한 DASH 시스템 (A DASH System Using the A3C-based Deep Reinforcement Learning)

최민제;임경식
- 대한임베디드공학회논문지
- /
- 제17권5호
- /
- pp.297-307
- /
- 2022
The simple procedural segment selection algorithm commonly used in Dynamic Adaptive Streaming over HTTP (DASH) reveals severe weakness to provide high-quality streaming services in the integrated mobile networks of various wired and wireless links. A major issue could be how to properly cope with dynamically changing underlying network conditions. The key to meet it should be to make the segment selection algorithm much more adaptive to fluctuation of network traffics. This paper presents a system architecture that replaces the existing procedural segment selection algorithm with a deep reinforcement learning algorithm based on the Asynchronous Advantage Actor-Critic (A3C). The distributed A3C-based deep learning server is designed and implemented to allow multiple clients in different network conditions to stream videos simultaneously, collect learning data quickly, and learn asynchronously, resulting in greatly improved learning speed as the number of video clients increases. The performance analysis shows that the proposed algorithm outperforms both the conventional DASH algorithm and the Deep Q-Network algorithm in terms of the user's quality of experience and the speed of deep learning.
https://doi.org/10.14372/IEMEK.2022.17.5.297 인용 PDF KSCI

강화학습을 이용한 트레이딩 전략 (Trading Strategies Using Reinforcement Learning)

조현민;신현준
- 한국산학기술학회논문지
- /
- 제22권1호
- /
- pp.123-130
- /
- 2021
최근 컴퓨터 기술이 발전하면서 기계학습 분야에 관한 관심이 높아지고 있고 다양한 분야에 기계학습 이론을 적용하는 사례가 크게 증가하고 있다. 특히 금융 분야에서는 금융 상품의 미래 가치를 예측하는 것이 난제인데 80년대부터 지금까지 기술적 및 기본적 분석에 의존하고 있다. 기계학습을 이용한 미래 가치 예측 모형들은 다양한 잠재적 시장변수에 대응하기 위한 모형 설계가 무엇보다 중요하다. 따라서 본 논문은 기계학습의 하나인 강화학습 모형을 이용해 KOSPI 시장에 상장되어 있는 개별 종목들의 주가 움직임을 정량적으로 판단하여 이를 주식매매 전략에 적용한다. 강화학습 모형은 2013년 구글 딥마인드에서 제안한 DQN와 A2C 알고리즘을 이용하여 KOSPI에 상장된 14개 업종별 종목들의 과거 약 13년 동안의 시계열 주가에 기반한 데이터세트를 각각 입력 및 테스트 데이터로 사용한다. 데이터세트는 8개의 주가 관련 속성들과 시장을 대표하는 2개의 속성으로 구성하였고 취할 수 있는 행동은 매입, 매도, 유지 중 하나이다. 실험 결과 매매전략의 평균 연 환산수익률 측면에서 DQN과 A2C이 대안 알고리즘들보다 우수하였다.
https://doi.org/10.5762/KAIS.2021.22.1.123 인용 PDF KSCI

네트워크 공격 시뮬레이터를 이용한 강화학습 기반 사이버 공격 예측 연구 (A Study of Reinforcement Learning-based Cyber Attack Prediction using Network Attack Simulator (NASim))

김범석;김정현;김민석
- 반도체디스플레이기술학회지
- /
- 제22권3호
- /
- pp.112-118
- /
- 2023
As technology advances, the need for enhanced preparedness against cyber-attacks becomes an increasingly critical problem. Therefore, it is imperative to consider various circumstances and to prepare for cyber-attack strategic technology. This paper proposes a method to solve network security problems by applying reinforcement learning to cyber-security. In general, traditional static cyber-security methods have difficulty effectively responding to modern dynamic attack patterns. To address this, we implement cyber-attack scenarios such as 'Tiny Alpha' and 'Small Alpha' and evaluate the performance of various reinforcement learning methods using Network Attack Simulator, which is a cyber-attack simulation environment based on the gymnasium (formerly Open AI gym) interface. In addition, we experimented with different RL algorithms such as value-based methods (Q-Learning, Deep-Q-Network, and Double Deep-Q-Network) and policy-based methods (Actor-Critic). As a result, we observed that value-based methods with discrete action spaces consistently outperformed policy-based methods with continuous action spaces, demonstrating a performance difference ranging from a minimum of 20.9% to a maximum of 53.2%. This result shows that the scheme not only suggests opportunities for enhancing cybersecurity strategies, but also indicates potential applications in cyber-security education and system validation across a large number of domains such as military, government, and corporate sectors.
PDF

SAC 강화 학습을 통한 스마트 그리드 효율성 향상: CityLearn 환경에서 재생 에너지 통합 및 최적 수요 반응 (Enhancing Smart Grid Efficiency through SAC Reinforcement Learning: Renewable Energy Integration and Optimal Demand Response in the CityLearn Environment)

이자노브 알리벡 러스타모비치;성승제;임창균
- 한국전자통신학회논문지
- /
- 제19권1호
- /
- pp.93-104
- /
- 2024
수요 반응은 전력망의 신뢰성을 높이고 비용을 최소화하기 위해 수요가 가장 많은 시간대에 고객이 소비패턴을 조정하도록 유도한다. 재생 에너지원을 스마트 그리드에 통합하는 것은 간헐적이고 예측할 수 없는 특성으로 인해 상당한 도전 과제를 안고 있다. 강화 학습 기법과 결합된 수요 대응 전략은 이러한 문제를 해결하고 기존 방식에서는 이러한 종류의 복잡한 요구 사항을 충족하지 못하는 경우 그리드 운영을 최적화할 수 있는 접근 방식으로 부상하고 있다. 본 연구는 재생 에너지 통합을 위한 수요 반응에 강화 학습 알고리즘을 적용하는 방법을 찾아 적용하는데 중점을 둔다. 연구의 핵심 목표는 수요 측 유연성을 최적화하고 재생 에너지 활용도를 개선할 뿐 아니라 그리드 안정성을 강화하고자 한다. 연구 결과는 강화 학습을 기반으로 한 수요 반응 전략이 그리드 유연성을 향상시키고 재생 에너지 통합을 촉진하는 데 효과적이라것을 보여준다.
https://doi.org/10.13067/JKIECS.2024.19.1.93 인용 PDF

검색결과 47건 처리시간 0.019초

스마트 TMD 제어를 위한 강화학습 알고리즘 성능 검토 (Performance Evaluation of Reinforcement Learning Algorithm for Control of Smart TMD)

액터-크리틱 모형기반 포트폴리오 연구 (A Study on the Portfolio Performance Evaluation using Actor-Critic Reinforcement Learning Algorithms)

Backstepping Sliding Mode-based Model-free Control of Electro-hydraulic Systems

A3C 기반의 강화학습을 사용한 DASH 시스템 (A DASH System Using the A3C-based Deep Reinforcement Learning)

강화학습을 이용한 트레이딩 전략 (Trading Strategies Using Reinforcement Learning)

네트워크 공격 시뮬레이터를 이용한 강화학습 기반 사이버 공격 예측 연구 (A Study of Reinforcement Learning-based Cyber Attack Prediction using Network Attack Simulator (NASim))

SAC 강화 학습을 통한 스마트 그리드 효율성 향상: CityLearn 환경에서 재생 에너지 통합 및 최적 수요 반응 (Enhancing Smart Grid Efficiency through SAC Reinforcement Learning: Renewable Energy Integration and Optimal Demand Response in the CityLearn Environment)

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)