• 제목/요약/키워드: reinforcement efficiency

검색결과 345건 처리시간 0.027초

종방향 주행성능향상을 위한 Latent SAC 강화학습 보상함수 설계 (On the Reward Function of Latent SAC Reinforcement Learning to Improve Longitudinal Driving Performance)

  • 조성빈;정한유
    • 전기전자학회논문지
    • /
    • 제25권4호
    • /
    • pp.728-734
    • /
    • 2021
  • 최근 심층강화학습을 활용한 종단간 자율주행에 대한 관심이 크게 증가하고 있다. 본 논문에서는 차량의 종방향 주행 성능을 개선하는 잠재 SAC 기반 심층강화학습의 보상함수를 제시한다. 기존 강화학습 보상함수는 주행 안전성과 효율성이 크게 저하되는 반면 제시하는 보상함수는 전방 차량과의 충돌위험을 회피하면서 적절한 차간거리를 유지할 수 있음을 보인다.

점성 감쇠기를 이용한 인접 비대칭 강성 구조물의 내진보강 최적설계 (Optimal Seismic Reinforcement Design of Adjacent Asymmetric-Stiffness Structures with Viscous Dampers)

  • 성은희
    • 한국안전학회지
    • /
    • 제37권6호
    • /
    • pp.60-70
    • /
    • 2022
  • This paper proposes an optimal design method of a seismic reinforcement system for the seismic performance of adjacent asymmetric-stiffness structures with viscous dampers. The first method considers plan asymmetry for efficient seismic reinforcement, and evaluates the seismic performance of optimal design applied to two cases of modeling: adjacent stiffness-asymmetric structures and adjacent stiffness-symmetric structures. The second method considers the response of asymmetric structures to derive the optimal objective function, and evaluates seismic efficiency of the objective function applied to two cases of responses: horizontal displacement and torsion. Numerical analyses are conducted on 7- and 10-story structures with a uni-asymmetric-stiffness plan using six cases of historic earthquakes, normalized to 0.4g. The results indicate that the seismic performance is excellent as modeled by adjacent asymmetric-stiffness structures and how much horizontal displacement is applied as the objective function.

Reinforcement learning-based control with application to the once-through steam generator system

  • Cheng Li;Ren Yu;Wenmin Yu;Tianshu Wang
    • Nuclear Engineering and Technology
    • /
    • 제55권10호
    • /
    • pp.3515-3524
    • /
    • 2023
  • A reinforcement learning framework is proposed for the control problem of outlet steam pressure of the once-through steam generator(OTSG) in this paper. The double-layer controller using Proximal Policy Optimization(PPO) algorithm is applied in the control structure of the OTSG. The PPO algorithm can train the neural networks continuously according to the process of interaction with the environment and then the trained controller can realize better control for the OTSG. Meanwhile, reinforcement learning has the characteristic of difficult application in real-world objects, this paper proposes an innovative pretraining method to solve this problem. The difficulty in the application of reinforcement learning lies in training. The optimal strategy of each step is summed up through trial and error, and the training cost is very high. In this paper, the LSTM model is adopted as the training environment for pretraining, which saves training time and improves efficiency. The experimental results show that this method can realize the self-adjustment of control parameters under various working conditions, and the control effect has the advantages of small overshoot, fast stabilization speed, and strong adaptive ability.

멀티에이전트 강화학습에서 견고한 지식 전이를 위한 확률적 초기 상태 랜덤화 기법 연구 (Stochastic Initial States Randomization Method for Robust Knowledge Transfer in Multi-Agent Reinforcement Learning)

  • 김도현;배정호
    • 한국군사과학기술학회지
    • /
    • 제27권4호
    • /
    • pp.474-484
    • /
    • 2024
  • Reinforcement learning, which are also studied in the field of defense, face the problem of sample efficiency, which requires a large amount of data to train. Transfer learning has been introduced to address this problem, but its effectiveness is sometimes marginal because the model does not effectively leverage prior knowledge. In this study, we propose a stochastic initial state randomization(SISR) method to enable robust knowledge transfer that promote generalized and sufficient knowledge transfer. We developed a simulation environment involving a cooperative robot transportation task. Experimental results show that successful tasks are achieved when SISR is applied, while tasks fail when SISR is not applied. We also analyzed how the amount of state information collected by the agents changes with the application of SISR.

PGA: An Efficient Adaptive Traffic Signal Timing Optimization Scheme Using Actor-Critic Reinforcement Learning Algorithm

  • Shen, Si;Shen, Guojiang;Shen, Yang;Liu, Duanyang;Yang, Xi;Kong, Xiangjie
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제14권11호
    • /
    • pp.4268-4289
    • /
    • 2020
  • Advanced traffic signal timing method plays very important role in reducing road congestion and air pollution. Reinforcement learning is considered as superior approach to build traffic light timing scheme by many recent studies. It fulfills real adaptive control by the means of taking real-time traffic information as state, and adjusting traffic light scheme as action. However, existing works behave inefficient in complex intersections and they are lack of feasibility because most of them adopt traffic light scheme whose phase sequence is flexible. To address these issues, a novel adaptive traffic signal timing scheme is proposed. It's based on actor-critic reinforcement learning algorithm, and advanced techniques proximal policy optimization and generalized advantage estimation are integrated. In particular, a new kind of reward function and a simplified form of state representation are carefully defined, and they facilitate to improve the learning efficiency and reduce the computational complexity, respectively. Meanwhile, a fixed phase sequence signal scheme is derived, and constraint on the variations of successive phase durations is introduced, which enhances its feasibility and robustness in field applications. The proposed scheme is verified through field-data-based experiments in both medium and high traffic density scenarios. Simulation results exhibit remarkable improvement in traffic performance as well as the learning efficiency comparing with the existing reinforcement learning-based methods such as 3DQN and DDQN.

Study of the longitudinal reinforcement in reinforced concrete-filled steel tube short column subjected to axial loading

  • Alifujiang Xiamuxi;Caijian Liu;Alipujiang Jierula
    • Steel and Composite Structures
    • /
    • 제47권6호
    • /
    • pp.709-728
    • /
    • 2023
  • Experimental and analytical studies were conducted to clarify the influencing mechanisms of the longitudinal reinforcement on performance of axially loaded Reinforced Concrete-Filled Steel Tube (R-CFST) short columns. The longitudinal reinforcement ratio was set as parameter, and 10 R-CFST specimens with five different ratios and three Concrete-Filled Steel Tube (CFST) specimens for comparison were prepared and tested. Based on the test results, the failure modes, load transfer responses, peak load, stiffness, yield to strength ratio, ductility, fracture toughness, composite efficiency and stress state of steel tube were theoretically analyzed. To further examine, analytical investigations were then performed, material model for concrete core was proposed and verified against the test, and thereafter 36 model specimens with four different wall-thickness of steel tube, coupling with nine reinforcement ratios, were simulated. Finally, considering the experimental and analytical results, the prediction equations for ultimate load bearing capacity of R-CFSTs were modified from the equations of CFSTs given in codes, and a new equation which embeds the effect of reinforcement was proposed, and equations were validated against experimental data. The results indicate that longitudinal reinforcement significantly impacts the behavior of R-CFST as steel tube does; the proposed analytical model is effective and reasonable; proper ratios of longitudinal reinforcement enable the R-CFSTs obtain better balance between the performance and the construction cost, and the range for the proper ratios is recommended between 1.0% and 3.0%, regardless of wall-thickness of steel tube; the proposed equation is recommended for more accurate and stable prediction of the strength of R-CFSTs.

유공 H 형강보의 보강효율에 관한 실험적 연구 (An Experimental Study on Reinforcing Efficiency of H-Shaped Steel Beams with a Rectangular Web Opening)

  • 김진무;조철호
    • 한국구조물진단유지관리공학회 논문집
    • /
    • 제3권1호
    • /
    • pp.171-178
    • /
    • 1999
  • Despite of decrease in shear and moment strengths, most steel structural designers use web openings in beams because of economical benefit and requirement. The purpose of this study is to suggest the method of reinforcement of H-shape steel beams with a rectangular web opening. If shear predominates over bending, it is necessary to consider all possible combinations of shear force and bending moment acting at the opening. In this paper, the ultimate strength and behavior of perforated beams have been investigated according to parameters (ratio of M/V, opening width within opening height ratio D/h, various reinforcing types A/B/C/D/M/N/W). The results of this study are as follows ; 1. Deformation of H-shape steel beams with a rectangular web opening was greatly affected by not only bending but also shear. 2. SB1-2/3 series have little difference in the reinforced efficiency, but SB1-2E/3E series have difference in the reinforced efficiency according to the reinforcement type. 3. Efficiency of SB1-2E/3E series is determined by reinforcing types, which RB1-2E-B/M/C and RB1-3E-M/D/C specimens have good efficient. Reinforcing type of perforated beams chooses efficient method according to ratio of M/V and D/h.

  • PDF

A Diversified Message Type Forwarding Strategy Based on Reinforcement Learning in VANET

  • Xu, Guoai;Liu, Boya;Xu, Guosheng;Zuo, Peiliang
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제16권9호
    • /
    • pp.3104-3123
    • /
    • 2022
  • The development of Vehicular Ad hoc Network (VANET) has greatly improved the efficiency and safety of social transportation, and the routing strategy for VANET has also received high attention from both academia and industry. However, studies on dynamic matching of routing policies with the message types of VANET are in short supply, which affects the operational efficiency and security of VANET to a certain extent. This paper studies the message types in VANET and fully considers the urgency and reliability requirements of message forwarding under various types. Based on the diversified types of messages to be transmitted, and taking the diversified message forwarding strategies suitable for VANET scenarios as behavioral candidates, an adaptive routing method for the VANET message types based on reinforcement learning (RL) is proposed. The key parameters of the method, such as state, action and reward, are reasonably designed. Simulation and analysis show that the proposed method could converge quickly, and the comprehensive performance of the proposed method is obviously better than the comparison methods in terms of timeliness and reliability.

탄소섬유쉬트로 횡구속된 RC기둥의 압축거동 (Axial Compressive Behavior of R/C Columns Confined with Carbon Fiber Sheets)

  • 신성우;이광수;심성택;송민성
    • 한국콘크리트학회:학술대회논문집
    • /
    • 한국콘크리트학회 2001년도 가을 학술발표회 논문집
    • /
    • pp.727-732
    • /
    • 2001
  • External Confinement of concrete in CFS enhances strength and ductility of concrete columns. This paper presents the test results on the study of reinforced concrete columns strengthened with carbon fiber sheets. The purpose of this research is to evaluate the CFS confinement characteristics of square reinforced concrete columns and the CFS efficiency. The tests were performed with different lateral reinforcement ratios, CFS reinforcement ratios and concrete strength. Test results were characterized according to maximum loads and lateral strain of CFS.

  • PDF

MULTI-OBJECTIVE OPTIMIZATION OF THE INNER REINFORCEMENT FOR A VEHICLE'S HOOD CONSIDERING STATIC STIFFNESS AND NATURAL FREQUENCY

  • Choi, S.H.;Kim, S.R.;Park, J.Y.;Han, S.Y.
    • International Journal of Automotive Technology
    • /
    • 제8권3호
    • /
    • pp.337-342
    • /
    • 2007
  • A multi-objective optimization technique was implemented to obtain optimal topologies of the inner reinforcement for a vehicle's hood simultaneously considering the static stiffness of bending and torsion and natural frequency. In addition, a smoothing scheme was used to suppress the checkerboard patterns in the ESO method. Two models with different curvature were chosen in order to investigate the effect of curvature on the static stiffness and natural frequency of the inner reinforcement. A scale factor was employed to properly reflect the effect of each objective function. From several combinations of weighting factors, a Pareto-optimal topology solution was obtained. As the weighting factor for the elastic strain efficiency went from 1 to 0, the optimal topologies transmitted from the optimal topology of a static stiffness problem to that of a natural frequency problem. It was also found that the higher curvature model had a larger static stiffness and natural frequency than the lower curvature model. From the results, it is concluded that the ESO method with a smoothing scheme was effectively applied to topology optimization of the inner reinforcement of a vehicle's hood.