• Title/Summary/Keyword: Model based reinforcement learning

Search Result 158, Processing Time 0.022 seconds

Developing Reinforcement Learning based Job Allocation Model by Using FlexSim Software (FlexSim 소프트웨어를 이용한 강화학습 기반 작업 할당 모형 개발)

  • Jin-Sung Park;Jun-Woo Kim
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2023.01a
    • /
    • pp.311-313
    • /
    • 2023
  • 병렬 기계 작업장에서 자원을 효율적으로 활용하기 위해서는 처리할 작업을 적절한 기계에 할당해야 한다. 특정 작업을 처리할 기계를 선택할 때 휴리스틱을 사용할 수도 있으나, 특정 작업장에 맞춤화된 휴리스틱을 개발하는 것은 쉽지 않다. 반면, 본 논문에서는 이종 병렬 기계 작업장을 위한 작업 할당 모형을 개발하는데 강화학습을 응용하고자 한다. 작업 할당 모형을 학습하는데 필요한 에피소드들은 상용 시뮬레이션 소프트웨어인 FlexSim을 이용하여 생성하였다. 아울러, stable-baseline3 라이브러리를 이용하여 강화학습 알고리즘을 생성된 에피소드들에 적용하였다. 실험 결과를 통해 시뮬레이션과 강화학습이 작업장 운영관리에 유용함을 알 수 있었다.

  • PDF

Real-time RL-based 5G Network Slicing Design and Traffic Model Distribution: Implementation for V2X and eMBB Services

  • WeiJian Zhou;Azharul Islam;KyungHi Chang
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.17 no.9
    • /
    • pp.2573-2589
    • /
    • 2023
  • As 5G mobile systems carry multiple services and applications, numerous user, and application types with varying quality of service requirements inside a single physical network infrastructure are the primary problem in constructing 5G networks. Radio Access Network (RAN) slicing is introduced as a way to solve these challenges. This research focuses on optimizing RAN slices within a singular physical cell for vehicle-to-everything (V2X) and enhanced mobile broadband (eMBB) UEs, highlighting the importance of adept resource management and allocation for the evolving landscape of 5G services. We put forth two unique strategies: one being offline network slicing, also referred to as standard network slicing, and the other being Online reinforcement learning (RL) network slicing. Both strategies aim to maximize network efficiency by gathering network model characteristics and augmenting radio resources for eMBB and V2X UEs. When compared to traditional network slicing, RL network slicing shows greater performance in the allocation and utilization of UE resources. These steps are taken to adapt to fluctuating traffic loads using RL strategies, with the ultimate objective of bolstering the efficiency of generic 5G services.

Machine learning-based probabilistic predictions of shear resistance of welded studs in deck slab ribs transverse to beams

  • Vitaliy V. Degtyarev;Stephen J. Hicks
    • Steel and Composite Structures
    • /
    • v.49 no.1
    • /
    • pp.109-123
    • /
    • 2023
  • Headed studs welded to steel beams and embedded within the concrete of deck slabs are vital components of modern composite floor systems, where safety and economy depend on the accurate predictions of the stud shear resistance. The multitude of existing deck profiles and the complex behavior of studs in deck slab ribs makes developing accurate and reliable mechanical or empirical design models challenging. The paper addresses this issue by presenting a machine learning (ML) model developed from the natural gradient boosting (NGBoost) algorithm capable of producing probabilistic predictions and a database of 464 push-out tests, which is considerably larger than the databases used for developing existing design models. The proposed model outperforms models based on other ML algorithms and existing descriptive equations, including those in EC4 and AISC 360, while offering probabilistic predictions unavailable from other models and producing higher shear resistances for many cases. The present study also showed that the stud shear resistance is insensitive to the concrete elastic modulus, stud welding type, location of slab reinforcement, and other parameters considered important by existing models. The NGBoost model was interpreted by evaluating the feature importance and dependence determined with the SHapley Additive exPlanations (SHAP) method. The model was calibrated via reliability analyses in accordance with the Eurocodes to ensure that its predictions meet the required reliability level and facilitate its use in design. An interactive open-source web application was created and deployed to the cloud to allow for convenient and rapid stud shear resistance predictions with the developed model.

Stealthy Behavior Simulations Based on Cognitive Data (인지 데이터 기반의 스텔스 행동 시뮬레이션)

  • Choi, Taeyeong;Na, Hyeon-Suk
    • Journal of Korea Game Society
    • /
    • v.16 no.2
    • /
    • pp.27-40
    • /
    • 2016
  • Predicting stealthy behaviors plays an important role in designing stealth games. It is, however, difficult to automate this task because human players interact with dynamic environments in real time. In this paper, we present a reinforcement learning (RL) method for simulating stealthy movements in dynamic environments, in which an integrated model of Q-learning with Artificial Neural Networks (ANN) is exploited as an action classifier. Experiment results show that our simulation agent responds sensitively to dynamic situations and thus is useful for game level designer to determine various parameters for game.

Machine Learning-Based Retrofit Scheme Development for Seismically Vulnerable Reinforced Concrete School Buildings (기계학습기반 기둥 파괴유형 분류모델을 활용한 학교건축물의 내진보강전략 구축)

  • Kim, Subin;Choi, Insub;Shin, Jiuk
    • Journal of the Earthquake Engineering Society of Korea
    • /
    • v.28 no.5
    • /
    • pp.275-283
    • /
    • 2024
  • Many school buildings are vulnerable to earthquakes because they were built before mandatory seismic design was applied. This study uses machine learning to develop an algorithm that rapidly constructs an optimal reinforcement scheme with simple information for non-ductile reinforced concrete school buildings built according to standard design drawings in the 1980s. We utilize a decision tree (DT) model that can conservatively predict the failure type of reinforced concrete columns through machine learning that rapidly determines the failure type of reinforced concrete columns with simple information, and through this, a methodology is developed to construct an optimal reinforcement scheme for the confinement ratio (CR) for ductility enhancement and the stiffness ratio (SR) for stiffness enhancement. By examining the failure types of columns according to changes in confinement ratio and stiffness ratio, we propose a retrofit scheme for school buildings with masonry walls and present the maximum applicable stiffness ratio and the allowable range of stiffness ratio increase for the minimum and maximum values of confinement ratio. This retrofit scheme construction methodology allows for faster construction than existing analysis methods.

Design and Implementation of Project Learning Site by Using XML (XML을 이용한 프로젝트 학습사이트의 설계 및 구현)

  • Choe, Hyun-Kun;Ha, Tai-Hyun
    • Journal of Digital Convergence
    • /
    • v.5 no.2
    • /
    • pp.123-134
    • /
    • 2007
  • The purpose of this study is to design and implementation of project learning site by using XML. The development of the Internet site for project learning was planned as per preparation, development and test/application stages. The research shows that the elements used for the development of the Internet site for project learning are to give learners motivation, specification of learning goals, reminiscence of preceding knowledge, positive participation in teaching activities, learning-guide feedback, evaluation, reinforcement and correction. It is expected that many teachers apply this model to their classes and show realistic results to motivate their students.

  • PDF

Neuro-fuzzy optimisation to model the phenomenon of failure by punching of a slab-column connection without shear reinforcement

  • Hafidi, Mariam;Kharchi, Fattoum;Lefkir, Abdelouhab
    • Structural Engineering and Mechanics
    • /
    • v.47 no.5
    • /
    • pp.679-700
    • /
    • 2013
  • Two new predictive design methods are presented in this study. The first is a hybrid method, called neuro-fuzzy, based on neural networks with fuzzy learning. A total of 280 experimental datasets obtained from the literature concerning concentric punching shear tests of reinforced concrete slab-column connections without shear reinforcement were used to test the model (194 for experimentation and 86 for validation) and were endorsed by statistical validation criteria. The punching shear strength predicted by the neuro-fuzzy model was compared with those predicted by current models of punching shear, widely used in the design practice, such as ACI 318-08, SIA262 and CBA93. The neuro-fuzzy model showed high predictive accuracy of resistance to punching according to all of the relevant codes. A second, more user-friendly design method is presented based on a predictive linear regression model that supports all the geometric and material parameters involved in predicting punching shear. Despite its simplicity, this formulation showed accuracy equivalent to that of the neuro-fuzzy model.

Design and Implementation of Project Learning Site by Using XML (XML을 이용한 프로젝트 학습사이트의 설계 및 구현)

  • Choe, Hyeon-Geun;Ha, Tae-Hyeon
    • 한국디지털정책학회:학술대회논문집
    • /
    • 2005.11a
    • /
    • pp.613-628
    • /
    • 2005
  • The purpose of this study was to design and implementation of project learning site by using XML. The development of the Internet site for project learning was planned as per preparation, development and test/application stages. At the stage of preparation, literature and cases were reviewed to find the elements of design principles needed for development of a project learning program and those required in the actual development. At the stage of development, the elements of the design principles and those for the actual development, both explored from the stage of preparation were used to develop a draft Internet site for project learning. The elements used for the development of the Internet site for project learning were motivation, specification of learning goals, reminiscence of preceding knowledge, positive participation in teaching activities, learning-guide feedback, evaluation, reinforcement and correction. It is changing web based format. XML advent because of HTML limitations of web based internet and expand it's field. XML is able to gather data on HTML of text based format, thus it is possible to control it's inside. It is expected that many teachers apply this model to their classes and show realistic results.

  • PDF

Edge Computing Task Offloading of Internet of Vehicles Based on Improved MADDPG Algorithm

  • Ziyang Jin;Yijun Wang;Jingying Lv
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.18 no.2
    • /
    • pp.327-347
    • /
    • 2024
  • Edge computing is frequently employed in the Internet of Vehicles, although the computation and communication capabilities of roadside units with edge servers are limited. As a result, to perform distributed machine learning on resource-limited MEC systems, resources have to be allocated sensibly. This paper presents an Improved MADDPG algorithm to overcome the current IoV concerns of high delay and limited offloading utility. Firstly, we employ the MADDPG algorithm for task offloading. Secondly, the edge server aggregates the updated model and modifies the aggregation model parameters to achieve optimal policy learning. Finally, the new approach is contrasted with current reinforcement learning techniques. The simulation results show that compared with MADDPG and MAA2C algorithms, our algorithm improves offloading utility by 2% and 9%, and reduces delay by 29.6%.

A Study on the attitude control of the quadrotor using neural networks (신경회로망을 이용한 쿼드로터의 자세 제어에 관한 연구)

  • Kim, Sung-Dea
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.9 no.9
    • /
    • pp.1019-1025
    • /
    • 2014
  • Recently, the studies of the Unmanned Aerial Vehicle(UAV) has been studied a variety from military aircraft to civilian aircraft and for general hobby activity aircraft. In particular, for small unmanned aircraft research for the ease of turning and hovering and Vertical-Off Take Landing(VTOL), have been studied mainly quadrotor unmanned aircraft is a type suitable for this study of small unmanned aircraft. The studies of these unmanned aircraft is the kinetic analysis requires complex processes, because these support by the aerodynamic forces on the unmanned aircraft study, and the controller design based on these dynamical analysis and experimental model analysis. In this paper, after the implementation of the basic attitude control based on a general PID controller, we propose concept design of the attitude control method on quadrotor attitude control by using the reinforcement learning algorithm of neural networks for non-linear elements not considered in the controller design.