• Title/Summary/Keyword: deep reinforcement learning

Search Result 208, Processing Time 0.032 seconds

DRL based Dynamic Service Mobility for Marginal Downtime in Multi-access Edge Computing

  • Mwasinga, Lusungu Josh;Raza, Syed Muhammad;Chu, Hyeon-Seung
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2022.05a
    • /
    • pp.114-116
    • /
    • 2022
  • The advent of the Multi-access Edge Computing (MEC) paradigm allows mobile users to offload resource-intensive and delay-stringent services to nearby servers, thereby significantly enhancing the quality of experience. Due to erratic roaming of mobile users in the network environment, maintaining maximum quality of experience becomes challenging as they move farther away from the serving edge server, particularly due to the increased latency resulting from the extended distance. The services could be migrated, under policies obtained using Deep Reinforcement Learning (DRL) techniques, to an optimal edge server, however, this operation incurs significant costs in terms of service downtime, thereby adversely affecting service quality of experience. Thus, this study addresses the service mobility problem of deciding whether to migrate and where to migrate the service instance for maximized migration benefits and marginal service downtime.

Technical Trends in Artificial Intelligence for De Novo Drug Design (신규 약물 설계를 위한 인공지능 기술 동향)

  • Y.W. Han;H.Y. Jung;S.J. Park
    • Electronics and Telecommunications Trends
    • /
    • v.38 no.3
    • /
    • pp.38-46
    • /
    • 2023
  • The value of living a long and healthy life without suffering has increased owing to aging populations, transition to welfare societies, and global interest in health deriving from the novel coronavirus disease pandemic. New drug development has gained attention as both a tool to improve the quality of life and high-value market, with blockbuster drugs potentially generating over 10 billion dollars in annual revenue. However, for newly discovered substances to be used as drugs, various properties must be verified over a long period in a time-consuming and costly process. Recently, the development of artificial intelligence technologies, such as deep and reinforcement learning, has led to significant changes in drug development by enabling the effective identification of drug candidates that satisfy desired properties. We explore and discuss trends in artificial intelligence for de novo drug design.

Deep Reinforcement Learning based Adaptive GOP Selection for HEVC/H.265 Encoder (심층적 강화학습 기반 적응적 GOP 선택을 통한 HEVC/H.265 인코더 제어)

  • Lee, Jung-Kyung;Kim, Nayoung;Kang, Je-Won
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2020.11a
    • /
    • pp.140-142
    • /
    • 2020
  • 본 논문에서는 심층적 강화학습 기반 GOP (Group of Picture) 크기를 선택하여 HEVC/H.265의 인코더를 제어하는 방법을 제안한다. 기존 방법에서는 현재 비디오 신호를 부호화 하는 과정에서 이미 부호화한 정보를 사용해야하는 부호화 의존성에 관한 문제가 있었다. 제안 방법은 강화학습 방식을 도입하여 이러한 문제를 극복하고 입력 비디오의 시간적 상관도에 따라 GOP의 크기를 적응적으로 선택하여 부호화 한다. 본 논문에서는 GOP 선택을 위한 강화학습 환경을 새롭게 정의하고 부호화 성능에 따른 보상을 부여하는 방식으로 학습을 수행한다. 제안된 적응적 GOP 선택에 따라 인코더 제어 시, 부호화 방법의 부호화 효율이 -6.07% BD-rate 향상된 실험 결과를 보이며 본 방법의 우수성을 입증한다.

  • PDF

Task Scheduling Using Deep Reinforcement Learning in Mobile Edge Computing-based Smart Factory Environment (MEC 기반 스마트 팩토리 환경에서 DRL를 이용한 태스크 스케줄링)

  • Koo, Seolwon;Lim, Yujin
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2022.05a
    • /
    • pp.147-150
    • /
    • 2022
  • 최근 들어 다양한 제약 조건이 있는 스마트 시티나 스마트 팩토리와 같은 도메인들 내에서 태스크들을 효과적으로 처리하기 위해서 MEC 기술이 많이 사용되고 있다. 그러나 이러한 도메인에서 발생하는 복잡하고 동적인 시나리오는 기존의 휴리스틱이나 메타 휴리스틱 기법을 이용하여 해결하기엔 계산 복잡도가 증가하는 문제점을 가지고 있다. 따라서 최근 들어 이러한 문제점을 해결하기 위한 방법 중 하나로 강화학습과 딥러닝이 결합된 DRL 기법이 주목을 받고 있다. 본 연구는 스마트 팩토리 환경에서 종속성을 가진 태스크들이 실행시간과 태스크가 처리되는 MEC 서버들의 로드 표준편차를 최소화하는 태스크 스케줄링 기법을 제안한다. 모의실험을 통하여 제안 기법은 태스크가 증가하는 동적인 환경에서도 좋은 성능을 보임을 증명하였다.

A Deep Reinforcement Learning Framework for Optimal Path Planning of Industrial Robotic Arm (산업용 로봇 팔 최적 경로 계획을 위한 심층강화학습 프레임워크)

  • Kwon, Junhyung;Cho, Deun-Sol;Kim, Won-Tae
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2022.11a
    • /
    • pp.75-76
    • /
    • 2022
  • 현재 산업용 로봇 팔의 경로 계획을 생성할 때, 로봇 팔 경로 계획은 로봇 엔지니어가 수동으로 로봇을 제어하며 최적 경로 계획을 탐색한다. 미래에 고객의 다양한 요구에 따라 공정을 유연하게 변경하는 대량 맞춤 시대에는 기존의 경로 계획 수립 방식은 부적합하다. 심층강화학습 프레임워크는 가상 환경에서 로봇 팔 경로 계획 수립을 학습해 새로운 공정으로 변경될 때, 최적 경로 계획을 자동으로 수립해 로봇 팔에 전달하여 빠르고 유연한 공정 변경을 지원한다. 본 논문에서는 심층강화학습 에이전트를 위한 학습 환경 구축과 인공지능 모델과 학습 환경의 연동을 중심으로, 로봇 팔 경로 계획 수립을 위한 심층강화학습 프레임워크 구조를 설계한다.

Explainable Deep Reinforcement Learning Knowledge Distillation for Global Optimal Solutions (글로벌 최적 솔루션을 위한 설명 가능한 심층 강화 학습 지식 증류)

  • Fengjun Li;Inwhee Joe
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2023.11a
    • /
    • pp.524-525
    • /
    • 2023
  • 설명 가능한 심층 강화 학습 지식 증류 방법(ERL-KD)이 제안하였다. 이 방법은 모든 하위 에이전트로부터 점수를 수집하며, 메인 에이전트는 주 교사 네트워크 역할을 하고 하위 에이전트는 보조 교사 네트워크 역할을 한다. 글로벌 최적 솔루션은 샤플리 값과 같은 해석 가능한 방법을 통해 얻어진다. 또한 유사도 제약이라는 개념을 도입하여 교사 네트워크와 학생 네트워크 간의 유사도를 조정함으로써 학생 네트워크가 자유롭게 탐색할 수 있도록 유도한다. 실험 결과, 학생 네트워크는 아타리 2600 환경에서 대규모 교사 네트워크와 비슷한 성능을 달성하는 것으로 나타났다.

Resource Allocation Strategy of Internet of Vehicles Using Reinforcement Learning

  • Xi, Hongqi;Sun, Huijuan
    • Journal of Information Processing Systems
    • /
    • v.18 no.3
    • /
    • pp.443-456
    • /
    • 2022
  • An efficient and reasonable resource allocation strategy can greatly improve the service quality of Internet of Vehicles (IoV). However, most of the current allocation methods have overestimation problem, and it is difficult to provide high-performance IoV network services. To solve this problem, this paper proposes a network resource allocation strategy based on deep learning network model DDQN. Firstly, the method implements the refined modeling of IoV model, including communication model, user layer computing model, edge layer offloading model, mobile model, etc., similar to the actual complex IoV application scenario. Then, the DDQN network model is used to calculate and solve the mathematical model of resource allocation. By decoupling the selection of target Q value action and the calculation of target Q value, the phenomenon of overestimation is avoided. It can provide higher-quality network services and ensure superior computing and processing performance in actual complex scenarios. Finally, simulation results show that the proposed method can maintain the network delay within 65 ms and show excellent network performance in high concurrency and complex scenes with task data volume of 500 kbits.

Contextual-Bandit Based Log Level Setting for Video Wall Controller (Contextual Bandit에 기반한 비디오 월 컨트롤러의 로그레벨)

  • Kim, Sung-jin
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2022.05a
    • /
    • pp.633-635
    • /
    • 2022
  • If an error occurs during operation of the video wall controller, the control system creates a log file and records the log. In order to minimize the load on the system due to log recording, the log level is set so that the log is not recorded as much as possible under normal operating conditions. When an error occurs, detailed logs are recorded by changing the log level to analyze and respond to the cause of the error. So work efficiency is reduced, and operator intervention is inevitable to change the log level. In this paper, we propose a model that automatically sets the log level according to the operating situation using Contextual Bandit.

  • PDF

Research on Developing a Conversational AI Callbot Solution for Medical Counselling

  • Won Ro LEE;Jeong Hyon CHOI;Min Soo KANG
    • Korean Journal of Artificial Intelligence
    • /
    • v.11 no.4
    • /
    • pp.9-13
    • /
    • 2023
  • In this study, we explored the potential of integrating interactive AI callbot technology into the medical consultation domain as part of a broader service development initiative. Aimed at enhancing patient satisfaction, the AI callbot was designed to efficiently address queries from hospitals' primary users, especially the elderly and those using phone services. By incorporating an AI-driven callbot into the hospital's customer service center, routine tasks such as appointment modifications and cancellations were efficiently managed by the AI Callbot Agent. On the other hand, tasks requiring more detailed attention or specialization were addressed by Human Agents, ensuring a balanced and collaborative approach. The deep learning model for voice recognition for this study was based on the Transformer model and fine-tuned to fit the medical field using a pre-trained model. Existing recording files were converted into learning data to perform SSL(self-supervised learning) Model was implemented. The ANN (Artificial neural network) neural network model was used to analyze voice signals and interpret them as text, and after actual application, the intent was enriched through reinforcement learning to continuously improve accuracy. In the case of TTS(Text To Speech), the Transformer model was applied to Text Analysis, Acoustic model, and Vocoder, and Google's Natural Language API was applied to recognize intent. As the research progresses, there are challenges to solve, such as interconnection issues between various EMR providers, problems with doctor's time slots, problems with two or more hospital appointments, and problems with patient use. However, there are specialized problems that are easy to make reservations. Implementation of the callbot service in hospitals appears to be applicable immediately.

Mapless Navigation Based on DQN Considering Moving Obstacles, and Training Time Reduction Algorithm (이동 장애물을 고려한 DQN 기반의 Mapless Navigation 및 학습 시간 단축 알고리즘)

  • Yoon, Beomjin;Yoo, Seungryeol
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.25 no.3
    • /
    • pp.377-383
    • /
    • 2021
  • Recently, in accordance with the 4th industrial revolution, The use of autonomous mobile robots for flexible logistics transfer is increasing in factories, the warehouses and the service areas, etc. In large factories, many manual work is required to use Simultaneous Localization and Mapping(SLAM), so the need for the improved mobile robot autonomous driving is emerging. Accordingly, in this paper, an algorithm for mapless navigation that travels in an optimal path avoiding fixed or moving obstacles is proposed. For mapless navigation, the robot is trained to avoid fixed or moving obstacles through Deep Q Network (DQN) and accuracy 90% and 93% are obtained for two types of obstacle avoidance, respectively. In addition, DQN requires a lot of learning time to meet the required performance before use. To shorten this, the target size change algorithm is proposed and confirmed the reduced learning time and performance of obstacle avoidance through simulation.