• Title/Summary/Keyword: 딥러닝 융합연구

Search Result 451, Processing Time 0.026 seconds

RoutingConvNet: A Light-weight Speech Emotion Recognition Model Based on Bidirectional MFCC (RoutingConvNet: 양방향 MFCC 기반 경량 음성감정인식 모델)

  • Hyun Taek Lim;Soo Hyung Kim;Guee Sang Lee;Hyung Jeong Yang
    • Smart Media Journal
    • /
    • v.12 no.5
    • /
    • pp.28-35
    • /
    • 2023
  • In this study, we propose a new light-weight model RoutingConvNet with fewer parameters to improve the applicability and practicality of speech emotion recognition. To reduce the number of learnable parameters, the proposed model connects bidirectional MFCCs on a channel-by-channel basis to learn long-term emotion dependence and extract contextual features. A light-weight deep CNN is constructed for low-level feature extraction, and self-attention is used to obtain information about channel and spatial signals in speech signals. In addition, we apply dynamic routing to improve the accuracy and construct a model that is robust to feature variations. The proposed model shows parameter reduction and accuracy improvement in the overall experiments of speech emotion datasets (EMO-DB, RAVDESS, and IEMOCAP), achieving 87.86%, 83.44%, and 66.06% accuracy respectively with about 156,000 parameters. In this study, we proposed a metric to calculate the trade-off between the number of parameters and accuracy for performance evaluation against light-weight.

A Study on Improving Facial Recognition Performance to Introduce a New Dog Registration Method (새로운 반려견 등록방식 도입을 위한 안면 인식 성능 개선 연구)

  • Lee, Dongsu;Park, Gooman
    • Journal of Broadcast Engineering
    • /
    • v.27 no.5
    • /
    • pp.794-807
    • /
    • 2022
  • Although registration of dogs is mandatory according to the revision of the Animal Protection Act, the registration rate is low due to the inconvenience of the current registration method. In this paper, a performance improvement study was conducted on the dog face recognition technology, which is being reviewed as a new registration method. Through deep learning learning, an embedding vector for facial recognition of a dog was created and a method for identifying each dog individual was experimented. We built a dog image dataset for deep learning learning and experimented with InceptionNet and ResNet-50 as backbone networks. It was learned by the triplet loss method, and the experiments were divided into face verification and face recognition. In the ResNet-50-based model, it was possible to obtain the best facial verification performance of 93.46%, and in the face recognition test, the highest performance of 91.44% was obtained in rank-5, respectively. The experimental methods and results presented in this paper can be used in various fields, such as checking whether a dog is registered or not, and checking an object at a dog access facility.

A Study on Deep Learning based Aerial Vehicle Classification for Armament Selection (무장 선택을 위한 딥러닝 기반의 비행체 식별 기법 연구)

  • Eunyoung, Cha;Jeongchang, Kim
    • Journal of Broadcast Engineering
    • /
    • v.27 no.6
    • /
    • pp.936-939
    • /
    • 2022
  • As air combat system technologies developed in recent years, the development of air defense systems is required. In the operating concept of the anti-aircraft defense system, selecting an appropriate armament for the target is one of the system's capabilities in efficiently responding to threats using limited anti-aircraft power. Much of the flying threat identification relies on the operator's visual identification. However, there are many limitations in visually discriminating a flying object maneuvering high speed from a distance. In addition, as the demand for unmanned and intelligent weapon systems on the modern battlefield increases, it is essential to develop a technology that automatically identifies and classifies the aircraft instead of the operator's visual identification. Although some examples of weapon system identification with deep learning-based models by collecting video data for tanks and warships have been presented, aerial vehicle identification is still lacking. Therefore, in this paper, we present a model for classifying fighters, helicopters, and drones using a convolutional neural network model and analyze the performance of the presented model.

A Study on the Generation of Fouling Organism Information Based Aids to Navigation (항로표지 기반의 부착생물 정보 생성에 관한 연구)

  • Shin-Girl Lee;Chae-Uk Song;Yun-Ja Yoo;Min Jung
    • Journal of the Korean Society of Marine Environment & Safety
    • /
    • v.29 no.5
    • /
    • pp.456-461
    • /
    • 2023
  • The Korea Maritime Environment Corporation is conducting a comprehensive survey of the national marine ecosystem under the commission of the Ministry of Oceans and Fisheries (MOF) to ensure continuous use of the ocean, preserve and manage the marine ecosystem. The survey has set major peaks to investigate changes in the marine ecosystem around the Korean Peninsula. However as the peak has been set around the coast, it is necessary to expand the scope of investigation to encompass offshore areas. Meanwhile, the Aids to Navigation Division of the MOF supports a comprehensive national marine ecosystem survey providing photographs of fouling organisms during the Aids to Navigation lifting inspection, however, the photographs are provided only in consultation with the Korea Maritime Environment Corporation. Therefore, a study was conducted to generate information on fouling organisms using deep learning-based image processing algorithms by the lifting Aids to Navigation and dorsal buoys so that Aids to Navigation could be used as the major component of a comprehensive national marine ecosystem. If the Aids to Navigation are used as the peak of the survey, they could serve as fundamental data to enhance their own value as well as analyze abnormal marine conditions and ecosystem changes in Korea.

A Pilot Study on Outpainting-powered Pet Pose Estimation (아웃페인팅 기반 반려동물 자세 추정에 관한 예비 연구)

  • Gyubin Lee;Youngchan Lee;Wonsang You
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.24 no.1
    • /
    • pp.69-75
    • /
    • 2023
  • In recent years, there has been a growing interest in deep learning-based animal pose estimation, especially in the areas of animal behavior analysis and healthcare. However, existing animal pose estimation techniques do not perform well when body parts are occluded or not present. In particular, the occlusion of dog tail or ear might lead to a significant degradation of performance in pet behavior and emotion recognition. In this paper, to solve this intractable problem, we propose a simple yet novel framework for pet pose estimation where pet pose is predicted on an outpainted image where some body parts hidden outside the input image are reconstructed by the image inpainting network preceding the pose estimation network, and we performed a preliminary study to test the feasibility of the proposed approach. We assessed CE-GAN and BAT-Fill for image outpainting, and evaluated SimpleBaseline for pet pose estimation. Our experimental results show that pet pose estimation on outpainted images generated using BAT-Fill outperforms the existing methods of pose estimation on outpainting-less input image.

A Named Entity Recognition Model in Criminal Investigation Domain using Pretrained Language Model (사전학습 언어모델을 활용한 범죄수사 도메인 개체명 인식)

  • Kim, Hee-Dou;Lim, Heuiseok
    • Journal of the Korea Convergence Society
    • /
    • v.13 no.2
    • /
    • pp.13-20
    • /
    • 2022
  • This study is to develop a named entity recognition model specialized in criminal investigation domains using deep learning techniques. Through this study, we propose a system that can contribute to analysis of crime for prevention and investigation using data analysis techniques in the future by automatically extracting and categorizing crime-related information from text-based data such as criminal judgments and investigation documents. For this study, the criminal investigation domain text was collected and the required entity name was newly defined from the perspective of criminal analysis. In addition, the proposed model applying KoELECTRA, a pre-trained language model that has recently shown high performance in natural language processing, shows performance of micro average(referred to as micro avg) F1-score 98% and macro average(referred to as macro avg) F1-score 95% in 9 main categories of crime domain NER experiment data, and micro avg F1-score 98% and macro avg F1-score 62% in 56 sub categories. The proposed model is analyzed from the perspective of future improvement and utilization.

A Study on a Real-Time Aerial Image-Based UAV-USV Cooperative Guidance and Control Algorithm (실시간 항공영상 기반 UAV-USV 간 협응 유도·제어 알고리즘 개발)

  • Do-Kyun Kim;Jeong-Hyeon Kim;Hui-Hun Son;Si-Woong Choi;Dong-Han Kim;Chan Young Yeo;Jong-Yong Park
    • Journal of the Society of Naval Architects of Korea
    • /
    • v.61 no.5
    • /
    • pp.324-333
    • /
    • 2024
  • This paper focuses on the cooperation between Unmanned Aerial Vehicle (UAV) and Unmanned Surface Vessel (USV). It aims to develop efficient guidance and control algorithms for USV based on obstacle identification and path planning from aerial images captured by UAV. Various obstacle scenarios were implemented using the Robot Operating System (ROS) and the Gazebo simulation environment. The aerial images transmitted in real-time from UAV to USV are processed using the computer vision-based deep learning model, You Only Look Once (YOLO), to classify and recognize elements such as the water surface, obstacles, and ships. The recognized data is used to create a two-dimensional grid map. Algorithms such as A* and Rapidly-exploring Random Tree star (RRT*) were used for path planning. This process enhances the guidance and control strategies within the UAV-USV collaborative system, especially improving the navigational capabilities of the USV in complex and dynamic environments. This research offers significant insights into obstacle avoidance and path planning in maritime environments and proposes new directions for the integrated operation of UAV and USV.

Transfer Learning-based Generated Synthetic Images Identification Model (전이 학습 기반의 생성 이미지 판별 모델 설계)

  • Chaewon Kim;Sungyeon Yoon;Myeongeun Han;Minseo Park
    • The Journal of the Convergence on Culture Technology
    • /
    • v.10 no.2
    • /
    • pp.465-470
    • /
    • 2024
  • The advancement of AI-based image generation technology has resulted in the creation of various images, emphasizing the need for technology capable of accurately discerning them. The amount of generated image data is limited, and to achieve high performance with a limited dataset, this study proposes a model for discriminating generated images using transfer learning. Applying pre-trained models from the ImageNet dataset directly to the CIFAKE input dataset, we reduce training time cost followed by adding three hidden layers and one output layer to fine-tune the model. The modeling results revealed an improvement in the performance of the model when adjusting the final layer. Using transfer learning and then adjusting layers close to the output layer, small image data-related accuracy issues can be reduced and generated images can be classified.

Spot The Difference Generation System Using Generative Adversarial Networks (생성적 적대 신경망을 활용한 다른 그림 찾기 생성 시스템)

  • Song, Seongheon;Moon, Mikyeong;Choi, Bongjun
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2021.07a
    • /
    • pp.673-674
    • /
    • 2021
  • 본 논문은 집중력 향상 놀이인 다른 그림 찾기를 자신이 좋아하는 주제를 배경으로 쉽게 생성할 수 있는 시스템을 제안한다. 아동기에 주로 진단이 되고 성인기까지 이어질 수 있는 주의력 결핍 과다활동 증후군(ADHD)을 조기에 예방하기 위해 본 논문에서는 선택한 그림의 일부분을 가지고 생성적 적대 신경망을 활용하여 새로운 물체를 생성해 낸 뒤 자연스럽게 원본 그림에 융화될 수 있도록 하는 것이 목표이다. 하나의 다른 그림 찾기 콘텐츠를 만드는 것은 포토샵과 같이 전문성을 가진 툴을 전문가가 오랜 시간 작업해야 하는 내용이다. 전문적인 기술이 필요한 작업 과정을 본 연구를 통해 일반인도 쉽게 작업할 수 있도록 하는 것을 최종 목표로 한다.

  • PDF

Designing a 3D-CNN for Non-Contact PPG Signal Acquisition Based on Video Imaging (영상기반 비접촉식 PPG 신호 취득을 위한 3D-CNN 설계)

  • Tae-Wan Kim;Chan-Uk ,Yeom;Keun-Chang Kawk
    • Annual Conference of KIPS
    • /
    • 2023.05a
    • /
    • pp.627-629
    • /
    • 2023
  • 생체 신호를 분석하여 사용자의 건강과 정신 상태를 예측하고, 관련 질병에 관해 예방하는 연구가 늘어나고 있다. 생체 신호 중 심박은 사람의 육체, 정신적인 상태를 반영하는 대표적인 신호이지만 기존의 접촉 패드를 통한 ECG나 광학 센서를 통한 PPG로 심박을 예측할 때는 구속적인 환경이 필요하여 일상적인 상황 속에 적용하기 어려웠다. 이러한 단점을 해결하고자 본 논문은 UBFC-RPPG 데이터셋의 동영상 프레임을 RGB 채널마다 다른 가중치를 적용하는 전처리를 하여 학습 데이터의 크기를 줄이면서 정확도를 높이고, 3D-CNN을 활용한 딥러닝으로 순간적인 영상에서도 PPG 신호를 예측할 수 있도록 1초 전처리 영상을 학습한 후, 신호를 예측하는 것을 목표로 한다. 이렇게 비접촉식으로 취득된 신호는 더 다양한 환경에서의 감정분류, 우울증 진단, 질병 감지 등 다양한 분야에 활용될 수 있다.