• Title/Summary/Keyword: Deep Learning System

Search Result 1,738, Processing Time 0.03 seconds

Development of Gas Type Identification Deep-learning Model through Multimodal Method (멀티모달 방식을 통한 가스 종류 인식 딥러닝 모델 개발)

  • Seo Hee Ahn;Gyeong Yeong Kim;Dong Ju Kim
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.12 no.12
    • /
    • pp.525-534
    • /
    • 2023
  • Gas leak detection system is a key to minimize the loss of life due to the explosiveness and toxicity of gas. Most of the leak detection systems detect by gas sensors or thermal imaging cameras. To improve the performance of gas leak detection system using single-modal methods, the paper propose multimodal approach to gas sensor data and thermal camera data in developing a gas type identification model. MultimodalGasData, a multimodal open-dataset, is used to compare the performance of the four models developed through multimodal approach to gas sensors and thermal cameras with existing models. As a result, 1D CNN and GasNet models show the highest performance of 96.3% and 96.4%. The performance of the combined early fusion model of 1D CNN and GasNet reached 99.3%, 3.3% higher than the existing model. We hoped that further damage caused by gas leaks can be minimized through the gas leak detection system proposed in the study.

Real-time Printed Text Detection System using Deep Learning Model (딥러닝 모델을 활용한 실시간 인쇄물 문자 탐지 시스템)

  • Ye-Jun Choi;Song-Won Kim;Mi-Kyeong Moon
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.19 no.3
    • /
    • pp.523-530
    • /
    • 2024
  • Online, such as web pages and digital documents, have the ability to search for specific words or specific phrases that users want to search in real time. Printed materials such as printed books and reference books often have difficulty finding specific words or specific phrases in real time. This paper describes the development of a deep learning model for detecting text and a real-time character detection system using OCR for recognizing text. This study proposes a method of detecting text using the EAST model, a method of recognizing the detected text using EasyOCR, and a method of expressing the recognized text as a bounding box by comparing a specific word or specific phrase that the user wants to search for. Through this system, users expect to find specific words or phrases they want to search in real time in print, such as books and reference books, and find necessary information easily and quickly.

Context-Awareness Cat Behavior Captioning System (반려묘의 상황인지형 행동 캡셔닝 시스템)

  • Chae, Heechan;Choi, Yoona;Lee, Jonguk;Park, Daihee;Chung, Yongwha
    • Journal of Korea Multimedia Society
    • /
    • v.24 no.1
    • /
    • pp.21-29
    • /
    • 2021
  • With the recent increase in the number of households raising pets, various engineering studies have been underway for pets. The final purpose of this study is to automatically generate situation-sensitive captions that can express implicit intentions based on the behavior and sound of cats by embedding the already mature behavioral detection technology of pets as basic element technology in the video capturing research. As a pilot project to this end, this paper proposes a high-level capturing system using optical-flow, RGB, and sound information of cat videos. That is, the proposed system uses video datasets collected in an actual breeding environment to extract feature vectors from the video and sound, then through hierarchical LSTM encoder and decoder, to identify the cat's behavior and its implicit intentions, and to perform learning to create context-sensitive captions. The performance of the proposed system was verified experimentally by utilizing video data collected in the environment where actual cats are raised.

Human Gait Recognition Based on Spatio-Temporal Deep Convolutional Neural Network for Identification

  • Zhang, Ning;Park, Jin-ho;Lee, Eung-Joo
    • Journal of Korea Multimedia Society
    • /
    • v.23 no.8
    • /
    • pp.927-939
    • /
    • 2020
  • Gait recognition can identify people's identity from a long distance, which is very important for improving the intelligence of the monitoring system. Among many human features, gait features have the advantages of being remotely available, robust, and secure. Traditional gait feature extraction, affected by the development of behavior recognition, can only rely on manual feature extraction, which cannot meet the needs of fine gait recognition. The emergence of deep convolutional neural networks has made researchers get rid of complex feature design engineering, and can automatically learn available features through data, which has been widely used. In this paper,conduct feature metric learning in the three-dimensional space by combining the three-dimensional convolution features of the gait sequence and the Siamese structure. This method can capture the information of spatial dimension and time dimension from the continuous periodic gait sequence, and further improve the accuracy and practicability of gait recognition.

Deep Learning based Music Classification System (딥러닝 기반의 음원검색 및 분류 시스템)

  • Lee, Sei-Hoon;Jeong, Ui-Jung
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2018.07a
    • /
    • pp.119-120
    • /
    • 2018
  • 본 논문에서는 음악을 듣고 어떤 음악인지 인식하고 판별하는 음원분류 시스템과 해당 기술 구현을 딥러닝을 통해 적용하도록 제안하였다. 제안한 시스템은 인공심층신경망을 통해 음원파일을 여러 음원 특징 추출 모델에 따라 검출된 특징들을 학습하여 해당 음원의 고유한 보컬이나 반주의 특색 등을 찾아내어 이를 인식할 수 있도록 구현하였다. 이를 통해, 기존의 Fingerprint 방식의 데이터베이스 검색 시스템과는 다른 접근방식으로 보다 사람이 음악을 기억하는 방법에 가깝도록 구현하여 능동성과 유연성을 개선하고 다양한 응용분야로 활용할 수 있는 시스템을 제안하였다.

  • PDF

ADD-Net: Attention Based 3D Dense Network for Action Recognition

  • Man, Qiaoyue;Cho, Young Im
    • Journal of the Korea Society of Computer and Information
    • /
    • v.24 no.6
    • /
    • pp.21-28
    • /
    • 2019
  • Recent years with the development of artificial intelligence and the success of the deep model, they have been deployed in all fields of computer vision. Action recognition, as an important branch of human perception and computer vision system research, has attracted more and more attention. Action recognition is a challenging task due to the special complexity of human movement, the same movement may exist between multiple individuals. The human action exists as a continuous image frame in the video, so action recognition requires more computational power than processing static images. And the simple use of the CNN network cannot achieve the desired results. Recently, the attention model has achieved good results in computer vision and natural language processing. In particular, for video action classification, after adding the attention model, it is more effective to focus on motion features and improve performance. It intuitively explains which part the model attends to when making a particular decision, which is very helpful in real applications. In this paper, we proposed a 3D dense convolutional network based on attention mechanism(ADD-Net), recognition of human motion behavior in the video.

Proper Noun Embedding Model for the Korean Dependency Parsing

  • Nam, Gyu-Hyeon;Lee, Hyun-Young;Kang, Seung-Shik
    • Journal of Multimedia Information System
    • /
    • v.9 no.2
    • /
    • pp.93-102
    • /
    • 2022
  • Dependency parsing is a decision problem of the syntactic relation between words in a sentence. Recently, deep learning models are used for dependency parsing based on the word representations in a continuous vector space. However, it causes a mislabeled tagging problem for the proper nouns that rarely appear in the training corpus because it is difficult to express out-of-vocabulary (OOV) words in a continuous vector space. To solve the OOV problem in dependency parsing, we explored the proper noun embedding method according to the embedding unit. Before representing words in a continuous vector space, we replace the proper nouns with a special token and train them for the contextual features by using the multi-layer bidirectional LSTM. Two models of the syllable-based and morpheme-based unit are proposed for proper noun embedding and the performance of the dependency parsing is more improved in the ensemble model than each syllable and morpheme embedding model. The experimental results showed that our ensemble model improved 1.69%p in UAS and 2.17%p in LAS than the same arc-eager approach-based Malt parser.

TVM-based Performance Optimization for Image Classification in Embedded Systems (임베디드 시스템에서의 객체 분류를 위한 TVM기반의 성능 최적화 연구)

  • Cheonghwan Hur;Minhae Ye;Ikhee Shin;Daewoo Lee
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.18 no.3
    • /
    • pp.101-108
    • /
    • 2023
  • Optimizing the performance of deep neural networks on embedded systems is a challenging task that requires efficient compilers and runtime systems. We propose a TVM-based approach that consists of three steps: quantization, auto-scheduling, and ahead-of-time compilation. Our approach reduces the computational complexity of models without significant loss of accuracy, and generates optimized code for various hardware platforms. We evaluate our approach on three representative CNNs using ImageNet Dataset on the NVIDIA Jetson AGX Xavier board and show that it outperforms baseline methods in terms of processing speed.

Real-Time Fire Detection Method Using YOLOv8 (YOLOv8을 이용한 실시간 화재 검출 방법)

  • Tae Hee Lee;Chun-Su Park
    • Journal of the Semiconductor & Display Technology
    • /
    • v.22 no.2
    • /
    • pp.77-80
    • /
    • 2023
  • Since fires in uncontrolled environments pose serious risks to society and individuals, many researchers have been investigating technologies for early detection of fires that occur in everyday life. Recently, with the development of deep learning vision technology, research on fire detection models using neural network backbones such as Transformer and Convolution Natural Network has been actively conducted. Vision-based fire detection systems can solve many problems with physical sensor-based fire detection systems. This paper proposes a fire detection method using the latest YOLOv8, which improves the existing fire detection method. The proposed method develops a system that detects sparks and smoke from input images by training the Yolov8 model using a universal fire detection dataset. We also demonstrate the superiority of the proposed method through experiments by comparing it with existing methods.

  • PDF

GAN-based camouflage pattern generation parameter optimization system for improving assimilation rate with environment (야생 환경과의 동화율 개선을 위한 GAN 알고리즘 기반 위장 패턴 생성 파라미터 최적화 시스템)

  • Park, JunHyeok;Park, Seungmin;Cho, Dae-Soo
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2022.07a
    • /
    • pp.511-512
    • /
    • 2022
  • 동물무늬는 서식지에 따라 야생에서 천적으로부터 살아남을 수 있는 중요한 역할을 한다. 동물무늬의 역할 중 하나인 자연과 야생 환경에서 천적의 눈을 피해 위장하는 기능이 있기 때문인데 본 논문에서는 기존 위장무늬의 개선을 위한 GAN 알고리즘 기반 위장 패턴 생성모델을 제안한다. 이 모델은 단순히 색상만을 사용하여 위장무늬의 윤곽선을 Blur 처리를 해서 사람의 관측을 흐리게 만드는 기존의 모델의 단순함을 보완하여 GAN 알고리즘의 활용기술인 Deep Dream을 활용하여 경사 상승법을 통해 특정 층의 필터 값을 조절하여 원하는 부분에 대한 구분되는 패턴을 생성할 수 있어 색뿐만 아니라 위장의 기능이 있는 동물무늬와 섞어 자연과 야생 환경에서 더욱 동화율이 높아진 위장 패턴을 생성하고자 한다.

  • PDF