• Title/Summary/Keyword: Deep Learning System

Search Result 1,738, Processing Time 0.028 seconds

Active Vision from Image-Text Multimodal System Learning (능동 시각을 이용한 이미지-텍스트 다중 모달 체계 학습)

  • Kim, Jin-Hwa;Zhang, Byoung-Tak
    • Journal of KIISE
    • /
    • v.43 no.7
    • /
    • pp.795-800
    • /
    • 2016
  • In image classification, recent CNNs compete with human performance. However, there are limitations in more general recognition. Herein we deal with indoor images that contain too much information to be directly processed and require information reduction before recognition. To reduce the amount of data processing, typically variational inference or variational Bayesian methods are suggested for object detection. However, these methods suffer from the difficulty of marginalizing over the given space. In this study, we propose an image-text integrated recognition system using active vision based on Spatial Transformer Networks. The system attempts to efficiently sample a partial region of a given image for a given language information. Our experimental results demonstrate a significant improvement over traditional approaches. We also discuss the results of qualitative analysis of sampled images, model characteristics, and its limitations.

A Study on the Implementation of Real-Time Marine Deposited Waste Detection AI System and Performance Improvement Method by Data Screening and Class Segmentation (데이터 선별 및 클래스 세분화를 적용한 실시간 해양 침적 쓰레기 감지 AI 시스템 구현과 성능 개선 방법 연구)

  • Wang, Tae-su;Oh, Seyeong;Lee, Hyun-seo;Choi, Donggyu;Jang, Jongwook;Kim, Minyoung
    • The Journal of the Convergence on Culture Technology
    • /
    • v.8 no.3
    • /
    • pp.571-580
    • /
    • 2022
  • Marine deposited waste is a major cause of problems such as a lot of damage and an increase in the estimated amount of garbage due to abandoned fishing grounds caused by ghost fishing. In this paper, we implement a real-time marine deposited waste detection artificial intelligence system to understand the actual conditions of waste fishing gear usage, distribution, loss, and recovery, and study methods for performance improvement. The system was implemented using the yolov5 model, which is an excellent performance model for real-time object detection, and the 'data screening process' and 'class segmentation' method of learning data were applied as performance improvement methods. In conclusion, the object detection results of datasets that do screen unnecessary data or do not subdivide similar items according to characteristics and uses are better than the object recognition results of unscreened datasets and datasets in which classes are subdivided.

Enhanced Sound Signal Based Sound-Event Classification (향상된 음향 신호 기반의 음향 이벤트 분류)

  • Choi, Yongju;Lee, Jonguk;Park, Daihee;Chung, Yongwha
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.8 no.5
    • /
    • pp.193-204
    • /
    • 2019
  • The explosion of data due to the improvement of sensor technology and computing performance has become the basis for analyzing the situation in the industrial fields, and various attempts to detect events based on such data are increasing recently. In particular, sound signals collected from sensors are used as important information to classify events in various application fields as an advantage of efficiently collecting field information at a relatively low cost. However, the performance of sound-event classification in the field cannot be guaranteed if noise can not be removed. That is, in order to implement a system that can be practically applied, robust performance should be guaranteed even in various noise conditions. In this study, we propose a system that can classify the sound event after generating the enhanced sound signal based on the deep learning algorithm. Especially, to remove noise from the sound signal itself, the enhanced sound data against the noise is generated using SEGAN applied to the GAN with a VAE technique. Then, an end-to-end based sound-event classification system is designed to classify the sound events using the enhanced sound signal as input data of CNN structure without a data conversion process. The performance of the proposed method was verified experimentally using sound data obtained from the industrial field, and the f1 score of 99.29% (railway industry) and 97.80% (livestock industry) was confirmed.

An Auto-Labeling based Smart Image Annotation System (자동-레이블링 기반 영상 학습데이터 제작 시스템)

  • Lee, Ryong;Jang, Rae-young;Park, Min-woo;Lee, Gunwoo;Choi, Myung-Seok
    • The Journal of the Korea Contents Association
    • /
    • v.21 no.6
    • /
    • pp.701-715
    • /
    • 2021
  • The drastic advance of recent deep learning technologies is heavily dependent on training datasets which are essential to train models by themselves with less human efforts. In comparison with the work to design deep learning models, preparing datasets is a long haul; at the moment, in the domain of vision intelligent, datasets are still being made by handwork requiring a lot of time and efforts, where workers need to directly make labels on each image usually with GUI-based labeling tools. In this paper, we overview the current status of vision datasets focusing on what datasets are being shared and how they are prepared with various labeling tools. Particularly, in order to relieve the repetitive and tiring labeling work, we present an interactive smart image annotating system with which the annotation work can be transformed from the direct human-only manual labeling to a correction-after-checking by means of a support of automatic labeling. In an experiment, we show that automatic labeling can greatly improve the productivity of datasets especially reducing time and efforts to specify regions of objects found in images. Finally, we discuss critical issues that we faced in the experiment to our annotation system and describe future work to raise the productivity of image datasets creation for accelerating AI technology.

Construction Method of ECVAM using Land Cover Map and KOMPSAT-3A Image (토지피복지도와 KOMPSAT-3A위성영상을 활용한 환경성평가지도의 구축)

  • Kwon, Hee Sung;Song, Ah Ram;Jung, Se Jung;Lee, Won Hee
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • v.40 no.5
    • /
    • pp.367-380
    • /
    • 2022
  • In this study, the periodic and simplified update and production way of the ECVAM (Environmental Conservation Value Assessment Map) was presented through the classification of environmental values using KOMPSAT-3A satellite imagery and land cover map. ECVAM is a map that evaluates the environmental value of the country in five stages based on 62 legal evaluation items and 8 environmental and ecological evaluation items, and is provided on two scales: 1:25000 and 1:5000. However, the 1:5000 scale environmental assessment map is being produced and serviced with a slow renewal cycle of one year due to various constraints such as the absence of reference materials and different production years. Therefore, in this study, one of the deep learning techniques, KOMPSAT-3A satellite image, SI (Spectral Indices), and land cover map were used to conduct this study to confirm the possibility of establishing an environmental assessment map. As a result, the accuracy was calculated to be 87.25% and 85.88%, respectively. Through the results of the study, it was possible to confirm the possibility of constructing an environmental assessment map using satellite imagery, optical index, and land cover classification.

Development of a Acoustic Acquisition Prototype device and System Modules for Fire Detection in the Underground Utility Tunnel (지하 공동구 화재재난 감지를 위한 음향수집 프로토타입 장치 및 시스템 모듈 개발)

  • Lee, Byung-Jin;Park, Chul-Woo;Lee, Mi-Suk;Jung, Woo-Sug
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.22 no.5
    • /
    • pp.7-15
    • /
    • 2022
  • Since the direct and indirect damage caused by the fire in the underground utility tunnel will cause great damage to society as a whole, it is necessary to make efforts to prevent and control it in advance. The most of the fires that occur in cables are caused by short circuits, earth leakage, ignition due to over-current, overheating of conductor connections, and ignition due to sparks caused by breakdown of insulators. In order to find the cause of fire at an early stage due to the characteristics of the underground utility tunnel and to prevent disasters and safety accidents, we are constantly managing it with a detection system using image analysis and making efforts. Among them, a case of developing a fire detection system using CCTV-based deep learning image analysis technology has been reported. However, CCTV needs to be supplemented because there are blind spots. Therefore, we would like to develop a high-performance acoustic-based deep learning model that can prevent fire by detecting the spark sound before spark occurs. In this study, we propose a method that can collect sound in underground utility tunnel environments using microphone sensor through development and experiment of prototype module. After arranging an acoustic sensor in the underground utility tunnel with a lot of condensation, it verifies whether data can be collected in real time without malfunction.

Overseas Address Data Quality Verification Technique using Artificial Intelligence Reflecting the Characteristics of Administrative System (국가별 행정체계 특성을 반영한 인공지능 활용 해외 주소데이터 품질검증 기법)

  • Jin-Sil Kim;Kyung-Hee Lee;Wan-Sup Cho
    • The Journal of Bigdata
    • /
    • v.7 no.2
    • /
    • pp.1-9
    • /
    • 2022
  • In the global era, the importance of imported food safety management is increasing. Address information of overseas food companies is key information for imported food safety management, and must be verified for prompt response and follow-up management in the event of a food risk. However, because each country's address system is different, one verification system cannot verify the addresses of all countries. Also, the purpose of address verification may be different depending on the field used. In this paper, we deal with the problem of classifying a given overseas food business address into the administrative district level of the country. This is because, in the event of harm to imported food, it is necessary to find the administrative district level from the address of the relevant company, and based on this trace the food distribution route or take measures to ban imports. However, in some countries the administrative district level name is omitted from the address, and the same place name is used repeatedly in several administrative district levels, so it is not easy to accurately classify the administrative district level from the address. In this study we propose a deep learning-based administrative district level classification model suitable for this case, and verify the actual address data of overseas food companies. Specifically, a method of training using a label powerset in a multi-label classification model is used. To verify the proposed method, the accuracy was verified for the addresses of overseas manufacturing companies in Ecuador and Vietnam registered with the Ministry of Food and Drug Safety, and the accuracy was improved by 28.1% and 13%, respectively, compared to the existing classification model.

Developing a Deep Learning-based Restaurant Recommender System Using Restaurant Categories and Online Consumer Review (레스토랑 카테고리와 온라인 소비자 리뷰를 이용한 딥러닝 기반 레스토랑 추천 시스템 개발)

  • Haeun Koo;Qinglong Li;Jaekyeong Kim
    • Information Systems Review
    • /
    • v.25 no.1
    • /
    • pp.27-46
    • /
    • 2023
  • Research on restaurant recommender systems has been proposed due to the development of the food service industry and the increasing demand for restaurants. Existing restaurant recommendation studies extracted consumer preference information through quantitative information or online review sensitivity analysis, but there is a limitation that it cannot reflect consumer semantic preference information. In addition, there is a lack of recommendation research that reflects the detailed attributes of restaurants. To solve this problem, this study proposed a model that can learn the interaction between consumer preferences and restaurant attributes by applying deep learning techniques. First, the convolutional neural network was applied to online reviews to extract semantic preference information from consumers, and embedded techniques were applied to restaurant information to extract detailed attributes of restaurants. Finally, the interaction between consumer preference and restaurant attributes was learned through the element-wise products to predict the consumer preference rating. Experiments using an online review of Yelp.com to evaluate the performance of the proposed model in this study confirmed that the proposed model in this study showed excellent recommendation performance. By proposing a customized restaurant recommendation system using big data from the restaurant industry, this study expects to provide various academic and practical implications.

Analysis of the application of image quality assessment method for mobile tunnel scanning system (이동식 터널 스캐닝 시스템의 이미지 품질 평가 기법의 적용성 분석)

  • Chulhee Lee;Dongku Kim;Donggyou Kim
    • Journal of Korean Tunnelling and Underground Space Association
    • /
    • v.26 no.4
    • /
    • pp.365-384
    • /
    • 2024
  • The development of scanning technology is accelerating for safer and more efficient automated inspection than human-based inspection. Research on automatically detecting facility damage from images collected using computer vision technology is also increasing. The pixel size, quality, and quantity of an image can affect the performance of deep learning or image processing for automatic damage detection. This study is a basic to acquire high-quality raw image data and camera performance of a mobile tunnel scanning system for automatic detection of damage based on deep learning, and proposes a method to quantitatively evaluate image quality. A test chart was attached to a panel device capable of simulating a moving speed of 40 km/h, and an indoor test was performed using the international standard ISO 12233 method. Existing image quality evaluation methods were applied to evaluate the quality of images obtained in indoor experiments. It was determined that the shutter speed of the camera is closely related to the motion blur that occurs in the image. Modulation transfer function (MTF), one of the image quality evaluation method, can objectively evaluate image quality and was judged to be consistent with visual observation.

A Study of CNN-based Super-Resolution Method for Remote Sensing Image (원격 탐사 영상을 활용한 CNN 기반의 초해상화 기법 연구)

  • Choi, Yeonju;Kim, Minsik;Kim, Yongwoo;Han, Sanghyuck
    • Korean Journal of Remote Sensing
    • /
    • v.36 no.3
    • /
    • pp.449-460
    • /
    • 2020
  • Super-resolution is a technique used to reconstruct an image with low-resolution into that of high-resolution. Recently, deep-learning based super resolution has become the mainstream, and applications of these methods are widely used in the remote sensing field. In this paper, we propose a super-resolution method based on the deep back-projection network model to improve the satellite image resolution by the factor of four. In the process, we customized the loss function with the edge loss to result in a more detailed feature of the boundary of each object and to improve the stability of the model training using generative adversarial network based on Wasserstein distance loss. Also, we have applied the detail preserving image down-scaling method to enhance the naturalness of the training output. Finally, by including the modified-residual learning with a panchromatic feature in the final step of the training process. Our proposed method is able to reconstruct fine features and high frequency information. Comparing the results of our method with that of the others, we propose that the super-resolution method improves the sharpness and the clarity of WorldView-3 and KOMPSAT-2 images.