• Title/Summary/Keyword: deep Learning

Search Result 5,763, Processing Time 0.03 seconds

Semantic Segmentation of Drone Images Based on Combined Segmentation Network Using Multiple Open Datasets (개방형 다중 데이터셋을 활용한 Combined Segmentation Network 기반 드론 영상의 의미론적 분할)

  • Ahram Song
    • Korean Journal of Remote Sensing
    • /
    • v.39 no.5_3
    • /
    • pp.967-978
    • /
    • 2023
  • This study proposed and validated a combined segmentation network (CSN) designed to effectively train on multiple drone image datasets and enhance the accuracy of semantic segmentation. CSN shares the entire encoding domain to accommodate the diversity of three drone datasets, while the decoding domains are trained independently. During training, the segmentation accuracy of CSN was lower compared to U-Net and the pyramid scene parsing network (PSPNet) on single datasets because it considers loss values for all dataset simultaneously. However, when applied to domestic autonomous drone images, CSN demonstrated the ability to classify pixels into appropriate classes without requiring additional training, outperforming PSPNet. This research suggests that CSN can serve as a valuable tool for effectively training on diverse drone image datasets and improving object recognition accuracy in new regions.

State-of-the-Art Knowledge Distillation for Recommender Systems in Explicit Feedback Settings: Methods and Evaluation (익스플리싯 피드백 환경에서 추천 시스템을 위한 최신 지식증류기법들에 대한 성능 및 정확도 평가)

  • Hong-Kyun Bae;Jiyeon Kim;Sang-Wook Kim
    • Smart Media Journal
    • /
    • v.12 no.9
    • /
    • pp.89-94
    • /
    • 2023
  • Recommender systems provide users with the most favorable items by analyzing explicit or implicit feedback of users on items. Recently, as the size of deep-learning-based models employed in recommender systems has increased, many studies have focused on reducing inference time while maintaining high recommendation accuracy. As one of them, a study on recommender systems with a knowledge distillation (KD) technique is actively conducted. By KD, a small-sized model (i.e., student) is trained through knowledge extracted from a large-sized model (i.e., teacher), and then the trained student is used as a recommendation model. Existing studies on KD for recommender systems have been mainly performed only for implicit feedback settings. Thus, in this paper, we try to investigate the performance and accuracy when applied to explicit feedback settings. To this end, we leveraged a total of five state-of-the-art KD methods and three real-world datasets for recommender systems.

Diagnosis of the Rice Lodging for the UAV Image using Vision Transformer (Vision Transformer를 이용한 UAV 영상의 벼 도복 영역 진단)

  • Hyunjung Myung;Seojeong Kim;Kangin Choi;Donghoon Kim;Gwanghyeong Lee;Hvung geun Ahn;Sunghwan Jeong;Bvoungiun Kim
    • Smart Media Journal
    • /
    • v.12 no.9
    • /
    • pp.28-37
    • /
    • 2023
  • The main factor affecting the decline in rice yield is damage caused by localized heavy rains or typhoons. The method of analyzing the rice lodging area is difficult to obtain objective results based on visual inspection and judgment based on field surveys visiting the affected area. it requires a lot of time and money. In this paper, we propose the method of estimation and diagnosis for rice lodging areas using a Vision Transformer-based Segformer for RGB images, which are captured by unmanned aerial vehicles. The proposed method estimates the lodging, normal, and background area using the Segformer model, and the lodging rate is diagnosed through the rice field inspection criteria in the seed industry Act. The diagnosis result can be used to find the distribution of the rice lodging areas, to show the trend of lodging, and to use the quality management of certified seed in government. The proposed method of rice lodging area estimation shows 98.33% of mean accuracy and 96.79% of mIoU.

Attention-Based Heart Rate Estimation using MobilenetV3

  • Yeo-Chan Yoon
    • Journal of the Korea Society of Computer and Information
    • /
    • v.28 no.12
    • /
    • pp.1-7
    • /
    • 2023
  • The advent of deep learning technologies has led to the development of various medical applications, making healthcare services more convenient and effective. Among these applications, heart rate estimation is considered a vital method for assessing an individual's health. Traditional methods, such as photoplethysmography through smart watches, have been widely used but are invasive and require additional hardware. Recent advancements allow for contactless heart rate estimation through facial image analysis, providing a more hygienic and convenient approach. In this paper, we propose a lightweight methodology capable of accurately estimating heart rate in mobile environments, using a specialized 2-channel network structure based on 2D convolution. Our method considers both subtle facial movements and color changes resulting from blood flow and muscle contractions. The approach comprises two major components: an Encoder for analyzing image features and a regression layer for evaluating Blood Volume Pulse. By incorporating both features simultaneously our methodology delivers more accurate results even in computing environments with limited resources. The proposed approach is expected to offer a more efficient way to monitor heart rate without invasive technology, particularly well-suited for mobile devices.

Convolutional Neural Network Model Using Data Augmentation for Emotion AI-based Recommendation Systems

  • Ho-yeon Park;Kyoung-jae Kim
    • Journal of the Korea Society of Computer and Information
    • /
    • v.28 no.12
    • /
    • pp.57-66
    • /
    • 2023
  • In this study, we propose a novel research framework for the recommendation system that can estimate the user's emotional state and reflect it in the recommendation process by applying deep learning techniques and emotion AI (artificial intelligence). To this end, we build an emotion classification model that classifies each of the seven emotions of angry, disgust, fear, happy, sad, surprise, and neutral, respectively, and propose a model that can reflect this result in the recommendation process. However, in the general emotion classification data, the difference in distribution ratio between each label is large, so it may be difficult to expect generalized classification results. In this study, since the number of emotion data such as disgust in emotion image data is often insufficient, correction is made through augmentation. Lastly, we propose a method to reflect the emotion prediction model based on data through image augmentation in the recommendation systems.

Deep Learning-Based Personalized Recommendation Using Customer Behavior and Purchase History in E-Commerce (전자상거래에서 고객 행동 정보와 구매 기록을 활용한 딥러닝 기반 개인화 추천 시스템)

  • Hong, Da Young;Kim, Ga Yeong;Kim, Hyon Hee
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.11 no.6
    • /
    • pp.237-244
    • /
    • 2022
  • In this paper, we present VAE-based recommendation using online behavior log and purchase history to overcome data sparsity and cold start. To generate a variable for customers' purchase history, embedding and dimensionality reduction are applied to the customers' purchase history. Also, Variational Autoencoders are applied to online behavior and purchase history. A total number of 12 variables are used, and nDCG is chosen for performance evaluation. Our experimental results showed that the proposed VAE-based recommendation outperforms SVD-based recommendation. Also, the generated purchase history variable improves the recommendation performance.

A Study on Object Detection and Warning Model for the Prevention of Right Turn Car Accidents (우회전 차량 사고 예방을 위한 객체 탐지 및 경고 모델 연구)

  • Sang-Joon Cho;Seong-uk Shin;Myeong-Jae Noh
    • Journal of Digital Policy
    • /
    • v.2 no.4
    • /
    • pp.33-39
    • /
    • 2023
  • With a continuous occurrence of right-turn traffic accidents at intersections, there is an increasing demand for measures to address these incidents. In response, a technology has been developed to detect the presence of pedestrians through object detection in CCTV footage at right-turn areas and display warning messages on the screen to alert drivers. The YOLO (You Only Look Once) model, a type of object detection model, was employed to assess the performance of object detection. An algorithm was also devised to address misidentification issues and generate warning messages when pedestrians are detected. The accuracy of recognizing pedestrians or objects and outputting warning messages was measured at approximately 82%, suggesting a potential contribution to preventing right-turn accidents

Convolutional neural networks for automated tooth numbering on panoramic radiographs: A scoping review

  • Ramadhan Hardani Putra;Eha Renwi Astuti;Aga Satria Nurrachman;Dina Karimah Putri;Ahmad Badruddin Ghazali;Tjio Andrinanti Pradini;Dhinda Tiara Prabaningtyas
    • Imaging Science in Dentistry
    • /
    • v.53 no.4
    • /
    • pp.271-281
    • /
    • 2023
  • Purpose: The objective of this scoping review was to investigate the applicability and performance of various convolutional neural network (CNN) models in tooth numbering on panoramic radiographs, achieved through classification, detection, and segmentation tasks. Materials and Methods: An online search was performed of the PubMed, Science Direct, and Scopus databases. Based on the selection process, 12 studies were included in this review. Results: Eleven studies utilized a CNN model for detection tasks, 5 for classification tasks, and 3 for segmentation tasks in the context of tooth numbering on panoramic radiographs. Most of these studies revealed high performance of various CNN models in automating tooth numbering. However, several studies also highlighted limitations of CNNs, such as the presence of false positives and false negatives in identifying decayed teeth, teeth with crown prosthetics, teeth adjacent to edentulous areas, dental implants, root remnants, wisdom teeth, and root canal-treated teeth. These limitations can be overcome by ensuring both the quality and quantity of datasets, as well as optimizing the CNN architecture. Conclusion: CNNs have demonstrated high performance in automated tooth numbering on panoramic radiographs. Future development of CNN-based models for this purpose should also consider different stages of dentition, such as the primary and mixed dentition stages, as well as the presence of various tooth conditions. Ultimately, an optimized CNN architecture can serve as the foundation for an automated tooth numbering system and for further artificial intelligence research on panoramic radiographs for a variety of purposes.

Research on Artificial Intelligence Based Shipping Container Loading Safety Management System (인공지능 기반 컨테이너 적재 안전관리 시스템 연구)

  • Kim Sang Woo;Oh Se Yeong;Seo Yong Uk;Yeon Jeong Hum;Cho Hee Jeong;Youn Joosang
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.12 no.9
    • /
    • pp.273-282
    • /
    • 2023
  • Recently, various technologies such as logistics automation and port operations automation with ICT technology are being developed to build smart ports. However, there is a lack of technology development for port safety and safety accident prevention. This paper proposes an AI-based shipping container loading safety management system for the prevention of safety accidents at container loading fields in ports. The system consists of an AI-based shipping container safety accident risk classification and storage function and a real-time safety accident monitoring function. The system monitors the accident risk at the site in real-time and can prevent container collapse accidents. The proposed system is developed as a prototype, and the system is ecaluated by direct application in a port.

Multicontents Integrated Image Animation within Synthesis for Hiqh Quality Multimodal Video (고화질 멀티 모달 영상 합성을 통한 다중 콘텐츠 통합 애니메이션 방법)

  • Jae Seung Roh;Jinbeom Kang
    • Journal of Intelligence and Information Systems
    • /
    • v.29 no.4
    • /
    • pp.257-269
    • /
    • 2023
  • There is currently a burgeoning demand for image synthesis from photos and videos using deep learning models. Existing video synthesis models solely extract motion information from the provided video to generate animation effects on photos. However, these synthesis models encounter challenges in achieving accurate lip synchronization with the audio and maintaining the image quality of the synthesized output. To tackle these issues, this paper introduces a novel framework based on an image animation approach. Within this framework, upon receiving a photo, a video, and audio input, it produces an output that not only retains the unique characteristics of the individuals in the photo but also synchronizes their movements with the provided video, achieving lip synchronization with the audio. Furthermore, a super-resolution model is employed to enhance the quality and resolution of the synthesized output.