• Title/Summary/Keyword: multimodal data

Search Result 158, Processing Time 0.02 seconds

Error Estimation Based on the Bhattacharyya Distance for Classifying Multimodal Data (Multimodal 데이터에 대한 분류 에러 예측 기법)

  • Choe, Ui-Seon;Kim, Jae-Hui;Lee, Cheol-Hui
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.39 no.2
    • /
    • pp.147-154
    • /
    • 2002
  • In this paper, we propose an error estimation method based on the Bhattacharyya distance for multimodal data. First, we try to find the empirical relationship between the classification error and the Bhattacharyya distance. Then, we investigate the possibility to derive the error estimation equation based on the Bhattacharyya distance for multimodal data. We assume that the distribution of multimodal data can be approximated as a mixture of several Gaussian distributions. Experimental results with remotely sensed data showed that there exist strong relationships between the Bhattacharyya distance and the classification error and that it is possible to predict the classification error using the Bhattacharyya distance for multimodal data.

Multimodal Supervised Contrastive Learning for Crop Disease Diagnosis (멀티 모달 지도 대조 학습을 이용한 농작물 병해 진단 예측 방법)

  • Hyunseok Lee;Doyeob Yeo;Gyu-Sung Ham;Kanghan Oh
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.18 no.6
    • /
    • pp.285-292
    • /
    • 2023
  • With the wide spread of smart farms and the advancements in IoT technology, it is easy to obtain additional data in addition to crop images. Consequently, deep learning-based crop disease diagnosis research utilizing multimodal data has become important. This study proposes a crop disease diagnosis method using multimodal supervised contrastive learning by expanding upon the multimodal self-supervised learning. RandAugment method was used to augment crop image and time series of environment data. These augmented data passed through encoder and projection head for each modality, yielding low-dimensional features. Subsequently, the proposed multimodal supervised contrastive loss helped features from the same class get closer while pushing apart those from different classes. Following this, the pretrained model was fine-tuned for crop disease diagnosis. The visualization of t-SNE result and comparative assessments of crop disease diagnosis performance substantiate that the proposed method has superior performance than multimodal self-supervised learning.

Estimation of Classification Error Based on the Bhattacharyya Distance for Data with Multimodal Distribution (Multimodal 분포 데이터를 위한 Bhattacharyya distance 기반 분류 에러예측 기법)

  • 최의선;이철희
    • Proceedings of the IEEK Conference
    • /
    • 2000.06d
    • /
    • pp.85-87
    • /
    • 2000
  • In pattern classification, the Bhattacharyya distance has been used as a class separability measure and provides useful information for feature selection and extraction. In this paper, we propose a method to predict the classification error for multimodal data based on the Bhattacharyya distance. In our approach, we first approximate the pdf of multimodal distribution with a Gaussian mixture model and find the bhattacharyya distance and classification error. Exprimental results showed that there is a strong relationship between the Bhattacharyya distance and the classification error for multimodal data.

  • PDF

Estimating Suitable Probability Distribution Function for Multimodal Traffic Distribution Function

  • Yoo, Sang-Lok;Jeong, Jae-Yong;Yim, Jeong-Bin
    • Journal of the Korean Society of Marine Environment & Safety
    • /
    • v.21 no.3
    • /
    • pp.253-258
    • /
    • 2015
  • The purpose of this study is to find suitable probability distribution function of complex distribution data like multimodal. Normal distribution is broadly used to assume probability distribution function. However, complex distribution data like multimodal are very hard to be estimated by using normal distribution function only, and there might be errors when other distribution functions including normal distribution function are used. In this study, we experimented to find fit probability distribution function in multimodal area, by using AIS(Automatic Identification System) observation data gathered in Mokpo port for a year of 2013. By using chi-squared statistic, gaussian mixture model(GMM) is the fittest model rather than other distribution functions, such as extreme value, generalized extreme value, logistic, and normal distribution. GMM was found to the fit model regard to multimodal data of maritime traffic flow distribution. Probability density function for collision probability and traffic flow distribution will be calculated much precisely in the future.

Multimodal layer surveillance map based on anomaly detection using multi-agents for smart city security

  • Shin, Hochul;Na, Ki-In;Chang, Jiho;Uhm, Taeyoung
    • ETRI Journal
    • /
    • v.44 no.2
    • /
    • pp.183-193
    • /
    • 2022
  • Smart cities are expected to provide residents with convenience via various agents such as CCTV, delivery robots, security robots, and unmanned shuttles. Environmental data collected by various agents can be used for various purposes, including advertising and security monitoring. This study suggests a surveillance map data framework for efficient and integrated multimodal data representation from multi-agents. The suggested surveillance map is a multilayered global information grid, which is integrated from the multimodal data of each agent. To confirm this, we collected surveillance map data for 4 months, and the behavior patterns of humans and vehicles, distribution changes of elevation, and temperature were analyzed. Moreover, we represent an anomaly detection algorithm based on a surveillance map for security service. A two-stage anomaly detection algorithm for unusual situations was developed. With this, abnormal situations such as unusual crowds and pedestrians, vehicle movement, unusual objects, and temperature change were detected. Because the surveillance map enables efficient and integrated processing of large multimodal data from a multi-agent, the suggested data framework can be used for various applications in the smart city.

Multimodal Media Content Classification using Keyword Weighting for Recommendation (추천을 위한 키워드 가중치를 이용한 멀티모달 미디어 콘텐츠 분류)

  • Kang, Ji-Soo;Baek, Ji-Won;Chung, Kyungyong
    • Journal of Convergence for Information Technology
    • /
    • v.9 no.5
    • /
    • pp.1-6
    • /
    • 2019
  • As the mobile market expands, a variety of platforms are available to provide multimodal media content. Multimodal media content contains heterogeneous data, accordingly, user requires much time and effort to select preferred content. Therefore, in this paper we propose multimodal media content classification using keyword weighting for recommendation. The proposed method extracts keyword that best represent contents through keyword weighting in text data of multimodal media contents. Based on the extracted data, genre class with subclass are generated and classify appropriate multimodal media contents. In addition, the user's preference evaluation is performed for personalized recommendation, and multimodal content is recommended based on the result of the user's content preference analysis. The performance evaluation verifies that it is superiority of recommendation results through the accuracy and satisfaction. The recommendation accuracy is 74.62% and the satisfaction rate is 69.1%, because it is recommended considering the user's favorite the keyword as well as the genre.

Development of Gas Type Identification Deep-learning Model through Multimodal Method (멀티모달 방식을 통한 가스 종류 인식 딥러닝 모델 개발)

  • Seo Hee Ahn;Gyeong Yeong Kim;Dong Ju Kim
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.12 no.12
    • /
    • pp.525-534
    • /
    • 2023
  • Gas leak detection system is a key to minimize the loss of life due to the explosiveness and toxicity of gas. Most of the leak detection systems detect by gas sensors or thermal imaging cameras. To improve the performance of gas leak detection system using single-modal methods, the paper propose multimodal approach to gas sensor data and thermal camera data in developing a gas type identification model. MultimodalGasData, a multimodal open-dataset, is used to compare the performance of the four models developed through multimodal approach to gas sensors and thermal cameras with existing models. As a result, 1D CNN and GasNet models show the highest performance of 96.3% and 96.4%. The performance of the combined early fusion model of 1D CNN and GasNet reached 99.3%, 3.3% higher than the existing model. We hoped that further damage caused by gas leaks can be minimized through the gas leak detection system proposed in the study.

Danger detection technology based on multimodal and multilog data for public safety services

  • Park, Hyunho;Kwon, Eunjung;Byon, Sungwon;Shin, Won-Jae;Jung, Eui-Suk;Lee, Yong-Tae
    • ETRI Journal
    • /
    • v.44 no.2
    • /
    • pp.300-312
    • /
    • 2022
  • Recently, public safety services have attracted significant attention for their ability to protect people from crimes. Rapid detection of dangerous situations (that is, abnormal situations where someone may be harmed or killed) is required in public safety services to reduce the time required to respond to such situations. This study proposes a novel danger detection technology based on multimodal data, which includes data from multiple sensors (for example, accelerometer, gyroscope, heart rate, air pressure, and global positioning system sensors), and multilog data, which includes contextual logs of humans and places (for example, contextual logs of human activities and crime-ridden districts) over time. To recognize human activity (for example, walk, sit, and punch), the proposed technology uses multimodal data analysis with an attitude heading reference system and long short-term memory. The proposed technology also includes multilog data analysis for detecting whether recognized activities of humans are dangerous. The proposed danger detection technology will benefit public safety services by improving danger detection capabilities.

Multimodal Sentiment Analysis for Investigating User Satisfaction

  • Hwang, Gyo Yeob;Song, Zi Han;Park, Byung Kwon
    • The Journal of Information Systems
    • /
    • v.32 no.3
    • /
    • pp.1-17
    • /
    • 2023
  • Purpose The proliferation of data on the internet has created a need for innovative methods to analyze user satisfaction data. Traditional survey methods are becoming inadequate in dealing with the increasing volume and diversity of data, and new methods using unstructured internet data are being explored. While numerous comment-based user satisfaction studies have been conducted, only a few have explored user satisfaction through video and audio data. Multimodal sentiment analysis, which integrates multiple modalities, has gained attention due to its high accuracy and broad applicability. Design/methodology/approach This study uses multimodal sentiment analysis to analyze user satisfaction of iPhone and Samsung products through online videos. The research reveals that the combination model integrating multiple data sources showed the most superior performance. Findings The findings also indicate that price is a crucial factor influencing user satisfaction, and users tend to exhibit more positive emotions when content with a product's price. The study highlights the importance of considering multiple factors when evaluating user satisfaction and provides valuable insights into the effectiveness of different data sources for sentiment analysis of product reviews.

Building Detection by Convolutional Neural Network with Infrared Image, LiDAR Data and Characteristic Information Fusion (적외선 영상, 라이다 데이터 및 특성정보 융합 기반의 합성곱 인공신경망을 이용한 건물탐지)

  • Cho, Eun Ji;Lee, Dong-Cheon
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • v.38 no.6
    • /
    • pp.635-644
    • /
    • 2020
  • Object recognition, detection and instance segmentation based on DL (Deep Learning) have being used in various practices, and mainly optical images are used as training data for DL models. The major objective of this paper is object segmentation and building detection by utilizing multimodal datasets as well as optical images for training Detectron2 model that is one of the improved R-CNN (Region-based Convolutional Neural Network). For the implementation, infrared aerial images, LiDAR data, and edges from the images, and Haralick features, that are representing statistical texture information, from LiDAR (Light Detection And Ranging) data were generated. The performance of the DL models depends on not only on the amount and characteristics of the training data, but also on the fusion method especially for the multimodal data. The results of segmenting objects and detecting buildings by applying hybrid fusion - which is a mixed method of early fusion and late fusion - results in a 32.65% improvement in building detection rate compared to training by optical image only. The experiments demonstrated complementary effect of the training multimodal data having unique characteristics and fusion strategy.