• Title/Summary/Keyword: multi-modal learning

Search Result 51, Processing Time 0.03 seconds

Machine learning-based Multi-modal Sensing IoT Platform Resource Management (머신러닝 기반 멀티모달 센싱 IoT 플랫폼 리소스 관리 지원)

  • Lee, Seongchan;Sung, Nakmyoung;Lee, Seokjun;Jun, Jaeseok
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.17 no.2
    • /
    • pp.93-100
    • /
    • 2022
  • In this paper, we propose a machine learning-based method for supporting resource management of IoT software platforms in a multi-modal sensing scenario. We assume that an IoT device installed with a oneM2M-compatible software platform is connected with various sensors such as PIR, sound, dust, ambient light, ultrasonic, accelerometer, through different embedded system interfaces such as general purpose input output (GPIO), I2C, SPI, USB. Based on a collected dataset including CPU usage and user-defined priority, a machine learning model is trained to estimate the level of nice value required to adjust according to the resource usage patterns. The proposed method is validated by comparing with a rule-based control strategy, showing its practical capability in a multi-modal sensing scenario of IoT devices.

FakedBits- Detecting Fake Information on Social Platforms using Multi-Modal Features

  • Dilip Kumar, Sharma;Bhuvanesh, Singh;Saurabh, Agarwal;Hyunsung, Kim;Raj, Sharma
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.17 no.1
    • /
    • pp.51-73
    • /
    • 2023
  • Social media play a significant role in communicating information across the globe, connecting with loved ones, getting the news, communicating ideas, etc. However, a group of people uses social media to spread fake information, which has a bad impact on society. Therefore, minimizing fake news and its detection are the two primary challenges that need to be addressed. This paper presents a multi-modal deep learning technique to address the above challenges. The proposed modal can use and process visual and textual features. Therefore, it has the ability to detect fake information from visual and textual data. We used EfficientNetB0 and a sentence transformer, respectively, for detecting counterfeit images and for textural learning. Feature embedding is performed at individual channels, whilst fusion is done at the last classification layer. The late fusion is applied intentionally to mitigate the noisy data that are generated by multi-modalities. Extensive experiments are conducted, and performance is evaluated against state-of-the-art methods. Three real-world benchmark datasets, such as MediaEval (Twitter), Weibo, and Fakeddit, are used for experimentation. Result reveals that the proposed modal outperformed the state-of-the-art methods and achieved an accuracy of 86.48%, 82.50%, and 88.80%, respectively, for MediaEval (Twitter), Weibo, and Fakeddit datasets.

Multi-Modal Wearable Sensor Integration for Daily Activity Pattern Analysis with Gated Multi-Modal Neural Networks (Gated Multi-Modal Neural Networks를 이용한 다중 웨어러블 센서 결합 방법 및 일상 행동 패턴 분석)

  • On, Kyoung-Woon;Kim, Eun-Sol;Zhang, Byoung-Tak
    • KIISE Transactions on Computing Practices
    • /
    • v.23 no.2
    • /
    • pp.104-109
    • /
    • 2017
  • We propose a new machine learning algorithm which analyzes daily activity patterns of users from multi-modal wearable sensor data. The proposed model learns and extracts activity patterns using input from wearable devices in real-time. Inspired by cue integration of human's property, we constructed gated multi-modal neural networks which integrate wearable sensor input data selectively by using gate modules. For the experiments, sensory data were collected by using multiple wearable devices in restaurant situations. As an experimental result, we first show that the proposed model performs well in terms of prediction accuracy. Then, the possibility to construct a knowledge schema automatically by analyzing the activation patterns in the middle layer of our proposed model is explained.

Deep Learning based Emotion Classification using Multi Modal Bio-signals (다중 모달 생체신호를 이용한 딥러닝 기반 감정 분류)

  • Lee, JeeEun;Yoo, Sun Kook
    • Journal of Korea Multimedia Society
    • /
    • v.23 no.2
    • /
    • pp.146-154
    • /
    • 2020
  • Negative emotion causes stress and lack of attention concentration. The classification of negative emotion is important to recognize risk factors. To classify emotion status, various methods such as questionnaires and interview are used and it could be changed by personal thinking. To solve the problem, we acquire multi modal bio-signals such as electrocardiogram (ECG), skin temperature (ST), galvanic skin response (GSR) and extract features. The neural network (NN), the deep neural network (DNN), and the deep belief network (DBN) is designed using the multi modal bio-signals to analyze emotion status. As a result, the DBN based on features extracted from ECG, ST and GSR shows the highest accuracy (93.8%). It is 5.7% higher than compared to the NN and 1.4% higher than compared to the DNN. It shows 12.2% higher accuracy than using only single bio-signal (GSR). The multi modal bio-signal acquisition and the deep learning classifier play an important role to classify emotion.

A Study on the Intention to Use a Robot-based Learning System with Multi-Modal Interaction (멀티모달 상호작용 중심의 로봇기반교육 콘텐츠를 활용한 r-러닝 시스템 사용의도 분석)

  • Oh, Junseok;Cho, Hye-Kyung
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.20 no.6
    • /
    • pp.619-624
    • /
    • 2014
  • This paper introduces a robot-based learning system which is designed to teach multiplication to children. In addition to a small humanoid and a smart device delivering educational content, we employ a type of mixed-initiative operation which provides enhanced multi-modal cognition to the r-learning system through human intervention. To investigate major factors that influence people's intention to use the r-learning system and to see how the multi-modality affects the connections, we performed a user study based on TAM (Technology Acceptance Model). The results support the fact that the quality of the system and the natural interaction are key factors for the r-learning system to be used, and they also reveal very interesting implications related to the human behaviors.

Multi-modal Emotion Recognition using Semi-supervised Learning and Multiple Neural Networks in the Wild (준 지도학습과 여러 개의 딥 뉴럴 네트워크를 사용한 멀티 모달 기반 감정 인식 알고리즘)

  • Kim, Dae Ha;Song, Byung Cheol
    • Journal of Broadcast Engineering
    • /
    • v.23 no.3
    • /
    • pp.351-360
    • /
    • 2018
  • Human emotion recognition is a research topic that is receiving continuous attention in computer vision and artificial intelligence domains. This paper proposes a method for classifying human emotions through multiple neural networks based on multi-modal signals which consist of image, landmark, and audio in a wild environment. The proposed method has the following features. First, the learning performance of the image-based network is greatly improved by employing both multi-task learning and semi-supervised learning using the spatio-temporal characteristic of videos. Second, a model for converting 1-dimensional (1D) landmark information of face into two-dimensional (2D) images, is newly proposed, and a CNN-LSTM network based on the model is proposed for better emotion recognition. Third, based on an observation that audio signals are often very effective for specific emotions, we propose an audio deep learning mechanism robust to the specific emotions. Finally, so-called emotion adaptive fusion is applied to enable synergy of multiple networks. The proposed network improves emotion classification performance by appropriately integrating existing supervised learning and semi-supervised learning networks. In the fifth attempt on the given test set in the EmotiW2017 challenge, the proposed method achieved a classification accuracy of 57.12%.

Classification of Breast Cancer using Explainable A.I. and Deep learning (딥러닝과 설명 가능한 인공지능을 이용한 유방암 판별)

  • Ha, Soo-Hee;Yoo, Jae-Chern
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2022.07a
    • /
    • pp.99-100
    • /
    • 2022
  • 본 논문에서는 유방암 초음파 이미지를 학습한 multi-modal 구조를 이용하여 유방암을 판별하는 인공지능을 제안한다. 학습된 인공지능은 유방암을 판별과 동시에, 설명 가능한 인공지능 기법과 ROI를 함께 사용하여 종양의 위치를 나타내준다. 시각적으로 판단 근거를 제시하기 때문에 인공지능의 판단 신뢰도는 더 높아진다.

  • PDF

Speech and Textual Data Fusion for Emotion Detection: A Multimodal Deep Learning Approach (감정 인지를 위한 음성 및 텍스트 데이터 퓨전: 다중 모달 딥 러닝 접근법)

  • Edward Dwijayanto Cahyadi;Mi-Hwa Song
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2023.11a
    • /
    • pp.526-527
    • /
    • 2023
  • Speech emotion recognition(SER) is one of the interesting topics in the machine learning field. By developing multi-modal speech emotion recognition system, we can get numerous benefits. This paper explain about fusing BERT as the text recognizer and CNN as the speech recognizer to built a multi-modal SER system.

3D Cross-Modal Retrieval Using Noisy Center Loss and SimSiam for Small Batch Training

  • Yeon-Seung Choo;Boeun Kim;Hyun-Sik Kim;Yong-Suk Park
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.18 no.3
    • /
    • pp.670-684
    • /
    • 2024
  • 3D Cross-Modal Retrieval (3DCMR) is a task that retrieves 3D objects regardless of modalities, such as images, meshes, and point clouds. One of the most prominent methods used for 3DCMR is the Cross-Modal Center Loss Function (CLF) which applies the conventional center loss strategy for 3D cross-modal search and retrieval. Since CLF is based on center loss, the center features in CLF are also susceptible to subtle changes in hyperparameters and external inferences. For instance, performance degradation is observed when the batch size is too small. Furthermore, the Mean Squared Error (MSE) used in CLF is unable to adapt to changes in batch size and is vulnerable to data variations that occur during actual inference due to the use of simple Euclidean distance between multi-modal features. To address the problems that arise from small batch training, we propose a Noisy Center Loss (NCL) method to estimate the optimal center features. In addition, we apply the simple Siamese representation learning method (SimSiam) during optimal center feature estimation to compare projected features, making the proposed method robust to changes in batch size and variations in data. As a result, the proposed approach demonstrates improved performance in ModelNet40 dataset compared to the conventional methods.

Multi Modal Sensor Training Dataset for the Robust Object Detection and Tracking in Outdoor Surveillance (MMO (Multi Modal Outdoor) Dataset) (실외 경비 환경에서 강인한 객체 검출 및 추적을 위한 실외 멀티 모달 센서 기반 학습용 데이터베이스 구축)

  • Noh, DongKi;Yang, Wonkeun;Uhm, Teayoung;Lee, Jaekwang;Kim, Hyoung-Rock;Baek, SeungMin
    • Journal of Korea Multimedia Society
    • /
    • v.23 no.8
    • /
    • pp.1006-1018
    • /
    • 2020
  • Dataset is getting more import to develop a learning based algorithm. Quality of the algorithm definitely depends on dataset. So we introduce new dataset over 200 thousands images which are fully labeled multi modal sensor data. Proposed dataset was designed and constructed for researchers who want to develop detection, tracking, and action classification in outdoor environment for surveillance scenarios. The dataset includes various images and multi modal sensor data under different weather and lighting condition. Therefor, we hope it will be very helpful to develop more robust algorithm for systems equipped with difference kinds of sensors in outdoor application. Case studies with the proposed dataset are also discussed in this paper.