• Title/Summary/Keyword: 멀티모달정보

Search Result 187, Processing Time 0.023 seconds

Driver Drowsiness Detection Model using Image and PPG data Based on Multimodal Deep Learning (이미지와 PPG 데이터를 사용한 멀티모달 딥 러닝 기반의 운전자 졸음 감지 모델)

  • Choi, Hyung-Tak;Back, Moon-Ki;Kang, Jae-Sik;Yoon, Seung-Won;Lee, Kyu-Chul
    • Database Research
    • /
    • v.34 no.3
    • /
    • pp.45-57
    • /
    • 2018
  • The drowsiness that occurs in the driving is a very dangerous driver condition that can be directly linked to a major accident. In order to prevent drowsiness, there are traditional drowsiness detection methods to grasp the driver's condition, but there is a limit to the generalized driver's condition recognition that reflects the individual characteristics of drivers. In recent years, deep learning based state recognition studies have been proposed to recognize drivers' condition. Deep learning has the advantage of extracting features from a non-human machine and deriving a more generalized recognition model. In this study, we propose a more accurate state recognition model than the existing deep learning method by learning image and PPG at the same time to grasp driver's condition. This paper confirms the effect of driver's image and PPG data on drowsiness detection and experiment to see if it improves the performance of learning model when used together. We confirmed the accuracy improvement of around 3% when using image and PPG together than using image alone. In addition, the multimodal deep learning based model that classifies the driver's condition into three categories showed a classification accuracy of 96%.

Query processing for multi-modal sensor network (멀티 모달 센서 네트워크를 위한 질의 처리)

  • 이미정;정유나;황인준
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2004.10b
    • /
    • pp.64-66
    • /
    • 2004
  • 최근 들어 통신 기술과 센서 기기의 발달로 센서 네트워크에 대한 연구가 활발히 진행되고있다. 특히 센서노드를 통해 데이터를 수집하고 처리하는 기술이 중요한 이슈로 떠오르고 있다. 그러나 기존의 논문에서는 한 종류의 노드에서만 정보를 수집하는 것을 가정하고 있다. 하지만 여러 종류의 센서 노드에서 정보를 수집해야 하는 경우도 발생할 수 있다. 그러므로 본 논문에서는 여러 종류의 센서 노드를 고려한 센서 네트워크 구조를 제안한다. 또한 제안한 네트워크 구조에서 이루어지는 다양한 질의 처리 방법을 제시한다.

  • PDF

Wearable Multi-modal Remote Control (착용형 멀티모달 제어 리모콘)

  • Lee, Dong-Woo;SunWoo, John;Cho, Il-Yeon;Lee, Cheol-Hoon
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2008.05a
    • /
    • pp.169-170
    • /
    • 2008
  • 가전 기기제어를 위해서 통상적으로 리모컨을 사용하지만 집안에 넘쳐나는 리모콘 때문에 불편한 점이 있다. 본 논문에서는 URC 처럼 하나의 리모콘으로 사용할 착용형 시스템을 소개하고, 여러 개의 가전기기를 음성, 제스처 등과 같은 다양한 모달리티들을 이용하여 동일한 방법으로 제어 할 수 있는 방법을 제안하고 시스템을 소개한다.

Activity Recognition based on Multi-modal Sensors using Dynamic Bayesian Networks (동적 베이지안 네트워크를 이용한 델티모달센서기반 사용자 행동인식)

  • Yang, Sung-Ihk;Hong, Jin-Hyuk;Cho, Sung-Bae
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.15 no.1
    • /
    • pp.72-76
    • /
    • 2009
  • Recently, as the interest of ubiquitous computing has been increased there has been lots of research about recognizing human activities to provide services in this environment. Especially, in mobile environment, contrary to the conventional vision based recognition researches, lots of researches are sensor based recognition. In this paper we propose to recognize the user's activity with multi-modal sensors using hierarchical dynamic Bayesian networks. Dynamic Bayesian networks are trained by the OVR(One-Versus-Rest) strategy. The inferring part of this network uses less calculation cost by selecting the activity with the higher percentage of the result of a simpler Bayesian network. For the experiment, we used an accelerometer and a physiological sensor recognizing eight kinds of activities, and as a result of the experiment we gain 97.4% of accuracy recognizing the user's activity.

Deep Learning Music genre automatic classification voting system using Softmax (소프트맥스를 이용한 딥러닝 음악장르 자동구분 투표 시스템)

  • Bae, June;Kim, Jangyoung
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.23 no.1
    • /
    • pp.27-32
    • /
    • 2019
  • Research that implements the classification process through Deep Learning algorithm, one of the outstanding human abilities, includes a unimodal model, a multi-modal model, and a multi-modal method using music videos. In this study, the results were better by suggesting a system to analyze each song's spectrum into short samples and vote for the results. Among Deep Learning algorithms, CNN showed superior performance in the category of music genre compared to RNN, and improved performance when CNN and RNN were applied together. The system of voting for each CNN result by Deep Learning a short sample of music showed better results than the previous model and the model with Softmax layer added to the model performed best. The need for the explosive growth of digital media and the automatic classification of music genres in numerous streaming services is increasing. Future research will need to reduce the proportion of undifferentiated songs and develop algorithms for the last category classification of undivided songs.

AI Multimodal Sensor-based Pedestrian Image Recognition Algorithm (AI 멀티모달 센서 기반 보행자 영상인식 알고리즘)

  • Seong-Yoon Shin;Seung-Pyo Cho;Gwanghung Jo
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2023.01a
    • /
    • pp.407-408
    • /
    • 2023
  • In this paper, we intend to develop a multimodal algorithm that secures recognition performance of over 95% in daytime illumination environments and secures recognition performance of over 90% in bad weather (rainfall and snow) and night illumination environments.

  • PDF

A Study on Method for User Gender Prediction Using Multi-Modal Smart Device Log Data (스마트 기기의 멀티 모달 로그 데이터를 이용한 사용자 성별 예측 기법 연구)

  • Kim, Yoonjung;Choi, Yerim;Kim, Solee;Park, Kyuyon;Park, Jonghun
    • The Journal of Society for e-Business Studies
    • /
    • v.21 no.1
    • /
    • pp.147-163
    • /
    • 2016
  • Gender information of a smart device user is essential to provide personalized services, and multi-modal data obtained from the device is useful for predicting the gender of the user. However, the method for utilizing each of the multi-modal data for gender prediction differs according to the characteristics of the data. Therefore, in this study, an ensemble method for predicting the gender of a smart device user by using three classifiers that have text, application, and acceleration data as inputs, respectively, is proposed. To alleviate privacy issues that occur when text data generated in a smart device are sent outside, a classification method which scans smart device text data only on the device and classifies the gender of the user by matching text data with predefined sets of word. An application based classifier assigns gender labels to executed applications and predicts gender of the user by comparing the label ratio. Acceleration data is used with Support Vector Machine to classify user gender. The proposed method was evaluated by using the actual smart device log data collected from an Android application. The experimental results showed that the proposed method outperformed the compared methods.

Multimodal Sentiment Analysis Using Review Data and Product Information (리뷰 데이터와 제품 정보를 이용한 멀티모달 감성분석)

  • Hwang, Hohyun;Lee, Kyeongchan;Yu, Jinyi;Lee, Younghoon
    • The Journal of Society for e-Business Studies
    • /
    • v.27 no.1
    • /
    • pp.15-28
    • /
    • 2022
  • Due to recent expansion of online market such as clothing, utilizing customer review has become a major marketing measure. User review has been used as a tool of analyzing sentiment of customers. Sentiment analysis can be largely classified with machine learning-based and lexicon-based method. Machine learning-based method is a learning classification model referring review and labels. As research of sentiment analysis has been developed, multi-modal models learned by images and video data in reviews has been studied. Characteristics of words in reviews are differentiated depending on products' and customers' categories. In this paper, sentiment is analyzed via considering review data and metadata of products and users. Gated Recurrent Unit (GRU), Long Short-Term Memory (LSTM), Self Attention-based Multi-head Attention models and Bidirectional Encoder Representation from Transformer (BERT) are used in this study. Same Multi-Layer Perceptron (MLP) model is used upon every products information. This paper suggests a multi-modal sentiment analysis model that simultaneously considers user reviews and product meta-information.

A Model to Automatically Generate Non-verbal Expression Information for Korean Utterance Sentence (한국어 발화 문장에 대한 비언어 표현 정보를 자동으로 생성하는 모델)

  • Jaeyoon Kim;Jinyea Jang;San Kim;Minyoung Jung;Hyunwook Kang;Saim Shin
    • Annual Conference on Human and Language Technology
    • /
    • 2023.10a
    • /
    • pp.91-94
    • /
    • 2023
  • 자연스러운 상호작용이 가능한 인공지능 에이전트를 개발하기 위해서는 언어적 표현뿐 아니라, 비언어적 표현 또한 고려되어야 한다. 본 논문에서는 한국어 발화문으로부터 비언어적 표현인 모션을 생성하는 연구를 소개한다. 유튜브 영상으로부터 데이터셋을 구축하고, Text to Motion의 기존 모델인 T2M-GPT와 이종 모달리티 데이터를 연계 학습한 VL-KE-T5의 언어 인코더를 활용하여 구현한 모델로 실험을 진행하였다. 실험 결과, 한국어 발화 텍스트에 대해 생성된 모션 표현은 FID 스코어 0.11의 성능으로 나타났으며, 한국어 발화 정보 기반 비언어 표현 정보 생성의 가능성을 보여주었다.

  • PDF