• Title/Summary/Keyword: Multimodal Information

Search Result 257, Processing Time 0.03 seconds

The Effect of AI Agent's Multi Modal Interaction on the Driver Experience in the Semi-autonomous Driving Context : With a Focus on the Existence of Visual Character (반자율주행 맥락에서 AI 에이전트의 멀티모달 인터랙션이 운전자 경험에 미치는 효과 : 시각적 캐릭터 유무를 중심으로)

  • Suh, Min-soo;Hong, Seung-Hye;Lee, Jeong-Myeong
    • The Journal of the Korea Contents Association
    • /
    • v.18 no.8
    • /
    • pp.92-101
    • /
    • 2018
  • As the interactive AI speaker becomes popular, voice recognition is regarded as an important vehicle-driver interaction method in case of autonomous driving situation. The purpose of this study is to confirm whether multimodal interaction in which feedback is transmitted by auditory and visual mode of AI characters on screen is more effective in user experience optimization than auditory mode only. We performed the interaction tasks for the music selection and adjustment through the AI speaker while driving to the experiment participant and measured the information and system quality, presence, the perceived usefulness and ease of use, and the continuance intention. As a result of analysis, the multimodal effect of visual characters was not shown in most user experience factors, and the effect was not shown in the intention of continuous use. Rather, it was found that auditory single mode was more effective than multimodal in information quality factor. In the semi-autonomous driving stage, which requires driver 's cognitive effort, multimodal interaction is not effective in optimizing user experience as compared to single mode interaction.

Development of Gas Type Identification Deep-learning Model through Multimodal Method (멀티모달 방식을 통한 가스 종류 인식 딥러닝 모델 개발)

  • Seo Hee Ahn;Gyeong Yeong Kim;Dong Ju Kim
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.12 no.12
    • /
    • pp.525-534
    • /
    • 2023
  • Gas leak detection system is a key to minimize the loss of life due to the explosiveness and toxicity of gas. Most of the leak detection systems detect by gas sensors or thermal imaging cameras. To improve the performance of gas leak detection system using single-modal methods, the paper propose multimodal approach to gas sensor data and thermal camera data in developing a gas type identification model. MultimodalGasData, a multimodal open-dataset, is used to compare the performance of the four models developed through multimodal approach to gas sensors and thermal cameras with existing models. As a result, 1D CNN and GasNet models show the highest performance of 96.3% and 96.4%. The performance of the combined early fusion model of 1D CNN and GasNet reached 99.3%, 3.3% higher than the existing model. We hoped that further damage caused by gas leaks can be minimized through the gas leak detection system proposed in the study.

Design and Implementation of a Character Agent based Multimodal Presentation Authoring Tool (캐릭터 에이전트 기반 멀티모달 프리젠테이션 저작도구 설계 및 구현)

  • 정성태;정석태
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.7 no.5
    • /
    • pp.941-948
    • /
    • 2003
  • Character agent based Multimodal Presentation Markup Language(MPML) has been developed to increase the efficiency of the presentation using a computer. However, authoring of a presentation by using MPML is not simple because MPML describes only the behavior of the character agent. It describes the presentation background by importing HTML documents. This paper proposes EMPML(Extended MPML) which describes not only the behavior of the character agent but also the presentation background. And an authoring tool for the EMPML has been designed and implemented. By integrating the editing of the presentation background and character agent behavior, the proposed authoring tool supports WYSIWIG(What You See Is What You Get) mode designing. By using the proposed authoring tool, users can make a multimodal presentation without knowing the details of EMPML.

Emotion Recognition Method Based on Multimodal Sensor Fusion Algorithm

  • Moon, Byung-Hyun;Sim, Kwee-Bo
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • v.8 no.2
    • /
    • pp.105-110
    • /
    • 2008
  • Human being recognizes emotion fusing information of the other speech signal, expression, gesture and bio-signal. Computer needs technologies that being recognized as human do using combined information. In this paper, we recognized five emotions (normal, happiness, anger, surprise, sadness) through speech signal and facial image, and we propose to method that fusing into emotion for emotion recognition result is applying to multimodal method. Speech signal and facial image does emotion recognition using Principal Component Analysis (PCA) method. And multimodal is fusing into emotion result applying fuzzy membership function. With our experiments, our average emotion recognition rate was 63% by using speech signals, and was 53.4% by using facial images. That is, we know that speech signal offers a better emotion recognition rate than the facial image. We proposed decision fusion method using S-type membership function to heighten the emotion recognition rate. Result of emotion recognition through proposed method, average recognized rate is 70.4%. We could know that decision fusion method offers a better emotion recognition rate than the facial image or speech signal.

Combining Feature Fusion and Decision Fusion in Multimodal Biometric Authentication (다중 바이오 인증에서 특징 융합과 결정 융합의 결합)

  • Lee, Kyung-Hee
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.20 no.5
    • /
    • pp.133-138
    • /
    • 2010
  • We present a new multimodal biometric authentication method, which performs both feature-level fusion and decision-level fusion. After generating support vector machines for new features made by integrating face and voice features, the final decision for authentication is made by integrating decisions of face SVM classifier, voice SVM classifier and integrated features SVM clssifier. We justify our proposal by comparing our method with traditional one by experiments with XM2VTS multimodal database. The experiments show that our multilevel fusion algorithm gives higher recognition rate than the existing schemes.

A Virtual Reality System for the Cognitive and Behavioral Assessment of Schizophrenia (정신분열병 환자의 인지적/행동적 특성평가를 위한 가상현실시스템 구현)

  • Cho, Won-Geun;Kim, Ho-Sung;Ku, Jung-Hun;Kim, Jae-Hun;Kim, Byoung-Nyun;Lee, Jang-Han;Kim, Sun I.
    • Proceedings of the Korean Society for Emotion and Sensibility Conference
    • /
    • 2003.05a
    • /
    • pp.94-100
    • /
    • 2003
  • Patients with schizophrenia have thinking disorders such as delusion or hallucination, because they have a deficit in the ability which to systematize and integrate information. Therefore, they cannot integrate or systemize visual, auditory and tactile stimuli. In this study we suggest a virtual reality system for the assessment of cognitive ability of schizophrenia patients, based on the brain multimodal integration model. The virtual reality system provides multimodal stimuli, such as visual and auditory stimuli, to the patient, and can evaluate the patient's multimodal integration and working memory integration abilities by making the patient interpret and react to multimodal stimuli, which must be remembered for a given period of time. The clinical study showed that the virtual reality program developed is comparable to those of the WCST and the SPM.

  • PDF

Multimodal layer surveillance map based on anomaly detection using multi-agents for smart city security

  • Shin, Hochul;Na, Ki-In;Chang, Jiho;Uhm, Taeyoung
    • ETRI Journal
    • /
    • v.44 no.2
    • /
    • pp.183-193
    • /
    • 2022
  • Smart cities are expected to provide residents with convenience via various agents such as CCTV, delivery robots, security robots, and unmanned shuttles. Environmental data collected by various agents can be used for various purposes, including advertising and security monitoring. This study suggests a surveillance map data framework for efficient and integrated multimodal data representation from multi-agents. The suggested surveillance map is a multilayered global information grid, which is integrated from the multimodal data of each agent. To confirm this, we collected surveillance map data for 4 months, and the behavior patterns of humans and vehicles, distribution changes of elevation, and temperature were analyzed. Moreover, we represent an anomaly detection algorithm based on a surveillance map for security service. A two-stage anomaly detection algorithm for unusual situations was developed. With this, abnormal situations such as unusual crowds and pedestrians, vehicle movement, unusual objects, and temperature change were detected. Because the surveillance map enables efficient and integrated processing of large multimodal data from a multi-agent, the suggested data framework can be used for various applications in the smart city.

Multimodal Biometric Using a Hierarchical Fusion of a Person's Face, Voice, and Online Signature

  • Elmir, Youssef;Elberrichi, Zakaria;Adjoudj, Reda
    • Journal of Information Processing Systems
    • /
    • v.10 no.4
    • /
    • pp.555-567
    • /
    • 2014
  • Biometric performance improvement is a challenging task. In this paper, a hierarchical strategy fusion based on multimodal biometric system is presented. This strategy relies on a combination of several biometric traits using a multi-level biometric fusion hierarchy. The multi-level biometric fusion includes a pre-classification fusion with optimal feature selection and a post-classification fusion that is based on the similarity of the maximum of matching scores. The proposed solution enhances biometric recognition performances based on suitable feature selection and reduction, such as principal component analysis (PCA) and linear discriminant analysis (LDA), as much as not all of the feature vectors components support the performance improvement degree.

Single-Cell Molecular Barcoding to Decode Multimodal Information Defining Cell States

  • Ik Soo Kim
    • Molecules and Cells
    • /
    • v.46 no.2
    • /
    • pp.74-85
    • /
    • 2023
  • Single-cell research has provided a breakthrough in biology to understand heterogeneous cell groups, such as tissues and organs, in development and disease. Molecular barcoding and subsequent sequencing technology insert a single-cell barcode into isolated single cells, allowing separation cell by cell. Given that multimodal information from a cell defines precise cellular states, recent technical advances in methods focus on simultaneously extracting multimodal data recorded in different biological materials (DNA, RNA, protein, etc.). This review summarizes recently developed single-cell multiomics approaches regarding genome, epigenome, and protein profiles with the transcriptome. In particular, we focus on how to anchor or tag molecules from a cell, improve throughputs with sample multiplexing, and record lineages, and we further discuss the future developments of the technology.

A multisource image fusion method for multimodal pig-body feature detection

  • Zhong, Zhen;Wang, Minjuan;Gao, Wanlin
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.14 no.11
    • /
    • pp.4395-4412
    • /
    • 2020
  • The multisource image fusion has become an active topic in the last few years owing to its higher segmentation rate. To enhance the accuracy of multimodal pig-body feature segmentation, a multisource image fusion method was employed. Nevertheless, the conventional multisource image fusion methods can not extract superior contrast and abundant details of fused image. To superior segment shape feature and detect temperature feature, a new multisource image fusion method was presented and entitled as NSST-GF-IPCNN. Firstly, the multisource images were resolved into a range of multiscale and multidirectional subbands by Nonsubsampled Shearlet Transform (NSST). Then, to superior describe fine-scale texture and edge information, even-symmetrical Gabor filter and Improved Pulse Coupled Neural Network (IPCNN) were used to fuse low and high-frequency subbands, respectively. Next, the fused coefficients were reconstructed into a fusion image using inverse NSST. Finally, the shape feature was extracted using automatic threshold algorithm and optimized using morphological operation. Nevertheless, the highest temperature of pig-body was gained in view of segmentation results. Experiments revealed that the presented fusion algorithm was able to realize 2.102-4.066% higher average accuracy rate than the traditional algorithms and also enhanced efficiency.