• 제목/요약/키워드: multimodal interaction

검색결과 59건 처리시간 0.143초

A Full Body Gumdo Game with an Intelligent Cyber Fencer using Multi-modal(3D Vision and Speech) Interface (멀티모달 인터페이스(3차원 시각과 음성 )를 이용한 지능적 가상검객과의 전신 검도게임)

  • 윤정원;김세환;류제하;우운택
    • Journal of KIISE:Computing Practices and Letters
    • /
    • 제9권4호
    • /
    • pp.420-430
    • /
    • 2003
  • This paper presents an immersive multimodal Gumdo simulation game that allows a user to experience the whole body interaction with an intelligent cyber fencer. The proposed system consists of three modules: (i) a nondistracting multimodal interface with 3D vision and speech (ii) an intelligent cyber fencer and (iii) an immersive feedback by a big screen and sound. First, the multimodal Interface with 3D vision and speech allows a user to move around and to shout without distracting the user. Second, an intelligent cyber fencer provides the user with intelligent interactions by perception and reaction modules that are created by the analysis of real Gumdo game. Finally, an immersive audio-visual feedback by a big screen and sound effects helps a user experience an immersive interaction. The proposed system thus provides the user with an immersive Gumdo experience with the whole body movement. The suggested system can be applied to various applications such as education, exercise, art performance, etc.

HomeN manager system based on multimodal context-aware middleware (멀티모달 상황인지 미들웨어 기반의 홈앤(HomeN) 매니저 시스템)

  • Ahn, Se-Yeol;Park, Sung-Chan;Park, Seong-Soo;Koo, Myung-Wan;Jeong, Yeong-Joon;Kim, Myung-Sook
    • Proceedings of the KSPS conference
    • /
    • 대한음성학회 2006년도 추계학술대회 발표논문집
    • /
    • pp.120-123
    • /
    • 2006
  • The provision of personalized user interfaces for mobile devices is expected to be used for different devices with a wide variety of capabilities and interaction modalities. In this paper, we implemented a multimodal context-aware middleware incorporating XML-based languages such as XHTML, VoiceXML. SCXML uses parallel states to invoke both XHTML and VoiceXML contents as well as to gather composite multimodal inputs or synchronize inter-modalities through man-machine I/Os. We developed home networking service named "HomeN" based on our middleware framework. It demonstrates that users could maintain multimodal scenarios in a clear, concise and consistent manner under various user's interactions.

  • PDF

Convergence evaluation method using multisensory and matching painting and music using deep learning based on imaginary soundscape (Imaginary Soundscape 기반의 딥러닝을 활용한 회화와 음악의 매칭 및 다중 감각을 이용한 융합적 평가 방법)

  • Jeong, Hayoung;Kim, Youngjun;Cho, Jundong
    • Journal of the Korea Convergence Society
    • /
    • 제11권11호
    • /
    • pp.175-182
    • /
    • 2020
  • In this study, we introduced the technique of matching classical music using deep learning to design soundscape that can help the viewer appreciate painting and proposed an evaluation index to evaluate how well matching painting and music. The evaluation index was conducted with suitability evaluation through the Likeard 5-point scale and evaluation in a multimodal aspect. The suitability evaluation score of the 13 test participants for the deep learning based best match between painting and music was 3.74/5.0 and band the average cosine similarity of the multimodal evaluation of 13 participants was 0.79. We expect multimodal evaluation to be an evaluation index that can measure a new user experience. In addition, this study aims to improve the experience of multisensory artworks by proposing the interaction between visual and auditory. The proposed matching of painting and music method can be used in multisensory artwork exhibition and furthermore it will increase the accessibility of visually impaired people to appreciate artworks.

Automatic Human Emotion Recognition from Speech and Face Display - A New Approach (인간의 언어와 얼굴 표정에 통하여 자동적으로 감정 인식 시스템 새로운 접근법)

  • Luong, Dinh Dong;Lee, Young-Koo;Lee, Sung-Young
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 한국정보과학회 2011년도 한국컴퓨터종합학술대회논문집 Vol.38 No.1(B)
    • /
    • pp.231-234
    • /
    • 2011
  • Audiovisual-based human emotion recognition can be considered a good approach for multimodal humancomputer interaction. However, the optimal multimodal information fusion remains challenges. In order to overcome the limitations and bring robustness to the interface, we propose a framework of automatic human emotion recognition system from speech and face display. In this paper, we develop a new approach for fusing information in model-level based on the relationship between speech and face expression to detect automatic temporal segments and perform multimodal information fusion.

An Experimental Multimodal Command Control Interface toy Car Navigation Systems

  • Kim, Kyungnam;Ko, Jong-Gook;SeungHo choi;Kim, Jin-Young;Kim, Ki-Jung
    • Proceedings of the IEEK Conference
    • /
    • 대한전자공학회 2000년도 ITC-CSCC -1
    • /
    • pp.249-252
    • /
    • 2000
  • An experimental multimodal system combining natural input modes such as speech, lip movement, and gaze is proposed in this paper. It benefits from novel human-compute. interaction (HCI) modalities and from multimodal integration for tackling the problem of the HCI bottleneck. This system allows the user to select menu items on the screen by employing speech recognition, lip reading, and gaze tracking components in parallel. Face tracking is a supplementary component to gaze tracking and lip movement analysis. These key components are reviewed and preliminary results are shown with multimodal integration and user testing on the prototype system. It is noteworthy that the system equipped with gaze tracking and lip reading is very effective in noisy environment, where the speech recognition rate is low, moreover, not stable. Our long term interest is to build a user interface embedded in a commercial car navigation system (CNS).

  • PDF

Multimodal based Storytelling Experience Using Virtual Reality in Museum (가상현실을 이용한 박물관 내 멀티모달 스토리텔링 경험 연구)

  • Lee, Ji-Hye
    • The Journal of the Korea Contents Association
    • /
    • 제18권10호
    • /
    • pp.11-19
    • /
    • 2018
  • This paper is about multimodal storytelling experience applying Virtual Reality technology in museum. Specifically, this research argues virtual reality in both intuitive understanding of history also multimodal experience in the space. This research investigates cases regarding use of virtual reality in museum sector. As a research method, this paper conducts a literature review regarding multimodal experience and examples applying virtual reality related technologies in museum. Based on the literature review to investigate the concept necessary with its related cases. Based on the investigation, this paper suggests constructing elements for multimodal storytelling based on VR. Ultimately, this paper suggests the elements of building VR storytelling where dynamic audio-visual and interaction mode combines with historical resources for diverse audiences.

Design of the emotion expression in multimodal conversation interaction of companion robot (컴패니언 로봇의 멀티 모달 대화 인터랙션에서의 감정 표현 디자인 연구)

  • Lee, Seul Bi;Yoo, Seung Hun
    • Design Convergence Study
    • /
    • 제16권6호
    • /
    • pp.137-152
    • /
    • 2017
  • This research aims to develop the companion robot experience design for elderly in korea based on needs-function deploy matrix of robot and emotion expression research of robot in multimodal interaction. First, Elder users' main needs were categorized into 4 groups based on ethnographic research. Second, the functional elements and physical actuators of robot were mapped to user needs in function- needs deploy matrix. The final UX design prototype was implemented with a robot type that has a verbal non-touch multi modal interface with emotional facial expression based on Ekman's Facial Action Coding System (FACS). The proposed robot prototype was validated through a user test session to analyze the influence of the robot interaction on the cognition and emotion of users by Story Recall Test and face emotion analysis software; Emotion API when the robot changes facial expression corresponds to the emotion of the delivered information by the robot and when the robot initiated interaction cycle voluntarily. The group with emotional robot showed a relatively high recall rate in the delayed recall test and In the facial expression analysis, the facial expression and the interaction initiation of the robot affected on emotion and preference of the elderly participants.

The Individual Discrimination Location Tracking Technology for Multimodal Interaction at the Exhibition (전시 공간에서 다중 인터랙션을 위한 개인식별 위치 측위 기술 연구)

  • Jung, Hyun-Chul;Kim, Nam-Jin;Choi, Lee-Kwon
    • Journal of Intelligence and Information Systems
    • /
    • 제18권2호
    • /
    • pp.19-28
    • /
    • 2012
  • After the internet era, we are moving to the ubiquitous society. Nowadays the people are interested in the multimodal interaction technology, which enables audience to naturally interact with the computing environment at the exhibitions such as gallery, museum, and park. Also, there are other attempts to provide additional service based on the location information of the audience, or to improve and deploy interaction between subjects and audience by analyzing the using pattern of the people. In order to provide multimodal interaction service to the audience at the exhibition, it is important to distinguish the individuals and trace their location and route. For the location tracking on the outside, GPS is widely used nowadays. GPS is able to get the real time location of the subjects moving fast, so this is one of the important technologies in the field requiring location tracking service. However, as GPS uses the location tracking method using satellites, the service cannot be used on the inside, because it cannot catch the satellite signal. For this reason, the studies about inside location tracking are going on using very short range communication service such as ZigBee, UWB, RFID, as well as using mobile communication network and wireless lan service. However these technologies have shortcomings in that the audience needs to use additional sensor device and it becomes difficult and expensive as the density of the target area gets higher. In addition, the usual exhibition environment has many obstacles for the network, which makes the performance of the system to fall. Above all these things, the biggest problem is that the interaction method using the devices based on the old technologies cannot provide natural service to the users. Plus the system uses sensor recognition method, so multiple users should equip the devices. Therefore, there is the limitation in the number of the users that can use the system simultaneously. In order to make up for these shortcomings, in this study we suggest a technology that gets the exact location information of the users through the location mapping technology using Wi-Fi and 3d camera of the smartphones. We applied the signal amplitude of access point using wireless lan, to develop inside location tracking system with lower price. AP is cheaper than other devices used in other tracking techniques, and by installing the software to the user's mobile device it can be directly used as the tracking system device. We used the Microsoft Kinect sensor for the 3D Camera. Kinect is equippedwith the function discriminating the depth and human information inside the shooting area. Therefore it is appropriate to extract user's body, vector, and acceleration information with low price. We confirm the location of the audience using the cell ID obtained from the Wi-Fi signal. By using smartphones as the basic device for the location service, we solve the problems of additional tagging device and provide environment that multiple users can get the interaction service simultaneously. 3d cameras located at each cell areas get the exact location and status information of the users. The 3d cameras are connected to the Camera Client, calculate the mapping information aligned to each cells, get the exact information of the users, and get the status and pattern information of the audience. The location mapping technique of Camera Client decreases the error rate that occurs on the inside location service, increases accuracy of individual discrimination in the area through the individual discrimination based on body information, and establishes the foundation of the multimodal interaction technology at the exhibition. Calculated data and information enables the users to get the appropriate interaction service through the main server.

A Study on Interaction between Multimodal Feedback Setting and Portable Patterns through Behavior Study of Mobile Phone User in Mobile Environment (모바일 환경 내 휴대폰 사용자 행동연구를 통한 다중양식 피드백 설정과 휴대패턴의 상호영향 연구)

  • Baek, Young-Mi;Myung, Ro-Hae;Yim, Jin-Ho
    • 한국HCI학회:학술대회논문집
    • /
    • 한국HCI학회 2006년도 학술대회 1부
    • /
    • pp.579-586
    • /
    • 2006
  • 모바일 환경에서 휴대폰을 사용하다 보면 무의식적으로 전화를 받지 못하는(Missing call) 상황이 빈번하게 일어난다. 휴대폰에서는 기본적으로 시각 청각 촉각의 다중양식 피드백(Multimodal feedback)을 제공하고 있음에도 불구하고, 이렇게 Missing call 이 발생하는 현상에는 여러 가지 다양한 원인이 존재할 것이다. 본 연구에서는 이러한 원인을 찾기 위해 모바일 환경 내 휴대폰 사용자 행동연구를 실시하여 일반적인 휴대패턴을 분석하고, Missing call과 관련하여 주로 설정하는 수신모드와 휴대패턴의 상호영향을 연구하고자 하였다. 본 연구결과, 모바일 환경에서 휴대폰 수신 탐지능력에는 인지심리학적(감각과 주의관련), 환경적, 행동학적 요인이 영향을 미칠 수 있는 것으로 나타났다. 또한 모바일 환경에서 주로 사용하는 수신모드인 진동모드 설정시, 휴대폰 사용자가 속한 환경에 다른 요인들이 복합적으로 존재할 경우 휴대폰 수신에 대한 탐지만족도가 감소함을 확인할 수 있었다.

  • PDF