• Title/Summary/Keyword: Multimodal Information

Search Result 255, Processing Time 0.03 seconds

Hybrid feature extraction of multimodal images for face recognition

  • Cheema, Usman;Moon, Seungbin
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2018.10a
    • /
    • pp.880-881
    • /
    • 2018
  • Recently technological advancements have allowed visible, infrared and thermal imaging systems to be readily available for security and access control. Increasing applications of facial recognition for security and access control leads to emerging spoofing methodologies. To overcome these challenges of occlusion, replay attack and disguise, researches have proposed using multiple imaging modalities. Using infrared and thermal modalities alongside visible imaging helps to overcome the shortcomings of visible imaging. In this paper we review and propose hybrid feature extraction methods to combine data from multiple imaging systems simultaneously.

Speech and Textual Data Fusion for Emotion Detection: A Multimodal Deep Learning Approach (감정 인지를 위한 음성 및 텍스트 데이터 퓨전: 다중 모달 딥 러닝 접근법)

  • Edward Dwijayanto Cahyadi;Mi-Hwa Song
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2023.11a
    • /
    • pp.526-527
    • /
    • 2023
  • Speech emotion recognition(SER) is one of the interesting topics in the machine learning field. By developing multi-modal speech emotion recognition system, we can get numerous benefits. This paper explain about fusing BERT as the text recognizer and CNN as the speech recognizer to built a multi-modal SER system.

The Individual Discrimination Location Tracking Technology for Multimodal Interaction at the Exhibition (전시 공간에서 다중 인터랙션을 위한 개인식별 위치 측위 기술 연구)

  • Jung, Hyun-Chul;Kim, Nam-Jin;Choi, Lee-Kwon
    • Journal of Intelligence and Information Systems
    • /
    • v.18 no.2
    • /
    • pp.19-28
    • /
    • 2012
  • After the internet era, we are moving to the ubiquitous society. Nowadays the people are interested in the multimodal interaction technology, which enables audience to naturally interact with the computing environment at the exhibitions such as gallery, museum, and park. Also, there are other attempts to provide additional service based on the location information of the audience, or to improve and deploy interaction between subjects and audience by analyzing the using pattern of the people. In order to provide multimodal interaction service to the audience at the exhibition, it is important to distinguish the individuals and trace their location and route. For the location tracking on the outside, GPS is widely used nowadays. GPS is able to get the real time location of the subjects moving fast, so this is one of the important technologies in the field requiring location tracking service. However, as GPS uses the location tracking method using satellites, the service cannot be used on the inside, because it cannot catch the satellite signal. For this reason, the studies about inside location tracking are going on using very short range communication service such as ZigBee, UWB, RFID, as well as using mobile communication network and wireless lan service. However these technologies have shortcomings in that the audience needs to use additional sensor device and it becomes difficult and expensive as the density of the target area gets higher. In addition, the usual exhibition environment has many obstacles for the network, which makes the performance of the system to fall. Above all these things, the biggest problem is that the interaction method using the devices based on the old technologies cannot provide natural service to the users. Plus the system uses sensor recognition method, so multiple users should equip the devices. Therefore, there is the limitation in the number of the users that can use the system simultaneously. In order to make up for these shortcomings, in this study we suggest a technology that gets the exact location information of the users through the location mapping technology using Wi-Fi and 3d camera of the smartphones. We applied the signal amplitude of access point using wireless lan, to develop inside location tracking system with lower price. AP is cheaper than other devices used in other tracking techniques, and by installing the software to the user's mobile device it can be directly used as the tracking system device. We used the Microsoft Kinect sensor for the 3D Camera. Kinect is equippedwith the function discriminating the depth and human information inside the shooting area. Therefore it is appropriate to extract user's body, vector, and acceleration information with low price. We confirm the location of the audience using the cell ID obtained from the Wi-Fi signal. By using smartphones as the basic device for the location service, we solve the problems of additional tagging device and provide environment that multiple users can get the interaction service simultaneously. 3d cameras located at each cell areas get the exact location and status information of the users. The 3d cameras are connected to the Camera Client, calculate the mapping information aligned to each cells, get the exact information of the users, and get the status and pattern information of the audience. The location mapping technique of Camera Client decreases the error rate that occurs on the inside location service, increases accuracy of individual discrimination in the area through the individual discrimination based on body information, and establishes the foundation of the multimodal interaction technology at the exhibition. Calculated data and information enables the users to get the appropriate interaction service through the main server.

Implementation of a Multimodal Controller Combining Speech and Lip Information (음성과 영상정보를 결합한 멀티모달 제어기의 구현)

  • Kim, Cheol;Choi, Seung-Ho
    • The Journal of the Acoustical Society of Korea
    • /
    • v.20 no.6
    • /
    • pp.40-45
    • /
    • 2001
  • In this paper, we implemented a multimodal system combining speech and lip information, and evaluated its performance. We designed speech recognizer using speech information and lip recognizer using image information. Both recognizers were based on HMM recognition engine. As a combining method we adopted the late integration method in which weighting ratio for speech and lip is 8:2. By the way, Our constructed multi-modal recognition system was ported on DARC system. That is, our system was used to control Comdio of DARC. The interrace between DARC and our system was done with TCP/IP socked. The experimental results of controlling Comdio showed that lip recognition can be used for an auxiliary means of speech recognizer by improving the rate of the recognition. Also, we expect that multi-model system will be successfully applied to o traffic information system and CNS (Car Navigation System).

  • PDF

Deep Multimodal MRI Fusion Model for Brain Tumor Grading (뇌 종양 등급 분류를 위한 심층 멀티모달 MRI 통합 모델)

  • Na, In-ye;Park, Hyunjin
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2022.05a
    • /
    • pp.416-418
    • /
    • 2022
  • Glioma is a type of brain tumor that occurs in glial cells and is classified into two types: high hrade hlioma with a poor prognosis and low grade glioma. Magnetic resonance imaging (MRI) as a non-invasive method is widely used in glioma diagnosis research. Studies to obtain complementary information by combining multiple modalities to overcome the incomplete information limitation of single modality are being conducted. In this study, we developed a 3D CNN-based model that applied input-level fusion to MRI of four modalities (T1, T1Gd, T2, T2-FLAIR). The trained model showed classification performance of 0.8926 accuracy, 0.9688 sensitivity, 0.6400 specificity, and 0.9467 AUC on the validation data. Through this, it was confirmed that the grade of glioma was effectively classified by learning the internal relationship between various modalities.

  • PDF

Effect of Multimodal cues on Tactile Mental Imagery and Attitude-Purchase Intention Towards the Product (다중 감각 단서가 촉각적 심상과 제품에 대한 태도-구매 의사에 미치는 영향)

  • Lee, Yea Jin;Han, Kwanghee
    • Science of Emotion and Sensibility
    • /
    • v.24 no.3
    • /
    • pp.41-60
    • /
    • 2021
  • The purpose of this research was to determine whether multimodal cues in an online shopping environment could enhance tactile consumer mental imagery, purchase intentions, and attitudes towards an apparel product. One limitation of online retail is that consumers are unable to physically touch the items. However, as tactile information plays an important role in consumer decisions especially for apparel products, this study investigated the effects of multimodal cues on overcoming the lack of tactile stimuli. In experiment 1, to explore the product, the participants were randomly assigned to four conditions; picture only, video without sound, video with corresponding sound, and video with discordant sound; after which tactile mental imagery vividness, ease of imagination, attitude, and purchase intentions were measured. It was found that the video with discordant sound had the lowest average scores of all dependent variables. A within-participants design was used in experiment 2, in which all participants explored the same product in the four conditions in a random order. They were told that they were visiting four different brands on a price comparison web site. After the same variables as in experiment 1, including the need for touch, were measured, the repeated measures ANCOVA results revealed that compared to the other conditions, the video with the corresponding sound significantly enhanced tactile mental imagery vividness, attitude, and purchase intentions. However, the discordant condition had significantly lower attitudes and purchase intentions. The dual mediation analysis also revealed that the multimodal cue conditions significantly predicted attitudes and purchase intentions by sequentially mediating the imagery vividness and ease of imagination. In sum, vivid tactile mental imagery triggered using audio-visual stimuli could have a positive effect on consumer decision making by making it easier to imagine a situation where consumers could touch and use the product.

Multimodal Digital Photographic Imaging System for Total Diagnostic Analysis of Skin Lesions: DermaVision-Pro (다모드 디지털 사진 영상 시스템을 이용한 피부 손상의 진단적 분석에 대한 연구 : DermaVision-Pro)

  • Bae, Young-Woo;Kim, Eun-Ji;Jung, Byung-Jo
    • Proceedings of the KIEE Conference
    • /
    • 2008.10b
    • /
    • pp.153-154
    • /
    • 2008
  • Digital photographic analysis is currently considered as a routine procedure in clinic because periodic follow-up examinations can provide meaningful information for diagnosis. However, it is impractical to separately evaluate all suspicious lesions with conventional digital photographic systems, which have inconsistent characteristics of the environmental conditions. To address the issue, it is necessary for total diagnostic evaluation in clinic to integrate conventional systems. Previously, a multimodal digital photographic imaging system, which provides a conventional color image, parallel and cross polarization color images and a fluorescent color image, was developed for objective evaluation of facial skin lesions. Based on our previous study, we introduce a commercial product, "DermaVision-PRO," for routine use in clinical application in dermatology. We characterize the system and describe the image analysis methods for objective evaluation of skin lesions. In order to demonstrate the validity of the system in dermatology, sample images were obtained from subjects with various skin disorders, and image analysis methods were applied for objective evaluation of those lesions.

  • PDF

Multimodal Biometric Recognition System using Real Fuzzy Vault (실수형 퍼지볼트를 이용한 다중 바이오인식 시스템)

  • Lee, Dae-Jong;Chun, Myung-Geun
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.23 no.4
    • /
    • pp.310-316
    • /
    • 2013
  • Biometric techniques have been widely used for various areas including criminal identification due to their reliability. However, they have some drawbacks when the biometric information is divulged to illegal users. This paper proposed multimodal biometric system using a real fuzzy vault by RN-ECC for protecting fingerprint and face template. This proposed method has some advantages to regenerate a key value compared with face or fingerprint based verification system having non-regenerative nature and to implement advanced biometric verification system by fusion of both fingerprint and face recognition. From the various experiments, we found that the proposed method shows high recognition rates comparing with the conventional methods.

Interface Modeling for Digital Device Control According to Disability Type in Web

  • Park, Joo Hyun;Lee, Jongwoo;Lim, Soon-Bum
    • Journal of Multimedia Information System
    • /
    • v.7 no.4
    • /
    • pp.249-256
    • /
    • 2020
  • Learning methods using various assistive and smart devices have been developed to enable independent learning of the disabled. Pointer control is the most important consideration for the disabled when controlling a device and the contents of an existing graphical user interface (GUI) environment; however, difficulties can be encountered when using a pointer, depending on the disability type; Although there are individual differences depending on the blind, low vision, and upper limb disability, problems arise in the accuracy of object selection and execution in common. A multimodal interface pilot solution is presented that enables people with various disability types to control web interactions more easily. First, we classify web interaction types using digital devices and derive essential web interactions among them. Second, to solve problems that occur when performing web interactions considering the disability type, the necessary technology according to the characteristics of each disability type is presented. Finally, a pilot solution for the multimodal interface for each disability type is proposed. We identified three disability types and developed solutions for each type. We developed a remote-control operation voice interface for blind people and a voice output interface applying the selective focusing technique for low-vision people. Finally, we developed a gaze-tracking and voice-command interface for GUI operations for people with upper-limb disability.

International Multimodal Transport Route Development from Korea to Mongolia

  • Nyamjav, Tsenskhuu;Ha, Min-Ho
    • Journal of Navigation and Port Research
    • /
    • v.46 no.5
    • /
    • pp.419-426
    • /
    • 2022
  • This study aimed to identify new routes for transporting automobiles from Korea to Mongolia by comparing them with the existing route. At present, a route from the Incheon Port through the Tianjin Port to Zamiin-Uud is commonly used to transport containerized cargo from Korea to Mongolia. This study examined five possible logistics routes from Korea to Mongolia using a time/cost-distance methodology based on real data. Through consecutive discussions with importers and freight forwarders in Mongolia, the potential routes were selected and costs, distance, and lead time were evaluated to provide additional route options for automobile logistics from Korea to Mongolia. The results indicated that each route could be ranked in terms of the total cost while the lead time for all options in the present COVID-19 period is 2 - 4 months, with no difference among the routes. In addition, although the confidence index of all routes was not impressive, route 3 was the most preferred option, followed by route 1. However, the study results cannot provide the answer to the question of "which route is more attractive for transporting automobiles from Korea to Mongolia." This limitation notwithstanding, this study provides real information on the critical factors of distance, cost, and lead time in terms of the selected transportation routes so that importers and exporters can compare the routes in terms of the priority of each factor in uncertain logistics environment.