• Title/Summary/Keyword: 멀티모달정보

Search Result 187, Processing Time 0.024 seconds

Deep Multimodal MRI Fusion Model for Brain Tumor Grading (뇌 종양 등급 분류를 위한 심층 멀티모달 MRI 통합 모델)

  • Na, In-ye;Park, Hyunjin
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2022.05a
    • /
    • pp.416-418
    • /
    • 2022
  • Glioma is a type of brain tumor that occurs in glial cells and is classified into two types: high hrade hlioma with a poor prognosis and low grade glioma. Magnetic resonance imaging (MRI) as a non-invasive method is widely used in glioma diagnosis research. Studies to obtain complementary information by combining multiple modalities to overcome the incomplete information limitation of single modality are being conducted. In this study, we developed a 3D CNN-based model that applied input-level fusion to MRI of four modalities (T1, T1Gd, T2, T2-FLAIR). The trained model showed classification performance of 0.8926 accuracy, 0.9688 sensitivity, 0.6400 specificity, and 0.9467 AUC on the validation data. Through this, it was confirmed that the grade of glioma was effectively classified by learning the internal relationship between various modalities.

  • PDF

The Interactive Learning Experience by Integrating Educational Robots into the Augmented Reality (교육용 로봇과 증강 현실 결합을 통한 인터랙티브 학습 경험)

  • Yu, Jeong Su
    • Journal of The Korean Association of Information Education
    • /
    • v.16 no.4
    • /
    • pp.419-427
    • /
    • 2012
  • This paper presents the effect of a interactive learning experience and student's response to technological components We develop the interactive learning environment and learning model in lessons relying on educational robots and augmented reality in the school classroom. The developed learning model is based on the problem-based learning model. The experiments of the study conduct with 18 students, the $5^{th}$ and $6^{th}$ graders of an elementary school for 8 weeks using developed system. We find out the interactive learning experiences have an influence on the creative ability of children. We know that students who scored lower on the school exam scored higher on the score of creativity compared to top students through educational robots and augmented reality.

  • PDF

Smart Affect Jewelry based on Multi-modal (멀티 모달 기반의 스마트 감성 주얼리)

  • Kang, Yun-Jeong
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.20 no.7
    • /
    • pp.1317-1324
    • /
    • 2016
  • Utilizing the Arduino platform to express the emotions that reflect the colors expressed the jewelry. Emotional color expression utilizes Plutchik's Wheel of Emotions model was applied to the similarity of emotions and colors. It receives the recognized value from the temperature, lighting, sound, pulse sensor and gyro sensor of a smart jewelery that can be easily accessible from your smartphone processes that recognize and process the emotion applied the rules of inference based on ontology. The emotional feelings color depending on the color looking for the emotion seen in context and applied to the smart LED jewelry. The emotion and the color combination of contextual information extracted from the recognition sensors are reflected in the built-in smart LED Jewelry depending on the emotions of the wearer. Take a light plus the emotion in a smart jewelery can represent the emotions of the situation, the doctor will be able to be a tool of representation.

Development of Context Awareness and Service Reasoning Technique for Handicapped People (멀티 모달 감정인식 시스템 기반 상황인식 서비스 추론 기술 개발)

  • Ko, Kwang-Eun;Sim, Kwee-Bo
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.19 no.1
    • /
    • pp.34-39
    • /
    • 2009
  • As a subjective recognition effect, human's emotion has impulsive characteristic and it expresses intentions and needs unconsciously. These are pregnant with information of the context about the ubiquitous computing environment or intelligent robot systems users. Such indicators which can aware the user's emotion are facial image, voice signal, biological signal spectrum and so on. In this paper, we generate the each result of facial and voice emotion recognition by using facial image and voice for the increasing convenience and efficiency of the emotion recognition. Also, we extract the feature which is the best fit information based on image and sound to upgrade emotion recognition rate and implement Multi-Modal Emotion recognition system based on feature fusion. Eventually, we propose the possibility of the ubiquitous computing service reasoning method based on Bayesian Network and ubiquitous context scenario in the ubiquitous computing environment by using result of emotion recognition.

Text Augmentation Using Hierarchy-based Word Replacement

  • Kim, Museong;Kim, Namgyu
    • Journal of the Korea Society of Computer and Information
    • /
    • v.26 no.1
    • /
    • pp.57-67
    • /
    • 2021
  • Recently, multi-modal deep learning techniques that combine heterogeneous data for deep learning analysis have been utilized a lot. In particular, studies on the synthesis of Text to Image that automatically generate images from text are being actively conducted. Deep learning for image synthesis requires a vast amount of data consisting of pairs of images and text describing the image. Therefore, various data augmentation techniques have been devised to generate a large amount of data from small data. A number of text augmentation techniques based on synonym replacement have been proposed so far. However, these techniques have a common limitation in that there is a possibility of generating a incorrect text from the content of an image when replacing the synonym for a noun word. In this study, we propose a text augmentation method to replace words using word hierarchy information for noun words. Additionally, we performed experiments using MSCOCO data in order to evaluate the performance of the proposed methodology.

Text Mining Analysis of Customer Reviews on Public Service Robots: With a focus on the Guide Robot Cases (텍스트 마이닝을 활용한 공공기관 서비스 로봇에 대한 사용자 리뷰 분석 : 안내로봇 사례를 중심으로)

  • Hyorim Shin;Junho Choi;Changhoon Oh
    • The Journal of the Convergence on Culture Technology
    • /
    • v.9 no.1
    • /
    • pp.787-797
    • /
    • 2023
  • The use of service robots, particularly guide robots, is becoming increasingly prevalent in public institutions. However, there has been limited research into the interactions between users and guide robots. To explore the customer experience with the guidance robot, we selected 'QI', which has been meeting customers for the longest time, and collected all reviews since the service was launched in public institutions. By using text mining techniques, we identified the main keywords and user experience factors and examined factors that hinder user experience. As a result, the guide robot's functionality, appearance, interaction methods, and role as a cultural commentator and helper were key factors that influenced the user experience. After identifying hindrance factors, we suggested solutions such as improved interaction design, multimodal interface service design, and content development. This study contributes to the understanding of user experience with guide robots and provides practical suggestions for improvement.

Multicontents Integrated Image Animation within Synthesis for Hiqh Quality Multimodal Video (고화질 멀티 모달 영상 합성을 통한 다중 콘텐츠 통합 애니메이션 방법)

  • Jae Seung Roh;Jinbeom Kang
    • Journal of Intelligence and Information Systems
    • /
    • v.29 no.4
    • /
    • pp.257-269
    • /
    • 2023
  • There is currently a burgeoning demand for image synthesis from photos and videos using deep learning models. Existing video synthesis models solely extract motion information from the provided video to generate animation effects on photos. However, these synthesis models encounter challenges in achieving accurate lip synchronization with the audio and maintaining the image quality of the synthesized output. To tackle these issues, this paper introduces a novel framework based on an image animation approach. Within this framework, upon receiving a photo, a video, and audio input, it produces an output that not only retains the unique characteristics of the individuals in the photo but also synchronizes their movements with the provided video, achieving lip synchronization with the audio. Furthermore, a super-resolution model is employed to enhance the quality and resolution of the synthesized output.

Performance Analysis for Accuracy of Personality Recognition Models based on Setting of Margin Values at Face Region Extraction (얼굴 영역 추출 시 여유값의 설정에 따른 개성 인식 모델 정확도 성능 분석)

  • Qiu Xu;Gyuwon Han;Bongjae Kim
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.24 no.1
    • /
    • pp.141-147
    • /
    • 2024
  • Recently, there has been growing interest in personalized services tailored to an individual's preferences. This has led to ongoing research aimed at recognizing and leveraging an individual's personality traits. Among various methods for personality assessment, the OCEAN model stands out as a prominent approach. In utilizing OCEAN for personality recognition, a multi modal artificial intelligence model that incorporates linguistic, paralinguistic, and non-linguistic information is often employed. This paper examines the impact of the margin value set for extracting facial areas from video data on the accuracy of a personality recognition model that uses facial expressions to determine OCEAN traits. The study employed personality recognition models based on 2D Patch Partition, R2plus1D, 3D Patch Partition, and Video Swin Transformer technologies. It was observed that setting the facial area extraction margin to 60 resulted in the highest 1-MAE performance, scoring at 0.9118. These findings indicate the importance of selecting an optimal margin value to maximize the efficiency of personality recognition models.

NUI/NUX framework based on intuitive hand motion (직관적인 핸드 모션에 기반한 NUI/NUX 프레임워크)

  • Lee, Gwanghyung;Shin, Dongkyoo;Shin, Dongil
    • Journal of Internet Computing and Services
    • /
    • v.15 no.3
    • /
    • pp.11-19
    • /
    • 2014
  • The natural user interface/experience (NUI/NUX) is used for the natural motion interface without using device or tool such as mice, keyboards, pens and markers. Up to now, typical motion recognition methods used markers to receive coordinate input values of each marker as relative data and to store each coordinate value into the database. But, to recognize accurate motion, more markers are needed and much time is taken in attaching makers and processing the data. Also, as NUI/NUX framework being developed except for the most important intuition, problems for use arise and are forced for users to learn many NUI/NUX framework usages. To compensate for this problem in this paper, we didn't use markers and implemented for anyone to handle it. Also, we designed multi-modal NUI/NUX framework controlling voice, body motion, and facial expression simultaneously, and proposed a new algorithm of mouse operation by recognizing intuitive hand gesture and mapping it on the monitor. We implement it for user to handle the "hand mouse" operation easily and intuitively.

A Multi Modal Interface for Mobile Environment (모바일 환경에서의 Multi Modal 인터페이스)

  • Seo, Yong-Won;Lee, Beom-Chan;Lee, Jun-Hun;Kim, Jong-Phil;Ryu, Je-Ha
    • 한국HCI학회:학술대회논문집
    • /
    • 2006.02a
    • /
    • pp.666-671
    • /
    • 2006
  • 'Multi modal 인터페이스'란 인간과 기계의 통신을 위해 음성, 키보드, 펜을 이용, 인터페이스를 하는 방법을 말한다. 최근 들어 많은 휴대용 단말기가 보급 되고, 단말기가 소형화, 지능화 되어가고, 단말기의 어플리케이션도 다양해짐에 따라 사용자가 보다 편리하고 쉽게 사용할 수 있는 입력 방법에 기대치가 높아가고 있다. 현재 휴대용 단말기에 가능한 입력장치는 단지 단말기의 버튼이나 터치 패드(PDA 경우)이다. 하지만 장애인의 경우 버튼이나 터치 패드를 사용하기 어렵고, 휴대용 단말기로 게임을 하는데 있어서도, 어려움이 많으며 새로운 게임이나 어플리케이션 개발에도 많은 장애요인이 되고 있다. 이런 문제점들은 극복하기 위하여, 본 논문에서는 휴대용 단말기의 새로운 Multi Modal 인터페이스를 제시 하였다. PDA(Personal Digital Assistants)를 이용하여 더 낳은 재미와 실감을 줄 수 있는 Multi Modal 인터페이스를 개발하였다. 센서를 이용하여 휴대용 단말기를 손목으로 제어를 가능하게 함으로서, 사용자에게 편리하고 색다른 입력 장치를 제공 하였다. 향후 음성 인식 기능이 추가 된다면, 인간과 인간 사이의 통신은 음성과 제스처를 이용하듯이 기계에서는 전통적으로 키보드 나 버튼을 사용하지 않고 인간처럼 음성과 제스처를 통해 통신할 수 있을 것이다. 또한 여기에 진동자를 이용하여 촉감을 부여함으로써, 그 동안 멀티 모달 인터페이스에 소외된 시각 장애인, 노약자들에게도 정보를 제공할 수 있다. 실제로 사람은 시각이나 청각보다 촉각에 훨씬 빠르게 반응한다. 이 시스템을 게임을 하는 사용자한테 적용한다면, 능동적으로 게임참여 함으로서 좀더 실감나는 재미를 제공할 수 있다. 특수한 상황에서는 은밀한 정보를 제공할 수 있으며, 앞으로 개발될 모바일 응용 서비스에 사용될 수 있다.

  • PDF