• Title/Summary/Keyword: 멀티모달 정보분석

Search Result 43, Processing Time 0.022 seconds

A Method of Comparing Risk Similarities Based on Multimodal Data (멀티모달 데이터 기반 위험 발생 유사성 비교 방법)

  • Kwon, Eun-Jung;Shin, WonJae;Lee, Yong-Tae;Lee, Kyu-Chul
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2019.05a
    • /
    • pp.510-512
    • /
    • 2019
  • Recently, there have been growing requirements in the public safety sector to ensure safety through detection of hazardous situations or preemptive predictions. It is noteworthy that various sensor data can be analyzed and utilized as a result of mobile device's dissemination, and many advantages can be used in terms of safety and security. An effective modeling technique is needed to combine sensor data generated by smart-phones and wearable devices to analyze users' moving patterns and behavioral patterns, and to ensure public safety by fusing location-based crime risk data provided.

  • PDF

A Study on the Analysis of Educational Objectives of 'Library and Information Life' Textbooks Based on the Eisner Curriculum (아이즈너 교육과정에 의한 '도서관과 정보생활' 교과서 교육목표 분석에 관한 연구)

  • Byeong-Kee Lee
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.58 no.2
    • /
    • pp.57-80
    • /
    • 2024
  • Eisner emphasized the importance of problem-solving objectives and expressive objectives in addition to behavioral objectives, and communication through multiple modalities including linguistic, visual, aural, spatial, gestural modes. This study analyzes 'Libraries and Information Life,' a textbook developed for information literacy instruction, by dividing it into educational objectives types(behavioral, problem-solving, expressive) and multimodal modes(linguistic, visual, auditory, spatial, and gestural), and seeks to derive implications for setting educational objectives for information literacy instruction and developing textbooks. The textbook has four volumes for elementary low-grade, elementary high-grade, middle school, and high school levels. Educational objectives were extracted from the textbooks, and 3 librarian-teachers were engaged in the analysis of these objectives. The main findings and implications of this study are as follows. First, when looking at the types of educational objectives, the proportion of behavioral objectives was found to be excessively high, and there is a need to strengthen the proportion of problem-solving objectives and expressive objectives. Second, problem-solving objectives tend to overlap with behavioral objectives, indicating a need to develop problem-solving objectives with defined conditions and solution requirements. Third, expressive objectives concentrated in specific units need to be placed evenly in other units. Fourth, in the case of multi-modality mode, the proportion of the linguistic mode must be reduced, the proportion of the visual, auditory, spatial, and gestural modes must be increased, and it is necessary to set educational objectives with clear characteristics of each mode.

Design of the emotion expression in multimodal conversation interaction of companion robot (컴패니언 로봇의 멀티 모달 대화 인터랙션에서의 감정 표현 디자인 연구)

  • Lee, Seul Bi;Yoo, Seung Hun
    • Design Convergence Study
    • /
    • v.16 no.6
    • /
    • pp.137-152
    • /
    • 2017
  • This research aims to develop the companion robot experience design for elderly in korea based on needs-function deploy matrix of robot and emotion expression research of robot in multimodal interaction. First, Elder users' main needs were categorized into 4 groups based on ethnographic research. Second, the functional elements and physical actuators of robot were mapped to user needs in function- needs deploy matrix. The final UX design prototype was implemented with a robot type that has a verbal non-touch multi modal interface with emotional facial expression based on Ekman's Facial Action Coding System (FACS). The proposed robot prototype was validated through a user test session to analyze the influence of the robot interaction on the cognition and emotion of users by Story Recall Test and face emotion analysis software; Emotion API when the robot changes facial expression corresponds to the emotion of the delivered information by the robot and when the robot initiated interaction cycle voluntarily. The group with emotional robot showed a relatively high recall rate in the delayed recall test and In the facial expression analysis, the facial expression and the interaction initiation of the robot affected on emotion and preference of the elderly participants.

Social Network Analysis of TV Drama via Location Knowledge-learned Deep Hypernetworks (장소 정보를 학습한 딥하이퍼넷 기반 TV드라마 소셜 네트워크 분석)

  • Nan, Chang-Jun;Kim, Kyung-Min;Zhang, Byoung-Tak
    • KIISE Transactions on Computing Practices
    • /
    • v.22 no.11
    • /
    • pp.619-624
    • /
    • 2016
  • Social-aware video displays not only the relationships between characters but also diverse information on topics such as economics, politics and culture as a story unfolds. Particularly, the speaking habits and behavioral patterns of people in different situations are very important for the analysis of social relationships. However, when dealing with this dynamic multi-modal data, it is difficult for a computer to analyze the drama data effectively. To solve this problem, previous studies employed the deep concept hierarchy (DCH) model to automatically construct and analyze social networks in a TV drama. Nevertheless, since location knowledge was not included, they can only analyze the social network as a whole in stories. In this research, we include location knowledge and analyze the social relations in different locations. We adopt data from approximately 4400 minutes of a TV drama Friends as our dataset. We process face recognition on the characters by using a convolutional- recursive neural networks model and utilize a bag of features model to classify scenes. Then, in different scenes, we establish the social network between the characters by using a deep concept hierarchy model and analyze the change in the social network while the stories unfold.

Literature Review of AI Hallucination Research Since the Advent of ChatGPT: Focusing on Papers from arXiv (챗GPT 등장 이후 인공지능 환각 연구의 문헌 검토: 아카이브(arXiv)의 논문을 중심으로)

  • Park, Dae-Min;Lee, Han-Jong
    • Informatization Policy
    • /
    • v.31 no.2
    • /
    • pp.3-38
    • /
    • 2024
  • Hallucination is a significant barrier to the utilization of large-scale language models or multimodal models. In this study, we collected 654 computer science papers with "hallucination" in the abstract from arXiv from December 2022 to January 2024 following the advent of Chat GPT and conducted frequency analysis, knowledge network analysis, and literature review to explore the latest trends in hallucination research. The results showed that research in the fields of "Computation and Language," "Artificial Intelligence," "Computer Vision and Pattern Recognition," and "Machine Learning" were active. We then analyzed the research trends in the four major fields by focusing on the main authors and dividing them into data, hallucination detection, and hallucination mitigation. The main research trends included hallucination mitigation through supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF), inference enhancement via "chain of thought" (CoT), and growing interest in hallucination mitigation within the domain of multimodal AI. This study provides insights into the latest developments in hallucination research through a technology-oriented literature review. This study is expected to help subsequent research in both engineering and humanities and social sciences fields by understanding the latest trends in hallucination research.

A News Video Mining based on Multi-modal Approach and Text Mining (멀티모달 방법론과 텍스트 마이닝 기반의 뉴스 비디오 마이닝)

  • Lee, Han-Sung;Im, Young-Hee;Yu, Jae-Hak;Oh, Seung-Geun;Park, Dai-Hee
    • Journal of KIISE:Databases
    • /
    • v.37 no.3
    • /
    • pp.127-136
    • /
    • 2010
  • With rapid growth of information and computer communication technologies, the numbers of digital documents including multimedia data have been recently exploded. In particular, news video database and news video mining have became the subject of extensive research, to develop effective and efficient tools for manipulation and analysis of news videos, because of their information richness. However, many research focus on browsing, retrieval and summarization of news videos. Up to date, it is a relatively early state to discover and to analyse the plentiful latent semantic knowledge from news videos. In this paper, we propose the news video mining system based on multi-modal approach and text mining, which uses the visual-textual information of news video clips and their scripts. The proposed system systematically constructs a taxonomy of news video stories in automatic manner with hierarchical clustering algorithm which is one of text mining methods. Then, it multilaterally analyzes the topics of news video stories by means of time-cluster trend graph, weighted cluster growth index, and network analysis. To clarify the validity of our approach, we analyzed the news videos on "The Second Summit of South and North Korea in 2007".

R3 : Open Domain Question Answering System Using Structure Information of Tables (R3 : 테이블의 구조 정보를 활용한 오픈 도메인 질의응답 시스템)

  • Deokhyung Kang;Gary Geunbae Lee
    • Annual Conference on Human and Language Technology
    • /
    • 2022.10a
    • /
    • pp.455-460
    • /
    • 2022
  • 오픈 도메인 질의 응답에서 질의에 대한 답변은 질의에 대한 관련 문서를 검색한 다음 질의에 대한 답변을 포함할 수 있는 검색된 문서를 분석함으로써 얻어진다. 문서내의 테이블이 질의와 관련이 있을 수 있음에도 불구하고, 기존의 연구는 주로 문서의 텍스트 부분만을 검색하는 데 초점을 맞추고 있었다. 이에 테이블과 텍스트를 모두 고려하는 질의응답과 관련된 연구가 진행되었으나 테이블의 구조적 정보가 손실되는 등의 한계가 있었다. 본 연구에서는 테이블의 구조적 정보를 모델의 추가적인 임베딩을 통해 활용한 오픈 도메인 질의응답 시스템인 R3를 제안한다. R3는 오픈 도메인 질의 응답 데이터셋인 NQ에 기반한 새로운 데이터셋인 NQ-Open-Multi를 이용해 학습 및 평가하였으며, 테이블의 구조적 정보를 활용하지 않은 시스템에 비해 더 좋은 성능을 보임을 확인할 수 있었다.

  • PDF

Non-liner brain image registration based on moment and free-form deformation (모멘트 및 free-form 변형기반 비선형 뇌영상 정합)

  • 김민정;최유주;김명희
    • Proceedings of the Korea Multimedia Society Conference
    • /
    • 2004.05a
    • /
    • pp.271-274
    • /
    • 2004
  • 영상정합을 통한 의료영상 분석방법들 중 동일환자에 대한 선형적 다중모달리티 정합이 널리 이용되고 있다. 그러나 실제적으로 여러 종류의 환자영상 취득이 어렵거나 해부학적 영상정보가 손실되는 경우가 적지 않다 본 논문에서는 표준 형상을 가지는 정상인 해부학적 뇌영상에 대한 환자 기능적 뇌영상의 정합방법을 제안한다. 먼저 두 영상간 모멘트 정보 매칭 및 초기선형 변환을 수행하고, 3차원 B zier 함수 기반 free-form 변형기법을 이용한 비선형 정합을 수행하여 정합 영상간 형상 차이를 최소화한다 제안방법은 환자 기능영상의 해부학적 분석 뿐 아니라 시술전-시술중 영상정합을 통한 영상유도시술에도 확장 적용될 수 있다.

  • PDF

Nonparametric Bayesian Approach for Multichannel based Semantic Segmentation of TV Dramas (멀티채널 기반 드라마 동영상 의미 분절화를 위한 비모수 베이지안 방법)

  • Seok, Ho-Sik;Lee, Ba-Do;Zhang, Byoung-Tak
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2012.06b
    • /
    • pp.474-476
    • /
    • 2012
  • 본 논문에서는 드라마 동영상의 의미 분절화(Semantic segmentation)를 위한 멀티 채널 기반 비모수적 베이지만 방법론을 소개한다. 기존 방법론은 매우 한정적인 특징만을 이용하여 분절화를 시도하거나 이미지 채널이나 오디오 채널과 같은 단일 채널에서만 유효한 방법론을 이용하여 데이터 분석을 시도하였기에, TV 드라마와 같이 예측할 수 없는 변화를 보여주는 스트림 데이터에 적용하기에는 어려움이 많았다. 이와 같은 단점을 극복하기 위해 우리는 주어진 동영상을 단일 모달리티의 채널로 분할한 후 각 채널 별로 분절화를 시도하고 각 채널의 분절 결과를 동적으로 결합하여 주어진 동영상에서의 의미 분절화를 근사하는 방법을 개발하였다. 제안 방법은 실제 TV 동영상의 의미 분절화에 적용되었으며 인간 평가자에 의한 의미 변화 구간과의 비교를 통해 그 성능을 확인하였다.

Multimodal biosignal measurement sensor and analysis system (멀티모달 바이오신호 측정센서 및 분석 시스템)

  • Jeong, Kwanmoon;Moon, Chanki;Nam, Yunyoung;Lee, Jinsook
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2015.04a
    • /
    • pp.1049-1050
    • /
    • 2015
  • e-health보드를 이용하여 측정한 생체신호를 실시간으로 블루투스통신을 통한 무선통신을 함으로서 PC와 연결한다. PC에서 송신된 데이터를 텍스트로 저장한 뒤 c#으로 체온, 심전도, 근전도, 피층 전기 반응, 호흡 5가지의 결과 값을 그래프로 보여준다.