Search | Korea Science

Research on Generative AI for Korean Multi-Modal Montage App (한국형 멀티모달 몽타주 앱을 위한 생성형 AI 연구)

Lim, Jeounghyun;Cha, Kyung-Ae;Koh, Jaepil;Hong, Won-Kee
- Journal of Service Research and Studies
- /
- v.14 no.1
- /
- pp.13-26
- /
- 2024
Multi-modal generation is the process of generating results based on a variety of information, such as text, images, and audio. With the rapid development of AI technology, there is a growing number of multi-modal based systems that synthesize different types of data to produce results. In this paper, we present an AI system that uses speech and text recognition to describe a person and generate a montage image. While the existing montage generation technology is based on the appearance of Westerners, the montage generation system developed in this paper learns a model based on Korean facial features. Therefore, it is possible to create more accurate and effective Korean montage images based on multi-modal voice and text specific to Korean. Since the developed montage generation app can be utilized as a draft montage, it can dramatically reduce the manual labor of existing montage production personnel. For this purpose, we utilized persona-based virtual person montage data provided by the AI-Hub of the National Information Society Agency. AI-Hub is an AI integration platform aimed at providing a one-stop service by building artificial intelligence learning data necessary for the development of AI technology and services. The image generation system was implemented using VQGAN, a deep learning model used to generate high-resolution images, and the KoDALLE model, a Korean-based image generation model. It can be confirmed that the learned AI model creates a montage image of a face that is very similar to what was described using voice and text. To verify the practicality of the developed montage generation app, 10 testers used it and more than 70% responded that they were satisfied. The montage generator can be used in various fields, such as criminal detection, to describe and image facial features.
https://doi.org/10.18807/jsrs.2024.14.1.013 인용 PDF

Eye Tracking Using Neural Network and Mean-shift (신경망과 Mean-shift를 이용한 눈 추적)

Kang, Sin-Kuk;Kim, Kyung-Tai;Shin, Yun-Hee;Kim, Na-Yeon;Kim, Eun-Yi
- Journal of the Institute of Electronics Engineers of Korea CI
- /
- v.44 no.1
- /
- pp.56-63
- /
- 2007
In this paper, an eye tracking method is presented using a neural network (NN) and mean-shift algorithm that can accurately detect and track user's eyes under the cluttered background. In the proposed method, to deal with the rigid head motion, the facial region is first obtained using skin-color model and con-nected-component analysis. Thereafter the eye regions are localized using neural network (NN)-based tex-ture classifier that discriminates the facial region into eye class and non-eye class, which enables our method to accurately detect users' eyes even if they put on glasses. Once the eye region is localized, they are continuously and correctly tracking by mean-shift algorithm. To assess the validity of the proposed method, it is applied to the interface system using eye movement and is tested with a group of 25 users through playing a 'aligns games.' The results show that the system process more than 30 frames/sec on PC for the $320{\times}240$ size input image and supply a user-friendly and convenient access to a computer in real-time operation.
PDF KSCI

ERF Components Patterns of Causal Question Generation during Observation of Biological Phenomena : A MEG Study (생명현상 관찰에서 나타나는 인과적 의문 생성의 ERF 특성 : MEG 연구)

Kwon, Suk-Won;Kwon, Yong-Ju
- Journal of Science Education
- /
- v.33 no.2
- /
- pp.336-345
- /
- 2009
The purpose of this study is to analysis ERF components patterns of causal questions generated during the observation of biological phenomenon. First, the system that shows pictures causing causal questions based on biological phenomenon (evoked picture system) was developed in a way of cognitive psychology. The ERF patterns of causal questions based on time-series brain processing was observed using MEG. The evoked picture system was developed by R&D method consisting of scientific education experts and researchers. Tasks were classified into animal (A), microbe (M), and plant (P) tasks according to biological species and into interaction (I), all (A), and part (P) based on the interaction between different species. According to the collaboration with MEG team in the hospital of Seoul National University, the paradigm of MEG task was developed. MEG data about the generation of scientific questions in 5 female graduate student were collected. For examining the unique characteristic of causal question, MEG ERF components were analyzed. As a result, total 100 pictures were produced by evoked picture and 4 ERF components, M1(100~130ms), M2(220~280ms), M3(320~390ms), M4(460~520ms). The present study could guide personalized teaching-learning method through the application and development of scientific question learning program.
PDF

Search Result 93, Processing Time 0.023 seconds

Research on Generative AI for Korean Multi-Modal Montage App (한국형 멀티모달 몽타주 앱을 위한 생성형 AI 연구)

Eye Tracking Using Neural Network and Mean-shift (신경망과 Mean-shift를 이용한 눈 추적)

ERF Components Patterns of Causal Question Generation during Observation of Biological Phenomena : A MEG Study (생명현상 관찰에서 나타나는 인과적 의문 생성의 ERF 특성 : MEG 연구)

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)