• Title/Summary/Keyword: visual model

Search Result 2,032, Processing Time 0.029 seconds

Lip and Voice Synchronization Using Visual Attention (시각적 어텐션을 활용한 입술과 목소리의 동기화 연구)

  • Dongryun Yoon;Hyeonjoong Cho
    • The Transactions of the Korea Information Processing Society
    • /
    • v.13 no.4
    • /
    • pp.166-173
    • /
    • 2024
  • This study explores lip-sync detection, focusing on the synchronization between lip movements and voices in videos. Typically, lip-sync detection techniques involve cropping the facial area of a given video, utilizing the lower half of the cropped box as input for the visual encoder to extract visual features. To enhance the emphasis on the articulatory region of lips for more accurate lip-sync detection, we propose utilizing a pre-trained visual attention-based encoder. The Visual Transformer Pooling (VTP) module is employed as the visual encoder, originally designed for the lip-reading task, predicting the script based solely on visual information without audio. Our experimental results demonstrate that, despite having fewer learning parameters, our proposed method outperforms the latest model, VocaList, on the LRS2 dataset, achieving a lip-sync detection accuracy of 94.5% based on five context frames. Moreover, our approach exhibits an approximately 8% superiority over VocaList in lip-sync detection accuracy, even on an untrained dataset, Acappella.

Development of Visual Tools for Strut-Tie Model (스트럿 타이 모델개발을 위한 시각화 도구 개발)

  • Kim, Nam-Hee;Hong, Sung-Gul;Yeo, Deok-Hyun
    • Proceedings of the Computational Structural Engineering Institute Conference
    • /
    • 2008.04a
    • /
    • pp.596-601
    • /
    • 2008
  • This paper presents how to develop visual design tools for construction of strut-and-tie models(S (STM). STMs have shown internal force flows for dimensioning and proportioning of D-regions of reinforced concrete structures. In order to select an appropriate strut-and-tie model some interactive graphic tools are necessary to help designers compare alternatives by changing the geometry of initial STM. This study proposes to use force polygons representing the equilibrium state of STM. The change of STM dynamically shows change of force magnitudes by force polygon. Once the geometry of STM is determined the detailing design process is required in the next procedure.

  • PDF

A Study on HMM-Based Segmentation Method for Traffic Monitoring (HMM 분할에 기반한 교통모니터링에 관한 연구)

  • Hwang, Suen-Ki;Kang, Yong-Seok;Kim, Tae-Woo;Kim, Hyun-Yul;Park, Young-Cheol;Bae, Cheol-Soo
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.5 no.1
    • /
    • pp.1-6
    • /
    • 2012
  • In this paper, we propose a HMM(Hidden Markov Model)-based segmentation method to model shadows as well as foreground and background regions. The shadow of moving objects often keeps from visual tracking. We propose an HMM-based segmentation method which classifies each object in real time. In the case of traffic monitoring movies, the effectiveness of the proposed method was proved by experiments.

A Study on the Development of the Air Pollution-Health Risk Model : The case of Seoul, Korea. (都市大氣汚染이 市民健康에 미치는 危險性 評價 模型의 開發에 관한 硏究)

  • 김귀곤;김명진;성현찬
    • Journal of Korean Society for Atmospheric Environment
    • /
    • v.5 no.2
    • /
    • pp.30-35
    • /
    • 1989
  • To effectively develop and evaluate air pollution control measures, health risk rates due to air pollution must be identified. This article describes the application of a visual analysis and an air pollution-health risk model for determining the impacts of carbon monoxide (CO) exposure on angina pectoris patients in a metropolitan area. The procedures used for analyzing the relationship between CO exposure and the related increase in angina angina attacks for stable angina pectoris patients are described through a case study in the city of Seoul, Korea and the findings show that air-pollution-health risk model and visual analysis can be effective tools for environmental decision-makers, allowing air pollution control scenarios to be developed and evaluated for environmental protection. One of the features of this study is to provide a methodology for translating clinical findings into estimates of the relative contributions of air pollution to all causes of a particular disease. Therefore, there must be appropriate recognition of the uncertainties involved in the study.

  • PDF

Enhanced Representation for Object Tracking (물체 추적을 위한 강화된 부분공간 표현)

  • Yun, Frank;Yoo, Haan-Ju;Choi, Jin-Young
    • Proceedings of the IEEK Conference
    • /
    • 2009.05a
    • /
    • pp.408-410
    • /
    • 2009
  • We present an efficient and robust measurement model for visual tracking. This approach builds on and extends work on subspace representations of measurement model. Subspace-based tracking algorithms have been introduced to visual tracking literature for a decade and show considerable tracking performance due to its robustness in matching. However the measures used in their measurement models are often restricted to few approaches. We propose a novel measure of object matching using Angle In Feature Space, which aims to improve the discriminability of matching in subspace. Therefore, our tracking algorithm can distinguish target from similar background clutters which often cause erroneous drift by conventional Distance From Feature Space measure. Experiments demonstrate the effectiveness of the proposed tracking algorithm under severe cluttered background.

  • PDF

An Investigation of the Learning Styles of South Korean Business Students

  • Naik, Bijayananda;Girish, V.G.
    • Asia-Pacific Journal of Business
    • /
    • v.3 no.1
    • /
    • pp.1-9
    • /
    • 2012
  • The Index of Learning Styles (ILS) instrument based on the Felder-Silverman Learning Style Model was used to determine distribution of learning styles of 125 South Korean business students enrolled in a South Korean institution of higher education. Results show that greater proportion of South Korean business students surveyed in this study prefer sensing over intuitive, visual over verbal, reflective over active, and global over sequential learning styles. The majority of business students have a balanced learning style in all four dimensions of the Felder-Silverman model. Among the students that do not have a balanced learning style, students with sensing, visual, reflective, and global learning styles dominate. Gender difference in learning style preference was not statistically significant for any of the four dimensions.

  • PDF

An Experimental Study on the Optimal Number of Cameras used for Vision Control System (비젼 제어시스템에 사용된 카메라의 최적개수에 대한 실험적 연구)

  • 장완식;김경석;김기영;안힘찬
    • Transactions of the Korean Society of Machine Tool Engineers
    • /
    • v.13 no.2
    • /
    • pp.94-103
    • /
    • 2004
  • The vision system model used for this study involves the six parameters that permits a kind of adaptability in that relationship between the camera space location of manipulable visual cues and the vector of robot joint coordinates is estimated in real time. Also this vision control method requires the number of cameras to transform 2-D camera plane from 3-D physical space, and be used irrespective of location of cameras, if visual cues are displayed in the same camera plane. Thus, this study is to investigate the optimal number of cameras used for the developed vision control system according to the change of the number of cameras. This study is processed in the two ways : a) effectiveness of vision system model b) optimal number of cameras. These results show the evidence of the adaptability of the developed vision control method using the optimal number of cameras.

An Efficient Virtual Teeth Modeling for Dental Training System

  • Kim, Lae-Hyun;Park, Se-Hyung
    • International Journal of CAD/CAM
    • /
    • v.8 no.1
    • /
    • pp.41-44
    • /
    • 2009
  • This paper describes an implementation of virtual teeth modeling for a haptic dental simulation. The system allows dental students to practice dental procedures with realistic tactual feelings. The system requires fast and stable haptic rendering and volume modeling techniques working on the virtual tooth. In our implementation, a volumetric implicit surface is used for intuitive shape modification without topological constraints and haptic rendering. The volumetric implicit surface is generated from input geometric model by using a closest point transformation algorithm. And for visual rendering, we apply an adaptive polygonization method to convert volumetric teeth model to geometric model. We improve our previous system using new octree design to save memory requirement while increase the performance and visual quality.

Incipient Cavitation in a Bulb Turbine: Model Test and CFD Calculation

  • Necker, Jorg;Aschenbrenner, Thomas
    • International Journal of Fluid Machinery and Systems
    • /
    • v.4 no.1
    • /
    • pp.140-149
    • /
    • 2011
  • For a certain operating point of a horizontal shaft bulb turbine (i.e. volume flow, net head, blade angle, guide vane angle) the efficiency for different pressure levels (i.e. different Thoma-coefficient ${\sigma}$) is calculated using a commercial Computational Fluid Dynamics (CFD-)-code including two-phase flow and a cavitation model. The results are compared with experimental results achieved at a closed loop test rig for model turbines. The comparison of the experimentally and numerically obtained efficiency and the visual impression of the cavitation show a good agreement. Especially the drop in efficiency is calculated with satisfying accuracy. This drop in efficiency in combination with the visual impression is of high practical importance since it contributes to determine the admissible cavitation in a bulb-turbine. It is seen that the incipient cavitation in Kaplan type turbines has no major importance in determing this admissible amount of cavitation.

Quantitative Image Qualify Assessment for Block-based DCT Image Coder using Human Visual Characteristics (인간시각특성을 이용한 블록기반 DCT 영상 부호화기의 정량적 화질 평가)

  • Chung, Tae-Yun
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.12 no.5
    • /
    • pp.424-431
    • /
    • 2002
  • This paper proposes a new quantitative image assessment model which is essential to verify the performance of block-based DCT coding. The proposed model considers not only global distortions such as frequency sensitivity and channel masking using HVS based visual model, but also distortions including several local distortions caused by block-based coding.