• Title/Summary/Keyword: Visual Recognition

Search Result 826, Processing Time 0.055 seconds

Constructing a Noise-Robust Speech Recognition System using Acoustic and Visual Information (청각 및 시가 정보를 이용한 강인한 음성 인식 시스템의 구현)

  • Lee, Jong-Seok;Park, Cheol-Hoon
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.13 no.8
    • /
    • pp.719-725
    • /
    • 2007
  • In this paper, we present an audio-visual speech recognition system for noise-robust human-computer interaction. Unlike usual speech recognition systems, our system utilizes the visual signal containing speakers' lip movements along with the acoustic signal to obtain robust speech recognition performance against environmental noise. The procedures of acoustic speech processing, visual speech processing, and audio-visual integration are described in detail. Experimental results demonstrate the constructed system significantly enhances the recognition performance in noisy circumstances compared to acoustic-only recognition by using the complementary nature of the two signals.

Utilization of Visual Context for Robust Object Recognition in Intelligent Mobile Robots (지능형 이동 로봇에서 강인 물체 인식을 위한 영상 문맥 정보 활용 기법)

  • Kim, Sung-Ho;Kim, Jun-Sik;Kweon, In-So
    • The Journal of Korea Robotics Society
    • /
    • v.1 no.1
    • /
    • pp.36-45
    • /
    • 2006
  • In this paper, we introduce visual contexts in terms of types and utilization methods for robust object recognition with intelligent mobile robots. One of the core technologies for intelligent robots is visual object recognition. Robust techniques are strongly required since there are many sources of visual variations such as geometric, photometric, and noise. For such requirements, we define spatial context, hierarchical context, and temporal context. According to object recognition domain, we can select such visual contexts. We also propose a unified framework which can utilize the whole contexts and validates it in real working environment. Finally, we also discuss the future research directions of object recognition technologies for intelligent robots.

  • PDF

Semantic Visual Place Recognition in Dynamic Urban Environment (동적 도시 환경에서 의미론적 시각적 장소 인식)

  • Arshad, Saba;Kim, Gon-Woo
    • The Journal of Korea Robotics Society
    • /
    • v.17 no.3
    • /
    • pp.334-338
    • /
    • 2022
  • In visual simultaneous localization and mapping (vSLAM), the correct recognition of a place benefits in relocalization and improved map accuracy. However, its performance is significantly affected by the environmental conditions such as variation in light, viewpoints, seasons, and presence of dynamic objects. This research addresses the problem of feature occlusion caused by interference of dynamic objects leading to the poor performance of visual place recognition algorithm. To overcome the aforementioned problem, this research analyzes the role of scene semantics in correct detection of a place in challenging environments and presents a semantics aided visual place recognition method. Semantics being invariant to viewpoint changes and dynamic environment can improve the overall performance of the place matching method. The proposed method is evaluated on the two benchmark datasets with dynamic environment and seasonal changes. Experimental results show the improved performance of the visual place recognition method for vSLAM.

Visual Location Recognition Using Time-Series Streetview Database (시계열 스트리트뷰 데이터베이스를 이용한 시각적 위치 인식 알고리즘)

  • Park, Chun-Su;Choeh, Joon-Yeon
    • Journal of the Semiconductor & Display Technology
    • /
    • v.18 no.4
    • /
    • pp.57-61
    • /
    • 2019
  • Nowadays, portable digital cameras such as smart phone cameras are being popularly used for entertainment and visual information recording. Given a database of geo-tagged images, a visual location recognition system can determine the place depicted in a query photo. One of the most common visual location recognition approaches is the bag-of-words method where local image features are clustered into visual words. In this paper, we propose a new bag-of-words-based visual location recognition algorithm using time-series streetview database. The proposed algorithm selects only a small subset of image features which will be used in image retrieval process. By reducing the number of features to be used, the proposed algorithm can reduce the memory requirement of the image database and accelerate the retrieval process.

A Bio-Inspired Modeling of Visual Information Processing for Action Recognition (생체 기반 시각정보처리 동작인식 모델링)

  • Kim, JinOk
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.3 no.8
    • /
    • pp.299-308
    • /
    • 2014
  • Various literatures related computing of information processing have been recently shown the researches inspired from the remarkably excellent human capabilities which recognize and categorize very complex visual patterns such as body motions and facial expressions. Applied from human's outstanding ability of perception, the classification function of visual sequences without context information is specially crucial task for computer vision to understand both the coding and the retrieval of spatio-temporal patterns. This paper presents a biological process based action recognition model of computer vision, which is inspired from visual information processing of human brain for action recognition of visual sequences. Proposed model employs the structure of neural fields of bio-inspired visual perception on detecting motion sequences and discriminating visual patterns in human brain. Experimental results show that proposed recognition model takes not only into account several biological properties of visual information processing, but also is tolerant of time-warping. Furthermore, the model allows robust temporal evolution of classification compared to researches of action recognition. Presented model contributes to implement bio-inspired visual processing system such as intelligent robot agent, etc.

A Study on Visual Contextual Awareness in Ubiquitous Computing (유비쿼터스 환경에서의 시각문맥정보인식에 대한 연구)

  • Han, Dong-Ju;Kim, Jong-Bok;Lee, Sang-Hoon;Suh, Il-Hong
    • Proceedings of the KIEE Conference
    • /
    • 2004.11c
    • /
    • pp.19-21
    • /
    • 2004
  • In many cases, human's visual recognition depends on contextual information. We need to use effective feature information for performing vigorous place recognition to illumination, noise, etc. In the existing cases that use edge and color, etc., visual recognition doesn't cope effectively with real environment. To solve this problem, using natural marker, we improve the efficiency of place recognition.

  • PDF

Robust Feature Extraction Based on Image-based Approach for Visual Speech Recognition (시각 음성인식을 위한 영상 기반 접근방법에 기반한 강인한 시각 특징 파라미터의 추출 방법)

  • Gyu, Song-Min;Pham, Thanh Trung;Min, So-Hee;Kim, Jing-Young;Na, Seung-You;Hwang, Sung-Taek
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.20 no.3
    • /
    • pp.348-355
    • /
    • 2010
  • In spite of development in speech recognition technology, speech recognition under noisy environment is still a difficult task. To solve this problem, Researchers has been proposed different methods where they have been used visual information except audio information for visual speech recognition. However, visual information also has visual noises as well as the noises of audio information, and this visual noises cause degradation in visual speech recognition. Therefore, it is one the field of interest how to extract visual features parameter for enhancing visual speech recognition performance. In this paper, we propose a method for visual feature parameter extraction based on image-base approach for enhancing recognition performance of the HMM based visual speech recognizer. For experiments, we have constructed Audio-visual database which is consisted with 105 speackers and each speaker has uttered 62 words. We have applied histogram matching, lip folding, RASTA filtering, Liner Mask, DCT and PCA. The experimental results show that the recognition performance of our proposed method enhanced at about 21% than the baseline method.

Visual Touch Recognition for NUI Using Voronoi-Tessellation Algorithm (보로노이-테셀레이션 알고리즘을 이용한 NUI를 위한 비주얼 터치 인식)

  • Kim, Sung Kwan;Joo, Young Hoon
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.64 no.3
    • /
    • pp.465-472
    • /
    • 2015
  • This paper presents a visual touch recognition for NUI(Natural User Interface) using Voronoi-tessellation algorithm. The proposed algorithms are three parts as follows: hand region extraction, hand feature point extraction, visual-touch recognition. To improve the robustness of hand region extraction, we propose RGB/HSI color model, Canny edge detection algorithm, and use of spatial frequency information. In addition, to improve the accuracy of the recognition of hand feature point extraction, we propose the use of Douglas Peucker algorithm, Also, to recognize the visual touch, we propose the use of the Voronoi-tessellation algorithm. Finally, we demonstrate the feasibility and applicability of the proposed algorithms through some experiments.

A Novel Integration Scheme for Audio Visual Speech Recognition

  • Pham, Than Trung;Kim, Jin-Young;Na, Seung-You
    • The Journal of the Acoustical Society of Korea
    • /
    • v.28 no.8
    • /
    • pp.832-842
    • /
    • 2009
  • Automatic speech recognition (ASR) has been successfully applied to many real human computer interaction (HCI) applications; however, its performance tends to be significantly decreased under noisy environments. The invention of audio visual speech recognition (AVSR) using an acoustic signal and lip motion has recently attracted more attention due to its noise-robustness characteristic. In this paper, we describe our novel integration scheme for AVSR based on a late integration approach. Firstly, we introduce the robust reliability measurement for audio and visual modalities using model based information and signal based information. The model based sources measure the confusability of vocabulary while the signal is used to estimate the noise level. Secondly, the output probabilities of audio and visual speech recognizers are normalized respectively before applying the final integration step using normalized output space and estimated weights. We evaluate the performance of our proposed method via Korean isolated word recognition system. The experimental results demonstrate the effectiveness and feasibility of our proposed system compared to the conventional systems.

A Study on the Artificial Recognition System on Visual Environment of Architecture (건축의 시각적 환경에 대한 지능형 인지 시스템에 관한 연구)

  • Seo, Dong-Yeon;Lee, Hyun-Soo
    • KIEAE Journal
    • /
    • v.3 no.2
    • /
    • pp.25-32
    • /
    • 2003
  • This study deals with the investigation of recognition structure on architectural environment and reconstruction of it by artificial intelligence. To test the possibility of the reconstruction, recognition structure on architectural environment is analysed and each steps of the structure are matched with computational methods. Edge Detection and Neural Network were selected as matching methods to each steps of recognition process. Visual perception system established by selected methods is trained and tested, and the result of the system is compared with that of experiment of human. Assuming that the artificial system resembles the process of human recognition on architectural environment, does the system give similar response of human? The result shows that it is possible to establish artificial visual perception system giving similar response with that of human when it models after the recognition structure and process of human.