• Title/Summary/Keyword: Scene Recognition

Search Result 193, Processing Time 0.029 seconds

Text Extraction in HIS Color Space by Weighting Scheme

  • Le, Thi Khue Van;Lee, Gueesang
    • Smart Media Journal
    • /
    • v.2 no.1
    • /
    • pp.31-36
    • /
    • 2013
  • A robust and efficient text extraction is very important for an accuracy of Optical Character Recognition (OCR) systems. Natural scene images with degradations such as uneven illumination, perspective distortion, complex background and multi color text give many challenges to computer vision task, especially in text extraction. In this paper, we propose a method for extraction of the text in signboard images based on a combination of mean shift algorithm and weighting scheme of hue and saturation in HSI color space for clustering algorithm. The number of clusters is determined automatically by mean shift-based density estimation, in which local clusters are estimated by repeatedly searching for higher density points in feature vector space. Weighting scheme of hue and saturation is used for formulation a new distance measure in cylindrical coordinate for text extraction. The obtained experimental results through various natural scene images are presented to demonstrate the effectiveness of our approach.

  • PDF

Brain Dynamics and Interactions for Object Detection and Basic-level Categorization (물체 탐지와 범주화에서의 뇌의 동적 움직임 추적)

  • Kim, Ji-Hyun;Kwon, Hyuk-Chan;Lee, Yong-Ho
    • Proceedings of the Korean Society for Emotion and Sensibility Conference
    • /
    • 2009.05a
    • /
    • pp.219-222
    • /
    • 2009
  • Rapid object recognition is one of the main stream research themes focusing to reveal how human recognizes object and interacts with environment in natural world. This field of study is of consequence in that it is highly important in evolutionary perspective to quickly see the external objects and judge their characteristics to plan future reactions. In this study, we investigated how human detect natural scene objects and categorize them in a limited time frame. We applied Magnetoencepahlogram (MEG) while participants were performing detection (e.g. object vs. texture) or basic-level categorization (e.g. cars vs. dogs) tasks to track the dynamic interaction in human brain for rapid object recognition process. The results revealed that detection and categorization involves different temporal and functional connections that correlated for the successful recognition process as a whole. These results imply that dynamics in the brain are important for our interaction with environment. The implication from this study can be further extended to investigate the effect of subconscious emotional factors on the dynamics of brain interactions during the rapid recognition process.

  • PDF

Aerial Scene Labeling Based on Convolutional Neural Networks (Convolutional Neural Networks기반 항공영상 영역분할 및 분류)

  • Na, Jong-Pil;Hwang, Seung-Jun;Park, Seung-Je;Baek, Joong-Hwan
    • Journal of Advanced Navigation Technology
    • /
    • v.19 no.6
    • /
    • pp.484-491
    • /
    • 2015
  • Aerial scene is greatly increased by the introduction and supply of the image due to the growth of digital optical imaging technology and development of the UAV. It has been used as the extraction of ground properties, classification, change detection, image fusion and mapping based on the aerial image. In particular, in the image analysis and utilization of deep learning algorithm it has shown a new paradigm to overcome the limitation of the field of pattern recognition. This paper presents the possibility to apply a more wide range and various fields through the segmentation and classification of aerial scene based on the Deep learning(ConvNet). We build 4-classes image database consists of Road, Building, Yard, Forest total 3000. Each of the classes has a certain pattern, the results with feature vector map come out differently. Our system consists of feature extraction, classification and training. Feature extraction is built up of two layers based on ConvNet. And then, it is classified by using the Multilayer perceptron and Logistic regression, the algorithm as a classification process.

Contextual In-Video Advertising Using Situation Information (상황 정보를 활용한 동영상 문맥 광고)

  • Yi, Bong-Jun;Woo, Hyun-Wook;Lee, Jung-Tae;Rim, Hae-Chang
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.11 no.8
    • /
    • pp.3036-3044
    • /
    • 2010
  • With the rapid growth of video data service, demand to provide advertisements or additional information with regard to a particular video scene is increasing. However, the direct use of automated visual analysis or speech recognition on videos virtually has limitations with current level of technology; the metadata of video such as title, category information, or summary does not reflect the content of continuously changing scenes. This work presents a new video contextual advertising system that serves relevant advertisements on a given scene by leveraging the scene's situation information inferred from video scripts. Experimental results show that the use of situation information extracted from scripts leads to better performance and display of more relevant advertisements to the user.

A Study on Acting Approaches based on Characteristics of Zoom Theater - Focused on the Production Process of Project, Hong-Do 2020 (줌(Zoom)연극의 특성에 따른 배우의 연기 접근 방법 연구 - 프로젝트, 홍도(2020)의 제작 과정을 중심으로)

  • Jung, Eunyoung
    • The Journal of the Korea Contents Association
    • /
    • v.21 no.12
    • /
    • pp.842-854
    • /
    • 2021
  • Performing industries in Korea and abroad have been attempting a wide range of artistic experiments utilizing online platforms ever since the Covid-19 pandemic. Accordingly, this study will shed light on the functional characteristics of Zoom, which was used as a creative tool for theater performances. At first, after examining theater performances presented in Korea and abroad using Zoom and their characteristics, the production stage of the Zoom play will be analyzed by dividing it into following stages; a research-based pre-production stage, a scene workshop stage that composes each scene based on the script, a recording stage filming each scene on Zoom, and Streaming stage for presenting the show. Furthermore, the actor's approaches to acting in this production process was presumed to be separation of gaze, re-recognition of space, utilization of expressive gestures, and reaction as an active action. As a result, it proposes the possibility of ongoing development of theatrical work using Zoom and the evolutionary aspect of actor's acting approaches in accordance with theatrical work via Zoom.

Image Based Human Action Recognition System to Support the Blind (시각장애인 보조를 위한 영상기반 휴먼 행동 인식 시스템)

  • Ko, ByoungChul;Hwang, Mincheol;Nam, Jae-Yeal
    • Journal of KIISE
    • /
    • v.42 no.1
    • /
    • pp.138-143
    • /
    • 2015
  • In this paper we develop a novel human action recognition system based on communication between an ear-mounted Bluetooth camera and an action recognition server to aid scene recognition for the blind. First, if the blind capture an image of a specific location using the ear-mounted camera, the captured image is transmitted to the recognition server using a smartphone that is synchronized with the camera. The recognition server sequentially performs human detection, object detection and action recognition by analyzing human poses. The recognized action information is retransmitted to the smartphone and the user can hear the action information through the text-to-speech (TTS). Experimental results using the proposed system showed a 60.7% action recognition performance on the test data captured in indoor and outdoor environments.

An Approach for Localization Around Indoor Corridors Based on Visual Attention Model (시각주의 모델을 적용한 실내 복도에서의 위치인식 기법)

  • Yoon, Kook-Yeol;Choi, Sun-Wook;Lee, Chong-Ho
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.17 no.2
    • /
    • pp.93-101
    • /
    • 2011
  • For mobile robot, recognizing its current location is very important to navigate autonomously. Especially, loop closing detection that robot recognize location where it has visited before is a kernel problem to solve localization. A considerable amount of research has been conducted on loop closing detection and localization based on appearance because vision sensor has an advantage in terms of costs and various approaching methods to solve this problem. In case of scenes that consist of repeated structures like in corridors, perceptual aliasing in which, the two different locations are recognized as the same, occurs frequently. In this paper, we propose an improved method to recognize location in the scenes which have similar structures. We extracted salient regions from images using visual attention model and calculated weights using distinctive features in the salient region. It makes possible to emphasize unique features in the scene to classify similar-looking locations. In the results of corridor recognition experiments, proposed method showed improved recognition performance. It shows 78.2% in the accuracy of single floor corridor recognition and 71.5% for multi floor corridors recognition.

A Novel Text Sample Selection Model for Scene Text Detection via Bootstrap Learning

  • Kong, Jun;Sun, Jinhua;Jiang, Min;Hou, Jian
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.13 no.2
    • /
    • pp.771-789
    • /
    • 2019
  • Text detection has been a popular research topic in the field of computer vision. It is difficult for prevalent text detection algorithms to avoid the dependence on datasets. To overcome this problem, we proposed a novel unsupervised text detection algorithm inspired by bootstrap learning. Firstly, the text candidate in a novel form of superpixel is proposed to improve the text recall rate by image segmentation. Secondly, we propose a unique text sample selection model (TSSM) to extract text samples from the current image and eliminate database dependency. Specifically, to improve the precision of samples, we combine maximally stable extremal regions (MSERs) and the saliency map to generate sample reference maps with a double threshold scheme. Finally, a multiple kernel boosting method is developed to generate a strong text classifier by combining multiple single kernel SVMs based on the samples selected from TSSM. Experimental results on standard datasets demonstrate that our text detection method is robust to complex backgrounds and multilingual text and shows stable performance on different standard datasets.

Three-dimensional object recognition using efficient indexing:Part II-generation and verification of object hypotheses (효율적인 인덱싱 기법을 이용한 3차원 물체인식:Part II-물체에 대한 가설의 생성과 검증)

  • 이준호
    • Journal of the Korean Institute of Telematics and Electronics C
    • /
    • v.34C no.10
    • /
    • pp.76-88
    • /
    • 1997
  • Based on the principles described in Part I, we have implemented a working prototype vision system using a feature structure called an LSG (local surface group) for generating object hypotheses. In order to verify an object hypothesis, we estimate the view of the hypothesized model object and render the model object for the computed view. The object hypothesis is then verified by finding additional features in the scene that match those present in the rendered image. Experimental results on synthetic and real range images show the effectiveness of the indexing scheme.

  • PDF

Three Dimensional Object Recognition using PCA and KNN (peA 와 KNN를 이용한 3차원 물체인식)

  • Lee, Kee-Jun
    • The Journal of the Korea Contents Association
    • /
    • v.9 no.8
    • /
    • pp.57-63
    • /
    • 2009
  • Object recognition technologies using PCA(principal component analysis) recognize objects by deciding representative features of objects in the model image, extracting feature vectors from objects in a image and measuring the distance between them and object representation. Given frequent recognition problems associated with the use of point-to-point distance approach, this study adopted the k-nearest neighbor technique(class-to-class) in which a group of object models of the same class is used as recognition unit for the images in-putted on a continual input image. However, the robustness of recognition strategies using PCA depends on several factors, including illumination. When scene constancy is not secured due to varying illumination conditions, the learning performance the feature detector can be compromised, undermining the recognition quality. This paper proposes a new PCA recognition in which database of objects can be detected under different illuminations between input images and the model images.