• Title/Summary/Keyword: Scene Recognition

Search Result 193, Processing Time 0.031 seconds

Accurate Location Identification by Landmark Recognition

  • Jian, Hou;Tat-Seng, Chua
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2009.01a
    • /
    • pp.164-169
    • /
    • 2009
  • As one of the most interesting scenes, landmarks constitute a large percentage of the vast amount of scene images available on the web. On the other hand, a specific "landmark" usually has some characteristics that distinguish it from surrounding scenes and other landmarks. These two observations make the task of accurately estimating geographic information from a landmark image necessary and feasible. In this paper, we propose a method to identify landmark location by means of landmark recognition in view of significant viewpoint, illumination and temporal variations. We use GPS-based clustering to form groups for different landmarks in the image dataset. The images in each group rather fully express the possible views of the corresponding landmark. We then use a combination of edge and color histogram to match query to database images. Initial experiments with Zubud database and our collected landmark images show that is feasible.

  • PDF

Object Recognition using 3D Depth Measurement System. (3차원 거리 측정 장치를 이용한 물체 인식)

  • Gim, Seong-Chan;Ko, Su-Hong;Kim, Hyong-Suk
    • Proceedings of the IEEK Conference
    • /
    • 2006.06a
    • /
    • pp.941-942
    • /
    • 2006
  • A depth measurement system to recognize 3D shape of objects using single camera, line laser and a rotating mirror has been investigated. The camera and the light source are fixed, facing the rotating mirror. The laser light is reflected by the mirror and projected to the scene objects whose locations are to be determined. The camera detects the laser light location on object surfaces through the same mirror. The scan over the area to be measured is done by mirror rotation. The Segmentation process of object recognition is performed using the depth data of restored 3D data. The Object recognition domain can be reduced by separating area of interest objects from complex background.

  • PDF

Synthetic hit-miss transform for optical recognition of a moving target (이동물체의 광학적 인식을 위한 합성 HMT)

  • 김종찬;김정우;이하운;도양회;김수중
    • Journal of the Korean Institute of Telematics and Electronics D
    • /
    • v.35D no.3
    • /
    • pp.82-90
    • /
    • 1998
  • A hit-miss transform(HMT) using synthetic structuring elements(SE's) for optical recognition of a moving target is proposed. A moving target which was obtained from a fixed view point has objects. In proposed HMT, SE's are synthesized by using SDF(synthetic discriminant function) algorithm for efficient recognitionof various shapes of true class objects in noisy and cluttered scene. The synthetic hit SE and the synthetic miss SE are composed of SDF of hit SE's and miss SE's for each true class object. Simulation results show the proposed method can be used for the recognition of various shapes of the true class with one one HMT operation.

  • PDF

A New Matching Strategy for SNI-based 3-D Object Recognition (면 법선 영상 기반형 3차원 물체인식에서의 새로운 매칭 기법)

  • 박종훈;최종수
    • Journal of the Korean Institute of Telematics and Electronics B
    • /
    • v.30B no.7
    • /
    • pp.59-69
    • /
    • 1993
  • In this paper, a new matching strategy for 3-D object recognition, based on the Surface Normal Images (SNIs), is proposed. The matching strategy using the similarity decision function [9,10] lost the efficiency and the reliability of matching, because all features of models within model base must be compared with the scene object features, and the weights of the attributes of features is given by heuristic manner. However, the proposed matching strategy can solve these problems by using a new approach. In the approach, by searching the model base, a model object whose features are fully matched with the features of sceme object is selected. In this paper, the model base is constructed for the total 26 objects, and systhetic and real range images are used in the test of the system operation. Experimental result is performed to show the possibility that this strategy can be effectively used for the SNI based recognition.

  • PDF

CASA-based Front-end Using Two-channel Speech for the Performance Improvement of Speech Recognition in Noisy Environments (잡음환경에서의 음성인식 성능 향상을 위한 이중채널 음성의 CASA 기반 전처리 방법)

  • Park, Ji-Hun;Yoon, Jae-Sam;Kim, Hong-Kook
    • Proceedings of the IEEK Conference
    • /
    • 2007.07a
    • /
    • pp.289-290
    • /
    • 2007
  • In order to improve the performance of a speech recognition system in the presence of noise, we propose a noise robust front-end using two-channel speech signals by separating speech from noise based on the computational auditory scene analysis (CASA). The main cues for the separation are interaural time difference (ITD) and interaural level difference (ILD) between two-channel signal. As a result, we can extract 39 cepstral coefficients are extracted from separated speech components. It is shown from speech recognition experiments that proposed front-end has outperforms the ETSI front-end with single-channel speech.

  • PDF

3D Res-Inception Network Transfer Learning for Multiple Label Crowd Behavior Recognition

  • Nan, Hao;Li, Min;Fan, Lvyuan;Tong, Minglei
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.13 no.3
    • /
    • pp.1450-1463
    • /
    • 2019
  • The problem towards crowd behavior recognition in a serious clustered scene is extremely challenged on account of variable scales with non-uniformity. This paper aims to propose a crowed behavior classification framework based on a transferring hybrid network blending 3D res-net with inception-v3. First, the 3D res-inception network is presented so as to learn the augmented visual feature of UCF 101. Then the target dataset is applied to fine-tune the network parameters in an attempt to classify the behavior of densely crowded scenes. Finally, a transferred entropy function is used to calculate the probability of multiple labels in accordance with these features. Experimental results show that the proposed method could greatly improve the accuracy of crowd behavior recognition and enhance the accuracy of multiple label classification.

View Variations and Recognition of 2-D Objects (화상에서의 각도 변화를 이용한 3차원 물체 인식)

  • Whangbo, Taeg-Keun
    • The Transactions of the Korea Information Processing Society
    • /
    • v.4 no.11
    • /
    • pp.2840-2848
    • /
    • 1997
  • Recognition of 3D objects using computer vision is complicated by the fact that geometric features vary with view orientation. An important factor in designing recognition algorithms in such situations is understanding the variation of certain critical features. The features selected in this paper are the angles between landmarks in a scene. In a class of polyhedral objects the angles at certain vertices may form a distinct and characteristic alignment of faces. For many other classes of objects it may be possible to identify distinctive spacial arrangements of some readily identifiable landmarks. In this paper given an isotropic view orientation and an orthographic projection the two dimensional joint density function of two angles in a scene is derived. Also the joint density of all defining angles of a polygon in an image is derived. The analytic expressions for the densities are useful in determining statistical decision rules to recognize surfaces and objects. Experiments to evaluate the usefulness of the proposed methods are reported. Results indicate that the method is useful and powerful.

  • PDF

Korean Text Image Super-Resolution for Improving Text Recognition Accuracy (텍스트 인식률 개선을 위한 한글 텍스트 이미지 초해상화)

  • Junhyeong Kwon;Nam Ik Cho
    • Journal of Broadcast Engineering
    • /
    • v.28 no.2
    • /
    • pp.178-184
    • /
    • 2023
  • Finding texts in general scene images and recognizing their contents is a very important task that can be used as a basis for robot vision, visual assistance, and so on. However, for the low-resolution text images, the degradations, such as noise or blur included in text images, are more noticeable, which leads to severe performance degradation of text recognition accuracy. In this paper, we propose a new Korean text image super-resolution based on a Transformer-based model, which generally shows higher performance than convolutional neural networks. In the experiments, we show that text recognition accuracy for Korean text images can be improved when our proposed text image super-resolution method is used. We also propose a new Korean text image dataset for training our model, which contains massive HR-LR Korean text image pairs.

Application of Shape Analysis Techniques for Improved CASA-Based Speech Separation (CASA 기반 음성분리 성능 향상을 위한 형태 분석 기술의 응용)

  • Lee, Yun-Kyung;Kwon, Oh-Wook
    • MALSORI
    • /
    • no.65
    • /
    • pp.153-168
    • /
    • 2008
  • We propose a new method to apply shape analysis techniques to a computational auditory scene analysis (CASA)-based speech separation system. The conventional CASA-based speech separation system extracts speech signals from a mixture of speech and noise signals. In the proposed method, we complement the missing speech signals by applying the shape analysis techniques such as labelling and distance function. In the speech separation experiment, the proposed method improves signal-to-noise ratio by 6.6 dB. When the proposed method is used as a front-end of speech recognizers, it improves recognition accuracy by 22% for the speech-shaped stationary noise condition and 7.2% for the two-talker noise condition at the target-to-masker ratio than or equal to -3 dB.

  • PDF

Isolating vehicle license plate area using the known information (사전정보를 이용한 차량번호판 영역의 분리)

  • 문기주;신영석;최효돈
    • Korean Management Science Review
    • /
    • v.13 no.2
    • /
    • pp.1-11
    • /
    • 1996
  • Two different methods to extract the license plate area of a vehicle have been used for automatic recognition purposes. One method is with a color vision system and the other is with an edge detecting operator. The system with color vision has some problems if the colors of license plate and vehicle's body are similar. The various plate colors in Korea also drops the system performance. The edge detecting operator also has a problem for a real time processing since it performs on all pixels of the scene. In this paper a possible method using gray level vision system and available pre-known information of license plates is suggested. The suggested procedure searches the lower boundary of the plate by counting high contrast points between one and near pixel from the bottom line of the scene. It finds the upper boundary from the bottom line by adding number plate height after finding the lower boundary. The left and right boundaries are found by similar processes.

  • PDF