• 제목/요약/키워드: Local-feature

검색결과 933건 처리시간 0.023초

멀티미디어 검색을 위한 shot 경계 및 대표 프레임 추출 (Shot boundary Frame Detection and Key Frame Detection for Multimedia Retrieval)

  • 강대성;김영호
    • 융합신호처리학회논문지
    • /
    • 제2권1호
    • /
    • pp.38-43
    • /
    • 2001
  • 본 논문에서는 MPEG 비디오 스트림을 분석하여 DCT DC 계수를 추출하고 이들로 구성된 DC 이미지로부터 제안하는 robust feature를 이용하여 shot 검출을 수행한 후 각 feature들의 통계적 특성을 이용하여 스트림의 특징에 따라 weight를 부가하여 구해진 characterizing value의 시간 변화량을 구한다. 추해진 변화량의 local maxima와 local minima는 비디오 스트림에서 각각 가장 특징적인 frame과 평균적인 frame을 나타낸다. 이 순간의 shot을 구함으로서 효과적이고 빠른 시간 내에 key frame을 추출한다. 추출되어진 key frame에 대하여 원영상을 복원한 후, 색인을 위하여 다수의 parameter를 구하고, 사용자가 질의한 영상에 대해서 이들 파라메터를 구하여 key frame들과 가장 유사한 대표영상들을 검색한다. 실험결과 일반적인 방법보다 더 나은 결과를 보였고, 높은 검색율을 보였다.

  • PDF

Global Feature Extraction and Recognition from Matrices of Gabor Feature Faces

  • Odoyo, Wilfred O.;Cho, Beom-Joon
    • Journal of information and communication convergence engineering
    • /
    • 제9권2호
    • /
    • pp.207-211
    • /
    • 2011
  • This paper presents a method for facial feature representation and recognition from the Covariance Matrices of the Gabor-filtered images. Gabor filters are a very powerful tool for processing images that respond to different local orientations and wave numbers around points of interest, especially on the local features on the face. This is a very unique attribute needed to extract special features around the facial components like eyebrows, eyes, mouth and nose. The Covariance matrices computed on Gabor filtered faces are adopted as the feature representation for face recognition. Geodesic distance measure is used as a matching measure and is preferred for its global consistency over other methods. Geodesic measure takes into consideration the position of the data points in addition to the geometric structure of given face images. The proposed method is invariant and robust under rotation, pose, or boundary distortion. Tests run on random images and also on publicly available JAFFE and FRAV3D face recognition databases provide impressively high percentage of recognition.

Enhanced SIFT Descriptor Based on Modified Discrete Gaussian-Hermite Moment

  • Kang, Tae-Koo;Zhang, Huazhen;Kim, Dong W.;Park, Gwi-Tae
    • ETRI Journal
    • /
    • 제34권4호
    • /
    • pp.572-582
    • /
    • 2012
  • The discrete Gaussian-Hermite moment (DGHM) is a global feature representation method that can be applied to square images. We propose a modified DGHM (MDGHM) method and an MDGHM-based scale-invariant feature transform (MDGHM-SIFT) descriptor. In the MDGHM, we devise a movable mask to represent the local features of a non-square image. The complete set of non-square image features are then represented by the summation of all MDGHMs. We also propose to apply an accumulated MDGHM using multi-order derivatives to obtain distinguishable feature information in the third stage of the SIFT. Finally, we calculate an MDGHM-based magnitude and an MDGHM-based orientation using the accumulated MDGHM. We carry out experiments using the proposed method with six kinds of deformations. The results show that the proposed method can be applied to non-square images without any image truncation and that it significantly outperforms the matching accuracy of other SIFT algorithms.

Region Division for Large-scale Image Retrieval

  • Rao, Yunbo;Liu, Wei
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제13권10호
    • /
    • pp.5197-5218
    • /
    • 2019
  • Large-scale retrieval algorithm is problem for visual analyses applications, along its research track. In this paper, we propose a high-efficiency region division-based image retrieve approaches, which fuse low-level local color histogram feature and texture feature. A novel image region division is proposed to roughly mimic the location distribution of image color and deal with the color histogram failing to describe spatial information. Furthermore, for optimizing our region division retrieval method, an image descriptor combining local color histogram and Gabor texture features with reduced feature dimensions are developed. Moreover, we propose an extended Canberra distance method for images similarity measure to increase the fault-tolerant ability of the whole large-scale image retrieval. Extensive experimental results on several benchmark image retrieval databases validate the superiority of the proposed approaches over many recently proposed color-histogram-based and texture-feature-based algorithms.

Stroke Width-Based Contrast Feature for Document Image Binarization

  • Van, Le Thi Khue;Lee, Gueesang
    • Journal of Information Processing Systems
    • /
    • 제10권1호
    • /
    • pp.55-68
    • /
    • 2014
  • Automatic segmentation of foreground text from the background in degraded document images is very much essential for the smooth reading of the document content and recognition tasks by machine. In this paper, we present a novel approach to the binarization of degraded document images. The proposed method uses a new local contrast feature extracted based on the stroke width of text. First, a pre-processing method is carried out for noise removal. Text boundary detection is then performed on the image constructed from the contrast feature. Then local estimation follows to extract text from the background. Finally, a refinement procedure is applied to the binarized image as a post-processing step to improve the quality of the final results. Experiments and comparisons of extracting text from degraded handwriting and machine-printed document image against some well-known binarization algorithms demonstrate the effectiveness of the proposed method.

Action Recognition with deep network features and dimension reduction

  • Li, Lijun;Dai, Shuling
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제13권2호
    • /
    • pp.832-854
    • /
    • 2019
  • Action recognition has been studied in computer vision field for years. We present an effective approach to recognize actions using a dimension reduction method, which is applied as a crucial step to reduce the dimensionality of feature descriptors after extracting features. We propose to use sparse matrix and randomized kd-tree to modify it and then propose modified Local Fisher Discriminant Analysis (mLFDA) method which greatly reduces the required memory and accelerate the standard Local Fisher Discriminant Analysis. For feature encoding, we propose a useful encoding method called mix encoding which combines Fisher vector encoding and locality-constrained linear coding to get the final video representations. In order to add more meaningful features to the process of action recognition, the convolutional neural network is utilized and combined with mix encoding to produce the deep network feature. Experimental results show that our algorithm is a competitive method on KTH dataset, HMDB51 dataset and UCF101 dataset when combining all these methods.

Attention-based for Multiscale Fusion Underwater Image Enhancement

  • Huang, Zhixiong;Li, Jinjiang;Hua, Zhen
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제16권2호
    • /
    • pp.544-564
    • /
    • 2022
  • Underwater images often suffer from color distortion, blurring and low contrast, which is caused by the propagation of light in the underwater environment being affected by the two processes: absorption and scattering. To cope with the poor quality of underwater images, this paper proposes a multiscale fusion underwater image enhancement method based on channel attention mechanism and local binary pattern (LBP). The network consists of three modules: feature aggregation, image reconstruction and LBP enhancement. The feature aggregation module aggregates feature information at different scales of the image, and the image reconstruction module restores the output features to high-quality underwater images. The network also introduces channel attention mechanism to make the network pay more attention to the channels containing important information. The detail information is protected by real-time superposition with feature information. Experimental results demonstrate that the method in this paper produces results with correct colors and complete details, and outperforms existing methods in quantitative metrics.

RGB Contrast 영상에서의 Local Binary Pattern Variance를 이용한 연기검출 방법 (Smoke Detection Method Using Local Binary Pattern Variance in RGB Contrast Imag)

  • 김정한;배성호
    • 한국멀티미디어학회논문지
    • /
    • 제18권10호
    • /
    • pp.1197-1204
    • /
    • 2015
  • Smoke detection plays an important role for the early detection of fire. In this paper, we suggest a newly developed method that generated LBPV(Local Binary Pattern Variance)s as special feature vectors from RGB contrast images can be applied to detect smoke using SVM(Support Vector Machine). The proposed method rearranges mean value of the block from each R, G, B channel and its intensity of the mean value. Additionally, it generates RGB contrast image which indicates each RGB channel’s contrast via smoke’s achromatic color. Uniform LBPV, Rotation-Invariance LBPV, Rotation-Invariance Uniform LBPV are applied to RGB Contrast images so that it could generate feature vector from the form of LBP. It helps to distinguish between smoke and non smoke area through SVM. Experimental results show that true positive detection rate is similar but false positive detection rate has been improved, although the proposed method reduced numbers of feature vector in half comparing with the existing method with LBP and LBPV.

Binary Hashing CNN Features for Action Recognition

  • Li, Weisheng;Feng, Chen;Xiao, Bin;Chen, Yanquan
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제12권9호
    • /
    • pp.4412-4428
    • /
    • 2018
  • The purpose of this work is to solve the problem of representing an entire video using Convolutional Neural Network (CNN) features for human action recognition. Recently, due to insufficient GPU memory, it has been difficult to take the whole video as the input of the CNN for end-to-end learning. A typical method is to use sampled video frames as inputs and corresponding labels as supervision. One major issue of this popular approach is that the local samples may not contain the information indicated by the global labels and sufficient motion information. To address this issue, we propose a binary hashing method to enhance the local feature extractors. First, we extract the local features and aggregate them into global features using maximum/minimum pooling. Second, we use the binary hashing method to capture the motion features. Finally, we concatenate the hashing features with global features using different normalization methods to train the classifier. Experimental results on the JHMDB and MPII-Cooking datasets show that, for these new local features, binary hashing mapping on the sparsely sampled features led to significant performance improvements.

Face Recognition Based on the Combination of Enhanced Local Texture Feature and DBN under Complex Illumination Conditions

  • Li, Chen;Zhao, Shuai;Xiao, Ke;Wang, Yanjie
    • Journal of Information Processing Systems
    • /
    • 제14권1호
    • /
    • pp.191-204
    • /
    • 2018
  • To combat the adverse impact imposed by illumination variation in the face recognition process, an effective and feasible algorithm is proposed in this paper. Firstly, an enhanced local texture feature is presented by applying the central symmetric encode principle on the fused component images acquired from the wavelet decomposition. Then the proposed local texture features are combined with Deep Belief Network (DBN) to gain robust deep features of face images under severe illumination conditions. Abundant experiments with different test schemes are conducted on both CMU-PIE and Extended Yale-B databases which contain face images under various illumination condition. Compared with the DBN, LBP combined with DBN and CSLBP combined with DBN, our proposed method achieves the most satisfying recognition rate regardless of the database used, the test scheme adopted or the illumination condition encountered, especially for the face recognition under severe illumination variation.