• Title/Summary/Keyword: shift invariance

Search Result 32, Processing Time 0.023 seconds

신경회로망을 이용한 연속음성중 키워드(keyword)인식에 관한 연구

  • 최관선;한민홍
    • Proceedings of the Korean Operations and Management Science Society Conference
    • /
    • 1993.04a
    • /
    • pp.275-281
    • /
    • 1993
  • 본 발표에서는 신경회로망을 이용하여 연속음성중에서 키워드를 인식하는 방법을 설명한다. 연속음성에서 파형소편 및 음절을 식별하는 휴리스틱 알고리즘을 개발하였고, 연속음성을 음절단위로 파형소편 스펙트럼분석(선형예측법)으로 특성치를 추출하였다. 음절의 특성치는 코호넨 신경회로망을 통하여 학습을 시켰으며, 연속음성중 키워드인식은 먼저 음절을 인식하여 단어를 찾고, 인식된 단어가 키워드와 일치하는가를 확인한다. 본 연구의 의의는 파형소편 및 음절식별 알고리즘을 통하여, 크기불변성(Scaling invariance), 시간불변성(Time warping 및 Time-shift invariance), 중복성제거의 문제점을 해결하였고, 신경회로망의 학습을 통하여 화자독립적인 연속음성인식시스템 구축의 기반을 확립한데 있다. 본 음성인식모델은 학교구내 전화번호 안내시스템으로 활용단계에 있으며 전화번호뿐만아니라 주소안내시스템으로도 활용될 예정이다. 또한 자동차 운전보조시스템 및 주행안내시스템의 음성명령에 응용될 수 있는데, 예로 음성명령은 "핸들 좌로 20도", "시청까지 주행", "시청 지도안내"등이 될 수 있다. 현재 자동차 운전보조시스템은 컴퓨터 화면상 모의동작시스템으로 운영되고 있다. 본 음성인식모델은 화자종속시 90%이상, 화자독립시 70%의 인식결과를 보였다.시 90%이상, 화자독립시 70%의 인식결과를 보였다.

  • PDF

Binary clustering network for recognition of keywords in continuous speech (연속음성중 키워드(Keyword) 인식을 위한 Binary Clustering Network)

  • 최관선;한민홍
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 1993.10a
    • /
    • pp.870-876
    • /
    • 1993
  • This paper presents a binary clustering network (BCN) and a heuristic algorithm to detect pitch for recognition of keywords in continuous speech. In order to classify nonlinear patterns, BCN separates patterns into binary clusters hierarchically and links same patterns at root level by using the supervised learning and the unsupervised learning. BCN has many desirable properties such as flexibility of dynamic structure, high classification accuracy, short learning time, and short recall time. Pitch Detection algorithm is a heuristic model that can solve the difficulties such as scaling invariance, time warping, time-shift invariance, and redundance. This recognition algorithm has shown recognition rates as high as 95% for speaker-dependent as well as multispeaker-dependent tests.

  • PDF

Shift-Invariant uHMT Estimation for Wavelet-based Image Denoising (웨이블렛 기반 영상 잡음제거를 위한 천이 불변 uHMT 추정)

  • 윤근수;정원용
    • Proceedings of the Korea Institute of Convergence Signal Processing
    • /
    • 2001.06a
    • /
    • pp.221-224
    • /
    • 2001
  • In this paper we propose a shift-invariant uHMT estimation for wavelet-based image denoising. The proposed estimation have just nine meta-parameter (independent of the size of the image and the number of wavelet scales) and requires no kinds of training. Also it solve visual artifacts resulted in the lack of shift-invariance in the DWT. The experimental results show that the proposed estimation is more effective than the other wavelet-based denoising by 0.5-ldB (PSNR) and allows an Ο(nlog n) in terms of performance speed.

  • PDF

Windowed Wavelet Stereo Matching Using Shift ability (이동성(shift ability)을 이용한 윈도우 웨이블릿 스테레오 정합)

  • 신재민;이호근;하영호
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.28 no.1C
    • /
    • pp.56-63
    • /
    • 2003
  • In this paper, a wavelet-based stereo matching algorithm to obtain an accurate disparity map in wavelet transformed domain by using a shift ability property, a modified wavelet transform, the similarities for their sub-bands, and a hierarchical structure is proposed. New approaches for stereo matching by lots of feature information are to utilize translation-variant results of the sub-bands in the wavelet transformed domain because they cannot literally expect translation invariance in a system based on convolution and sub-sampling. After the similarity matching for each sub-band, we can easily find optimal matched-points because the sub-bands appearance of the shifted signals is definitely different from that of the original signal with no shift.

ELS FTF algorithm fot ARMA spectral estimation (ARMA스펙트럼 추정을 위한 ELS FTF 알고리즘)

  • 이철희;장영수;남현도;양홍석
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 1989.10a
    • /
    • pp.427-430
    • /
    • 1989
  • For on-line ARMA spectral estimation, the fast transversal filter algorithm of extended least squares method(ETS FTF) is presented. The projection operator, a key tool for geometric approach, is used in the derivation of the algorithm. ELS FTF is a fast time update recursion which is based on the fact that the correlation matrix of ARMA model satisfies the shift invariance property in each block, and thus it takes 10N+31 MADPR.

  • PDF

DFT integration for Face Detection (DFT를 이용한 Face Detection)

  • Han, Seok-Min;Choi, Jin-Young
    • Proceedings of the KIEE Conference
    • /
    • 2006.04a
    • /
    • pp.117-119
    • /
    • 2006
  • In this work, we suggest another method to localize DFT in spatial domain. This enables DFT algorithm to be used for local pattern matching. Once calculated, it costs same load to calculate localized DFT regardless of the size or the position of local region In spatial domain. We applied this method to face detection problem and got the results which prove the utility of our method.

  • PDF

Planar integrated optics for performing fractional correlation operation (평판 집적 광학계를 이용한 분수차 상관기 구현)

  • 박선택;김필수;오차환;송석호
    • Korean Journal of Optics and Photonics
    • /
    • v.8 no.2
    • /
    • pp.154-160
    • /
    • 1997
  • On the base of the fractional Fourier transform(FRT) which is known as a generalized form of the conventional Fourier transform, the fractional correlation has been implemented. Shift-variance property of the fraction correlation has been evaluated and compared with the shift-invariance of the conventional correlation. The fractional correlation operation has been implemented by using a planar optics configuration which integrates all of the optical components on a single glass substrate. A good agreement between the experimental and calculated results has been obtained.

  • PDF

Shift-invariant face recognition based on the karhunen-loeve approximationof amplitude spectra of fourier-transformed faces (Fourier 변환된 얼굴의 진폭스펙트럼의 karhunen-loeve 근사 방법에 기초한 변위불변적 얼굴인식)

  • 심영미;장주석;김종규
    • Journal of the Korean Institute of Telematics and Electronics C
    • /
    • v.35C no.3
    • /
    • pp.97-107
    • /
    • 1998
  • In face recognition based on the Karhunen-Loeve approximation, amplitudespectra of Fourier transformed facial images were used. We found taht the use of amplitude spetra gives not only the shift-invariance property but also some improvment of recognition rate. This is because the distance between the varing faces of a person compared with that between the different persons perfomed computer experiments on face recognitio with varing facial images obtained from total 55 male and 25 females. We confirmed that the use of amplitude spectra of Fourier-trnsformed facial imagesgives better recognition rate for avariety of varying facial images including shifted ones than the use of direct facial images does.

  • PDF

Digital Image Processing Using Non-separable High Density Discrete Wavelet Transformation (비분리 고밀도 이산 웨이브렛 변환을 이용한 디지털 영상처리)

  • Shin, Jong Hong
    • Journal of Korea Society of Digital Industry and Information Management
    • /
    • v.9 no.1
    • /
    • pp.165-176
    • /
    • 2013
  • This paper introduces the high density discrete wavelet transform using quincunx sampling, which is a discrete wavelet transformation that combines the high density discrete transformation and non-separable processing method, each of which has its own characteristics and advantages. The high density discrete wavelet transformation is one that expands an N point signal to M transform coefficients with M > N. The high density discrete wavelet transformation is a new set of dyadic wavelet transformation with two generators. The construction provides a higher sampling in both time and frequency. This new transform is approximately shift-invariant and has intermediate scales. In two dimensions, this transform outperforms the standard discrete wavelet transformation in terms of shift-invariant. Although the transformation utilizes more wavelets, sampling rates are high costs and some lack a dominant spatial orientation, which prevents them from being able to isolate those directions. A solution to this problem is a non separable method. The quincunx lattice is a non-separable sampling method in image processing. It treats the different directions more homogeneously than the separable two dimensional schemes. Proposed wavelet transformation can generate sub-images of multiple degrees rotated versions. Therefore, This method services good performance in image processing fields.

The Digital Image Processing Method Using Triple-Density Discrete Wavelet Transformation (3중 밀도 이산 웨이브렛 변환을 이용한 디지털 영상처리 기법)

  • Shin, Jong Hong
    • Journal of Korea Society of Digital Industry and Information Management
    • /
    • v.8 no.3
    • /
    • pp.133-145
    • /
    • 2012
  • This paper describes the high density discrete wavelet transformation which is one that expands an N point signal to M transform coefficients with M > N. The double-density discrete wavelet transform is one of the high density discrete wavelet transformation. This transformation employs one scaling function and two distinct wavelets, which are designed to be offset from one another by one half. And it is nearly shift-invariant. Similarly, triple-density discrete wavelet transformation is a new set of dyadic wavelet transformation with two generators. The construction provides a higher sampling in both time and frequency. Specifically, the spectrum of the first wavelet is concentrated halfway between the spectrum of the second wavelet and the spectrum of its dilated version. In addition, the second wavelet is translated by half-integers rather than whole-integers in the frame construction. This arrangement leads to high density wavelet transformation. But this new transform is approximately shift-invariant and has intermediate scales. In two dimensions, this transform outperforms the standard and double-density discrete wavelet transformation in terms of multiple directions. Resultingly, the proposed wavelet transformation services good performance in image and video processing fields.