Search | Korea Science

The Local Path Constraint for the Recognition of Speech (음성 인식을 위한 소구간 경로 제약)

Ann, Tae-Ock;Kim, Soon-Hyob
- The Journal of the Acoustical Society of Korea
- /
- v.8 no.4
- /
- pp.60-64
- /
- 1989
In this paper, an local path constraint Is proposed in order to increase the speech recognition rate. An input speech signal is analyzed by autocorrelation and LPC coefficient as parameters. The local path constraint of the proposed type was compared with the conventional five types. The speechs used in this search are the subway stops, and the 130 words pronounced 10 times for the different 13 words consisting of 11 characters of syllable by 2 male and 1 female are tested. As a result, we proved that this proposed type is the most optimal type and the recognition rate of $94.6\%$ is obtained .
PDF

Efficient quantization of LPC parameters for vocoder of mobile communications (이동통신 음성 부화화기를 위한 선형 예측 계수(LPC)의 효율적 양자화 방법)

이인성;우홍채
- Journal of the Korean Institute of Telematics and Electronics S
- /
- v.34S no.4
- /
- pp.50-56
- /
- 1997
In this paper, efficient quantization methods of line spectrum pairs (LSP) which has good performances and low complexity and memory are proosed for vocoder of mobile communication system. The adaptive quantization method utilizing the ordering property of LSP parameters is used in a scalar quantizer and a vector-scalar hybrid quantizer. The proposed scalar quantization algorithm needs 31 bits/frame to maintain the transparent quality of speech. The improved vector-scalar quantizer achieves an average spectral distortion of 1dB using 26 bits/frame. The proposed methods are evaluated in the channel errors and changed the predictor structure to maintain the robustness to channel errors.
PDF

A Study on Speech Recognition by One Stage MSVQ/DP (One stage MSVQ/DP를 이용한 음성 인식에 관한연구)

Jeoung, Eui-Bung
- The Journal of the Acoustical Society of Korea
- /
- v.13 no.2
- /
- pp.5-12
- /
- 1994
This paper proposes One Stage MSVQ/DP method for word recognition system university administration branch names are selected for the recognition experiment and 10 LPC cepstrum coefficients is used as the feature parameter. Besides the speech recognition experiments by proposed method, for comparision with it, we perform the experiments on the same data by Level Building DTW and One Stage DP method. The Recognition rates with the LBDTW and the One Stage method are $83.3\%$ and $87.5\%$, but the recognition rate with the proposed method is $91.6\%$.
PDF

Speech Enhancement by Reconstruction of Cosine Table for LSE Roots According to the Voiced/Unvoiced Decision (유무성음 판정에 따른 LSF 코사인테이블 재구성에 의한 음질향상)

Choi SeongYoung;BAE MyungJin
- Proceedings of the Acoustical Society of Korea Conference
- /
- autumn
- /
- pp.3-6
- /
- 2000
CELP 계열 보코더중 인터넷 폰 및 화상회의를 목적으로 개발된 G.723.1 보코더에서는 LPC를 LSP로 변환하기 위해 LPC 계수를 이용한 다항식을 구성한 다음 근을 검색하는 방법을 사용하고 있다. 근 검색시에는 256/pi의 범위 안에서 동일한 간격을 갖는 코사인 테이블을 구성하여 순차적으로 검색하게 된다. LSF의 근들은 포만트가 존재하는 대역에서 근들이 나타나게 되므로 유성음의 경우 저주파수 대역에서 무성음의 경우 고주파수 대역에서 많이 분포하게 된다. 하지만 G.723.1에서 사용하는 코사인 테이블은 음성신호의 특성을 고려하지 않고 균등한 간격을 갖는 값들을 사용함으로 음질을 저해할 수 있는 요소를 갖고 있다. 따라서 본 논문에서는 음성의 특성을 고려한 코사인 테이블을 재구성함으로써 음질을 향상시킬 수 있었으며 주관적 음질평가인 MOS 시험결과 평균 1.8 정도의 음질향상을 가져올 수 있었다
PDF

A Study on the Parameter Extraction for Performance Comparison of LSP transformation Time (LSP 변환 알고리즘들의 비교 평가에 관한 연구)

Lim, Ji-Sun
- Proceedings of the KAIS Fall Conference
- /
- 2010.05a
- /
- pp.249-252
- /
- 2010
LPC 계수를 LSP 변환하는 방법에는 복소근, 실근, 비율 필터, 체비셰프 급수, 적응적 순차형 최소제곱 평균 방법(adaptive sequential LMS) 등이 있다. 이 방법들 중 음성 부호화기에서 주로 사용하는 실근 방법은 근을 구하기 위해 주파수 영역을 순차적으로 검색하기 때문에 계산시간이 많이 소요되는 단점을 갖는다. 본 논문에서는 LPC에서 LSP로 변환하는 4가지 고속 알고리즘을 제안한다. 첫 번째 방식에서는 검색간격에 멜 스케일을 적용하였고, 두 번째는 홀수번째 LSP 파라미터의 분포도를 이용하여 검색순서를 조정한 방법이다. 세 번째 방식과 네 번째 방식에서는 각각, 모음 특성, LSP 분포특성과 해상도를 이용하여 계산시간을 단축하였다. LSP 변환시간은 4가지 방법 모두 35~50% 단축되었다. 또한 실험결과에서는 각 알고리즘의 고유한 특성에 대하여 분석한다.
PDF

Robust Speech Recognition for Emotional Variation (감정 변화에 강인한 음성 인식)

Kim, Won-Gu
- Proceedings of the Korean Institute of Intelligent Systems Conference
- /
- 2007.11a
- /
- pp.431-434
- /
- 2007
본 논문에서는 인간의 감정 변화의 영향을 적게 받는 음성 인식 시스템의 특정 파라메터에 관한 연구를 수행하였다. 이를 위하여 우선 다양한 감정이 포함된 음성 데이터베이스를 사용하여 감정 변화가 음성 인식 시스템의 성능에 미치는 영향과 감정 변화의 영향을 적게 받는 특정 파라메터에 관한 연구를 수행하였다. 본 연구에서는 LPC 켑스트럼 계수, 멜 켑스트럼 계수, 루트 켑스트럼 계수, PLP 계수와 RASTA 처리를 한 멜 켑스트럼 계수와 음성의 에너지를 사용하였다. 또한 음성에 포함된 편의(bias)를 제거하는 방법으로 CMS 와 SBR 방법을 사용하여 그 성능을 비교하였다. HMM 기반의 화자독립 단어 인식기를 사용한 실험 결과에서 RASTA 멜 켑스트럼과 델타 켑스트럼을 사용하고 신호편의 제거 방법으로 CMS를 사용한 경우에 가장 우수한 성능을 나타내었다. 이러한 것은 멜 켑스트럼을 사용한 기준 시스템과 비교하여 59%정도 오차가 감소된 것이다.
PDF

A Method For Improvement Of Split Vector Quantization Of The ISF Parameters Using Adaptive Extended Codebook (적응적인 확장된 코드북을 이용한 분할 벡터 양자화기 구조의 ISF 양자화기 개선)

Lim, Jong-Ha;Jeong, Gyu-Hyeok;Hong, Gi-Bong;Lee, In-Sung
- The Journal of the Acoustical Society of Korea
- /
- v.30 no.1
- /
- pp.1-8
- /
- 2011
This paper presents a method for improving the performance of ISF coefficients quantizer through compensating the defect of the split structure vector quantization using the ordering property of ISF coefficients. And design the ISF coefficients quantizer for wideband speech codec using proposed method. The wideband speech codec uses split structure vector quantizer which could not use the correlation between ISF coefficients fully to reduce complexity and the size of codebook. The proposed algorithm uses the ordering property of ISF coefficients to overcome the defect. Using the ordering property, the codebook redundancy could be figured out. The codebook redundancy is replaced by the adaptive-extended codebook to improve the performance of the quantizer through using the ordering property, ISF coefficient prediction and interpolation of existing codebook. As a result, the proposed algorithm shows that the adaptive-extended codebook algorithm could get about 2 bit gains in comparison with the existing split structure ISF quantizer of AMR-WB (G.722.2) in the points of spectral distortion.
https://doi.org/10.7776/ASK.2011.30.1.001 인용 PDF KSCI

A Study on the Signal Processing for Content-Based Audio Genre Classification (내용기반 오디오 장르 분류를 위한 신호 처리 연구)

윤원중;이강규;박규식
- Journal of the Institute of Electronics Engineers of Korea SP
- /
- v.41 no.6
- /
- pp.271-278
- /
- 2004
In this paper, we propose a content-based audio genre classification algorithm that automatically classifies the query audio into five genres such as Classic, Hiphop, Jazz, Rock, Speech using digital sign processing approach. From the 20 seconds query audio file, the audio signal is segmented into 23ms frame with non-overlapped hamming window and 54 dimensional feature vectors, including Spectral Centroid, Rolloff, Flux, LPC, MFCC, is extracted from each query audio. For the classification algorithm, k-NN, Gaussian, GMM classifier is used. In order to choose optimum features from the 54 dimension feature vectors, SFS(Sequential Forward Selection) method is applied to draw 10 dimension optimum features and these are used for the genre classification algorithm. From the experimental result, we can verify the superior performance of the proposed method that provides near 90% success rate for the genre classification which means 10%∼20% improvements over the previous methods. For the case of actual user system environment, feature vector is extracted from the random interval of the query audio and it shows overall 80% success rate except extreme cases of beginning and ending portion of the query audio file.
PDF KSCI

A Method of Adaptive ISF Split Vector Quantization Using Normalized Codebook (정규화 코드북을 이용한 분할 벡터 구조의 ISF 적응적 양자화 기법)

Piao, Zhigang;Lim, Jong-Ha;Hong, Gi-Bong;Lee, In-Sung
- The Journal of the Acoustical Society of Korea
- /
- v.30 no.5
- /
- pp.265-272
- /
- 2011
In most of the ISF (or LSF) based real time speech codec, SVQ (split vector quantization) method is used to decrease the quantizer complexity and memory size of codebook. However, it produces drawback that the level of correlation between code vectors can not be used during vector splits. This paper presents a new method of adaptive ISF vector quantization, which compensates the drawbacks of SVQ structured quantizer for wideband speech codec. In each different frame, the proposed method makes use of the correlation between splitted vectors by adaptively changing codebook distribution according to ordering property of ISF. The algorithm is evaluated in AMR-WB, and shows about 1.5 bit per frame improvement.
https://doi.org/10.7776/ASK.2011.30.5.265 인용 PDF KSCI

A Study on Classification of Four Emotions using EEG (뇌파를 이용한 4가지 감정 분류에 관한 연구)

강동기;김동준;김흥환;고한우
- Proceedings of the Korean Society for Emotion and Sensibility Conference
- /
- 2001.11a
- /
- pp.87-90
- /
- 2001
본 연구에서는 감성 평가 시스템에 가장 적합한 파라미터를 찾기 위하여 3가지 뇌파 파라미터를 이용하여 감정 분류 실험을 하였다. 뇌파 파라미터는 선형예측기계수(linear predictor coefficients)와 FFT 스펙트럼 및 AR 스펙트럼의 밴드별 상호상관계수(cross-correlation coefficients)를 이용하였으며, 감정은 relaxation, joy, sadness, irritation으로 설정하였다. 뇌파 데이터는 대학의 연극동아리 학생 4명을 대상으로 수집하였으며, 전극 위치는 Fp1, Fp2, F3, F4, T3, T4, P3, P4, O1, O2를 사용하였다. 수집된 뇌파 데이터는 전처리를 거친 후 특징 파라미터를 추출하고 패턴 분류기로 사용된 신경회로망(neural network)에 입력하여 감정 분류를 하였다. 감정 분류실험 결과 선형예측기계수를 이용하는 것이 다른 2가지 보다 좋은 성능을 나타내었다.
PDF

Search Result 90, Processing Time 0.024 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)