• Title/Summary/Keyword: Data Codebook

Search Result 80, Processing Time 0.024 seconds

Automatic Speaker Identification by Sustained Vowel Phonation (지속적으로 발성한 모음에 의한 화자인식)

  • Bae, Geon-Seong
    • The Journal of the Acoustical Society of Korea
    • /
    • v.11 no.1
    • /
    • pp.35-41
    • /
    • 1992
  • A speaker identification scheme using the speaker-based VQ codecook of a sustained vowel is proposed and tested. With the pitch synchronous LPC vector of the sustained vowel /i/ as a feature vector, a VQ codebook size of 4 was found to be suitable to characterize each speaker's feature space. For 40 normal speakers (20 males, 20 females), we achieved the correct identification rate of 99.4% with a training data set, and 89.4% with a test data set with speech samples of only 50 pitch periods.

  • PDF

A study on the speech recognition by HMM based on multi-observation sequence (다중 관측열을 토대로한 HMM에 의한 음성 인식에 관한 연구)

  • 정의봉
    • Journal of the Korean Institute of Telematics and Electronics S
    • /
    • v.34S no.4
    • /
    • pp.57-65
    • /
    • 1997
  • The purpose of this paper is to propose the HMM (hidden markov model) based on multi-observation sequence for the isolated word recognition. The proosed model generates the codebook of MSVQ by dividing each word into several sections followed by dividing training data into several sections. Then, we are to obtain the sequential value of multi-observation per each section by weighting the vectors of distance form lower values to higher ones. Thereafter, this the sequential with high probability value while in recognition. 146 DDD area names are selected as the vocabularies for the target recognition, and 10LPC cepstrum coefficients are used as the feature parameters. Besides the speech recognition experiments by way of the proposed model, for the comparison with it, the experiments by DP, MSVQ, and genral HMM are made with the same data under the same condition. The experiment results have shown that HMM based on multi-observation sequence proposed in this paper is proved superior to any other methods such as the ones using DP, MSVQ and general HMM models in recognition rate and time.

  • PDF

A vehicle detection and tracking algorithm for supervision of illegal parking (불법 주정차 차량 단속을 위한 차량 검지 및 추적 기법)

  • Kim, Seung-Kyun;Kim, Hyo-Kak;Zhang, Dongni;Park, Sang-Hee;Ko, Sung-Jea
    • Journal of IKEEE
    • /
    • v.13 no.2
    • /
    • pp.232-240
    • /
    • 2009
  • This paper presents a robust vehicle detection and tracking algorithm for supervision of illegal parking. The proposed algorithm is composed of four parts. First, a vehicle detection algorithm is proposed using the improved codebook object detection algorithm to segment moving vehicles from the input sequence. Second, a preprocessing technique using the geometric characteristics of vehicles is employed to exclude non-vehicle objects. Then, the detected vehicles are tracked by an object tracker which incorporates histogram tracking method with Kalman filter. To make the tracking results more accurate, histogram tracking results are used as measurement data for Kalman filter. Finally, Real Stop Counter (RSC) is introduced for trustworthy and accurate performance of the stopped vehicle detection. Experimental results show that the proposed algorithm can track multiple vehicles simultaneously and detect stopped vehicles successfully in the complicated street environment.

  • PDF

Speech Recognition Based on VQ/NN using Fuzzy (Fuzzy를 이용한 VQ/NN에 기초를 둔 음성 인식)

  • Ann, Tae-Ock
    • The Journal of the Acoustical Society of Korea
    • /
    • v.15 no.6
    • /
    • pp.5-11
    • /
    • 1996
  • This paper is the study for recognizing single vowels of speaker-independent, and we suppose a method of speech recognition using VQ(Vector Quantization)/NN(Neural Network). This method makes a VQ codebook, which is used for obtaining the observation sequence, and then claculates the probability value by comparing each codeword with the data, finally uses these probability values for the input value of the neural network. Korean signle vowels are selected for our recognition experiment, and ten male speakers pronounced eight single vowels ten times. We compare the performance of our method with those of fuzzy VQ/HMM and conventional VQ/NN According to the experiment result, the recognition rate by VQ/NN is 92.3%, by VQ/HMM using fuzzy is 93.8% and by VQ/NN using fuzzy is 95.7%. Therefore, it is shown that recognition rate of speech recognition by fuzzy VQ/NN is better than those of fuzzy VQ/HMM and conventional VQ/HMM because of its excellent learning ability.

  • PDF

Performance Evaluation of Beamforming Scheme in Millimeter Wave Wireless Communication System (밀리미터파 무선통신 시스템에서의 빔포밍 기법 성능 평가)

  • Nguyen, Thanh Ngoc;Jeon, Taehyun
    • Journal of Satellite, Information and Communications
    • /
    • v.11 no.3
    • /
    • pp.133-137
    • /
    • 2016
  • Millimeter wave wireless communication systems, especially those targeting indoor high rate data transfer, have a strong requirement for high quality wireless link. Unfortunately, in this frequency band, the electromagnetic wave has to sustain the high propagation loss caused by the smaller wavelengths. In this scenario, beamforming technique, which enhances the link quality by focusing the radiation power on a direction, becomes one of the most important techniques in millimeter wave band wireless communication. In recent year, there been conducted many research on beamforming to improve the performance of wireless system. In this paper, we evaluate the performance of a simplified codebook-based beamforming scheme which is based on multiple-procedure and three-state beam selection. The simplified scheme significantly reduces beamforming setup time, comparing to the exhaustive searching, two-level searching adopted in IEEE 802.15.3c standard, and also conventional multi-level scheme.

A Novel LTE Downlink Codebook for Rician Fading Channels (Rician 페이딩 채널에 적합한 새로운 LTE 하향링크 코드북)

  • Yan, Zhi Fei;Kim, Young-Ju
    • Journal of the Institute of Electronics Engineers of Korea TC
    • /
    • v.48 no.1
    • /
    • pp.70-76
    • /
    • 2011
  • LTE Re1-10 aims at peak. data rates of 1Gbits/s for the downlink and 500 Mbits/s for the uplink, which can be accomplished by not only wide spectrum but also advanced MIMO techniques such as precoded MIMO and cooperative relays. Considering some relays can have more direct signal paths than mobile stations do, LoS components are examined to build more efficient codebooks for Rician channels. The proposed codebooks perform better than the existing LTE codebooks as the criterium of LoS, K-factor increases. Conserving the advantages and max-min chordal distance of the existing LTE codebooks, the proposed ones also maximize the minimum chordal distances between codewords over Rician fading channels. Link-level simulation with LTE system parameters confirm the performance improvements as the value of K increases.

A Study on METS Design Using DDI Metadata (DDI 메타데이터를 활용한 METS 설계에 관한 연구)

  • Park, Jin Ho
    • Journal of the Korean Society for information Management
    • /
    • v.38 no.4
    • /
    • pp.153-171
    • /
    • 2021
  • This study suggested a method of utilizing METS based on DDI metadata to manage, preserve, and service datasets. DDI is a standard for statistical data processing, and there are currently two versions of DDI Codebook (DDI-C) and DDI Lifecycle (DDI-L). In this study, the main elements of DDI-C were mainly used. First the structures and elements of METS and DDI-C were first analyzed. And the mapping of the major elements of METS and DDI-C. The standard was finally taken as METS, the format to express it. Since METS and DDI-C do not show a perfect 1:1 mapping, the DDI-C element that best matches each element of the standard METS was selected. As a result, a new dataset management transmission standard METS using DDI-C metadata elements was designed and presented.

An Improvement of LVQ3 Learning Using SVM (SVM을 이용한 LVQ3 학습의 성능개선)

  • 김상운
    • Proceedings of the IEEK Conference
    • /
    • 2001.06c
    • /
    • pp.9-12
    • /
    • 2001
  • Learning vector quantization (LVQ) is a supervised learning technique that uses class information to move the vector quantizer slightly, so as to improve the quality of the classifier decision regions. In this paper we propose a selection method of initial codebook vectors for a teaming vector quantization (LVQ3) using support vector machines (SVM). The method is experimented with artificial and real design data sets and compared with conventional methods of the condensed nearest neighbor (CNN) and its modifications (mCNN). From the experiments, it is discovered that the proposed method produces higher performance than the conventional ones and then it could be used efficiently for designing nonparametric classifiers.

  • PDF

The Research of Reducing the Fixed Codebook Search Time of G.723.1 MP-MLQ (잡음 환경에서의 전송율 감소를 위한 G.723.1 VAD 성능개선에 관한 연구)

  • 김정진;박영호;배명진
    • Proceedings of the IEEK Conference
    • /
    • 2000.06d
    • /
    • pp.98-101
    • /
    • 2000
  • On CELP type Vocoders G.723.1 6.3kbps/5.3kbps Dual Rate Speech Codec, which is developed for Internet Phone and videoconferencing, uses VAD(Voice Activity Detection)/CNG (Comfort Noise Generator) in order to reduce the bit rate in a silence period. In order to reduce the bit rate effectively in this paper, we first set the boundary condition of the energy threshold to prevent the consumption of unnecessary processing time, and use three decision rules to detect an active frame by energy, pitch gain and LSP distance. To evaluate the performance of the proposed algorithm we use silence-inserted speech data with 0, 5, 10, 20dB of SNR. As a result when SNR is over 5dB, the bit rate is reduced up to about 40% without speech degradation and the processing time is additionally decreased.

  • PDF

Wideband Speech Reconstruction Using Modular Neural Networks (모듈화한 신경 회로망을 이용한 광대역 음성 복원)

  • Woo Dong Hun;Ko Charm Han;Kang Hyun Min;Jeong Jin Hee;Kim Yoo Shin;Kim Hyung Soon
    • MALSORI
    • /
    • no.48
    • /
    • pp.93-105
    • /
    • 2003
  • Since telephone channel has bandlimited frequency characteristics, speech signal over the telephone channel shows degraded speech quality. In this paper, we propose an algorithm using neural network to reconstruct wideband speech from its narrowband version. Although single neural network is a good tool for direct mapping, it has difficulty in training for vast and complicated data. To alleviate this problem, we modularize the neural networks based on appropriate clustering of the acoustic space. We also introduce fuzzy computing to compensate for probable misclassification at the cluster boundaries. According to our simulation, the proposed algorithm showed improved performance over the single neural network and conventional codebook mapping method in both objective and subjective evaluations.

  • PDF