Search | Korea Science

A New Pruning Method for Synthesis Database Reduction Using Weighted Vector Quantization

Kim, Sanghun;Lee, Youngjik;Keikichi Hirose
- The Journal of the Acoustical Society of Korea
- /
- v.20 no.4E
- /
- pp.31-38
- /
- 2001
A large-scale synthesis database for a unit selection based synthesis method usually retains redundant synthesis unit instances, which are useless to the synthetic speech quality. In this paper, to eliminate those instances from the synthesis database, we proposed a new pruning method called weighted vector quantization (WVQ). The WVQ reflects relative importance of each synthesis unit instance when clustering the similar instances using vector quantization (VQ) technique. The proposed method was compared with two conventional pruning methods through the objective and subjective evaluations of the synthetic speech quality: one to simply limit maximum number of instance, and the other based on normal VQ-based clustering. The proposed method showed the best performance under 50% reduction rates. Over 50% of reduction rates, the synthetic speech quality is not seriously but perceptibly degraded. Using the proposed method, the synthesis database can be efficiently reduced without serious degradation of the synthetic speech quality.
PDF

Design of EVRC LSP Codebooks with Korean (한국어에 의한 EVRC LSP 코드북 설계)

이진걸
- The Journal of the Acoustical Society of Korea
- /
- v.21 no.2
- /
- pp.167-172
- /
- 2002
The EVRC (Enhanced Variable Rate Codec) is currently in service as a speech cosec in digital cellular systems in North America and Korea. In the EVRC, the LSP (Line Spectral Pairs) related to energy distribution of speech signals in the frequency domain are coded by weighted split vector quantization. Considering that the LSP codebooks might be trained with the language of the develop country of the codebooks or English, it is expected that codebooks trained with Korean provide the performance improvements in the communication in Korean. In this paper, the EVRC LSP codebooks are designed with korean adopting the LBG algorithm based vector quantization, and the performance improvement of the vector quantization and the accompanying speech quality improvement are demonstrated by spectral distortion, SNR and SegSNR measurements, respectively.
PDF KSCI

Nearest-Neighbors Based Weighted Method for the BOVW Applied to Image Classification

Xu, Mengxi;Sun, Quansen;Lu, Yingshu;Shen, Chenming
- Journal of Electrical Engineering and Technology
- /
- v.10 no.4
- /
- pp.1877-1885
- /
- 2015
This paper presents a new Nearest-Neighbors based weighted representation for images and weighted K-Nearest-Neighbors (WKNN) classifier to improve the precision of image classification using the Bag of Visual Words (BOVW) based models. Scale-invariant feature transform (SIFT) features are firstly extracted from images. Then, the K-means++ algorithm is adopted in place of the conventional K-means algorithm to generate a more effective visual dictionary. Furthermore, the histogram of visual words becomes more expressive by utilizing the proposed weighted vector quantization (WVQ). Finally, WKNN classifier is applied to enhance the properties of the classification task between images in which similar levels of background noise are present. Average precision and absolute change degree are calculated to assess the classification performance and the stability of K-means++ algorithm, respectively. Experimental results on three diverse datasets: Caltech-101, Caltech-256 and PASCAL VOC 2011 show that the proposed WVQ method and WKNN method further improve the performance of classification.
https://doi.org/10.5370/JEET.2015.10.4.1877 인용 PDF KSCI KPUBS HTML

Automatic Music Summarization Using Similarity Measure Based on Multi-Level Vector Quantization (다중레벨 벡터양자화 기반의 유사도를 이용한 자동 음악요약)

Kim, Sung-Tak;Kim, Sang-Ho;Kim, Hoi-Rin
- The Journal of the Acoustical Society of Korea
- /
- v.26 no.2E
- /
- pp.39-43
- /
- 2007
Music summarization refers to a technique which automatically extracts the most important and representative segments in music content. In this paper, we propose and evaluate a technique which provides the repeated part in music content as music summary. For extracting a repeated segment in music content, the proposed algorithm uses the weighted sum of similarity measures based on multi-level vector quantization for fixed-length summary or optimal-length summary. For similarity measures, count-based similarity measure and distance-based similarity measure are proposed. The number of the same codeword and the Mahalanobis distance of features which have same codeword at the same position in segments are used for count-based and distance-based similarity measure, respectively. Fixed-length music summary is evaluated by measuring the overlapping ratio between hand-made repeated parts and automatically generated ones. Optimal-length music summary is evaluated by calculating how much automatically generated music summary includes repeated parts of the music content. From experiments we observed that optimal-length summary could capture the repeated parts in music content more effectively in terms of summary length than fixed-length summary.
PDF KSCI

Fuzzy Classifier and Bispectrum for Invariant 2-D Shape Recognition (2차원 불변 영상 인식을 위한 퍼지 분류기와 바이스펙트럼)

한수환;우영운
- Journal of Korea Multimedia Society
- /
- v.3 no.3
- /
- pp.241-252
- /
- 2000
In this paper, a translation, rotation and scale invariant system for the recognition of closed 2-D images using the bispectrum of a contour sequence and a weighted fuzzy classifier is derived and compared with the recognition process using one of the competitive neural algorithm, called a LVQ( Loaming Vector Quantization). The bispectrum based on third order cumulants is applied to the contour sequences of an image to extract fifteen feature vectors for each planar image. These bispectral feature vectors, which are invariant to shape translation, rotation and scale transformation, can be used to the represent two-dimensional planar images and are fed into a weighted fuzzy classifier. The experimental processes with eight different shapes of aircraft images are presented to illustrate a relatively high performance of the proposed recognition system.
PDF

Fuzzy Mean Method with Bispectral Features for Robust 2D Shape Classification

Woo, Young-Woon;Han, Soo-Whan
- Proceedings of the Korea Inteligent Information System Society Conference
- /
- 1999.10a
- /
- pp.313-320
- /
- 1999
In this paper, a translation, rotation and scale invariant system for the classification of closed 2D images using the bispectrum of a contour sequence and the weighted fuzzy mean method is derived and compared with the classification process using one of the competitive neural algorithm, called a LVQ(Learning Vector Quantization). The bispectrun based on third order cumulants is applied to the contour sequences of the images to extract fifteen feature vectors for each planar image. These bispectral feature vectors, which are invariant to shape translation, rotation and scale transformation, can be used to represent two-dimensional planar images and are fed into an classifier using weighted fuzzy mean method. The experimental processes with eight different shapes of aircraft images are presented to illustrate the high performance of the proposed classifier.
PDF

Representative Feature Extraction of Objects using VQ and Its Application to Content-based Image Retrieval (VQ를 이용한 영상의 객체 특징 추출과 이를 이용한 내용 기반 영상 검색)

Jang, Dong-Sik;Jung, Seh-Hwan;Yoo, Hun-Woo;Sohn, Yong--Jun
- Journal of KIISE:Computing Practices and Letters
- /
- v.7 no.6
- /
- pp.724-732
- /
- 2001
In this paper, a new method of feature extraction of major objects to represent an image using Vector Quantization(VQ) is proposed. The principal features of the image, which are used in a content-based image retrieval system, are color, texture, shape and spatial positions of objects. The representative color and texture features are extracted from the given image using VQ(Vector Quantization) clustering algorithm with a general feature extraction method of color and texture. Since these are used for content-based image retrieval and searched by objects, it is possible to search and retrieve some desirable images regardless of the position, rotation and size of objects. The experimental results show that the representative feature extraction time is much reduced by using VQ, and the highest retrieval rate is given as the weighted values of color and texture are set to 0.5 and 0.5, respectively, and the proposed method provides up to 90% precision and recall rate for 'person'query images.
PDF

An Adaptive Finite State Vector Quantization Method Using a New Side Match Distortion Function for Image Coding (영상 부호화를 위한 새로운 사이드 매치 왜곡 함수를 이용한 적응 유한상태 벡터 양자화 기법)

Lee, Sang-Un;Lee, Doo-Soo;Lim, In-Chil
- Journal of the Korean Institute of Telematics and Electronics S
- /
- v.35S no.10
- /
- pp.118-125
- /
- 1998
We introduce an adaptive finite state vector quantization using a new side match distortion function. The conventional side match distortion function can make the gray level transition between the block bounddaries as smooth as possible and proper state codebooks in the flat areas where the spatial correlations are high. But it can't make proper codebooks in the edge areas where the spatial correlations are not high. The proposed distortion function adds the variances which represent the image characteristics to the conventional side match distortion function as weighted values. Then it can select better state codebooks than the conventional side match distortion function. Also if it predicts a wrong state, the proposed quantizer can correct the state. As a result, we can obtain the satisfiable image quality.
PDF

Unit Generation Based on Phrase Break Strength and Pruning for Corpus-Based Text-to-Speech

Kim, Sang-Hun;Lee, Young-Jik;Hirose, Keikichi
- ETRI Journal
- /
- v.23 no.4
- /
- pp.168-176
- /
- 2001
This paper discusses two important issues of corpus-based synthesis: synthesis unit generation based on phrase break strength information and pruning redundant synthesis unit instances. First, the new sentence set for recording was designed to make an efficient synthesis database, reflecting the characteristics of the Korean language. To obtain prosodic context sensitive units, we graded major prosodic phrases into 5 distinctive levels according to pause length and then discriminated intra-word triphones using the levels. Using the synthesis unit with phrase break strength information, synthetic speech was generated and evaluated subjectively. Second, a new pruning method based on weighted vector quantization (WVQ) was proposed to eliminate redundant synthesis unit instances from the synthesis database. WVQ takes the relative importance of each instance into account when clustering similar instances using vector quantization (VQ) technique. The proposed method was compared with two conventional pruning methods through objective and subjective evaluations of synthetic speech quality: one to simply limit the maximum number of instances, and the other based on normal VQ-based clustering. For the same reduction rate of instance number, the proposed method showed the best performance. The synthetic speech with reduction rate 45% had almost no perceptible degradation as compared to the synthetic speech without instance reduction.
PDF

Search Result 9, Processing Time 0.022 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)