• 제목/요약/키워드: feature transformation

검색결과 391건 처리시간 0.028초

단위 선택 기반의 음성 변환 (Feature Selection-based Voice Transformation)

  • 이기승
    • 한국음향학회지
    • /
    • 제31권1호
    • /
    • pp.39-50
    • /
    • 2012
  • A voice transformation (VT) method that can make the utterance of a source speaker mimic that of a target speaker is described. Speaker individuality transformation is achieved by altering three feature parameters, which include the LPC cepstrum, pitch period and gain. The main objective of this study involves construction of an optimal sequence of features selected from a target speaker's database, to maximize both the correlation probabilities between the transformed and the source features and the likelihood of the transformed features with respect to the target model. A set of two-pass conversion rules is proposed, where the feature parameters are first selected from a database then the optimal sequence of the feature parameters is then constructed in the second pass. The conversion rules were developed using a statistical approach that employed a maximum likelihood criterion. In constructing an optimal sequence of the features, a hidden Markov model (HMM) was employed to find the most likely combination of the features with respect to the target speaker's model. The effectiveness of the proposed transformation method was evaluated using objective tests and informal listening tests. We confirmed that the proposed method leads to perceptually more preferred results, compared with the conventional methods.

수중에서의 특징점 매칭을 위한 CNN기반 Opti-Acoustic변환 (CNN-based Opti-Acoustic Transformation for Underwater Feature Matching)

  • 장혜수;이영준;김기섭;김아영
    • 로봇학회논문지
    • /
    • 제15권1호
    • /
    • pp.1-7
    • /
    • 2020
  • In this paper, we introduce the methodology that utilizes deep learning-based front-end to enhance underwater feature matching. Both optical camera and sonar are widely applicable sensors in underwater research, however, each sensor has its own weaknesses, such as light condition and turbidity for the optic camera, and noise for sonar. To overcome the problems, we proposed the opti-acoustic transformation method. Since feature detection in sonar image is challenging, we converted the sonar image to an optic style image. Maintaining the main contents in the sonar image, CNN-based style transfer method changed the style of the image that facilitates feature detection. Finally, we verified our result using cosine similarity comparison and feature matching against the original optic image.

구조 기반 BPMN 모델의 Feature 모델로 변환 기법 (A mechanism for Converting BPMN model into Feature model based on syntax)

  • 송치양;김철진
    • 한국산학기술학회논문지
    • /
    • 제17권1호
    • /
    • pp.733-744
    • /
    • 2016
  • BPMN 모델로부터 휘처(Feature) 모델로 변환하는 기존 방법들이 도메인 분석가의 직관에 의존하여 자동화된 변환이 어려운바, 비즈니스 모델링 연계의 휘처 지향 개발의 활성화에 저해가 되고 있다. 본 고는 구조 기반의 BPMN 비지니스 모델을 휘처 도메인 모델로 변환하는 방법을 제시한다. 상호 이질적인 BPMN(Business Process Modeling Notation)과 FM(Feature Model) 모델간의 변환을 위해서, 액티비티의 구조에 기반한 그룹핑 기법을 정의하고, 이들 모델의 공통 구조물인 요소(비지니스 기능을 표현)와 구조(요소간 관계 및 프로세스)에 기반해서 모델간 변환 규칙과 방법을 정립한다. 온라인쇼핑몰 시스템을 대상으로 적용 사례를 보인다. 이로서, BPMN 모델로부터 휘처 모델로의 기계적인 혹은 자동화된 구조 변환을 도모할 수 있다.

Robust Histogram Equalization Using Compensated Probability Distribution

  • Kim, Sung-Tak;Kim, Hoi-Rin
    • 대한음성학회지:말소리
    • /
    • 제55권
    • /
    • pp.131-142
    • /
    • 2005
  • A mismatch between the training and the test conditions often causes a drastic decrease in the performance of the speech recognition systems. In this paper, non-linear transformation techniques based on histogram equalization in the acoustic feature space are studied for reducing the mismatched condition. The purpose of histogram equalization(HEQ) is to convert the probability distribution of test speech into the probability distribution of training speech. While conventional histogram equalization methods consider only the probability distribution of a test speech, for noise-corrupted test speech, its probability distribution is also distorted. The transformation function obtained by this distorted probability distribution maybe bring about miss-transformation of feature vectors, and this causes the performance of histogram equalization to decrease. Therefore, this paper proposes a new method of calculating noise-removed probability distribution by using assumption that the CDF of noisy speech feature vectors consists of component of speech feature vectors and component of noise feature vectors, and this compensated probability distribution is used in HEQ process. In the AURORA-2 framework, the proposed method reduced the error rate by over $44\%$ in clean training condition compared to the baseline system. For multi training condition, the proposed methods are also better than the baseline system.

  • PDF

Z-index와 주파수 분석을 이용한 유도전동기 고장진단과 분류 (Fault Detection and Classification of Faulty Induction Motors using Z-index and Frequency Analysis)

  • 이상혁
    • 한국안전학회지
    • /
    • 제20권3호
    • /
    • pp.64-70
    • /
    • 2005
  • In this literature, fault detection and classification of faulty induction motors are carried out through Z-index and frequency analysis. Above frequency analysis refer Fourier transformation and Wavelet transformation. Z-index is defined as the similar form of energy function, also the faulty and healthy conditions are classified through Z-index. For the detection and classification feature extraction for the fault detection of an induction motor is carried out using the information from stator current. Fourier and Wavelet transforms are applied to detect the characteristics under the healthy and various faulty conditions. We can obtain feature vectors from two transformations, and the results illustrate that the feature vectors are complementary each other.

청각 모델에 기초한 음성 특징 추출에 관한 연구 (A study on the speech feature extraction based on the hearing model)

  • 김바울;윤석현;홍광석;박병철
    • 전자공학회논문지B
    • /
    • 제33B권4호
    • /
    • pp.131-140
    • /
    • 1996
  • In this paper, we propose the method that extracts the speech feature using the hearing model through signal precessing techniques. The proposed method includes following procedure ; normalization of the short-time speech block by its maximum value, multi-resolution analysis using the discrete wavelet transformation and re-synthesize using thediscrete inverse wavelet transformation, differentiation after analysis and synthesis, full wave rectification and integration. In order to verify the performance of the proposed speech feature in the speech recognition task, korean digita recognition experiments were carried out using both the dTW and the VQ-HMM. The results showed that, in case of using dTW, the recognition rates were 99.79% and 90.33% for speaker-dependent and speaker-independent task respectively and, in case of using VQ-HMM, the rate were 96.5% and 81.5% respectively. And it indicates that the proposed speech feature has the potentials to use as a simple and efficient feature for recognition task.

  • PDF

Speech Feature Extraction Based on the Human Hearing Model

  • Chung, Kwang-Woo;Kim, Paul;Hong, Kwang-Seok
    • 대한음성학회:학술대회논문집
    • /
    • 대한음성학회 1996년도 10월 학술대회지
    • /
    • pp.435-447
    • /
    • 1996
  • In this paper, we propose the method that extracts the speech feature using the hearing model through signal processing techniques. The proposed method includes the following procedure ; normalization of the short-time speech block by its maximum value, multi-resolution analysis using the discrete wavelet transformation and re-synthesize using the discrete inverse wavelet transformation, differentiation after analysis and synthesis, full wave rectification and integration. In order to verify the performance of the proposed speech feature in the speech recognition task, korean digit recognition experiments were carried out using both the DTW and the VQ-HMM. The results showed that, in the case of using DTW, the recognition rates were 99.79% and 90.33% for speaker-dependent and speaker-independent task respectively and, in the case of using VQ-HMM, the rate were 96.5% and 81.5% respectively. And it indicates that the proposed speech feature has the potential for use as a simple and efficient feature for recognition task

  • PDF

The extension of the largest generalized-eigenvalue based distance metric Dij1) in arbitrary feature spaces to classify composite data points

  • Daoud, Mosaab
    • Genomics & Informatics
    • /
    • 제17권4호
    • /
    • pp.39.1-39.20
    • /
    • 2019
  • Analyzing patterns in data points embedded in linear and non-linear feature spaces is considered as one of the common research problems among different research areas, for example: data mining, machine learning, pattern recognition, and multivariate analysis. In this paper, data points are heterogeneous sets of biosequences (composite data points). A composite data point is a set of ordinary data points (e.g., set of feature vectors). We theoretically extend the derivation of the largest generalized eigenvalue-based distance metric Dij1) in any linear and non-linear feature spaces. We prove that Dij1) is a metric under any linear and non-linear feature transformation function. We show the sufficiency and efficiency of using the decision rule $\bar{{\delta}}_{{\Xi}i}$(i.e., mean of Dij1)) in classification of heterogeneous sets of biosequences compared with the decision rules min𝚵iand median𝚵i. We analyze the impact of linear and non-linear transformation functions on classifying/clustering collections of heterogeneous sets of biosequences. The impact of the length of a sequence in a heterogeneous sequence-set generated by simulation on the classification and clustering results in linear and non-linear feature spaces is empirically shown in this paper. We propose a new concept: the limiting dispersion map of the existing clusters in heterogeneous sets of biosequences embedded in linear and nonlinear feature spaces, which is based on the limiting distribution of nucleotide compositions estimated from real data sets. Finally, the empirical conclusions and the scientific evidences are deduced from the experiments to support the theoretical side stated in this paper.

거리변환법에 의한 한글패턴의 특징분류 (Feature Classification of Hanguel Patterns by Distance Transformation method)

  • 고찬;이대영
    • 한국통신학회논문지
    • /
    • 제14권6호
    • /
    • pp.650-662
    • /
    • 1989
  • 본 논문에서는 한글문자패턴의 새로운 특징추출 및 분류 알고리즘을 제안하였다. 입력된 패턴을 한글기본 6형식으로 분류하고 자소분리를 시행한 후 각 자소별 위치에 따른 굴곡특징점을 추출하였다. 이 특징점에 의해 입력문자의 내용을 정의하고 이를 색인-순차 파일로 구성하였다. 이 파일과 표준사전화일과의 검색으로 인식처리토록 하였다. 간단한 알고리즘으로 인한 처리시간의 단축과 소프트웨어 작성이 용이함을 보였다. 실험의 결과는 입력패턴의 특징추출과 분류의 결과를 나타내준다. 제안된 알고리즘은 문자를 이루는 최소 4각형 안에서 거리변환을 시켜 굴국특성을 추출하여 이들이 갖고 있는 상대 위치 정보를 이용한 것이 특징으로 실험을 통해 97%의 인식율을 나타내었다.

  • PDF

PCA 기반 변환을 통한 다해상도 피처 맵 압축 방법 (A Feature Map Compression Method for Multi-resolution Feature Map with PCA-based Transformation)

  • 박승진;이민훈;최한솔;김민섭;오승준;김연희;도지훈;정세윤;심동규
    • 방송공학회논문지
    • /
    • 제27권1호
    • /
    • pp.56-68
    • /
    • 2022
  • 본 논문에서는 VCM을 위한 다해상도 피처 맵에 대한 압축 방법을 제안한다. 제안하는 압축 방법은 PCA 기반의 변환을 통해 다해상도 피처 맵의 채널 및 해상도 계층 간 중복성을 제거하며 변환에 사용된 기저 벡터와 평균 벡터 그리고 변환을 통해 얻어진 변환 계수를 각각의 특성에 따라 VVC 기반 부호화기와 DeepCABAC을 통하여 압축한다. 제안하는 방법의 성능을 측정하기 위하여 OpenImageV6와 COCO 2017 validation set에 대하여 객체 검출 성능을 평가하며, MPEG-VCM 앵커 및 본 논문에서 제안하는 피처 맵 압축 앵커 대비 bpp와 mAP를 BD-rate 관점에서 비교한다. 실험 결과, 제안하는 방법은 OpenImageV6에서 피처 맵 압축 앵커 대비 25.71%의 BD-rate 성능 향상을 보이며, 특히 COCO 2017 validation set의 크기가 큰 객체들에 대해서 MPEG-VCM 앵커 대비 최대 43.72%의 BD-rate 성능이 향상됨을 보인다.