• 제목/요약/키워드: Acoustic feature

검색결과 238건 처리시간 0.055초

한국어 음성인식 플랫폼(ECHOS)의 개선 및 평가 (Improvement and Evaluation of the Korean Large Vocabulary Continuous Speech Recognition Platform (ECHOS))

  • 권석봉;윤성락;장규철;김용래;김봉완;김회린;유창동;이용주;권오욱
    • 대한음성학회지:말소리
    • /
    • 제59호
    • /
    • pp.53-68
    • /
    • 2006
  • We report the evaluation results of the Korean speech recognition platform called ECHOS. The platform has an object-oriented and reusable architecture so that researchers can easily evaluate their own algorithms. The platform has all intrinsic modules to build a large vocabulary speech recognizer: Noise reduction, end-point detection, feature extraction, hidden Markov model (HMM)-based acoustic modeling, cross-word modeling, n-gram language modeling, n-best search, word graph generation, and Korean-specific language processing. The platform supports both lexical search trees and finite-state networks. It performs word-dependent n-best search with bigram in the forward search stage, and rescores the lattice with trigram in the backward stage. In an 8000-word continuous speech recognition task, the platform with a lexical tree increases 40% of word errors but decreases 50% of recognition time compared to the HTK platform with flat lexicon. ECHOS reduces 40% of recognition errors through incorporation of cross-word modeling. With the number of Gaussian mixtures increasing to 16, it yields word accuracy comparable to the previous lexical tree-based platform, Julius.

  • PDF

Stress Detection and Classification of Laying Hens by Sound Analysis

  • Lee, Jonguk;Noh, Byeongjoon;Jang, Suin;Park, Daihee;Chung, Yongwha;Chang, Hong-Hee
    • Asian-Australasian Journal of Animal Sciences
    • /
    • 제28권4호
    • /
    • pp.592-598
    • /
    • 2015
  • Stress adversely affects the wellbeing of commercial chickens, and comes with an economic cost to the industry that cannot be ignored. In this paper, we first develop an inexpensive and non-invasive, automatic online-monitoring prototype that uses sound data to notify producers of a stressful situation in a commercial poultry facility. The proposed system is structured hierarchically with three binary-classifier support vector machines. First, it selects an optimal acoustic feature subset from the sound emitted by the laying hens. The detection and classification module detects the stress from changes in the sound and classifies it into subsidiary sound types, such as physical stress from changes in temperature, and mental stress from fear. Finally, an experimental evaluation was performed using real sound data from an audio-surveillance system. The accuracy in detecting stress approached 96.2%, and the classification model was validated, confirming that the average classification accuracy was 96.7%, and that its recall and precision measures were satisfactory.

LCD 라인의 음향 특성신호에 웨이브렛 변환과 인경신경망회로를 적용한 공정로봇의 건정성 감시 연구 (Condition Monitoring of an LCD Glass Transfer Robot Based on Wavelet Packet Transform and Artificial Neural Network for Abnormal Sound)

  • 김의열;이상권;장지욱
    • 대한기계학회논문집A
    • /
    • 제36권7호
    • /
    • pp.813-822
    • /
    • 2012
  • LCD 생산라인의 공정 로봇에서 방사되는 비정상 작동 소음은 로봇의 결함 탐지에 사용된다. 이 신호의 장점은 상대적으로 낮은 민감도에 비해 단지 마이크로폰을 이용하여 다수의 결함을 확인할 수 있는 것이다. 결함요소 추출을 위한 웨이브렛 변환(WPT)과 불량의 분류를 위한 인공신경망 회로(ANN)이 본 논문에서 사용되었다. 결과적으로, 비정상 작동 소음이 기계요소의 결함 진단에 효율적으로 사용될 수 있다.

Implementation of HMM-Based Speech Recognizer Using TMS320C6711 DSP

  • Bae Hyojoon;Jung Sungyun;Son Jongmok;Kwon Hongseok;Kim Siho;Bae Keunsung
    • 대한전자공학회:학술대회논문집
    • /
    • 대한전자공학회 2004년도 ICEIC The International Conference on Electronics Informations and Communications
    • /
    • pp.391-394
    • /
    • 2004
  • This paper focuses on the DSP implementation of an HMM-based speech recognizer that can handle several hundred words of vocabulary size as well as speaker independency. First, we develop an HMM-based speech recognition system on the PC that operates on the frame basis with parallel processing of feature extraction and Viterbi decoding to make the processing delay as small as possible. Many techniques such as linear discriminant analysis, state-based Gaussian selection, and phonetic tied mixture model are employed for reduction of computational burden and memory size. The system is then properly optimized and compiled on the TMS320C6711 DSP for real-time operation. The implemented system uses 486kbytes of memory for data and acoustic models, and 24.5kbytes for program code. Maximum required time of 29.2ms for processing a frame of 32ms of speech validates real-time operation of the implemented system.

  • PDF

C-means 알고리즘을 이용한 마이크로 엔드밀의 상태 감시 (Condition Monitoring of Micro Endmill using C-means Algorithm)

  • 권동희;정연식;강익수;김전하;김정석
    • 한국공작기계학회:학술대회논문집
    • /
    • 한국공작기계학회 2005년도 춘계학술대회 논문집
    • /
    • pp.162-167
    • /
    • 2005
  • Recently, the advanced industries using micro parts are rapidly growing. Micro endmilling is one of the prominent technology that has wide spectrum of application field ranging from macro to micro parts. Also, the method of micro-grooving using micro endmilling is used widely owing to many merit, but has problems of precision and quality of products due to tool wear and tool fracture. This study deals with condition monitoring using acoustic emission(AE) signal in the micro-grooving. First, the feature extraction of AE signal directly related to machining process is executed. Then, the distinctive micro endmill state according to the each tool condition is classified by using the fuzzy C-means algorithm, which is one of the methods to recognize data patterns. These result is effective monitoring method of micro endmill state by the AE sensing techniques which can be expected to be applicable to micro machining processes in the future.

  • PDF

Space Charge Behavior of Oil-Impregnated Paper Insulation Aging at AC-DC Combined Voltages

  • Li, Jian;Wang, Yan;Bao, Lianwei
    • Journal of Electrical Engineering and Technology
    • /
    • 제9권2호
    • /
    • pp.635-642
    • /
    • 2014
  • The space charge behaviors of oil-paper insulation affect the stability and security of oil-filled converter transformers of traditional and new energies. This paper presents the results of the electrical aging of oil-impregnated paper under AC-DC combined voltages by the pulsed electro-acoustic technique. Data mining and feature extractions were performed on the influence of electrical aging on charge dynamics based on the experiment results in the first stage. Characteristic parameters such as total charge injection and apparent charge mobility were calculated. The influences of electrical aging on the trap energy distribution of an oil-paper insulation system were analyzed and discussed. Longer electrical aging time would increase the depth and energy density of charge trap, which decelerates the apparent charge mobility and increases the probability of hot electron formation. This mechanism would accelerate damage to the cellulose and the formation of discharge channels, enhance the acceleration of the electric field distortion, and shorten insulation lifetime under AC-DC combined voltages.

한국어 낭독체 담화문의 운율적 특징 - 단독발화문과 연속발화문의 비교를 통하여 - (The Prosodic Characteristics of Korean Read Sentences in Dicourse Context)

  • 성철재
    • 대한음성학회지:말소리
    • /
    • 제35_36호
    • /
    • pp.1-12
    • /
    • 1998
  • This study aims to investigate the prosodic characteristics of Korean discourse sentences, especially focusing the initial and final part of a sentence. 50 disourse sentences were read in two different styles; one, sentence by sentence, the other, continuous of all 50's. First, we tried to get two kinds of ratios from the acoustic results: first, ratio of the final syllable to the initial syllable in first word in a sentence; second, ratio of the final syllable to the initial syllable in last word in a sentence. We, then, calculated statistical values of the ratios including mean, standard deviation, minimum, maximum, and p-values in t-test. With respect to duration, there were little difference between two different styles. If any, we could see tiny unharmonious durational aspect in the initial of continuous reading. More concisely, there could be observed some deviation from standard. In case of F0, there was prominent statistical difference between ratios of last words in two styles. This difference might play a role as a prosodic feature. Energy seems to show similar pattern with that of F0. The results showed that final syllable in last word was pronounced with about 85 % of initial syllable in the same context and the last words in continuous speech were strongly articulated compared with those of sentence by sentence reading.

  • PDF

다계통위축증 환자를 대상으로 한 마비말장애 집중 치료의 효과 (Efficacy of intensive treatment of dysarthria for people with multiple system atrophy)

  • 박영미
    • 말소리와 음성과학
    • /
    • 제10권4호
    • /
    • pp.163-171
    • /
    • 2018
  • A mixed dysarthria with combinations of hypokinetic, ataxic, and spastic components is a common clinical feature of multiple system atrophy (MSA). Due to the rapid progress of dysarthria after diagnosis, people with MSA experience difficulty with verbal communication, which eventually affects their quality of life negatively. In this study, SPEAK $OUT!^{(R)}$, an intensive 1:1 treatment of dysarthria for improving functional communicative ability, was provided to twelve people with MSA. To evaluate the efficacy of SPEAK $OUT!^{(R)}$ in people with MSA, aerodynamic, acoustic, and perceptual analyses were conducted. Pre-and post-therapy data included maximum phonation time, vocal intensity, and fundamental frequency during /a/ sustained phonation and passage reading; frequency range between high /a/ and low /a/ phonation; jitter, shimmer, and HNR for vocal quality; speech rate during passage reading; and perceptual evaluation scores for articulation precision and intonation. The participants achieved statistically significant improvement in vocal intensity, pitch range, vocal quality, speech rate, and speech intelligibility. In conclusion, SPEAK $OUT!^{(R)}$ is a feasible treatment for people with MSA to efficaciously improve their speech ability.

Cosmological constraints using BAO - From spectroscopic to photometric catalogues

  • Sridhar, Srivatsan
    • 천문학회보
    • /
    • 제44권1호
    • /
    • pp.56.2-56.2
    • /
    • 2019
  • Measurement of the location of the baryon acoustic oscillation (BAO) feature in the clustering of galaxies has proven to be a robust and precise method to measure the expansion of the Universe. The best constraints so far have been provided from spectroscopic surveys because the errors on the redshift obtained from spectroscopy are minimal. This in turn means that the errors along the line-of-sight are reduced and so one can expect constraints on both angular diameter distance $D_A$ and expansion rate $H^{-1}$. But, future surveys will probe a larger part of the sky and go to deeper redshifts, which correspond to more number of galaxies. Analysing each galaxy using spectroscopy, which is a time consuming task, will not be practically possible. So, photometry will be the most convenient way to measure redshifts for future surveys such as LSST, Euclid, etc. The advantage of photometry is measuring the redshift of vast number of galaxies in a single exposure, but the disadvantage are the errors associated with the measured redshifts. Using a wedge approach, wherein the clustering is split into different wedges along the line-of-sight ${\pi}$ and across the line-of-sight ${\sigma}$, we show that the BAO information can be recovered even for photometric catalogues with errors along the line-of-sight. This means that we can get cosmological distance constraints even if we don't have spectroscopic information.

  • PDF

머신러닝 분류기를 사용한 만성콩팥병 자동 진단 및 중증도 예측 연구 (Automatic detection and severity prediction of chronic kidney disease using machine learning classifiers)

  • 문지현;김선희;김명주;류지원;김세중;정민화
    • 말소리와 음성과학
    • /
    • 제14권4호
    • /
    • pp.45-56
    • /
    • 2022
  • 본 논문은 만성콩팥병 환자의 음성을 사용하여 질병을 자동으로 진단하고 중증도를 예측하는 최적의 방법론을 제안한다. 만성콩팥병 환자는 호흡계 근력의 약화와 성대 부종 등으로 인해 음성이 변화하게 된다. 만성콩팥병 환자의 음성을 음성학적으로 분석한 선행 연구는 존재했으나, 환자의 음성을 분류하는 연구는 진행된 바가 없다. 본 논문에서는 모음연장발화, 유성음 문장 발화, 일반 문장 발화의 발화 목록과, 수제 특징 집합, eGeMAPS, CNN 추출 특징의 특징 집합, SVM, XGBoost의 머신러닝 분류기를 사용하여 만성콩팥병 환자의 음성을 분류하였다. 총 3시간 26분 25초 분량의 1,523개 발화가 실험에 사용되었다. 그 결과, 질병을 자동으로 진단하는 데에는 0.93, 중증도를 예측하는 3분류 문제에서는 0.89, 5분류 문제에서는 0.84의 F1-score가 나타났고, 모든 과제에서 일반 문장 발화, 수제 특징 집합, XGBoost의 조합을 사용했을 때 가장 높은 성능이 나타났다. 이는 만성콩팥병 음성 자동 분류에는 화자의 발화 특성을 모두 반영할 수 있는 일반 문장 발화와 거기로부터 추출한 적절한 특징 집합이 효과적임을 시사한다.