• Title/Summary/Keyword: Acoustic Feature

Search Result 237, Processing Time 0.023 seconds

Improvement and Evaluation of the Korean Large Vocabulary Continuous Speech Recognition Platform (ECHOS) (한국어 음성인식 플랫폼(ECHOS)의 개선 및 평가)

  • Kwon, Suk-Bong;Yun, Sung-Rack;Jang, Gyu-Cheol;Kim, Yong-Rae;Kim, Bong-Wan;Kim, Hoi-Rin;Yoo, Chang-Dong;Lee, Yong-Ju;Kwon, Oh-Wook
    • MALSORI
    • /
    • no.59
    • /
    • pp.53-68
    • /
    • 2006
  • We report the evaluation results of the Korean speech recognition platform called ECHOS. The platform has an object-oriented and reusable architecture so that researchers can easily evaluate their own algorithms. The platform has all intrinsic modules to build a large vocabulary speech recognizer: Noise reduction, end-point detection, feature extraction, hidden Markov model (HMM)-based acoustic modeling, cross-word modeling, n-gram language modeling, n-best search, word graph generation, and Korean-specific language processing. The platform supports both lexical search trees and finite-state networks. It performs word-dependent n-best search with bigram in the forward search stage, and rescores the lattice with trigram in the backward stage. In an 8000-word continuous speech recognition task, the platform with a lexical tree increases 40% of word errors but decreases 50% of recognition time compared to the HTK platform with flat lexicon. ECHOS reduces 40% of recognition errors through incorporation of cross-word modeling. With the number of Gaussian mixtures increasing to 16, it yields word accuracy comparable to the previous lexical tree-based platform, Julius.

  • PDF

Stress Detection and Classification of Laying Hens by Sound Analysis

  • Lee, Jonguk;Noh, Byeongjoon;Jang, Suin;Park, Daihee;Chung, Yongwha;Chang, Hong-Hee
    • Asian-Australasian Journal of Animal Sciences
    • /
    • v.28 no.4
    • /
    • pp.592-598
    • /
    • 2015
  • Stress adversely affects the wellbeing of commercial chickens, and comes with an economic cost to the industry that cannot be ignored. In this paper, we first develop an inexpensive and non-invasive, automatic online-monitoring prototype that uses sound data to notify producers of a stressful situation in a commercial poultry facility. The proposed system is structured hierarchically with three binary-classifier support vector machines. First, it selects an optimal acoustic feature subset from the sound emitted by the laying hens. The detection and classification module detects the stress from changes in the sound and classifies it into subsidiary sound types, such as physical stress from changes in temperature, and mental stress from fear. Finally, an experimental evaluation was performed using real sound data from an audio-surveillance system. The accuracy in detecting stress approached 96.2%, and the classification model was validated, confirming that the average classification accuracy was 96.7%, and that its recall and precision measures were satisfactory.

Condition Monitoring of an LCD Glass Transfer Robot Based on Wavelet Packet Transform and Artificial Neural Network for Abnormal Sound (LCD 라인의 음향 특성신호에 웨이브렛 변환과 인경신경망회로를 적용한 공정로봇의 건정성 감시 연구)

  • Kim, Eui-Youl;Lee, Sang-Kwon;Jang, Ji-Uk
    • Transactions of the Korean Society of Mechanical Engineers A
    • /
    • v.36 no.7
    • /
    • pp.813-822
    • /
    • 2012
  • Abnormal operating sounds radiated from a moving transfer robot in LCD (liquid crystal display) product lines have been used for the fault detection line of a robot instead of other source signals such as vibrations, acoustic emissions, and electrical signals. Its advantage as a source signal makes it possible to monitor the status of multiple faults by using only a microphone, despite a relatively low sensitivity. The wavelet packet transform for feature extraction and the artificial neural network for fault classification are employed. It can be observed that the abnormal operating sound is sufficiently useful as a source signal for the fault diagnosis of mechanical components as well as other source signals.

Implementation of HMM-Based Speech Recognizer Using TMS320C6711 DSP

  • Bae Hyojoon;Jung Sungyun;Son Jongmok;Kwon Hongseok;Kim Siho;Bae Keunsung
    • Proceedings of the IEEK Conference
    • /
    • summer
    • /
    • pp.391-394
    • /
    • 2004
  • This paper focuses on the DSP implementation of an HMM-based speech recognizer that can handle several hundred words of vocabulary size as well as speaker independency. First, we develop an HMM-based speech recognition system on the PC that operates on the frame basis with parallel processing of feature extraction and Viterbi decoding to make the processing delay as small as possible. Many techniques such as linear discriminant analysis, state-based Gaussian selection, and phonetic tied mixture model are employed for reduction of computational burden and memory size. The system is then properly optimized and compiled on the TMS320C6711 DSP for real-time operation. The implemented system uses 486kbytes of memory for data and acoustic models, and 24.5kbytes for program code. Maximum required time of 29.2ms for processing a frame of 32ms of speech validates real-time operation of the implemented system.

  • PDF

Condition Monitoring of Micro Endmill using C-means Algorithm (C-means 알고리즘을 이용한 마이크로 엔드밀의 상태 감시)

  • Kwon Dong-Hee;Jeong Yun-Shick;Kang Ik-Soo;Kim Jeon-Ha;Kim Jeong-Suk
    • Proceedings of the Korean Society of Machine Tool Engineers Conference
    • /
    • 2005.05a
    • /
    • pp.162-167
    • /
    • 2005
  • Recently, the advanced industries using micro parts are rapidly growing. Micro endmilling is one of the prominent technology that has wide spectrum of application field ranging from macro to micro parts. Also, the method of micro-grooving using micro endmilling is used widely owing to many merit, but has problems of precision and quality of products due to tool wear and tool fracture. This study deals with condition monitoring using acoustic emission(AE) signal in the micro-grooving. First, the feature extraction of AE signal directly related to machining process is executed. Then, the distinctive micro endmill state according to the each tool condition is classified by using the fuzzy C-means algorithm, which is one of the methods to recognize data patterns. These result is effective monitoring method of micro endmill state by the AE sensing techniques which can be expected to be applicable to micro machining processes in the future.

  • PDF

Space Charge Behavior of Oil-Impregnated Paper Insulation Aging at AC-DC Combined Voltages

  • Li, Jian;Wang, Yan;Bao, Lianwei
    • Journal of Electrical Engineering and Technology
    • /
    • v.9 no.2
    • /
    • pp.635-642
    • /
    • 2014
  • The space charge behaviors of oil-paper insulation affect the stability and security of oil-filled converter transformers of traditional and new energies. This paper presents the results of the electrical aging of oil-impregnated paper under AC-DC combined voltages by the pulsed electro-acoustic technique. Data mining and feature extractions were performed on the influence of electrical aging on charge dynamics based on the experiment results in the first stage. Characteristic parameters such as total charge injection and apparent charge mobility were calculated. The influences of electrical aging on the trap energy distribution of an oil-paper insulation system were analyzed and discussed. Longer electrical aging time would increase the depth and energy density of charge trap, which decelerates the apparent charge mobility and increases the probability of hot electron formation. This mechanism would accelerate damage to the cellulose and the formation of discharge channels, enhance the acceleration of the electric field distortion, and shorten insulation lifetime under AC-DC combined voltages.

The Prosodic Characteristics of Korean Read Sentences in Dicourse Context (한국어 낭독체 담화문의 운율적 특징 - 단독발화문과 연속발화문의 비교를 통하여 -)

  • Seong Cheol-Jae
    • MALSORI
    • /
    • no.35_36
    • /
    • pp.1-12
    • /
    • 1998
  • This study aims to investigate the prosodic characteristics of Korean discourse sentences, especially focusing the initial and final part of a sentence. 50 disourse sentences were read in two different styles; one, sentence by sentence, the other, continuous of all 50's. First, we tried to get two kinds of ratios from the acoustic results: first, ratio of the final syllable to the initial syllable in first word in a sentence; second, ratio of the final syllable to the initial syllable in last word in a sentence. We, then, calculated statistical values of the ratios including mean, standard deviation, minimum, maximum, and p-values in t-test. With respect to duration, there were little difference between two different styles. If any, we could see tiny unharmonious durational aspect in the initial of continuous reading. More concisely, there could be observed some deviation from standard. In case of F0, there was prominent statistical difference between ratios of last words in two styles. This difference might play a role as a prosodic feature. Energy seems to show similar pattern with that of F0. The results showed that final syllable in last word was pronounced with about 85 % of initial syllable in the same context and the last words in continuous speech were strongly articulated compared with those of sentence by sentence reading.

  • PDF

Efficacy of intensive treatment of dysarthria for people with multiple system atrophy (다계통위축증 환자를 대상으로 한 마비말장애 집중 치료의 효과)

  • Park, Youngmi
    • Phonetics and Speech Sciences
    • /
    • v.10 no.4
    • /
    • pp.163-171
    • /
    • 2018
  • A mixed dysarthria with combinations of hypokinetic, ataxic, and spastic components is a common clinical feature of multiple system atrophy (MSA). Due to the rapid progress of dysarthria after diagnosis, people with MSA experience difficulty with verbal communication, which eventually affects their quality of life negatively. In this study, SPEAK $OUT!^{(R)}$, an intensive 1:1 treatment of dysarthria for improving functional communicative ability, was provided to twelve people with MSA. To evaluate the efficacy of SPEAK $OUT!^{(R)}$ in people with MSA, aerodynamic, acoustic, and perceptual analyses were conducted. Pre-and post-therapy data included maximum phonation time, vocal intensity, and fundamental frequency during /a/ sustained phonation and passage reading; frequency range between high /a/ and low /a/ phonation; jitter, shimmer, and HNR for vocal quality; speech rate during passage reading; and perceptual evaluation scores for articulation precision and intonation. The participants achieved statistically significant improvement in vocal intensity, pitch range, vocal quality, speech rate, and speech intelligibility. In conclusion, SPEAK $OUT!^{(R)}$ is a feasible treatment for people with MSA to efficaciously improve their speech ability.

Cosmological constraints using BAO - From spectroscopic to photometric catalogues

  • Sridhar, Srivatsan
    • The Bulletin of The Korean Astronomical Society
    • /
    • v.44 no.1
    • /
    • pp.56.2-56.2
    • /
    • 2019
  • Measurement of the location of the baryon acoustic oscillation (BAO) feature in the clustering of galaxies has proven to be a robust and precise method to measure the expansion of the Universe. The best constraints so far have been provided from spectroscopic surveys because the errors on the redshift obtained from spectroscopy are minimal. This in turn means that the errors along the line-of-sight are reduced and so one can expect constraints on both angular diameter distance $D_A$ and expansion rate $H^{-1}$. But, future surveys will probe a larger part of the sky and go to deeper redshifts, which correspond to more number of galaxies. Analysing each galaxy using spectroscopy, which is a time consuming task, will not be practically possible. So, photometry will be the most convenient way to measure redshifts for future surveys such as LSST, Euclid, etc. The advantage of photometry is measuring the redshift of vast number of galaxies in a single exposure, but the disadvantage are the errors associated with the measured redshifts. Using a wedge approach, wherein the clustering is split into different wedges along the line-of-sight ${\pi}$ and across the line-of-sight ${\sigma}$, we show that the BAO information can be recovered even for photometric catalogues with errors along the line-of-sight. This means that we can get cosmological distance constraints even if we don't have spectroscopic information.

  • PDF

Automatic detection and severity prediction of chronic kidney disease using machine learning classifiers (머신러닝 분류기를 사용한 만성콩팥병 자동 진단 및 중증도 예측 연구)

  • Jihyun Mun;Sunhee Kim;Myeong Ju Kim;Jiwon Ryu;Sejoong Kim;Minhwa Chung
    • Phonetics and Speech Sciences
    • /
    • v.14 no.4
    • /
    • pp.45-56
    • /
    • 2022
  • This paper proposes an optimal methodology for automatically diagnosing and predicting the severity of the chronic kidney disease (CKD) using patients' utterances. In patients with CKD, the voice changes due to the weakening of respiratory and laryngeal muscles and vocal fold edema. Previous studies have phonetically analyzed the voices of patients with CKD, but no studies have been conducted to classify the voices of patients. In this paper, the utterances of patients with CKD were classified using the variety of utterance types (sustained vowel, sentence, general sentence), the feature sets [handcrafted features, extended Geneva Minimalistic Acoustic Parameter Set (eGeMAPS), CNN extracted features], and the classifiers (SVM, XGBoost). Total of 1,523 utterances which are 3 hours, 26 minutes, and 25 seconds long, are used. F1-score of 0.93 for automatically diagnosing a disease, 0.89 for a 3-classes problem, and 0.84 for a 5-classes problem were achieved. The highest performance was obtained when the combination of general sentence utterances, handcrafted feature set, and XGBoost was used. The result suggests that a general sentence utterance that can reflect all speakers' speech characteristics and an appropriate feature set extracted from there are adequate for the automatic classification of CKD patients' utterances.