Search | Korea Science

Robust Speech Recognition by Utilizing Class Histogram Equalization (클래스 히스토그램 등화 기법에 의한 강인한 음성 인식)

Suh, Yung-Joo;Kim, Hor-Rin;Lee, Yun-Keun
- MALSORI
- /
- no.60
- /
- pp.145-164
- /
- 2006
This paper proposes class histogram equalization (CHEQ) to compensate noisy acoustic features for robust speech recognition. CHEQ aims to compensate for the acoustic mismatch between training and test speech recognition environments as well as to reduce the limitations of the conventional histogram equalization (HEQ). In contrast to HEQ, CHEQ adopts multiple class-specific distribution functions for training and test environments and equalizes the features by using their class-specific training and test distributions. According to the class-information extraction methods, CHEQ is further classified into two forms such as hard-CHEQ based on vector quantization and soft-CHEQ using the Gaussian mixture model. Experiments on the Aurora 2 database confirmed the effectiveness of CHEQ by producing a relative word error reduction of 61.17% over the baseline met-cepstral features and that of 19.62% over the conventional HEQ.
PDF

Acoustic scene classification using recurrence quantification analysis (재발량 분석을 이용한 음향 상황 인지)

Park, Sangwook;Choi, Woohyun;Ko, Hanseok
- The Journal of the Acoustical Society of Korea
- /
- v.35 no.1
- /
- pp.42-48
- /
- 2016
Since a variety of sound occur in same place and similar sound occurs in other places, the performance of acoustic scene classification is not guaranteed in case of insufficient training data. A Bag of Words (BOW) based histogram feature is foreseen as a method to overcome the problem. However, since the histogram features is made by using a feature distribution, the ordering of sequence of features is ignored. A temporal information such as periodicity and stationarity are also important for acoustic scene classification. In this paper, temporal features about a periodicity and a stationarity are extracted by using a recurrent quantification analysis. In the experiment, performance of the proposed method is shown better than other baseline methods.
https://doi.org/10.7776/ASK.2016.35.1.042 인용 PDF KSCI

Combining multi-task autoencoder with Wasserstein generative adversarial networks for improving speech recognition performance (음성인식 성능 개선을 위한 다중작업 오토인코더와 와설스타인식 생성적 적대 신경망의 결합)

Kao, Chao Yuan;Ko, Hanseok
- The Journal of the Acoustical Society of Korea
- /
- v.38 no.6
- /
- pp.670-677
- /
- 2019
As the presence of background noise in acoustic signal degrades the performance of speech or acoustic event recognition, it is still challenging to extract noise-robust acoustic features from noisy signal. In this paper, we propose a combined structure of Wasserstein Generative Adversarial Network (WGAN) and MultiTask AutoEncoder (MTAE) as deep learning architecture that integrates the strength of MTAE and WGAN respectively such that it estimates not only noise but also speech features from noisy acoustic source. The proposed MTAE-WGAN structure is used to estimate speech signal and the residual noise by employing a gradient penalty and a weight initialization method for Leaky Rectified Linear Unit (LReLU) and Parametric ReLU (PReLU). The proposed MTAE-WGAN structure with the adopted gradient penalty loss function enhances the speech features and subsequently achieve substantial Phoneme Error Rate (PER) improvements over the stand-alone Deep Denoising Autoencoder (DDAE), MTAE, Redundant Convolutional Encoder-Decoder (R-CED) and Recurrent MTAE (RMTAE) models for robust speech recognition.
https://doi.org/10.7776/ASK.2019.38.6.670 인용 PDF KSCI

An Acoustic Investigation of Post-Obstruent Tensification Phenomena

Ahn, Hyun-Kee
- Speech Sciences
- /
- v.11 no.4
- /
- pp.223-232
- /
- 2004
This study investigated and compared the acoustic characteristics of the Korean stop sound [k'] in three different phonological environments: the tensified lenis stop [k'] as observed in /prek+kaci/, the fortis stop /k'/ as in /pre+k'aci/, and the fortis stop /k'/ following an obstruent as in /prek+k'aci/. The specific research question was whether or not the tensified lenis stop shares all the acoustic features with the other two kinds of fortis stops. The acoustic measures adopted in this study were H1*-H2*, VOT, length of stop closure, and $F_0$. The major findings were that the three stops showed no significant difference in all the acoustic measures except the length of stop closure. The fortis stop /k'/ following an obstruent showed significantly longer duration of stop closure than the other two stops, both of which showed no significant difference. Based on these phonetic results, this study argued that, for the proper phonological description of post-obstruent tensification, the phonological feature [slack vocal folds] of a lenis stop should be changed into [stiff vocal folds, constricted glottis] that the fortis stops should have.
PDF

Damage progression study in fibre reinforced concrete using acoustic emission technique

Banjara, Nawal Kishor;Sasmal, Saptarshi;Srinivas, V.
- Smart Structures and Systems
- /
- v.23 no.2
- /
- pp.173-184
- /
- 2019
The main objective of this study is to evaluate the true fracture energy and monitor the damage progression in steel fibre reinforced concrete (SFRC) specimens using acoustic emission (AE) features. Four point bending test is carried out using pre-notched plain and fibre reinforced (0.5% and 1% volume fraction) - concrete under monotonic loading. AE sensors are affixed at different locations of the specimens and AE parameters such as rise time, AE energy, hits, counts, amplitude and duration etc. are obtained. Using the captured and processed AE event data, fracture process zone is identified and the true fracture energy is evaluated. The AE data is also employed for tracing the damage progression in plain and fibre reinforced concrete, using both parametric- and signal- based techniques. Hilbert - Huang transform (HHT) is used in signal based processing for evaluating instantaneous frequency of the acoustic events. It is found that the appropriately processed and carefully analyzed acoustic data is capable of providing vital information on progression of damage on different types of concrete.
https://doi.org/10.12989/sss.2019.23.2.173 인용 KSCI

Sonographic Appearance of Steatocystoma: An Analysis of 14 Pathologically Confirmed Lesions (지선낭종의 초음파 소견: 조직학적으로 진단된 14개 병변의 분석)

Hyeyoung Yoon;Yusuhn Kang;Hwiryong Park;Joong Mo Ahn;Eugene Lee;Joon Woo Lee;Heung Sik Kang
- Journal of the Korean Society of Radiology
- /
- v.82 no.2
- /
- pp.382-392
- /
- 2021
Purpose To evaluate the ultrasonographic characteristics of steatocystomas focusing on the features that aid in differentiating them from epidermal inclusion cysts and lipomas. Materials and Methods The ultrasonographic findings of 14 histologically proven steatocystomas in 10 patients were retrospectively reviewed. The following features were assessed: the layer of involvement, shape, margin, echogenicity, posterior acoustic features, and the presence of a visible wall or intralesional striations. The findings were compared with those of subcutaneous lipomas and epidermal inclusion cysts to identify those findings that aid in the differential diagnosis of steatocystomas. Results The majority of steatocystomas appeared as a subcutaneous mass (n = 6, 42.9%) or a mass involving both the dermal and subcutaneous layers (n = 6, 42.9%). Steatocystomas exhibited a well-defined smooth margin (n = 12, 85.7%) and homogeneous echogenicity (n = 9, 64.3%), and showed no specific posterior acoustic features (n = 9, 64.3%). The most important features that differentiated steatocystomas from epidermal inclusion cysts were a homogeneous internal echotexture (p = 0.009) and absent or less prominent posterior acoustic enhancement (p < 0.001). The features that distinguished steatocystomas from lipomas were the margin (p < 0.001), echogenicity (p = 0.034), internal echotexture (p = 0.004), and the absence of intralesional striations (p < 0.001). Conclusion Steatocystomas appeared as well-defined homogeneous masses with mild or absent posterior acoustic enhancement.
https://doi.org/10.3348/jksr.2019.0200 인용 PDF

Scanning acoustic microscopy for material evaluation

Hyunung Yu
- Applied Microscopy
- /
- v.50
- /
- pp.25.1-25.11
- /
- 2020
Scanning acoustic microscopy (SAM) or Acoustic Micro Imaging (AMI) is a powerful, non-destructive technique that can detect hidden defects in elastic and biological samples as well as non-transparent hard materials. By monitoring the internal features of a sample in three-dimensional integration, this technique can efficiently find physical defects such as cracks, voids, and delamination with high sensitivity. In recent years, advanced techniques such as ultrasound impedance microscopy, ultrasound speed microscopy, and scanning acoustic gigahertz microscopy have been developed for applications in industries and in the medical field to provide additional information on the internal stress, viscoelastic, and anisotropic, or nonlinear properties. X-ray, magnetic resonance, and infrared techniques are the other competitive and widely used methods. However, they have their own advantages and limitations owing to their inherent properties such as different light sources and sensors. This paper provides an overview of the principle of SAM and presents a few results to demonstrate the applications of modern acoustic imaging technology. A variety of inspection modes, such as vertical, horizontal, and diagonal cross-sections have been presented by employing the focus pathway and image reconstruction algorithm. Images have been reconstructed from the reflected echoes resulting from the change in the acoustic impedance at the interface of the material layers or defects. The results described in this paper indicate that the novel acoustic technology can expand the scope of SAM as a versatile diagnostic tool requiring less time and having a high efficiency.
https://doi.org/10.1186/s42649-020-00045-4 인용 KSCI

A Study for Acoustic Features of Benign Laryngeal Disease (양성 성대 점막 질환의 음향학적 특성에 관한 연구)

Lee, Jae Seok;Kim, Jin Pyeong;Park, Jeong Je;Kwon, Oh Jin;Woo, Seung Hoon
- Journal of the Korean Society of Laryngology, Phoniatrics and Logopedics
- /
- v.24 no.1
- /
- pp.47-50
- /
- 2013
Background and Objectives:The purpose of this study is to find features in acoustics and to learn useful features of parameters in order to distinguish laryngeal diseases through many acoustic variables. Materials and Methods:The subjects of this study were 125-male patients who had been diagnosed with vocal nodule, vocal polyp, vocal cyst, Reinke's edema, leukoplakia. To research the features of each disease in acoustics, they are measured 34 parameters by using MDVP. Results:It is clear that in order to see a meaning result when distinguishing laryngeal diseases, $F_0$, $MF_0$, $T_0$, Fhi, Flo, PER variables are significant (p<.05). It means that variables related to fundamental frequency are important to anticipate which group will be diagnosed with Reinke's edema and leukoplakia. vAm had an effect on getting a significant result in terms of amplitude perturbation parameters, which is useful to distinguish between laryngeal polyp/cyst and other laryngeal disease (p<.05). ATRI made a significant result in related to tremor parameters, which is useful to distinguish between laryngeal polyp and other laryngeal disease (p<.05). Conclusion:$F_0$, $MF_0$, $T_0$, Fhi, Flo, PER, vAm, ATRI might be meaningful parameters distinguishing pathologic from benign laryngeal diseases. Especially, the vAm and ATRI are an important factor when forecasting which group would be diagnosed with vocal polyp.
PDF

Bearing Faults Identification of an Induction Motor using Acoustic Emission Signals and Histogram Modeling (음향 방출 신호와 히스토그램 모델링을 이용한 유도전동기의 베어링 결함 검출)

Jang, Won-Chul;Seo, Jun-Sang;Kim, Jong-Myon
- Journal of the Korea Society of Computer and Information
- /
- v.19 no.11
- /
- pp.17-24
- /
- 2014
This paper proposes a fault detection method for low-speed rolling element bearings of an induction motor using acoustic emission signals and histogram modeling. The proposed method performs envelop modeling of the histogram of normalized fault signals. It then extracts and selects significant features of each fault using partial autocorrelation coefficients and distance evaluation technique, respectively. Finally, using the extracted features as inputs, the support vector regression (SVR) classifies bearing's inner, outer, and roller faults. To obtain optimal classification performance, we evaluate the proposed method with varying an adjustable parameter of the Gaussian radial basis function of SVR from 0.01 to 1.0 and the number of features from 2 to 150. Experimental results show that the proposed fault identification method using 0.64-0.65 of the adjustable parameter and 75 features achieves 91% in classification performance and outperforms conventional fault diagnosis methods as well.
https://doi.org/10.9708/jksci.2014.19.11.017 인용 PDF KSCI

Voice Coding Using Only the Features of the Face Image

Cho, Youn-Soo;Jang, Jong-Whan
- The Journal of the Acoustical Society of Korea
- /
- v.18 no.3E
- /
- pp.26-29
- /
- 1999
In this paper, we propose a new voice coding using only the features of the face image such as mouth height(H), width(W), rate(R=W/H), area(S), and ellipse's feature(P). It provides high security and is not affected by acoustic noise because we use only the features of face image for speech. In the proposed algorithm, the mean recognition rate for the vowels approximately rises between 70% and 96% after many tests.
PDF

Search Result 328, Processing Time 0.023 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)