통합 검색 | Korea Science

Intra-and Inter-frame Features for Automatic Speech Recognition

Lee, Sung Joo;Kang, Byung Ok;Chung, Hoon;Lee, Yunkeun
- ETRI Journal
- /
- 제36권3호
- /
- pp.514-517
- /
- 2014
In this paper, alternative dynamic features for speech recognition are proposed. The goal of this work is to improve speech recognition accuracy by deriving the representation of distinctive dynamic characteristics from a speech spectrum. This work was inspired by two temporal dynamics of a speech signal. One is the highly non-stationary nature of speech, and the other is the inter-frame change of a speech spectrum. We adopt the use of a sub-frame spectrum analyzer to capture very rapid spectral changes within a speech analysis frame. In addition, we attempt to measure spectral fluctuations of a more complex manner as opposed to traditional dynamic features such as delta or double-delta. To evaluate the proposed features, speech recognition tests over smartphone environments were conducted. The experimental results show that the feature streams simply combined with the proposed features are effective for an improvement in the recognition accuracy of a hidden Markov model-based speech recognizer.
https://doi.org/10.4218/etrij.14.0213.0181 인용 PDF KSCI KPUBS

음성신호 적응분할방법에 의한 특징분석 (Features Analysis of Speech Signal by Adaptive Dividing Method)

장승관;최성연;김창석
- 음성과학
- /
- 제5권1호
- /
- pp.63-80
- /
- 1999
In this paper, an adaptive method of dividing a speech signal into an initial, a medial and a final sound of the form of utterance utilized by evaluating extreme limits of short term energy and autocorrelation functions. By applying this method into speech signal composed of a consonant, a vowel and a consonant, it was divided into an initial, a medial and a final sound and its feature analysis of sample by LPC were carried out. As a result of spectrum analysis in each period, it was observed that there existed spectrum features of a consonant and a vowel in the initial and medial periods respectively and features of both in a final sound. Also, when all kinds of words were adaptively divided into 3 periods by using the proposed method, it was found that the initial sounds of the same consonant and the medial sounds of the same vowels have the same spectrum characteristics respectively, but the final sound showed different spectrum characteristics even if it had the same consonant as the initial sound.
PDF

Knowledge-driven speech features for detection of Korean-speaking children with autism spectrum disorder

Seonwoo Lee;Eun Jung Yeo;Sunhee Kim;Minhwa Chung
- 말소리와 음성과학
- /
- 제15권2호
- /
- pp.53-59
- /
- 2023
Detection of children with autism spectrum disorder (ASD) based on speech has relied on predefined feature sets due to their ease of use and the capabilities of speech analysis. However, clinical impressions may not be adequately captured due to the broad range and the large number of features included. This paper demonstrates that the knowledge-driven speech features (KDSFs) specifically tailored to the speech traits of ASD are more effective and efficient for detecting speech of ASD children from that of children with typical development (TD) than a predefined feature set, extended Geneva Minimalistic Acoustic Standard Parameter Set (eGeMAPS). The KDSFs encompass various speech characteristics related to frequency, voice quality, speech rate, and spectral features, that have been identified as corresponding to certain of their distinctive attributes of them. The speech dataset used for the experiments consists of 63 ASD children and 9 TD children. To alleviate the imbalance in the number of training utterances, a data augmentation technique was applied to TD children's utterances. The support vector machine (SVM) classifier trained with the KDSFs achieved an accuracy of 91.25%, surpassing the 88.08% obtained using the predefined set. This result underscores the importance of incorporating domain knowledge in the development of speech technologies for individuals with disorders.
https://doi.org/10.13064/KSSS.2023.15.2.053 인용 PDF

The Classification of Music Styles on the Basis of Spectral Contrast Features

Wang, Yan-bing
- 한국컴퓨터정보학회논문지
- /
- 제22권1호
- /
- pp.9-14
- /
- 2017
In this paper, we propose that the contrast features of octave spectrum can be used to show spectral contrast features of some music clips. It shows the relative spectral distribution rather than average spectrum. From the experiment, it can be seen the method of spectral contrast features has a good performance in classification of music styles. Another comparative experiment shows that the method of spectral contrast features can better distinguish different music styles than the method of MFCC features that commonly used previously in the classification system of music styles.
https://doi.org/10.9708/jksci.2017.22.01.009 인용 PDF KSCI

Laver Farm Feature Extraction From Landsat ETM+ Using Independent Component Analysis

Han J. G.;Yeon Y. K.;Chi K. H.;Hwang J. H.
- 대한원격탐사학회:학술대회논문집
- /
- 대한원격탐사학회 2004년도 Proceedings of ISRS 2004
- /
- pp.359-362
- /
- 2004
In multi-dimensional image, ICA-based feature extraction algorithm, which is proposed in this paper, is for the purpose of detecting target feature about pixel assumed as a linear mixed spectrum sphere, which is consisted of each different type of material object (target feature and background feature) in spectrum sphere of reflectance of each pixel. Landsat ETM+ satellite image is consisted of multi-dimensional data structure and, there is target feature, which is purposed to extract and various background image is mixed. In this paper, in order to eliminate background features (tidal flat, seawater and etc) around target feature (laver farm) effectively, pixel spectrum sphere of target feature is projected onto the orthogonal spectrum sphere of background feature. The rest amount of spectrum sphere of target feature in the pixel can be presumed to remove spectrum sphere of background feature. In order to make sure the excellence of feature extraction method based on ICA, which is proposed in this paper, laver farm feature extraction from Landsat ETM+ satellite image is applied. Also, In the side of feature extraction accuracy and the noise level, which is still remaining not to remove after feature extraction, we have conducted a comparing test with traditionally most popular method, maximum-likelihood. As a consequence, the proposed method from this paper can effectively eliminate background features around mixed spectrum sphere to extract target feature. So, we found that it had excellent detection efficiency.
PDF

Fast Spectrum Sensing with Coordinate System in Cognitive Radio Networks

Lee, Wilaiporn;Srisomboon, Kanabadee;Prayote, Akara
- ETRI Journal
- /
- 제37권3호
- /
- pp.491-501
- /
- 2015
Spectrum sensing is an elementary function in cognitive radio designed to monitor the existence of a primary user (PU). To achieve a high rate of detection, most techniques rely on knowledge of prior spectrum patterns, with a trade-off between high computational complexity and long sensing time. On the other hand, blind techniques ignore pattern matching processes to reduce processing time, but their accuracy degrades greatly at low signal-to-noise ratios. To achieve both a high rate of detection and short sensing time, we propose fast spectrum sensing with coordinate system (FSC) - a novel technique that decomposes a spectrum with high complexity into a new coordinate system of salient features and that uses these features in its PU detection process. Not only is the space of a buffer that is used to store information about a PU reduced, but also the sensing process is fast. The performance of FSC is evaluated according to its accuracy and sensing time against six other well-known conventional techniques through a wireless microphone signal based on the IEEE 802.22 standard. FSC gives the best performance overall.
https://doi.org/10.4218/etrij.15.0114.0675 인용 PDF KSCI

Dimension-Reduced Audio Spectrum Projection Features for Classifying Video Sound Clips

Kim, Hyoung-Gook
- The Journal of the Acoustical Society of Korea
- /
- 제25권3E호
- /
- pp.89-94
- /
- 2006
For audio indexing and targeted search of specific audio or corresponding visual contents, the MPEG-7 standard has adopted a sound classification framework, in which dimension-reduced Audio Spectrum Projection (ASP) features are used to train continuous hidden Markov models (HMMs) for classification of various sounds. The MPEG-7 employs Principal Component Analysis (PCA) or Independent Component Analysis (ICA) for the dimensional reduction. Other well-established techniques include Non-negative Matrix Factorization (NMF), Linear Discriminant Analysis (LDA) and Discrete Cosine Transformation (DCT). In this paper we compare the performance of different dimensional reduction methods with Gaussian mixture models (GMMs) and HMMs in the classifying video sound clips.
PDF KSCI

Small-scale Features of Thermal Inflation: CMB Distortion, Substructure Abundance, and 21cm Power Spectrum

홍성욱;조희승;안경진;조기현
- 천문학회보
- /
- 제42권2호
- /
- pp.78.4-79
- /
- 2017
Thermal inflation is an additional inflationary mechanism before the big bang nucleosynthesis, which solves the moduli problem and naturally provides a plausible dark matter candidate. Thermal inflation leaves a slight enhancement followed by huge suppression of a factor of ~50 in the curvature and matter power spectrum, which can be expressed in terms of a single characteristic scale $k_b$. Here we describe the observability of the small-scale features of thermal inflation from various observations, such as CMB distortion, satellite galaxy abundance in the Milky-Way-sized galaxies, and 21-cm power spectrum before the epoch of reionization.
PDF

Random PWM 기법에 의한 전도노이즈 (Conducted Noise Reduction in Random Pulse Width Modulation)

정동효;김상남
- 대한전기학회:학술대회논문집
- /
- 대한전기학회 2002년도 학술대회 논문집 전문대학교육위원
- /
- pp.98-101
- /
- 2002
The switching-mode power converter has been widely used because of its features of high efficiency and small weight and size. These features are brought by the ON-OFF operation of semiconductor switching devices. However, this switching operation causes the surge and EMI(Electromagnetic Interference) which deteriorate the reliability of the converter themselves and entire electronic systems. This problem on the surge and noise is one of the most serious difficulties in AC-to-DC converter. In the case of carrier frequency selection, output-voltage of steady state and transient state is fully regulated. A RPWM control method was proposed in order to smooth the switching noise spectrum and reduce it's level. Experimental results are verified by converter operating at 300V/1kW with 5%${\sim}$30% white noise input. Spectrum analysis is performed on the Phase current and the CM noise voltage. The former is measured with Current Probe and the latter is achieved with USN. which are connected to the spectrum analyzer respectively.
PDF

Heart Sound Localization in Respiratory Sounds Based on Singular Spectrum Analysis and Frequency Features

Molaie, Malihe;Moradi, Mohammad Hassan
- ETRI Journal
- /
- 제37권4호
- /
- pp.824-832
- /
- 2015
Heart sounds are the main obstacle in lung sound analysis. To tackle this obstacle, we propose a diagnosis algorithm that uses singular spectrum analysis (SSA) and frequency features of heart and lung sounds. In particular, we introduce a frequency coefficient that shows the frequency difference between heart and lung sounds. The proposed algorithm is applied to a synthetic mixture of heart and lung sounds. The results show that heart sounds can be extracted successfully and localizations for the first and second heart sounds are remarkably performed. An error analysis of the localization results shows that the proposed algorithm has fewer errors compared to the SSA method, which is one of the most powerful methods in the localization of heart sounds. The presented algorithm is also applied in the cases of recorded respiratory sounds from the chest walls of five healthy subjects. The efficiency of the algorithm in extracting heart sounds from the recorded breathing sounds is verified with power spectral density evaluations and listening. Most studies have used only normal respiratory sounds, whereas we additionally use abnormal breathing sounds to validate the strength of our achievements.
https://doi.org/10.4218/etrij.15.0114.1447 인용 PDF KSCI

검색결과 409건 처리시간 0.031초

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)