• 제목/요약/키워드: Acoustic Feature

검색결과 238건 처리시간 0.028초

결정 트리 모델링에 의한 한국어 문맥 종속 음소 분류 연구 (A Study on the Categorization of Context-dependent Phoneme using Decision Tree Modeling)

  • 이선정
    • 한국컴퓨터산업학회논문지
    • /
    • 제2권2호
    • /
    • pp.195-202
    • /
    • 2001
  • 본 논문에서는 한국어 음소가 좌, 우 음소에 따라 발음 방식이 달라질 때 매 음소를 모델링 하는 방법에 관한 연구를 수행한다. 이를 위해 유니트 감소 알고리즘과 결정 트리(Decision Tree)를 사용하는 방법을 사용하여 비교 연구한다. 유니트 감소 알고리즘은 통계적 특성만을 이용한 알고리즘이며 결정 트리 모델링 방식은 한국어 음운정보와 통계적 정보를 이용하여 문맥종속 음소를 분류하는 방식이다. 특히 본 논문에서는 결정 트리를 사용하여 문맥종속 음소를 분류하는 것에 대하여 상세히 기술한다. 마지막으로 결정 트리를 사용하여 분류된 문맥종속 음소의 성능을 실험하였다.

  • PDF

Malay Syllables Speech Recognition Using Hybrid Neural Network

  • Ahmad, Abdul Manan;Eng, Goh Kia
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 제어로봇시스템학회 2005년도 ICCAS
    • /
    • pp.287-289
    • /
    • 2005
  • This paper presents a hybrid neural network system which used a Self-Organizing Map and Multilayer Perceptron for the problem of Malay syllables speech recognition. The novel idea in this system is the usage of a two-dimension Self-organizing feature map as a sequential mapping function which transform the phonetic similarities or acoustic vector sequences of the speech frame into trajectories in a square matrix where elements take on binary values. This property simplifies the classification task. An MLP is then used to classify the trajectories that each syllable in the vocabulary corresponds to. The system performance was evaluated for recognition of 15 Malay common syllables. The overall performance of the recognizer showed to be 91.8%.

  • PDF

청각장애인을 위한 상황인지기반의 음향강화기술 (Sound Reinforcement Based on Context Awareness for Hearing Impaired)

  • 최재훈;장준혁
    • 대한전자공학회논문지SP
    • /
    • 제48권5호
    • /
    • pp.109-114
    • /
    • 2011
  • 본 논문에서는 청각장애인을 위한 음향 데이터를 이용한 음향강화 알고리즘을 Gaussian Mixture Model (GMM)을 이용한 상황인지 시스템 기반으로 제안한다. 음향 신호 데이터에서 Mel-Frequency Cepstral Coefficients (MFCC) 특징벡터를 추출하여 GMM을 구성하고 이를 기반으로 상황인지 결과에 따라 위험음향일 경우 음향강화기술을 제안한다. 실험결과 제안된 상황인지 기반의 음향강화 알고리즘이 다양한 음향학적 환경에서 우수한 성능을 보인 것을 알 수 있었다.

선박소음예측 및 데이터베이스 프로그램 개발 (A Study on the Prediction and Database Program of Ship Noise)

  • 박종현;김동해
    • 한국소음진동공학회:학술대회논문집
    • /
    • 한국소음진동공학회 2001년도 춘계학술대회논문집
    • /
    • pp.149-154
    • /
    • 2001
  • Ship owners are demanding quieter vessels since crews have become more sensitive to their acoustic environment. Accordingly, designers of shipyards need to respond intelligently to the challenging requirements of delivering a quiet vessel. In early design stage, to predict shipboard noise the statistical approach is preferred to other methods because of simplicity. However, since the noise characteristics of the ships vary continuously with the environments, it is necessary to update the prediction formula with data base management system. This paper describes the feature of database program with the prediction method. Database management programs with GUI, are applied to Intranet system that is accessible by any users. Statistical approach to the prediction of A-weighted noise level in ship cabins, based on multiple regression analysis, is conducted. The noise levels in ship cabins are mainly affected by the parameters of the deadweight, the type of ship, the relative location of engines and cabins, the type of deckhouse, etc. As a result of verification, the formulas ensure the accuracy of 3 ㏈ in 83 % of cabins.

  • PDF

인공신경망 기반의 기타 코드 분류 시스템 성능 비교 (Performance Comparison of Guitar Chords Classification Systems Based on Artificial Neural Network)

  • 박선배;유도식
    • 한국멀티미디어학회논문지
    • /
    • 제21권3호
    • /
    • pp.391-399
    • /
    • 2018
  • In this paper, we construct and compare various guitar chord classification systems using perceptron neural network and convolutional neural network without pre-processing other than Fourier transform to identify the optimal chord classification system. Conventional guitar chord classification schemes use, for better feature extraction, computationally demanding pre-processing techniques such as stochastic analysis employing a hidden markov model or an acoustic data filtering and hence are burdensome for real-time chord classifications. For this reason, we construct various perceptron neural networks and convolutional neural networks that use only Fourier tranform for data pre-processing and compare them with dataset obtained by playing an electric guitar. According to our comparison, convolutional neural networks provide optimal performance considering both chord classification acurracy and fast processing time. In particular, convolutional neural networks exhibit robust performance even when only small fraction of low frequency components of the data are used.

SPEECH TRAINING TOOLS BASED ON VOWEL SWITCH/VOLUME CONTROL AND ITS VISUALIZATION

  • Ueda, Yuichi;Sakata, Tadashi
    • 한국방송∙미디어공학회:학술대회논문집
    • /
    • 한국방송공학회 2009년도 IWAIT
    • /
    • pp.441-445
    • /
    • 2009
  • We have developed a real-time software tool to extract a speech feature vector whose time sequences consist of three groups of vector components; the phonetic/acoustic features such as formant frequencies, the phonemic features as outputs on neural networks, and some distances of Japanese phonemes. In those features, since the phoneme distances for Japanese five vowels are applicable to express vowel articulation, we have designed a switch, a volume control and a color representation which are operated by pronouncing vowel sounds. As examples of those vowel interface, we have developed some speech training tools to display a image character or a rolling color ball and to control a cursor's movement for aurally- or vocally-handicapped children. In this paper, we introduce the functions and the principle of those systems.

  • PDF

한국인 화자에 나타나는 일본어 어두 유성 자음의 경향 분석 (The Initial Voiced Stops in Japanese)

  • 김선희
    • 음성과학
    • /
    • 제9권4호
    • /
    • pp.201-214
    • /
    • 2002
  • In the Japanese language, there is a phonological contrast between not only initial stops, but also non initial in voiced and voiceless sounds. But in the Korean language, voiced sounds do not appear in the initial. Due to this, pronunciation of voiced sounds in the initial will be difficult for Korean. Through this research, I analyzed the minimal pairs by voiced/voiceless sounds of Japanese and Korean, and perception experiment in which Japanese listened to Korean speakers' pronunciations. Japanese pronunciations showed distinct acoustic differences between voiced and voiceless stops, especially in VOT. The duration of vowels after voiced stops was longer than that of voiceless ones. Vowel pitches after voiceless stops were higher. On the other hands, Korean showed three patterns of voiced sounds. There were-VOT values as native speakers, +VOT, and nasal formant tended to occur before prenasalized stops. Koreans pronounced voiceless sounds in strong aspirated, unaspirated, or tense sounds. Finally, Japanese judged sounds with not only -VOT values and prenasalized, but also with +VOT values as voiced. This suggests that we may not consider VOT values as the unique feature of voicing, and that such other phonetic characteristics as the following vowel lengthening should be included here.

  • PDF

방사선 요법이 초기 성대암 및 정상 후두의 음성 지표에 미치는 영향 (Effect of Radiation Therapy on Voice Parameters in Early Glottic Cancer and Normal Larynx)

  • 김민식;박한종;선동일;박영학;조승호
    • 대한후두음성언어의학회지
    • /
    • 제7권1호
    • /
    • pp.32-38
    • /
    • 1996
  • The preservation of the voice-producing mechanism is an important feature in the management of laryngeal cancer by radiotherapy. But, radiation therapy has certain side effects such as mucositis, tissue edema, necrosis and fibrosis which could effect on normal voice production. Several subjective studies that used questionnaires and auditory perceptual judgements of voice have been interpreted to mean that radiation results in a normal or near-normal voice. Objective evidence of the status of vocal function after radiation treatment, however, is still lacking. We analyzed the changes that occur in voice parameters in a group of patients undergoing radiation therapy, in order to determine the effect of radiation on voice quality. In this study acoustic, aerodynamic measures of vocal function were used to determine the characteristics of voice production. We found that voice parameters in early glottic cancer changed meaningfully comparing to normal larynx with or without radiation and radiation therapy has an little effect on normal larynx.

  • PDF

한국인을 위한 영어 말하기 시험의 컴퓨터 기반 유창성 평가 (Computer-Based Fluency Evaluation of English Speaking Tests for Koreans)

  • 장병용;권오욱
    • 말소리와 음성과학
    • /
    • 제6권2호
    • /
    • pp.9-20
    • /
    • 2014
  • In this paper, we propose an automatic fluency evaluation algorithm for English speaking tests. In the proposed algorithm, acoustic features are extracted from an input spoken utterance and then fluency score is computed by using support vector regression (SVR). We estimate the parameters of feature modeling and SVR using the speech signals and the corresponding scores by human raters. From the correlation analysis results, it is shown that speech rate, articulation rate, and mean length of runs are best for fluency evaluation. Experimental results show that the correlation between the human score and the SVR score is 0.87 for 3 speaking tests, which suggests the possibility of the proposed algorithm as a secondary fluency evaluation tool.

Cosmic Web traced by ELGs and LRGs from the Multidark Simulation

  • 김도일
    • 천문학회보
    • /
    • 제41권1호
    • /
    • pp.72.1-72.1
    • /
    • 2016
  • Current and planned large-volume surveys such as the Sloan Digital Sky Survey extended Baryon Oscillation Spectroscopic Survey (SDSS IV-eBOSS) or the Dark Energy Spectroscopic Instrument (DESI) will use Luminous Red Galaxies (LRGs) and Emission Line Galaxies (ELGs) to map the cosmic web up to z~1.7, and will allow one to accurately constrain cosmological models and obtain crucial information on the nature of dark energy and the expansion history of the Universe in novel epochs - particularly by measuring the Baryon Acoustic Oscillation (BAO) feature with improved accuracy. To this end, we present here a study of the spatial distribution and clustering of a sample of LRGs and ELGs obtained from a sub-volume of the MultiDark simulation complemented by different semi-analytic prescriptions, and investigate how these two different populations trace the cosmic web at different redshift intervals - along with their synergy. This is the first step towards the interpretation of upcoming ELG and LRG data.

  • PDF