• Title/Summary/Keyword: 발성특성

Search Result 217, Processing Time 0.023 seconds

Word Recognition Using VQ and Fuzzy Theory (VQ와 Fuzzy 이론을 이용한 단어인식)

  • Kim, Ja-Ryong;Choi, Kap-Seok
    • The Journal of the Acoustical Society of Korea
    • /
    • v.10 no.4
    • /
    • pp.38-47
    • /
    • 1991
  • The frequency variation among speakers is one of problems in the speech recognition. This paper applies fuzzy theory to solve the variation problem of frequency features. Reference patterns are expressed by fuzzified patterns which are produced by the peak frequency and the peak energy extracted from codebooks which are generated from training words uttered by several speakers, as they should include common features of speech signals. Words are recognized by fuzzy inference which uses the certainty factor between the reference patterns and the test fuzzified patterns which are produced by the peak frequency and the peak energy extracted from the power spectrum of input speech signals. Practically, in computing the certainty factor, to reduce memory capacity and computation requirements we propose a new equation which calculates the improved certainty factor using only the difference between two fuzzy values. As a result of experiments to test this word recognition method by fuzzy interence with Korean digits, it is shown that this word recognition method using the new equation presented in this paper, can solve the variation problem of frequency features and that the memory capacity and computation requirements are reduced.

  • PDF

The suppression of noise-induced speech distortions for speech recognition (음성인식을 위한 잡음하의 음성왜곡제거)

  • Chi, Sang-Mun;Oh, Yung-Hwan
    • Journal of the Korean Institute of Telematics and Electronics S
    • /
    • v.35S no.12
    • /
    • pp.93-102
    • /
    • 1998
  • In noisy environments, human speech productions are influenced by noises(Lombard effect), and speech signals are contaminated. These distortions dramatically reduce the performance of speech recognition systems. This paper proposes a method of the Lombard effect compensation and noise suppression in order to improve speech recognition performance in noise environments. To estimate the intensity of the Lombard effect which is a nonlinear distortion depending on the ambient noise levels, speakers, and phonetic units, we formulate the measure of the Lombard effect level based on the acoustic speech signal, and the measure is used to compensate the Lombard effect. The distortions of speech under noisy environments are cancelled out as follows. First, spectral subtraction and band-pass filtering are used to cancel out noise. Second, energy nomalization is proposed to cancel out the variation of vocal intensity by the Lombard effect. Finally, the Lombard effect level controls the transform which converts Lombard speech cepstrum to clean speech cepstrum. The proposed method was validated on 50 korean word recognition. Average recognition rates were 82.6%, 95.7%, 97.6% with the proposed method, while 46.3%, 75.5%, 87.4% without any compensation at SNR 0, 10, 20 dB, respectively.

  • PDF

The Perceptual Evaluation and Aerodynamic Analysis of Spasmodic Dysphonia (연축성발성장애의 청지각적 평가 및 공기역학적 특성)

  • Park, Sun-Young;Kim, Jae-Ock;Lim, Sung-Eun;Nam, Do-Hyun;Choi, Hong-Shik
    • Journal of the Korean Society of Laryngology, Phoniatrics and Logopedics
    • /
    • v.19 no.1
    • /
    • pp.38-42
    • /
    • 2008
  • Background and Objectives : This study was performed to investigate the perceptual and aerodynamic characteristics and the relation between vocal efficiency and the severity of strained voice. of adductor spasmodic dysphonia. Materials and Methods : 13 female patients with adductor spasmodic dysphonia were examined and compared with 10 normal female control group. MPT, MFR, Psub, Sound Intensity, VE(vocal efficiency) were obtained using PAS(Phonatory Aerodynamic System). GRBA(S) scale was used for Perceptual evaluation. Results : Psub(subglottic pressure) of SD was significantly higher than normal group. MPT, MFR, Sound Intensity, VE were not significantly different between two groups. Correlation between VE and 'S'(strained) was not significant. Conclusion : The results of this study show that certain aerodynamic parameters(Psub) distinguish adductor spasmodic dysphonia from normal voice.

  • PDF

Aerodynamic Characteristics of Young and Elderly Adult Patients with Voice Disorders during Continuous Speech (젊은 성인 및 노인 음성장애 환자의 연속발화시 공기역학적 특성 비교)

  • Pyo, Hwa-young
    • The Journal of the Korea Contents Association
    • /
    • v.19 no.12
    • /
    • pp.270-278
    • /
    • 2019
  • This study was performed to compare the aerodynamic characteristics of young and elderly adult male patients with voice disorders during continuous speech. Aerodynamic measurements were obtained after 12 young male patients and 9 elderly male patients read a paragraph. The elderly group showed longer duration, lower airflow rate and air volume than the younger group, but the differences were not significant except phonation time. So, when interpreting the meaning of aerodynamic measures of elderly voice disorder patients in the aspects of airflow and air volume, it should take into account various conditions(e. g. reading materials, pulmonary functions) as well as age.

The Speaker Recognition System using the Pitch Alteration (피치변경을 이용한 화자인식 시스템)

  • Jung JongSoon;Bae MyungJin
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • spring
    • /
    • pp.115-118
    • /
    • 2002
  • Parameters used in a speaker recognition system are desirable expressing speaker's characteristics filly and have in a speech. That is to say, if inter-speaker than intra-speaker variance a big characteristic, it is useful to distinguish between speakers. Also, to make minimum error between speakers, it is required the improved recognition technology as well as the distinguishing characteristics. When we see the result of recent simulation performance, we obtain more exact performance by using dynamic characteristics and constant characteristics by a speaking habit. Therefore we suggest it to solve this problem as followings. The prosodic information is used by a characteristic vector of speech. Characteristics vector generally using in speaker recognition system is a modeling spectrum information and is working for a high performance in non-noise circumstance. However, it is found a problem that characteristic vector is distorted in noise circumstance and it makes a reduction of recognition rate. In this paper, we change pitch line divided by segment which can estimate a dynamic characteristic and it is used as a recognition characteristic. we confirmed that the dynamic characteristic is very robust in noise circumstance with a simulation. We make a decision of acceptance or rejection by comparing test pattern and recognition rate using the proposed algorithm has more improvement than using spectrum and prosodic information. Especially stational recognition rate can be obtained in noise circumstance through the simulation.

  • PDF

Study on behavioral change of estrus in Hanwoo (Korean native cattle) (한우 발정기 행동변화에 대한 연구)

  • Cheon, Si Nae;Yoo, Geum Zoo;Kim, Chan Ho;Jung, Ji Yeon;Kim, Dong Hun;Jeon, Jung Hwan
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.21 no.11
    • /
    • pp.825-832
    • /
    • 2020
  • The detection of estrus is very important for the successful reproductive efficiency of cattle. This has prompted the development of electronic estrus detection techniques by using the characterization of estrus behavior. The objective of this study was to investigate the changes in physical activity, mounting behavior and vocalization during estrus in Hanwoo (Korean native cattle). Bio-telemetry devices were attached to 4 multiparous Hanwoo and physical activity was compared, namely mounting behavior and vocalization for 6 days (from 2 days before the day of estrus to 3 days after the day of estrus). Physical activity rapidly increased on the day of estrus (p<0.001) and was frequently observed at night time. Mounting behavior gradually increased, starting from 2 days before the day of estrus and reached its highest level on the day of estrus (p<0.01). The circadian rhythm showed irregularities during this entire period (p>0.05). There was no significant difference in vocalization during the experiment period (p>0.05). In conclusion, we assumed that mounting behavior is an early indicator to detect estrus in Hanwoo and if both mounting behavior and physical activity are considered together it would be possible to detect estrus with a higher probability. Further studies with more information from different sources regarding the measuring of estrus in Hanwoo are needed.

A Study on Micro-Battery using Beta Source (베타선원을 이용한 마이크로전지 구현연구)

  • Lee, Nam-Ho;Jung, Hyun-Kyu;Jung, Seung-Ho
    • Proceedings of the KIEE Conference
    • /
    • 2006.04b
    • /
    • pp.369-371
    • /
    • 2006
  • 반도체에 이온화 방사선을 조사시키면 소자 내부에서 전자-전공쌍(Electron-Hole Pair, EHP) 생성현상이 나타난나. 이 발성 전하는 방사선원에 따라 반영구적인 전원으로 활용이 가능하다. 본 논문에서는 반감기가 100여년인 Ni-63 베타 방사성 능위원소와 PN 접합형 반도체 소자를 사용한 방사선 전지를 개발하기 위한 연구의 일환으로 효율적인 방사광 변환 법과 전류 발생방법 및 전지회로 구현에 대해서 고찰하였다. 또한 Ni-63 방사선의 베타선 스펙트럼을 계산하고 이에 해당하는 광입력 특성에 대한 실리콘 PN 접합 다이오드의 반응특성을 태양광 시뮬레이션용 소프트웨어를 사용하여 검증하였다.

  • PDF

친환경 CuInS2 나노입자가 고분자 박막층 안에 분산된 유기 쌍안정성 소자의 메모리 특성

  • Ju, Ang;Yun, Dong-Yeol;Kim, Tae-Hwan
    • Proceedings of the Korean Vacuum Society Conference
    • /
    • 2014.02a
    • /
    • pp.393.1-393.1
    • /
    • 2014
  • 유기물/무기물 나노복합체를 이용하여 제작한 유기비휘발성 메모리 소자는 간단한 공정과 플렉서블기기 응용 가능성 때문에 많은 연구가 진행되고 있다. 유기물/무기물 나노복합체를 사용하여 제작한 비휘발성 메모리 소자의 전기적 성질에 대한 연구는 많이 진행되었으나, Poly (N-vinylcarbazole) (PVK) 박막 내부에 분산된 친환경 CuInS2 (CIS) 나노입자를 이용하여 제작한 유기 쌍안정 소자의 메모리 특성에 대한 연구는 미미한 실정이다. 본 연구에서는 PVK박막층안에 분산된 CIS나노입자를 사용한 메모리 소자를 제작하여 전기적 특성에 대하여 연구하였다. 화학적으로 합성 된 CIS 나노입자를 톨루엔에 용해되어있는 PVK에 넣고 초음파 교반기를 사용하여 두 물질을 고르게 섞었다. 전극이 되는 indium-tin-oxide 가 성장된 유리 기판 위에CIS 나노입자와 PVK가 섞인 용액을 스핀코팅 한 후, 열을 가해 용매를 제거하여 CIS 나노입자가 PVK에 분산되어 있는 나노복합체 박막을 형성하였다. 형성된 나노복합체 박막 위에 상부 전극으로 Al을 열증착하여 비휘발성 메모리 소자를 제작하였다. 제작된 소자의 전류-전압 측정 결과는 메모리 특성을 나타내었으며, CIS 나노입자를 포함하지 않은 소자와의 비교를 통해 CIS나노입자가 비휘발성 메모리 소자에서 메모리 특성을 나타내게 하는 역할을 확인하였다. 전류-시간 측정 결과 소자의 ON/OFF 전류 비율이 시간에 따라 큰 변화 없이 1,000 회 이상 지속적으로 유지함을 관찰함으로써 소자의 전기적 기억 상태 안정성을 확인하였다.

  • PDF

Personal Credit Evaluation System through Telephone Voice Analysis: By Support Vector Machine

  • Park, Hyungwoo
    • Journal of Internet Computing and Services
    • /
    • v.19 no.6
    • /
    • pp.63-72
    • /
    • 2018
  • The human voice is one of the easiest methods for the information transmission between human beings. The characteristics of voice can vary from person to person and include the speed of speech, the form and function of the vocal organ, the pitch tone, speech habits, and gender. The human voice is a key element of human communication. In the days of the Fourth Industrial Revolution, voices are also a major means of communication between humans and humans, between humans and machines, machines and machines. And for that reason, people are trying to communicate their intentions to others clearly. And in the process, it contains various additional information along with the linguistic information. The Information such as emotional status, health status, part of trust, presence of a lie, change due to drinking, etc. These linguistic and non-linguistic information can be used as a device for evaluating the individual's credit worthiness by appearing in various parameters through voice analysis. Especially, it can be obtained by analyzing the relationship between the characteristics of the fundamental frequency(basic tonality) of the vocal cords, and the characteristics of the resonance frequency of the vocal track.In the previous research, the necessity of various methods of credit evaluation and the characteristic change of the voice according to the change of credit status were studied. In this study, we propose a personal credit discriminator by machine learning through parameters extracted through voice.

Voice range profile in premutation, mutation, and postmutation of men (변성이전, 변성 및 변성이후 남성의 발성범위 프로파일)

  • Kim, Jaeock;Lee, Seung Jin
    • Phonetics and Speech Sciences
    • /
    • v.13 no.4
    • /
    • pp.89-100
    • /
    • 2021
  • This study compared the voice range profiles (VRPs) with glissando and simplified VRP methods with 57 men who were in premutation (8-13 years), mutation (11-16 years), and postmutation (10-24 years) stages. The difference between modal and falsetto areas measured in two VRP methods was also compared. As the results, the average fundamental frequency (F0) was in the order of premuaton>mutation>postmutation. The maximum F0 (F0max), the range of F0 (F0range), the maximum intensity (Imax), and the range of intensity (Irange) were the lowest in the mutation stage, and these variables were higher in falsetto area than in modal area in both methods. In addition, most variables of VRP in glissando were higher than in simplified VRP, but the differences were not significant. This study showed that, in men in mutation stage, due to the temporary anatomical and physiological changes of the larynx, the mechanism of the vocal folds vibration changes and VRP shows a different pattern from that of other age groups. Both the VRPs of glissando and simplifed VRP are suitable for clinical practice by experienced examiners. And it is necessary to measure not only the falsetto area but also the modal area when measuring VRP.