• Title/Summary/Keyword: 음성 노력

Search Result 148, Processing Time 0.026 seconds

A Research on the state of the utilization of the stock-information-retrieval-service (KT 증권정보 서비스 이용 실태 및 인식 결과 조사)

  • 최영재
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • 1998.06c
    • /
    • pp.63-66
    • /
    • 1998
  • 한국통신에서는 PC로 된 프로토타입 시스템을 이용하여 음성인식 증권정보 서비스를 1995년 11월부터 1998년 초까지 5채널에 대해 시험운용을 해왔으며, 상용서비스를 위해 120명이 동시에 서비스 받을 수 있는 시스템을 개발하였다. 개발된 시스템의 전반적인 문제점을 파악하기 위하여 개발된 시스템을 사용하여 1998년 3월 16일부터 30 채널규모로 일반인들에게 시험서비스를 제공하고 있다. 음성인식 전화정보 서비스를 현재보다 훨씬 더 활성화시키기 위해서, 서비스의 이용 형태에 대한 분석을 통해, 어느 부분이 어떻게 개선되어야 할지를 연구하여, 초보 사용자라도 이용하기 쉬운 형태로 서비스를 시나리오를 개선해 나가고 있다. 본 논문에서는 사용자 특히, 처음 사용자의 여러 가지 이용 실태 요인을 분석하였다. 또한, 음성인식 증권 정보 서비스가 정식으로 서비스되기 이전과 그 이후의 일시별 인식률을 통해 조사하고, 이용자가 동일 대상 단어를 연속으로 발음하는 경우, 동일 대상 단어에 대한 인식률을 조사하였다. 조사결과 문제점은 4가지로 분류될 수 있었으며, 드러난 문제점을 해결하기 위하여 노력하고 있다.

  • PDF

A Study on Cockpit Voice Command System for Fighter Aircraft (전투기용 음성명령 시스템에 대한 연구)

  • Kim, Seongwoo;Seo, Mingi;Oh, Yunghwan;Kim, Bonggyu
    • Journal of the Korean Society for Aeronautical & Space Sciences
    • /
    • v.41 no.12
    • /
    • pp.1011-1017
    • /
    • 2013
  • The human voice is the most natural means of communication. The need for speech recognition technology is increasing gradually to increase the ease of human and machine interface. The function of the avionics equipment is getting various and complicated in consequence of the growth of digital technology development, so that the load of pilots in the fighter aircraft must become increased since they don't concentrate only the attack function, but also operate the complicated avionics equipments. Accordingly, if speech recognition technology is applied to the aircraft cockpit as regards the operating the avionics equipments, pilots can spend their time and effort on the mission of fighter aircraft. In this paper, the cockpit voice command system applicable to the fighter aircraft has been developed and the function and the performance of the system verified.

A Study on Spam Protection Technolgy for Secure VoIP Service in Broadband convergence Network Environment (BcN 환경에서 안전한 VoIP 서비스를 위한 스팸대응 기술 연구)

  • Sung, Kyung;Kim, Seok-Hun
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.12 no.4
    • /
    • pp.670-676
    • /
    • 2008
  • There is a difficult plane letting a security threat to occur in Internet networks as VoIP service uses technology-based the Internet is inherent, and you protect without adjustment of the existing security solution or changes with real-time service characteristics. It is a voice to single networks The occurrence security threat that it is possible is inherent in IP networks that effort and cost to protect a data network only are complicated relatively as provide service integrated data. This paper about various response way fields to be able to prevent analysis regarding definition regarding VoIP spam and VoIP spam technology and VoIP spam.

The Effect of FIR Filtering and Spectral Tilt on Speech Recognition with MFCC (FIR 필터링과 스펙트럼 기울이기가 MFCC를 사용하는 음성인식에 미치는 효과)

  • Lee, Chang-Young
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.5 no.4
    • /
    • pp.363-371
    • /
    • 2010
  • In an effort to enhance the quality of feature vector classification and thereby reduce the recognition error rate for the speaker-independent speech recognition, we study the effect of spectral tilt on the Fourier magnitude spectrum en route to the extraction of MFCC. The effect of FIR filtering on the speech signal on the speech recognition is also investigated in parallel. Evaluation of the proposed methods are performed by two independent ways of the Fisher discriminant objective function and speech recognition test by hidden Markov model with fuzzy vector quantization. From the experiments, the recognition error rate is found to show about 10% relative improvements over the conventional method by an appropriate choice of the tilt factor.

Service Quality Criteria for Voice Services over a WiBro Network (와이브로 네트워크를 통한 음성 서비스의 측정 기반 품질 기준 수립)

  • Kim, Beom-Joon
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.6 no.6
    • /
    • pp.823-829
    • /
    • 2011
  • This paper covers the service quality of packet-based voice service that is provided over a wireless broadband (WiBro) network. Using a measurement software that has been developed in the course of preparing a advanced service quality management scheme for the packet-based voice service over a wireless network[2][3], a huge scale of experiment is conducted to measure the real quality of the voice service. Based on our analysis of the measurement result, the service quality of the voice service is supposed to be quite good over WiBro networks. In addition, another experiment to investigate the effect of degradation of wireless transmission conditions on the service quality of the voice service shows the values of wireless service metris in which mean opinion score (MOS) starts to decrease.

Service Quality Criteria for Voice Services over a HSDPA System (HSDPA 시스템을 통한 음성 서비스의 측정 기반 품질 기준 수립)

  • Kim, Beom-Joon
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.7 no.2
    • /
    • pp.249-255
    • /
    • 2012
  • This paper covers the service quality of packet-based voice service that is provided over a high speed downlink packet access (HSDPA) system. Using the measurement software that has been developed in the course of preparing a advanced service quality management scheme for the packet-based voice service over a wireless network[2][3], a huge scale of experiment is conducted to measure the real quality of the voice service. Based on our analysis of the measurement result, the service quality of the voice service is supposed to be quite good over HSDPA system. In addition, another experiment to investigate the effect of degradation of wireless transmission conditions on the service quality of the voice service shows the values of wireless service metrics in which mean opinion score (MOS) starts to decrease.

Identification of Voice Features for Recently Voice Fishing by Voice Analysis (음성 분석을 통한 최근 보이스피싱의 음성 특징 규명)

  • Lee, Bum Joo;Cho, Dong Uk;Jeong, Yeon Man
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.41 no.10
    • /
    • pp.1276-1283
    • /
    • 2016
  • The scale of financial damages on voice fishing has not been decreased despite of national and social efforts to reduce the amounts of voice fishing damage. One of these reasons is a sophisticated and vernacular speech style that makes it difficult to recognize the offenders. Furthermore, nowadays, young men have intensively been deceived by not only sophisticated and vernacular speech style which is used the employer of real public offices but also obtained personal information. As a result, this lead directly to the financial damages of younger people who has a stronger judgement than older. For this, we investigated the comparison and analysis between the criminals of voice fishing and the same generation younger people for identifying voice features. The experiment was carried out based on the pitch, bandwidth of pitch, energy, speech speed and voice color for searching the difference of voice characteristics between the criminals of voice fishing and the same generation younger people since 2011. The experimental result shows that there is a significant difference in energy and speech speed between the criminals of voice fishing and the same generation younger people.

Combining multi-task autoencoder with Wasserstein generative adversarial networks for improving speech recognition performance (음성인식 성능 개선을 위한 다중작업 오토인코더와 와설스타인식 생성적 적대 신경망의 결합)

  • Kao, Chao Yuan;Ko, Hanseok
    • The Journal of the Acoustical Society of Korea
    • /
    • v.38 no.6
    • /
    • pp.670-677
    • /
    • 2019
  • As the presence of background noise in acoustic signal degrades the performance of speech or acoustic event recognition, it is still challenging to extract noise-robust acoustic features from noisy signal. In this paper, we propose a combined structure of Wasserstein Generative Adversarial Network (WGAN) and MultiTask AutoEncoder (MTAE) as deep learning architecture that integrates the strength of MTAE and WGAN respectively such that it estimates not only noise but also speech features from noisy acoustic source. The proposed MTAE-WGAN structure is used to estimate speech signal and the residual noise by employing a gradient penalty and a weight initialization method for Leaky Rectified Linear Unit (LReLU) and Parametric ReLU (PReLU). The proposed MTAE-WGAN structure with the adopted gradient penalty loss function enhances the speech features and subsequently achieve substantial Phoneme Error Rate (PER) improvements over the stand-alone Deep Denoising Autoencoder (DDAE), MTAE, Redundant Convolutional Encoder-Decoder (R-CED) and Recurrent MTAE (RMTAE) models for robust speech recognition.

Classification of muscle tension dysphonia (MTD) female speech and normal speech using cepstrum variables and random forest algorithm (켑스트럼 변수와 랜덤포레스트 알고리듬을 이용한 MTD(근긴장성 발성장애) 여성화자 음성과 정상음성 분류)

  • Yun, Joowon;Shim, Heejeong;Seong, Cheoljae
    • Phonetics and Speech Sciences
    • /
    • v.12 no.4
    • /
    • pp.91-98
    • /
    • 2020
  • This study investigated the acoustic characteristics of sustained vowel /a/ and sentence utterance produced by patients with muscle tension dysphonia (MTD) using cepstrum-based acoustic variables. 36 women diagnosed with MTD and the same number of women with normal voice participated in the study and the data were recorded and measured by ADSVTM. The results demonstrated that cepstral peak prominence (CPP) and CPP_F0 among all of the variables were statistically significantly lower than those of control group. When it comes to the GRBAS scale, overall severity (G) was most prominent, and roughness (R), breathiness (B), and strain (S) indices followed in order in the voice quality of MTD patients. As these characteristics increased, a statistically significant negative correlation was observed in CPP. We tried to classify MTD and control group using CPP and CPP_F0 variables. As a result of statistic modeling with a Random Forest machine learning algorithm, much higher classification accuracy (100% in training data and 83.3% in test data) was found in the sentence reading task, with CPP being proved to be playing a more crucial role in both vowel and sentence reading tasks.

Phonetic Tied-Mixture Syllable Model for CSR (연속 음성 인식을 위한 PTM 음절 모델)

  • Kim Bong-Wan;Lee Yong-Ju
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • spring
    • /
    • pp.33-36
    • /
    • 2004
  • 최근 연속 음성 인식에서의 성능 향상을 위하여 음절을 인식 단위로 사용하고자 하는 노력들이 보고되고 있다. 그러나 음절의 경우 음소에 비해 학습성이 좋지 않고 모델의 수가 많으므로 음절 경계에서의 문맥 종속 모델링이 어렵다는 단점을 갖고 있다. 본 논문에서는 음절의 이러한 단점을 극복하기 위하여 모노폰과 트라이폰을 이용하여 음절 모델을 합성하는 방법을 제안한다. 제안된 모델은 트라이폰에 비하여 평균 $55\%$, PTM에 비하여 평균 $13\%$의 인식 속도 향상을 보이며, 동일한 속도일 경우 PTM, 트라이폰 모델 모두에 대하여 ERR이 약$8\%$ 향상됨을 볼 수 있었다.

  • PDF