• Title/Summary/Keyword: Voice recognition rate

Search Result 137, Processing Time 0.022 seconds

Voice Recognition Module for Multi-functional Electric Wheelchair (다기능 전동휠체어의 음성인식 모듈에 관한 연구)

  • 류홍석;김정훈;강성인;강재명;이상배
    • Proceedings of the IEEK Conference
    • /
    • 2002.06c
    • /
    • pp.83-86
    • /
    • 2002
  • This paper intends to provide convenience to the disabled, losing the use of their limbs, through voice recognition technology. The voice recognition part of this system recognizes voice by DTW (Dynamic Time Warping) Which is most Widely used in Speaker dependent system. Specially, S/N rate was improved through Wiener filter in the pre-treatment phase while considering real environmental conditions; the result values of 12th order feature pattern per frame are extracted by DTW algorithm using LPC and Cepsturm in feature extraction process. Furthermore, miniaturization is pursued using TMS320C32, 71's the floating-point DSP, for the hardware part. Currently, 90% of hardware porting has been completed, but we can confirm that the recognition rate was 96% as a result of performing the DTW algorithm in PC.

  • PDF

The Study on the Quality Assessment Model of Aircraft Voice Recognition Software (항공기 음성인식 소프트웨어 품질 평가 모델 연구)

  • Lee, Seung-Mok
    • Journal of Software Assessment and Valuation
    • /
    • v.15 no.2
    • /
    • pp.73-83
    • /
    • 2019
  • Voice Recognition has recently been improved with AI(Artificial Intelligence) and has greatly improved the false recognition rate and provides an effective and efficient Human Machine Interface (HMI). This trend has also been applied in the defense industry, particularly in the aviation, F-35. However, for the quality evaluation of Voice Recognition, the defense industry, especially the aircraft, requires measurable quantitative models. In this paper, the quantitative evaluation model is proposed for applying Voice Recognition to aircraft. For the proposal, the evaluation items are identified from the Voice Recognition technology and ISO/IEC 25000(SQuaRE) quality attributes. Using these two perspectives, the quantitative evaluation model is proposed under aircraft operation condition and confirms the evaluation results.

Speech Intelligibility of Alaryngeal Voices and Pre/Post Operative Evaluation of Voice Quality using the Speech Recognition Program(HUVOIS) (음성인식프로그램을 이용한 무후두 음성의 말 명료도와 병적 음성의 수술 전후 개선도 측정)

  • Kim, Han-Su;Choi, Seong-Hee;Kim, Jae-In;Lee, Jae-Yol;Choi, Hong-Shik
    • Journal of the Korean Society of Laryngology, Phoniatrics and Logopedics
    • /
    • v.15 no.2
    • /
    • pp.92-97
    • /
    • 2004
  • Background and Objectives : The purpose of this study was to examine objectively pre and post operative voice quality evaluation and intelligibility of alaryngeal voice using speech recognition program, HUVOIS. Materials and Methods : 2 laryngologists and 1 speech pathologist were evaluated 'G', 'R', 'B' in the GRBAS sclae and speech intelligibility using NTID rating scale from standard paragraph. And also acoustic estimates such as jitter, shimmer, HNR were obtained from Lx Speech Studio. Results : Speech recognition rate was not significantly different between pre and post operation for pathological vocie samples though voice quality(G, B) and acoustic values(Jitter, HNR) were significantly improved after post operation. In Alaryngeal voices, reed type electrolarynx 'Moksori' was the highest both speech intelligibility and speech recognition rate, whereas esophageal speech was the lowest. Coefficient correlation of speech intelligibility and speech recognition rate was found in alaryngeal voices, but not in pathological voices. Conclusion : Current study was not proved speech recognition program, HUVOIS during telephone program was not objective and efficient method for assisting subjective GRBAS scale.

  • PDF

A Study on Emotion Recognition of Chunk-Based Time Series Speech (청크 기반 시계열 음성의 감정 인식 연구)

  • Hyun-Sam Shin;Jun-Ki Hong;Sung-Chan Hong
    • Journal of Internet Computing and Services
    • /
    • v.24 no.2
    • /
    • pp.11-18
    • /
    • 2023
  • Recently, in the field of Speech Emotion Recognition (SER), many studies have been conducted to improve accuracy using voice features and modeling. In addition to modeling studies to improve the accuracy of existing voice emotion recognition, various studies using voice features are being conducted. This paper, voice files are separated by time interval in a time series method, focusing on the fact that voice emotions are related to time flow. After voice file separation, we propose a model for classifying emotions of speech data by extracting speech features Mel, Chroma, zero-crossing rate (ZCR), root mean square (RMS), and mel-frequency cepstrum coefficients (MFCC) and applying them to a recurrent neural network model used for sequential data processing. As proposed method, voice features were extracted from all files using 'librosa' library and applied to neural network models. The experimental method compared and analyzed the performance of models of recurrent neural network (RNN), long short-term memory (LSTM) and gated recurrent unit (GRU) using the Interactive emotional dyadic motion capture Interactive Emotional Dyadic Motion Capture (IEMOCAP) english dataset.

Implementation of Motorized Wheelchair using Speaker Independent Voice Recognition Chip and Wireless Microphone (화자 독립 방식의 음성 인식 칩 및 무선 마이크를 이용한 전동 휄체어의 구현)

  • Song, Byung-Seop;Lee, Jung-Hyun;Park, Jung-Jae;Park, Hee-Joon;Kim, Myoung-Nam
    • Journal of Sensor Science and Technology
    • /
    • v.13 no.1
    • /
    • pp.20-26
    • /
    • 2004
  • For the disabled persons who can't use their limbs, motorized wheelchair that is activated by voice recognition module employing speaker independent method, was implemented. The wireless voice transfer device was designed and employed for the user convenience. And the wheelchair was designed to operate using voice and keypad by selection of the user because they can manipulate it using keypad if necessary. The speaker independent method was used as the voice recognition module in order that anyone can manipulate the wheelchair in case of assistance. Using the implemented wheelchair, performance and motion of the system was examined and it has over than 97% of voice recognition rate and proper movements.

A Study on Motion Control of the Pet-Robot using Voice-Recognition (음성인식을 이용한 반려 로봇의 모션제어에 대한 연구)

  • Ye-Jin, Cho;Hyun-Seok, Kim;Tae-Sung, Bae;Su-Haeng, Lee;Jin-Hyean, Kim;Jae-Wook, Kim
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.17 no.6
    • /
    • pp.1089-1094
    • /
    • 2022
  • In this paper, a human coexistence-type companion robot that can communicate with people in daily life and alleviate the gap in care personnel was studied. Based on the voice recognition module, servo motor, and Arduino board, a companion robot equipped with a robot arm control function using voice recognition, a position movement function using RC cars, and a voice recognition function was tested and manufactured. As a result of the experiment, the speech recognition experiment according to distance showed the optimal recognition rate at a distance of 5 to 30 cm, and the speech recognition experiment according to gender showed a higher recognition rate in the first tone, monotonous tone. Through the evaluation results of these motion experiments, it was confirmed that a companion robot could be made.

A Study on Voice Recognition using Noise Cancel DTW for Noise Environment (잡음환경에서의 Noise Cancel DTW를 이용한 음성인식에 관한 연구)

  • Ahn, Jong-Young;Kim, Sung-Su;Kim, Su-Hoon;Koh, Si-Young;Hur, Kang-In
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.11 no.4
    • /
    • pp.181-186
    • /
    • 2011
  • In this paper, we propose the Noise Cancel DTW that to use a kind of feature compensation. This method is not to use estimated noise but we use real life environment noise data for Voice Recognition. And we applied this contaminated data for recognition reference model that suitable for noise environment. NCDTW is combined with surround noise when generating reference patten. We improved voice recognition rate at mobile environment to use NCDTW.

A study on Voice Recognition using Model Adaptation HMM for Mobile Environment (모델적응 HMM을 이용한 모바일환경에서의 음성인식에 관한 연구)

  • Ahn, Jong-Young;Kim, Sang-Bum;Kim, Su-Hoon;Hur, Kang-In
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.11 no.3
    • /
    • pp.175-179
    • /
    • 2011
  • In this paper, we propose the MA(Model Adaption) HMM that to use speech enhancement and feature compensation. Normally voice reference data is not consider for real noise data. This method is not to use estimated noise but we use real life environment noise data. And we applied this contaminated data for recognition reference model that suitable for noise environment. MAHMM is combined with surround noise when generating reference patten. We improved voice recognition rate at mobile environment to use MAHMM.

Voice Recognition Performance Improvement using a convergence of Voice Energy Distribution Process and Parameter (음성 에너지 분포 처리와 에너지 파라미터를 융합한 음성 인식 성능 향상)

  • Oh, Sang-Yeob
    • Journal of Digital Convergence
    • /
    • v.13 no.10
    • /
    • pp.313-318
    • /
    • 2015
  • A traditional speech enhancement methods distort the sound spectrum generated according to estimation of the remaining noise, or invalid noise is a problem of lowering the speech recognition performance. In this paper, we propose a speech detection method that convergence the sound energy distribution process and sound energy parameters. The proposed method was used to receive properties reduce the influence of noise to maximize voice energy. In addition, the smaller value from the feature parameters of the speech signal The log energy features of the interval having a more of the log energy value relative to the region having a large energy similar to the log energy feature of the size of the voice signal containing the noise which reducing the mismatch of the training and the recognition environment recognition experiments Results confirmed that the improved recognition performance are checked compared to the conventional method. Car noise environment of Pause Hit Rate is in the 0dB and 5dB lower SNR region showed an accuracy of 97.1% and 97.3% in the high SNR region 10dB and 15dB 98.3%, showed an accuracy of 98.6%.

Performance Evaluation of Real-time Voice Traffic over IEEE 802.15.4 Beacon-enabled Mode (IEEE 802.15.4 비컨 가용 방식에 의한 실시간 음성 트래픽 성능 평가)

  • Hur, Yun-Kang;Kim, You-Jin;Huh, Jae-Doo
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.2 no.1
    • /
    • pp.43-52
    • /
    • 2007
  • IEEE 802.15.4 specification which defines low-rate wireless personal area network(LR-WPAN) has application to home or building automation, remote control and sensing, intelligent management, environmental monitoring, and so on. Recently, it has been considered as an alternative technology to provide multimedia services such as automation via voice recognition, wireless headset and wireless camera for surveillance. In order to evaluate capability of voice traffic on the IEEE 802.15.4 LR-WPAN, we supposed two scenarios, voice traffic only and coexistence of voice and sensing traffic. For both cases we examined delay and packet loss rate in case of with and without acknowledgement, and various beacon period varying with beacon and superframe order values. In LR-WPAN with voice devices only, total 5 voice devices could be applicable and in the other case, i.e., coexisted cases of voice and sensor devices, a voice device was able to coexist with about 60 sensor devices.

  • PDF