• Title, Summary, Keyword: Voice Recognition

Search Result 513, Processing Time 0.037 seconds

Voice Activity Detection Algorithm using Wavelet Band Entropy Ensemble Analysis in Car Noisy Environments (문서 편집 접근성 향상을 위한 음성 명령 기반 모바일 어플리케이션 개발)

  • Park, Joo Hyun;Park, Seah;Lee, Muneui;Lim, Soon-Bum
    • Journal of Korea Multimedia Society
    • /
    • v.21 no.11
    • /
    • pp.1342-1352
    • /
    • 2018
  • Voice Command systems are important means of ensuring accessibility to digital devices for use in situations where both hands are not free or for people with disabilities. Interests in services using speech recognition technology have been increasing. In this study, we developed a mobile writing application using voice recognition and voice command technology which helps people create and edit documents easily. This application is characterized by the minimization of the touch on the screen and the writing of memo by voice. We have systematically designed a mode to distinguish voice writing and voice command so that the writing and execution system can be used simultaneously in one voice interface. It provides a shortcut function that can control the cursor by voice, which makes document editing as convenient as possible. This allows people to conveniently access writing applications by voice under both physical and environmental constraints.

A Study on Voice Web Browsing in JAVA Beans Component Architecture Automatic Speech Recognition Application System. (JAVABeans Component 구조를 갖는 음성인식 시스템에서의 Voice Web Browsing에 관한 연구)

  • 장준식;윤재석
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • /
    • pp.273-276
    • /
    • 2003
  • In this study, Automatic Speech Recognition Application System is designed and implemented to realize transformation from present GUI-centered web services to VUI-centered web service. Due to ASP's restriction with web in speed and implantation, in this study, Automatic Speech Recognition Application System with Java beans Component Architecture is devised and studied. Also the voice web browsing which is able to transfer voice and graphic information simultaneously is studied using Remote AWT(Abstract Windows Toolkit).

  • PDF

Implementation of Motorized Wheelchair using Speaker Independent Voice Recognition Chip and Wireless Microphone (화자 독립 방식의 음성 인식 칩 및 무선 마이크를 이용한 전동 휄체어의 구현)

  • Song, Byung-Seop;Lee, Jung-Hyun;Park, Jung-Jae;Park, Hee-Joon;Kim, Myoung-Nam
    • Journal of Sensor Science and Technology
    • /
    • v.13 no.1
    • /
    • pp.20-26
    • /
    • 2004
  • For the disabled persons who can't use their limbs, motorized wheelchair that is activated by voice recognition module employing speaker independent method, was implemented. The wireless voice transfer device was designed and employed for the user convenience. And the wheelchair was designed to operate using voice and keypad by selection of the user because they can manipulate it using keypad if necessary. The speaker independent method was used as the voice recognition module in order that anyone can manipulate the wheelchair in case of assistance. Using the implemented wheelchair, performance and motion of the system was examined and it has over than 97% of voice recognition rate and proper movements.

Voice Activity Detection in Noisy Environment using Speech Energy Maximization and Silence Feature Normalization (음성 에너지 최대화와 묵음 특징 정규화를 이용한 잡음 환경에 강인한 음성 검출)

  • Ahn, Chan-Shik;Choi, Ki-Ho
    • Journal of Digital Convergence
    • /
    • v.11 no.6
    • /
    • pp.169-174
    • /
    • 2013
  • Speech recognition, the problem of performance degradation is the difference between the model training and recognition environments. Silence features normalized using the method as a way to reduce the inconsistency of such an environment. Silence features normalized way of existing in the low signal-to-noise ratio. Increase the energy level of the silence interval for voice and non-voice classification accuracy due to the falling. There is a problem in the recognition performance is degraded. This paper proposed a robust speech detection method in noisy environments using a silence feature normalization and voice energy maximize. In the high signal-to-noise ratio for the proposed method was used to maximize the characteristics receive less characterized the effects of noise by the voice energy. Cepstral feature distribution of voice / non-voice characteristics in the low signal-to-noise ratio and improves the recognition performance. Result of the recognition experiment, recognition performance improved compared to the conventional method.

The Recognition of Korean Syllables using Parameter Based on Principal Component Analysis (PCA 기반 파라메타를 이용한 숫자음 인식)

  • 박경훈;표창수;김창근;허강인
    • Proceedings of the Korea Institute of Convergence Signal Processing
    • /
    • /
    • pp.181-184
    • /
    • 2000
  • The new method of feature extraction is proposed, considering the statistic feature of human voice, unlike the conventional methods of voice extraction. PCA(principal Component Analysis) is applied to this new method. PCA removes the repeating of data after finding the axis direction which has the greatest variance in input dimension. Then the new method is applied to real voice recognition to assess performance. When results of the number recognition in this paper and the conventional Mel-Cepstrum of voice feature parameter are compared, there is 0.5% difference of recognition rate. Better recognition rate is expected than word or sentence recognition in that less convergence time than the conventional method in extracting voice feature. Also, better recognition tate is expected when the optimum vector is used by statistic feature of data.

  • PDF

A Study on Improved Method of Voice Recognition Rate (음성 인식률 개선방법에 관한 연구)

  • Kim, Young-Po;Lee, Han-Young
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.8 no.1
    • /
    • pp.77-83
    • /
    • 2013
  • In this paper, we suggested a method about the improvement of the voice recognition rate and carried out a study on it. In general, voices were detected by applying the most widely-used method, HMM (Hidden Markov Model) algorithm. Regarding the method of detecting voices, the zero crossing ratio was calculated based on the units of voices before the existence of data was identified. Regarding the method of recognizing voices, the patterns shown by the forms of voices were analyzed before they were compared to the patterns which had already been learned. According to the results of the experiment, in comparison with the recognition rate of 80% shown by the existing HMM algorithm, the suggested algorithm based on the recognition of the patterns shown by the forms of voices showed the recognition rate of 92%, reflecting the recognition rate improved by about 12% compared to the existing one.

Development of Portable Conversation-Type English Leaner (대화식 휴대용 영어학습기 개발)

  • Yoo, Jae-Tack;Yoon, Tae-Seob
    • Proceedings of the KIEE Conference
    • /
    • /
    • pp.147-149
    • /
    • 2004
  • Although most of the people have studied English for a long time, their English conversation capability is low. When we provide them portable conversational-type English learners by the application of computer and information process technology, such portable learners can be used to enhance their English conversation capability by their conventional conversation exercises. The core technology to develop such learner is the development of a voice recognition and synthesis module under an embedded environment. This paper deals with voice recognition and synthesis, prototype of the learner module using a DSP(Digital Signal Processing) chip for voice processing, voice playback function, flash memory file system, PC download function using USB ports, English conversation text function by the use of SMC(Smart Media Card) flash memory, LCD display function, MP3 music listening function, etc. Application areas of the prototype equipped with such various functions are vast, i.e. portable language learners, amusement devices, kids toy, control by voice, security by the use of voice, etc.

  • PDF

A study on the vowel extraction from the word using the neural network (신경망을 이용한 단어에서 모음추출에 관한 연구)

  • 이택준;김윤중
    • Proceedings of the Korea Society for Industrial Systems Conference
    • /
    • /
    • pp.721-727
    • /
    • 2003
  • This study designed and implemented a system to extract of vowel from a word. The system is comprised of a voice feature extraction module and a neutral network module. The voice feature extraction module use a LPC(Linear Prediction Coefficient) model to extract a voice feature from a word. The neutral network module is comprised of a learning module and voice recognition module. The learning module sets up a learning pattern and builds up a neutral network to learn. Using the information of a learned neutral network, a voice recognition module extracts a vowel from a word. A neutral network was made to learn selected vowels(a, eo, o, e, i) to test the performance of a implemented vowel extraction recognition machine. Through this experiment, could confirm that speech recognition module extract of vowel from 4 words.

  • PDF

A Study on Formants of Vowels for Speaker Recognition (화자 인식을 위한 모음의 포만트 연구)

  • Ahn Byoung-seob;Shin Jiyoung;Kang Sunmee
    • MALSORI
    • /
    • no.51
    • /
    • pp.1-16
    • /
    • 2004
  • The aim of this paper is to analyze vowels in voice imitation and disguised voice, and to find the invariable phonetic features of the speaker. In this paper we examined the formants of monophthongs /a, u, i, o, {$\omega},{\;}{\varepsilon},{\;}{\Lambda}$/. The results of the present are as follows : $\circled1$ Speakers change their vocal tract features. $\circled2$ Vowels /a, ${\varepsilon}$, i/ appear to be proper for speaker recognition since they show invariable acoustic feature during voice modulation. $\circled3$ F1 does not change easily compared to higher formants. $\circled4$ F3-F2 appears to be constituent for a speaker identification in vowel /a/ and /$\varepsilon$/, and F4-F2 in vowel /i/. $\circled5$ Resulting of F-ratio, differences of each formants were more useful than individual formant of a vowel to speaker recognition.

  • PDF

Voice Recognition Performance Improvement using a convergence of Voice Energy Distribution Process and Parameter (음성 에너지 분포 처리와 에너지 파라미터를 융합한 음성 인식 성능 향상)

  • Oh, Sang-Yeob
    • Journal of Digital Convergence
    • /
    • v.13 no.10
    • /
    • pp.313-318
    • /
    • 2015
  • A traditional speech enhancement methods distort the sound spectrum generated according to estimation of the remaining noise, or invalid noise is a problem of lowering the speech recognition performance. In this paper, we propose a speech detection method that convergence the sound energy distribution process and sound energy parameters. The proposed method was used to receive properties reduce the influence of noise to maximize voice energy. In addition, the smaller value from the feature parameters of the speech signal The log energy features of the interval having a more of the log energy value relative to the region having a large energy similar to the log energy feature of the size of the voice signal containing the noise which reducing the mismatch of the training and the recognition environment recognition experiments Results confirmed that the improved recognition performance are checked compared to the conventional method. Car noise environment of Pause Hit Rate is in the 0dB and 5dB lower SNR region showed an accuracy of 97.1% and 97.3% in the high SNR region 10dB and 15dB 98.3%, showed an accuracy of 98.6%.