• Title/Summary/Keyword: 음성인식알고리즘

Search Result 449, Processing Time 0.031 seconds

Comparison of Characteristic Vector of Speech for Gender Recognition of Male and Female (남녀 성별인식을 위한 음성 특징벡터의 비교)

  • Jeong, Byeong-Goo;Choi, Jae-Seung
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.16 no.7
    • /
    • pp.1370-1376
    • /
    • 2012
  • This paper proposes a gender recognition algorithm which classifies a male or female speaker. In this paper, characteristic vectors for the male and female speaker are analyzed, and recognition experiments for the proposed gender recognition by a neural network are performed using these characteristic vectors for the male and female. Input characteristic vectors of the proposed neural network are 10 LPC (Linear Predictive Coding) cepstrum coefficients, 12 LPC cepstrum coefficients, 12 FFT (Fast Fourier Transform) cepstrum coefficients and 1 RMS (Root Mean Square), and 12 LPC cepstrum coefficients and 8 FFT spectrum. The proposed neural network trained by 20-20-2 network are especially used in this experiment, using 12 LPC cepstrum coefficients and 8 FFT spectrum. From the experiment results, the average recognition rates obtained by the gender recognition algorithm is 99.8% for the male speaker and 96.5% for the female speaker.

HMM with Global Path constraint in Viterbi Decoding for Insolated Word Recognition (전체 경로 제한 조건을 갖는 HMM을 이용한 단독음 인식)

  • Kim, Weon-Goo;Ahn, Dong-Soon;Youn, Dae-Hee
    • The Journal of the Acoustical Society of Korea
    • /
    • v.13 no.1E
    • /
    • pp.11-19
    • /
    • 1994
  • Hidden Markov Models (HMM's) with explicit state duration density (HMM/SD) can represent the time-varying characteristics of speech signals more accurately. However, such an advantage is reduced in relatively smooth state duration densities or ling bounded duration. To solve this problem, we propose HMM's with global path constraint (HMM/GPC) where the transition between states occur only within prescribed time slots. HMM/GPC explicitly limits state durations and accurately describes the temproal structure of speech simply and efficiently. HMM's formed by combining HMM/GPC with HMM/SD are also presented (HMM/SD+GPC) and performances are compared. HMM/GPC can be implemented with slight modifications to the conventional Viterbi algorithm. HMM/GPC and HMM/SD_GPC not only show superior performance than the conventional HMM and HMM/SD but also require much less computation. In the speaket independent isolated word recognition experiments, the minimum recognition eror rate of HMM/GPC(1.6%) is 1.1% lower than the conventional HMM's and the required computation decreased about 57%.

  • PDF

Personalized Smart Mirror using Voice Recognition (음성인식을 이용한 개인맞춤형 스마트 미러)

  • Dae-Cheol, Kang;Jong-Seok, Lim;Gil-Ho, Lee;Beom-Hee, Lee;Hyoung-Keun, Park
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.17 no.6
    • /
    • pp.1121-1128
    • /
    • 2022
  • Information about the present invention is made available for business use. You are helping to use the LCD, you can't use the LCD screen. During software configuration, Raspbian was used to provide the system environment. We made our way through the menu and made our financial through play. It provides various information such as weather, weather, apps, streamer music, and web browser search function, and it can be charged. Currently, the 'Google Assistant' will be provided through the GUI within a predetermined time.

Development of Street Crossing Assistive Embedded System for the Visually-Impaired Using Machine Learning Algorithm (머신러닝을 이용한 시각장애인 도로 횡단 보조 임베디드 시스템 개발)

  • Oh, SeonTaek;Jeong, Kidong;Kim, Homin;Kim, Young-Keun
    • Journal of the HCI Society of Korea
    • /
    • v.14 no.2
    • /
    • pp.41-47
    • /
    • 2019
  • In this study, a smart assistive device is designed to recognize pedestrian signal and to provide audio instructions for visually impaired people in crossing streets safely. Walking alone is one of the biggest challenges to the visually impaired and it deteriorates their life quality. The proposed device has a camera attached on a pair of glasses which can detect traffic lights, recognize pedestrian signals in real-time using a machine learning algorithm on GPU board and provide audio instructions to the user. For the portability, the dimension of the device is designed to be compact and light but with sufficient battery life. The embedded processor of device is wired to the small camera which is attached on a pair of glasses. Also, on inner part of the leg of the glasses, a bone-conduction speaker is installed which can give audio instructions without blocking external sounds for safety reason. The performance of the proposed device was validated with experiments and it showed 87.0% recall and 100% precision for detecting pedestrian green light, and 94.4% recall and 97.1% precision for detecting pedestrian red light.

English Phoneme Recognition using Segmental-Feature HMM (분절 특징 HMM을 이용한 영어 음소 인식)

  • Yun, Young-Sun
    • Journal of KIISE:Software and Applications
    • /
    • v.29 no.3
    • /
    • pp.167-179
    • /
    • 2002
  • In this paper, we propose a new acoustic model for characterizing segmental features and an algorithm based upon a general framework of hidden Markov models (HMMs) in order to compensate the weakness of HMM assumptions. The segmental features are represented as a trajectory of observed vector sequences by a polynomial regression function because the single frame feature cannot represent the temporal dynamics of speech signals effectively. To apply the segmental features to pattern classification, we adopted segmental HMM(SHMM) which is known as the effective method to represent the trend of speech signals. SHMM separates observation probability of the given state into extra- and intra-segmental variations that show the long-term and short-term variabilities, respectively. To consider the segmental characteristics in acoustic model, we present segmental-feature HMM(SFHMM) by modifying the SHMM. The SFHMM therefore represents the external- and internal-variation as the observation probability of the trajectory in a given state and trajectory estimation error for the given segment, respectively. We conducted several experiments on the TIMIT database to establish the effectiveness of the proposed method and the characteristics of the segmental features. From the experimental results, we conclude that the proposed method is valuable, if its number of parameters is greater than that of conventional HMM, in the flexible and informative feature representation and the performance improvement.

Speaker Verification System Based on HMM Robust to Noise Environments (잡음환경에 강인한 HMM기반 화자 확인 시스템에 관한 연구)

  • 위진우;강철호
    • The Journal of the Acoustical Society of Korea
    • /
    • v.20 no.7
    • /
    • pp.69-75
    • /
    • 2001
  • Intra-speaker variation, noise environments, and mismatch between training and test conditions are the major reasons for the speaker verification system unable to use it practically. In this study, we propose robust end-point detection algorithm, noise cancelling with the microphone property compensation technique, and inter-speaker discriminate technique by weighting cepstrum for robust speaker verification system. Simulation results show that the average speaker verification rate is improved in the rate of 17.65% with proposed end-point detection algorithm using LPC residue and is improved in the rate of 36.93% with proposed noise cancelling and microphone property compensation algorithm. The proposed weighting function for discriminating inter-speaker variations also improves the average speaker verification rate in the rate of 6.515%.

  • PDF

The Remote HMI System Control Using the Transformed Successive State Splitting Algorithm (변형된 상태분할 알고리즘을 이용한 원격 HMI 시스템 제어)

  • Lee, Jong-Woock;Lee, Jeong-Bae;Hwang, Yeong-Seop;Nam, Ji-Eun
    • Convergence Security Journal
    • /
    • v.8 no.4
    • /
    • pp.135-143
    • /
    • 2008
  • Currently, The HMI system is being used on the network is limited in the ability. In this paper, an Industrial HMI applied the transformed state splitting algorithm. this study suggests by applying a transformed the Successive state splitting algorithm, for the modeling in the questions of the expected data. So, you can save time and reliable and precise as high as 98.15 percent repregented recognition rate. HMI system applied to the voice of industrial equipment the man can not act directly in the industry environment was able to drive devices. Optimize the performance of the engine was the voice of HMI system.

  • PDF

Optimal Wavelet Selection for AR Model Parameter Identification of Nonstationary Time-Varying Signal (비정상 시변신호의 AR모델 파라메터 인식을 위한 최적의 웨이브렛 선택)

  • Shin, D.H.;Kim, S.H.
    • The Journal of the Acoustical Society of Korea
    • /
    • v.15 no.4
    • /
    • pp.50-57
    • /
    • 1996
  • In this paper, we proposed the method of optimal wavelet selection and wavelet expansion of AR(autoregressive) parameters by selected wavelet using F-test. A cost function is introduced as a wavelet selection method. Using this cost function, wavelets (D4 to D20) are tested to the synthesized signal. With this selected wavelet, we get the wavelet coefficients of AR parameters to both synthesized signal and real speech signal. To evaluate the proposed method, this wavelet based algorithm is compared with the Kalman filering algorithm. As a results, the proposed method shows a better performance by about 5-10dB than the Kalman filter.

  • PDF

Features for Figure Speech Recognition in Noise Environment (잡음환경에서의 숫자음 인식을 위한 특징파라메타)

  • Lee, Jae-Ki;Koh, Si-Young;Lee, Kwang-Suk;Hur, Kang-In
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • v.9 no.2
    • /
    • pp.473-476
    • /
    • 2005
  • This paper is proposed a robust various feature parameters in noise. Feature parameter MFCC(Mel Frequency Cepstral Coefficient) used in conventional speech recognition shows good performance. But, parameter transformed feature space that uses PCA(Principal Component Analysis)and ICA(Independent Component Analysis) that is algorithm transformed parameter MFCC's feature space that use in old for more robust performance in noise is compared with the conventional parameter MFCC's performance. The result shows more superior performance than parameter and MFCC that feature parameter transformed by the result ICA is transformed by PCA.

  • PDF

Object Tracking Method using Deep Learning and Kalman Filter (딥 러닝 및 칼만 필터를 이용한 객체 추적 방법)

  • Kim, Gicheol;Son, Sohee;Kim, Minseop;Jeon, Jinwoo;Lee, Injae;Cha, Jihun;Choi, Haechul
    • Journal of Broadcast Engineering
    • /
    • v.24 no.3
    • /
    • pp.495-505
    • /
    • 2019
  • Typical algorithms of deep learning include CNN(Convolutional Neural Networks), which are mainly used for image recognition, and RNN(Recurrent Neural Networks), which are used mainly for speech recognition and natural language processing. Among them, CNN is able to learn from filters that generate feature maps with algorithms that automatically learn features from data, making it mainstream with excellent performance in image recognition. Since then, various algorithms such as R-CNN and others have appeared in object detection to improve performance of CNN, and algorithms such as YOLO(You Only Look Once) and SSD(Single Shot Multi-box Detector) have been proposed recently. However, since these deep learning-based detection algorithms determine the success of the detection in the still images, stable object tracking and detection in the video requires separate tracking capabilities. Therefore, this paper proposes a method of combining Kalman filters into deep learning-based detection networks for improved object tracking and detection performance in the video. The detection network used YOLO v2, which is capable of real-time processing, and the proposed method resulted in 7.7% IoU performance improvement over the existing YOLO v2 network and 20 fps processing speed in FHD images.