• Title/Summary/Keyword: 자동 화자 인식

Search Result 48, Processing Time 0.023 seconds

The Local Path Constraint for the Recognition of Speech (음성 인식을 위한 소구간 경로 제약)

  • Ann, Tae-Ock;Kim, Soon-Hyob
    • The Journal of the Acoustical Society of Korea
    • /
    • v.8 no.4
    • /
    • pp.60-64
    • /
    • 1989
  • In this paper, an local path constraint Is proposed in order to increase the speech recognition rate. An input speech signal is analyzed by autocorrelation and LPC coefficient as parameters. The local path constraint of the proposed type was compared with the conventional five types. The speechs used in this search are the subway stops, and the 130 words pronounced 10 times for the different 13 words consisting of 11 characters of syllable by 2 male and 1 female are tested. As a result, we proved that this proposed type is the most optimal type and the recognition rate of $94.6\%$ is obtained .

  • PDF

Speaker Adapted Real-time Dialogue Speech Recognition Considering Korean Vocal Sound System (한국어 음운체계를 고려한 화자적응 실시간 단모음인식에 관한 연구)

  • Hwang, Seon-Min;Yun, Han-Kyung;Song, Bok-Hee
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.6 no.4
    • /
    • pp.201-207
    • /
    • 2013
  • Voice Recognition technique has been developed and it has been actively applied to various information devices such as smart phones and car navigation system. But the basic research technique related the speech recognition is based on research results in English. Since the lip sync producing generally requires tedious hand work of animators and it serious affects the animation producing cost and development period to get a high quality lip animation. In this research, a real time processed automatic lip sync algorithm for virtual characters in digital contents is studied by considering Korean vocal sound system. This suggested algorithm contributes to produce a natural lip animation with the lower producing cost and the shorter development period.

Optimal Feature Parameters Extraction for Speech Recognition of Ship's Wheel Orders (조타명령의 음성인식을 위한 최적 특징파라미터 검출에 관한 연구)

  • Moon, Serng-Bae;Chae, Yang-Bum;Jun, Seung-Hwan
    • Journal of the Korean Society of Marine Environment & Safety
    • /
    • v.13 no.2 s.29
    • /
    • pp.161-167
    • /
    • 2007
  • The goal of this paper is to develop the speech recognition system which can control the ship's auto pilot. The feature parameters predicting the speaker's intention was extracted from the sample wheel orders written in SMCP(IMO Standard Marine Communication Phrases). And we designed the post-recognition procedure based on the parameters which could make a final decision from the list of candidate words. To evaluate the effectiveness of these parameters and the procedure, the basic experiment was conducted with total 525 wheel orders. From the experimental results, the proposed pattern recognition procedure has enhanced about 42.3% over the pre-recognition procedure.

  • PDF

Application Example of Forensic Speaker Analysis Method for Voice-phishing Speech Files (보이스피싱 음성 파일에 대한 법과학적 화자 분석 방법의 적용 사례)

  • 박남인;이중;전옥엽;김태훈
    • Journal of Digital Forensics
    • /
    • v.13 no.1
    • /
    • pp.35-44
    • /
    • 2019
  • The voice-phishing is done by inducing victims to send money, only with voice through the personal information illegally obtained. The amount of damage caused by voice-phishing continues to increase every year, and it became a social problem. Recently, the Financial Supervisory Service (i.e. the FSS) in Republic of Korea has been collecting the voices of voice-phishing scamer from victims. In this paper, we describe an effective forensic speaker analysis method for detecting the voice from the same person compared with the large-scale speech files stored in database(DB), and apply the aforementioned forensic speaker analysis method with the collected voice-phising speech files from victims. At first, an i-vector of each speech file had been extracted from the DB, then, the cosine similarity matrix for the all speech files had been generated through the cosine distance among the extracted the i-vectors of all speech file in DB. In other words, it performed the speaker analysis as grouping a set of candidates with high common similarity among i-vectors of all speech files in DB. As a result of EER(Error Equal Rate) measurement for 6,724 speech files composed of 82 speakers, it was confirmed that the EER of the i-vector-based method is improved than that of the GMM-based method. Finally, as a result of comparing the collected 2,327 voice-phishing speech files collected by the FSS, it was shown that some of the speech files having similar voice features were grouped each other.

Improving the Performance of a Speech Recognition System in a Vehicle by Distinguishing Male/Female Voice (성별 구별방법에 의한 자동차 내 음성 인식 성능 향상)

  • Yang, Jin-Woo;Kim, Sun-Hyeop
    • Journal of KIISE:Software and Applications
    • /
    • v.27 no.12
    • /
    • pp.1174-1182
    • /
    • 2000
  • 본 논문은 주행중인 자동차 환경에서 운전자의 안전성 및 편의성의 동시 확보를 위하여, 보조적인 스위치 조작 없이 상시 음성의 입, 출력이 가능한 시스템을 제안하였다. 이대 잡음에 강인한 threshold 값을 구하기 위하여, 1.5초마다 기준 에너지와 영 교차율을 변경하였으며 대역 통과 여과기를 이용하여 1차, 2차로 나누어 실시간 상태에서 자동으로, 정확하게 끝점 검출을 처리하였다. 또한 남성, 여성을 피치검출로 구분하여 모델을 선택하게 하였고, 주행중인 자동차 속도에 따라 가장 적합한 모델을 사용하기 위하여 Idle-40km, 40-80km, 80-100km로 구분하여 남성, 여성 모델을 각각 구분하여 인식할 수 있게 하였다. 그리고, 음성의 특징 벡터와 인식 알고리즘은 PLP 13차와 OSDP(one-Stage Dynamic Programming)을 사용하였다. 본 실험은 서울시내 도로 및 내부 순환도로에서 각각 속도별로 구분하여 화자독립 인식 실험을 한 결과 40-80km 상태에서 남자는 96.8%, 여자는 95.1%, 80-100km 상태에서는 남자 91.6%, 여자는 90.6%의 인식결과를 얻을 수 있었고, 화자종속 인식실험 결과 40-80km 상태에서 남자는 98%, 여자는 96%, 80-100km 상태에서는 남자는 96%, 여자는 94%의 높은 인식률을 얻었으므로, system의 유효성을 입증하였다.

  • PDF

A Study on the Automatic Speech Control System Using DMS model on Real-Time Windows Environment (실시간 윈도우 환경에서 DMS모델을 이용한 자동 음성 제어 시스템에 관한 연구)

  • 이정기;남동선;양진우;김순협
    • The Journal of the Acoustical Society of Korea
    • /
    • v.19 no.3
    • /
    • pp.51-56
    • /
    • 2000
  • Is this paper, we studied on the automatic speech control system in real-time windows environment using voice recognition. The applied reference pattern is the variable DMS model which is proposed to fasten execution speed and the one-stage DP algorithm using this model is used for recognition algorithm. The recognition vocabulary set is composed of control command words which are frequently used in windows environment. In this paper, an automatic speech period detection algorithm which is for on-line voice processing in windows environment is implemented. The variable DMS model which applies variable number of section in consideration of duration of the input signal is proposed. Sometimes, unnecessary recognition target word are generated. therefore model is reconstructed in on-line to handle this efficiently. The Perceptual Linear Predictive analysis method which generate feature vector from extracted feature of voice is applied. According to the experiment result, but recognition speech is fastened in the proposed model because of small loud of calculation. The multi-speaker-independent recognition rate and the multi-speaker-dependent recognition rate is 99.08% and 99.39% respectively. In the noisy environment the recognition rate is 96.25%.

  • PDF

Enhancement of Ship's Wheel Order Recognition System using Speaker's Intention Predictive Parameters (화자의도예측 파라미터를 이용한 조타명령 음성인식 시스템의 개선)

  • Moon, Serng-Bae
    • Journal of Advanced Marine Engineering and Technology
    • /
    • v.32 no.5
    • /
    • pp.791-797
    • /
    • 2008
  • The officer of the deck(OOD) may sometimes have to carry out lookout as well as handling of auto pilot without a quartermaster at sea. The purpose of this paper is to develop the ship's auto pilot control module using speech recognition in order to reduce the potential risk of one man bridge system. The feature parameters predicting the OOD's intention was extracted from the sample wheel orders written in SMCP(IMO Standard Marine Communication Phrases). We designed a pre-recognition procedure which could make some candidate words using DTW(Dynamic Time Warping) algorithm, a post-recognition procedure which made a final decision from the candidate words using the feature parameters. To evaluate the effectiveness of these procedures the experiment was conducted with 500 wheel orders.

Real-Time Implementation of Speaker Dependent Speech Recognition Hardware Module Using the TMS320C32 DSP : VR32 (TMS320C32 DSP를 이용한 실시간 화자종속 음성인식 하드웨어 모듈(VR32) 구현)

  • Chung, Ik-Joo;Chung, Hoon
    • The Journal of the Acoustical Society of Korea
    • /
    • v.17 no.4
    • /
    • pp.14-22
    • /
    • 1998
  • 본 연구에서는 Texas Instruments 사의 저가형 부동소수점 디지털 신호 처리기 (Digital Singnal Processor, DSP)인 TMS320C32를 이용하여 실시간 화자종속 음성인식 하 드웨어 모듈(VR32)을 개발하였다. 하드웨어 모듈의 구성은 40MHz의 TMS320C32 DSP, 14bit 코덱인 TLC32044(또는 8bit μ-law PCM 코덱), EPROM과 SRAM 등의 메모리와 호 스트 인터페이스를 위한 로직 회로로 이루어졌다. 뿐만 아니라 이 하드웨어 모듈을 PC사에 서 평가해보기 위한 PC 인터페이스용 보드 및 소프트웨어도 개발하였다. 음성인식 알고리 즘의 구성은 에너지와 ZCR을 기반으로 한 끝점검출(Endpoint Detection) 침 10차 가중 LPC 켑스터럼(Weighted LPC Cepstrum) 분석이 실시간으로 이루어지며 이후 Dynamic Time Warping(DTW)를 통하여 최고 유사 단어를 결정하고 다시 검증과정을 거쳐 최종 인식을 수행한다. 끝점검출의 경우 적응 문턱값(Adaptive threshold)을 이용하여 잡음에 강인한 끝 점검출이 가능하며 DTW 알고리즘의 경우 C 및 어셈블리를 이용한 최적화를 통하여 계산 속도를 대폭 개선하였다. 현재 인식률은 일반 사무실 환경에서 통상 단축다이얼 용도로 사 용할 수 있는 30 단어에 대하여 95% 이상으로 매우 높은 편이며, 특히 배경음악이나 자동 차 소음과 같은 잡음환경에서도 잘 동작한다.

  • PDF

Performance Improvement of Voice Dialing System using Post-Processing (후처리를 이용한 음성 다이얼링 시스템의 성능향상)

  • 김원구
    • The Journal of the Acoustical Society of Korea
    • /
    • v.19 no.5
    • /
    • pp.9-12
    • /
    • 2000
  • Voice dialing system can recognize the speaker's command and dial the destinate phone number automatically. Such a system is useful for wireless handsets and portable communication devices. As a personal voice dialing system, all the commands are used to train the HMM for speech recognition based on owner-selected phrases. Its implementation requires much less memory space and computation resource compared to a speaker-independent system. Since only two or three training utterances per command are used in this system, it is difficult to estimate exact state duration distribution to improve the recognition performance. Therefore a post-processor is presented to improve the performance. Experiments which use the database collected through the telephone line showed that the proposed post-processor improves the recognition system performance.

  • PDF

Improved Automatic Lipreading by Multiobjective Optimization of Hidden Markov Models (은닉 마르코프 모델의 다목적함수 최적화를 통한 자동 독순의 성능 향상)

  • Lee, Jong-Seok;Park, Cheol-Hoon
    • The KIPS Transactions:PartB
    • /
    • v.15B no.1
    • /
    • pp.53-60
    • /
    • 2008
  • This paper proposes a new multiobjective optimization method for discriminative training of hidden Markov models (HMMs) used as the recognizer for automatic lipreading. While the conventional Baum-Welch algorithm for training HMMs aims at maximizing the probability of the data of a class from the corresponding HMM, we define a new training criterion composed of two minimization objectives and develop a global optimization method of the criterion based on simulated annealing. The result of a speaker-dependent recognition experiment shows that the proposed method improves performance by the relative error reduction rate of about 8% in comparison to the Baum-Welch algorithm.