• Title/Summary/Keyword: Speech improvement

Search Result 610, Processing Time 0.025 seconds

The Speaker Recognition System using the Pitch Alteration (피치변경을 이용한 화자인식 시스템)

  • Jung JongSoon;Bae MyungJin
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • spring
    • /
    • pp.115-118
    • /
    • 2002
  • Parameters used in a speaker recognition system are desirable expressing speaker's characteristics filly and have in a speech. That is to say, if inter-speaker than intra-speaker variance a big characteristic, it is useful to distinguish between speakers. Also, to make minimum error between speakers, it is required the improved recognition technology as well as the distinguishing characteristics. When we see the result of recent simulation performance, we obtain more exact performance by using dynamic characteristics and constant characteristics by a speaking habit. Therefore we suggest it to solve this problem as followings. The prosodic information is used by a characteristic vector of speech. Characteristics vector generally using in speaker recognition system is a modeling spectrum information and is working for a high performance in non-noise circumstance. However, it is found a problem that characteristic vector is distorted in noise circumstance and it makes a reduction of recognition rate. In this paper, we change pitch line divided by segment which can estimate a dynamic characteristic and it is used as a recognition characteristic. we confirmed that the dynamic characteristic is very robust in noise circumstance with a simulation. We make a decision of acceptance or rejection by comparing test pattern and recognition rate using the proposed algorithm has more improvement than using spectrum and prosodic information. Especially stational recognition rate can be obtained in noise circumstance through the simulation.

  • PDF

Performance Comparison of Out-Of-Vocabulary Word Rejection Algorithms in Variable Vocabulary Word Recognition (가변어휘 단어 인식에서의 미등록어 거절 알고리즘 성능 비교)

  • 김기태;문광식;김회린;이영직;정재호
    • The Journal of the Acoustical Society of Korea
    • /
    • v.20 no.2
    • /
    • pp.27-34
    • /
    • 2001
  • Utterance verification is used in variable vocabulary word recognition to reject the word that does not belong to in-vocabulary word or does not belong to correctly recognized word. Utterance verification is an important technology to design a user-friendly speech recognition system. We propose a new utterance verification algorithm for no-training utterance verification system based on the minimum verification error. First, using PBW (Phonetically Balanced Words) DB (445 words), we create no-training anti-phoneme models which include many PLUs(Phoneme Like Units), so anti-phoneme models have the minimum verification error. Then, for OOV (Out-Of-Vocabulary) rejection, the phoneme-based confidence measure which uses the likelihood between phoneme model (null hypothesis) and anti-phoneme model (alternative hypothesis) is normalized by null hypothesis, so the phoneme-based confidence measure tends to be more robust to OOV rejection. And, the word-based confidence measure which uses the phoneme-based confidence measure has been shown to provide improved detection of near-misses in speech recognition as well as better discrimination between in-vocabularys and OOVs. Using our proposed anti-model and confidence measure, we achieve significant performance improvement; CA (Correctly Accept for In-Vocabulary) is about 89%, and CR (Correctly Reject for OOV) is about 90%, improving about 15-21% in ERR (Error Reduction Rate).

  • PDF

Effects of Communication Improvement on Caregivers Education and Training on Aphasia (보호자 교육과 경험학습 훈련이 실어증 환자의 의사소통 개선에 미치는 효과)

  • Park, Hee-June;Chang, Hyun-Jin
    • Therapeutic Science for Rehabilitation
    • /
    • v.8 no.2
    • /
    • pp.79-88
    • /
    • 2019
  • Objective : Aphasia interferes with communication between the patient and conversation partner. Adequate communication is essential not only for the patient but also for caregiver education and training Method : This study examined the benefits of parental education and group training in terms of improving the communication of six aphasic patients and their caregivers(family members). Caregiver education provided caregivers with information on stroke and aphasia, and group training was conducted according to the experimental learning cycle. Result : As a result, communication increased in terms of sending and receiving messages or interactive communication. Furthermore, the questionnaire analysis showed that caregivers learned more about aphasia and had confidence in using facilitation strategies. Conclusion : Giving educational opportunities to patients and caregivers promotes caregiver's knowledge and positively interacts.

Reliability of OperaVOXTM against Multi-Dimensional Voice Program to Assess Voice Quality before and after Laryngeal Microsurgery in Patient with Vocal Polyp (성대 용종 환자의 후두미세수술 전후 음성 평가에서 OperaVOXTM와 Multi-Dimensional Voice Program 간의 신뢰도 연구)

  • Kim, Sun Woo;Kim, So Yean;Cho, Jae Kyung;Jin, Sung Min;Lee, Sang Hyuk
    • Journal of the Korean Society of Laryngology, Phoniatrics and Logopedics
    • /
    • v.31 no.2
    • /
    • pp.71-77
    • /
    • 2020
  • Background and Objectives OperaVOXTM (Oxford Wave Research Ltd.) is a portable voice analysis software package designed for use with iOS devices. As a relatively cheap, portable and easily accessible form of acoustic analysis, OperaVOXTM may be more clinically useful than laboratory-based software in many situations. The aim of this study was to evaluate the agreement between OperaVOXTM and Multi-Dimensional Voice Program (MDVP; Computerized Speech Lab) to assess voice quality before and after laryngeal microsurgery in patient with vocal polyp. Materials and Method Twenty patients who had undergone laryngeal microsurgery for vocal polyp were enrolled in this study. Preoperative and postoperative voices were assessed by acoustic analysis using MDVP and OperaVOXTM. A five-seconds recording of vowel /a/ was used to measure fundamental frequency (F0), jitter, shimmer and noise-to-harmonic ratio (NHR). Results Several acoustic parameters of MDVP and OperaVOXTM related to short-term variability showed significant improvement. While pre-operative value of F0, jitter, shimmer, NHR was 155.75 Hz (male: 125.37 Hz, female: 183.37 Hz), 2.20%, 6.28%, 0.16, post-operative values of these parameter was 164.34 Hz (male: 129.42 Hz, female: 199.26 Hz), 2.15%, 5.18%, 0.14 Hz in MDVP. While pre-operative value of F0, jitter, shimmer, NHR was 168.26 Hz (male: 135.16 Hz, female: 201.37 Hz), 2.27%, 6.95%, 0.26, post-operative values of these parameters was 162.72 Hz (male: 128.267 Hz, female: 197.18 Hz), 1.71%, 5.36%, 0.20 in OperaVOXTM. There was high intersoftware agreement for F0, jitter, shimmer with intraclass correlation coefficient. Conclusion Our results showed that the short-term variability of acoustic parameters in both MDVP and OperaVOXTM were useful for the objective assessment of voice quality in patients who received laryngeal microsurgery. OperaVOXTM is comparable to MDVP and has high intersoftware reliability with MDVP in measuring the F0, jitter, and shimmer

Phonetic improvement by adjusting the shape of the anterior palate of the maxillary complete denture: a case report (상악 총의치 전방 구개 부위 형태 조정을 통한 발음개선 증례)

  • Yoon, Myeong Ah;Lee, HagYoung;Kim, Jee Hwan
    • The Journal of Korean Academy of Prosthodontics
    • /
    • v.60 no.1
    • /
    • pp.37-43
    • /
    • 2022
  • Patients tend to return to normal pronunciation patterns after fitting new dentures. However, for some patients, it takes a long time to adapt the new complete denture. In this case, the patient came to the hospital at the address of wanting to remake dentures due to wear and tear. After diagnosis through clinical and radiological examination, the maxillary complete denture and mandibular removable partial denture were remade. The patient complained whistling /s/ sound at the first check-up after placement of the new denture. The anterior palatal area of polished surface of the new maxillary complete denture was concave comparing to old denture, and this was the cause of the whistling /s/ sound. A tissue conditioning material was applied to the maxillary complete denture and patient made /s/ sound. The tissue conditioning material was replaced with self-curing type denture base resin, and the patient was immediately satisfied with clear /s/ sound. As an objective assessment, palatogram and speech analytics software was applied. In this case, a patient who received denture treatment complaining of difficulty in pronunciation underwent immediate denture repair, which resulted in patient satisfaction and improved pronunciation through objective evaluation.

A Convergence Study for Development of Psychological Language Analysis Program: Comparison of Existing Programs and Trend Analysis of Related Literature (심리학적 언어분석 프로그램 개발을 위한 융합연구: 기존 프로그램의 비교와 관련 문헌의 동향 분석)

  • Kim, Youngjun;Choi, Wonil;Kim, Tae Hoon
    • Journal of the Korea Convergence Society
    • /
    • v.12 no.11
    • /
    • pp.1-18
    • /
    • 2021
  • While content word-based frequency analysis has obvious limitations to intentional deception or irony, KLIWC has evolved into functional word analysis and KrKwic has evolved as a way to visualize co-occurrence frequencies. However, after more than 10 years of development, several issues still need improvement. Therefore, we tried to develop a new psychological language analysis program by analyzing KLIWC and KrKwic. First, the two programs were analyzed. In particular, the morpheme classification of KLIWC and the Korean morpheme analyzer was compared to enhance the functional word analysis function, and the psychological dictionary were analyzed to strengthen the psychological analysis. As a result of the analysis, the Hannanum part-of-speech analyzer was the most subdivided, but KLIWC for personal pronouns and KKMA for endings and endings were more subdivided, suggesting the integrated use of multiple part-of-speech analyzers to strengthen functional word analysis. Second, the research trends of studies that analyzed texts with these programs were analyzed. As a result of the analysis, the two programs were used in various academic fields, including the field of Interdisciplinary Studies. In particular, KrKwic was used a lot for the analysis of papers and reports, and KLIWC was used a lot for the comparative study of the writer's thoughts, emotions, and personality. Based on these results, the necessity and direction of development of a new psychological language analysis program were suggested.

Development of smartphone-based voice therapy program (스마트폰기반 음성치료 프로그램 개발연구)

  • Lee, Ha-Na;Park, Jun-Hee;Yoo, Jae-Yeon
    • Phonetics and Speech Sciences
    • /
    • v.11 no.1
    • /
    • pp.51-61
    • /
    • 2019
  • The purpose of this study was to develop a smartphone based voice therapy program for patients with voice disorders. Contents of voice therapy were collected through analysis of mobile contents related to voice therapy in Korea, experts and users' demand survey, and the program was developed using Android Studio. Content needed for voice therapy was collected through analysis of mobile contents related to voice therapy. The user satisfaction evaluation for application was conducted for five patient with functional voice disorders. The results showed that the mobile contents related to voice therapy in Korea were mostly related to breathing, followed by voice and singing, but only 13 applications were practically practiced for voice therapy. Expert and user demand surveys showed that the patients and therapists both had a high need for content that could provide voice training in places other than the treatment room. Based on this analysis, 'Home Voice Trainer', an smartphone based voice therapy program, was developed. Home Voice Trainer is an application for voice therapy and management based on Android smartphones. It is designed to train voice therapy activities at home that have been trained offline. In addition, the records of voice training of patients were managed online so that patients can maintain voice improvement through continuous voice consulting even after the end of voice therapy. User evaluations show that patients are satisfied with the difficulty and content of voice therapy programs provided by home voice trainers, but lack of a portion of user interface, such as the portion of home button and interface between screens. Further study suggests the clinical application of home voice trainer to the patients with voice disorders. It is expected that the development study and the clinical application of smart contents related to voice therapy will be actively conducted.

Performance Improvement Methods of a Spoken Chatting System Using SVM (SVM을 이용한 음성채팅시스템의 성능 향상 방법)

  • Ahn, HyeokJu;Lee, SungHee;Song, YeongKil;Kim, HarkSoo
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.4 no.6
    • /
    • pp.261-268
    • /
    • 2015
  • In spoken chatting systems, users'spoken queries are converted to text queries using automatic speech recognition (ASR) engines. If the top-1 results of the ASR engines are incorrect, these errors are propagated to the spoken chatting systems. To improve the top-1 accuracies of ASR engines, we propose a post-processing model to rearrange the top-n outputs of ASR engines using a ranking support vector machine (RankSVM). On the other hand, a number of chatting sentences are needed to train chatting systems. If new chatting sentences are not frequently added to training data, responses of the chatting systems will be old-fashioned soon. To resolve this problem, we propose a data collection model to automatically select chatting sentences from TV and movie scenarios using a support vector machine (SVM). In the experiments, the post-processing model showed a higher precision of 4.4% and a higher recall rate of 6.4% compared to the baseline model (without post-processing). Then, the data collection model showed the high precision of 98.95% and the recall rate of 57.14%.

Improvement of AMR Data Compression Using the Context Tree Weighting Method (Context Tree Weighting을 이용한 AMR 음성 데이터 압축 성능 개선)

  • Lee, Eun-su;Oh, Eun-ju;Yoo, Hoon
    • Journal of Internet Computing and Services
    • /
    • v.21 no.4
    • /
    • pp.35-41
    • /
    • 2020
  • This paper proposes an algorithm to improve the compression performance of the adaptive multi-rate (AMR) speech coding using the context tree weighting (CTW) method. AMR is the voice encoding standard adopted by IMT-2000, and supports 8 transmission rates from 4.75 kbit/s to 12.2 kbit/s to cope with changes in the channel condition. CTW as a kind of the arithmetic coding, uses a variable-order Markov model. Considering that CTW operates bit by bit, we propose an algorithm that re-orders AMR data and compresses them with CTW. To verify the validity of the proposed algorithm, an experiment is conducted to compare the proposed algorithm with existing compression methods including ZIP in terms of compression ratio. Experimental results indicate that the average additional compression rate in AMR data is about 3.21% with ZIP and about 9.10% with the proposed algorithm. Thus our algorithm improves the compression performance of AMR data by about 5.89%.

Vocabulary Recognition Performance Improvement using a convergence of Bayesian Method for Parameter Estimation and Bhattacharyya Algorithm Model (모수 추정을 위한 베이시안 기법과 바타차랴 알고리즘을 융합한 어휘 인식 성능 향상)

  • Oh, Sang-Yeob
    • Journal of Digital Convergence
    • /
    • v.13 no.10
    • /
    • pp.353-358
    • /
    • 2015
  • The Vocabulary Recognition System made by recognizing the standard vocabulary is seen as a decline of recognition when out of the standard or similar words. In this case, reconstructing the system in order to add or extend a range of vocabulary is a way to solve the problem. This paper propose configured Bhattacharyya algorithm standing by speech recognition learning model using the Bayesian methods which reflect parameter estimation upon the model configuration scalability. It is recognized corrected standard model based on a characteristic of the phoneme using the Bayesian methods for parameter estimation of the phoneme's data and Bhattacharyya algorithm for a similar model. By Bhattacharyya algorithm to configure recognition model evaluates a recognition performance. The result of applying the proposed method is showed a recognition rate of 97.3% and a learning curve of 1.2 seconds.