• Title/Summary/Keyword: Non-speech

Search Result 470, Processing Time 0.025 seconds

Development of Automatic Creating Web-Site Tool for the Blind (시각장애인용 웹사이트 자동생성 툴 개발)

  • Baek, Hyeun-Ki;Ha, Tai-Hyun
    • Journal of Digital Contents Society
    • /
    • v.8 no.4
    • /
    • pp.467-474
    • /
    • 2007
  • This paper documents the design and implementation of an automatic creating web-site tool for the blind to build their own homepage by using both voice recognition and voice mixed technology with equal ease as the non-disabled. The blind can make voice mails, schedules, address lists and bookmarks by making use of the tool. It also facilitates communication between the non-disabled with the help of their information management system. This tool converts basic commands into voice recognition, also making an offer of text-to-speech which supports voice output. In the end, the tool will remove the blind's social isolation, allowing them to enjoy the information age like the non-disabled.

  • PDF

An Acoustic Study of English Non-Phoneme Schwa and the Korean Full Vowel /e/

  • Ahn, Soo-Woong
    • Speech Sciences
    • /
    • v.7 no.4
    • /
    • pp.93-105
    • /
    • 2000
  • The English schwa sound has special characteristics which are distinct from other vowels. It is non-phonemic and occurs only in an unstressed syllable. Compared with the English schwa, the Korean /e/ is a full vowel which has phonemic contrast. This paper had three aims. One was to see whether there is any relationship between English full vowels and their reduced vowel schwas. Second was to see whether there is any possible target in the English schwa sounds which are derived from different full vowels. The third was to compare the English non-phoneme vowel schwa and the Korean full vowel /e/ in terms of articulatory positions and duration. The study results showed that there is no relationship between each of the full vowels and its schwa. The schwa tended to converge into a possible target which was F1 456 and F2 1560. The Korean vowel /e/ seemed to have its distinct position speaker-individual which is different from the neutral tongue position. The evidence that the Korean /e/ is a back vowel was supported by the Seoul dialect speaker. In duration, the English schwa was much shorter than the full vowels, but there was no significant difference in length between the Korean /e/ and other Korean vowels.

  • PDF

Prediction of Break Indices in Korean Read Speech (국어 낭독체 발화의 운율경계 예측)

  • Kim Hyo Sook;Kim Chung Won;Kim Sun Ju;Kim Seoncheol;Kim Sam Jin;Kwon Chul Hong
    • MALSORI
    • /
    • no.43
    • /
    • pp.1-9
    • /
    • 2002
  • This study aims to model Korean prosodic phrasing using CART(classification and regression tree) method. Our data are limited to Korean read speech. We used 400 sentences made up of editorials, essays, novels and news scripts. Professional radio actress read 400sentences for about two hours. We used K-ToBI transcription system. For technical reason, original break indices 1,2 are merged into AP. Differ from original K-ToBI, we have three break index Zero, AP and IP. Linguistic information selected for this study is as follows: the number of syllables in ‘Eojeol’, the location of ‘Eojeol’ in sentence and part-of-speech(POS) of adjacent ‘Eojeol’s. We trained CART tree using above information as variables. Average accuracy of predicting NonIP(Zero and AP) and IP was 90.4% in training data and 88.5% in test data. Average prediction accuracy of Zero and AP was 79.7% in training data and 78.7% in test data.

  • PDF

Boll's Spectral Subtraction Algorithm by New Voice Activity Detection (새로운 음성 활동 검출법에 의한 Boll의 스펙트럼 차감 알고리즘)

  • 류종훈;김대경;박장식;손경식
    • Journal of Korea Multimedia Society
    • /
    • v.4 no.1
    • /
    • pp.46-55
    • /
    • 2001
  • In this paper, a new voice activity detection method estimating SNR of enhanced speech with extended spectral subtraction (ESS) is proposed. Voice activity detection is performed by putting an second Wiener filter behind an Wiener filter used in the ESS to estimate speech and noise power of output signal of first Wiener filter. The proposed voice activity detection method does not require many computational loads and performs well under severe input SNR. Boll's spectral substraction algorithm with proposed voice activity detection was compared to ESS under several noise environment having different time-frequency distributions. During speech and non-speech activity, performance of Boll's spectral substraction algorithm with proposed voice activity detection is superior to that of ESS.

  • PDF

Development of Joystick & Speech Recognition Moving Machine Control System (조이스틱 및 음성인식 겸용 이동기제어시스템 개발)

  • Lee, Sang-Bae;Kang, Sung-In
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.17 no.1
    • /
    • pp.52-57
    • /
    • 2007
  • This paper presents the design of intelligent moving machine control system using a real time speech recognition. The proposed moving machine control system is composed of four separated module, which are main control module, speech recognition module, servo motor driving module and sensor module. In main control module with microprocessor(80C196KC), one part of the artificial intelligences, fuzzy logic, was applied to the proposed intelligent control system. In order to improve the non-linear characteristic which depend on an user's weight and variable environment, encoder attached to the servo motors was used for feedback control. The proposed system is tested using 9 words lot control of the mobile robot, and the performance of a mobile robot using voice and joystick command is also evaluated.

Recent advances in genetic studies of stuttering

  • Kang, Changsoo
    • Journal of Genetic Medicine
    • /
    • v.12 no.1
    • /
    • pp.19-24
    • /
    • 2015
  • Speech and language are uniquely human-specific traits, which contributed to humans becoming the predominant species on earth. Disruptions in the human speech and language function may result in diverse disorders. These include stuttering, aphasia, articulation disorder, spasmodic dysphonia, verbal dyspraxia, dyslexia and specific language impairment. Among these disorders, stuttering is the most common speech disorder characterized by disruptions in the normal flow of speech. Twin, adoption, and family studies have suggested that genetic factors are involved in susceptibility to stuttering. For several decades, multiple genetic studies including linkage analysis were performed to connect causative gene to stuttering, and several genetic studies have revealed the association of specific gene mutation with stuttering. One notable genetic discovery came from the genetic studies in the consanguineous Pakistani families. These studies suggested that mutations in the lysosomal enzyme-targeting pathway genes (GNPTAB, GNPTG and NAPGA) are associated with non-syndromic persistent stuttering. Although these studies have revealed some clues in understanding the genetic causes of stuttering, only a small fraction of patients are affected by these genes. In this study, we summarize recent advances and future challenges in an effort to understand genetic causes underlying stuttering.

A Study on the Reconstruction of a Frame Based Speech Signal through Dictionary Learning and Adaptive Compressed Sensing (Adaptive Compressed Sensing과 Dictionary Learning을 이용한 프레임 기반 음성신호의 복원에 대한 연구)

  • Jeong, Seongmoon;Lim, Dongmin
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.37A no.12
    • /
    • pp.1122-1132
    • /
    • 2012
  • Compressed sensing has been applied to many fields such as images, speech signals, radars, etc. It has been mainly applied to stationary signals, and reconstruction error could grow as compression ratios are increased by decreasing measurements. To resolve the problem, speech signals are divided into frames and processed in parallel. The frames are made sparse by dictionary learning, and adaptive compressed sensing is applied which designs the compressed sensing reconstruction matrix adaptively by using the difference between the sparse coefficient vector and its reconstruction. Through the proposed method, we could see that fast and accurate reconstruction of non-stationary signals is possible with compressed sensing.

Artificial Intelligence for Clinical Research in Voice Disease (후두음성 질환에 대한 인공지능 연구)

  • Jungirl, Seok;Tack-Kyun, Kwon
    • Journal of the Korean Society of Laryngology, Phoniatrics and Logopedics
    • /
    • v.33 no.3
    • /
    • pp.142-155
    • /
    • 2022
  • Diagnosis using voice is non-invasive and can be implemented through various voice recording devices; therefore, it can be used as a screening or diagnostic assistant tool for laryngeal voice disease to help clinicians. The development of artificial intelligence algorithms, such as machine learning, led by the latest deep learning technology, began with a binary classification that distinguishes normal and pathological voices; consequently, it has contributed in improving the accuracy of multi-classification to classify various types of pathological voices. However, no conclusions that can be applied in the clinical field have yet been achieved. Most studies on pathological speech classification using speech have used the continuous short vowel /ah/, which is relatively easier than using continuous or running speech. However, continuous speech has the potential to derive more accurate results as additional information can be obtained from the change in the voice signal over time. In this review, explanations of terms related to artificial intelligence research, and the latest trends in machine learning and deep learning algorithms are reviewed; furthermore, the latest research results and limitations are introduced to provide future directions for researchers.

The Effect of Strong Syllables on Lexical Segmentation in English Continuous Speech by Korean Speakers (강음절이 한국어 화자의 영어 연속 음성의 어휘 분절에 미치는 영향)

  • Kim, Sunmi;Nam, Kichun
    • Phonetics and Speech Sciences
    • /
    • v.5 no.2
    • /
    • pp.43-51
    • /
    • 2013
  • English native listeners have a tendency to treat strong syllables in a speech stream as the potential initial syllables of new words, since the majority of lexical words in English have a word-initial stress. The current study investigates whether Korean (L1) - English (L2) late bilinguals perceive strong syllables in English continuous speech as word onsets, as English native listeners do. In Experiment 1, word-spotting was slower when the word-initial syllable was strong, indicating that Korean listeners do not perceive strong syllables as word onsets. Experiment 2 was conducted in order to avoid any possibilities that the results of Experiment 1 may be due to the strong-initial targets themselves used in Experiment 1 being slower to recognize than the weak-initial targets. We employed the gating paradigm in Experiment 2, and measured the Isolation Point (IP, the point at which participants correctly identify a word without subsequently changing their minds) and the Recognition Point (RP, the point at which participants correctly identify the target with 85% or greater confidence) for the targets excised from the non-words in the two conditions of Experiment 1. Both the mean IPs and the mean RPs were significantly earlier for the strong-initial targets, which means that the results of Experiment 1 reflect the difficulty of segmentation when the initial syllable of words was strong. These results are consistent with Kim & Nam (2011), indicating that strong syllables are not perceived as word onsets for Korean listeners and interfere with lexical segmentation in English running speech.

Speech Quality Estimation Algorithm using a Harmonic Modeling of Reverberant Signals (반향 음성 신호의 하모닉 모델링을 이용한 음질 예측 알고리즘)

  • Yang, Jae-Mo;Kang, Hong-Goo
    • Journal of Broadcast Engineering
    • /
    • v.18 no.6
    • /
    • pp.919-926
    • /
    • 2013
  • The acoustic signal from a distance sound source in an enclosed space often produces reverberant sound that varies depending on room impulse response. The estimation of the level of reverberation or the quality of the observed signal is important because it provides valuable information on the condition of system operating environment. It is also useful for designing a dereverberation system. This paper proposes a speech quality estimation method based on the harmonicity of received signal, a unique characteristic of voiced speech. At first, we show that the harmonic signal modeling to a reverberant signal is reasonable. Then, the ratio between the harmonically modeled signal and the estimated non-harmonic signal is used as a measure of standard room acoustical parameter, which is related to speech clarity. Experimental results show that the proposed method successfully estimates speech quality when the reverberation time varies from 0.2s to 1.0s. Finally, we confirm the superiority of the proposed method in both background noise and reverberant environments.