• Title/Summary/Keyword: High Vowel

Search Result 144, Processing Time 0.031 seconds

Combined Artificial Bee Colony for Data Clustering (융합 인공벌군집 데이터 클러스터링 방법)

  • Kang, Bum-Su;Kim, Sung-Soo
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.40 no.4
    • /
    • pp.203-210
    • /
    • 2017
  • Data clustering is one of the most difficult and challenging problems and can be formally considered as a particular kind of NP-hard grouping problems. The K-means algorithm is one of the most popular and widely used clustering method because it is easy to implement and very efficient. However, it has high possibility to trap in local optimum and high variation of solutions with different initials for the large data set. Therefore, we need study efficient computational intelligence method to find the global optimal solution in data clustering problem within limited computational time. The objective of this paper is to propose a combined artificial bee colony (CABC) with K-means for initialization and finalization to find optimal solution that is effective on data clustering optimization problem. The artificial bee colony (ABC) is an algorithm motivated by the intelligent behavior exhibited by honeybees when searching for food. The performance of ABC is better than or similar to other population-based algorithms with the added advantage of employing fewer control parameters. Our proposed CABC method is able to provide near optimal solution within reasonable time to balance the converged and diversified searches. In this paper, the experiment and analysis of clustering problems demonstrate that CABC is a competitive approach comparing to previous partitioning approaches in satisfactory results with respect to solution quality. We validate the performance of CABC using Iris, Wine, Glass, Vowel, and Cloud UCI machine learning repository datasets comparing to previous studies by experiment and analysis. Our proposed KABCK (K-means+ABC+K-means) is better than ABCK (ABC+K-means), KABC (K-means+ABC), ABC, and K-means in our simulations.

Voice Analysis before and after Radioactive Iodine Ablation in Patients with Total Thyroidectomy (적갑상선 전절제술 환자의 방사성 동위원소치료 전.후 음성의 변화에 대한 연구)

  • Hong, Ki Hwan;Seo, Eun Ji;Lee, Hyun Doo;Yoon, Yun Sub;Lim, Seok Tae
    • Journal of the Korean Society of Laryngology, Phoniatrics and Logopedics
    • /
    • v.24 no.1
    • /
    • pp.33-40
    • /
    • 2013
  • Background and Objectives:This study is to objectively compare and analyze the acoustic changes in the patients with total thyroidectomy before and after RI therapy. Subjects and Methods:For this study, a total of 50 patients with total thyroidectomy were participated as subjects. Voice samples were obtained at the time of post-operation (Post-OP), before high-dose radioactive iodine therapy (Pre-RIT), and after high-dose radioactive iodine therapy (Post-RIT). Acoustic analysis, the maximum phonation time and K-VHI (Korea-Voice handicap index) were used for subjective evaluation. Results:According to the comparison analysis of the three periods, mFo (Hz) was significantly reduced in all of the vowels /a/ and /i/ as the hormone was discontinued. This can be related to the reduction in vocal range. As thyroid hormone was discontinued, Shim (%) and APQ (%) values, which are the parameters related to the degree of aggressiveness, showed a significant increase in the middle vowel /a/. As thyroid hormone was discontinued, emotional index was significantly decreased in VHI (voice handicap index). Conclusion:These results can be assumed that thyroid hormone suspension is related to the increased changes in the vocal intensity, the increase in noise and the reduction in vocal range. Emotionally, these data can be assumed that the responsive factors of one's own voice disorders were significantly decreased in the patients with vocal handicap.

  • PDF

A study on the clinical utility of voiced sentences in acoustic analysis for pathological voice evaluation (장애음성의 음향학적 분석에서 유성음 문장의 임상적 유용성에 관한 연구)

  • Ji-sung Kim
    • The Journal of the Acoustical Society of Korea
    • /
    • v.42 no.4
    • /
    • pp.298-303
    • /
    • 2023
  • This study aimed to investigate the clinical utility of voiced sentence tasks for voice evaluation. To this end, we analyzed the correlation between perturbation-based acoustic measurements [jitter percent (jitter), shimmer percent (shimmer), Noise to Harmonic Ratio (NHR)] using sustained vowel phonation, and cepstrum-based acoustic measurements [Cepstral Peak Prominence (CPP), Low/High spectral ratio (L/H ratio)] using voiced sentences. As a result of analyzing data collected from 65 patients with voice disorders, there was a significant correlation between the CPP and jitter (r = -.624, p = .000), shimmer (r = -.530, p = .000), NHR (r = -.469, p = .000).This suggests that the cepstrum measurement of voiced sentences can be used as an alternative to the analysis limitations of the pathological voice such as not possible perturbation-based acoustic measurement, and result difference according to the analysis section.

Acoustic Voice Tremor index in the measurement of voice tremor: Development and validation (음성 떨림 측정을 위한 AVTI(Acoustic Voice Tremor index)의 개발과 검증)

  • Geun-Hyo Kim;Yeon-Woo Lee
    • Phonetics and Speech Sciences
    • /
    • v.16 no.2
    • /
    • pp.91-97
    • /
    • 2024
  • The aim of this study was to develop and validate the Acoustic Voice Tremor index (AVTI) for the acoustic measurement of voice tremor. A total of 71 normal adults and 41 patients with voice tremor participated in the study. Vowels /a/ were recorded for at least five seconds. Three seconds of vowel stable duration were edited to identify measures of 18 variables related to voice tremor using a Praat script. These variables and the overall severity (OS) of auditory-perceptual assessment were used to design the AVTI using linear regression analysis. The linear regression analysis identified four out of the 18 variables as significant, and a regression equation was constructed. Furthermore, internal and external validity studies demonstrated high correlations, with an average of over 0.8. The AVTI demonstrated a high correlation of 0.841 with OS. The AVTI was found to be capable of predicting voice tremor. Further studies should include a larger number of voice samples and a complementary Praat script for further analysis.

Speech Synthesis Based on CVC Speech Segments Extracted from Continuous Speech (연속 음성으로부터 추출한 CVC 음성세그먼트 기반의 음성합성)

  • 김재홍;조관선;이철희
    • The Journal of the Acoustical Society of Korea
    • /
    • v.18 no.7
    • /
    • pp.10-16
    • /
    • 1999
  • In this paper, we propose a concatenation-based speech synthesizer using CVC(consonant-vowel-consonant) speech segments extracted from an undesigned continuous speech corpus. Natural synthetic speech can be generated by a proper modelling of coarticulation effects between phonemes and the use of natural prosodic variations. In general, CVC synthesis unit shows smaller acoustic degradation of speech quality since concatenation points are located in the consonant region and it can properly model the coarticulation of vowels that are effected by surrounding consonants. In this paper, we analyze the characteristics and the number of required synthesis units of 4 types of speech synthesis methods that use CVC synthesis units. Furthermore, we compare the speech quality of the 4 types and propose a new synthesis method based on the most promising type in terms of speech quality and implementability. Then we implement the method using the speech corpus and synthesize various examples. The CVC speech segments that are not in the speech corpus are substituted by demonstrate speech segments. Experiments demonstrate that CVC speech segments extracted from about 100 Mbytes continuous speech corpus can produce high quality synthetic speech.

  • PDF

Speaker Adapted Real-time Dialogue Speech Recognition Considering Korean Vocal Sound System (한국어 음운체계를 고려한 화자적응 실시간 단모음인식에 관한 연구)

  • Hwang, Seon-Min;Yun, Han-Kyung;Song, Bok-Hee
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.6 no.4
    • /
    • pp.201-207
    • /
    • 2013
  • Voice Recognition technique has been developed and it has been actively applied to various information devices such as smart phones and car navigation system. But the basic research technique related the speech recognition is based on research results in English. Since the lip sync producing generally requires tedious hand work of animators and it serious affects the animation producing cost and development period to get a high quality lip animation. In this research, a real time processed automatic lip sync algorithm for virtual characters in digital contents is studied by considering Korean vocal sound system. This suggested algorithm contributes to produce a natural lip animation with the lower producing cost and the shorter development period.

Assembling Disjoint Korean Syllables Using Two-Step Rules (2단계 규칙을 이용한 해체된 한글 음절의 결합)

  • Lee, Joo-Ho;Kim, Hark-Soo
    • Korean Journal of Cognitive Science
    • /
    • v.19 no.3
    • /
    • pp.283-295
    • /
    • 2008
  • With increasing usages of a messenger and a SMS, many young people are habitually using a new-style of sentences with intentionally disjoint Korean syllables. To develop a natural language interface system in these environments, we should first develop a technique that converts a sequence of disjoint Korean syllables to a correct sentence. Therefore, we propose a method to assemble a sequence of disjoint Korean syllables into a correct sentence by using two-step rules. In the first step, the proposed method assembles CVC (consonant-vowel-consonant) forms of simple-disjoint Korean syllables by using manual heuristic rules. In the second step, the proposed method assembles CCVCC forms of double-disjoint Korean syllables by using a mapping table and a transformation-based learning technique. In the experiment, the proposed method showed the perfect precision of 100% in assembling simple-disjoint Korean syllables and the high precision of 99.98% in assembling double-disjoint Korean syllables.

  • PDF

A Study on the Inputting Method of English Pronunciation for a Computer by the Combining Diacritical Mark (조합분음기호에 의한 영어 발음기호의 컴퓨터 입력방법에 관한 연구)

  • Lee Hyun-Chang
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.43 no.4 s.310
    • /
    • pp.31-38
    • /
    • 2006
  • In this paper, the inputting method of english pronunciation for a computer by the combining diacritical mark is studied. English pronunciation system and the methods of its notations are investigated and conditions to input english pronunciations easily are analysed. Therefore, the inputting method which can input 3, 4-level stress as well as 2-level stress is presented. By using this method, English pronunciation can be inputted to the spreadsheets, databases and presentations as well as word-processors, and each application program's data can have compatibility. In the result of experiments, every data can have the compatibility in all of application programs and inputting speed is increased highly compare with using the individual vowel method which has high speed than using the pre-existing functions of word processors.

Development of Neck-Type Electrolarynx Blueton and Acoustic Characteristic Analysis (경부형 전기인공후두 Blueton의 개발과 음향학적 성능 분석)

  • Choi, Seong-Hee;Park, Young-Jae;Park, Young-Kwan;Kim, Tae-Jung;Nam, Do-Hyun;Lim, Sung-Eun;Lee, Sung-Eun;Kim, Han-Soo;Choi, Hong-Shik;Kim, Kwang-Moon
    • Journal of the Korean Society of Laryngology, Phoniatrics and Logopedics
    • /
    • v.15 no.1
    • /
    • pp.37-42
    • /
    • 2004
  • Electrolarynx(EL), battery operated vibrators which are held against the neck by on-off button, has been widely used as a verbal communication method among post-laryngectomized patients. EL speech can produce easily without need of any additional surgery or special training and be used with any other methods. This institute developed a neck-typed EL named "Blueton" in commperation with EL Company Linkus, which consists of 3 parts : Vibrator part, Control part, Battery part. In this study we evaluated the acoustic characteristics of the produced voices by Blueton compared with Servox-inton using MDVP. Three EL users (2 full time users, 1 part time user) were participated. The results revelaed that NHR higher in Servox than Blueton and intensity is higher in Blueton than Servox. The spectra for vowels produced by EL speakers are mixed signals combined with talkers' vocal output and electrolarynx noise. The spectra pattern is similar with two ELs. High, SPI index and vowel spectra from MDVP demonstrated characteristics of both electrolarynxes related to noise signal. This finding suggests that Blueton helps to provide one of useful rehabilitation options in the post laryngectomy patients.

  • PDF

Sentence design for speech recognition database

  • Zu Yiqing
    • Proceedings of the KSPS conference
    • /
    • 1996.10a
    • /
    • pp.472-472
    • /
    • 1996
  • The material of database for speech recognition should include phonetic phenomena as much as possible. At the same time, such material should be phonetically compact with low redundancy[1, 2]. The phonetic phenomena in continuous speech is the key problem in speech recognition. This paper describes the processing of a set of sentences collected from the database of 1993 and 1994 "People's Daily"(Chinese newspaper) which consist of news, politics, economics, arts, sports etc.. In those sentences, both phonetic phenometla and sentence patterns are included. In continuous speech, phonemes always appear in the form of allophones which result in the co-articulary effects. The task of designing a speech database should be concerned with both intra-syllabic and inter-syllabic allophone structures. In our experiments, there are 404 syllables, 415 inter-syllabic diphones, 3050 merged inter-syllabic triphones and 2161 merged final-initial structures in read speech. Statistics on the database from "People's Daily" gives and evaluation to all of the possible phonetic structures. In this sentence set, we first consider the phonetic balances among syllables, inter-syllabic diphones, inter-syllabic triphones and semi-syllables with their junctures. The syllabic balances ensure the intra-syllabic phenomena such as phonemes, initial/final and consonant/vowel. the rest describes the inter-syllabic jucture. The 1560 sentences consist of 96% syllables without tones(the absent syllables are only used in spoken language), 100% inter-syllabic diphones, 67% inter-syllabic triphones(87% of which appears in Peoples' Daily). There are rougWy 17 kinds of sentence patterns which appear in our sentence set. By taking the transitions between syllables into account, the Chinese speech recognition systems have gotten significantly high recognition rates[3, 4]. The following figure shows the process of collecting sentences. [people's Daily Database] -> [segmentation of sentences] -> [segmentation of word group] -> [translate the text in to Pin Yin] -> [statistic phonetic phenomena & select useful paragraph] -> [modify the selected sentences by hand] -> [phonetic compact sentence set]

  • PDF