• Title/Summary/Keyword: speech analysis

Search Result 1,592, Processing Time 0.033 seconds

Acoustic Analysis and Auditory-Perceptual Assessment for Diagnosis of Functional Dysphonia (기능성 음성장애의 진단을 위한 음향학적, 청지각적 평가)

  • Kim, Geun-Hyo;Lee, Yeon-Yoo;Bae, In-Ho;Lee, Jae-Seok;Lee, Chang-Yoon;Park, Hee-June;Lee, Byung-Joo;Kwon, Soon-Bok
    • Journal of Clinical Otolaryngology Head and Neck Surgery
    • /
    • v.29 no.2
    • /
    • pp.212-222
    • /
    • 2018
  • Background and Objectives : The purpose of this study was to compare the measured values of acoustic and auditory perceptual assessments between normal and functional dysphonia (FD) groups. Materials and Methods : 102 subjects with FD and 59 normal voice groups were participated in this study. Mid-vowel portion of the sustained vowel /a/ and two sentences of 'Sanchaek' were edited, concatenated, and analyzed by Praat script. And then auditory-perceptual (AP) rating was completed by three listeners. Results : The FD group showed higher acoustic voice quality index version 2.02 and version 3.01 (AVQIv2 and AVQIv3), slope, Hammarberg index (HAM), grade (G) and overall severity (OS), values than normal group. Additionally, smoothed cepstral peak prominence in Praat (PraatCPPS), tilt, low-to high spectral band energies (L/H ratio), long-term average spectrum (LTAS) in FD group were lower than normal voice group. And the correlation among measured values ranged from -0.250 to 0.960. In ROC curve analysis, cutoff values of AVQIv2, AVQIv3, PraatCPPS, slope, tilt, L/H ratio, HAM, and LTAS were 3.270, 2.013, 13.838, -22.286, -9.754, 369.043, 27.912, and 34.523, respectively, and the AUC of each analysis was over .890 in AVQIv2, AVQIv3, and PraatCPPS, over 0.731 in HAM, tilt, and slope, over 0.605 in LTAS and L/H ratio. Conclusions : In conclusion, AVQI and CPPS showed the highest predictive power for distinguishing between normal and FD groups. Acoustic analyses and AP rating as noninvasive examination can reinforce the screening capability of FD and help to establish efficient diagnosis and treatment process plan for FD.

Korean Morphological Analysis Method Based on BERT-Fused Transformer Model (BERT-Fused Transformer 모델에 기반한 한국어 형태소 분석 기법)

  • Lee, Changjae;Ra, Dongyul
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.11 no.4
    • /
    • pp.169-178
    • /
    • 2022
  • Morphemes are most primitive units in a language that lose their original meaning when segmented into smaller parts. In Korean, a sentence is a sequence of eojeols (words) separated by spaces. Each eojeol comprises one or more morphemes. Korean morphological analysis (KMA) is to divide eojeols in a given Korean sentence into morpheme units. It also includes assigning appropriate part-of-speech(POS) tags to the resulting morphemes. KMA is one of the most important tasks in Korean natural language processing (NLP). Improving the performance of KMA is closely related to increasing performance of Korean NLP tasks. Recent research on KMA has begun to adopt the approach of machine translation (MT) models. MT is to convert a sequence (sentence) of units of one domain into a sequence (sentence) of units of another domain. Neural machine translation (NMT) stands for the approaches of MT that exploit neural network models. From a perspective of MT, KMA is to transform an input sequence of units belonging to the eojeol domain into a sequence of units in the morpheme domain. In this paper, we propose a deep learning model for KMA. The backbone of our model is based on the BERT-fused model which was shown to achieve high performance on NMT. The BERT-fused model utilizes Transformer, a representative model employed by NMT, and BERT which is a language representation model that has enabled a significant advance in NLP. The experimental results show that our model achieves 98.24 F1-Score.

Development of a Web-based Presentation Attitude Correction Program Centered on Analyzing Facial Features of Videos through Coordinate Calculation (좌표계산을 통해 동영상의 안면 특징점 분석을 중심으로 한 웹 기반 발표 태도 교정 프로그램 개발)

  • Kwon, Kihyeon;An, Suho;Park, Chan Jung
    • The Journal of the Korea Contents Association
    • /
    • v.22 no.2
    • /
    • pp.10-21
    • /
    • 2022
  • In order to improve formal presentation attitudes such as presentation of job interviews and presentation of project results at the company, there are few automated methods other than observation by colleagues or professors. In previous studies, it was reported that the speaker's stable speech and gaze processing affect the delivery power in the presentation. Also, there are studies that show that proper feedback on one's presentation has the effect of increasing the presenter's ability to present. In this paper, considering the positive aspects of correction, we developed a program that intelligently corrects the wrong presentation habits and attitudes of college students through facial analysis of videos and analyzed the proposed program's performance. The proposed program was developed through web-based verification of the use of redundant words and facial recognition and textualization of the presentation contents. To this end, an artificial intelligence model for classification was developed, and after extracting the video object, facial feature points were recognized based on the coordinates. Then, using 4000 facial data, the performance of the algorithm in this paper was compared and analyzed with the case of facial recognition using a Teachable Machine. Use the program to help presenters by correcting their presentation attitude.

Analysis of Generative AI Technology Trends Based on Patent Data (특허 데이터 기반 생성형 AI 기술 동향 분석)

  • Seongmu Ryu;Taewon Song;Minjeong Lee;Yoonju Choi;Soonuk Seol
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.17 no.1
    • /
    • pp.1-9
    • /
    • 2024
  • This paper analyzes the trends in generative AI technology based on patent application documents. To achieve this, we selected 5,433 generative AI-related patents filed in South Korea, the United States, and Europe from 2003 to 2023, and analyzed the data by country, technology category, year, and applicant, presenting it visually to find insights and understand the flow of technology. The analysis shows that patents in the image category account for 36.9%, the largest share, with a continuous increase in filings, while filings in the text/document and music/speech categories have either decreased or remained stable since 2019. Although the company with the highest number of filings is a South Korean company, four out of the top five filers are U.S. companies, and all companies have filed the majority of their patents in the U.S., indicating that generative AI is growing and competing centered around the U.S. market. The findings of this paper are expected to be useful for future research and development in generative AI, as well as for formulating strategies for acquiring intellectual property.

Acoustic Analysis and Melodization of Korean Intonation for Language Rehabilitation (언어재활을 위한 한국어의 음향적 분석과 선율화)

  • Choi, Jin Hee;Park Jeong Mi
    • Journal of Music and Human Behavior
    • /
    • v.21 no.1
    • /
    • pp.49-68
    • /
    • 2024
  • This study aims to acoustically analyze Korean language characteristics and convert these findings into musical elements, providing foundational data for evidence-based music-language rehabilitation. We collected voice data from thirty men and thirty women aged 19-25, each providing six-syllable prosodic units composed of two accentual phrases, including both declarative and interrogative sentences. Analyzing this data with Praat, we extracted syllabic acoustic properties and conducted statistical analyses based on acoustic properties, sentence type, gender, and particle presence. Significant differences were found in syllable frequency and duration based on accentual phrases and prosodic units (p < .001), with interrogative showing higher frequencies and declaratives longer durations (p < .001). Female frequencies were significantly higher than males' (p < .001), with longer durations observed (p < .001). Particle syllables also showed significantly stronger intensities (p < .001). Finally, we presented melodies converted from these acoustic properties into musical scores based on pitch, duration, and accent. The insights from this analysis of six-syllable Korean sentences will guide further research on developing a system for melodizing large-scale Korean speech data, expected to be crucial in music-based language rehabilitation.

Characteristic MR Imaging Features and Serial Changes in Adult-Onset Alexander Disease: A Case Report (성인형 알렉산더병의 자기공명영상 소견 및 추적 관찰상의 변화: 증례 보고)

  • Ha Yun Oh;Ra Gyoung Yoon;Ji Ye Lee;Ohyun Kwon;Woong-Woo Lee
    • Journal of the Korean Society of Radiology
    • /
    • v.84 no.3
    • /
    • pp.736-744
    • /
    • 2023
  • Adult-onset Alexander Disease (AOAD) is a rare genetically determined leukoencephalopathy that presents with ataxia, spastic paraparesis, or brain stem signs including speech abnormalities, swallowing difficulties, and frequent vomiting. The diagnosis of AOAD is frequently proposed based on the findings on MRI. We demonstrate two cases (37-year-old female and 61-year-old female) with characteristic imaging findings and changes in follow-up MRI in patients with AOAD, which were confirmed via glial fibrillary acidic protein (GFAP) mutation analysis. On MRI, the typical tadpole-like brainstem atrophy and periventricular white matter abnormalities were noted. The presumptive diagnoses were made based on the typical MRI appearances and, subsequently, confirmed via GFAP mutation analysis. Follow-up MRI demonstrated the progression of atrophy in the medulla and upper cervical spinal cord. Our report could help raise awareness of characteristic MRI findings of AOAD, thus helping clinicians use GFAP analysis for AOAD diagnosis confirmation.

A Study on the Development Trend of Artificial Intelligence Using Text Mining Technique: Focused on Open Source Software Projects on Github (텍스트 마이닝 기법을 활용한 인공지능 기술개발 동향 분석 연구: 깃허브 상의 오픈 소스 소프트웨어 프로젝트를 대상으로)

  • Chong, JiSeon;Kim, Dongsung;Lee, Hong Joo;Kim, Jong Woo
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.1
    • /
    • pp.1-19
    • /
    • 2019
  • Artificial intelligence (AI) is one of the main driving forces leading the Fourth Industrial Revolution. The technologies associated with AI have already shown superior abilities that are equal to or better than people in many fields including image and speech recognition. Particularly, many efforts have been actively given to identify the current technology trends and analyze development directions of it, because AI technologies can be utilized in a wide range of fields including medical, financial, manufacturing, service, and education fields. Major platforms that can develop complex AI algorithms for learning, reasoning, and recognition have been open to the public as open source projects. As a result, technologies and services that utilize them have increased rapidly. It has been confirmed as one of the major reasons for the fast development of AI technologies. Additionally, the spread of the technology is greatly in debt to open source software, developed by major global companies, supporting natural language recognition, speech recognition, and image recognition. Therefore, this study aimed to identify the practical trend of AI technology development by analyzing OSS projects associated with AI, which have been developed by the online collaboration of many parties. This study searched and collected a list of major projects related to AI, which were generated from 2000 to July 2018 on Github. This study confirmed the development trends of major technologies in detail by applying text mining technique targeting topic information, which indicates the characteristics of the collected projects and technical fields. The results of the analysis showed that the number of software development projects by year was less than 100 projects per year until 2013. However, it increased to 229 projects in 2014 and 597 projects in 2015. Particularly, the number of open source projects related to AI increased rapidly in 2016 (2,559 OSS projects). It was confirmed that the number of projects initiated in 2017 was 14,213, which is almost four-folds of the number of total projects generated from 2009 to 2016 (3,555 projects). The number of projects initiated from Jan to Jul 2018 was 8,737. The development trend of AI-related technologies was evaluated by dividing the study period into three phases. The appearance frequency of topics indicate the technology trends of AI-related OSS projects. The results showed that the natural language processing technology has continued to be at the top in all years. It implied that OSS had been developed continuously. Until 2015, Python, C ++, and Java, programming languages, were listed as the top ten frequently appeared topics. However, after 2016, programming languages other than Python disappeared from the top ten topics. Instead of them, platforms supporting the development of AI algorithms, such as TensorFlow and Keras, are showing high appearance frequency. Additionally, reinforcement learning algorithms and convolutional neural networks, which have been used in various fields, were frequently appeared topics. The results of topic network analysis showed that the most important topics of degree centrality were similar to those of appearance frequency. The main difference was that visualization and medical imaging topics were found at the top of the list, although they were not in the top of the list from 2009 to 2012. The results indicated that OSS was developed in the medical field in order to utilize the AI technology. Moreover, although the computer vision was in the top 10 of the appearance frequency list from 2013 to 2015, they were not in the top 10 of the degree centrality. The topics at the top of the degree centrality list were similar to those at the top of the appearance frequency list. It was found that the ranks of the composite neural network and reinforcement learning were changed slightly. The trend of technology development was examined using the appearance frequency of topics and degree centrality. The results showed that machine learning revealed the highest frequency and the highest degree centrality in all years. Moreover, it is noteworthy that, although the deep learning topic showed a low frequency and a low degree centrality between 2009 and 2012, their ranks abruptly increased between 2013 and 2015. It was confirmed that in recent years both technologies had high appearance frequency and degree centrality. TensorFlow first appeared during the phase of 2013-2015, and the appearance frequency and degree centrality of it soared between 2016 and 2018 to be at the top of the lists after deep learning, python. Computer vision and reinforcement learning did not show an abrupt increase or decrease, and they had relatively low appearance frequency and degree centrality compared with the above-mentioned topics. Based on these analysis results, it is possible to identify the fields in which AI technologies are actively developed. The results of this study can be used as a baseline dataset for more empirical analysis on future technology trends that can be converged.

Anthropometric Analysis of Unilateral Cleft Lip Patient (편측성 구순열 환아의 안모 계측 연구)

  • Koh, Kwang-Moo;Leem, Dae-Ho;Baek, Jin-A;Ko, Seung-O;Shin, Hyo-Keun
    • Maxillofacial Plastic and Reconstructive Surgery
    • /
    • v.33 no.5
    • /
    • pp.392-400
    • /
    • 2011
  • Purpose: Cleft lip and palate is one of the most frequent hereditary deformities of the maxillofacial region which can arise in facial and jaw abnormalities as well as malocclusion and speech problems. In particular, unilateral cleft lip and palate is characterized by midface deformity resulting in maxillary anterior nasal septal deviation and nasal deformity. The aim of this study is to analyze the facial deformity of untreated unilateral cleft lip patients for contribution to primary cheiloplasty. Methods: Thirty-three patients with unilateral cleft lip and palate were impressioned before operation and facial casts were made. The casts were classified into complete cleft lip and incomplete cleft lip groups and each group were classified into affected side and normal side. Anthropometric reference points and lines were setted up and analysis between points and lines were made. Results and Conclusion: The obtained results were as follows: 1. The intercanthal width had no significant difference between the incomplete and complete cleft lip groups. 2. Cleft width and alar base width were greater in the complete group, and nasal tip protrusion was greater in the incomplete group. 3. Involved alar width and nostril width were greater in the complete group and in both complete and incomplete groups, involved alar width and nostril width were greater than the non-involved side. 4. The lateral deviation of the subnasale was greater in the complete group in both involved and non-involved sides. 5. The nasal laterale was placed inferiorly in both cleft groups. 6. The subnasale was deviated to the non-involved side in both cleft groups. 7. The nose tip was deviated to the non-involved side in both cleft groups and had greater lateral deviation in the complete cleft group. 8. The midpoint of cupid's bow had no vertical difference between complete and incomplete groups, but had a greater lateral deviation in the complete group. 9. In the complete cleft group, correlation between differences in cleft width and nostril width and columella height difference were obtained.

Development and Effects of Intelligent CCTV Algorithm Creative Education Program Using Rich Picture Technique (리치픽처 기법을 적용한 지능형 CCTV 알고리즘 창의교육 프로그램 개발 및 효과)

  • Jung, Yu-Jin;Kim, Jin-Su;Park, Nam-Je
    • Journal of the Korea Convergence Society
    • /
    • v.11 no.4
    • /
    • pp.125-131
    • /
    • 2020
  • As technology advances, the importance of software education is increasing. Accordingly, interest in information subjects is increasing, but intending elementary learners to show algorithms only for specialized IT skills that could spoil the interest. In this paper for the elementary school students, through the four stages, 2015 revision curriculum analysis, creating of training program development operating plans, applying programs for the targeting students and analysis of results and evaluation, using Rich Picture technique which is various tools such as pictures and speech bubble symbols for the learners can express the intelligent CCTV algorithm freely and easily so they can understand fully about the algorithm of intelligent CCTV that uses artificial intelligence to extract faces from subjects. Suggest on this paper, the proposal of educational program can help the learner to grasp the principle of the algorithm by using the flowchart. As the result, Through the modification and development of the proposed program, we will conduct research on IT creative education that can be applied in various areas.

A Study Model Analysis of Complete Unilateral Cleft Lip & Palate Patients (편측성 완전 구순 구개열 환자의 구개열 형태 및 치궁의 분석)

  • Leem Dae-Ho;Kim Seung-Young;Shin Hyo-Keun
    • Korean Journal of Cleft Lip And Palate
    • /
    • v.2 no.1_2
    • /
    • pp.5-14
    • /
    • 1999
  • The aim of treatment of cleft lip and palate is to correct the cleft and associated problems surgically and thus hide the anomaly so that patients can lead normal lives. This correction involves surgically producing a face that does not attract attention, a vocal apparatus that permits intelligible speech, and a dentition that allows optimal function and esthetics. In neonatal periods, gross distortion of tissues surrounding the cleft requires considerable effort and time due to post operative functional defect and scarring and induces milk feeding problem, malocclusion of deciduous or permanent dentition, congenital missing teeth, skeletal dysplasia. The occurrence of a cleft deformity is a source of considerable shock to the parents of an afflicted baby, and the most appropriate approach is very important things. Thus we tried to analysis of dental arch, shape and size of deformity in cleft patients. The results were obtained as follows. 1. When the cast measurements of UCLP subjects at first visit it was found that the mean length was 9.29mm at the alveolar cleft width, also that was 11.7mm at the anterior width and 14mm at the posterior cleft width. 2. Comparison of UCLP group at first visit and just lip surgery, it was found that the older group showed a insignificant reduction in the width of the cleft in the alveolar, canine, and tuberosity regions. 3. The maxillary casts of the UCLP group at 6 months differ Significantly from those of the at 3 months in both length and width. but there was no statistical difference except anterior ridge length of nonclefted site. 4. Comparison at 6 months and 18 months, there was a greater change in length of the alveolar cleft width, intercanine width, and anterior cleft width. Maxillary arch became wider at both the canine region and intertuberosity region. also posterior anteroposterior length was increased but anterior AP length was decreased from 8.1mm to 7.7mm. There was meaningful increase at intertuberosity length; however, a significant reduction in width t-t'

  • PDF