Search | Korea Science

AI-based stuttering automatic classification method: Using a convolutional neural network (인공지능 기반의 말더듬 자동분류 방법: 합성곱신경망(CNN) 활용)

Jin Park;Chang Gyun Lee
- Phonetics and Speech Sciences
- /
- v.15 no.4
- /
- pp.71-80
- /
- 2023
This study primarily aimed to develop an automated stuttering identification and classification method using artificial intelligence technology. In particular, this study aimed to develop a deep learning-based identification model utilizing the convolutional neural networks (CNNs) algorithm for Korean speakers who stutter. To this aim, speech data were collected from 9 adults who stutter and 9 normally-fluent speakers. The data were automatically segmented at the phrasal level using Google Cloud speech-to-text (STT), and labels such as 'fluent', 'blockage', prolongation', and 'repetition' were assigned to them. Mel frequency cepstral coefficients (MFCCs) and the CNN-based classifier were also used for detecting and classifying each type of the stuttered disfluency. However, in the case of prolongation, five results were found and, therefore, excluded from the classifier model. Results showed that the accuracy of the CNN classifier was 0.96, and the F1-score for classification performance was as follows: 'fluent' 1.00, 'blockage' 0.67, and 'repetition' 0.74. Although the effectiveness of the automatic classification identifier was validated using CNNs to detect the stuttered disfluencies, the performance was found to be inadequate especially for the blockage and prolongation types. Consequently, the establishment of a big speech database for collecting data based on the types of stuttered disfluencies was identified as a necessary foundation for improving classification performance.
https://doi.org/10.13064/KSSS.2023.15.4.071 인용 PDF

Sentiment Analysis of Korean Reviews Using CNN: Focusing on Morpheme Embedding (CNN을 적용한 한국어 상품평 감성분석: 형태소 임베딩을 중심으로)

Park, Hyun-jung;Song, Min-chae;Shin, Kyung-shik
- Journal of Intelligence and Information Systems
- /
- v.24 no.2
- /
- pp.59-83
- /
- 2018
With the increasing importance of sentiment analysis to grasp the needs of customers and the public, various types of deep learning models have been actively applied to English texts. In the sentiment analysis of English texts by deep learning, natural language sentences included in training and test datasets are usually converted into sequences of word vectors before being entered into the deep learning models. In this case, word vectors generally refer to vector representations of words obtained through splitting a sentence by space characters. There are several ways to derive word vectors, one of which is Word2Vec used for producing the 300 dimensional Google word vectors from about 100 billion words of Google News data. They have been widely used in the studies of sentiment analysis of reviews from various fields such as restaurants, movies, laptops, cameras, etc. Unlike English, morpheme plays an essential role in sentiment analysis and sentence structure analysis in Korean, which is a typical agglutinative language with developed postpositions and endings. A morpheme can be defined as the smallest meaningful unit of a language, and a word consists of one or more morphemes. For example, for a word '예쁘고', the morphemes are '예쁘(= adjective)' and '고(=connective ending)'. Reflecting the significance of Korean morphemes, it seems reasonable to adopt the morphemes as a basic unit in Korean sentiment analysis. Therefore, in this study, we use 'morpheme vector' as an input to a deep learning model rather than 'word vector' which is mainly used in English text. The morpheme vector refers to a vector representation for the morpheme and can be derived by applying an existent word vector derivation mechanism to the sentences divided into constituent morphemes. By the way, here come some questions as follows. What is the desirable range of POS(Part-Of-Speech) tags when deriving morpheme vectors for improving the classification accuracy of a deep learning model? Is it proper to apply a typical word vector model which primarily relies on the form of words to Korean with a high homonym ratio? Will the text preprocessing such as correcting spelling or spacing errors affect the classification accuracy, especially when drawing morpheme vectors from Korean product reviews with a lot of grammatical mistakes and variations? We seek to find empirical answers to these fundamental issues, which may be encountered first when applying various deep learning models to Korean texts. As a starting point, we summarized these issues as three central research questions as follows. First, which is better effective, to use morpheme vectors from grammatically correct texts of other domain than the analysis target, or to use morpheme vectors from considerably ungrammatical texts of the same domain, as the initial input of a deep learning model? Second, what is an appropriate morpheme vector derivation method for Korean regarding the range of POS tags, homonym, text preprocessing, minimum frequency? Third, can we get a satisfactory level of classification accuracy when applying deep learning to Korean sentiment analysis? As an approach to these research questions, we generate various types of morpheme vectors reflecting the research questions and then compare the classification accuracy through a non-static CNN(Convolutional Neural Network) model taking in the morpheme vectors. As for training and test datasets, Naver Shopping's 17,260 cosmetics product reviews are used. To derive morpheme vectors, we use data from the same domain as the target one and data from other domain; Naver shopping's about 2 million cosmetics product reviews and 520,000 Naver News data arguably corresponding to Google's News data. The six primary sets of morpheme vectors constructed in this study differ in terms of the following three criteria. First, they come from two types of data source; Naver news of high grammatical correctness and Naver shopping's cosmetics product reviews of low grammatical correctness. Second, they are distinguished in the degree of data preprocessing, namely, only splitting sentences or up to additional spelling and spacing corrections after sentence separation. Third, they vary concerning the form of input fed into a word vector model; whether the morphemes themselves are entered into a word vector model or with their POS tags attached. The morpheme vectors further vary depending on the consideration range of POS tags, the minimum frequency of morphemes included, and the random initialization range. All morpheme vectors are derived through CBOW(Continuous Bag-Of-Words) model with the context window 5 and the vector dimension 300. It seems that utilizing the same domain text even with a lower degree of grammatical correctness, performing spelling and spacing corrections as well as sentence splitting, and incorporating morphemes of any POS tags including incomprehensible category lead to the better classification accuracy. The POS tag attachment, which is devised for the high proportion of homonyms in Korean, and the minimum frequency standard for the morpheme to be included seem not to have any definite influence on the classification accuracy.
https://doi.org/10.13088/jiis.2018.24.2.059 인용 PDF KSCI

Korean Speech Act Tagging using Previous Sentence Features and Following Candidate Speech Acts (이전 문장 자질과 다음 발화의 후보 화행을 이용한 한국어 화행 분석)

Kim, Se-Jong;Lee, Yong-Hun;Lee, Jong-Hyeok
- Journal of KIISE:Software and Applications
- /
- v.35 no.6
- /
- pp.374-385
- /
- 2008
Speech act tagging is an important step in various dialogue applications, which recognizes speaker's intentions expressed in natural language utterances. Previous approaches such as rule-based and statistics-based methods utilize the speech acts of previous utterances and sentence features of the current utterance. This paper proposes a method that determines speech acts of the current utterance using the speech acts of the following utterances as well as previous ones. Using the features of following utterances yields the accuracy 95.27%, improving previous methods by 3.65%. Moreover, sentence features of the previous utterances are employed to maximally utilize the information available to the current utterance. By applying the proper probability model for each speech act, final accuracy of 97.97% is achieved.
PDF KSCI

Analyzing Emotions in Literature by Extracting Emotion Terms (텍스트의 정서 단어 추출을 통한 문학 작품의 정서 분석)

Ham, Jun-Seok;Rhee, Shin-Young;Ko, Il-Ju
- Science of Emotion and Sensibility
- /
- v.14 no.2
- /
- pp.257-268
- /
- 2011
We define a 'dominant emotion' as acting dominantly for unit time, and propose methodology to extract dominant emotion in a literature automatically. Due to the nature of the Korean language, it is able to be changed or reversed owns meanings as desinence. But it might be possible to extract a dominant emotion in a text has a small quantity like a fiction or an essay. A process to extract a dominant emotion in a literature is as follows. At first, extract morphemes in a whole text. And dispart words having emotional meaning as matching emotion terms database. Map disported terms to a affective circumplex model and matching it with basic emotion. Finally, analyze dominant emotion according to matched basic emotion. And we adjust our methodology to two literature; modem fiction 'A lucky day' by Jingeon, Hyun and essay 'An old man who shave a bat' by Woyoung, Yun. As a result, it was possible to grasp flows of dominant emotion.
PDF

Implementation of the Automatic Segmentation and Labeling System (자동 음성분할 및 레이블링 시스템의 구현)

Sung, Jong-Mo;Kim, Hyung-Soon
- The Journal of the Acoustical Society of Korea
- /
- v.16 no.5
- /
- pp.50-59
- /
- 1997
In this paper, we implement an automatic speech segmentation and labeling system which marks phone boundaries automatically for constructing the Korean speech database. We specify and implement the system based on conventional speech segmentation and labeling techniques, and also develop the graphic user interface(GUI) on Hangul $Motif^{TM}$ environment for the users to examine the automatic alignment boundaries and to refine them easily. The developed system is applied to 16kHz sampled speech, and the labeling unit is composed of 46 phoneme-like units(PLUs) and silence. The system uses both of the phonetic and orthographic transcription as input methods of linguistic information. For pattern-matching method, hidden Markov models(HMM) is employed. Each phoneme model is trained using the manually segmented 445 phonetically balanced word (PBW) database. In order to evaluate the performance of the system, we test it using another database consisting of sentence-type speech. According to our experiment, 74.7% of phoneme boundaries are within 20ms of the true boundary and 92.8% are within 40ms.
PDF

A Comparative Study of Second Language Acquisition Models: Focusing on Vowel Acquisition by Chinese Learners of Korean (중국인 학습자의 한국어 모음 습득에 대한 제2언어 습득 모델 비교 연구)

Kim, Jooyeon
- Phonetics and Speech Sciences
- /
- v.6 no.4
- /
- pp.27-36
- /
- 2014
This study provided longitudinal examination of the Chinese learners' acquisition of Korean vowels. Specifically, I examined the Chinese learners' Korean monophthongs /i, e, ɨ, ${\Lambda}$, a, u, o/ that were created at the time of 1 month and 12 months, tried to verify empirically how they learn by dealing with their mother tongue, and Korean vowels through dealing with pattern of the Perceptual Assimilation Model (henceforth PAM) of Best (Best, 1993; 1994; Best & Tyler, 2007) and the Speech Learning Model (henceforth SLM) of Flege (Flege, 1987; Bohn & Flege, 1992, Flege, 1995). As a result, most of the present results are shown to be similarly explained by the PAM and SLM, and the only discrepancy between these two models is found in the 'similar' category of sounds between the learners' native language and the target language. Specifically, the acquisition pattern of /u/ and /o/ in Korean is well accounted for the PAM, but not in the SLM. The SLM did not explain why the Chinese learners had difficulty in acquiring the Korean vowel /u/, because according to the SLM, the vowel /u/ in Chinese (the native language) is matched either to the vowel /u/ or /o/ in Korean (the target language). Namely, there is only a one-to-one matching relationship between the native language and the target language. In contrast, the Chinese learners' difficulty for the Korean vowel /u/ is well accounted for in the PAM in that the Chinese vowel /u/ is matched to the vowel pair /o, u/ in Korean, not the single vowel, /o/ or /u/.
https://doi.org/10.13064/KSSS.2014.6.4.027 인용 PDF KSCI

3D Graphic Nursery Contents Developed by Mobile AR Technology (모바일 기반 증강현실 기술을 활용한 3D전래동화 콘텐츠 연구)

Park, Young-sook;Park, Dea-woo
- Journal of the Korea Institute of Information and Communication Engineering
- /
- v.20 no.11
- /
- pp.2125-2130
- /
- 2016
In this paper, we researched the excellency of 3D graphic nursery contents which is developed by mobile AR technology. AR technology has currently people's attention because of the potential to be core contents of future ICT industry. We applied AR nursery contents for kid's subtitle language selection in Korean, Chinese and English education. The original fairy tale consisted of 6~8 scenes for the 3D contents production, and was adapted and translated. Dubbing was dubbed by the native speaker using the standard pronunciation, and the effect sound was edited separately to fit the scene. After composing a scenario, constructing a 3D model, constructing a interaction, constructing a sound effect, and creating content metadata, the Unity 3D game engine is executed to create a project and describe it as a script. It provides a fun and informative tradition of fairy tales with abundant content that incorporates ICT technology, accepting advanced technology-based education, and having opportunities to perceive software in daily life.
https://doi.org/10.6109/jkiice.2016.20.11.2125 인용 PDF KSCI

A Study on Performance Evaluation of Hidden Markov Network Speech Recognition System (Hidden Markov Network 음성인식 시스템의 성능평가에 관한 연구)

오세진;김광동;노덕규;위석오;송민규;정현열
- Journal of the Institute of Convergence Signal Processing
- /
- v.4 no.4
- /
- pp.30-39
- /
- 2003
In this paper, we carried out the performance evaluation of HM-Net(Hidden Markov Network) speech recognition system for Korean speech databases. We adopted to construct acoustic models using the HM-Nets modified by HMMs(Hidden Markov Models), which are widely used as the statistical modeling methods. HM-Nets are carried out the state splitting for contextual and temporal domain by PDT-SSS(Phonetic Decision Tree-based Successive State Splitting) algorithm, which is modified the original SSS algorithm. Especially it adopted the phonetic decision tree to effectively express the context information not appear in training speech data on contextual domain state splitting. In case of temporal domain state splitting, to effectively represent information of each phoneme maintenance in the state splitting is carried out, and then the optimal model network of triphone types are constructed by in the parameter. Speech recognition was performed using the one-pass Viterbi beam search algorithm with phone-pair/word-pair grammar for phoneme/word recognition, respectively and using the multi-pass search algorithm with n-gram language models for sentence recognition. The tree-structured lexicon was used in order to decrease the number of nodes by sharing the same prefixes among words. In this paper, the performance evaluation of HM-Net speech recognition system is carried out for various recognition conditions. Through the experiments, we verified that it has very superior recognition performance compared with the previous introduced recognition system.
PDF

An An.0, pproach to the Reorganization of University Libraries in the 21st Century

홍현진;이병목
- Journal of Korean Library and Information Science Society
- /
- v.29
- /
- pp.443-464
- /
- 1998
21세기를 맞이하여 대학도서관은 정보기술의 도입, 업무내용의 변화, 이용자의 요구변화등 급격하게 변화하는 새로운 환경에 직면해 있다. 본 연구는 한국의 대학도서관 조직구조의 현황에 대한 분석과 함께 다양한 조직이론들과 정보환경의 변화에 기초해서 도서관조직을 활성화시키기위한 개념적인 조직모델을 제시하고자 한다. 한국의 대학도서관은 거의 10년동안 법적인 제약과 조직내외의 환경적인 한계 등으로 인해 전산화시스템의 도입, 도서관부관장의 임명, 그리고 도서관과 컴퓨터 센터와의 통합시도와 같은 약간의 변화외에는 거의 변화가 없었다. 전형적인 한국의 대학도서관은 수서, 기술서비스, 열람과 참고봉사 부문으로 조직되었다. 여기서 수서 기능을 기술서비스의 부문으로 간주한다면, 본 연구의 대상인 대학도서관 114개관 중 95개관(82.5%)이 전통적인 도서관조직의 형태인 기술서비스와 공공서비스 부문으로 조직된 것으로 나타났다. 본 연구에서는 전통적인 도서관조직의 문제점들을 급복할 수 있는 21세기의 개념적인 대학도서관 조직모델로서, 네가지 부문 - 서비스 부문, 서비스지원 부문, 기술지원 부문, 그리고 통합·조정부문-을 대학도서관의 개념적인 기본 구성요소로써 제안하였다. 그러나 모든 도서관의 서비스나 업무과정에 대해 적합한 잉상적인 조직구조는 없으며, 조직의 재조직과정은 도서관의 형태와 목적, 업무과정에 따라 매우 다양하다. 따라서 도서관의 재조직화는 환경의 변화에 따라 끊임없는 과정이 될 것이며, 도서관조직의 성공은 이러한 변화에 적응할 수 있는 개인과 조직의 역량에 달려있다고 하겠다.대한 순서에 있어서 차이가 있다. 4) 도서관에 대한 태도에 있어서 두 집단은 상이한 입장을 보이고 있다. 학자들의 과반수는 중요 정보원으로서 자신의 개인장서를 활용하며, 도서관의 장서 및 그 조직방법에 대해서도 별로 만족하지를 못하고 있다. 반면에, 실무가들은 도서관에 대하여 비교적 만족하며 따라서 도서관에 대한 이용도도 높다. 5) 두 집단 모두 보조인을 적극적으로 활용하지 않으며 사서의 도움을 받는 경우도 극소수에 불과하다. 이러한 조사결과를 기초로 하여 볼 때 법률전문직을 둘러싼 정보환경을 개선하기 위하여는, 인쇄된 일차적 정보자료의 검색방법등을 개선하고, 나아가서는 법령과 판례정보를 위한 효율적인 시스템을 구축하며, 뿐만 아니라 이용자의 요구에 충분히 대처할 수 잇는 도서관으로 변화되는 것이다. 이와 함께 가장 중요한 것은 법과대학과 사법연수원에서 법학 연구방법에 관한 강좌를 개설하여 각종 법률정보원의 활용 내지 도서관 이용방법에 관하여 교육하는 것이다.글을 연구하고, 그 결과에 의존하여서 우리의 실제의 생활에 사용하는 $\boxDr$한국어사전$\boxUl$등을 만드는 과정에서, 어떤 의미에서 실험되었다고 말할 수가 있는 언어과학의 연구의 결과에 의존하여서 수행되는 철학적인 작업이다. 여기에서는 하나의 철학적인 연구의 시작으로 받아들여지는 이 의미분석의 문제를 반성하여 본다. 것이 필요하다고 사료된다.크기에 의존하며, 또한 이러한 영향은 $(Ti_{1-x}AI_{x})N$ 피막에 존재하는 AI의 함량이 높고, 초기에
PDF

GenAI(Generative Artificial Intelligence) Technology Trend Analysis Using Bigkinds: ChatGPT Emergence and Startup Impact Assessment (빅카인즈를 활용한 GenAI(생성형 인공지능) 기술 동향 분석: ChatGPT 등장과 스타트업 영향 평가)

Lee, Hyun Ju;Sung, Chang Soo;Jeon, Byung Hoon
- Asia-Pacific Journal of Business Venturing and Entrepreneurship
- /
- v.18 no.4
- /
- pp.65-76
- /
- 2023
In the field of technology entrepreneurship and startups, the development of Artificial Intelligence(AI) has emerged as a key topic for business model innovation. As a result, venture firms are making various efforts centered on AI to secure competitiveness(Kim & Geum, 2023). The purpose of this study is to analyze the relationship between the development of GenAI technology and the startup ecosystem by analyzing domestic news articles to identify trends in the technology startup field. Using BIG Kinds, this study examined the changes in GenAI-related news articles, major issues, and trends in Korean news articles from 1990 to August 10, 2023, focusing on the emergence of ChatGPT before and after, and visualized the relevance through network analysis and keyword visualization. The results of the study showed that the mention of GenAI gradually increased in the articles from 2017 to 2023. In particular, OpenAI's ChatGPT service based on GPT-3.5 was highlighted as a major issue, indicating the popularization of language model-based GenAI technologies such as OpenAI's DALL-E, Google's MusicLM, and VoyagerX's Vrew. This proves the usefulness of GenAI in various fields, and since the launch of ChatGPT, Korean companies have been actively developing Korean language models. Startups such as Ritten Technologies are also utilizing GenAI to expand their scope in the technology startup field. This study confirms the connection between GenAI technology and startup entrepreneurship activities, which suggests that it can support the construction of innovative business strategies, and is expected to continue to shape the development of GenAI technology and the growth of the startup ecosystem. Further research is needed to explore international trends, the utilization of various analysis methods, and the possibility of applying GenAI in the real world. These efforts are expected to contribute to the development of GenAI technology and the growth of the startup ecosystem.
PDF

Search Result 1,033, Processing Time 0.029 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)