• Title/Summary/Keyword: 한국어 어절

Search Result 364, Processing Time 0.016 seconds

The Method of Using the Automatic Word Clustering System for the Evaluation of Verbal Lexical-Semantic Network (동사 어휘의미망 평가를 위한 단어클러스터링 시스템의 활용 방안)

  • Kim Hae-Gyung;Yoon Ae-Sun
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.40 no.3
    • /
    • pp.175-190
    • /
    • 2006
  • For the recent several years, there has been much interest in lexical semantic network However it seems to be very difficult to evaluate the effectiveness and correctness of it and invent the methods for applying it into various problem domains. In order to offer the fundamental ideas about how to evaluate and utilize lexical semantic networks, we developed two automatic vol·d clustering systems, which are called system A and system B respectively. 68.455.856 words were used to learn both systems. We compared the clustering results of system A to those of system B which is extended by the lexical-semantic network. The system B is extended by reconstructing the feature vectors which are used the elements of the lexical-semantic network of 3.656 '-ha' verbs. The target data is the 'multilingual Word Net-CoroNet'. When we compared the accuracy of the system A and system B, we found that system B showed the accuracy of 46.6% which is better than that of system A. 45.3%.

The perceptual span during reading Korean sentences (우리글 읽기에서 지각 폭 연구)

  • Choi, So-Young;Koh, Sung-Yrong
    • Korean Journal of Cognitive Science
    • /
    • v.20 no.4
    • /
    • pp.573-601
    • /
    • 2009
  • The present study investigated the perceptual span during reading Korean, using the moving-window display change technique introduced by McConkie and Rayner(1975). Eight different window sizes were used in Experiment 1. They were 3, 5, 7, 9, 11, 13, 15 characters in size and the full line. Reading rate, number of fixation, saccadic distance, fixation duration were compared between each window-size condition and the full line condition. The reading rate was no higher in the full line condition than in the 15 character condition but was higher than in the other conditions. The number of fixations was no larger in the full line condition than in the 15 character condition, had a tendency to be larger than in the 13 characters condition, and was more than in the other conditions. The result pattern of the saccadic distance based on character was the same as that of the reading rate, and the saccadic distance based on the pixel was the same as that of the number of fixation. Similarly, for fixation duration, there was no differences between whole line condition and 15, 13, and 11 characters condition. The fixation duration had a tendency to be shorter in the 9 characters, and was shorter in the 7, 5, and 3 characters conditions than whole line condition. In Experiment 2, based on asymmetry of perceptual span, the 6 different window sizes(0, 1, 2, 3, 4 characters in size and the full line) were used. There was a difference only between the 0 condition and the other conditions in the reading rate, number of fixations, fixation duration. Considering the pattern of eye-movement measures above, the perceptual span of Korean readers extends about 6-7 characters to the right of fixation and 1 character to the left of fixation.

  • PDF

A Model for evaluating the efficiency of inputting Hangul on a telephone keyboard (전화기 자판의 한글 입력 효율성 평가 모형)

  • Koo, Min-Mo;Lee, Mahn-Young
    • The KIPS Transactions:PartD
    • /
    • v.8D no.3
    • /
    • pp.295-304
    • /
    • 2001
  • The standards of a telephone Hangul keyboard should be decided in terms of objective factors : the number of strokes and fingers’moving distance. A number of designers will agree on them, because these factors can be calculated in an objective manner. So, We developed the model which can evaluate the efficiency of inputting Hangul on a telephone keyboard in terms of two factors. As compared with other models, the major features of this model are as follows : in order to evaluate the efficiency of Hangul input on a telephone keyboard, (1) this model calculated not a typing time but the number of strokes ; (2) concurrence frequency that had been counted on KOREA-1 Corpus was used directly ; (3) a total set of 67 consonants and vowels was used ; and (4) this model could evaluate a number of keyboards that use a kind of syllabic function key-the complete key, the null key and the final consonant key and also calculate a lot of keyboards that adopt no syllabic function key. However, there are many other factors to judge the efficiency of inputting Hangul on a telephone keyboard. If we want to make more accurate estimate of a telephone Hangul keyboard, we must consider both logical data and experimental data as well.

  • PDF

AI-based stuttering automatic classification method: Using a convolutional neural network (인공지능 기반의 말더듬 자동분류 방법: 합성곱신경망(CNN) 활용)

  • Jin Park;Chang Gyun Lee
    • Phonetics and Speech Sciences
    • /
    • v.15 no.4
    • /
    • pp.71-80
    • /
    • 2023
  • This study primarily aimed to develop an automated stuttering identification and classification method using artificial intelligence technology. In particular, this study aimed to develop a deep learning-based identification model utilizing the convolutional neural networks (CNNs) algorithm for Korean speakers who stutter. To this aim, speech data were collected from 9 adults who stutter and 9 normally-fluent speakers. The data were automatically segmented at the phrasal level using Google Cloud speech-to-text (STT), and labels such as 'fluent', 'blockage', prolongation', and 'repetition' were assigned to them. Mel frequency cepstral coefficients (MFCCs) and the CNN-based classifier were also used for detecting and classifying each type of the stuttered disfluency. However, in the case of prolongation, five results were found and, therefore, excluded from the classifier model. Results showed that the accuracy of the CNN classifier was 0.96, and the F1-score for classification performance was as follows: 'fluent' 1.00, 'blockage' 0.67, and 'repetition' 0.74. Although the effectiveness of the automatic classification identifier was validated using CNNs to detect the stuttered disfluencies, the performance was found to be inadequate especially for the blockage and prolongation types. Consequently, the establishment of a big speech database for collecting data based on the types of stuttered disfluencies was identified as a necessary foundation for improving classification performance.