Search | Korea Science

A Statistical Model for Korean Text Segmentation Using Syllable-Level Bigrams (음절단위 bigram정보를 이용한 한국어 단어인식모델)

Shin, Joong-Ho;Park, Hyuk-Ro
- Annual Conference on Human and Language Technology
- /
- 1997.10a
- /
- pp.255-260
- /
- 1997
일반적으로 한국어는 띄어쓰기 단위인 어절이 형태소 분석의 입력 단위로 쓰이고 있다. 그러나 실제 영역(real domain)에서 사용되는 텍스트에서는 띄어쓰기 오류와 같은 비문법적인 형태도 빈번히 쓰이고 있다. 따라서 형태소 분석 과정에 선행하여 적합한 형태소 분석의 단위를 인식하는 과정이 이루어져야 한다. 본 연구에서는 한국어의 음절 특성을 이용한 형태소분석을 위한 어절 인식 방법을 제안한다. 제안하는 방법은 사전에 기반하지 않고 원형코퍼스(raw corpus)로부터의 필요한 음절 정보 및 어휘정보를 추출하는 방법을 취하므로 오류가 포함된 문장에 대하여 견고한 분석이 가능하고 많은 시간과 노력이 요구되는 사전구축 및 관리 작업을 필요로 하지 않는다는 장점이 있다. 한국어 어절 인식을 위하여 본 논문에서는 세가지 확률 모텔과 동적 프로그래밍에 기반한 인식 알고리즘을 제안한다. 제안하는 모델들을 띄어쓰기 오류문제와 한국어 복합명사 분석 문제에 적용하여 실험한 결과 82-85%정도의 인식 정확도를 보였다.
PDF

A Method of Dictionary Search for Typographical Error (사용자 입력오류를 고려한 사전 검색 방법)

Jeong, Hyoung-Il;Seon, Choong-Nyoung;Seo, Jung-Yun
- Annual Conference on Human and Language Technology
- /
- 2010.10a
- /
- pp.183-185
- /
- 2010
디지털 기기들의 발전은 사전 검색 수요의 증가와 함께 강건한 검색 기법의 필요성도 증가시키고 있다. 기존의 사전 검색 기법들은 사용자의 입력 오류를 고려하지 않고, 검색 최적화만을 위해 설계되었다. 본 논문에서는 언어 모델 키워드와 자소 범주 키워드를 이용하여 오타에 강건한 사전 검색 방법을 제안한다. 제안된 방법은 오류가 포함된 사용자의 입력 단어에 대하여 활용 가능한 수준의 높은 성능과 검색 속도를 보여주었다.
PDF

A Study on the improvement of English writing by applying error indication function in word processor (워드프로세서의 영어문장 어법오류 인식개선을 통한 영어구문작성 향상방안에 대한 연구)

Yi, Jae-Il
- Journal of Digital Convergence
- /
- v.18 no.2
- /
- pp.285-290
- /
- 2020
This study focus on improving the text language proficiency regarding users' written text. In order to tone up accuracy improvement in writing, Computer Assisted Language Learning(CALL) can be primarily used as one of the most efficient tools. This study proposes a English Grammar Checking Application that can improve the accuracy over the current applications. The proposed system is capable of defining the difference between a Noun and a Noun Phrase which is critical in improving grammar accuracy for those who use Englilsh as a foreign language in English writing.
https://doi.org/10.14400/JDC.2020.18.2.285 인용 PDF KSCI

Analysis of Error Characteristics and Usabilities for Korean Consonant Perception Test (한국자음지각검사의 오류특성 및 유용성 분석)

Kim, Dong Chang;Kim, Jin Sook;Lee, Kyoung Won
- 재활복지
- /
- v.18 no.4
- /
- pp.295-314
- /
- 2014
The purpose of this study was to supply the baseline data for auditory rehabilitation in the field through error type and rate of the phoneme which the hearing impaired feel difficulty to discriminate. Thirty participants with sensorineural hearing loss heard KCPT lists through recorded voice by male and female to get the data about error type and KCPT score accordance with talker's gender. In the initial consonant test list, /ㄷ/, /ㅂ/, /ㅃ/, /ㅉ/, /ㅌ/ showed more than 30% error rate while /ㄱ/and /ㄷ/ showed in final consonant test list. The most common error type was the initial consonant substitution or the final consonant substitution for the initial or final consonant test lists. Talker's gender effect was not signigicant showing no statistical difference between the scores when compared results from male voice and female voice. It means that KCPT can be used regardless of talker's gender in clinics.
https://doi.org/10.16884/JRR.2014.18.4.295 인용

World Sense Disambiguation using Multiple Feature Decision Lists (다중 자질 결정 목록을 이용한 단어 의미 중의성 해결)

서희철;임해창
- Journal of KIISE:Software and Applications
- /
- v.30 no.7_8
- /
- pp.659-671
- /
- 2003
This paper proposes a method of disambiguating the senses of words using decision lists, which consists of rules with confidence values. The rule of decision list is composed of a boolean function(=precondition) and a class(=sense). Decision lists classify the instance using the rule with the highest confidence value that is matched with it. Previous work disambiguated the senses using single feature decision lists, whose boolean function was composed of only one feature. However, this approach can be affected more severely by data sparseness problem and preprocessing errors. Hence, we propose multiple feature decision lists that have the boolean function consisting of more than one feature in order to identify the senses of words. Experiments are performed with 1 sense tagged corpus in Korean and 5 sense tagged corpus in English. The experimental results show that multiple feature decision lists are more effective than single feature decision lists in disambiguating senses.
PDF KSCI

The Syllable Type and Token Frequency Effect in Naming Task (명명 과제에서 음절 토큰 및 타입 빈도 효과)

Kwon, Youan
- Korean Journal of Cognitive Science
- /
- v.25 no.2
- /
- pp.91-107
- /
- 2014
The syllable frequency effect is defined as the inhibitory effect that words starting with high frequency syllable generate a longer lexical decision latency and a larger error rate than words starting with low frequency syllable do. Researchers agree that the reason of the inhibitory effect is the interference from syllable neighbors sharing a target's first syllable at the lexical level and the degree of the interference effect correlates with the number of syllable neighbors or stronger syllable neighbors which have a higher word frequency. However, although the syllable frequency can be classified as the syllable type and token frequency, previous studies in visual word recognition have used the syllable frequency without the classification. Recently Conrad, Carreiras, & Jacobs (2008) demonstrated that the syllable type frequency might reflect a sub-lexical processing level including matching from letters to syllables and the syllable token frequency might reflect competitions between a target and higher frequency words of syllable neighbors in the whole word lexical processing level. Therefore, the present study investigated their proposals using word naming tasks. Generally word naming tasks are more sensitive to sub-lexical processing. Thus, the present study expected a facilitative effect of high syllable type frequency and a null effect of high syllable token frequency. In Experiment 1, words starting with high syllable type frequency generated a faster naming latency than words starting with low syllable type frequency with holding syllable token frequency of them. In Experiment 2, high syllable token frequency also created a shorter naming time than low syllable token frequency with holding their syllable type frequency. For that reason, we rejected the propose of Conrad et al. and suggested that both type and token syllable frequency could relate to the sub-lexical processing.
PDF KSCI

The Postprocessor of Automatic Segmentation for Synthesis Unit Generation (합성단위 자동생성을 위한 자동 음소 분할기 후처리에 대한 연구)

박은영;김상훈;정재호
- The Journal of the Acoustical Society of Korea
- /
- v.17 no.7
- /
- pp.50-56
- /
- 1998
본 논문은 자동 음소 분할기의 음소 경계 오류를 보상하기 위한 후처리 (Postprocessing)에 관한 연구이다. 이는 현재 음성 합성을 위한 음성/언어학적 연구, 운율 모델링, 합성단위 자동 생성 연구 등에 대량의 음소 단위 분절과 음소 레이블링된 데이터의 필요성에 따른 연구의 일환이다. 특히 수작업에 의한 분절 및 레이블링은 일관성의 유지가 어렵고 긴 시간이 소요되므로 자동 분절 기술이 더욱 중요시 되고 있다. 따라서, 본 논문은 자동 분절 경계의 오류 범위를 줄일 수 있는 후처리기를 제안하여 자동 분절 결과를 직접 합성 단위로 사용할 수 있고 대량의 합성용 운율 데이터 베이스 구축에 유용함을 기술한다. 제안된 후처리기는 수작업으로 조정된 데이터의 특징 벡터를 다층 신경회로망 (MLP:Multi-layer perceptron)을 통해 학습을 한 후, ETRI(Electronics and Telecommunication Research Institute)에서 개발된 음성 언어 번역 시스템을 이용한 자동 분절 결과와 후처리기인 MLP를 이용하여 새로운 음소 경계를 추출한다. 고립단어로 발성된 합성 데이터베이스에서 후처리기로 보정된 분절 결과는 음성 언어 번역 시스템의 분할율보 다 약 25%의 향상된 성능을 보였으며, 절대 오류(｜Hand label position-Auto label position ｜)는 약 39%가 향상되었다. 이는 MLP를 이용한 후처리기로 자동 분절 오류의 범위를 줄 일 수 있고, 대량의 합성용 운율 데이터 베이스 구축 및 합성 단위의 자동생성에 이용될 수 있음을 보이는 것이다.
PDF

Template Constrained Sequence to Sequence based Conversational Utterance Error Correction Method (문장틀 기반 Sequence to Sequence 구어체 문장 문법 교정기)

Jeesu Jung;Seyoun Won;Hyein Seo;Sangkeun Jung;Du-Seong Chang
- Annual Conference on Human and Language Technology
- /
- 2022.10a
- /
- pp.553-558
- /
- 2022
최근, 구어체 데이터에 대한 자연어처리 응용 기술이 늘어나고 있다. 구어체 문장은 소통 방식 등의 형태로 인해 정제되지 않은 형태로써, 필연적으로 띄어쓰기, 문장 왜곡 등의 다양한 문법적 오류를 포함한다. 자동 문법 교정기는 이러한 구어체 데이터의 전처리 및 일차적 정제 도구로써 활용된다. 사전학습된 트랜스포머 기반 문장 생성 연구가 활발해지며, 이를 활용한 자동 문법 교정기 역시 연구되고 있다. 트랜스포머 기반 문장 교정 시, 교정의 필요 유무를 잘못 판단하여, 오류가 생기게 된다. 이러한 오류는 대체로 문맥에 혼동을 주는 단어의 등장으로 인해 발생한다. 본 논문은 트랜스포머 기반 문법 교정기의 오류를 보강하기 위한 방식으로써, 필요하지 않은 형태소인 고유명사를 마스킹한 입력 및 출력 문장틀 형태를 제안하며, 이러한 문장틀에 대해 고유명사를 복원한 경우 성능이 증강됨을 보인다.
PDF

Extraversion and Recognition for Emotional Words: Effects of Valence, Frequency, and Task-difficulty (외향성과 정서단어의 재인 기억: 정서가, 빈도, 과제 난이도 효과)

Kang, Eunjoo
- Korean Journal of Cognitive Science
- /
- v.25 no.4
- /
- pp.385-416
- /
- 2014
In this study, memory for emotional words was compared between extraverts and introverts, employing signal detection analysis to distinguish differences in discriminative memory and response bias. Subjects were presented with a study list of emotional words in an encoding session, followed by a recognition session. Effects of task difficulty were examined by varying the nature of the encoding task and the intervals between study and test. For an easy task, with a retention interval of 5 minutes (Study I), introverts exhibited better memory (i.e., higher d') than extraverts, particularly for low-frequency words, and response biases did not differ between these two groups. For a difficult task, with a one-month retention period (Study II), performance was poor overall, and only high-frequency words were remembered; also extraverts adopted a more liberal criterion for 'old' responses (i.e., more hits and more false alarms) for positive emotional-valence words. These results suggest that as task difficulty drives down performance, effects of internal control processes become more apparent, revealing differences in response biases for positive words between extraverts and introverts. These results show that extraversion can distort memory performance for words, depending on their emotional valence.
PDF KSCI

Spoken Dialogue Management System based on Word Spotting (단어추출을 기반으로 한 음성 대화처리 시스템)

Song, Chang-Hwan;Yu, Ha-Jin;Oh, Yung-Hwan
- Annual Conference on Human and Language Technology
- /
- 1994.11a
- /
- pp.313-317
- /
- 1994
본 연구에서는 인간과 컴퓨터 사이의 음성을 이용한 대화 시스템을 구현하였다. 특별히 음성을 인식하는데 있어서 단어추출(word apotting) 방법을 사용하는 경우에 알맞은 의미 분석 방법과 도표 형태의 규칙을 기반으로 하여 시스템의 응답을 생성하는 방법에 대하여 연구하였다. 단어추출 방법을 사용하여 음성을 인식하는 경우에는 형태소분석 및 구문분석의 과정을 이용하여 사용자의 발화 의도를 분석하기 어려우므로 새로운 의미분석 방법을 필요로 한다. 본 연구에서는 퍼지 관계를 사용하여 사용자의 발화 의도를 파악하는 새로운 의미분석 방법을 제안하였다. 그리고, 사용자의 발화 의도에 적절한 시스템의 응답을 만들고 응답의 내용을 효율적으로 관리하기 위한 방범으로 현재의 상태와 사용자의 의도에 따른 응답 규칙을 만들었다. 이 규칙은 도표의 형태로 구현되어 규칙의 갱신 및 확장을 편리하게 만들었다. 대화의 영역은 열차 예매에 관련된 예매, 취소, 문의 및 관광지 안내로 제안하였다. 음성의 오인식에 의한 오류에 적절히 대처하기 위해 시스템의 응답은 확인 및 수정 과정을 포함하고 있다. 본 시스템은 문자 입력과 음성 입력으로 각각 실험한 결과, 사용자는 시스템의 도움을 받아 자신이 의도하는 목적을 달성할 수 있었다.
PDF

Search Result 213, Processing Time 0.025 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)