• 제목/요약/키워드: 저빈도(低频度)

Search Result 72, Processing Time 0.04 seconds

The Effect of Word Frequency on Noun Definitions (단어빈도가 명사정의하기에 미치는 효과)

  • Lee, Chan-Jong
    • The Journal of the Acoustical Society of Korea
    • /
    • v.27 no.6
    • /
    • pp.303-308
    • /
    • 2008
  • The purpose of the present study is to investigate that word frequency has significant influence on noun definitions in Korean. The experimental group was 80 students from Elementary school, Middle school, High school and University. They rated familiarity and wrote definitions for nouns. Noun definitions were analyzed with semantic categories such as "use/purpose," "description," "association/relation," "partial explanation," "explanation," "error," "partial explanation-attribute," "partial explanation-specific class," "partial explanation-nonspecific class," "explanation-specific class," "explanation-nonspecific class." As a result, they showed familiarity for high-frequency nouns. "EXPL" categories that use class terms or critical attributes were used more frequently in definitions of high-frequency nouns compared with low-frequency nouns. They increased with age and errors decreased with age. Word frequency had a significant influence on noun definitions.

A Study of Development for Korean Phonotactic Probability Calculator (한국어 음소결합확률 계산기 개발연구)

  • Lee, Chan-Jong;Lee, Hyun-Bok;Choi, Hun-Young
    • The Journal of the Acoustical Society of Korea
    • /
    • v.28 no.3
    • /
    • pp.239-244
    • /
    • 2009
  • This paper is to develop the Korean Phonotactic Probability Calculator (KPPC) that anticipates the phonotactic probability in Korean. KPPC calculates the positional segment frequecncy, position-specific biphone frequency and position-specific triphone frequency. And KPPC also calculates the Neighborhood Density that is the number of words that sound similar to a target word. The Phonotactic Calculator that was developed in University of Kansas can be analyzed by the computer-readable phonemic transcription. This can calculate positional frequency and position-specific biphone frequency that were derived from 20,000 dictionary words. But KPPC calculates positional frequency, positional biphone frequency, positional triphone frequency and neighborhood density. KPPC can calculate by korean alphabet or computer-readable phonemic transcription. This KPPC can anticipate high phonotactic probability, low phonotactic probability, high neighborhood density and low neighborhood density.

중국 코퍼스와 인터넷을 이용한 중한사전 표제어의 오류 연구 - F2-1을 중심으로

  • Baek, Jong-In
    • 중국학논총
    • /
    • no.63
    • /
    • pp.47-64
    • /
    • 2019
  • 当今在韩国流通的中韩词典收词颇多, 但词典里翻开哪已叶不难发现令人莫名其妙的词汇, 而且这些词汇当中有的甚至连汉语大词典里都找不到. 我们发现这些词汇里往往出现解释有误的问题. 本文主要探讨了这些解释有误词汇. 为此, 先从中韩词典里筛选出在现代汉语语料库中出现的次数少于十次的词汇. 我们认为此文里筛选出的这些词汇很可能不太正规或现在不怎幺使用. 为了使这种推测能得到更准确的印证, 作者在百度网上又检索了是否出现它们的用例, 之后, 就发现这些词汇确实存在各种问题, 需要校正这些解释有误的词汇. 本文以F2-1部分一千五百个词条为研究对象进行了适当性调查. 通过这次研究发现F2-1部分低频率词条有348个词, 其中45个词有各种问题. 值得探讨的是在汉韩词典里对这些低频率词条的说明出现不少错误, 许多词汇根本不适合被收录到词典里. 我们把这些带错误的词汇分成三各部分 : 1. 词汇解释有误, 2. 漏意味项, 3. 其他错误, 进行讨论. 我们将要继续研究其他项目的词条. 希望这些研究对中韩词典的编辑有所帮助.

A Study on Feature Selection for kNN Classifier using Document Frequency and Collection Frequency (문헌빈도와 장서빈도를 이용한 kNN 분류기의 자질선정에 관한 연구)

  • Lee, Yong-Gu
    • Journal of Korean Library and Information Science Society
    • /
    • v.44 no.1
    • /
    • pp.27-47
    • /
    • 2013
  • This study investigated the classification performance of a kNN classifier using the feature selection methods based on document frequency(DF) and collection frequency(CF). The results of the experiments, which used HKIB-20000 data, were as follows. First, the feature selection methods that used high-frequency terms and removed low-frequency terms by the CF criterion achieved better classification performance than those using the DF criterion. Second, neither DF nor CF methods performed well when low-frequency terms were selected first in the feature selection process. Last, combining CF and DF criteria did not result in better classification performance than using the single feature selection criterion of DF or CF.

The Generation Methods of Composition Noun For Efficient Index Term Extraction (고빈도어를 이용한 복합명사 색인어 추출 방안)

  • Kim, Mi-Jin;Park, Mi-Seong;Jang, Hyeok-Chang;Choi, Jae-Hyeok;Lee, Sang-Jo
    • Annual Conference on Human and Language Technology
    • /
    • 1998.10c
    • /
    • pp.121-129
    • /
    • 1998
  • 정보검색이나 자동색인 시스템에서는 정확한 색인어의 추출이 시스템의 성능을 좌우하게 된다. 따라서 정확한 색인어의 추출이 매우 중요하다. 본 논문에서는 정보 검색시에 보다 정확한 문서를 찾아줄 수 있도록, 출현 고빈도어를 이용하여 효율적인 색인어 추출을 위한 합성 명사 생성방안을 제시한다. 이를 위하여 문서 내에서 출현 빈도가 높은 명사, 즉 상위 $30%{\sim}40%$의 고빈도 명사에 합성 및 분해 규칙을 적용하여 합성명사 색인어를 추출한다. 또한 본 논문에서 제시한 상위 $30%{\sim}40%$ 고빈도 명사합성에 대한 타당성을 검증하기 위하여 적절한 명사합성 빈도를 구한다. 제안한 방법을 적용한 결과 300어절 이하의 짧은 문서는 출현빈도 상위 30%까지의 명사를 합성했을 경우 저빈도 누락이 작았고 300어절 이상의 문서는 출현빈도 40%까지 합성하면 저빈도 누락이 상당히 줄어듦을 알 수 있었다. 그리하여 전체 색인어의 개수를 줄였고 색인어의 정확률을 높였다.

  • PDF

Hangul Word-Frequency in Semantic Categorization Task (범주화 과제에서의 한글단어 빈도효과)

  • Cho, Jeung-Ryeul
    • Annual Conference on Human and Language Technology
    • /
    • 1999.10e
    • /
    • pp.351-358
    • /
    • 1999
  • Two experiments were conducted to investigate effects of word-frequency on semantic processing of Hangul. Stimuli were two syllable words, and exemplars and target words were different in the final consonant of the second syllable in the Exp 1 and in the final consonant of the first syllable in the Exp2. Exp 1 shows the results that subjects made more errors on low frequency target words and took longer times on high frequency exemplars than on controls. In Exp 2 subjects took longer times on high frequency examplar-low frequency target word conditions than on controls. These results support the predictions of dual process models and suggest that the use of phonological and visual information depends on word frequency. Phonological activation appears to be an optional rather than obligatory process.

  • PDF

Application of a large-scale climate ensemble simulation database for estimating the extreme rainfall (확률강우량 산정을 위한 대규모 기후 앙상블 모의자료의 적용)

  • Kim, Youngkyu;Son, Minwoo
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2022.05a
    • /
    • pp.333-333
    • /
    • 2022
  • 본 연구는 저빈도·고강도의 확률강우량 산정을 위해, 대규모 기후 앙상블 모의실험 기반으로 생성된 d4PDF(Data for Policy Decision Making for Future Change)를 적용하는 것을 목적으로 수행되었다. 또한, d4PDF 를 이용하여 산정된 확률강우량과 관측자료 및 빈도해석을 통해서 산정된 확률강우량을 비교함으로써 빈도해석의 적용에 따라 발생하는 불확실성을 분석하였다. 이와 같은 연구는 용담댐에 위치한 금산, 임실, 전주, 장수 관측소를 대상으로 수행되었다. d4PDF 자료는 총 50 개의 앙상블로 구성되어 있으며, 하나의 앙상블은 60 년 동안의 기상자료를 제공하기 때문에 한 지점에서 3,000 개의 연 최대 일 강우량을 수집 및 활용하는 것이 가능했다. 이와 같은 d4PDF 의 특징을 토대로 본 연구는 빈도해석 방법을 적용하지 않고, 3000 개의 연 최대 일 강수량을 비모수적 접근법(Non-parametric approach)에 따라 규모별로 나열하여, 10 년부터 1000 년의 재현기간을 갖는 확률강우량을 산정했다. 그 후, 관측 자료와 Gumbel 및 GEV(General extreme value) 분포를 토대로 산정된 확률강우량과의 편차를 산정하였다. 그 결과, 재현기간과 관측 기간의 차이가 증가할수록 이 편차가 증가하였으며, 이 결과는 짧은 관측 기간과 빈도해석의 적용은 재현기간이 증가할수록 신뢰하기 어려운 확률강우량을 제시한다는 것을 의미한다. 반면에, d4PDF 는 대규모 표본을 이용함으로써 이와 같은 불확실성을 최소화시켜 합리적인 저빈도·고강도의 확률강우량을 제시하였다.

  • PDF

The Effects of Working Memory Load on Word Frequency (작업기억 부하가 단어빈도에 미치는 효과)

  • Lee, Chang-Hoan;Oh, Ji-Hyang;Pyun, Sung-Bom;Lim, Heui-Seok
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.10 no.3
    • /
    • pp.567-571
    • /
    • 2009
  • This study was conducted in order to investigate the role of working memory in word recognition. As a preliminary step in tackling this topic, word frequency and working memory load were manipulated in a naming task. The results showed that word frequency is significantly involved with the working memory load. The effects of working memory load were greater in low-frequency word processing than in high-frequency word processing. These results indicat that working memory is involved more in the processing of low-frequency words. The implications for the teaching of children at the early reading acquisition stage are discussed in this paper.