• Title/Summary/Keyword: 띄어쓰기 변이

Search Result 3, Processing Time 0.015 seconds

Automatic English MeSH keywords assignment to Korean medical documents - spacing variant effect (한국어 의학 문서에 대한 영문 MeSH 키워드의 자동 부여 - 띄어쓰기 변이 처리 효과를 중심으로)

  • Lee, Jae-Sung;Kim, Mi-Suk;Lee, Young-Sung
    • Annual Conference on Human and Language Technology
    • /
    • 2004.10d
    • /
    • pp.82-89
    • /
    • 2004
  • 본 논문에서는 한국어 의학 논문의 요약문으로부터 자동 영문 MeSH 키워드 제안 시스템을 소개하고, 띄어쓰기 변이(spacing variant) 문제를 해결할 수 있는 방법을 제안한다. 띄어쓰기 변이란 표준 한글 맞춤법에 비해 다르게 띄어쓰기된 것을 말한다. 이를 위해 시소러스에는 생성 가능한 모든 띄어쓰기 변이 대신에 최대 띄어쓰기 어구만을 저장하고, 문서에서 K-MeSH 용어를 찾기 위해 음절단위 부분문자열 검색을 사용한다. 이 방법으로 한국어 의학 논문의 요약문에서 K-MeSH 용어를 추출한 후, TF-IDF 순위 함수를 이용하여 상위 10위내의 키워드를 저자가 선정한 영문 키워드와 비교한 결과 58%가 일치하였다. 이는 기존 방법에 비해 42%정도의 시소러스 크기가 축소되었고, 상위 10위내에서 영문 MeSH 키워드 추천 재현률이 약 7.8% 증가한 것으로 효과적인 방법임을 보여주었다.

  • PDF

Automatic Korean to English Cross Language Keyword Assignment Using MeSH Thesaurus (MeSH 시소러스를 이용한 한영 교차언어 키워드 자동 부여)

  • Lee Jae-Sung;Kim Mi-Suk;Oh Yong-Soon;Lee Young-Sung
    • The KIPS Transactions:PartB
    • /
    • v.13B no.2 s.105
    • /
    • pp.155-162
    • /
    • 2006
  • The medical thesaurus, MeSH (Medical Subject Heading), has been used as a controlled vocabulary thesaurus for English medical paper indexing for a long time. In this paper, we propose an automatic cross language keyword assignment method, which assigns English MeSH index terms to the abstract of a Korean medical paper. We compare the performance with the indexing performance of human indexers and the authors. The procedure of index term assignment is that first extracting Korean MeSH terms from text, changing these terms into the corresponding English MeSH terms, and calculating the importance of the terms to find the highest rank terms as the keywords. For the process, an effective method to solve spacing variants problem is proposed. Experiment showed that the method solved the spacing variant problem and reduced the thesaurus space by about 42%. And the experiment also showed that the performance of automatic keyword assignment is much less than that of human indexers but is as good as that of authors.

COMPARATIVE STUDY UPON THE CHARACTERISTICS OF WRITING BETWEEN THE PATIENTS WITH WRITING DISABILITIES AND NORMAL ELEMENTARY SCHOOL STUDENTS (쓰기 장애 환자와 정상 초등학교 학생의 쓰기 특성 비교)

  • Cho, Soo-Churl;Shin, Sung-Woong
    • Journal of the Korean Academy of Child and Adolescent Psychiatry
    • /
    • v.12 no.1
    • /
    • pp.51-70
    • /
    • 2001
  • Characteristics of handwriting were investigated and compared between the patients with writing disabilities and normal elementary school pupils. Generally, the heights of the letters of the patients were significantly larger than those of normal children, and letters of the patients were more sparsely distributed than those of controls. The distance between the words were significantly reduced in the patients’ writings, which indicated that patients had much more problems of space-leaving than normal pupils. Letter heights differences were significant across all grades in the patients and normal controls. The heights of the letters decreased as they grew older, and the slope of the decrements were more steeper in normal girls(r=-0.45) than girls with writing disabilities(r=-0.16). Sex differences were found in the letter spacings in low grades(grades 1, 2), that is, the distances between the letters were significantly narrower in the male patients than normal boys in these grades, and the differences were almost indiscriminating in grades 3 through 5, and finally, in sixth grade, letter spacings were signifycantly broader in normal boys than male dysgraphics. In girls, letter spacings were significantly broader in the patients across all grades. These findings supports the hypothesis that male and female writings were qualitatively different and that distinct mechanisms served in boys and girls dysgraphics. Across all grades and sexes, spaces between the words of the patients were significantly broader than normal pupils, which suggested that space-leaving between the words was important in Korean writings. There was trend that letter spacings and word spacings decreased across grades, but in girls, no correlations between the letter spacings and grades were found. Correlation analyses revealed that letter heights and letter spacings had mild correlation(r=0.11-0.15), and that letter spacings and word spacings had robust correlation(r=0.99). Phonological errors were mostly found in last phoneme(Jong-seong), especially double-phoneme(ㄳ, ㄵ, ㄶ, ㄺ, ㄻ, ㄼ, ㄾ, ㄿ, ㅀ, ㅄ), and in the case the sound values changed due to assimilations of phonemes. Semantic errors were rare in both groups. Space-leaving errors were correlated with phonological errors, and more frequent in boys than girls. In conclusion, significant differences existed in the letter heights, letter spacings, word spacings, and frequencies of phonological errors and spaceleaving errors between the patients with writing disabilities and normal pupils. The characteristics of writings changed across grades and the developmental profiles were somewhat quantitatively different between the groups. The differences became obvious from the second-third grades.

  • PDF