• Title/Summary/Keyword: 음역어

Search Result 17, Processing Time 0.024 seconds

A Retrieval System Using the Automatic Transition of the English-Adopted Words into Transliterations (영어외래어의 음역어 자동변환을 이용한 검색 시스템)

  • Lee, Mi-Ran;Kim, Yang-Taek;Jeun, Hong-Tee;Youn, Sung-Dae
    • Annual Conference of KIPS
    • /
    • 2002.04b
    • /
    • pp.1073-1076
    • /
    • 2002
  • 정보 검색시 질의어가 외래어일 경우에 검색의 재현율은 급격하게 감소된다. 이는 외래어에서 나오는 음역어의 불일치와 영어외래어, 한글음역어는 같은 색인으로 처리가 되지 않기 때문이다. 따라서 본 논문에서는 영어외래어를 한글음역어로 자동 변환시키고, 자동 변환시에는 영어음소에 해당하는 발음값을 한글음소로 모두 변환시킨 다음 조합하였다. 조합된 음역어들은 다시 동치부류 DB에 저장되어, 질의어 검색시 검색어가 동치부류 색인어로 확장되어 검색된다. 제안한 검색시스템의 성능을 평가하기 위해서 재현율을 측정하였다.

  • PDF

A Study on the Mismatch between the Spoken and Written of Chinese Language and the Use of the Phonetic Loans (중국의 언문(言文) 부조화와 음역어의 활용)

  • 김태은
    • Language Facts and Perspectives
    • /
    • v.44
    • /
    • pp.99-124
    • /
    • 2018
  • This study is about the mismatch between the spoken and written language of Chinese language. In the past, many Chinese intellectuals insisted on abolishing Chinese characters, since they are too difficult for common people to learn, write and remember. However, Chinese characters are still kept as the only formal letter in China, and probably, Chinese characters will not be abolished in the future. On the other hand, problematic situations often happen, because Chinese characters are used to transcribe foreign sounds such as phonetic symbols, even though they are ideograms. The most important part of the characters as an ideogram is the meaning, but sometimes the meaning is ignored for the phonetic representation of foreign sounds. Chinese phonetic loans show this situation well. Therefore, this study discusses various types of Chinese phonetic loans, the problems of variations, and the solution to overcome the problems.

Alleviating Semantic Term Mismatches in Korean Information Retrieval (한국어 정보 검색에서 의미적 용어 불일치 완화 방안)

  • Yun, Bo-Hyun;Park, Sung-Jin;Kang, Hyun-Kyu
    • The Transactions of the Korea Information Processing Society
    • /
    • v.7 no.12
    • /
    • pp.3874-3884
    • /
    • 2000
  • An information retrieval system has to retrieve all and only documents which are relevant to a user query, even if index terms and query terms are not matched exactly. However, term mismatches between index terms and qucry terms have been a serious obstacle to the enhancement of retrieval performance. In this paper, we discuss automatic term normalization between words in text corpora and their application to a Korean information retrieval system. We perform two types of term normalizations to alleviate semantic term mismatches: equivalence class and co-occurrence cluster. First, transliterations, spelling errors, and synonyms are normalized into equivalence classes bv using contextual similarity. Second, context-based terms are normalized by using a combination of mutual information and word context to establish word similarities. Next, unsupervised clustering is done by using K-means algorithm and co-occurrence clusters are identified. In this paper, these normalized term products are used in the query expansion to alleviate semantic tem1 mismatches. In other words, we utilize two kinds of tcrm normalizations, equivalence class and co-occurrence cluster, to expand user's queries with new tcrms, in an attempt to make user's queries more comprehensive (adding transliterations) or more specific (adding spc'Cializationsl. For query expansion, we employ two complementary methods: term suggestion and term relevance feedback. The experimental results show that our proposed system can alleviatl' semantic term mismatches and can also provide the appropriate similarity measurements. As a result, we know that our system can improve the rctrieval efficiency of the information retrieval system.

  • PDF

A Study on Keyword Extraction and Expansion for Web Text Retrieval (웹 문서 검색을 위한 검색어 추출과 확장에 관한 연구)

  • Yoon, Sung-Hee
    • Journal of the Korea Computer Industry Society
    • /
    • v.5 no.9
    • /
    • pp.1111-1118
    • /
    • 2004
  • Natural language query is the best user interface for the users of web text retrieval systems. This paper proposes a retrieval system with expanded keyword from syntactically-analyzed structures of user's natural language query based on natural language processing technique. Through the steps combining or splitting the compound nouns based on syntactic tree traversal, and expanding the other-formed or shorten-formed keyword into multiple keyword, it shows that precision and correctness of the retrieval system was enhanced.

  • PDF

A Study on User Satisfaction with CJK Romanization in the OCLC WorldCat System (도서관 서지정보의 한중일 로마자표기법에 대한 이용자 만족도 연구)

  • Ha, Yoo-Jin
    • Journal of the Korean Society for information Management
    • /
    • v.27 no.2
    • /
    • pp.95-115
    • /
    • 2010
  • The purpose of this study is to investigate how individuals assess Chinese, Japanese, and Korean (CJK) transliterated bibliographic information on current library catalogs. Two separate studies, a survey and an experiment, were conducted using the WorldCat system. Users noted that Romanization has many issues which can inhibit user‘s ability to understand the transliterated bibliographic information even when it is in the person’s own native language and even when the individual had extensive experience with transliteration systems. The experimental results also supported these findings: participants had better results and satisfaction when looking for information written in English than when searching for transliterated information written in their native language. Implications for future research suggests a need to investigate user preferences for translation vs. transliteration of bibliographic information. This study proposes consideration of using English translation as a parallel link with CJK Romanization for bibliographic information.

Interaction of native language interference and universal language interference on L2 intonation acquisition: Focusing on the pitch range variation (L2 억양에서 나타나는 모국어 간섭과 언어 보편적 간섭현상의 상호작용: 피치대역을 중심으로)

  • Yune, Youngsook
    • Phonetics and Speech Sciences
    • /
    • v.13 no.4
    • /
    • pp.35-46
    • /
    • 2021
  • In this study, we examined the interactive aspects between pitch reduction phenomena considered a universal language phenomenon and native language interference in the production of L2 intonation performed by Chinese learners of Korean. To investigate their interaction, we conducted an acoustic analysis using acoustic measures such as pitch span, pitch level, pitch dynamic quotient, skewness, and kurtosis. In addition, the correlation between text comprehension and pitch was examined. The analyzed material consisted of four Korean discourses containing five and seven sentences of varying difficulty. Seven Korean native speakers and thirty Chinese learners who differed in their Korean proficiency participated in the production test. The results, for differences by language, showed that Chinese had a more expanded pitch span, and a higher pitch level than Korean. The analysis between groups showed that at the beginner and intermediate levels, pitch reduction was prominent, i.e., their Korean was characterized by a compressed pitch span, low pitch level, and less sentence internal pitch variation. Contrariwise, the pitch use of advanced speakers was most similar to Korean native speakers. There was no significant correlation between text difficulty and pitch use. Through this study, we observed that pitch reduction was more pronounced than native language interference in the phonetic layer.

인쇄용어 통일에 관한 연구(4)-틀리기 쉬운 잉크용어

  • Park, Do-Yeong
    • 프린팅코리아
    • /
    • s.30
    • /
    • pp.164-167
    • /
    • 2004
  • 인쇄관련 언론매체를 보면 놀랍게도 일본어로 표기된 단어를 많이 발견할 수 있다. 하리코미, 베타, 고마, 돔보, 도지, 구와에, 돈땡, 모루동, 후렉소, 아지로, 도무송, 싸바리, 단보루 등이 대표적으로 많이 쓰이고 있다. 또한 일본어를 음역하여 견당, 습수, 정합, 노광, 타발, 사양, 중철, 소부, 돗판, 매엽, 하지, 상지 등 우리말에도 없는 것을 계속 사용하고 있다. 이에 박도영 전 교육인적자원부교과서심의위원이 정리한 인쇄용어통일에 대한 연구를 정리.연재한다.

  • PDF

인쇄용어 통일에 관한 연구(1)-틀리기 쉬운 제판용어

  • Park, Do-Yeong
    • 프린팅코리아
    • /
    • s.27
    • /
    • pp.114-119
    • /
    • 2004
  • 인쇄관련 언론매체를 보면 놀랍게도 일본어로 표기된 단어를 많이 발견할 수 있다. 하리코미, 베타, 고마, 돔보, 도지, 구와에, 돈땡, 모루동, 후렉소, 아지로, 도무송, 싸바리, 단보루 등이 대표적으로 많이 쓰이고 있다. 또한 일본어를 음역하여 견당, 습수, 정합, 노광, 타발, 사양, 중철, 소부, 돗판, 매엽, 하지, 상지 등 우리말에도 없는 것을 계속 사용하고 있다. 이에 박도영 전 교육인적자원부교과서심의위원이 정리한 인쇄용어통일에 대한 연구를 정리.연재한다.

  • PDF

인쇄용어 통일에 관한 연구(2)-틀리기 쉬운 제판용어

  • Park, Do-Yeong
    • 프린팅코리아
    • /
    • s.28
    • /
    • pp.120-123
    • /
    • 2004
  • 인쇄관련 언론매체를 보면 놀랍게도 일본어로 표기된 단어를 많이 발견할 수 있다. 하리코미, 베타, 고마, 돔보, 도지, 구와에, 돈땡, 모루동, 후렉소, 아지로, 도무송, 싸바리, 단보루 등이 대표적으로 많이 쓰이고 있다. 또한 일본어를 음역하여 견당, 습수, 정합, 노광, 타발, 사양, 중철, 소부, 돗판, 매엽, 하지, 상지 등 우리말에도 없는 것을 계속 사용하고 있다. 이에 박도영 전 교육인적자원부교과서심의위원이 정리한 인쇄용어통일에 대한 연구를 정리.연재한다.

  • PDF