• 제목/요약/키워드: Lexicon

검색결과 272건 처리시간 0.028초

Bilingual lexicon induction through a pivot language

  • Kim, Jae-Hoon;Seo, Hyeong-Won;Kwon, Hong-Seok
    • Journal of Advanced Marine Engineering and Technology
    • /
    • 제37권3호
    • /
    • pp.300-306
    • /
    • 2013
  • This paper presents a new method for constructing bilingual lexicons through a pivot language. The proposed method is adapted from the context-based approach, called the standard approach, which is well-known for building bilingual lexicons using comparable corpora. The main difference between the standard approach and the proposed method is how to represent context vectors. The former is to represent context vectors in a target language, while the latter in a pivot language. The proposed method is very simplified from the standard approach thereby. Furthermore, the proposed method is more accurate than the standard approach because it uses parallel corpora instead of comparable corpora. The experiments are conducted on a language pair, Korean and Spanish. Our experimental results have shown that the proposed method is quite attractive where a parallel corpus directly between source and target languages are unavailable, but both source-pivot and pivot-target parallel corpora are available.

Toward A Bilingual Legal Term Glossary from Context Profiles

  • Kwong, Oi-Yee
    • 한국언어정보학회:학술대회논문집
    • /
    • 한국언어정보학회 2002년도 Language, Information, and Computation Proceedings of The 16th Pacific Asia Conference
    • /
    • pp.249-258
    • /
    • 2002
  • We propose an algorithm for the automatic acquisition of a bilingual lexicon in the legal domain. We make use of a parallel corpus of bilingual court judgments, aligned to the sentence level, and analyse the bilingual context profiles to extract corresponding legal terms in both languages. Our method is different from those in past studies as it does not require any prior knowledge source, and naturally extends to multi-word terms in either language. A pilot test was done with a sample of ten legal terms, each with ten or more occurrences in the data. Encouraging results of about 75% average accuracy were obtained. This figure does not only reflect the effectiveness of the method for bilingual lexicon acquisition, but also its potential for bilingual alignment at the word or expression level.

  • PDF

The Interface between Syntax and Morphology: Taiwanese Verbal Complexes

  • Lin, Huei-Ling
    • 한국언어정보학회:학술대회논문집
    • /
    • 한국언어정보학회 2002년도 Language, Information, and Computation Proceedings of The 16th Pacific Asia Conference
    • /
    • pp.308-319
    • /
    • 2002
  • Taiwanese abounds with verbal complexes. Among them, phasal complexes, resultative complexes, and directional complexes are alike in that their second component denotes some sort of result. Moreover, they behave similarly in that they can occur in V-ho-Y, V-e/be-Y, and V-bo-V forms. Despite the similarities, they still differ from one another in several aspects, such as whether objects are allowed inside or after the verbal complex, whether infixing changes their basic meaning, etc. This paper examines their individual properties carefully and proposes that these three types of complexes are all different from one another in their formation and thus the difference in their syntactic behavior. Directional complexes are syntactic phrases, resultative complexes are compounds derived in syntax, and while some phasal complexes are also syntactically derived compounds, others are compounds formed in the lexicon. This paper aims to argue that words (or compounds in this case) can be formed in syntax as well as in the lexicon.

  • PDF

한국어 피동동사의 의미구조와 논항실현 (The Semantic Structure and Argument Realization of Korean Passive Verbs)

  • 김윤신;이정민;강범모;남승호
    • 인지과학
    • /
    • 제11권1호
    • /
    • pp.25-32
    • /
    • 2000
  • 한국어에서 피동동사는 대응하는 능동동사로부터 접미사를 첨가하여 파생되거나 대응하는 능동동사의 어간에 어미와 조동사로 이루어진 형태를 붙여 형성된다. 따라서 피동동사는 그 능동동사와 공유하는 어휘정보를 갖는다고 가정할 수 있다. 이 논문은 피동동사의 논항 실현 양상을 격교체현상을 중심으로 살펴보고 Pustejovsky(1995)의 생성어휘부 이론에 근거하여 그 의미구조를 설정하는 것을 그 목적으로 한다.

  • PDF

Recognizing Chord Symbols in Printed Korean Musical Images Using Lexicon-Driven Approach

  • Dinh, Minh;Yang, Hyung-Jeong;Lee, Guee-Sang;Kim, Soo-Hyung;Na, In-Seop
    • 한국콘텐츠학회:학술대회논문집
    • /
    • 한국콘텐츠학회 2015년도 춘계 종합학술대회 논문집
    • /
    • pp.53-54
    • /
    • 2015
  • Optical music recognition (OMR) systems have been developed in recent years. However, chord symbols that play a role in a music sheet have been still disregarded. Therefore, we aimed to develop a proper approach to recognize these chord symbols. First, we divide the image of chord symbol into small segments in horizontal by a method based on vertical projection. Then, the optimal combination of these segments is found by using a lexicon-driven word scoring technique and a nearest neighbor classifier. The word that corresponds to the optimal combination is the result of recognition. The experiment gives an impressive result with accuracy 97.32%.

  • PDF

계산주의적 모델을 이용한 한국어 시각단어 재인에서 나타나는 이웃효과 (The Neighborhood Effects in Korean Word Recognition Using Computation Model)

  • 박기남;권유안;임희석;남기춘
    • 대한음성학회:학술대회논문집
    • /
    • 대한음성학회 2007년도 한국음성과학회 공동학술대회 발표논문집
    • /
    • pp.295-297
    • /
    • 2007
  • This study suggests a computational model to inquire the roles of phonological information and orthography information in the process of visual word recognition among the courses of language information processing and the representation types of the mental lexicon. As the result of the study, the computational model showed the phonological and orthographic neighborhood effect among language phenomena which are shown in Korean word recognition, and showed proofs which implies that the mental lexicon is represented as phonological information in the process of Korean word recognition.

  • PDF

Lexical Mismatches between English and Korean: with Particular Reference to Polysemous Nouns and Verbs

  • Lee, Yae-Sheik
    • 한국언어정보학회지:언어와정보
    • /
    • 제4권1호
    • /
    • pp.43-65
    • /
    • 2000
  • Along with the flourishign development of computational linguistics, research on the meanings of individual words has started to resume. Polyusemous words are especially brought into focus since their multiple senses have placed a real challenge to linguists and computer scientists. This paper mainly concerns the following three questions with regard to the treatments of such polysemous nouns and verbs in English and Korean. Firstly, what types of information should be represented in individual lexical entries for those polysemous words\ulcorner Secondly, how different are corresponding polysemous lexical entries in both languages\ulcorner Thirdly, what does a mental lexicon look like with regard to polysemous lexical entries\ulcorner For the first and second questions, Pustejosky's (1995) Generative Lexicon Theory (hereafter GLT) will be discussed in detail: the main focus falls on developing alternative way of representing (polysemous) lexical entries. For the third question, a brief discussion is made on mapping between concepts and their lexicalizations. Furthermore, a conceptual graph around conept 'bake' is depicted in terms of Sowa(2000)

  • PDF

Computerized Sound Dictionary of Korean and English

  • Kim, Jong-Mi
    • 음성과학
    • /
    • 제8권1호
    • /
    • pp.33-52
    • /
    • 2001
  • A bilingual sound dictionary in Korean and English has been created for a broad range of sound reference to cross-linguistic, dialectal, native language (L1)-transferred biological and allophonic variations. The paper demonstrates that the pronunciation dictionary of the lexicon is inadequate for sound reference due to the preponderance of unmarked sounds. The audio registry consists of the three-way comparison of 1) English speech from native English speakers, 2) Korean speech from Korean speakers, and 3) English speech from Korean speakers. Several sub-dictionaries have been created as the foundation research for independent development. They are 1) a pronunciation dictionary of the Korean lexicon in a keyboard-compatible phonetic transcription, 2) a sound dictionary of L1-interfered language, and 3) an audible dictionary of Korean sounds. The dictionary was designed to facilitate the exchange of the speech signal and its corresponding text data on various media particularly on CD-ROM. The methodology and findings of the construction are discussed.

  • PDF

기초 어휘 선정을 위한 초등학교 국어 교과서에 등장하는 어휘 분석 방안 (Lexicon Analysis Method for Basic Lexicon Construction included 7th Mother Language Text Books of Element School)

  • 채영숙;채영희
    • 한국정보과학회 언어공학연구회:학술대회논문집(한글 및 한국어 정보처리)
    • /
    • 한국정보과학회언어공학연구회 2002년도 제14회 한글 및 한국어 정보처리 학술대회
    • /
    • pp.98-102
    • /
    • 2002
  • 초등학교 교과서에 사용된 어휘의 수준을 보기 위해 교과서에 쓰인 어휘의 사용 빈도를 포함하여 결정에 영향력을 미칠 요소를 파악하고 요소간의 관계를 설립하여 교육용 어휘 설정의 나아갈 방향을 제시하는데 목적이 있다. 7차 교육과정에 있는 초등학교 교과서에서 국어 어휘 교육 관련 항목을 살펴 이들의 단계별 학습 수준의 고려가 이루어져 있는지를 검토하고자 한다. 수준별 교육 과정에서 밝히고 있는 어휘 의미 교육의 위계가 세부적이고 치밀한 수준의 적정성을 바탕으로 하여 구성되어 있는지를 검토하고 초등학교 교육용 어휘 선정의 문제 분석을 통해 기본 어휘와 기초 어휘 분류의 적정 기준과 학습 활동에 있어 언어 사용 능력으로서의 어휘력과 언어 체계 속의 어휘력을 구분할 필요가 있음을 설명하고자 한다.

  • PDF

한국어 연속음성인식을 위한 형태소 경계에서의 발음 변화 현상 모델링 (Modeling Cross-morpheme Pronunciation Variation for Korean LVCSR)

  • 이경님;정민화
    • 대한음성학회:학술대회논문집
    • /
    • 대한음성학회 2003년도 5월 학술대회지
    • /
    • pp.75-78
    • /
    • 2003
  • In this paper, we describe a cross-morpheme pronunciation variation model which is especially useful for constructing morpheme-based pronunciation lexicon for Korean LVCSR. There are a lot of pronunciation variations occurring at morpheme boundaries in continuous speech. Since phonemic context together with morphological category and morpheme boundary information affect Korean pronunciation variations, we have distinguished pronunciation variation rules according to the locations such as within a morpheme, across a morpheme boundary in a compound noun, across a morpheme boundary in an eojeol, and across an eojeol boundary. In 33K-morpheme Korean CSR experiment, an absolute improvement of 1.16% in WER from the baseline performance of 23.17% WER is achieved by modeling cross-morpheme pronunciation variations with a context-dependent multiple pronunciation lexicon.

  • PDF