• Title/Summary/Keyword: 한국용어

Search Result 3,081, Processing Time 0.028 seconds

Intelligne information retrieval using latent semantic analysis on the internet (인터넷에서 잠재적 의미 분석을 이용한 지능적 정보 검색)

  • 임재현;김영찬
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.22 no.8
    • /
    • pp.1782-1789
    • /
    • 1997
  • Most systems that retrieve distributed information on the Internet have difficulties in retrieving relevant information for they are not able to reflect exact semantics on retrieval queries that usersrequest. In this paepr, we propose an automatic query expansion based on ter distribution which reflects semantics of retrieval term to emhance the performance of information retrieval. We computed weight, indicating its overal imoritance in the collection documents and user's query and we use LSI's SVD technique to measure the term distribution which appears similar to query. And also, we measure the similarity to compared numerical value with query terms. Also we researched the method to reduce additional terms automatically and evaluated the performance of the proposed method.

  • PDF

A Study on the Connecting Method of Query and Legal Cases Using Doc2Vec Document Embedding (Doc2Vec 문서 임베딩을 이용한 질의문과 판례 자동 연결 방안 연구)

  • Kang, Ye-Jee;Kang, Hye-Rin;Park, Seo-Yoon;Jang, Yeon-Ji;Kim, Han-Saem
    • Annual Conference on Human and Language Technology
    • /
    • 2020.10a
    • /
    • pp.76-81
    • /
    • 2020
  • 법률 전문 지식이 없는 사람들이 법률 정보 검색을 성공적으로 하기 위해서는 일반 용어를 검색하더라도 전문 용어가 사용된 법령정보가 검색되어야 한다. 하지만 현 판례 검색 시스템은 사용자 선호도 검색이 불가능하며, 일반 용어를 사용하여 검색하면 사용자가 원하는 전문 자료를 도출하는 데 어려움이 있다. 이에 본 논문에서는 일반용어가 사용된 질의문과 전문용어가 사용된 판례를 자동으로 연결해 주고자 하였다. 질의문과 연관된 판례를 자동으로 연결해 주기 위해 전문용어가 사용된 전문가 답변을 바탕으로 문서분류에 높은 성능을 보이는 Doc2Vec을 이용한다. Doc2Vec 문서 임베딩 기법을 이용하여 전문용어가 사용된 전문가 답변과 유사한 답변을 제안하여 비슷한 주제의 답변들끼리 분류하였다. 또한 전문가 답변과 유사도가 높은 판례를 제안하여 질의문에 해당하는 판례를 자동으로 연결하였다.

  • PDF

An Experimental Study on Feature Selection Using Wikipedia for Text Categorization (위키피디아를 이용한 분류자질 선정에 관한 연구)

  • Kim, Yong-Hwan;Chung, Young-Mee
    • Journal of the Korean Society for information Management
    • /
    • v.29 no.2
    • /
    • pp.155-171
    • /
    • 2012
  • In text categorization, core terms of an input document are hardly selected as classification features if they do not occur in a training document set. Besides, synonymous terms with the same concept are usually treated as different features. This study aims to improve text categorization performance by integrating synonyms into a single feature and by replacing input terms not in the training document set with the most similar term occurring in training documents using Wikipedia. For the selection of classification features, experiments were performed in various settings composed of three different conditions: the use of category information of non-training terms, the part of Wikipedia used for measuring term-term similarity, and the type of similarity measures. The categorization performance of a kNN classifier was improved by 0.35~1.85% in $F_1$ value in all the experimental settings when non-learning terms were replaced by the learning term with the highest similarity above the threshold value. Although the improvement ratio is not as high as expected, several semantic as well as structural devices of Wikipedia could be used for selecting more effective classification features.

Is Knowledge Ascription Sensitive at all?: A Critique of Contextualist or Subject-sensitivist Semantic Approaches to 'know' (지식귀속은 민감하게 이뤄지는가? :'안다'에 대한 맥락주의 및 주체-민감주의 의미론 비판)

  • Han, Seong-Il
    • Korean Journal of Logic
    • /
    • v.8 no.2
    • /
    • pp.109-141
    • /
    • 2005
  • In this paper, I raise an objection to "sensitivism" about "know", according to which knowledge ascription is sensitive to contexts of utterance or subjects. While Peter Unger once proposed insensitivism about "know" in terms of insensitivism with respect to absolute terms, David Lewis provided sensitivism about "know" in terms of sensitivism with respect to absolute terms, on the common ground that "know" belongs to a class of absolute terms. On the one hand, I object to Unger-style insensitivism about 'know,' for, I claim, we have reason to opt for sensitivism rather than insensitivism with respect to absolute terms in virtue of the maxim that I call "semantic razor." On the other hand, I also object to sensitivist approaches to "know," for, on reflection, there is such a deep difference between "know" and absolute terms (or, sensitive terms altogether) that "know" cannot be taken to sensitive to contexts as opposed to absolute terms (or, sensitive terms altogether). These claims jointly indicate that "know" should be thought of as an insensitive term even though sensitivism has enjoyed wide acceptance in many other cases.

  • PDF

Analyzing Architectural History Terminologies by Text Mining and Association Analysis (텍스트 마이닝과 연관 관계 분석을 이용한 건축역사 용어 분석)

  • Kim, Min-Jeong;Kim, Chul-Joo
    • Journal of Digital Convergence
    • /
    • v.15 no.1
    • /
    • pp.443-452
    • /
    • 2017
  • Architectural history traces the changes in architecture through various traditions, regions, overarching stylistic trends, and dates. This study identified terminologies related to the proximity and frequency in the architectural history areas by text mining and association analysis. This study explored terminologies by investigating articles published in the "Journal of Architectural History", a sole journal for the architectural history studies. First, key terminologies that appeared frequently were extracted from paper that had titles, keywords, and abstracts. Then, we analyzed some typical and specific key terminologies that appear frequently and partially depending on the research areas. Finally, association analysis was used to find the frequent patterns in the key terminologies. This research can be used as fundamental data for understanding issues and trends in areas on the architectural history.

Use of SNOMED CT to Represent Traditional Korean Medicine Concepts : A Semantic Characterization of Migraine-Related Concepts from Korean Medicine Clinical Practice Guideline (SNOMED CT를 활용한 한의약 개념 매핑 : 한의임상진료지침에서 도출된 편두통 관련 개념의 의미론적 표현)

  • Ahjung Byun;Hyeoun-Ae Park;Byung-Kwan Seo;EunYong Lee;Hyeoneui Kim
    • Journal of Society of Preventive Korean Medicine
    • /
    • v.28 no.2
    • /
    • pp.85-97
    • /
    • 2024
  • 목적 : 본 연구는 한의약에서 사용하는 용어가 SNOMED CT로 매핑 가능한지 여부를 조사하고, 한의약 용어를 표현하기 위해 기존 SNOMED CT 온톨로지를 개선할 수 있는 방안을 제안하는 것을 목표로 하였다. 방법 : 선행 연구의 매핑 가이드라인에서 제시된 7단계 과정을 수정하여 활용하였다. 매핑의 목적 및 범위 정의, 용어 추출, 개념 추출, 매핑을 위한 소스 용어 작업, SNOMED CT 개념 검색, 매핑 관계 분류 및 매핑 검증의 과정을 수행하였다. 매핑의 목적은 한의약 임상 아이디어를 표현하는 표준 용어로서 SNOMED CT를 평가하는 것이고, 범위에는 편두통 환자 관리의 평가, 진단, 치료 및 예방을 포함하였다. 결과 : 총 546개의 용어가 추출되었다. 중복된 용어를 제거한 후, 271개의 개념이 SNOMED CT 매핑에 사용되었다. 이중 43.2%는 SNOMED CT 개념과 의미론적으로 동등하게 매핑되었고(117개 개념), 39.1%는 SNOMED CT 개념이 더 포괄적인 의미를 가지도록 매핑되었다(106개 개념). 상대적으로 포괄적인 의미를 가지는 SNOMED CT 개념에 매핑된 한의약 개념 106개 중 19개는 SNOMED CT 후조합을 이용하여 의미론적으로 동등하게 표현이 가능하였다. 나머지 17.7%의 한의약 개념은 SNOMED CT에 매핑할 수 없었다. 결론 : 본 연구는 한의약에서 사용되는 개념을 SNOMED CT에 매핑하여 한의약 용어를 표준화하였다. 연구 결과를 바탕으로, 한의약에서 사용되는 용어를 표준의료용어로 표현하기 위하여 SNOMED CT에 새로운 개념과 속성을 추가하는 것을 제안한다.

Alleviating Semantic Term Mismatches in Korean Information Retrieval (한국어 정보 검색에서 의미적 용어 불일치 완화 방안)

  • Yun, Bo-Hyun;Park, Sung-Jin;Kang, Hyun-Kyu
    • The Transactions of the Korea Information Processing Society
    • /
    • v.7 no.12
    • /
    • pp.3874-3884
    • /
    • 2000
  • An information retrieval system has to retrieve all and only documents which are relevant to a user query, even if index terms and query terms are not matched exactly. However, term mismatches between index terms and qucry terms have been a serious obstacle to the enhancement of retrieval performance. In this paper, we discuss automatic term normalization between words in text corpora and their application to a Korean information retrieval system. We perform two types of term normalizations to alleviate semantic term mismatches: equivalence class and co-occurrence cluster. First, transliterations, spelling errors, and synonyms are normalized into equivalence classes bv using contextual similarity. Second, context-based terms are normalized by using a combination of mutual information and word context to establish word similarities. Next, unsupervised clustering is done by using K-means algorithm and co-occurrence clusters are identified. In this paper, these normalized term products are used in the query expansion to alleviate semantic tem1 mismatches. In other words, we utilize two kinds of tcrm normalizations, equivalence class and co-occurrence cluster, to expand user's queries with new tcrms, in an attempt to make user's queries more comprehensive (adding transliterations) or more specific (adding spc'Cializationsl. For query expansion, we employ two complementary methods: term suggestion and term relevance feedback. The experimental results show that our proposed system can alleviatl' semantic term mismatches and can also provide the appropriate similarity measurements. As a result, we know that our system can improve the rctrieval efficiency of the information retrieval system.

  • PDF

A Development of Reference Terminology Subset Editor for effective adaption of Clinical Vocabulary (임상용어의 효율적 적용을 위한 참조용어 Subset 에디터의 개발)

  • Cho, Hune;Kim, Hyung-Hoi;Choi, Byung-Guan;Choi, Young-Yeon;Kim, Hwa-Sun;Hong, Hae-Sook
    • Journal of Korea Multimedia Society
    • /
    • v.11 no.3
    • /
    • pp.364-372
    • /
    • 2008
  • It is highly useful in an actual clinical setting to apply appropriate medical terms to every area of electronic medical record (EMR) and link them effectively, as a single medical terminology system cannot cover all medical concepts. In order to use standardized terms conveniently and efficiently, it is required to categorize them depending on the purpose of individual departments or physicians and thereby develop organized subsets of extracted terms highly likely to be used. In addition, it is important to such a subset to make it possible to change or correct standardized terminology system and continue to develop and upgrade to meet renewed demands of users. In this paper, data including chief compliant, symptoms, diagnosis, operation, and history of previous treatments were collected from discharge summary of patients with Department of Neurosurgery at Busan National University Hospital for analysis. In addition, subset database was created, and for terms needed to be added, the physician directly performed mapping through connection with reference terminology server and developed subset editor for the purpose of creating new subset database. Therefore, it is expected that this can serve as a practical and effective management method to reduce problems and inefficiency caused by existing vast terminology system.

  • PDF

An Analysis of Korean Language Learners' Understanding According to the Types of Terms in School Mathematics (수학과 용어 유형에 따른 한국어학습자의 이해 분석)

  • Do, Joowon;Chang, Hyewon
    • Communications of Mathematical Education
    • /
    • v.36 no.3
    • /
    • pp.335-353
    • /
    • 2022
  • The purpose of this study is to identify the characteristics and types of errors in the conceptual image of Korean language learners according to the types of terms in mathematics that are the basis for solving mathematical word problems, and to prepare basic data for effective teaching and learning methods in solving the word problems of Korean language learners. To do this, a case study was conducted targeting four Korean language learners to analyze the specific conceptual images of terms registered in curriculum and terms that were not registered in curriculum but used in textbooks. As a result of this study, first, it is necessary to guide Korean language learners by using sufficient visualization material so that they can form appropriate conceptual definitions for terms in school mathematics. Second, it is necessary to understand the specific relationship between the language used in the home of Korean language learners and the conceptual image of terms in school mathematics. Third, it is necessary to pay attention to the passive term, which has difficulty in understanding the meaning rather than the active term. Fourth, even for Korean language learners who do not have difficulties in daily communication, it is necessary to instruct them on everyday language that are not registered in the curriculum but used in math textbooks. Fifth, terms in school mathematics should be taught in consideration of the types of errors that reflect the linguistic characteristics of Korean language learners shown in the explanation of terms. This recognition is expected to be helpful in teaching word problem solving for Korean language learners with different linguistic backgrounds.

Optimizing the Weight of Added Terms in Query Expansion (질의확장 검색에서의 추가용어 가중치 최적화)

  • 정영미;이재윤
    • Proceedings of the Korean Society for Information Management Conference
    • /
    • 2002.08a
    • /
    • pp.241-246
    • /
    • 2002
  • 전역적 질의확장 검색에서 단어간 공기기반 유사도를 사용할 경우에는 질의에 추가되는 용어에 부여하는 탐색가중치로 질의와의 유사도를 사용하는 것이 일반적이다. 그러나 과연 유사도가 탐색가중치로 최적인가는 의문의 여지가 있다. 추가용어와 질의 사이의 유사도가 가지는 특성을 살펴보고 고정가중치를 부여한 경우와 비교해보았다. 또한 실험집단이나 확장범위의 영향을 덜 받는 최적화된 추가용어 가중치를 찾기 위해 여러 가지 탐색가중치 공식을 실험하였다.

  • PDF