Resolving the Ambigities in World Sense by using Automatic Keyword Network in Information Retrieval

Kim, Jung-Sae;Jang, Duk-Sung;

한국정보처리학회논문지 (The Transactions of the Korea Information Processing Society)

제7권12호
/
Pages.3855-3865
/
2000
/
1226-9190(pISSN)

한국정보처리학회 (Korea Information Processing Society)

정보검색에서의 어의 중의성 해소를 위한 자동 키워드망의 이용

Resolving the Ambigities in World Sense by using Automatic Keyword Network in Information Retrieval

김정세 (한국전자통신연구원 음성언어팀 연구원) ;
장덕성 (계명대학교 컴퓨터전자공학부)

발행 : 2000.12.01

PDF

PDF 다운로드

⟨ 이전 논문 다음 논문 ⟩

초록

문서 검색 시스템에서 자동 색인은 필수적이다. 그러나 자동 색인만으로 최적합 문서들을 상위에 위치시키기 불가능하다. 뿐만 아니라 동음이의어를 갖는 부적합한 문서들이 상위에 위치되는 것을 막을 길이 없다. 본 논문에서는 이런 문제를 해소하고 검색 효과를 높이기 위해 2차 검색에 자동 키워드망을 이용하는 두 단계 검색시스템을 연구하였다. 1차 검색은 자동색인으로 만들어진 역색인 파일을 이용하며, 2차 검색은 단어 연관성을 기초로 만든 자동 키워드망을 이용한다. 2차 검색을 위한 문서 순위 재조정 식들을 여러 개 만들어 비교하였으며, 이 식들이 동음이의어 어의 중의성 해소에 얼마나 효과가 있는지 성능을 평가하였다.

The automatic indexing is a compulsory part for the text retrieval system. However it is impossible to rank the appropriate texts at top. Furthermore, it is more difficult to prevent to rank the inappropriate texts having homonyms at top by only the automatic indexing. In this paper, we proposed the two-level retrieval system to enhance the retrieval efficiency, in which Automatic Keyword Network (AKN) is used at the second-level process. The firsHevel search is carried out with an inverted index file generated by the automatic indexing. On the other hand the second-level search exploits AKN based on the degree of asslxiation between terms. We have developed several formulas for rearranging the rank of texts at second-level search, and evaluated the performance of the effects of them on resolving the word sense ambiguities.

키워드

참고문헌

K. W. Church and P. Hanks, Word Association Norms, Mutual Information, and Lexicography, Computational Linguistucs, Vol.16, No.1, pp. 22-29, 1990
D. Harman, Ranking Algorithms, in Information Retrieval : Data Structure and Algorithms, W.B. Frakes and R. Baeza-Yates, Prentice-Hall, Englewood Cliffs, NJ, pp.363-392, 1992
D. Harman and G. Candela, Retrieving Records from a Gigabyte of Text on a Minicomputer using Statistical Ranking, Journal of the American Society for Information Science, Vol.41, No.8, pp.581-589, 1990 https://doi.org/10.1002/(SICI)1097-4571(199012)41:8<581::AID-ASI4>3.0.CO;2-U
D. M. Magerman and M. P. Marcus, Parsing a Natural Language Using Mutual Information Statistics, National Conference on Artificial Intelligence (AAAI-90), pp.984-989, 1990
G. Salton, Automatic Text Processing : The Transformation, Analysis, and Retrieval of Information by Computer, Addition-Wesley Publishing Company, 1989
G. Salton and C. Buukley, 'Improving Retrieval Performance by Relevance Feedback,' Journal of the American Society for Information Science, Vol.41, No.4, pp.288-297, 1990 https://doi.org/10.1002/(SICI)1097-4571(199006)41:4<288::AID-ASI8>3.0.CO;2-H
강현규, 옥서의 자연어 검색 성능 분석 및 개선, 한국정보처리학회 춘계 학술발표논문집, 제22권 제1호, pp.56-59, 1995
강현규, 박세영, 최기선, 자연언어 정보 검색에서 상호정보를 이용한 2단계 문서 순위 결정방법, 한국정보과학회 논문지, 제23권 제8호, pp.852-861, 1996
김대진, 정상철, 신동욱, '시소러스를 기반으로 하는 문서순위 결정 방법에 관한 연구', 한국정보과학회 봄 학술발표논문집 제21권 제1호, pp.177-180, 1994
이승률, 강현규, 박세영, 이상조, '자연어 질의 정보 검색 시스템의 비주제어 탐색방법을 통한 성능 개선', 제6회 한글 및 한국어 정보처리 학술발표논문집, pp.374-377, 1994
이준호, 시소러스의 연관성 정보를 이용한 문서의 순위 결정 방법, 한국정보처리학회지, 제10권 제2호, 1993

한국정보처리학회논문지 (The Transactions of the Korea Information Processing Society)

정보검색에서의 어의 중의성 해소를 위한 자동 키워드망의 이용

Resolving the Ambigities in World Sense by using Automatic Keyword Network in Information Retrieval

초록

키워드

참고문헌

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)