Search | Korea Science

LATENT SEMANTIC INDEXING AND LINEAR RELEVANCE FEEDBACK IN TEXT INFORMATION RETRIEVAL THEORY

Yang, Ki-Choon
- Journal of the Korean Mathematical Society
- /
- v.36 no.3
- /
- pp.609-619
- /
- 1999
We give a mathematically rigorous description of the recently popular latent semantic indexing (LSI) method in text information retrieval theory. Also, a related problem of finding a document ranking function in linear relevance feedback is discussed.
PDF

Semantic and syntactic relationships of indexing languages (색인언어의 어의적 관계 및 구문적 관계)

윤구호
- Journal of Korean Library and Information Science Society
- /
- v.22
- /
- pp.1-26
- /
- 1995
Indexes, especially subject indexes, are major tools for information retrieval. To enhance the retrieval effectiveness of subject indexes, the semantic and syntactic relationships of indexing languages are very important elements. This paper examines the afore-mentioned relationships, based on purely the syntax and semantics of Korean language. The outlines of this study are as follows: 1. The characteristics and usages of controlled vocabularies, particularly subject headings lists and thesaury, are reviewed. 2. The semantic relationships, such as equivalence, hierarchical and associative relationships, are defined, and their categories are investigated in detail. Accordingly, the usages of 'See' and 'See also' references are suggested circumstantially. 3. The syntactic relationships are also examined. Particularly, for the syntactic relationships of multiword indexing terms, two kinds of subject entry formats are compared. Since it is more rational for subject headings organized by the principle of context-dependency, the two-fine entry format is recommended for subject indexes. 4. Computerized production techniques of 'See' and 'See also' reference for the semantic relationships of indexing terms are presented. 5. Computerized production techniques of subject indexes representing the syntactic relationships of indexing terms are also presented.
PDF

A Study on Semantic Based Indexing and Fuzzy Relevance Model (의미기반 인덱스 추출과 퍼지검색 모델에 관한 연구)

Kang, Bo-Yeong;Kim, Dae-Won;Gu, Sang-Ok;Lee, Sang-Jo
- Proceedings of the Korean Information Science Society Conference
- /
- 2002.04b
- /
- pp.238-240
- /
- 2002
If there is an Information Retrieval system which comprehends the semantic content of documents and knows the preference of users. the system can search the information better on the Internet, or improve the IR performance. Therefore we propose the IR model which combines semantic based indexing and fuzzy relevance model. In addition to the statistical approach, we chose the semantic approach in indexing, lexical chains, because we assume it would improve the performance of the index term extraction. Furthermore, we combined the semantic based indexing with the fuzzy model, which finds out the exact relevance of the user preference and index terms. The proposed system works as follows: First, the presented system indexes documents by the efficient index term extraction method using lexical chains. And then, if a user tends to retrieve the information from the indexed document collection, the extended IR model calculates and ranks the relevance of user query. user preference and index terms by some metrics. When we experimented each module, semantic based indexing and extended fuzzy model. it gave noticeable results. The combination of these modules is expected to improve the information retrieval performance.
PDF

A Mobile P2P Semantic Information Retrieval System with Effective Updates

Liu, Chuan-Ming;Chen, Cheng-Hsien;Chen, Yen-Lin;Wang, Jeng-Haur
- KSII Transactions on Internet and Information Systems (TIIS)
- /
- v.9 no.5
- /
- pp.1807-1824
- /
- 2015
As the technologies advance, mobile peer-to-peer (MP2P) networks or systems become one of the major ways to share resources and information. On such a system, the information retrieval (IR), including the development of scalable infrastructures for indexing, becomes more complicated due to a huge increase on the amount of information and rapid information change. To keep the systems on MP2P networks more reliable and consistent, the index structures need to be updated frequently. For a semantic IR system, the index structure is even more complicated than a classic IR system and generally has higher update cost. The most well-known indexing technique used in semantic IR systems is Latent Semantic Indexing (LSI), of which the index structure is generated by singular value decomposition (SVD). Although LSI performs well, updating the index structure is not easy and time consuming. In an MP2P environment, which is fully distributed and dynamic, the update becomes more challenging. In this work, we consider how to update the sematic index generated by LSI and keep the index consistent in the whole MP2P network. The proposed Concept Space Update (CSU) protocol, based on distributed 2-Phase locking strategy, can effectively achieve the objectives in terms of two measurements: coverage speed and update cost. Using the proposed effective synchronization mechanism with the efficient updates on the SVD, re-computing the whole index on the P2P overlay can be avoided and the consistency can be achieved. Simulated experiments are also performed to validate our analysis on the proposed CSU protocol. The experimental results indicate that CSU is effective on updating the concept space with LSI/SVD index structure in MP2P semantic IR systems.
https://doi.org/10.3837/tiis.2015.05.014 인용 PDF KSCI KPUBS HTML

An Experimental Study on Opinion Classification Using Supervised Latent Semantic Indexing(LSI) (지도적 잠재의미색인(LSI)기법을 이용한 의견 문서 자동 분류에 관한 실험적 연구)

Lee, Ji-Hye;Chung, Young-Mee
- Journal of the Korean Society for information Management
- /
- v.26 no.3
- /
- pp.451-462
- /
- 2009
The aim of this study is to apply latent semantic indexing(LSI) techniques for efficient automatic classification of opinionated documents. For the experiments, we collected 1,000 opinionated documents such as reviews and news, with 500 among them labelled as positive documents and the remaining 500 as negative. In this study, sets of content words and sentiment words were extracted using a POS tagger in order to identify the optimal feature set in opinion classification. Findings addressed that it was more effective to employ LSI techniques than using a term indexing method in sentiment classification. The best performance was achieved by a supervised LSI technique.
https://doi.org/10.3743/KOSIM.2009.26.3.451 인용 PDF

Theory and practice of alphabetical subject indexing (주제색인의 이론과 실제)

윤구호
- Journal of Korean Library and Information Science Society
- /
- v.10
- /
- pp.95-131
- /
- 1983
Index is a systematic guide to items contained in, or concepts derived from, a collection, Thus, it is represented as a paired set of index terms (t) and documents (D) : I= {(t,D) vertical bar t .mem. V, D .mem. W), where V is index vocabulary and W is document collection. Indexing is the process of analysing the informational content of records of knowledge and expressing the informational content in the language of the indexing system. It involves: 1) Selecting indexable concepts in a document; and 2) expressing these concepts in the language of the indexing system (as index entries): and an ordered list. Indexing process involves technical, semantic and syntactic problems. Technical problems are related to the accuracy of indexing, which is primarily governed by the indexer's ability of analysing subject, identifying indexable concepts, and coding. The proper levels of indexing exhaustivity, and index language specificity are also significant factors affecting the quality of index. Semantic problems are related to the choice of index terms and the form in which they should be used. Equivalent, hierarchical and affinitive/associative relationships of index terms are involved. Syntactic problems are largely related to the coordination of index terms. This process of coordination arises from the need to be able to search for the intersection of two or more classes defined by terms denoting distinct concepts. Finally, most valuable aspects of alphabetical subject indexing theories and practices are derived from those of Cutter, Kaiser, Ranganathan, Coates, Lynch and Austin, and discussed in details.
PDF

Semantic Indexing for Soccer Videos Using Web-Extracted Information (웹에서 축출된 정보를 이용한 축구 경기의 시맨틱 인덱싱)

Hirata, Issao;Kim, Myeong-Hoon;Sull, Sang-Hoon
- Proceedings of the Korean Information Science Society Conference
- /
- 2007.10c
- /
- pp.41-45
- /
- 2007
The rapid growing of video content production leads to the necessity of developing more complex indexing systems in order to efficiently allow searching, retrieval and presentation of the desired segments of videos. This paper presents a method for indexing soccer video through automatic extraction of information from internet. The proposed paper defines a metadata structure to formally represent the knowledge of soccer matches and provides an automatic method to extract semantic information from web-sites. This approach improves the capability to extract more reliable and richer semantic Information for soccer videos. Experimental results demonstrate that the proposed method provides an efficient performance.
PDF

A Semantic-based Video Retrieval System Using the Automatic Indexing Agent (자동 인덱싱 에이전트를 이용한 의미기반 비디오 검색 시스템)

Kim Sam-Keun;Lee Jong-Hee;Yoon Sun-Hee;Lee Keun-Soo;Seo Jeong-Min
- Journal of Korea Multimedia Society
- /
- v.9 no.1
- /
- pp.127-137
- /
- 2006
In order to process video data effectively, it is required that the content information of video data is loaded in database and semantic- based retrieval method can be available for various query of users. Currently existent contents-based video retrieval systems search by single method such as annotation-based or feature-based retrieval, and show low search efficiency and requires many efforts of system administrator or annotator form less perfect automatic processing. In this paper, we propose semantic-based video retrieval system which support semantic retrieval of various users by feature-based retrieval and annotation-based retrieval of massive video data. By user's fundamental query and selection of image for key frame that extracted from query, the automatic indexing agent gives the detail shape for annotation of extracted key frame. Also, key frame selected by user become query image and searches the most similar key frame through feature based retrieval method that propose. Therefore, we propose the system that can heighten retrieval efficiency of video data through semantic-based retrieval.
PDF

Latent Semantic Indexing Analysis of K-Means Document Clustering for Changing Index Terms Weighting (색인어 가중치 부여 방법에 따른 K-Means 문서 클러스터링의 LSI 분석)

Oh, Hyung-Jin;Go, Ji-Hyun;An, Dong-Un;Park, Soon-Chul
- The KIPS Transactions:PartB
- /
- v.10B no.7
- /
- pp.735-742
- /
- 2003
In the information retrieval system, document clustering technique is to provide user convenience and visual effects by rearranging documents according to the specific topics from the retrieved ones. In this paper, we clustered documents using K-Means algorithm and present the effect of index terms weighting scheme on the document clustering. To verify the experiment, we applied Latent Semantic Indexing approach to illustrate the clustering results and analyzed the clustering results in 2-dimensional space. Experimental results showed that in case of applying local weighting, global weighting and normalization factor, the density of clustering is higher than those of similar or same weighting schemes in 2-dimensional space. Especially, the logarithm of local and global weighting is noticeable.
https://doi.org/10.3745/KIPSTB.2003.10B.7.735 인용 PDF KSCI

Text-based Image Indexing and Retrieval using Formal Concept Analysis

Ahmad, Imran Shafiq
- KSII Transactions on Internet and Information Systems (TIIS)
- /
- v.2 no.3
- /
- pp.150-170
- /
- 2008
In recent years, main focus of research on image retrieval techniques is on content-based image retrieval. Text-based image retrieval schemes, on the other hand, provide semantic support and efficient retrieval of matching images. In this paper, based on Formal Concept Analysis (FCA), we propose a new image indexing and retrieval technique. The proposed scheme uses keywords and textual annotations and provides semantic support with fast retrieval of images. Retrieval efficiency in this scheme is independent of the number of images in the database and depends only on the number of attributes. This scheme provides dynamic support for addition of new images in the database and can be adopted to find images with any number of matching attributes.
https://doi.org/10.3837/tiis.2008.03.002 인용 PDF

Search Result 82, Processing Time 0.023 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)