• Title/Summary/Keyword: automatic indexing

Search Result 138, Processing Time 0.028 seconds

Hierarchic Document Clustering in OPAC (OPAC에서 자동분류 열람을 위한 계층 클러스터링 연구)

  • 노정순
    • Journal of the Korean Society for information Management
    • /
    • v.21 no.1
    • /
    • pp.93-117
    • /
    • 2004
  • This study is to develop a hierarchic clustering model fur document classification and browsing in OPAC systems. Two automatic indexing techniques (with and without controlled terms), two term weighting methods (based on term frequency and binary weight), five similarity coefficients (Dice, Jaccard, Pearson, Cosine, and Squared Euclidean). and three hierarchic clustering algorithms (Between Average Linkage, Within Average Linkage, and Complete Linkage method) were tested on the document collection of 175 books and theses on library and information science. The best document clusters resulted from the Between Average Linkage or Complete Linkage method with Jaccard or Dice coefficient on the automatic indexing with controlled terms in binary vector. The clusters from Between Average Linkage with Jaccard has more likely decimal classification structure.

A Study on Automatic Text Categorization of Web-Based Query Using Synonymy List (유사어 사전을 이용한 웹기반 질의문의 자동 범주화에 관한 연구)

  • Nam, Young-Joon;Kim, Gyu-Hwan
    • Journal of Information Management
    • /
    • v.35 no.4
    • /
    • pp.81-105
    • /
    • 2004
  • In this study, the way of the automatic text categorization on web-based query was implemented. X2 methods based on the Supported Vector Machine were used to test the efficiency of text categorization on queries. This test is carried out by the model using the Synonymy List. 713 synonyms were extracted manually from the tested documents. As the result of this test, the precision ratio and the recall ratio were decreased by -0.01% and by 8.53%, respectively whether the synonyms were assigned or not. It also shows that the Value of F1 Measure was increased by 4.58%. The standard deviation between the recall and precision ratio was improve by 18.39%.

Object segmentation and object-based surveillance video indexing

  • Kim, Jin-Woong;Kim, Mun-Churl;Lee, Kyu-Won;Kim, Jae-Gon;Ahn, Chie-Teuk
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 1999.06a
    • /
    • pp.165.1-170
    • /
    • 1999
  • Object segmentation fro natural video scenes has recently become one of very active research to pics due to the object-based video coding standard MPEG-4. Object detection and isolation is also useful for object-based indexing and search of video content, which is a goal of the emerging new standard, MPEG-7. In this paper, an automatic segmentation method of moving objects in image sequence is presented which is applicable to multimedia content authoring for MPEG-4, and two different segmentation approaches suitable for surveillance applications are addressed in raw data domain and compressed bitstream domains. We also propose an object-based video description scheme based on object segmentation for video indexing purposes.

Face Detection and Matching for Video Indexing (비디오 인덱싱을 위한 얼굴 검출 및 매칭)

  • Islam Mohammad Khairul;Lee Sun-Tak;Yun Jae-Yoong;Baek Joong-Hwan
    • Proceedings of the Korea Institute of Convergence Signal Processing
    • /
    • 2006.06a
    • /
    • pp.45-48
    • /
    • 2006
  • This paper presents an approach to visual information based temporal indexing of video sequences. The objective of this work is the integration of an automatic face detection and a matching system for video indexing. The face detection is done using color information. The matching stage is based on the Principal Component Analysis (PCA) followed by the Minimax Probability Machine (MPM). Using PCA one feature vector is calculated for each face which is detected at the previous stage from the video sequence and MPM is applied to these feature vectors for matching with the training faces which are manually indexed after extracting from video sequences. The integration of the two stages gives good results. The rate of 86.3% correctly classified frames shows the efficiency of our system.

  • PDF

Video Indexing using Motion vector and brightness features (움직임 벡터와 빛의 특징을 이용한 비디오 인덱스)

  • 이재현;조진선
    • Journal of the Korea Society of Computer and Information
    • /
    • v.3 no.4
    • /
    • pp.27-34
    • /
    • 1998
  • In this paper we present a method for automatic motion vector and brightness based video indexing and retrieval. We extract a representational frame from each shot and compute some motion vector and brightness based features. For each R-frame we compute the optical flow field; motion vector features are then derived from this flow field, BMA(block matching algorithm) is used to find motion vectors and Brightness features are related to the cut detection of method brightness histogram. A video database provided contents based access to video. This is achieved by organizing or indexing video data based on some set of features. In this paper the index of features is based on a B+ search tree. It consists of internal and leaf nodes stores in a direct access a storage device. This paper defines the problem of video indexing based on video data models.

  • PDF

The Development of an Automatic Indexing System based on a Thesaurus (시소러스를 기반으로 하는 자동색인 시스템에 관한 연구)

  • 임형묵;정상철
    • Korean Journal of Cognitive Science
    • /
    • v.4 no.1
    • /
    • pp.213-242
    • /
    • 1993
  • During the past decades,several automatic indexing systems have been developed such as single term indexing.phrase indexing and thesaurus basedidndexing systems.Among these systems,single term indexing has been known as superior to others despte its simpicity of extracting meaningful terms.On the other hand,thesaurus based one has been conceived as producing low retrival rate ,mainly because thesauri do not usually have enough index terms.so that much of text data fail to be indexed if they do not match with any of index terms in thesauri.This paper develops a thesaurus based indexing system THINS that yields higher retrieval rate than other systems.by doing syntactic analysis of text data and matching them with index terms in thesauri partially.First,the system analyzes the input text syntactically by using the machine translation suystem MATES/EK and extracts noun phrases.After deleting stop words from noun phrases and stemming the remaining ones.it tries to index these with similar index terms in the thesaurus as much as possible. We conduct an experiment with CACM data set that measures the retrieval effectiveness with CACM data set that measures the retrieval effectuvenss of THINS with single term based one under HYKIS-a thesaurus based information retrieval system.It turns out that THINS yields about 10 percent higher precision than single term based one.while shows 8to9 percent lower recall.This retrieval rate shows that THINS improves much better than privious ones that only yields 25 or 30 percent lower precision than single term based one.We also argue that the relatively lower recall is cause by that CRCS-the thesaurus included in CACM datea set is very incomplete one,having only more than one thousand terms,thus THINS is expected to produce much higher rate if it is associated with currently available large thesaurus.

An Experimental Study on Opinion Classification Using Supervised Latent Semantic Indexing(LSI) (지도적 잠재의미색인(LSI)기법을 이용한 의견 문서 자동 분류에 관한 실험적 연구)

  • Lee, Ji-Hye;Chung, Young-Mee
    • Journal of the Korean Society for information Management
    • /
    • v.26 no.3
    • /
    • pp.451-462
    • /
    • 2009
  • The aim of this study is to apply latent semantic indexing(LSI) techniques for efficient automatic classification of opinionated documents. For the experiments, we collected 1,000 opinionated documents such as reviews and news, with 500 among them labelled as positive documents and the remaining 500 as negative. In this study, sets of content words and sentiment words were extracted using a POS tagger in order to identify the optimal feature set in opinion classification. Findings addressed that it was more effective to employ LSI techniques than using a term indexing method in sentiment classification. The best performance was achieved by a supervised LSI technique.

A Practical Digital Video Database based on Language and Image Analysis

  • Liang, Yiqing
    • Proceedings of the Korea Database Society Conference
    • /
    • 1997.10a
    • /
    • pp.24-48
    • /
    • 1997
  • . Supported byㆍDARPA′s image Understanding (IU) program under "Video Retrieval Based on Language and image Analysis" project.DARPA′s Computer Assisted Education and Training Initiative program (CAETI)ㆍObjective: Develop practical systems for automatic understanding and indexing of video sequences using both audio and video tracks(omitted)

  • PDF

Scene Change Detection Using Cumulative Histogram and Edge Information (누적 히스토그램과 에지 정보를 이용한 장면 전환 검출)

  • 황두선;이종설;조위덕;문영식
    • Proceedings of the IEEK Conference
    • /
    • 2002.06c
    • /
    • pp.211-214
    • /
    • 2002
  • Automatic video partitioning is the first step for content-based indexing and retrieval of video data. In this paper, an efficient algorithm for scene change detection is proposed, where cumulative histogram and edge information are utilized. Experimental results have shown the effectiveness of the proposed algorithm.

  • PDF

A Study on the Indexing Standard and Automatic Generation of Back-of-Book Indexes (도서권말색인의 작성지침과 자동생성에 관한 연구)

  • 김효열;정영미
    • Proceedings of the Korean Society for Information Management Conference
    • /
    • 1995.08a
    • /
    • pp.7-10
    • /
    • 1995
  • 본 논문은 한국어 도서권말색인의 여러 문제점들을 해결하기 위해 기존의 도서권말색인들을 분석하여 한국어 도서권말색인 작성을 위한 지침을 개발하였고, 색인 작성을 좀 더 짧은 시간에 작업하면서도 망라적인 표목을 생성하기 위해 색인 표목을 자동생성하는 시스템을 설계하고 구현하였다.

  • PDF