• Title/Summary/Keyword: Contents-based retrieval

Search Result 367, Processing Time 0.023 seconds

Improving the Performance of the User Creative Contents Retrieval Using Content Reputation and User Reputation (콘텐츠 명성 및 사용자 명성 평가를 이용한 UCC 검색 품질 개선)

  • Bae, Won-Sik;Cha, Jeong-Won
    • Journal of the Korea Society for Simulation
    • /
    • v.19 no.1
    • /
    • pp.83-90
    • /
    • 2010
  • We describe a novel method for improving the performance of the UCC retrieval using content reputation and user reputation. The UCC retrieval is a part of the information retrieval. The goal of the information retrieval system finds documents what users want, so the goal of the UCC retrieval system tries to find UCCs themselves instead of documents. Unlike the document, the UCC has not enough textual information. Therefore, we try to use the content reputation and the user reputation based on non-textual information to gain improved retrieval performance. We evaluate content reputation using the information of the UCC itself and social activities between users related with UCCs. We evaluate user reputation using individual social activities between users or users and UCCs. We build a network with users and UCCs from social activities, and then we can get the user reputation from the network by graph algorithms. We collect the information of users and UCCs from YouTube and implement two systems using content reputation and user reputation. And then we compare two systems. From the experiment results, we can see that the system using content reputation outperforms than the system using user reputation. This result is expected to use the UCC retrieval in the feature.

A Study on Layout Extraction from Internet Documents Through Xpath (Xpath에 의한 인터넷 문서의 레이아웃 추출 방법에 관한 연구)

  • Han Kwang-Rok;Sun Bok-Keun
    • The Journal of the Korea Contents Association
    • /
    • v.5 no.4
    • /
    • pp.237-244
    • /
    • 2005
  • Currently most Internet documents including news data are made based on predefined templates, but templates are usually formed only for main data and are not helpful for information retrieval against indexes, advertisements, header data etc. Templates in such forms are not appropriate when Internet documents are used as data for information retrieval. In order to process Internet documents in various areas of information retrieval, it is necessary to detect additional information such as advertisements and page indexes. Thus this study proposes a method of detecting the layout of web pages by identifying the characteristics and structure of block tags that affect the layout of web pages and calculating distances between web pages. As a result of experiment, we can successfully extract 640 documents from 1000 samples and obtain 64% recall rate. This method is purposed to reduce the cost of web document automatic processing and improve its efficiency through applying the method to document preprocessing of information retrieval such as data extraction and document summarization.

  • PDF

CBIR-based Data Augmentation and Its Application to Deep Learning (CBIR 기반 데이터 확장을 이용한 딥 러닝 기술)

  • Kim, Sesong;Jung, Seung-Won
    • Journal of Broadcast Engineering
    • /
    • v.23 no.3
    • /
    • pp.403-408
    • /
    • 2018
  • Generally, a large data set is required for learning of deep learning. However, since it is not easy to create large data sets, there are a lot of techniques that make small data sets larger through data expansion such as rotation, flipping, and filtering. However, these simple techniques have limitation on extendibility because they are difficult to escape from the features already possessed. In order to solve this problem, we propose a method to acquire new image data by using existing data. This is done by retrieving and acquiring similar images using existing image data as a query of the content-based image retrieval (CBIR). Finally, we compare the performance of the base model with the model using CBIR.

Textile image retrieval integrating contents, emotion and metadata (내용, 감성, 메타데이터의 결합을 이용한 텍스타일 영상 검색)

  • Lee, Kyoung-Mi;Park, U-Chang;Lee, Eun-Ok;Kwon, Hye-Young;Cha, Eun-MI
    • Journal of Internet Computing and Services
    • /
    • v.9 no.5
    • /
    • pp.99-108
    • /
    • 2008
  • This paper proposes an image retrieval system which integrates metadata, contents, and emotions in textile images. First, the proposed system searches images using metadata. Among searched images, the system retrieves similar images based on color histogram, color sketch, and emotion histogram. To extract emotion features, this paper uses emotion colors which was proposed on 160 emotion words by H. Nagumo. To enhance the user's convenience, the proposed textile image retrieval system provides additional functions as like enlarging an image, viewing color histogram, viewing color sketch, and viewing repeated patterns.

  • PDF

온라인 정보검색 시스템의 명령어비교

  • 김정현
    • Journal of Korean Library and Information Science Society
    • /
    • v.15
    • /
    • pp.207-243
    • /
    • 1988
  • This study is an attempt to furnish some helpful data for the maximum use of online information retrieval systems based on the comparative analysis of commands and operating terms employed by DIALOG, ORBIT and BRS which are three largest information retrieval systems in the world. To begin with, the historical development, service contents and retrieval functions of three systems were overviewed in the second chapter, on the basis of which the concrete functions and contents of the commands and the operating terms of the three systems were compared and analyzed one another. Commands were explained divided into basic and special ones. The basic commands subdivided into 44 items were put into the form of a diagram(Table 1). The special commands were first subdivided into search commnad, output-related commands and su n.0, pplementary commands, and then 14 items of which were comparatively analyzed and were represented as a chart(Table 2). In addition, any stand-alone commands employed by each system were also analyzed as a chart(Table 3). The Operating terms other than commands were subdivided into 16 items and were represented as a chart(Table 4). Password, inverted file, search step number in search session, etc were explained to the very detailed extent. A self-evident, single word is that the through understanding of commands is essential to the maximm use of all the functions of each of three systems.

  • PDF

Multi-class Support Vector Machines Model Based Clustering for Hierarchical Document Categorization in Big Data Environment (빅 데이터 환경에서 계층적 문서 유형 분류를 위한 클러스터링 기반 다중 SVM 모델)

  • Kim, Young Soo;Lee, Byoung Yup
    • The Journal of the Korea Contents Association
    • /
    • v.17 no.11
    • /
    • pp.600-608
    • /
    • 2017
  • Recently data growth rates are growing exponentially according to the rapid expansion of internet. Since users need some of all the information, they carry a heavy workload for examination and discovery of the necessary contents. Therefore information retrieval must provide hierarchical class information and the priority of examination through the evaluation of similarity on query and documents. In this paper we propose an Multi-class support vector machines model based clustering for hierarchical document categorization that make semantic search possible considering the word co-occurrence measures. A combination of hierarchical document categorization and SVM classifier gives high performance for analytical classification of web documents that increase exponentially according to extension of document hierarchy. More information retrieval systems are expected to use our proposed model in their developments and can perform a accurate and rapid information retrieval service.

Conjoined Audio Fingerprint based on Interhash and Intra hash Algorithms

  • Kim, Dae-Jin;Choi, Hong-Sub
    • International Journal of Contents
    • /
    • v.11 no.4
    • /
    • pp.1-6
    • /
    • 2015
  • In practice, the most important performance parameters for music information retrieval (MIR) service are robustness of fingerprint in real noise environments and recognition accuracy when the obtained query clips are matched with the an entry in the database. To satisfy these conditions, we proposed a conjoined fingerprint algorithm for use in massive MIR service. The conjoined fingerprint scheme uses interhash and intrahash algorithms to produce a robust fingerprint scheme in real noise environments. Because the interhash and intrahash algorithms are masked in the predominant pitch estimation, a compact fingerprint can be produced through their relationship. Experimental performance comparison results showed that our algorithms were superior to existing algorithms, i.e., the sub-mask and Philips algorithms, in real noise environments.

Image Retrieval using Contents and Location of Multiple Region-of-Interest (다중 관심영역의 내용과 위치를 이용한 이미지 검색)

  • Lee, Jong-Won
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2011.06a
    • /
    • pp.355-358
    • /
    • 2011
  • 본 논문에서는 이미지에서 사용자가 관심을 갖는 영역(ROI)의 내용을 나타내는 특성값과 영역의 위치를 함께 고려하여 이미지를 검색하는 방법을 제안한다. 제안한 방법은 검색 대상 이미지를 일정 크기의 블록으로 구분한 후 사용자가 선택한 다중 ROI와 가장 근접하는 특성을 가진 블록을 선택한다. 블록의 특성값은 MPEG-7의 도미넌트 컬러 기술자를 사용한다. 사용자가 선택한 블록의 특성값과 함께 블록의 위치를 측정한 후, 검색 대상 이미지의 블록들의 특성값 및 위치와 비교하여 유사도를 측정한다. 본 논문에서는 실험결과 제안한 방법이 전역 이미지 검색이나 동일한 위치의 블록만 비교하는 경우보다 다중 ROI의 내용과 위치를 함께 고려하는 방법이 다른 방법에 비해 우수한 성능을 나타냈다.

  • PDF

The Noise Robust Algorithm to Detect the Starting Point of Music for Content Based Music Retrieval System (노이즈에 강인한 음악 시작점 검출 알고리즘)

  • Kim, Jung-Soo;Sung, Bo-Kyung;Koo, Kwang-Hyo;Ko, Il-Ju
    • Journal of the Korea Society of Computer and Information
    • /
    • v.14 no.9
    • /
    • pp.95-104
    • /
    • 2009
  • This paper proposes the noise robust algorithm to detect the starting point of music. Detection of starting point of music is necessary to solve computational-waste problem and retrieval-comparison problem with inconsistent input data in music content based retrieval system. In particular, such detection is even more necessary in time sequential retrieval method that compares data in the sequential order of time in contents based music retrieval system. Whereas it has the long point that the retrieval is fast since it executes simple comparison in the order of time, time sequential retrieval method has the short point that data starting time to be compared should be the same. However, digitalized music cannot guarantee the equity of starting time by bit rate conversion. Therefore, this paper ensured that recognition rate shall not decrease even while executing high speed retrieval by applying time sequential retrieval method through detection of music starting point in the pre-processing stage of retrieval. Starting point detection used minimum wave model that can detect effective sound, and for strength against noise, the noises existing in mute sound were swapped. The proposed algorithm was confirmed to produce about 38% more excellent performance than the results to which starting point detection was not applied, and was verified for the strength against noise.

Content Based Image Retrieval using 8AB Representation of Spatial Relations between Objects (객체 위치 관계의 8AB 표현을 이용한 내용 기반 영상 검색 기법)

  • Joo, Chan-Hye;Chung, Chin-Wan;Park, Ho-Hyun;Lee, Seok-Lyong;Kim, Sang-Hee
    • Journal of KIISE:Databases
    • /
    • v.34 no.4
    • /
    • pp.304-314
    • /
    • 2007
  • Content Based Image Retrieval (CBIR) is to store and retrieve images using the feature description of image contents. In order to support more accurate image retrieval, it has become necessary to develop features that can effectively describe image contents. The commonly used low-level features, such as color, texture, and shape features may not be directly mapped to human visual perception. In addition, such features cannot effectively describe a single image that contains multiple objects of interest. As a result, the research on feature descriptions has shifted to focus on higher-level features, which support representations more similar to human visual perception like spatial relationships between objects. Nevertheless, the prior works on the representation of spatial relations still have shortcomings, particularly with respect to supporting rotational invariance, Rotational invariance is a key requirement for a feature description to provide robust and accurate retrieval of images. This paper proposes a high-level feature named 8AB (8 Angular Bin) that effectively describes the spatial relations of objects in an image while providing rotational invariance. With this representation, a similarity calculation and a retrieval technique are also proposed. In addition, this paper proposes a search-space pruning technique, which supports efficient image retrieval using the 8AB feature. The 8AB feature is incorporated into a CBIR system, and the experiments over both real and synthetic image sets show the effectiveness of 8AB as a high-level feature and the efficiency of the pruning technique.