• Title/Summary/Keyword: Map Retrieval

Search Result 117, Processing Time 0.029 seconds

A Content-based Audio Retrieval System Supporting Efficient Expansion of Audio Database (음원 데이터베이스의 효율적 확장을 지원하는 내용 기반 음원 검색 시스템)

  • Park, Ji Hun;Kang, Hyunchul
    • Journal of Digital Contents Society
    • /
    • v.18 no.5
    • /
    • pp.811-820
    • /
    • 2017
  • For content-based audio retrieval which is one of main functions in audio service, the techniques for extracting fingerprints from the audio source, storing and indexing them in a database are widely used. However, if the fingerprints of new audio sources are continually inserted into the database, there is a problem that space efficiency as well as audio retrieval performance are gradually deteriorated. Therefore, there is a need for techniques to support efficient expansion of audio database without periodic reorganization of the database that would increase the system operation cost. In this paper, we design a content-based audio retrieval system that solves this problem by using MapReduce and NoSQL database in a cluster computing environment based on the Shazam's fingerprinting algorithm, and evaluate its performance through a detailed set of experiments using real world audio data.

A Wikipedia-based Query Expansion Method for In-depth Blog Distillation (주제를 깊이 있게 다루는 블로그 피드 검색을 위한 위키피디아 기반 질의 확장 방법)

  • Song, Woo-Sang;Lee, Ye-Ha;Lee, Jong-Hyeok;Yang, Gi-Joo
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.16 no.11
    • /
    • pp.1121-1125
    • /
    • 2010
  • This paper proposes a Wikipedia-based feedback method for in-depth blog distillation whose goal is to find blogs that represent in-depth thoughts or analysis on a given query. The proposed method uses Wikipedia articles which are relevant to the query. TREC Blogs08 collection which is a large-scale blog corpus and English Wikipedia dump were used for experiments, The proposed method significantly increased the retrieval performance including MAP over the conventional post based feedback method.

PDFindexer: Distributed PDF Indexing system using MapReduce

  • Murtazaev, JAziz;Kihm, Jang-Su;Oh, Sangyoon
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.4 no.1
    • /
    • pp.13-17
    • /
    • 2012
  • Indexing allows converting raw document collection into easily searchable representation. Web searching by Google or Yahoo provides subsecond response time which is made possible by efficient indexing of web-pages over the entire Web. Indexing process gets challenging when the scale gets bigger. Parallel techniques, such as MapReduce framework can assist in efficient large-scale indexing process. In this paper we propose PDFindexer, system for indexing scientific papers in PDF using MapReduce programming model. Unlike Web search engines, our target domain is scientific papers, which has pre-defined structure, such as title, abstract, sections, references. Our proposed system enables parsing scientific papers in PDF recreating their structure and performing efficient distributed indexing with MapReduce framework in a cluster of nodes. We provide the overview of the system, their components and interactions among them. We discuss some issues related with the design of the system and usage of MapReduce in parsing and indexing of large document collection.

Optimization Driven MapReduce Framework for Indexing and Retrieval of Big Data

  • Abdalla, Hemn Barzan;Ahmed, Awder Mohammed;Al Sibahee, Mustafa A.
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.14 no.5
    • /
    • pp.1886-1908
    • /
    • 2020
  • With the technical advances, the amount of big data is increasing day-by-day such that the traditional software tools face a burden in handling them. Additionally, the presence of the imbalance data in big data is a massive concern to the research industry. In order to assure the effective management of big data and to deal with the imbalanced data, this paper proposes a new indexing algorithm for retrieving big data in the MapReduce framework. In mappers, the data clustering is done based on the Sparse Fuzzy-c-means (Sparse FCM) algorithm. The reducer combines the clusters generated by the mapper and again performs data clustering with the Sparse FCM algorithm. The two-level query matching is performed for determining the requested data. The first level query matching is performed for determining the cluster, and the second level query matching is done for accessing the requested data. The ranking of data is performed using the proposed Monarch chaotic whale optimization algorithm (M-CWOA), which is designed by combining Monarch butterfly optimization (MBO) [22] and chaotic whale optimization algorithm (CWOA) [21]. Here, the Parametric Enabled-Similarity Measure (PESM) is adapted for matching the similarities between two datasets. The proposed M-CWOA outperformed other methods with maximal precision of 0.9237, recall of 0.9371, F1-score of 0.9223, respectively.

Dominant Color Based Image Retrieval using Saliency Map (Saliency Map을 이용한 대표 색상 기반의 영상 검색)

  • An, Jae-Hyun;Lee, Sang-Hwa;Cho, Nam-Ik
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2013.11a
    • /
    • pp.213-216
    • /
    • 2013
  • 본 논문에서는 객체 위주의 컬러 영상 검색을 위하여 영상의 saliency map을 이용해 객체 중심의 영상을 생성하고, 객체와 그 주변 영역에서의 대표 색상이 가지는 통계적 특성과 공간적 분포 정보를 이용하는 방법을 제안한다. 먼저, 영상의 saliency map을 이진화하여 영상을 객체/배경으로 분할하고 객체를 중심으로 객체/배경의 비율이 일정한 일정 크기의 영상을 생성한다. 생성된 영상에서 대표 색상을 추출하고, 각 색상이 영상에서 어떻게 분포하는가를 나타내는 이진 공간분포 지도를 형성한다. 그 후 영상 간의 대표 색상마다 이진 공간분포의 차이를 비교함으로써, 색상의 통계적 특성과 공간적 분포가 동시에 반영된 특징으로 영상을 검색한다. 본 논문에서 제안한 saliency map을 이용한 대표 색상 기반의 영상 검색 기법은 기존의 대표 색상 기반의 영상 검색보다 우수한 성능을 보여준다.

  • PDF

A Fast and Scalable Image Retrieval Algorithms by Leveraging Distributed Image Feature Extraction on MapReduce (MapReduce 기반 분산 이미지 특징점 추출을 활용한 빠르고 확장성 있는 이미지 검색 알고리즘)

  • Song, Hwan-Jun;Lee, Jin-Woo;Lee, Jae-Gil
    • Journal of KIISE
    • /
    • v.42 no.12
    • /
    • pp.1474-1479
    • /
    • 2015
  • With mobile devices showing marked improvement in performance in the age of the Internet of Things (IoT), there is demand for rapid processing of the extensive amount of multimedia big data. However, because research on image searching is focused mainly on increasing accuracy despite environmental changes, the development of fast processing of high-resolution multimedia data queries is slow and inefficient. Hence, we suggest a new distributed image search algorithm that ensures both high accuracy and rapid response by using feature extraction of distributed images based on MapReduce, and solves the problem of memory scalability based on BIRCH indexing. In addition, we conducted an experiment on the accuracy, processing time, and scalability of this algorithm to confirm its excellent performance.

Improved SIM Algorithm for Contents-based Image Retrieval (내용 기반 이미지 검색을 위한 개선된 SIM 방법)

  • Kim, Kwang-Baek
    • Journal of Intelligence and Information Systems
    • /
    • v.15 no.2
    • /
    • pp.49-59
    • /
    • 2009
  • Contents-based image retrieval methods are in general more objective and effective than text-based image retrieval algorithms since they use color and texture in search and avoid annotating all images for search. SIM(Self-organizing Image browsing Map) is one of contents-based image retrieval algorithms that uses only browsable mapping results obtained by SOM(Self Organizing Map). However, SOM may have an error in selecting the right BMU in learning phase if there are similar nodes with distorted color information due to the intensity of light or objects' movements in the image. Such images may be mapped into other grouping nodes thus the search rate could be decreased by this effect. In this paper, we propose an improved SIM that uses HSV color model in extracting image features with color quantization. In order to avoid unexpected learning error mentioned above, our SOM consists of two layers. In learning phase, SOM layer 1 has the color feature vectors as input. After learning SOM Layer 1, the connection weights of this layer become the input of SOM Layer 2 and re-learning occurs. With this multi-layered SOM learning, we can avoid mapping errors among similar nodes of different color information. In search, we put the query image vector into SOM layer 2 and select nodes of SOM layer 1 that connects with chosen BMU of SOM layer 2. In experiment, we verified that the proposed SIM was better than the original SIM and avoid mapping error effectively.

  • PDF

A Content Retrieval Method Using Pictures Taken from a Display Robust to Partial Luminance Change (부분 휘도 변화에 강인한 영상 촬영 기반 콘텐츠 검색 방법)

  • Lee, Joo-Young;Kim, Youn-Hee;Nam, Je-Ho
    • Journal of Broadcast Engineering
    • /
    • v.16 no.3
    • /
    • pp.427-438
    • /
    • 2011
  • In this paper, we propose a content retrieval system using pictures taken from a display for more intelligent mobile services. We focus on the search robustness by minimizing the influence of photographing conditions such as changes in the illumination intensity. For an efficient search and precise detection, as well as robustness, we use a two-step comparison method based on indexing features and a binary map based on luminance and chrominance difference with the adjacent blocks. We also evaluate the proposed algorithm by comparing with the existing algorithms, and we show the content retrieval system that we've implemented using the proposed algorithm.

Information Strategy Planning for Digital Infrastructure Building with Geo-based Nonrenewable Resources Information in Korea: Conceptual Modeling Units

  • Chi, Kwang-Hoon;Yeon, Young-Kwang;Park, No-Wook;Lee, Ki-Won
    • Proceedings of the KSRS Conference
    • /
    • 2002.10a
    • /
    • pp.191-196
    • /
    • 2002
  • From this year, KIGAM, one of Korean government-supported research institutes, has started new national program for digital geologic/natural resources infrastructure building. The goal of this program is to prepare digitally oriented infrastructure for practical digital database building, management, and public services of numerous types of paper maps related to geo-scientific resources or geologic thematic map sets: hydro-geologic map, applied geologic map, geo-chemical map, airborne radiometric/magnetic map, coal geologic map and off-shelf bathymetry map and so forth. As for digital infrastructure, several research issues in this topic are composed of: ISP (Information Strategy Planning), geo-framework modeling of each map set, pilot database building, cyber geo-mineral directory service system, and web based geologic information retrieval system upgrade which services Korean digital geologic maps scaled 1:50K. In this study, UML (Unified Modeling Language)-based data modeling of geo-data sets by and in KIGAM, among them, is mainly discussed, and its results are also presented in the viewpoint of digital geo-modeling ISP. It is expected this model is further progressed with the purpose of being a guidance or framework modeling for geologic thematic mapping and practical database building, as well as other types of national thematic map database building.

  • PDF

e-Catalogue Image Retrieval Using Vectorial Combination of Color Edge (컬러에지의 벡터적 결합을 이용한 e-카탈로그 영상 검색)

  • Hwang, Yei-Seon;Park, Sang-Gun;Chun, Jun-Chul
    • The KIPS Transactions:PartB
    • /
    • v.9B no.5
    • /
    • pp.579-586
    • /
    • 2002
  • The edge descriptor proposed by MPEG-7 standard is a representative approach for the contents-based image retrieval using the edge information. In the edge descriptor, the edge information is the edge histogram derived from a gray-level value image. This paper proposes a new method which extracts color edge information from color images and a new approach for the contents-based image retrieval based on the color edge histogram. The poposed method and technique are applied to image retrieval of the e-catalogue. For the evaluation, the results of image retrieval using the proposed approach are compared with those of image retrieval using the edge descriptor by MPEG-7 and the statistics shows the efficiency of the proposed method. The proposed color edge model is made by combining the R,G,B channel components vectorially and by characterizing the vector norm of the edge map. The color edge histogram using the direction of the color edge model is subsequently used for the contents-based image retrieval.