• Title/Summary/Keyword: 정보검색기법

Search Result 2,281, Processing Time 0.03 seconds

Bit-Vector-Based Space Partitioning Indexing Scheme for Improving Node Utilization and Information Retrieval (노드 이용률과 검색 속도 개선을 위한 비트 벡터 기반 공간 분할 색인 기법)

  • Yeo, Myung-Ho;Seong, Dong-Ook;Yoo, Jae-Soo
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.16 no.7
    • /
    • pp.799-803
    • /
    • 2010
  • The KDB-tree is a traditional indexing scheme for retrieving multidimensional data. Much research for KDB-tree family frequently addresses the low storage utilization and insufficient retrieval performance as their two bottlenecks. The bottlenecks occur due to a number of unnecessary splits caused by data insertion orders and data skewness. In this paper, we propose a novel index structure, called as $KDB_{CS}^+$-tree, to process skewed data efficiently and improve the retrieval performance. The $KDB_{CS}^+$-tree increases the number of fan-outs by exploiting bit-vectors for representing splitting information and pointer elimination. It also improves the storage utilization by representing entries as a hierarchical structure in each internal node.

Efficient Inverted List Search Technique using Bitmap Filters (비트맵 필터를 이용한 효율적인 역 리스트 탐색 기법)

  • Kwon, In-Teak;Kim, Jong-Ik
    • The KIPS Transactions:PartD
    • /
    • v.18D no.6
    • /
    • pp.415-422
    • /
    • 2011
  • Finding similar strings is an important operation because textual data can have errors, duplications, and inconsistencies by nature. Many algorithms have been developed for string approximate searches and most of them make use of inverted lists to find similar strings. These algorithms basically perform merge operations on inverted lists. In this paper, we develop a bitmap representation of an inverted list and propose an efficient search algorithm that can skip unnecessary inverted lists without searching using bitmap filters. Experimental results show that the proposed technique consistently improve the performance of the search.

A Reasearch on Signature File Methods for Korean Text Retrieval (한글 텍스트 검색을 위한 요약 화일 기법에 관한 연구)

  • Song, Byoung-Ho;Lee, Suk-Ho
    • Annual Conference on Human and Language Technology
    • /
    • 1991.10a
    • /
    • pp.231-237
    • /
    • 1991
  • 텍스트에 대한 내용 본위 검색 기법으로서 요약 화일(signature file) 기법은 역화일(inverted file)이 허용되지 않을 때 매우 유용하다. 그러나 한글은 영문과 달리 어절의 형성이 복잡하고 띄어쓰기 형태가 고정되지 않음에 따라 기존의 단어 위주 영문 본위 요약 화일 기법을 그대로 적용시킬 수 없다. 본 논문에서는 이를 위하여 띄어쓰기를 무시하고 중복된 2음절 패턴을 도출하여 요약 화일을 구성, 검색하는 기법을 제안한다. 이 기법은 일본어, 중국어 등 비슷한 문제를 가진 외국어에도 적용될 수 있다.

  • PDF

Contents-based Image Retrieval Using Regression of Shape Features (모양 정보의 회귀추정에 의한 내용 기반 이미지 검색 기법)

  • Song Jun-Kyu;Choi Hwang-Kyu
    • Journal of Digital Contents Society
    • /
    • v.2 no.2
    • /
    • pp.157-166
    • /
    • 2001
  • In this paper we propose a feature vector extraction technique using regression of shape features for the content-based image retrieval system. The proposed technique can reduce the number of dimensions of a feature vector by converting the extracted high-dimensional feature vector into a specific n-dimensional feature vector. This paper shows how to resolve the 'dimensionality curse' problem by reducing the number of dimensions of a feature vector, and shows that the technique is more efficient than the conventional techniques for the practical image retrievals.

  • PDF

The Ecology of the Scientific Literature and Information Retrieval (I)

  • Jeong, Jun-Min
    • Journal of the Korean Society for information Management
    • /
    • v.2 no.2
    • /
    • pp.3-37
    • /
    • 1985
  • This research deals with the problems encountered in designing systems for more efficient and effective information retrieval used in the proliferation of literature. This research was designed to develop and test 1) the partitioning a large bibliographic data base into quality oriented subsets (quality filtering), and 2) a system for effective and efficient information retrieval within subsets of data base (relevance). In order to accomplish this partitioning, the 'kernel' technique of graph theory was applied. In addition, a method of quality filtering utilizing the 'epidemic' theory and the 'obsolescence' of scientific literature was developed.

  • PDF

The Ecology of the Scientific Literature and Information Retrieval (II)

  • Jeong, Jun-Min
    • Journal of the Korean Society for information Management
    • /
    • v.3 no.1
    • /
    • pp.3-16
    • /
    • 1986
  • This research deals with the problems encountered in designing systems for more efficient and effective information retrieval used in the proliferation of literature. This research was designed to develop and test 1) the partitioning a large bibliographic data base into quality oriented subsets (quality filtering), and 2) a system for effective and efficient Information retrieval within subsets of data base (relevance). In order to accomplish this partitioning, the 'kernel' technique of graph theory was applied. In addition, a method of quality filtering utilizing the 'epidemic' theory and the 'obsolescence' of scientific literature was developed.

  • PDF

Web Contents Mining System for Real-Time Monitoring of Opinion Information based on Web 2.0 (웹2.0에서 의견정보의 실시간 모니터링을 위한 웹 콘텐츠 마이닝 시스템)

  • Kim, Young-Choon;Joo, Hae-Jong;Choi, Hae-Gill;Cho, Moon-Taek;Kim, Young-Baek;Rhee, Sang-Yong
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.21 no.1
    • /
    • pp.68-79
    • /
    • 2011
  • This paper focuses on the opinion information extraction and analysis system through Web mining that is based on statistics collected from Web contents. That is, users' opinion information which is scattered across several websites can be automatically analyzed and extracted. The system provides the opinion information search service that enables users to search for real-time positive and negative opinions and check their statistics. Also, users can do real-time search and monitoring about other opinion information by putting keywords in the system. Proposing technique proved that the actual performance is excellent by comparison experiment with other techniques. Performance evaluation of function extracting positive/negative opinion information, the performance evaluation applying dynamic window technique and tokenizer technique for multilingual information retrieval, and the performance evaluation of technique extracting exact multilingual phonetic translation are carried out. The experiment with typical movie review sentence and Wikipedia experiment data as object as that applying example is carried out and the result is analyzed.

College Students' Preferences of Web-based OPAC Retrieval Techniques and their Blood Types: An Empirical Study (대학생들의 웹 기반 OPAC 검색기법 선호도와 혈액형에 대한 실험적 연구)

  • Kim, Hee-Sop
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.44 no.3
    • /
    • pp.81-102
    • /
    • 2010
  • The purpose of this study was to investigate college students' preferences of Web-based OPAC retrieval techniques and their ABO blood types as an empirical survey. Data was collected through a self-designed questionnaire with a total of 101 undergraduate students from the College of Social Sciences responding. The collected data was analyzed using descriptive statistics, and One-way ANOVA. The results show that 'title' was most preferred among the access points, 'AND' was the most preferred Boolean operator, 'publication year' and 'subject' were the most favored techniques in limiting the scope of retrieval, and 'record number limit per page' was the most frequently used for displaying retrieval results. The results also show that there were little(3 out of 22, i.e. 13.6%) statistically significant differences between the college students' preferences of Web-based OPAC techniques and their blood type.

Ontology Based Semantic Search System Using Inference (온톨로지를 통한 추론형 시멘틱 검색 시스템에 관한 연구)

  • 하상범;박영택
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2004.04b
    • /
    • pp.625-627
    • /
    • 2004
  • 시멘틱 웹의 등장으로 온톨로지를 통하여 에이전트가 이해할 수 있는 의미(semantic)를 갖는 문서를 생성하는 것이 가능해졌다. 이러한 시멘틱 웹의 영역은 비즈니스 업무 효율을 증가시키고 이를 통해 이윤을 극대화시키는 방법으로 시멘틱 검색을 통한 정보검색시스템으로 확대적용 될 수 있다. 데이터베이스를 활용하여 문서를 저장하고 데이터베이스의 질의문물 사용하거나 일반적인 키워드기반의 정보검색 기법을 사용하여 자료를 검색하는 기존의 시스템은 다양한 분야에서 많이 연구되어 왔다. 본 논문에서는 온톨로지를 기반으로 추론을 적용한 시멘틱 검색시스템에 대하여 문서검색에 초점을 맞추어 연구 결과를 제안한다. 본 논문에서 제안하는 방식은 기존의 데이터베이스 질의문으로 검색이 불가능하거나 정보관리 시스템에서 단순히 키워드 매칭으로 검색되지 않는 문서에 대해서 본 시스템이 온톨로지라 추론을 통하여 문서의 검색에 가능함을 보인다. 이러한 방식은 자연어처리 검색과 유사한 검색영역을 갖는다. 이는 문서의 검색에 있어 단순히 키워드의 유사도에 의존하지 않고 Description Logic을 바탕으로 구성된 온톨로지에 미리 정의 되어있는 의미를 바탕으로 생성된 메타데이타를 가지고 추론을 하기 때문에 가능하다 또한 기존의 정보관리 시스템에서 채용한 데이터베이스를 통한 질의응답 시스템을 적용하여 온톨로지 표현언어에 대해 질의 응답이 가능한 DQL 인터페이스와 연동을 통하여 본 시스템의 속도와 효율성을 극대화시킨다.

  • PDF

An Efficient-keyword-searching Technique over Encrypted data on Smartphone Database (스마트폰 데이터베이스 환경에서 암호화된 데이터에 대한 효율적인 키워드검색 기법)

  • Kim, Jong-Seok;Choi, Won-Suk;Park, Jin-Hyung;Lee, Dong-Hoon
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.24 no.4
    • /
    • pp.739-751
    • /
    • 2014
  • We are using our smartphone for our business as well as ours lives. Thus, user's privacy data and a company secret are stored at smartphone. By the way, the saved data on smartphone database can be exposed to a malicous attacker when a malicous app is installed in the smartphone or a user lose his/her smartphone because all data are stored as form of plaintext in the database. To prevent this disclosure of personal information, we need a database encryption method. However, if a database is encrypted, it causes of declining the performance. For example, when we search specific data in condition with encrypted database, we should decrypt all data stored in the database or search sequentially the data we want with accompanying overhead[1]. In this paper, we propose an efficient and searchable encryption method using variable length bloom filter under limited resource circumstances(e.g., a smartphone). We compare with existing searchable symmetric encryption. Also, we implemented the proposed method in android smartphone and evaluated the performance the proposed method. As a result through the implementation, We can confirm that our method has over a 50% improvement in the search speed compared to the simple search method about encrypted database and has over a 70% space saving compared to the method of fixed length bloom filter with the same false positive rate.