• Title/Summary/Keyword: 블룸 필터

Search Result 51, Processing Time 0.022 seconds

Introduction to Method of Space-efficient Bloom Filtering (공간 효율적인 블룸 필터링 방법의 소개)

  • Kang, Boo-Joong;Ro, In-Woo;Im, Eul-Gyu
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2008.06d
    • /
    • pp.1-4
    • /
    • 2008
  • 블룸 필터는 간단하고, 공간 효율적인 자료 구조이다. 블룸 필터는 확률에 근거하여 어떤 데이터 집합을 표현하며, 어떤 데이터가 특정 데이터 집합에 속하는 지를 검사하는 멤버십 쿼리를 지원한다. 이런 멤버십 쿼리는 긍정 오류를 발생시키지만 블룸 필터의 파라미터들을 조정하여 긍정 오류를 최소화할 수 있다. 블룸 필터는 데이터가 공유의 필요성에 의해 전체 시스템에 걸쳐 물리적으로 퍼져있는 분산 시스템과 많은 양의 데이터를 다루기 위해 데이터베이스를 사용하는 시스템 그리고 실시간으로 멤버십 쿼리를 수행해야 하는 시스템 등에서 널리 사용되고 있다. 본 논문에서는 블룸 필터에 대해 알아보고 시스템의 목적에 따라 다양한 형태로 개량된 블룸 필터들에 대해 소개한다.

  • PDF

Ternary Bloom Filter Improving Counting Bloom Filter (카운팅 블룸필터를 개선하는 터너리 블룸필터)

  • Byun, Hayoung;Lee, Jungwon;Lim, Hyesook
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.54 no.1
    • /
    • pp.3-10
    • /
    • 2017
  • Counting Bloom filters (CBFs) have been popularly used in many network algorithms and applications for the membership queries of dynamic sets, since CBFs can provide delete operations, which are not provided in a standard 1-bit vector Bloom filter. However, because of the counting functions, a CBF can have overflows and accordingly false negatives. CBFs composed of 4-bit counters are generally used, but the 4-bit CBF wastes memory spaces by allocating 4 bits for every counter. In this paper, we propose a simple alternative of a 4-bit CBF named ternary Bloom filter (TBF). In the proposed TBF structure, if two or more elements are mapped to a counter in programming, the counters are not used for insertion or deletion operations any more. When the TBF consumes the same amount of memory space as a 4-bit CBF, it is shown through simulation that the TBF provides a better false positive rate than the CBF as well as the TBF does not generate false negatives.

On Message Length Efficiency of Two Security Schemes using Bloom Filter (블룸필터를 사용하는 두 보안기법에 대한 메시지 길이의 효율성에 대하여)

  • Maeng, Young-Jae;Kang, Jeon-Il;Nyang, Dae-Hun;Lee, Kyung-Hee
    • The KIPS Transactions:PartC
    • /
    • v.19C no.3
    • /
    • pp.173-178
    • /
    • 2012
  • Recent two security schemes showed that a bloom filter can reduce a message length required for representing multiple MACs. The schemes, however, made message length comparison without considering security level. Since the MAC is intended for security, it is important to let multiple MACs and the bloom filter have the same level of security for making message length comparison. In this paper, we analyze the message length efficiency of bloom filter, compressed bloom filter and multiple MACs, letting them have the same security level.

The Construction of A Parallel type Bloom Filter (병렬 구조의 블룸필터 설계)

  • Jang, Young-dal;Kim, Ji-hong
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.21 no.6
    • /
    • pp.1113-1120
    • /
    • 2017
  • As the size of the data is getting larger and larger due to improvement of the telecommunication techniques, it would be main issues to develop and process the database. The bloom filter used to lookup a particular element under the given set is very useful structure because of the space efficiency. In this paper, we analyse the main factor of the false positive and propose the new parallel type bloom filter in order to minimize the false positive which is caused by other hash functions. The proposed method uses the memory as large as the conventional bloom filter use, but it can improve the processing speed using parallel processing. In addition, if we use the perfect hash function, the insertion and deletion function in the proposed bloom filter would be possible.

An SSD-Based Directory Parsing with the Counting Bloom Filter (카운팅 블룸필터를 이용한 SSD 기반의 디렉토리 탐색 기법)

  • Kim, Man-Yun;Youn, Hee-Yong
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2014.07a
    • /
    • pp.347-349
    • /
    • 2014
  • 데이터의 폭발적인 증가로 우리는 빅데이터 시대를 맞이하게 되었다. 빅데이터의 파일 시스템 내에는 아주 큰 트리구조로 이루어진 디렉토리와 파일이 무수히 존재한다. 이 커다란 트리구조에서 사용자가 요청하는 디렉토리와 파일을 탐색하는 것은 매우 어려운 작업이다. 이에 우리는 카운팅 블룸필터를 이용한 디렉토리 탐색 기법을 제시한다. SDP(SSD-based Directory Parsing)는 최근 또는 자주 액세스한 디렉토리와 파일의 메타데이터를 보관하는 SSD 기반의 캐시이다. 대규모 파일 시스템에서 사용자가 파일을 요청했을 때 파일 시스템은 저장 장치에 메타데이터를 검색하기 위해 여러 번 액세스한다. 이러한 비효율적인 SSD에 대한 액세스를 방지하기 위해 카운팅 블룸필터를 이용하여 메타데이터를 빠르고 효율적으로 검색하는 기법을 제시한다.

  • PDF

Multiple Hashing Architecture using Bloom Filter for IP Address Lookup (IP 주소 검색에서 블룸 필터를 사용한 다중 해싱 구조)

  • Park, Kyong-Hye;Lim, Hye-Sook
    • Journal of KIISE:Databases
    • /
    • v.36 no.2
    • /
    • pp.84-98
    • /
    • 2009
  • Various algorithms and architectures for IP address lookup have been studied to improve forwarding performance in the Internet routers. Previous IP address lookup architecture using Bloom filter requires a separate Bloom filter as well as a separate hash table in each prefix length, and hence it is not efficient in implementation complexity. To reduce the number of hash tables, it applies controlled prefix expansion, but prefix duplication is inevitable in the controlled prefix expansion. Previous parallel multiple-hashing architecture shows very good search performance since it performs parallel search on tables constructed in each prefix length. However, it also has high implementation complexity because of the parallel search structure. In this paper, we propose a new IP address lookup architecture using all-length Bloom filter and all-length multiple hash table, in which various length prefixes are accomodated in a single Bloom filter and a single multiple hash table. Hence the proposed architecture is very good in terms of implementation complexity as well as search performance. Simulation results using actual backbone routing tables which have $15000{\sim}220000$ prefixes show that the proposed architecture requires 1.04-1.17 memory accesses in average for an IP address lookup.

A Bloom filter-based Sentiment-aware Web Crawling Algorithm (블룸 필터를 이용한 감성 웹 문서 크롤링 알고리즘)

  • Na, Chul-Won;On, Byung-Won
    • Annual Conference on Human and Language Technology
    • /
    • 2018.10a
    • /
    • pp.69-74
    • /
    • 2018
  • 최근 빅 데이터와 인공지능의 발달과 함께 감성 분석에 대한 연구가 활발해지고 있다. 더불어 감성 분석을 위한 긍/부정 어휘가 풍부한 텍스트 문서들에 대한 수집의 필요성도 높아지고 있다. 본 논문은 긍/부정어휘가 풍부한 텍스트 문서들을 수집하는 기존의 수집 방법에 대한 문제점에 대하여 해결방안을 제시한다. 기존의 수집 방법으로 일단 모든 URL들을 저장하고 필터링 과정을 거쳐 긍/부정 어휘가 풍부한 텍스트 문서들을 수집하고자 한다면 불필요한 텍스트 문서 저장과 필터링 과정에서 메모리와 시간을 낭비하게 된다. 기존의 수집 방법에 블룸 필터라는 자료구조를 적용시켜 메모리와 시간을 낭비하게 되는 문제점을 해결하고자 한다.

  • PDF

An Analysis on the Error Probability of A Bloom Filter (블룸필터의 오류 확률에 대한 분석)

  • Kim, SungYong;Kim, JiHong
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.24 no.5
    • /
    • pp.809-815
    • /
    • 2014
  • As the size of the data is getting larger and larger due to improvement of the telecommunication techniques, it would be main issues to develop and process the database. The bloom filter used to lookup a particular element under the given set is very useful structure because of the space efficiency. In this paper, we introduce the error probabilities in Bloom filter. Especially, we derive the revised false positive rates of the Bloom filter using experimental method. Finally we analyze and compare the original false positive probability of the bloom filter used until now and the false decision probability proposed in this paper.

A Packet Classification Algorithm Using Bloom Filter Pre-Searching on Area-based Quad-Trie (영역 분할 사분 트라이에 블룸 필터 선 검색을 사용한 패킷 분류 알고리즘)

  • Byun, Hayoung;Lim, Hyesook
    • Journal of KIISE
    • /
    • v.42 no.8
    • /
    • pp.961-971
    • /
    • 2015
  • As a representative area-decomposed algorithm, an area-based quad-trie (AQT) has an issue of search performance. The search procedure must continue to follow the path to its end, due to the possibility of the higher priority-matching rule, even though a matching rule is encountered in a node. A leaf-pushing AQT improves the search performance of the AQT by making a single rule node exist in each search path. This paper proposes a new algorithm to further improve the search performance of the leaf-pushing AQT. The proposed algorithm implements a leaf-pushing AQT using a hash table and an on-chip Bloom filter. In the proposed algorithm, by sequentially querying the Bloom filter, the level of the rule node in the leaf-pushing AQT is identified first. After this procedure, the rule database, which is usually stored in an off-chip memory, is accessed. Simulation results show that packet classification can be performed through a single hash table access using a reasonable sized Bloom filter. The proposed algorithm is compared with existing algorithms in terms of the memory requirement and the search performance.

A Hybrid In-network Join Strategy using Bloom Filter in Sensor Network (센서 네트워크에서 블룸 필터를 이용한 하이브리드 인-네트워크 조인 기법)

  • Song, Im-Young;Kim, Kyung-Chang
    • Journal of KIISE:Databases
    • /
    • v.37 no.3
    • /
    • pp.165-170
    • /
    • 2010
  • This paper proposes an in-network join strategy SBJ(Semi & Bloom Join), an efficient join strategy for sensor networks, that minimizes communication cost. SBJ is a hybrid join strategy that can reduce energy consumption by using a bloom filter to reduce the size of data that needs to be sent or received in sensor network. The key to reducing the communication cost in SBJ is to eliminate data not involved in the join result in the early stages of join processing. Through simulation, the paper shows that compared to other join strategies in sensor network, SBJ join strategy is more efficient in reducing the communication cost resulting in a significant reduction in battery consumption.