• Title/Summary/Keyword: Prefix Filtering

Search Result 11, Processing Time 0.025 seconds

Efficient Similarity Joins by Adaptive Prefix Filtering (맞춤 접두 필터링을 이용한 효율적인 유사도 조인)

  • Park, Jong Soo
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.2 no.4
    • /
    • pp.267-272
    • /
    • 2013
  • As an important operation with many applications such as data cleaning and duplicate detection, the similarity join is a challenging issue, which finds all pairs of records whose similarities are above a given threshold in a dataset. We propose a new algorithm that uses the prefix filtering principle as strong constraints on generation of candidate pairs for fast similarity joins. The candidate pair is generated only when the current prefix token of a probing record shares one prefix token of an indexing record within the constrained prefix tokens by the principle. This generation method needs not to compute an upper bound of the overlap between two records, which results in reduction of execution time. Experimental results show that our algorithm significantly outperforms the previous prefix filtering-based algorithms on real datasets.

Fast URL Lookup Using URL Prefix Hash Tree (URL Prefix 해시 트리를 이용한 URL 목록 검색 속도 향상)

  • Park, Chang-Wook;Hwang, Sun-Young
    • Journal of KIISE:Information Networking
    • /
    • v.35 no.1
    • /
    • pp.67-75
    • /
    • 2008
  • In this paper, we propose an efficient URL lookup algorithm for URL list-based web contents filtering systems. Converting a URL list into URL prefix form and building a hash tree representation of them, the proposed algorithm performs tree searches for URL lookups. It eliminates redundant searches of hash table method. Experimental results show that proposed algorithm is $62%{\sim}210%$ faster, depending on the number of segment, than conventional hash table method.

Joint Compensation of Transmitter and Receiver IQ Imbalance in OFDM Systems Based on Selective Coefficient Updating

  • Rasi, Jafar;Tazehkand, Behzad Mozaffari;Niya, Javad Musevi
    • ETRI Journal
    • /
    • v.37 no.1
    • /
    • pp.43-53
    • /
    • 2015
  • In this paper, a selective coefficient updating (SCU) approach at each branch of the per-tone equalization (PTEQ) structure has been applied for insufficient cyclic prefix (CP) length. Because of the high number of adaptive filters and their complex adaption process in the PTEQ structure, SCU has been proposed. Using this method leads to a reduction in the computational complexity, while the performance remains almost unchanged. Moreover, the use of set-membership filtering with variable step size is proposed for a sufficient CP case to increase convergence speed and decrease the average number of calculations. Simulation results show that despite the aforementioned algorithms having similar performance in comparison with conventional algorithms, they are able to reduce the number of calculations necessary. In addition, compensation of both the channel effect and the transmitter/receiver in-phase/quadrature-phase imbalances are achievable by these algorithms.

The Mechanism to Bypass Ingress Filtering for Multihomed Mobile Networks (멀티호밍 모바일 네트워크를 위한 인그레스 필터링 우회 메커니즘)

  • Ryu, Ji-Ho;Choi, Nak-Jung;Kwon, Tae-Kyoung;Choi, Yang-Hee;Paik, Eun-Kyoung
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2006.10d
    • /
    • pp.283-287
    • /
    • 2006
  • 본 논문에서는 멀티호밍 모바일 네트워크에서 발생하는 다양한 이슈 중 인그레스 필터링(ingress filtering) 문제에 대한 해결책을 제시하고자 한다. 본 저자들은 먼저 다수의 모바일 라우터가 존재하는 모바일 네트워크 환경에서 이웃 모바일 라우터 사이에 '프리픽스(prefix) 이웃' 관계를 제안한다. 그리고 이런 관계를 활용하여 모바일 라우터의 서비스를 받는 단말들이 자신의 주소를 변경하지 않고도 이웃 모바일 라우터를 통하여 릴레이 서비스를 받을 수 있도록 하는 인그레스 필터링 우회 기법도 제안한다. 또한 제안된 기법들을 ns-2 시뮬레이터 상에서 구현하고 모의 실험을 수행하여 제안된 기법의 성능 향상을 검증한다.

  • PDF

A Keyword-based Filtering Technique of Document-centric XML using NFA Representation (NFA 표현을 사용한 문서-중심적 XML의 키워드 기반 필터링 기법)

  • Lee, Kyoung-Han;Park, Seog
    • Journal of KIISE:Databases
    • /
    • v.33 no.5
    • /
    • pp.437-452
    • /
    • 2006
  • In this paper, we propose an extended XPath specification which includes a special matching character '%' used in the LIKE operation of SQL in order to solve the difficulty of writing some queries to filter element contents well, using the previous XPath specification. We also present a novel technique for filtering a collection of document-centric XMLs, called Pfilter, which is able to exploit the extended XPath specification. Owing to sharing the common prefix characters of the operands in value-based predicates, the Pfilter improves the performance in processing those. We show several performance studies, comparing Pfilter with Yfilter in respect to efficiency and scalability as using multi-query processing time (MQPT), and reporting the results with respect to inserting, deleting, and processing of value-based predicates. In conclusion, our approach provides a core algorithm for evaluating the contains() function of XPath queries in previous XML filtering researches, and a foundation for building XML-based distributed information systems.

A Filtering Technique of Streaming XML Data based Postfix Sharing for Partial matching Path Queries (부분매칭 경로질의를 위한 포스트픽스 공유에 기반한 스트리밍 XML 데이타 필터링 기법)

  • Park Seog;Kim Young-Soo
    • Journal of KIISE:Databases
    • /
    • v.33 no.1
    • /
    • pp.138-149
    • /
    • 2006
  • As the environment with sensor network and ubiquitous computing is emerged, there are many demands of handling continuous, fast data such as streaming data. As work about streaming data has begun, work about management of streaming data in Publish-Subscribe system is started. The recent emergence of XML as a standard for information exchange on Internet has led to more interest in Publish - Subscribe system. A filtering technique of streaming XML data in the existing Publish- Subscribe system is using some schemes based on automata and YFilter, which is one of filtering techniques, is very popular. YFilter exploits commonality among path queries by sharing the common prefixes of the paths so that they are processed at most one and that is using the top-down approach. However, because partial matching path queries interrupt the common prefix sharing and don't calculate from root, throughput of YFilter decreases. So we use sharing of commonality among path queries with the common postfixes of the paths and use the bottom-up approach instead of the top-down approach. This filtering technique is called as PoSFilter. And we verify this technique through comparing with YFilter about throughput.

A Subsequence Matching Technique that Supports Time Warping Efficiently (타임 워핑을 지원하는 효율적인 서브시퀀스 매칭 기법)

  • Park, Sang-Hyun;Kim, Sang-Wook;Cho, June-Suh;Lee, Hoen-Gil
    • Journal of Industrial Technology
    • /
    • v.21 no.A
    • /
    • pp.167-179
    • /
    • 2001
  • This paper discusses an index-based subsequence matching that supports time warping in large sequence databases. Time warping enables finding sequences with similar patterns even when they are of different lengths. In earlier work, we suggested an efficient method for whole matching under time warping. This method constructs a multidimensional index on a set of feature vectors, which are invariant to time warping, from data sequences. For filtering at feature space, it also applies a lower-bound function, which consistently underestimates the time warping distance as well as satisfies the triangular inequality. In this paper, we incorporate the prefix-querying approach based on sliding windows into the earlier approach. For indexing, we extract a feature vector from every subsequence inside a sliding window and construct a multi-dimensional index using a feature vector as indexing attributes. For query precessing, we perform a series of index searches using the feature vectors of qualifying query prefixes. Our approach provides effective and scalable subsequence matching even with a large volume of a database. We also prove that our approach does not incur false dismissal. To verily the superiority of our method, we perform extensive experiments. The results reseal that our method achieves significant speedup with real-world S&P 500 stock data and with very large synthetic data.

  • PDF

Spectrally encapsulated OFDM: Vectorized structure with minimal complexity

  • Kim, Myungsup;Kwak, Do Young;Jung, Jiwon;Kim, Ki-Man
    • ETRI Journal
    • /
    • v.43 no.4
    • /
    • pp.660-673
    • /
    • 2021
  • To efficiently use frequency resources, the next 6th generation mobile communication technology must solve the problem of out-of-band emission (OoBE) of cyclic prefix (CP) orthogonal frequency division multiplexing (OFDM), which is not solved in 5th generation technology. This study describes a new zero insertion technique to replace an existing filtering scheme to solve this internal problem in OFDM signals. In the development of the proposed scheme, a precoder with a two-dimensional structure is first designed by generating a two-dimensional mapper and using the specialty of each matrix. A spectral shaping technique based on zero insertion instead of a long filter is proposed, so it can be applied not only to long OFDM symbols, but also very short ones. The proposed method shows that the transmitted signal is completely blocked at the bandwidth boundaries of signals according to the current standards, and it is confirmed that the proposed scheme is ideal with respect to bit error rate (BER) performance because its BER is the same as that of CP-OFDM. In addition, the proposed scheme can transformed into a real time structure through vectorizing process with minimal complexity.

Bandwidth Efficient Summed Area Table Generation for CUDA (CUDA를 이용한 효율적인 합산 영역 테이블의 생성 방법)

  • Ha, Sang-Won;Choi, Moon-Hee;Jun, Tae-Joon;Kim, Jin-Woo;Byun, Hye-Ran;Han, Tack-Don
    • Journal of Korea Game Society
    • /
    • v.12 no.5
    • /
    • pp.67-78
    • /
    • 2012
  • Summed area table allows filtering of arbitrary-width box regions for every pixel in constant time per pixel. This characteristic makes it beneficial in image processing applications where the sum or average of the surrounding pixel intensity is required. Although calculating the summed area table of an image data is primarily a memory bound job consisting of row or column-wise summation, previous works had to endure excessive access to the high latency global memory in order to exploit data parallelism. In this paper, we propose an efficient algorithm for generating the summed area table in the GPGPU environment where the input is decomposed into square sub-images with intermediate data that are propagated between them. By doing so, the global memory access is almost halved compared to the previous methods making an efficient use of the available memory bandwidth. The results show a substantial increase in performance.

An Index-Based Approach for Subsequence Matching Under Time Warping in Sequence Databases (시퀀스 데이터베이스에서 타임 워핑을 지원하는 효과적인 인덱스 기반 서브시퀀스 매칭)

  • Park, Sang-Hyeon;Kim, Sang-Uk;Jo, Jun-Seo;Lee, Heon-Gil
    • The KIPS Transactions:PartD
    • /
    • v.9D no.2
    • /
    • pp.173-184
    • /
    • 2002
  • This paper discuss an index-based subsequence matching that supports time warping in large sequence databases. Time warping enables finding sequences with similar patterns even when they are of different lengths. In earlier work, Kim et al. suggested an efficient method for whole matching under time warping. This method constructs a multidimensional index on a set of feature vectors, which are invariant to time warping, from data sequences. For filtering at feature space, it also applies a lower-bound function, which consistently underestimates the time warping distance as well as satisfies the triangular inequality. In this paper, we incorporate the prefix-querying approach based on sliding windows into the earlier approach. For indexing, we extract a feature vector from every subsequence inside a sliding window and construct a multidimensional index using a feature vector as indexing attributes. For query processing, we perform a series of index searches using the feature vectors of qualifying query prefixes. Our approach provides effective and scalable subsequence matching even with a large volume of a database. We also prove that our approach does not incur false dismissal. To verify the superiority of our approach, we perform extensive experiments. The results reveal that our approach achieves significant speedup with real-world S&P 500 stock data and with very large synthetic data.