Search | Korea Science

Optimal Construction of Multiple Indexes for Time-Series Subsequence Matching (시계열 서브시퀀스 매칭을 위한 최적의 다중 인덱스 구성 방안)

Lim, Seung-Hwan;Kim, Sang-Wook;Park, Hee-Jin
- Journal of KIISE:Databases
- /
- v.33 no.2
- /
- pp.201-213
- /
- 2006
A time-series database is a set of time-series data sequences, each of which is a list of changing values of the object in a given period of time. Subsequence matching is an operation that searches for such data subsequences whose changing patterns are similar to a query sequence from a time-series database. This paper addresses a performance issue of time-series subsequence matching. First, we quantitatively examine the performance degradation caused by the window size effect, and then show that the performance of subsequence matching with a single index is not satisfactory in real applications. We argue that index interpolation is fairly useful to resolve this problem. The index interpolation performs subsequence matching by selecting the most appropriate one from multiple indexes built on windows of their inherent sizes. For index interpolation, we first decide the sites of windows for multiple indexes to be built. In this paper, we solve the problem of selecting optimal window sizes in the perspective of physical database design. For this, given a set of query sequences to be peformed in a target time-series database and a set of window sizes for building multiple indexes, we devise a formula that estimates the cost of all the subsequence matchings. Based on this formula, we propose an algorithm that determines the optimal window sizes for maximizing the performance of entire subsequence matchings. We formally Prove the optimality as well as the effectiveness of the algorithm. Finally, we perform a series of extensive experiments with a real-life stock data set and a large volume of a synthetic data set. The results reveal that the proposed approach improves the previous one by 1.5 to 7.8 times.
PDF KSCI

Hippocratic XML Databases: A Model and Access Control Mechanism (히포크라테스 XML 데이터베이스: 모델 및 액세스 통제 방법)

Lee Jae-Gil;Han Wook-Shin;Whang Kyu-Young
- Journal of KIISE:Databases
- /
- v.31 no.6
- /
- pp.684-698
- /
- 2004
The Hippocratic database model recently proposed by Agrawal et al. incorporates privacy protection capabilities into relational databases. Since the Hippocratic database is based on the relational database, it needs extensions to be adapted for XML databases. In this paper, we propose the Hippocratic XML database model, an extension of the Hippocratic database model for XML databases and present an efficient access control mechanism under this model. In contrast to relational data, XML data have tree-like hierarchies. Thus, in order to manage these hierarchies of XML data, we extend and formally define such concepts presented in the Hippocratic database model as privacy preferences, privacy policies, privacy authorizations, and usage purposes of data records. Next, we present a new mechanism, which we call the authorization index, that is used in the access control mechanism. This authorization index, which is Implemented using a multi-dimensional index, allows us to efficiently search authorizations implied by the authorization granted on the nearest ancestor using the nearest neighbor search technique. Using synthetic and real data, we have performed extensive experiments comparing query processing time with those of existing access control mechanisms. The results show that the proposed access control mechanism improves the wall clock time by up to 13.6 times over the top-down access control strategy and by up to 20.3 times over the bottom-up access control strategy The major contributions of our paper are 1) extending the Hippocratic database model into the Hippocratic XML database model and 2) proposing an efficient across control mechanism that uses the authorization index and nearest neighbor search technique under this model.
PDF KSCI

A Cache Consistency Control for B-Tree Indices in a Database Sharing System (데이타베이스 공유 시스템에서 B-트리 인덱스를 위한 캐쉬 일관성 제어)

On, Gyeong-O;Jo, Haeng-Rae
- The KIPS Transactions:PartD
- /
- v.8D no.5
- /
- pp.593-604
- /
- 2001
A database sharing system (DSS) refers to a system for high performance transaction processing. In the DSS, the processing nodes are coupled via a high speed network and share a common database at the disk level. Each node has a local memory and a separate copy of operating system. To reduce the number of disk accesses, the node caches data pages and index pages in its memory buffer. In general, B-tree index pages are accessed more often and thus cached at more processing nodes, than their corresponding data pages. There are also complicated operations in the B-tree such as Fetch, Fetch Next, Insertion and Deletion. Therefore, an efficient cache consistency scheme supporting high level concurrency is required. In this paper, we propose cache consistency schemes using identifiers of index pages and page_LSN of leaf page. The propose schemes can improve the system throughput by reducing the required message traffic between nodes and index re-traversal.
PDF

Development of a File-based Moving Objects Storage Component for Efficient Storage and Retrieval of Moving Objects (이동 객체의 효율적인 저장과 검색을 위한 화일 기반 이동 객체 저장 컴포넌트의 개발)

장유정;김동오;홍동숙;한기준
- Proceedings of the Korean Information Science Society Conference
- /
- 2004.04b
- /
- pp.118-120
- /
- 2004
최근 무선 인터넷 인구의 증가로 인해 이동 객체의 위치 데이타를 활용하여 다양한 서비스를 제공하는 위치 기반 서비스와 텔레매틱스에 대찬 관심이 급증하고 있다 위치 기반 서비스와 텔레매틱스 분야에서 다양한 응용 서비스를 제공하기 위해서는 대용량의 위치 데이타를 빠르고 정확하게 저장하고 검색할 수 있는 이동 객체 데이타베이스 시스템이 필수적으로 요구된다. 그러나, 기존의 데이타베이스 시스템을 사용하여 대용량의 위치 데이타를 처리할 경우 트랜잭션 연산의 증가로 인하여 저장 밀 검색 성능이 저하된다 이러한 문제점을 해결하기 위해 본 논문에서는 이동 객체의 위치 데이타를 효율적으로 저장하고 검색하기 위한 화일 기반 이동 객체 저장 컴포넌트를 개발하고 성능 평가를 수행하였다. 화일 기반 이동 객체 저장 컴포넌트는 다중 연결 관리자, 단순 질의 처리기, 인덱스 관리자. 데이타 화일 관리자, 인덱스 파일 관리자, 메타데이타 관리자, 로그 관리자, OLE DB 데이타 제공자, 그리고 관리툴로 구성된다.
PDF

Delay Operation Techniques for Efficient MR-Tree on Nand Flash Memory (낸드 플래시 메모리 상에서 효율적인 MR-트리 동작을 위한 지연 연산 기법)

Lee, Hyun-Seung;Song, Ha-Yoon;Kim, Kyung-Chang
- Journal of KIISE:Computing Practices and Letters
- /
- v.14 no.8
- /
- pp.758-762
- /
- 2008
Embedded systems usually utilize Flash Memories with very nice characteristics of non-volatility, low access time, low power and so on. For the multimedia database systems, R-tree is an indexing tree with nice characteristics for multimedia access. MR-tree, which is an upgraded version of R-tree, has shown better performance in searching, inserting and deleting operations than R-tree. Flash memory has sectors and blocks as a unit of read, write and delete operations. Especially, the delete is done on a unit of 512 byte blocks with very large operation time and it is also known that read and write operations on a unit of block matches caching nature of MT-tree. Our research optimizes MR-tree operations in a unit of Flash memory blocks. Such an adjusting leads in better indexing performance in database accesses. With MR-tree on a 512B block units we achieved fast search time of database indexing with low height of MR-tree as well as faster update time of database indexing with the best fit of flash memory blocks. Thus MR-tree with optimized operations shows good characteristics to be a database index schemes on any systems with flash memory.
PDF KSCI

Efficient Indexing for Large DNA Sequence Databases (대용량 DNA 시퀀스 데이타베이스를 위한 효율적인 인덱싱)

Won Jung-Im;Yoon Jee-Hee;Park Sang-Hyun;Kim Sang-Wook
- Journal of KIISE:Databases
- /
- v.31 no.6
- /
- pp.650-663
- /
- 2004
In molecular biology, DNA sequence searching is one of the most crucial operations. Since DNA databases contain a huge volume of sequences, a fast indexing mechanism is essential for efficient processing of DNA sequence searches. In this paper, we first identify the problems of the suffix tree in aspects of the storage overhead, search performance, and integration with DBMSs. Then, we propose a new index structure that solves those problems. The proposed index consists of two parts: the primary part represents the trie as bit strings without any pointers, and the secondary part helps fast accesses of the leaf nodes of the trio that need to be accessed for post processing. We also suggest an efficient algorithm based on that index for DNA sequence searching. To verify the superiority of the proposed approach, we conducted a performance evaluation via a series of experiments. The results revealed that the proposed approach, which requires smaller storage space, achieves 13 to 29 times performance improvement over the suffix tree.
PDF KSCI

Video Index Generation and Search using Trie Structure (Trie 구조를 이용한 비디오 인덱스 생성 및 검색)

현기호;김정엽;박상현
- Journal of KIISE:Software and Applications
- /
- v.30 no.7_8
- /
- pp.610-617
- /
- 2003
Similarity matching in video database is of growing importance in many new applications such as video clustering and digital video libraries. In order to provide efficient access to relevant data in large databases, there have been many research efforts in video indexing with diverse spatial and temporal features. however, most of the previous works relied on sequential matching methods or memory-based inverted file techniques, thus making them unsuitable for a large volume of video databases. In order to resolve this problem, this paper proposes an effective and scalable indexing technique using a trie, originally proposed for string matching, as an index structure. For building an index, we convert each frame into a symbol sequence using a window order heuristic and build a disk-resident trie from a set of symbol sequences. For query processing, we perform a depth-first search on the trie and execute a temporal segmentation. To verify the superiority of our approach, we perform several experiments with real and synthetic data sets. The results reveal that our approach consistently outperforms the sequential scan method, and the performance gain is maintained even with a large volume of video databases.
PDF KSCI

다양한 분포의 데이터를 이용한 시계열 패턴 인덱스의 성능 비교

김영인
- Proceedings of the Korea Society for Industrial Systems Conference
- /
- 1998.10a
- /
- pp.791-805
- /
- 1998
음성데이타베이스 이미지 데이터베이스 등과 같은 응용에서 다차원 구조의 시계열 패턴을 효율적으로 처리하기 위한 인덱스 구조가 필요하다. 이러한 인덱스구조로 시계열 패턴 인덱스(9)가 제안되었다. 본 논문에서는 시계열 패턴 인덱스가 실제 응용에 적용가능한가를 판단하기 위하여 , 다양한 분포의 대량 데이터를 이용한 실험을 통한 성능을 비교한다. 성능 실험결과 저장시의 성능은 균일 분포에서 좋은 성능을 나타냈다. 질의 처리시의 성능은 모든 분포에서 좋은 후보 선택의 결과를 나타냈다.

Implementation and Evaluation of Time Interval Partitioning Algorithm in Temporal Databases (시간 데이타베이스에서 시간 간격 분할 알고리즘의 구현 및 평가)

Lee, Kwang-Kyu;Shin, Ye-Ho;Ryu, Keun-Ho;Kim, Hong-Gi
- Journal of KIISE:Computing Practices and Letters
- /
- v.8 no.1
- /
- pp.9-16
- /
- 2002
Join operation exert a great effect on the performance of system in temporal database as in the relational database. Especially, as for the temporal join, the optimization of interval partition decides the performance of query processing. In this paper, to improve the efficiency of parallel join query in temporal database. I proposed Minimum Interval Partition(MIP) scheme that time interval partitioning. The validity of this MIP algorithm that decides minimum breakpoint of the partition is proved by example scenario and I confirmed improved efficiency as compared with existing partition algorithm.
PDF KSCI

An Effective Similarity Search Technique supporting Time Warping in Sequence Databases (시퀀스 데이타베이스에서 타임 워핑을 지원하는 효과적인 유살 검색 기법)

Kim, Sang-Wook;Park, Sang-Hyun
- Journal of KIISE:Databases
- /
- v.28 no.4
- /
- pp.643-654
- /
- 2001
This paper discusses an effective processing of similarity search that supports time warping in large sequence database. Time warping enables finding sequences with similar patterns even when they are of different length, Previous methods fail to employ multi-dimensional indexes without false dismissal since the time warping distance does not satisfy the triangular inequality. They have to scan all the database, thus suffer from serious performance degradation in large database. Another method that hires the suffix tree also shows poor performance due to the large tree size. In this paper we propose a new novel method for similarity search that supports time warping Our primary goal is to innovate on search performance in large database without false dismissal. to attain this goal ,we devise a new distance function $D_{tw-Ib}$ consistently underestimates the time warping distance and also satisfies the triangular inequality, $D_{tw-Ib}$ uses a 4-tuple feature vector extracted from each sequence and is invariant to time warping, For efficient processing, we employ a distance function, We prove that our method does not incur false dismissal. To verify the superiority of our method, we perform extensive experiments . The results reveal that our method achieves significant speedup up to 43 times with real-world S&P 500 stock data and up to 720 times with very large synthetic data.
PDF

Search Result 63, Processing Time 0.023 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)