• Title/Summary/Keyword: Data Locality

Search Result 237, Processing Time 0.025 seconds

HA-PVFS : A PVFS File System supporting High Data Availability Adaptive to Temporal Locality (HA-PVFS : 시간적 지역성에 적응적인 데이터 고가용성을 지원하는 PVFS 파일 시스템)

  • Sim Sang-Man;Han Sae-Young;Park Sung-Yong
    • The KIPS Transactions:PartA
    • /
    • v.13A no.3 s.100
    • /
    • pp.241-252
    • /
    • 2006
  • In cluster file systems, the availability of files has been supported by replicating entire files or generating parities on parity servers. However, those methods require very large temporal and spatial cost, and cannot handle massive failures situation on the file system. So we propose HA-PVFS, a cluster file system supporting high data availability adaptive to temporal locality. HA-PVFS restricts replication or parity generation to some important files, for that it employs an efficient algorithm to estimate file access patterns from limited information. Moreover, in order to minimize the performance degradation of the file system, it uses delayed update method and relay replication.

An Efficient Data Centric Storage Scheme with Non-uniformed Density of Wireless Sensor Networks (센서의 불균일한 배포밀도를 고려한 효율적인 데이터 중심 저장기법)

  • Seong, dong-ook;Lee, seok-jae;Song, seok-il;Yoo, jae-soo
    • Proceedings of the Korea Contents Association Conference
    • /
    • 2007.11a
    • /
    • pp.135-139
    • /
    • 2007
  • Recently Data Centric Storage (DCS) schemes are variously studied for several applications (e.g. natural environment investigation, military application systems and environmental changes monitoring). In DCS scheme, data is stored at nodes within the network by name. There are several drawbacks in the existing schemes. The first is the inefficiency of the range query processing on not considered the locality of store point. the second is the non-homogeneity of store load of each sensors in case of the sensor distribution density is non-uniformed. In this paper, we propose a novel data centric storage scheme with the sensor distribution density which satisfied with the locality of data store location. This scheme divides whole sensor network area using grid and distributes the density bit map witch consist of the sensor density information of each cell. sensors use the density bit map for storing and searching the data. We evaluate our scheme with existing schemes. As a result, we show improved load balancing and more efficient range query processing than existing schemes in environment which sensors are distributed non-uniform.

  • PDF

CORE-Dedup: IO Extent Chunking based Deduplication using Content-Preserving Access Locality (CORE-Dedup: 내용보존 접근 지역성 활용한 IO 크기 분할 기반 중복제거)

  • Kim, Myung-Sik;Won, You-Jip
    • Journal of the Korea Society of Computer and Information
    • /
    • v.20 no.6
    • /
    • pp.59-76
    • /
    • 2015
  • Recent wide spread of embedded devices and technology growth of broadband communication has led to rapid increase in the volume of created and managed data. As a result, data centers have to increase the storage capacity cost-effectively to store the created data. Data deduplication is one way to save the storage space by removing redundant data. This work propose IO extent based deduplication schemes called CORE-Dedup that exploits content-preserving access locality. We acquire IO traces from block device layer in virtual machine host, and compare the deduplication performance of chunking method between the fixed size and IO extent based. At multiple workload of 10 user's compile in virtual machine environment, the result shows that 4 KB fixed size chunking and IO extent based chunking use chunk index 14500 and 1700, respectively. The deduplication rate account for 60.4% and 57.6% on fixed size and IO extent chunking, respectively.

A Block Allocation Policy to Enhance Wear-leveling in a Flash File System (플래시 파일시스템에서 wear-leveling 개선을 위한 블록 할당 정책)

  • Jang, Si-Woong
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2007.10a
    • /
    • pp.574-577
    • /
    • 2007
  • While disk can be overwritten on updating data, because flash memory can not be overwritten on updating data, new data are updated in new area. If data are frequently updated, garbage collection, which is achieved by erasing blocks, should be performed to reclaim new area. Hence, because the number of erase operations is limited due to characteristics of flash memory, every block should be evenly written and erased. However, if data with access locality are processed by cost benefit algorithm with separation of hot block and cold block, though the performance of processing is high, wear-leveling is not even. In this paper, we propose CB-MB (Cost Benefit between Multi Bank) algorithm in which hot data are allocated in one bank and cold data in another bank, and in which role of hot bank and cold bank is exchanged every period. CB-MB showed that its performance was similar to that of others for uniform workload, however, the method provides much better performance than that of others for workload of access locality.

  • PDF

An Area Efficient Low Power Data Cache for Multimedia Embedded Systems (멀티미디어 내장형 시스템을 위한 저전력 데이터 캐쉬 설계)

  • Kim Cheong-Ghil;Kim Shin-Dug
    • The KIPS Transactions:PartA
    • /
    • v.13A no.2 s.99
    • /
    • pp.101-110
    • /
    • 2006
  • One of the most effective ways to improve cache performance is to exploit both temporal and spatial locality given by any program executional characteristics. This paper proposes a data cache with small space for low power but high performance on multimedia applications. The basic architecture is a split-cache consisting of a direct-mapped cache with small block sire and a fully-associative buffer with large block size. To overcome the disadvantage of small cache space, two mechanisms are enhanced by considering operational behaviors of multimedia applications: an adaptive multi-block prefetching to initiate various fetch sizes and an efficient block filtering to remove rarely reused data. The simulations on MediaBench show that the proposed 5KB-cache can provide equivalent performance and reduce energy consumption up to 40% as compared with 16KB 4-way set associative cache.

A Cache-Conscious Compression Index Based on the Level of Compression Locality (압축 지역성 수준에 기반한 캐쉬 인식 압축 색인)

  • Kim, Won-Sik;Yoo, Jae-Jun;Lee, Jin-Soo;Han, Wook-Shin
    • Journal of Korea Multimedia Society
    • /
    • v.13 no.7
    • /
    • pp.1023-1043
    • /
    • 2010
  • As main memory get cheaper, it becomes increasingly affordable to load entire index of DBMS and to access the index. Since speed gap between CPU and main memory is growing bigger, many researches to reduce a cost of main memory access are under the progress. As one of those, cache conscious trees can reduce the cost of main memory access. Since cache conscious trees reduce the number of cache miss by compressing data in node, cache conscious trees can reduce the cost of main memory. Existing cache conscious trees use only fixed one compression technique without consideration of properties of data in node. First, this paper proposes the DC-tree that uses various compression techniques and change data layout in a node according to properties of data in order to reduce cache miss. Second, this paper proposes the level of compression locality that describes properties of data in node by formula. Third, this paper proposes Forced Partial Decomposition (FPD) that reduces the nutter of cache miss. DC-trees outperform 1.7X than B+-tree, 1.5X than simple prefix B+-tree, and 1.3X than pkB-tree, in terms of the number of cache misses. Since proposed DC-trees can be adopted in commercial main memory database system, we believe that DC-trees are practical result.

A Local Buffer Allocation Scheme for Multimedia Data on Linux (리눅스 상에서 멀티미디어 데이타를 고려한 지역 버퍼 할당 기법)

  • 신동재;박성용;양지훈
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.9 no.4
    • /
    • pp.410-419
    • /
    • 2003
  • The buffer cache of general operating systems such as Linux manages file data by using global block replacement policy and read ahead. As a result, multimedia data with a low locality of reference and various consumption rate have low cache hit ratio and consume additional buffers because of read ahead. In this paper we have designed and implemented a new buffer allocation algorithm for multimedia data on Linux. Our approach keeps one read-ahead cache per every opened multimedia file and dynamically changes the read-ahead group size based on the buffer consumption rate of the file. This distributes resources fairly and optimizes the buffer consumption. This paper compares the system performance with that of Linux 2.4.17 in terms of buffer consumption and buffer hit ratio.

A method for improving wear-leveling of flash file systems in workload of access locality (접근 지역성을 가지는 작업부하에서 플래시 파일시스템의 wear-leveling 향상 기법)

  • Jang, Si-Woong
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.12 no.1
    • /
    • pp.108-114
    • /
    • 2008
  • Since flash memory cannot be overwritten, new data are updated in new area. If data are frequently updated, garbage collection which is achieved by erasing blocks, should be performed to reclaim new area. Hence, because the count of erase operations is limited due to characteristics of flash memory, every block should be evenly written and erased. However, if data with access locality are processed by cost benefit algorithm with separation of hot block ad cold block though the performance of processing is hight wear-leveling is not even. In this paper, we propose CB-MB (Cost Benefit between Multi Bank) algorithm in which hot data are allocated in one bank and cold data in another bank, and in which role of hot bank and cold bank is exchanged every period. CB-MB shows that its performance is 30% better than cost benefit algorithm with separation of cold block and hot block its wear-leveling is about a third of that in standard deviation.

k-NN Join Based on LSH in Big Data Environment

  • Ji, Jiaqi;Chung, Yeongjee
    • Journal of information and communication convergence engineering
    • /
    • v.16 no.2
    • /
    • pp.99-105
    • /
    • 2018
  • k-Nearest neighbor join (k-NN Join) is a computationally intensive algorithm that is designed to find k-nearest neighbors from a dataset S for every object in another dataset R. Most related studies on k-NN Join are based on single-computer operations. As the data dimensions and data volume increase, running the k-NN Join algorithm on a single computer cannot generate results quickly. To solve this scalability problem, we introduce the locality-sensitive hashing (LSH) k-NN Join algorithm implemented in Spark, an approach for high-dimensional big data. LSH is used to map similar data onto the same bucket, which can reduce the data search scope. In order to achieve parallel implementation of the algorithm on multiple computers, the Spark framework is used to accelerate the computation of distances between objects in a cluster. Results show that our proposed approach is fast and accurate for high-dimensional and big data.

New record of Codium lucasii (Bryopsidales, Chlorophyta) in Korea

  • An, Jae Woo;Nam, Ki Wan
    • Journal of Ecology and Environment
    • /
    • v.38 no.4
    • /
    • pp.647-654
    • /
    • 2015
  • A prostrate species of Codium (Bryopsidales, Chlorophyta) was collected from Daejin on the eastern coast of Korea. This alga is morphologically characterized by a prostrate, adherent or pulvinate, dark green thallus that is tightly attached to substratum. The utricles are strongly grouped and cylindrical to slightly clavate. Their apex is rounded to capitated, and it frequently has an alveolate ornament. Hair scars are found in the upper portion of the utricle. The gametangia grow on a short pedicel in the upper part of the utricle. In the phylogenetic tree based on molecular data, this alga is placed in the same clade as C. mozambiquense in UPGMA analysis, and nests in a sister clade of C. lucasii subsp. capense and C. mozambiquense in ML and NJ analyses. However, the genetic distance between the sequences of the Korean alga and the two species is 1.3-1.9%, while that between the Korean alga and C. lucasii from Japan is 1.1% within intraspecific range. The divergence value between the Korean alga and C. lucasii from the type locality (Australia) is 2.7% considered to be interspecific range. As based on this genetic divergence value, the Korean alga together with Japanese C. lucasii can be separated from genuine C. lucasii from the type locality. However, the Korean alga is identified as C. lucasii until those entities are morphologically characterized in species level. This is the first record of C. lucasii in Korea