• Title/Summary/Keyword: 중복제거기법

Search Result 221, Processing Time 0.028 seconds

Data Backup System Exploiting De-duplication TAR Scheme (중복제거 TAR 기법을 적용한 백업 시스템)

  • Kang, Sung-Woon;Jung, Ho-Min;Lee, Jeong-Gun;Ko, Young-Woong
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2011.06a
    • /
    • pp.539-542
    • /
    • 2011
  • TAR와 같은 아카이브 포맷에는 파일 중복을 제거하는 기능이 포함되어 있지 않아서 리눅스 배포 미러와 같이 버전단위로 저장되는 시스템에서 디스크 공간의 낭비가 발생하였다. 본 연구에서는 파일 중복 제거 기능을 추가한 TAR형태의 압축 포맷인 DTAR와 이를 제어하는 DTM 유틸리티를 제안하였다. 주요 아이디어는 클라이언트에서 DTAR 생성 시, 헤더에 SHA1 해시 정보를 추가하여 DTM 유틸리티를 통해 SHA1 해시를 노드로 하는 R-B Tree를 생성하고 이를 서버에 저장된 해시 정보와 비교하여 DTAR내에서 중복이 없는 파일을 선택적으로 파일을 압축하고 서버로 백업하고 관리하는 것이다. 실험 결과 DTM을 통한 백업은 중복 데이터가 누적될수록 DTAR가 tar.gz보다 공간적인 측면이나 백업을 위한 데이터 패킷 전송 시간에서 크게 향상된 성능을 보였다.

Deduplication Technique for Smartphone Application Update Scenario (스마트폰의 어플리케이션 업데이트 패턴을 고려한 데이터 중복제거 기법 연구)

  • Park, Dae-Jun;Choi, Dong-Soo;Shin, Dong-Kun
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2012.06a
    • /
    • pp.364-366
    • /
    • 2012
  • 스마트폰의 어플리케이션은 어플리케이션 생태계의 발전에 따라 그 수가 많아지고, 업데이트 또한 잦아졌다. 어플리케이션의 업데이트는 낸드 플래시 메모리에 이전 버전을 삭제하고, 새로운 버전의 어플리케이션에 대한 쓰기 명령을 내린다. 따라서 사용자는 낸드 플래시 메모리에서의 상대적으로 느린 쓰기 명령에 의해 스마트폰의 성능의 저하를 느끼고 낸드 플래시 메모리는 반복되는 지우기/쓰기 동작에 의해 수명이 단축된다. 본 논문에서는 업데이트 되는 스마트폰 어플리케이션 데이터가 이전 버전과 큰 차이가 없다는 것에 착안하여 데이터 중복제거를 통해 업데이트 성능을 향상시키고 낸드 플래시 메모리의 수명을 향상시키는 기법을 제안하고 있으며, 실험을 통해서 어플리케이션들에 대한 중복 제거율을 관찰하였다.

Efficient Generation of 3-D Video Holograms Using Temporal-Spatial Redundancy of 3-D Moving Images (3차원 동영상의 시ㆍ공간적 정보 중복성을 이용한 효과적인 3차원 비디오 홀로그램의 생성)

  • Kim, Dong-Wook;Koo, Jung-Sik;Kim, Seung-Cheol;Kim, Eun-Soo
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.37C no.10
    • /
    • pp.859-869
    • /
    • 2012
  • In this paper, a new method to efficiently generate the 3-D(three-dimensional) video holograms for 3-D moving scenes, which is called here the TSR-N-LUT method, is proposed by the combined use of temporal-spatial redundancy(TSR) of 3-D video images and novel look-up table(N-LUT) technique. That is, in the proposed scheme, with the differential pulse code modulation (DPCM) algorithm, temporally redundancy redundant data in the inter-frame of a 3-D video images are removed between the frames, and then inter-line redundant data in the inter-frame of 3-D video images are also removed by using the DPCM method between the lines. Experimental results show that the proposed method could reduced the number of calculated object points and the calculation time of one object point by 23.72% and 19.55%, respectively on the average compared to the conventional method. Good experimental results with 3-D test moving pictures finally confirmed the feasibility of the proposed method to the fast generation of CGH patterns of the 3-D video images.

Design and Implementation of Inline Data Deduplication in Cluster File System (클러스터 파일 시스템에서 인라인 데이터 중복제거 설계 및 구현)

  • Kim, Youngchul;Kim, Cheiyol;Lee, Sangmin;Kim, Youngkyun
    • KIISE Transactions on Computing Practices
    • /
    • v.22 no.8
    • /
    • pp.369-374
    • /
    • 2016
  • The growing demand of virtual computing and storage resources in the cloud computing environment has led to deduplication of storage system for effective reduction and utilization of storage space. In particular, large reduction in the storage space is made possible by preventing data with identical content as the virtual desktop images from being stored on the virtual desktop infrastructure. However, in order to provide reliable support of virtual desktop services, the storage system must address a variety of workloads by virtual desktop, such as performance overhead due to deduplication, periodic data I/O storms and frequent random I/O operations. In this paper, we designed and implemented a clustered file system to support virtual desktop and storage services in cloud computing environment. The proposed clustered file system provides low storage consumption by means of inline deduplication on virtual desktop images. In addition, it reduces performance overhead by deduplication process in the data server and not the virtual host on which virtual desktops are running.

Data De-duplication and Recycling Technique in SSD-based Storage System for Increasing De-duplication Rate and I/O Performance (SSD 기반 스토리지 시스템에서 중복률과 입출력 성능 향상을 위한 데이터 중복제거 및 재활용 기법)

  • Kim, Ju-Kyeong;Lee, Seung-Kyu;Kim, Deok-Hwan
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.49 no.12
    • /
    • pp.149-155
    • /
    • 2012
  • SSD is a storage device of having high-performance controller and cache buffer and consists of many NAND flash memories. Because NAND flash memory does not support in-place update, valid pages are invalidated when update and erase operations are issued in file system and then invalid pages are completely deleted via garbage collection. However, garbage collection performs many erase operations of long latency and then it reduces I/O performance and increases wear leveling in SSD. In this paper, we propose a new method of de-duplicating valid data and recycling invalid data. The method de-duplicates valid data and then recycles invalid data so that it improves de-duplication ratio. Due to reducing number of writes and garbage collection, the method could increase I/O performance and decrease wear leveling in SSD. Experimental result shows that it can reduce maximum 20% number of garbage collections and 9% I/O latency than those of general case.

A Clustering File Backup Server Using Multi-level De-duplication (다단계 중복 제거 기법을 이용한 클러스터 기반 파일 백업 서버)

  • Ko, Young-Woong;Jung, Ho-Min;Kim, Jin
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.14 no.7
    • /
    • pp.657-668
    • /
    • 2008
  • Traditional off-the-shelf file server has several potential drawbacks to store data blocks. A first drawback is a lack of practical de-duplication consideration for storing data blocks, which leads to worse storage capacity waste. Second drawback is the requirement for high performance computer system for processing large data blocks. To address these problems, this paper proposes a clustering backup system that exploits file fingerprinting mechanism for block-level de-duplication. Our approach differs from the traditional file server systems in two ways. First, we avoid the data redundancy by multi-level file fingerprints technology which enables us to use storage capacity efficiently. Second, we applied a cluster technology to I/O subsystem, which effectively reduces data I/O time and network bandwidth usage. Experimental results show that the requirement for storage capacity and the I/O performance is noticeably improved.

Improving Efficiency of Encrypted Data Deduplication with SGX (SGX를 활용한 암호화된 데이터 중복제거의 효율성 개선)

  • Koo, Dongyoung
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.11 no.8
    • /
    • pp.259-268
    • /
    • 2022
  • With prosperous usage of cloud services to improve management efficiency due to the explosive increase in data volume, various cryptographic techniques are being applied in order to preserve data privacy. In spite of the vast computing resources of cloud systems, decrease in storage efficiency caused by redundancy of data outsourced from multiple users acts as a factor that significantly reduces service efficiency. Among several approaches on privacy-preserving data deduplication over encrypted data, in this paper, the research results for improving efficiency of encrypted data deduplication using trusted execution environment (TEE) published in the recent USENIX ATC are analysed in terms of security and efficiency of the participating entities. We present a way to improve the stability of a key-managing server by integrating it with individual clients, resulting in secure deduplication without independent key servers. The experimental results show that the communication efficiency of the proposed approach can be improved by about 30% with the effect of a distributed key server while providing robust security guarantees as the same level of the previous research.

Design and Implementation of Multiple Filter Distributed Deduplication System Applying Cuckoo Filter Similarity (쿠쿠 필터 유사도를 적용한 다중 필터 분산 중복 제거 시스템 설계 및 구현)

  • Kim, Yeong-A;Kim, Gea-Hee;Kim, Hyun-Ju;Kim, Chang-Geun
    • Journal of Convergence for Information Technology
    • /
    • v.10 no.10
    • /
    • pp.1-8
    • /
    • 2020
  • The need for storage, management, and retrieval techniques for alternative data has emerged as technologies based on data generated from business activities conducted by enterprises have emerged as the key to business success in recent years. Existing big data platform systems must load a large amount of data generated in real time without delay to process unstructured data, which is an alternative data, and efficiently manage storage space by utilizing a deduplication system of different storages when redundant data occurs. In this paper, we propose a multi-layer distributed data deduplication process system using the similarity of the Cuckoo hashing filter technique considering the characteristics of big data. Similarity between virtual machines is applied as Cuckoo hash, individual storage nodes can improve performance with deduplication efficiency, and multi-layer Cuckoo filter is applied to reduce processing time. Experimental results show that the proposed method shortens the processing time by 8.9% and increases the deduplication rate by 10.3%.

Eliminating Redundant Alarms of Buffer Overflow Analysis Using Context Refinements (분석 문맥 조절 기법을 이용한 버퍼 오버플로우 분석의 중복 경보 제거)

  • Kim, You-Il;Han, Hwan-Soo
    • Journal of KIISE:Software and Applications
    • /
    • v.37 no.12
    • /
    • pp.942-945
    • /
    • 2010
  • In order to reduce the efforts to inspect the reported alarms from a static buffer overflow analyzer, we present an effective method to filter out redundant alarms. In the static analysis, a sequence of multiple alarms are frequently found due to the same cause in the code. In such a case, it is sufficient and reasonable for programmers to examine the first alarm instead of the entire alarms in the same sequence. Based on this observation, we devise a buffer overflow analysis that filters out redundant alarms with our context refinement technique. Our experiment with several open source programs shows that our method reduces the reported alarms by 23% on average.

An Efficient Data Aggregation Method in Clustered Sensor Network (클러스터 구조의 센서 네트워크에서 효율적인 데이터 모음 기법)

  • Jee, Jae-Kyoung;Ha, Rhan
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2005.11a
    • /
    • pp.220-222
    • /
    • 2005
  • 배터리를 사용하여 정보의 처리와 센싱 작업을 수행하는 무선 센서 네트워크를 오래 동안 가동시키기 위해서 한정된 자원을 효율적으로 사용할 수 있는 여러 기술들이 제안되고 있다. 이런 기법들 중 클러스터를 구성하거나, 데이터 모음 기법을 수행하여 중복된 데이터를 하나의 패킷으로 압축하여 전송 횟수를 줄이면 에너지 절감 효과를 볼 수 있다. 본 논문에서는 클러스터 구조를 이루고 있는 센서 네트워크에서 두개 이상의 클러스터가 중복된 지역을 센싱 할 경우 발생하는 중복 데이터 제거 기법을 제안한다. 제안하는 기법은 Meta-data를 사용한 사전 교섭으로 동일한 정보가 각각 다른 클러스터에 전송되는 것을 방지하여 에너지 절감 효과를 볼 수 있다. 또한, 클러스터 내에서 발생하는 다른 정보들을 시간 지연 기법을 사용하여 하나의 패킷으로 데이터 모음을 수행하는 기법도 제안한다. 성능 평가를 통해 제안하는 알고리즘은 기존의 기법에 비하여 지연 시간과 에너지 소모 면에서 모두 효율적인 것을 확인할 수 있다.

  • PDF