• Title/Summary/Keyword: data deduplication

Search Result 47, Processing Time 0.022 seconds

Client-Side Deduplication to Enhance Security and Reduce Communication Costs

  • Kim, Keonwoo;Youn, Taek-Young;Jho, Nam-Su;Chang, Ku-Young
    • ETRI Journal
    • /
    • v.39 no.1
    • /
    • pp.116-123
    • /
    • 2017
  • Message-locked encryption (MLE) is a widespread cryptographic primitive that enables the deduplication of encrypted data stored within the cloud. Practical client-side contributions of MLE, however, are vulnerable to a poison attack, and server-side MLE schemes require large bandwidth consumption. In this paper, we propose a new client-side secure deduplication method that prevents a poison attack, reduces the amount of traffic to be transmitted over a network, and requires fewer cryptographic operations to execute the protocol. The proposed primitive was analyzed in terms of security, communication costs, and computational requirements. We also compared our proposal with existing MLE schemes.

Secure and Efficient Storage of Video Data in a CCTV Environment

  • Kim, Won-Bin;Lee, Im-Yeong
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.13 no.6
    • /
    • pp.3238-3257
    • /
    • 2019
  • Closed-circuit television (CCTV) technology continuously captures and stores video streams. Users are typically required by policy to store all the captured video for a certain period. Accordingly, increasing the number of CCTV operation cycles and photographing positions expands the amount of data to be stored. However, expanding the available storage space for video data incurs increased costs. In recent years, this problem has been addressed with cloud storage solutions, which enable multiple users and devices to access and store data simultaneously. However, because of the large amount of data to be stored, a vast storage space is required. Consequently, cloud storage administrators need a way to store data more efficiently. To save storage space, deduplication technology has been proposed to prevent duplicate storage of the same data. However, because cloud storage is hosted on remote servers, data encryption technology must be applied to address data exposure issues. Although deduplication techniques for encrypted data have been studied, there have been various security vulnerabilities. We attempted to solve this problem by addressing various issues such as poison attacks, property forgery, and ownership management while removing the redundant data and handling the data more securely.

CORE-Dedup: IO Extent Chunking based Deduplication using Content-Preserving Access Locality (CORE-Dedup: 내용보존 접근 지역성 활용한 IO 크기 분할 기반 중복제거)

  • Kim, Myung-Sik;Won, You-Jip
    • Journal of the Korea Society of Computer and Information
    • /
    • v.20 no.6
    • /
    • pp.59-76
    • /
    • 2015
  • Recent wide spread of embedded devices and technology growth of broadband communication has led to rapid increase in the volume of created and managed data. As a result, data centers have to increase the storage capacity cost-effectively to store the created data. Data deduplication is one way to save the storage space by removing redundant data. This work propose IO extent based deduplication schemes called CORE-Dedup that exploits content-preserving access locality. We acquire IO traces from block device layer in virtual machine host, and compare the deduplication performance of chunking method between the fixed size and IO extent based. At multiple workload of 10 user's compile in virtual machine environment, the result shows that 4 KB fixed size chunking and IO extent based chunking use chunk index 14500 and 1700, respectively. The deduplication rate account for 60.4% and 57.6% on fixed size and IO extent chunking, respectively.

Design Deduplication User File System for Flash-SSD (Flash-SSD 데이터 중복 제거를 위한 사용자 파일 시스템 설계)

  • Myeong, Jae-hui;Kwon, Oh-young
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2017.10a
    • /
    • pp.322-325
    • /
    • 2017
  • Due to the rapid increase in data, various studies are being conducted to efficiently manage the data. In 2025, the total amount of data will increase to more than 163 ZB, and more than a quarter of the data will be a real-time data. As mass storage devices is changed from HDD to SSD, SSD needs own way to manage their data effectively. In this paper, we study the SSD system structure and deduplication management methods of data management related to Flash-SSD. We also propose an application level user file system using deduplication. It is anticipated that it saves storage capacity and minimize reducing performance by unnecessary traffic.

  • PDF

Design and Implementation of Inline Data Deduplication in Cluster File System (클러스터 파일 시스템에서 인라인 데이터 중복제거 설계 및 구현)

  • Kim, Youngchul;Kim, Cheiyol;Lee, Sangmin;Kim, Youngkyun
    • KIISE Transactions on Computing Practices
    • /
    • v.22 no.8
    • /
    • pp.369-374
    • /
    • 2016
  • The growing demand of virtual computing and storage resources in the cloud computing environment has led to deduplication of storage system for effective reduction and utilization of storage space. In particular, large reduction in the storage space is made possible by preventing data with identical content as the virtual desktop images from being stored on the virtual desktop infrastructure. However, in order to provide reliable support of virtual desktop services, the storage system must address a variety of workloads by virtual desktop, such as performance overhead due to deduplication, periodic data I/O storms and frequent random I/O operations. In this paper, we designed and implemented a clustered file system to support virtual desktop and storage services in cloud computing environment. The proposed clustered file system provides low storage consumption by means of inline deduplication on virtual desktop images. In addition, it reduces performance overhead by deduplication process in the data server and not the virtual host on which virtual desktops are running.

Design of Adaptive Deduplication Algorithm Based on File Type and Size (파일 유형과 크기에 따른 적응형 중복 제거 알고리즘 설계)

  • Hwang, In-Cheol;Kwon, Oh-Young
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.24 no.2
    • /
    • pp.149-157
    • /
    • 2020
  • Today, due to the large amount of data duplication caused by the increase in user data, various deduplication studies have been conducted. However, research on personal storage is relatively poor. Personal storage, unlike high-performance computers, needs to perform deduplication while reducing CPU and memory resource usage. In this paper, we propose an adaptive algorithm that selectively applies fixed size chunking (FSC) and whole file chunking (WFH) according to the file type and size in order to maintain the deduplication rate and reduce the load in personal storage. We propose an algorithm for minimization. The experimental results show that the proposed file system has more than 1.3 times slower at first write operation but less than 3 times reducing in memory usage compare to LessFS and it is 2.5 times faster at rewrite operation.

Privacy Preserving source Based Deuplication Method (프라이버시 보존형 소스기반 중복제거 기술 방법 제안)

  • Nam, Seung-Soo;Seo, Chang-Ho;Lee, Joo-Young;Kim, Jong-Hyun;Kim, Ik-Kyun
    • Smart Media Journal
    • /
    • v.4 no.4
    • /
    • pp.33-38
    • /
    • 2015
  • Cloud storage server do not detect duplication of conventionally encrypted data. To solve this problem, Convergent Encryption has been proposed. Recently, various client-side deduplication technology has been proposed. However, this propositions still cannot solve the security problem. In this paper, we suggest a secure source-based deduplication technology, which encrypt data to ensure the confidentiality of sensitive data and apply proofs of ownership protocol to control access to the data, from curious cloud server and malicious user.

Privacy Preserving Source Based Deduplicaton Method (프라이버시 보존형 소스기반 중복제거 방법)

  • Nam, Seung-Soo;Seo, Chang-Ho
    • Journal of Digital Convergence
    • /
    • v.14 no.2
    • /
    • pp.175-181
    • /
    • 2016
  • Cloud storage servers do not detect duplication of conventionally encrypted data. To solve this problem, convergent encryption has been proposed. Recently, various client-side deduplication technology has been proposed. However, this propositions still cannot solve the security problem. In this paper, we suggest a secure source-based deduplication technology, which encrypt data to ensure the confidentiality of sensitive data and apply proofs of ownership protocol to control access to the data, from curious cloud server and malicious user.

Analysis and Elimination of Side Channels during Duplicate Identification in Remote Data Outsourcing (원격 저장소 데이터 아웃소싱에서 발생하는 중복 식별 과정에서의 부채널 분석 및 제거)

  • Koo, Dongyoung
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.27 no.4
    • /
    • pp.981-987
    • /
    • 2017
  • Proliferation of cloud computing services brings about reduction of the maintenance and management costs by allowing data to be outsourced to a dedicated third-party remote storage. At the same time, the majority of storage service providers have adopted a data deduplication technique for efficient utilization of storage resources. When a hash tree is employed for duplicate identification as part of deduplication process, size information of the attested data and partial information about the tree can be deduced from eavesdropping. To mitigate such side channels, in this paper, a new duplicate identification method is presented by exploiting a multi-set hash function.

A Study of Method to Restore Deduplicated Files in Windows Server 2012 (윈도우 서버 2012에서 데이터 중복 제거 기능이 적용된 파일의 복원 방법에 관한 연구)

  • Son, Gwancheol;Han, Jaehyeok;Lee, Sangjin
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.27 no.6
    • /
    • pp.1373-1383
    • /
    • 2017
  • Deduplication is a function to effectively manage data and improve the efficiency of storage space. When the deduplication is applied to the system, it makes it possible to efficiently use the storage space by dividing the stored file into chunks and storing only unique chunk. However, the commercial digital forensic tool do not support the file system analysis, and the original file extracted by the tool can not be executed or opened. Therefore, in this paper, we analyze the process of generating chunks of data for a Windows Server 2012 system that can apply deduplication, and the structure of the resulting file(Chunk Storage). We also analyzed the case where chunks that are not covered in the previous study are compressed. Based on these results, we propose the method to collect deduplicated data and reconstruct the original file for digital forensic investigation.