• Title/Summary/Keyword: Disk Cluster Allocation

Search Result 4, Processing Time 0.017 seconds

An Arbitrary Disk Cluster Manipulating Method for Allocating Disk Fragmentation of Filesystem (파일시스템의 클러스터를 임의로 할당하여 디스크를 단편화하기 위한 방법)

  • Cho, Gyu-Sang
    • Journal of Korea Society of Digital Industry and Information Management
    • /
    • v.16 no.2
    • /
    • pp.11-25
    • /
    • 2020
  • This study proposes a method to manipulate fragmentation of disks by arbitrarily allocating and releasing the status of a disk cluster in the NTFS file system. This method allows experiments to be performed in several studies related to fragmentation problems on disk cluster. Typical applicable research examples include testing the performance of disk defragmentation tools according to the state of fragmentation, establishing an experimental environment for fragmented file carving methods for digital forensics, setting up cluster fragmentation for testing the robustness of data hiding methods within directory indexes, and testing the file system's disk allocation methods according to the various version of Windows. This method suggests how a single file occupies a cluster and presents an algorithm with a flowchart. It raises three tricky problems to solve the method, and we propose solutions to the problems. Experiments for allocating the disk cluster to be fragmented to the maximum extent possible, it then performs a disk defragmentation experiment to prove the proposed method is effective.

An Efficient Snapshot Technique for Shared Storage Systems supporting Large Capacity (대용량 공유 스토리지 시스템을 위한 효율적인 스냅샷 기법)

  • 김영호;강동재;박유현;김창수;김명준
    • Journal of KIISE:Databases
    • /
    • v.31 no.2
    • /
    • pp.108-121
    • /
    • 2004
  • In this paper, we propose an enhanced snapshot technique that solves performance degradation when snapshot is initiated for the storage cluster system. However, traditional snapshot technique has some limits adapted to large amount storage shared by multi-hosts in the following aspects. As volume size grows, (1) it deteriorates crucially the performance of write operations due to additional disk access to verify COW is performed. (2) Also it increases excessively the blocking time of write operation performed during the snapshot creation time. (3)Finally, it deteriorates the performance of write operations due to additional disk I/O for mapping block caused by the verification of COW. In this paper, we propose an efficient snapshot technique for large amount storage shared by multi-hosts in SAN Environments. We eliminate the blocking time of write operation caused by freezing while a snapshot creation is performing. Also to improve the performance of write operation when snapshot is taken, we introduce First Allocation Bit(FAB) and Snapshot Status Bit(SSB). It improves performance of write operation by reducing an additional disk access to volume disk for getting snapshot mapping block. We design and implement an efficient snapshot technique, while the snapshot deletion time, improve performance by deallocation of COW data block using SSB of original mapping entry without snapshot mapping entry obtained mapping block read from the shared disk.

A Development of LDA Topic Association Systems Based on Spark-Hadoop Framework

  • Park, Kiejin;Peng, Limei
    • Journal of Information Processing Systems
    • /
    • v.14 no.1
    • /
    • pp.140-149
    • /
    • 2018
  • Social data such as users' comments are unstructured in nature and up-to-date technologies for analyzing such data are constrained by the available storage space and processing time when fast storing and processing is required. On the other hand, it is even difficult in using a huge amount of dynamically generated social data to analyze the user features in a high speed. To solve this problem, we design and implement a topic association analysis system based on the latent Dirichlet allocation (LDA) model. The LDA does not require the training process and thus can analyze the social users' hourly interests on different topics in an easy way. The proposed system is constructed based on the Spark framework that is located on top of Hadoop cluster. It is advantageous of high-speed processing owing to that minimized access to hard disk is required and all the intermediately generated data are processed in the main memory. In the performance evaluation, it requires about 5 hours to analyze the topics for about 1 TB test social data (SNS comments). Moreover, through analyzing the association among topics, we can track the hourly change of social users' interests on different topics.

Design and Implementation of a Metadata Structure for Large-Scale Shared-Disk File System (대용량 공유디스크 파일 시스템에 적합한 메타 데이타 구조의 설계 및 구현)

  • 이용주;김경배;신범주
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.30 no.1
    • /
    • pp.33-49
    • /
    • 2003
  • Recently, there have been large storage demands for manipulating multimedia data. To solve the tremendous storage demands, one of the major researches is the SAN(Storage Area Network) that provides the local file requests directly from shared-disk storage and also eliminates the server bottlenecks to performance and availability. SAN also improve the network latency and bandwidth through new channel interface like FC(Fibre Channel). But to manipulate the efficient storage network like SAN, traditional local file system and distributed file system are not adaptable and also are lack of researches in terms of a metadata structure for large-scale inode object such as file and directory. In this paper, we describe the architecture and design issues of our shared-disk file system and provide the efficient bitmap for providing the well-formed block allocation in each host, extent-based semi flat structure for storing large-scale file data, and two-phase directory structure of using Extendible Hashing. Also we describe a detailed algorithm for implementing the file system's device driver in Linux Kernel and compare our file system with the general file system like EXT2 and shard disk file system like GFS in terms of file creation, directory creation and I/O rate.