• Title/Summary/Keyword: Cluster File System

Search Result 91, Processing Time 0.025 seconds

Implementation of Data processing of the High Availability for Software Architecture of the Cloud Computing (클라우드 서비스를 위한 고가용성 대용량 데이터 처리 아키텍쳐)

  • Lee, Byoung-Yup;Park, Junho;Yoo, Jaesoo
    • The Journal of the Korea Contents Association
    • /
    • v.13 no.2
    • /
    • pp.32-43
    • /
    • 2013
  • These days, there are more and more IT research institutions which foresee cloud services as the predominant IT service in the near future and there, in fact, are actual cloud services provided by some IT leading vendors. Regardless of physical location of the service and environment of the system, cloud service can provide users with storage services, usage of data and software. On the other hand, cloud service has challenges as well. Even though cloud service has its edge in terms of the extent to which the IT resource can be freely utilized regardless of the confinement of hardware, the availability is another problem to be solved. Hence, this paper is dedicated to tackle the aforementioned issues; prerequisites of cloud computing for distributed file system, open source based Hadoop distributed file system, in-memory database technology and high availability database system. Also the author tries to body out the high availability mass distributed data management architecture in cloud service's perspective using currently used distributed file system in cloud computing market.

An Implementation and Evaluation of Large-Scale Dynamic Hashing Directories (대규모 동적 해싱 디렉토리의 구현 및 평가)

  • Kim, Shin-Woo;Lee, Yong-Kyu
    • Journal of Korea Multimedia Society
    • /
    • v.8 no.7
    • /
    • pp.924-942
    • /
    • 2005
  • Recently, large-scale directories have been developed for LINUX cluster file systems to store and retrieve huge amount of data. One of them, GFS directory, has attracted much attention because it is based on extendible hashing, one of dynamic hashing techniques, to support fast access to files. One distinctive feature of the GFS directory is the flat structure where all the leaf nodes are located at the same level of the tree. Hut one disadvantage of the mode structure is that the height of the mode tree has to be increased to make the tree flat after a byte is inserted to a full tree which cannot accommodate it. Thus, one byte addition makes the height of the whole mode tree grow, and each data block of the new tree needs one more link access than the old one. Another dynamic hashing technique which can be used for directories is linear hashing and a couple of researches have shown that it can get better performance at file access times than extendible hashing. [n this research, we have designed and implemented an extendible hashing directory and a linear hashing directory for large-scale LINUX cluster file systems and have compared performance between them. We have used the semi-flat structure which is known to have better access performance than the flat structure. According to the results of the performance evaluation, the linear hashing directory has shown slightly better performance at file inserts and accesses in most cases, whereas the extendible hashing directory is somewhat better at space utilization.

  • PDF

Efficient Load Balancing Scheme using Resource Information in Web Server System (웹 서버 시스템에서의 자원 정보를 이용한 효율적인 부하분산 기법)

  • Chang Tae-Mu;Myung Won-Shig;Han Jun-Tak
    • The KIPS Transactions:PartA
    • /
    • v.12A no.2 s.92
    • /
    • pp.151-160
    • /
    • 2005
  • The exponential growth of Web users requires the web serves with high expandability and reliability. It leads to the excessive transmission traffic and system overload problems. To solve these problems, cluster systems are widely studied. In conventional cluster systems, when the request size is large owing to such types as multimedia and CGI, the particular server load and response time tend to increase even if the overall loads are distributed evenly. In this paper, a cluster system is proposed where each Web server in the system has different contents and loads are distributed efficiently using the Web server resource information such as CPU, memory and disk utilization. Web servers having different contents are mutually connected and managed with a network file system to maintain information consistency required to support resource information updates, deletions, and additions. Load unbalance among contents group owing to distribution of contents can be alleviated by reassignment of Web servers. Using a simulation method, we showed that our method shows up to $50\%$ about average throughput and processing time improvement comparing to systems using each LC method and RR method.

The study of striping size according to the amount of storage nodes in the Parallel Media Stream Server (병렬 미디어 스트림 서버에서 저장노드수의 변화에 따른 스트라이핑 크기 결정에 관한 연구)

  • Kim, Seo-Gyun;Nam, Ji-Seung
    • The KIPS Transactions:PartC
    • /
    • v.8C no.6
    • /
    • pp.765-774
    • /
    • 2001
  • In this paper, we proposed the striping policy for the storage nodes in the Linux-based parallel media stream server. We newly developed a storage clustering architecture, and named it as a system RAID architecture. In this system, many storage cluster nodes are grouped to operate as a single server. This system uses unique striping policy to distribute multimedia files into the parallel storage nodes. If a service request occurs, each storage cluster node transmits striped files concurrently to the clients. This scheme can provide the fair distribution of the preprocessing load in all storage cluster nodes. The feature of this system is a relative striping policy based on the file types, service types, and the number of storage nodes to provide the best service.

  • PDF

Application of Group Master Cache for the Integrated Environment of SAN and NAS (Group Master Cache를 활용한 SAN과 NAS의 통합 방안)

  • Lee, Won-Bok;Park, Jin-Won
    • Journal of the Korea Society for Simulation
    • /
    • v.16 no.2
    • /
    • pp.9-15
    • /
    • 2007
  • As the Internet grows and the mass multimedia data become popular, the storage system migrates from DAS, where the storage and the server are directly connected, to SAN and NAS. SAN connects the storages with a separate network, and NAS provides only file services, connects the storages with IP network. However, SAN and NAS can not fulfill the needs for companies if used separately, thus are asked to be integrated. In this research, we propose an efficient data sharing method which employees the concept of GMC, Croup Master Cache for the integrated environment of SAN and NAS. GMC is based on MCI, Metadata server and Cluster system Integration, but tries to solve the high expansion cost problem with MCI. We introduce the basic concept of GMC, compare the performance of GMC with that of MCI using computer simulation.

  • PDF

A Pattern Summary System Using BLAST for Sequence Analysis

  • Choi, Han-Suk;Kim, Dong-Wook;Ryu, Tae-W.
    • Genomics & Informatics
    • /
    • v.4 no.4
    • /
    • pp.173-181
    • /
    • 2006
  • Pattern finding is one of the important tasks in a protein or DNA sequence analysis. Alignment is the widely used technique for finding patterns in sequence analysis. BLAST (Basic Local Alignment Search Tool) is one of the most popularly used tools in bio-informatics to explore available DNA or protein sequence databases. BLAST may generate a huge output for a large sequence data that contains various sequence patterns. However, BLAST does not provide a tool to summarize and analyze the patterns or matched alignments in the BLAST output file. BLAST lacks of general and robust parsing tools to extract the essential information out from its output. This paper presents a pattern summary system which is a powerful and comprehensive tool for discovering pattern structures in huge amount of sequence data in the BLAST. The pattern summary system can identify clusters of patterns, extract the cluster pattern sequences from the subject database of BLAST, and display the clusters graphically to show the distribution of clusters in the subject database.

Comparison of Directory Structures for SAN Based Very Large File Systems (SAN 환경 대용량 파일 시스템을 위한 디렉토리 구조 비교)

  • 김신우;이용규
    • The Journal of Society for e-Business Studies
    • /
    • v.9 no.1
    • /
    • pp.83-104
    • /
    • 2004
  • Recently, information systems that require storage and retrieval of huge amount of data are becoming used widely. Accordingly, research efforts have been made to develop Linux cluster file systems in the SAN environment in which clients themselves can manage metadata and access data directly. Also a semi-flat directory structure based on extendible hashing has been proposed to support fast retrieval of files[1]. In this research, we have designed and implemented the semi-flat extendible hash directory under the Linux system. In order to evaluate the practicality of the directory, we have also implemented the B+-tree based directory and experimented the performance. According to the performance comparisons, the extendible hash directory has the better performance at insert, delete, and search operations. On the other hand, the B+-tree directory is better at sorting files.

  • PDF

Design and Implementation of The High-Speed Communication Module for a Linux Cluster File System Using M-VIA (리눅스 클러스터 파일 시스템을 위한 M-WIA 기반 고속 통신 모듈의 설계 및 구현)

  • 박의수;최현호;유찬곤;유관종
    • Proceedings of the Korea Multimedia Society Conference
    • /
    • 2003.11a
    • /
    • pp.461-465
    • /
    • 2003
  • 클러스터 파일 시스템은 데이터 입출력 대역폭을 극대화하여 효율성을 높이고 각 노드의 입출력 부담을 균등하게 부과하기 위하여 원본 파일을 여러 노드에 분산 저장한다. 이렇게 파일을 노드들에 분산 저장하기 위해서는 효율적인 노드간 데이터 통신을 필요로 하며, 노드 내부에서도 클러스터 파일 시스템과 어플리케이션과의 효율적인 전용 데이터 교환 메커니즘을 지원해야 한다. 이를 위해 사용자 수준 통신 프로토콜인 VIA를 선정하여 운영체제(Operating System)의 간섭으로 인한 네트워크 계층간의 데이터 복사에 의한 병목현상을 줄이고자 하였다. 본 논문에서는 노드간 데이터 통신을 위해 M-VIA를 이용하여 통신모듈을 설계 및 구현하였다. 그리고 실제 성능테스트를 통하여 기존의 소켓 기반인 TCP/IP를 이용한 통신모듈과의 성능을 비교 평가하고 확인 한다.

  • PDF

A Network Load Sensitive Block Placement Strategy of HDFS

  • Meng, Lingjun;Zhao, Wentao;Zhao, Haohao;Ding, Yang
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.9 no.9
    • /
    • pp.3539-3558
    • /
    • 2015
  • This paper investigates and analyzes the default block placement strategy of HDFS. HDFS is a typical representative distributed file system to stream vast amount of data effectively at high bandwidth to user applications. However, the default HDFS block placement policy assumes that all nodes in the cluster are homogeneous, and places blocks with a simple RoundRobin strategy without considering any nodes' resource characteristics, which decreases self-adaptability of the system. The primary contribution of this paper is the proposition of a network load sensitive block placement strategy. We have implemented our algorithm and justify it through extensive simulations and comparison with similar existing studies. The results indicate that our work not only performs much better in the data distribution but also improves write performance more significantly than the others.

Improving Access Performance of the Linux Cluster File System for Multimedia Service (멀티미디어 서비스를 위한 리눅스 클러스터 파일 시스템의 접근 성능 개선)

  • 홍재연;김형식
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2003.04a
    • /
    • pp.22-24
    • /
    • 2003
  • 클러스터 구조는 고가용성(high availability)과 결함내성(fault tolerance)을 만족하고 확장성이 뛰어나기 때문에 클러스터 파일 시스템은 멀티미디어 서비스에 적합하다. 사용자 수준 클러스터 파일 시스템[1, 2]은 멀티미디어 서비스에 특화된 기능을 제공하고 저장된 위치에 관계없이 파일이나 디렉토리에 접근할 수 있는 단일 시스템 이미지(single system image) 기술을 제공하지만 실제 저장된 위치에 따라 접근 시간의 편차가 발생한다. 본 논문에서는 메타 데이터 캐쉬와 시스템 버퍼를 이용한 사용자 수준 클러스터 파일 시스템의 성능 개선 방법을 제안하고 각각에 대하여 성능 개선 정도를 분석한다. 메타 데이터 캐쉬는 자주 참조되는 원격 노드의 메타 정보를 로컬 저장구조에 저장하고 시스템 버퍼는 데이터 블록의 쓰기 성능을 개선할 뿐만 아니라 선반입을 통하여 읽기 성능을 개선할 수 있다.

  • PDF