• Title/Summary/Keyword: Cluster failure

Search Result 87, Processing Time 0.024 seconds

Failure Recovery in the Linux Cluster File System SANiqueTM (리눅스 클러스터 화일 시스템 SANiqueTM의 오류 회복 기법)

  • Lee, Gyu-Ung
    • The KIPS Transactions:PartA
    • /
    • v.8A no.4
    • /
    • pp.359-366
    • /
    • 2001
  • This paper overviews the design of SANique$^{TM}$ -a shred file system for Linux cluster based on SAN environment. SANique$^{TM}$ has the capability of transferring user data from network-attached SAN disks to client applcations directly without the control of centralized file server system. The paper also presents the characteristics of each SANique$^{TM}$ subsystem: CFM(Cluster File Manager), CVM(Cluster Volume Manager), CLM(Cluster Lock Manager), CBM(Cluster Buffer Manager) and CRM(Cluster Recovery Manager). Under the SANique$^{TM}$ design layout, then, the syndrome of '||'&'||'quot;split-brain'||'&'||'quot; in shared file system environments is described and defined. The work first generalizes and illustrates possible situations in each of which a shared file system environment may split into two or more pieces of separate brain. Finally, the work describes the SANique$^{TM}$ approach to the given "split-brain"problem using SAN disk named "split-brain" and develops the overall recovery procedure of shared file systems.

  • PDF

Cluster and information entropy analysis of acoustic emission during rock failure process

  • Zhang, Zhenghu;Hu, Lihua;Liu, Tiexin;Zheng, Hongchun;Tang, Chun'an
    • Geomechanics and Engineering
    • /
    • v.25 no.2
    • /
    • pp.135-142
    • /
    • 2021
  • This study provided a new research perspective for processing and analyzing AE data to evaluate rock failure. Cluster method and information entropy theory were introduced to investigate temporal and spatial correlation of acoustic emission (AE) events during the rock failure process. Laboratory experiments of granite subjected to compression were carried out, accompanied by real-time acoustic emission monitoring. The cumulative length and dip angle curves of single links were fitted by different distribution models and distribution functions of link length and directionality were determined. Spatial scale and directionality of AE event distribution, which are characterized by two parameters, i.e., spatial correlation length and spatial correlation directionality, were studied with the normalized applied stress. The entropies of link length and link directionality were also discussed. The results show that the distribution of accumulative link length and directionality obeys Weibull distribution. Spatial correlation length shows an upward trend preceding rock failure, while there are no remarkable upward or downward trends in spatial correlation directionality. There are obvious downward trends in entropies of link length and directionality. This research could enrich mathematical methods for processing AE data and facilitate the early-warning of rock failure-related geological disasters.

Bonded-cluster simulation of tool-rock interaction using advanced discrete element method

  • Liu, Weiji;Zhu, Xiaohua;Zhou, Yunlai;Li, Tao;Zhang, Xiangning
    • Structural Engineering and Mechanics
    • /
    • v.72 no.4
    • /
    • pp.469-477
    • /
    • 2019
  • The understanding of tool-rock interaction mechanism is of high essence for improving the rock breaking efficiency and optimizing the drilling parameters in mechanical rock breaking. In this study, the tool-rock interaction models of indentation and cutting are carried out by employing the discrete element method (DEM) to examine the rock failure modes of various brittleness rocks and critical indentation and cutting depths of the ductile to brittle failure mode transition. The results show that the cluster size and inter-cluster to intra-cluster bond strength ratio are the key factors which influence the UCS magnitude and the UCS to BTS ratio. The UCS to BTS strength ratio can be increased to a more realistic value using clustered rock model so that the characteristics of real rocks can be better represented. The critical indentation and cutting depth decrease with the brittleness of rock increases and the decreasing rate reduces dramatically against the brittleness value. This effort may lead to a better understanding of rock breaking mechanisms in mechanical excavation, and may contribute to the improvement in the design of rock excavation machines and the related parameters determination.

A Token Based Protocol for Mutual Exclusion in Mobile Ad Hoc Networks

  • Sharma, Bharti;Bhatia, Ravinder Singh;Singh, Awadhesh Kumar
    • Journal of Information Processing Systems
    • /
    • v.10 no.1
    • /
    • pp.36-54
    • /
    • 2014
  • Resource sharing is a major advantage of distributed computing. However, a distributed computing system may have some physical or virtual resource that may be accessible by a single process at a time. The mutual exclusion issue is to ensure that no more than one process at a time is allowed to access some shared resource. The article proposes a token-based mutual exclusion algorithm for the clustered mobile ad hoc networks (MANETs). The mechanism that is adapted to handle token passing at the inter-cluster level is different from that at the intra-cluster level. It makes our algorithm message efficient and thus suitable for MANETs. In the interest of efficiency, we implemented a centralized token passing scheme at the intra-cluster level. The centralized schemes are inherently failure prone. Thus, we have presented an intra-cluster token passing scheme that is able to tolerate a failure. In order to enhance reliability, we applied a distributed token circulation scheme at the inter-cluster level. More importantly, the message complexity of the proposed algorithm is independent of N, which is the total number of nodes in the system. Also, under a heavy load, it turns out to be inversely proportional to n, which is the (average) number of nodes per each cluster. We substantiated our claim with the correctness proof, complexity analysis, and simulation results. In the end, we present a simple approach to make our protocol fault tolerant.

On the Handling of Node Failures: Energy-Efficient Job Allocation Algorithm for Real-time Sensor Networks

  • Karimi, Hamid;Kargahi, Mehdi;Yazdani, Nasser
    • Journal of Information Processing Systems
    • /
    • v.6 no.3
    • /
    • pp.413-434
    • /
    • 2010
  • Wireless sensor networks are usually characterized by dense deployment of energy constrained nodes. Due to the usage of a large number of sensor nodes in uncontrolled hostile or harsh environments, node failure is a common event in these systems. Another common reason for node failure is the exhaustion of their energy resources and node inactivation. Such failures can have adverse effects on the quality of the real-time services in Wireless Sensor Networks (WSNs). To avoid such degradations, it is necessary that the failures be recovered in a proper manner to sustain network operation. In this paper we present a dynamic Energy efficient Real-Time Job Allocation (ERTJA) algorithm for handling node failures in a cluster of sensor nodes with the consideration of communication energy and time overheads besides the nodes' characteristics. ERTJA relies on the computation power of cluster members for handling a node failure. It also tries to minimize the energy consumption of the cluster by minimum activation of the sleeping nodes. The resulting system can then guarantee the Quality of Service (QoS) of the cluster application. Further, when the number of sleeping nodes is limited, the proposed algorithm uses the idle times of the active nodes to engage a graceful QoS degradation in the cluster. Simulation results show significant performance improvements of ERTJA in terms of the energy conservation and the probability of meeting deadlines compared with the other studied algorithms.

A Recovery Scheme of a Cluster Head Failure for Underwater Wireless Sensor Networks (수중 무선 센서 네트워크를 위한 클러스터 헤드 오류 복구 기법)

  • Heo, Jun-Young;Min, Hong
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.11 no.4
    • /
    • pp.17-22
    • /
    • 2011
  • The underwater environments are quite different from the terrestrial ones in terms of the communication channel and constrains. In underwater wireless sensor network, the probability of node failure is high because sensor nodes are deployed in more harsh environments than the ground based networks and moved by waves and currents. There are researches considering the communication environments of underwater to improve the data transmission throughput. In this paper, we present a checkpointing scheme of the cluster heads that recoveries from a cluster head failure quickly. Experimental results show that the proposed scheme enhances the reliability of the networks and more efficient in terms of the energy consumption and the recovery latency than without checkpointing.

A study on high availability of the linux clustering web server (리눅스 클러스터링 웹 서버의 고가용성에 대한 연구)

  • 박지현;이상문;홍태화;김학배
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 2000.10a
    • /
    • pp.88-88
    • /
    • 2000
  • As more and more critical commercial applications move on the Internet, providing highly available servers becomes increasingly important. One of the advantages of a clustered system is that it has hardware and software redundancy. High availability can be provided by detecting node or daemon failure and reconfiguring the system appropriately so that the workload can be taken over bi the remaining nodes in the cluster. This paper presents how to provide the guaranteeing high availability of clustering web server. The load balancer becomes a single failure point of the whole system. In order to prevent the failure of the load balancer, we setup a backup server using heartbeat, fake, mon, and checkpointing fault-tolerance method. For high availability of file servers in the cluster, we setup coda file system. Coda is a advanced network fault-tolerance distributed file system.

  • PDF

A Study On Balance Factors for the Sustainable Growth of Technology-Based Companies: Focusing on the Case of Daedeok Cluster Successful and Unsuccessful Companies (기술기반기업의 지속적 성장을 위한 균형요인 연구: 대덕클러스터 성공실패 사례 중심으로)

  • Kyeongsik Yoo;Heungsik Kang;Jaeman Yoon;Taekeun Kim
    • Industry Promotion Research
    • /
    • v.9 no.3
    • /
    • pp.87-100
    • /
    • 2024
  • This study approached the factors affecting companies's sustainable growth by overcoming the valley of death after starting a business from the perspective of technology, market, location, cluster, and INC model, and conducted a case study on success and failure companies in Daedeok Cluster to explore the adequacy and suitability of the main factors of previous studies. The results of this study suggest that location accessibility-based collaboration between innovators is important for the growth of companies in the cluster, and balanced growth such as innovative ideas, market needs, and capability to meet needs is necessary for companies' products and services to create innovative value in the market. This study is worth in that it presents factors for sustainable corporate growth through the analyses of success and failure cases for companies in the Daedeok Cluster. In addition, the research is successful in that it proposed a policy support plan based on collaboration among companies to foster companies in the cluster.

Recovery Management of Split-Brain Group in Highly Available Cluster file System $\textrm{SANique}^{TM}$ (고가용성 클러스터 파일 시스템 $\textrm{SANique}^{TM}$의 분할그룹 탐지 및 회복 기법)

  • 이규웅
    • Journal of Korea Multimedia Society
    • /
    • v.7 no.4
    • /
    • pp.505-517
    • /
    • 2004
  • This paper overviews the design details of the cluster file system $\textrm{SANique}^{TM}$ on the SAN environment. $\textrm{SANique}^{TM}$ has the capability of transferring user data from shared SAN disk to client application without control of centralized file server. We, especially, focus on the characteristics and functions of recovery manager CRM of $\textrm{SANique}^{TM}$. The process component for failure detection and its overall procedure are described. We define the split-brain problem that cannot be easily detected in cluster file systems and also propose the recovery management method based on SAN disk in order to detect and solve the split-brain situation.

  • PDF

QoS Guarantee in Partial Failure of Clustered VOD Server (클러스터 VOD 서버의 부분적 장애에서 QoS 보장)

  • Lee, Joa-Hyoung;Jung, In-Bum
    • The KIPS Transactions:PartC
    • /
    • v.16C no.3
    • /
    • pp.363-372
    • /
    • 2009
  • For large scale VOD service, cluster servers are spotlighted to their high performance and low cost. A cluster server usually consists of a front-end node and multiple back-end nodes. Though increasing the number of back-end nodes can result in the more QoS streams for clients, the possibility of failures in back-end nodes is proportionally increased. The failure causes not only the stop of all streaming service but also the loss of the current playing positions. In this paper, when a back-end node becomes a failed state, the recovery mechanisms are studied to support the unceasing streaming service. For the actual VOD service environment, we implement a cluster-based VOD servers composed of general PCs and adopt the parallel processing for MPEG movies. From the implemented VOD server, a video block recovery mechanism is designed on parity algorithms. However, without considering the architecture of cluster-based VOD server, the application of the basic technique causes the performance bottleneck of the internal network for recovery and also results in the inefficiency CPU usage of back-end nodes. To address these problems, we propose a new failure recovery mechanism based on the pipeline computing concept.