• Title/Summary/Keyword: Software fault-tolerance

Search Result 90, Processing Time 0.027 seconds

Design Technique and Application for Distributed Recovery Block Using the Partitioning Operating System Based on Multi-Core System (멀티코어 기반 파티셔닝 운영체제를 이용한 분산 복구 블록 설계 기법 및 응용)

  • Park, Hansol
    • Journal of IKEEE
    • /
    • v.19 no.3
    • /
    • pp.357-365
    • /
    • 2015
  • Recently, embedded systems such as aircraft and automobilie, are developed as modular architecture instead of federated architecture because of SWaP(Size, Weight and Power) issues. In addition, partition operating system that support multiple logical node based on partition concept were recently appeared. Distributed recovery block is fault tolerance design scheme that applicable to mission critical real-time system to support real-time take over via real-time synchronization between participated nodes. Because of real-time synchronization, single-core based computer is not suitable for partition based distributed recovery block design scheme. Multi-core and AMP(Asymmetric Multi-Processing) based partition architecture is required to apply distributed recovery block design scheme. In this paper, we proposed design scheme of distributed recovery block on the multi-core based supervised-AMP architecture partition operating system. This paper implements flight control simulator for avionics to check feasibility of our design scheme.

Term Clustering and Duplicate Distribution for Efficient Parallel Information Retrieval (효율적인 병렬정보검색을 위한 색인어 군집화 및 분산저장 기법)

  • 강재호;양재완;정성원;류광렬;권혁철;정상화
    • Journal of KIISE:Software and Applications
    • /
    • v.30 no.1_2
    • /
    • pp.129-139
    • /
    • 2003
  • The PC cluster architecture is considered as a cost-effective alternative to the existing supercomputers for realizing a high-performance information retrieval (IR) system. To implement an efficient IR system on a PC cluster, it is essential to achieve maximum parallelism by having the data appropriately distributed to the local hard disks of the PCs in such a way that the disk I/O and the subsequent computation are distributed as evenly as possible to all the PCs. If the terms in the inverted index file can be classified to closely related clusters, the parallelism can be maximized by distributing them to the PCs in an interleaved manner. One of the goals of this research is the development of methods for automatically clustering the terms based on the likelihood of the terms' co-occurrence in the same query. Also, in this paper, we propose a method for duplicate distribution of inverted index records among the PCs to achieve fault-tolerance as well as dynamic load balancing. Experiments with a large corpus revealed the efficiency and effectiveness of our method.

Integrating Resilient Tier N+1 Networks with Distributed Non-Recursive Cloud Model for Cyber-Physical Applications

  • Okafor, Kennedy Chinedu;Longe, Omowunmi Mary
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.16 no.7
    • /
    • pp.2257-2285
    • /
    • 2022
  • Cyber-physical systems (CPS) have been growing exponentially due to improved cloud-datacenter infrastructure-as-a-service (CDIaaS). Incremental expandability (scalability), Quality of Service (QoS) performance, and reliability are currently the automation focus on healthy Tier 4 CDIaaS. However, stable QoS is yet to be fully addressed in Cyber-physical data centers (CP-DCS). Also, balanced agility and flexibility for the application workloads need urgent attention. There is a need for a resilient and fault-tolerance scheme in terms of CPS routing service including Pod cluster reliability analytics that meets QoS requirements. Motivated by these concerns, our contributions are fourfold. First, a Distributed Non-Recursive Cloud Model (DNRCM) is proposed to support cyber-physical workloads for remote lab activities. Second, an efficient QoS stability model with Routh-Hurwitz criteria is established. Third, an evaluation of the CDIaaS DCN topology is validated for handling large-scale, traffic workloads. Network Function Virtualization (NFV) with Floodlight SDN controllers was adopted for the implementation of DNRCM with embedded rule-base in Open vSwitch engines. Fourth, QoS evaluation is carried out experimentally. Considering the non-recursive queuing delays with SDN isolation (logical), a lower queuing delay (19.65%) is observed. Without logical isolation, the average queuing delay is 80.34%. Without logical resource isolation, the fault tolerance yields 33.55%, while with logical isolation, it yields 66.44%. In terms of throughput, DNRCM, recursive BCube, and DCell offered 38.30%, 36.37%, and 25.53% respectively. Similarly, the DNRCM had an improved incremental scalability profile of 40.00%, while BCube and Recursive DCell had 33.33%, and 26.67% respectively. In terms of service availability, the DNRCM offered 52.10% compared with recursive BCube and DCell which yielded 34.72% and 13.18% respectively. The average delays obtained for DNRCM, recursive BCube, and DCell are 32.81%, 33.44%, and 33.75% respectively. Finally, workload utilization for DNRCM, recursive BCube, and DCell yielded 50.28%, 27.93%, and 21.79% respectively.

An Approach to Software Analysis and Design based on Distributed Components (분산 컴포넌트 기반의 소프트웨어 분석 및 설계 방법)

  • Choi, You-Hee;Yeom, Keun-Hyuk
    • Journal of KIISE:Software and Applications
    • /
    • v.28 no.12
    • /
    • pp.896-909
    • /
    • 2001
  • Recently, above 50 percentages of software are being developed based on distributed application platforms. And recent technologies such as EJB(Enterprise Java Beans)[1]COM(Component Object Model)[2] CORBA(Common Object Request Broker Architecture)[3] have been advanced for distributed component-based software development . Therefore a systematic development process is necessary to develop component based applications using distributed application platforms. However, most of component-base software development processes do not define concrete flows between tasks and relationships among artifacts of each task Also, distribution issues are not considered explicitly in most of component-based software development In this paper, we present an approach to analyze and design software based on distributed components. In this approach, we propose systematic guidelines for developing a software based on Unified process and the relationships among artifacts which are produced, Also we explicitly consider the distribution issues such as performance, fault tolerance, security, distributed transaction of CORBA environments.

  • PDF

IMMORTAL : Fault Tolerant Distributed Middleware System based on Remote Method Invocation (IMMORTAL : 원격 메쏘드 호출에 기반한 결함허용 분산 미들웨어 시스템)

  • Hyun, Mu-Yong;Kim, Shik;Kim, Myung-Jun;Yamakita, Jiro
    • Journal of KIISE:Information Networking
    • /
    • v.29 no.5
    • /
    • pp.562-572
    • /
    • 2002
  • Distributed object technologies have become popular in developing distributed systems. Although such middleware platforms as DSOM, DCOM, CORBA and Java RMI ease the development of distributed applications, they do not directly improve the reliability and the availability of these applications. Because the task of developing fault-tolerance techniques for distributed object paradigms is often complicated and error-prone, there is a great need for a development toolkit that enhances the reliability and the availability of distributed objects. In this paper, we propose a fault-tolerant distributed middleware system based on RMI, called IMMORTAL. We use a log-based rollback-recovery mechanism for supporting reliable distributed computing. Through a series of experiments, we observe that benchmark applications on the IMMORTAL tolerate hardware and software failures and evaluate its performance and scalability.

Connector for Dynamic Composition of Aspects Based on AOSD (AOSD기반에서 Aspect의 동적결합을 위한 Connector)

  • Kim Tae-Woong;Kim Tae-Gong
    • The KIPS Transactions:PartD
    • /
    • v.13D no.2 s.105
    • /
    • pp.251-258
    • /
    • 2006
  • Aspect-Oriented Software Development is new software development method. It has many advantages related to software performance, maintenance and repair. Also it offers modularization method to a existing programming language for secondary function such as security and fault tolerance. But the present problem is that we have to use new aspect-oriented programming language. Further more when we apply Aspect to legacy system, we have to recompile the source code in order to build software system based on AOSD. In this paper, we propose and design Connector that can be composed with Aspect in legacy system dynamically. To elaborate this work, we use the information of operations about Core and Aspect, and the information of pointcut described with XML. We validate that the proposed Connector has features such as no need of new compiler, no recompilation and no modification of legacy system through case study.

Efficient Fault Tolerance Method in CAN (CAN 통신에서의 효율적인 메시지 전송 오류 복구 방법)

  • Shin, Chang-Min
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2012.06a
    • /
    • pp.60-62
    • /
    • 2012
  • 차량 전장용 임베디드 소프트웨어 플랫폼의 공개 표준인 AUTOSAR SW 플랫폼에서의 CAN에서는 송신단말에서 전달하려는 메시지가 큰 경우에는 메시지를 여러 개의 프레임들로 쪼개어 전송을 하고, 수신 단말은 쪼개진 프레임들을 하나의 메시지로 재조립한다. 이 때에 전송 오류가 발생하여, 수신 단말에 프레임이 전송되지 못할 수 있으며, AUTOSAR SW 플랫폼에서 규정하고 있는 기존의 CAN 모듈들은 이와 같은 전송 오류를 처리할 수 있는 기술인 재전송 기술이 규정되어 있지 않다. 본 논문은 AUTOSAR SW 플랫폼 기반의 CAN 통신에서 발생할 수 있는 메시지 전송 오류시에 메시지 재전송 방법에 관한 것이다. 본 논문에서는 전송오류가 발생한 프레임만을 다시 전송하여, 재전송이 효율적으로 이루어지도록 하였다.

A Design of Fault Tolerance JavaRMI Object (고장 감내 자바 RMI 객체 설계)

  • Lee, Min-Seok;Yun, Tae-Jin;Ahn, Kwang-Seon
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2000.10b
    • /
    • pp.1215-1218
    • /
    • 2000
  • CORBA, DCOM, JavaRMI 등과 같은 분산 객체 기술이 분산 응용의 신뢰성을 직접적으로 향상시키지는 못한다. 이러한 분산 객체 기술에 고장 감내성을 추가하기 위해서는 객체 단위의 복제 그룹 관리와 고장 탐지 및 회복 메커니즘이 필요하다. 본 논문에서는 고장 감내형 JavaRMI 객체를 개발하기 위하여 고장 탐지와 그룹 관리를 위한 그룹관리자와 원격 인터페이스를 설계하고, 고장 감내성 클래스를 정의한다. 또한 고장 감내 객체의 투명한 그룹 참여를 위하여 Naming클래스와 RMIRegistry를 확장한다. 응용개발자는 고장 감내성 클래스를 상속함으로써 외부의 도움 없이 간단히 고장 감내 응용 객체를 개발 할 수 있다.

  • PDF

A Study on the Evolvable Hardware Design (EHW) (진화형하드웨어 설계에 관한 연구)

  • Kim, Jong-O;Kim, Duck-Soo;Lee, Won-Seok
    • Proceedings of the IEEK Conference
    • /
    • 2007.07a
    • /
    • pp.449-450
    • /
    • 2007
  • Evolvable hardware(EHW) is a dynamic field that brings together reconfigurable hardware, artificial intelligence, fault tolerance and autonomous systems. This paper gives an introduction to the field. The features that can be used to identify and classify evolvable hardware are the evolutionary algorithm, the implementation and the genotype representation. Evolvable hardware (EHW) is hardware that can change its own circuit structure by genetic learning to achieve maximum adaptation to the environment. In conventional EHW, the learning is executed by software on a computer.

  • PDF

Performance Analysis of Error Classification System on Distributed Multimedia Environment (분산 멀티미디어 환경에서 실행되는 오류 분류 시스템의 성능 분석)

  • Ko Eung-Nam
    • Journal of Digital Contents Society
    • /
    • v.4 no.2
    • /
    • pp.181-189
    • /
    • 2003
  • The requirement of distributed multimedia applications is the need for sophisticated QoS(quality of service) management. In terms of distributed multimedia systems, the most important catagories for quality of service are a timeless, volume, and reliability In this paper, we discuss a method for increasing reliability through fault tolerance. We describe the design and implementation of the ECA running on distributed multimedia environment. ECA is a system is able to classify automatically a software error based on distributed multimedia. This papaer explains a performance analysis of an error classification system running on distributed multimedia environment using the rule-based DEVS modeling and simulation techniques. In DEVS, a system has a time base, inputs, states, outputs, and functions.

  • PDF