• Title/Summary/Keyword: Rollback And Re-Execution

Search Result 4, Processing Time 0.016 seconds

Optimizing Checkpoint Intervals for Real-Time Multi-Tasks with Arbitrary Periods (임의 주기를 가지는 실시간 멀티 태스크를 위한 체크포인트 구간 최적화)

  • Kwak, Seong-Woo;Yang, Jung-Min
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.60 no.1
    • /
    • pp.193-200
    • /
    • 2011
  • This paper presents an optimal checkpoint strategy for fault-tolerance in real-time systems. In our environment, multiple real-time tasks with arbitrary periods are scheduled in the system by Rate Monotonic (RM) algorithm, and checkpoints are inserted at a constant interval in each task while the width of interval is different with respect to the task. We propose a method to determine the optimal checkpoint interval for each task so that the probability of completing all the tasks is maximized. Whenever a fault occurs to a checkpoint interval of a task, the execution time of the task would be prolonged by rollback and re-execution of checkpoints. Our scheme includes the schedulability test to examine whether a task can be completed with an extended execution time. A numerical experiment is conducted to demonstrate the applicability of the proposed scheme.

Determination of Optimal Checkpoint Interval for RM Scheduled Real-time Tasks (RM 스케줄링된 실시간 태스크에서의 최적 체크 포인터 구간 선정)

  • Kwak, Seong-Woo;Jung, Young-Joo
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.56 no.6
    • /
    • pp.1122-1129
    • /
    • 2007
  • For a system with multiple real-time tasks of different deadlines, it is very difficult to find the optimal checkpoint interval because of the complexity in considering the scheduling of tasks. In this paper, we determine the optimal checkpoint interval for multiple real-time tasks that are scheduled by RM(Rate Monotonic) algorithm. Faults are assumed to occur with Poisson distribution. Checkpoints are inserted in the execution of task with equal distance in the same task, but different distances in other tasks. When faults occur, rollback to the latest checkpoint and re-execute task after the checkpoint. We derive the equation of maximum slack time for each task, and determine the number of re-executable checkpoint intervals for fault recovery. The equation to check the schedulibility of tasks is also derived. Based on these equations, we find the probability of all tasks executed within their deadlines successfully. Checkpoint intervals which make the probability maximum is the optimal.

Determination of the Optimal Checkpoint and Distributed Fault Detection Interval for Real-Time Tasks on Triple Modular Redundancy Systems (삼중구조 시스템의 실시간 태스크 최적 체크포인터 및 분산 고장 탐지 구간 선정)

  • Seong Woo Kwak;Jung-Min Yang
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.18 no.3
    • /
    • pp.527-534
    • /
    • 2023
  • Triple modular redundancy (TMR) systems can continue their mission by virtue of their structural redundancy even if one processor is attacked by faults. In this paper, we propose a new fault tolerance strategy by introducing checkpoints into the TMR system in which data saving and fault detection processes are separated while they corporate together in the conventional checkpoints. Faults in one processor are tolerated by synchronizing the state of three processors upon detecting faults. Simultaneous faults occurring to more than one processor are tolerated by re-executing the task from the latest checkpoint. We propose the checkpoint placement and fault detection strategy to maximize the probability of successful execution of a task within the given deadline. We develop the Markov chain model for the TMR system having the proposed checkpoint strategy, and derive the optimal fault detection and checkpoint interval.

Determination of Optimal Checkpoint Intervals for Real-Time Tasks Using Distributed Fault Detection (분산 고장 탐지 방식을 이용한 실시간 태스크에서의 최적 체크포인터 구간 선정)

  • Kwak, Seong Woo;Yang, Jung-Min
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.26 no.3
    • /
    • pp.202-207
    • /
    • 2016
  • Checkpoint placement is an effective fault tolerance technique against transient faults in which the task is re-executed from the latest checkpoint when a fault is detected. In this paper, we propose a new checkpoint placement strategy separating data saving and fault detection processes that are performed together in conventional checkpoints. Several fault detection processes are performed in one checkpoint interval in order to decrease the latency between the occurrence and detection of faults. We address the placement method of fault detection processes to maximize the probability of successful execution of a task within the given deadline. We develop the Markov chain model for a real-time task having the proposed checkpoints, and derive the optimal fault detection and checkpoint interval.