Search | Korea Science

Kwak, Seong-Woo;Yang, Jung-Min
- The Transactions of The Korean Institute of Electrical Engineers
- /
- v.60 no.1
- /
- pp.193-200
- /
- 2011
This paper presents an optimal checkpoint strategy for fault-tolerance in real-time systems. In our environment, multiple real-time tasks with arbitrary periods are scheduled in the system by Rate Monotonic (RM) algorithm, and checkpoints are inserted at a constant interval in each task while the width of interval is different with respect to the task. We propose a method to determine the optimal checkpoint interval for each task so that the probability of completing all the tasks is maximized. Whenever a fault occurs to a checkpoint interval of a task, the execution time of the task would be prolonged by rollback and re-execution of checkpoints. Our scheme includes the schedulability test to examine whether a task can be completed with an extended execution time. A numerical experiment is conducted to demonstrate the applicability of the proposed scheme.
https://doi.org/10.5370/KIEE.2011.60.1.193 인용 PDF KSCI

Kwak, Seong-Woo;Jung, Young-Joo
- The Transactions of The Korean Institute of Electrical Engineers
- /
- v.56 no.6
- /
- pp.1122-1129
- /
- 2007
For a system with multiple real-time tasks of different deadlines, it is very difficult to find the optimal checkpoint interval because of the complexity in considering the scheduling of tasks. In this paper, we determine the optimal checkpoint interval for multiple real-time tasks that are scheduled by RM(Rate Monotonic) algorithm. Faults are assumed to occur with Poisson distribution. Checkpoints are inserted in the execution of task with equal distance in the same task, but different distances in other tasks. When faults occur, rollback to the latest checkpoint and re-execute task after the checkpoint. We derive the equation of maximum slack time for each task, and determine the number of re-executable checkpoint intervals for fault recovery. The equation to check the schedulibility of tasks is also derived. Based on these equations, we find the probability of all tasks executed within their deadlines successfully. Checkpoint intervals which make the probability maximum is the optimal.
PDF KSCI

Seong Woo Kwak;Jung-Min Yang
- The Journal of the Korea institute of electronic communication sciences
- /
- v.18 no.3
- /
- pp.527-534
- /
- 2023
Triple modular redundancy (TMR) systems can continue their mission by virtue of their structural redundancy even if one processor is attacked by faults. In this paper, we propose a new fault tolerance strategy by introducing checkpoints into the TMR system in which data saving and fault detection processes are separated while they corporate together in the conventional checkpoints. Faults in one processor are tolerated by synchronizing the state of three processors upon detecting faults. Simultaneous faults occurring to more than one processor are tolerated by re-executing the task from the latest checkpoint. We propose the checkpoint placement and fault detection strategy to maximize the probability of successful execution of a task within the given deadline. We develop the Markov chain model for the TMR system having the proposed checkpoint strategy, and derive the optimal fault detection and checkpoint interval.
https://doi.org/10.13067/JKIECS.2023.18.3.527 인용 PDF

Kwak, Seong Woo;Yang, Jung-Min
- Journal of the Korean Institute of Intelligent Systems
- /
- v.26 no.3
- /
- pp.202-207
- /
- 2016
Checkpoint placement is an effective fault tolerance technique against transient faults in which the task is re-executed from the latest checkpoint when a fault is detected. In this paper, we propose a new checkpoint placement strategy separating data saving and fault detection processes that are performed together in conventional checkpoints. Several fault detection processes are performed in one checkpoint interval in order to decrease the latency between the occurrence and detection of faults. We address the placement method of fault detection processes to maximize the probability of successful execution of a task within the given deadline. We develop the Markov chain model for a real-time task having the proposed checkpoints, and derive the optimal fault detection and checkpoint interval.
https://doi.org/10.5391/JKIIS.2016.26.3.202 인용 PDF KSCI