DOI QR코드

DOI QR Code

Design and Cost Analysis for a Fault-Tolerant Distributed Shared Memory System

  • Received : 2015.12.03
  • Accepted : 2016.06.20
  • Published : 2016.08.31

Abstract

Algorithms implementing distributed shared memory (DSM) were developed for ensuring consistency. The performance of DSM algorithms is dependent on system and usage parameters. However, ensuring these algorithms to tolerate faults is a problem that needs to be researched. In this study, we proposed fault-tolerant scheme for DSM system and analyzed reliability and fault-tolerant overhead. Using our analysis, we can choose a proper algorithm for DSM on error prone environment.

Keywords

References

  1. Michael Stumn and Songnian Zhou, "Algorithms Implementing Distributed Shared Memory", IEEE Computer, pp. 54-64, May 1990. http://dx.doi.org/10.1109/2.53355
  2. John L. Hennessy and David A Patterson, Computer Architecture: a quantitative approach, Fourth Edition, Morgan Kaufmann Publishers, 2007. https://app.knovel.com/web/toc.v/cid:kpCAAQAE02/viewerType:toc/root_slug:computer-architecture-a
  3. Larry Brown and Jie Wu, "Dynamic Snooping in a Fault_Tolerant Distributed Shared Memory", Proc. of International Conference on Distributed Computing Systems, pp. 218-226, Jun 1994. http://dx.doi.org/10.1109/ICDCS.1994.302415
  4. Bill Nitzberg and Virginia Lo, "Distributed Shared Memory: A Survey of Issues and Algorithms," IEEE Computer, pp. 54-64, May 1991. http://dx.doi.org/10.1109/2.84877
  5. Jelica Protic, Milo Tomasevic, and Veljko Milutinovic, "Distributed Shared Memory: Concepts and Systems," IEEE Computer pp. 63-79, June 1996. http://dx.doi.org/10.1109/88.494605
  6. Krishna Kavi and Hyong-Shik Kim, "Shared Memory and Distributed Shared Memory Systems: A Survey," IEEE System Sciences, pp. 74-84, 3-6 Jan 1995.
  7. Arun K. Somani and Nitin H. Vaidya, "Understanding Fault Tolerance and Reliability," IEEE Computer, pp. 45-50, April 1997. http://dx.doi.org/10.1109/MC.1997.585153
  8. Bill Nitzberg and Virginia Lo, "Distributed Shared Memory: A Survey of Issues and Algorithms," IEEE Computer, pp. 52-60, August 1991. http://dx.doi.org/10.1109//2.84877
  9. Kjetil Nrvag, "An Introduction to Fault-Tolerant Systems," IDI Technical Report 6/99, ISSN 0802-6394, pp.3-19, July 2000.
  10. Michael Stumn and Songnian Zhou, "Fault Tolerant Distributed Shared Memory Algorithms," Proceedings of the Second IEEE Symposium on Parallel and Distributed Processing, pp. 719-724, December 1990. http://dx.doi.org/1109/SPDP.1990.143633
  11. AL-Harbi Fahad Jazi A. and Jai-Hoon Kim, "Cost Analysis for Fault-Tolerant Distributed Shared Memory System," Proc. of Fall Conference of KSII, Oct. 2015. http://www.dbpia.co.kr/Journal/ArticleDetail/NODE06554560
  12. Michele Di Santo et al., "Software Distributed Shared Memory with Transactional Coherence," Proc. of 18th Euromicro Conference on Parallel, Distributed and Network-based Processing, pp.175-179, 2010. http://dx.doi.org/10.1109/PDP.2010.28
  13. Jie Cai et al., "Region-Based Prefetch Techniques for Software Distributed Shared Memory Systems," Proc. of 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing, pp. 113-122, 2010. http://dx.doi.org/10.1109/CCGRID.2010.16
  14. Bharath Ramesh et al., "Is It Time To Rethink Distributed Shared Memory Systems?" IEEE 17th International Conference on Parallel and Distributed Systems, 212-219, 2011. http://dx.doi.org/10.1109/ICPADS.2011.75
  15. Chance Eary and Mohan Kumar, "Delay Tolerant Lazy Release Consistency for Distributed Shared Memory in Opportunistic Networks," Proc. of IEEE International Symposium on a World of Wireless, Mobile and Multimedia Networks (WoWMoM), pp.1-6, 2012. http://dx.doi.org10.1109/WoWMoM.2012.6263745
  16. Takahiro Chiba et al., "A Distributed Real-Time Operating System with Distributed Shared Memory for Embedded Control Systems," Proc. of IEEE 11th International Conference on Dependable, Autonomic and Secure Computing, pp. 248-255, 2013. http://dx.doi.org/10.1109/DASC.2013.71
  17. Hemant D. Vasava and Jagdish M. Rathod, "Software based Distributed Shared Memory (DSM) model using Shared variables between Multiprocessors," Proc. of IEEE ICCSP, pp. 1431-1435, 2015. http://dx.doi.org/10.1109/ICCSP.2015.7322749
  18. Deepavali Bhagwat et al, "Efficient Replication for Distributed Fault Tolerant Memory," Proc. of ACM SYSTOR, 2015. http://dx.doi.org/10.1145/2757667.2757686
  19. Sachin Hirve et al., "SMASH: Speculative State Machine Replication in Transactional Systems," Proc. of ACM Middleware, 2013. http://dx.doi.org/10.1145/2541614.2541630
  20. ANDREA HOLLER et al., "Patterns for Automated Software Diversity to Support Security and Reliability," Proc. of EuroPloP, 2015. http://dx.doi.org/10.1109/IEEESTD.1990.101064