DOI QR코드

DOI QR Code

항공기 소프트웨어의 건전성 관리를 위해서 순서 위배 오류를 자율 수리하는 효율적인 시스템

An Efficient On-the-fly Repairing System of Order Violation Errors for Health Management of Airborne Software

  • Kim, Tae-Hyung (Department of Informatics, Gyeongsang National University) ;
  • Choi, Eu-Teum (Department of Informatics, Gyeongsang National University) ;
  • Jun, Yong-Kee (Department of Informatics, Gyeongsang National University)
  • 투고 : 2020.06.05
  • 심사 : 2020.09.16
  • 발행 : 2020.10.01

초록

항공기 소프트웨어의 건전성 관리시스템은 수행 중에 발생하는 오류를 수리하여 안전성을 제공하고 유지보수 비용을 절감한다. 순서 위배 오류는 개발 단계에서 모두 제거하는 것이 어렵기 때문에 운용 단계에서 자율적으로 수리할 수 있어야 한다. 순서 위배를 자율 수리하기 위한 기존 연구는 각 접근사건의 수행 직전에 접근사건 순서를 비교하여 오류를 진단하기 때문에 접근사건 수에 비례하는 시간 오버헤드를 발생시킨다. 본 논문은 접근사건을 포함하는 함수의 올바른 순서 정보로 오류를 진단하고 해당 함수를 지연시켜 조치하는 기법인 ORS를 제시한다. ORS를 평가하기 위해, 순서 위배 오류를 포함하는 5가지의 합성 프로그램에 기존 연구와 ORS를 적용하여 시간 오버헤드를 측정하였다. 그 결과, 접근 횟수가 약 60번 이상일 때 기존 연구보다 효율적임을 확인하였다.

Health management system of airborne software repairs runtime errors to provide safety and to reduce cost of maintenance. It is critical to on-the-fly repair order violation errors, because it is difficult to identify them at the development phase. Previous work, called Repairing Atomicity Violations (Repairing-AV) diagnoses order violations for each access event by comparing execution order of accesses. As a result, Repairing-AV has time overhead that is proportional to the number of access events to shared variable. This paper presents a tool called On-the-fly Repairing System (ORS) that can repair order violations of object methods containing access events. The ORS diagnoses order violations by using correct order of object methods, and treats them by stalling its thread where the error is about to occur. Experimentation with five synthetic programs shows that ORS is more efficient than Repairing-AV when the number of access events is greater than sixty.

키워드

참고문헌

  1. Airlines electronic engineering committee (AEEC), avionics application software standard interface - ARINC Specification 653 - Part 1. (supplement 2 - required services), ARINC Inc, 2015.
  2. Merendino, T., Latimer IV, D. T., Hammons, C. B., Falkenthal, D., Capell, P. and Firesmith, D. G., The Method Framework for Engineering System Architectures, CRC Press, 2008.
  3. Mahadevan, N., Dubey, A. and Karsai, G., "Application of software health management techniques," Proceedings of the 6th International Symposium on Software Engineering for Adaptive and Self-Managing Systems, 2011, pp. 1-10.
  4. Srivastava, A. N. and Schumann, J., "The case for software health management," 2011 IEEE Fourth International Conference on Space Mission Challenges for Information Technology, 2011, pp. 3-9.
  5. Koenig, D., "A new software glitch has been found in Boeing's troubled 737 Max jet," Associated Press, June 27, 2019.
  6. Ha, O. K., Tchamgoue, G. M., Suh, J. B. and Jun, Y. K., "On-the-fly healing of race conditions in ARINC-653 flight software," 29th Digital Avionics Systems Conference, 2010, pp. 5.A.6-1-5.A.6.11.
  7. Scandura, P. A., Jr., "7. Vehicle health management systems," Digital avionics handbook, CRC Press, 2015.
  8. Netzer, R. H. and Miller, B. P., "What Are Race Conditions?," ACM Letters on Programming Languages and Systems (LOPLAS), Vol. 1, No. 1, 1992, pp. 74-88. https://doi.org/10.1145/130616.130623
  9. Lucia, B. and Ceze, L., "Cooperative empirical failure avoidance for multithreaded programs," Proceedings of the eighteenth international conference on Architectural support for programming languages and operating systems, Vol. 48, No. 4, 2013, pp. 39-50.
  10. Lu, S., Park, S., Seo, E. and Zhou, Y., "Learning from mistakes: a comprehensive study on real world concurrency bug characteristics," Proceedings of the 13th international conference on Architectural support for programming languages and operating systems, 2008, pp. 329-339.
  11. Choi, E. T., Lee, D. S., Jun, Y. K., and Lee, S. J., "On-the-fly Atomicity Violation Repairing Technique for Airborne Health Management Systems," Journal of The Korean Society for Aeronautical and Space Sciences, Vol. 48, No. 7, 2020, pp. 547-554. https://doi.org/10.5139/JKSAS.2020.48.7.547
  12. Zhang, M., Wu, Y., Lu, S., Qi, S., Ren, J. and Zheng, W., "A lightweight system for detecting and tolerating concurrency bugs," in IEEE Transactions on Software Engineering, Vol. 42, No. 10, 2016, pp. 899-917. https://doi.org/10.1109/TSE.2016.2531666
  13. Zhang, L. and Wang, C., "Runtime prevention of concurrency related type-state violations in multi-threaded applications," Proceedings of the 2014 International Symposium on Software Testing and Analysis, 2014. pp. 1-12.
  14. Sidiroglou, S., Laadan, O., Perez, C. R., Viennot, N., Nieh, J. and Keromytis, A. D., "Assure: automatic software self-healing using rescue points," ACM SIGARCH Computer Architecture News, Vol. 37, No. 1, 2009, pp. 37-48. https://doi.org/10.1145/2528521.1508250
  15. Zhang, W., Kruijf, M. D., Li, A., Lu, S. and Sankaralingam, K., "ConAir: featherweight concurrency bug recovery via single-threaded idempotent execution," In Proceedings of the eighteenth international conference on Architectural support for programming languages an operating systems, 2013, pp. 113-126.
  16. Zhou, B., Neamtiu, I. and Gupta, R., "Predicting concurrency bugs: how many, what kind and where are they?," In Proceedings of the 19th International Conference on Evaluation and Assessment in Software Engineering, 2015, pp. 1-10.
  17. Jackson, J., "Nasdaq's Facebook glitch came from 'race conditions'," Computerworld, May 21, 2013.
  18. United State Department of Defense, "Appendix E. Generic Software Safety Requirements and Guidelines," Joint Software Systems Safety Engineering Handbook, August 2010, pp. E-15-E-18.
  19. Luo, Z., Xiang, X. and Zhang, Q., "Autopilot system of remotely operated vehicle based on Ardupilot," Intelligent Robotics and Applications, 2019. pp. 206-217.