CPL - Chalmers Publication Library
| Utbildning | Forskning | Styrkeområden | Om Chalmers | In English In English Ej inloggad.

A Process Health Status Service for Safety Related Systems Using TT/ET Communication Scheduling

Carl Bergenhem ; Johan Karlsson (Institutionen för data- och informationsteknik, Nätverk och system (Chalmers) )
Proc. IEEE 14th Pacific Rim International Symposium on Dependable Computing (PRDC 2008) p. 122-131. (2008)
[Konferensbidrag, refereegranskat]

This paper describes a health status protocol for distributed real-time systems that use TTCAN, Flexray, or other networks which support both time-triggered and event-triggered communication. The protocol allows a group of co-operating processes to establish a consistent view of each other’s health status over time. It extends the instantaneous view, of operational status of each process, provided by a process group membership protocol. The health status and membership protocols are intended for systems where processes (not nodes) are considered the smallest unit of failure, and where process failures can be detected and recovered locally by the host node. Such systems require a decision function that determines whether a process failure is temporary (the process is being recovered by the host node) or permanent (local recovery is not possible or was unsuccessful). Our protocol ensures that such decisions are made consistently among correct nodes despite symmetrical and asymmetrical omission failures.

Nyckelord: fault tolerance, redundancy management, diagnosis, memberhsip protocols



Denna post skapades 2009-01-09.
CPL Pubid: 84828