CPL - Chalmers Publication Library
| Utbildning | Forskning | Styrkeområden | Om Chalmers | In English In English Ej inloggad.

On the Design and Validation of Fault Containment Regions in Distributed Communication Systems

Håkan Sivencrona (Institutionen för datorteknik)
Göteborg : Chalmers University of Technology, 2004. ISBN: 91-7291-378-9.- 193 s.
[Doktorsavhandling]

This thesis has a two fold focus where the first is an evaluation of a time-triggered communication protocol implementation, TTP-C1, which was stressed by use of heavy-ion fault injection. The gathered result showed a novel type of failures, earlier known only from theory, so-called slightly-out-of-specification faults, that manifested as Byzantine failures with resulting system inconsistence. The second focus lies on the design and simulation of algorithms for RedCAN switches to dynamically reconfigure a CAN bus.

The thesis is divided into three parts which deals with the design, validation and analysis of dependable communication with respect to the above mentioned focus.

Part I deals with the design of mechanisms for fault containment. We propose one algorithm to handle slightly-out-of-specification faults in time domain as they manifested during heavy-ion fault injections in TTP-C1 implementation. We present a novel simulation tool to test and execute the scenarios that lead to serious communication degradation using two different algorithms.

A second algorithm that we propose handles a distributed recovery approach after permanent bus and node failures in a CAN communication system using RedCAN switches.

Part II presents results from heavy-ion fault injection experiments in a TTP-C1 cluster consisting of four to nine synchronized nodes. This was done to be able to evaluate the performance of the time-triggered protocol and assess the efficiency of the implemented dependability increasing mechanism in presence of faults.

Part II furthermore presents performance results of different RedCAN recovery algorithms collected through a novel simulation tool, RedCAN simulation manager that we have designed. Part II additionally includes validation result of a proposed solution for isolating asymmetric faults in a time-triggered system through an active star coupler.

Part III contains analyses of the results given from presented validation results and real world experiences, especially Byzantine faults in a distributed communication system are discussed.

Nyckelord: time-triggered protocol, fault containment regions, fault injection, validation, slightly-out-of-specification faults, Byzantine faults, asymmetric faults, fault handling and membership agreement



Denna post skapades 2006-08-25. Senast ändrad 2013-09-25.
CPL Pubid: 92

 

Institutioner (Chalmers)

Institutionen för datorteknik (2002-2004)

Ämnesområden

Information Technology

Chalmers infrastruktur

Ingår i serie

Technical report D - School of Computer Science and Engineering, Chalmers University of Technology 23


Doktorsavhandlingar vid Chalmers tekniska högskola. Ny serie 2060