Design of Hardened Embedded Systems on Multi-FPGA Platforms
Abstract
The aim of this article is the definition of a reliability-aware methodology for the design of embedded systems
on multi-FPGA platforms. The designed system must be able to detect the occurrence of faults globally and
autonomously, in order to recover or to mitigate their effects. Two categories of faults are identified, based
on their impact on the device elements; (i) recoverable faults, transient problems that can be fixed without
causing a lasting effect namely and (ii) nonrecoverable faults, those that cause a permanent problem, making
the portion of the fabric unusable. While some aspects can be taken from previous solutions available in
literature, several open issues exist. In fact, no complete design methodology handling all the peculiar issues
of the considered scenario has been proposed yet, a gap we aim at filling with our work. The final system
exposes reliability properties and increases its overall lifetime and availability