Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

Invasive Tightly Coupled Processor ArraysOn-Demand Fault Tolerance on Massively Parallel Processor Arrays

Invasive Tightly Coupled Processor Arrays: On-Demand Fault Tolerance on Massively Parallel... [In this chapter, we present for the first time (a) a systematic and holistic method to realise on-demand fault tolerance support on Tightly Coupled Processor Arrays (TCPAs) rather than single processors. Here, we propose (b) different level of replications, i. e., no replication, Dual Modular Redundancy (DMR), and Triple Modular Redundancy (TMR), with different capabilities for error handling for TCPAs. Here, a major contribution is to (c) apply these individual replication schemes based on a our novel reliability calculus for each of the proposed replication schemes and based on environmental conditions such as monitored Soft Error Rates (SERs) on the system. The strength of our reliability analysis is the usage of application execution characteristics that we derive from the compilation process. This will guide a system to transparently adopt suitable fault tolerance techniques upon application needs.] http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png

Invasive Tightly Coupled Processor ArraysOn-Demand Fault Tolerance on Massively Parallel Processor Arrays

Loading next page...
 
/lp/springer-journals/invasive-tightly-coupled-processor-arrays-on-demand-fault-tolerance-on-1hd3Czv1Te
Publisher
Springer Singapore
Copyright
© Springer Science+Business Media Singapore 2016
ISBN
978-981-10-1057-6
Pages
115 –144
DOI
10.1007/978-981-10-1058-3_4
Publisher site
See Chapter on Publisher Site

Abstract

[In this chapter, we present for the first time (a) a systematic and holistic method to realise on-demand fault tolerance support on Tightly Coupled Processor Arrays (TCPAs) rather than single processors. Here, we propose (b) different level of replications, i. e., no replication, Dual Modular Redundancy (DMR), and Triple Modular Redundancy (TMR), with different capabilities for error handling for TCPAs. Here, a major contribution is to (c) apply these individual replication schemes based on a our novel reliability calculus for each of the proposed replication schemes and based on environmental conditions such as monitored Soft Error Rates (SERs) on the system. The strength of our reliability analysis is the usage of application execution characteristics that we derive from the compilation process. This will guide a system to transparently adopt suitable fault tolerance techniques upon application needs.]

Published: Jul 9, 2016

Keywords: Fault Tolerance; Soft Error; Processor Array; Static Random Access Memory; Error Handling

There are no references for this article.