Resilience for Exascale Enabled Multigrid Methods
January 29, 2015 Β· Declared Dead Β· π arXiv.org
"No code URL or promise found in abstract"
Evidence collected by the PWNC Scanner
Authors
Markus Huber, BjΓΆrn Gmeiner, Ulrich RΓΌde, Barbara Wohlmuth
arXiv ID
1501.07400
Category
cs.CE: Computational Engineering
Cross-listed
cs.DC
Citations
6
Venue
arXiv.org
Last Checked
2 months ago
Abstract
With the increasing number of components and further miniaturization the mean time between faults in supercomputers will decrease. System level fault tolerance techniques are expensive and cost energy, since they are often based on redundancy. Also classical check-point-restart techniques reach their limits when the time for storing the system state to backup memory becomes excessive. Therefore, algorithm-based fault tolerance mechanisms can become an attractive alternative. This article investigates the solution process for elliptic partial differential equations that are discretized by finite elements. Faults that occur in the parallel geometric multigrid solver are studied in various model scenarios. In a standard domain partitioning approach, the impact of a failure of a core or a node will affect one or several subdomains. Different strategies are developed to compensate the effect of such a failure algorithmically. The recovery is achieved by solving a local subproblem with Dirichlet boundary conditions using local multigrid cycling algorithms. Additionally, we propose a superman strategy where extra compute power is employed to minimize the time of the recovery process.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
π Similar Papers
In the same crypt β Computational Engineering
R.I.P.
π»
Ghosted
R.I.P.
π»
Ghosted
A Probabilistic Graphical Model Foundation for Enabling Predictive Digital Twins at Scale
R.I.P.
π»
Ghosted
Temporal Attention augmented Bilinear Network for Financial Time-Series Data Analysis
R.I.P.
π»
Ghosted
Linked Component Analysis from Matrices to High Order Tensors: Applications to Biomedical Data
R.I.P.
π»
Ghosted
Deep Dynamical Modeling and Control of Unsteady Fluid Flows
R.I.P.
π»
Ghosted
Design and Optimization of Conforming Lattice Structures
Died the same way β π» Ghosted
R.I.P.
π»
Ghosted
Language Models are Few-Shot Learners
R.I.P.
π»
Ghosted
PyTorch: An Imperative Style, High-Performance Deep Learning Library
R.I.P.
π»
Ghosted
XGBoost: A Scalable Tree Boosting System
R.I.P.
π»
Ghosted