NDLI: Asynchronous and Multiprecision Linear Solvers - Scalable and Fault-Tolerant Numerics for Energy Efficient High Performance Computing

Please wait, while we are loading the content...

Asynchronous and Multiprecision Linear Solvers - Scalable and Fault-Tolerant Numerics for Energy Efficient High Performance Computing

Content Provider	Semantic Scholar
Author	Anzt, Hartwig
Copyright Year	2012
Abstract	The development of modern technology is characterized by simulations, that are no longer performed in physical experiments, but in terms of mathematical modeling by partial differential equations. Numerical algorithms have often replaced the necessity of laboratories, but typically demand for immense computing power. While the hardware manufacturers try to keep up with the demand for Petaflops, the suitability of the numerical methods employed in the simulation algorithms decreases constantly. This often stems from a gap between the design and parallelism of the numerical algorithms forming the simulation code and the parallelism and complexity provided by today's and future hardware platforms, impacting performance, dependability and resource efficiency. In a nutshell, three main challenges can be identified when aiming for exascale simulation algorithms: scalability, reliability and energy efficiency. As future hardware architectures are expected to consist of several millions up to billions of processing units located in different devices connected via different communication technologies, the algorithms running on these machines efficiently are required to scale for this immense processor number, which implies the reduction of the communication to a minimum. Furthermore, as the high number of processors also implies a significant failure rate, a high tolerance to hardware error is essential to ensure the completion of the simulations. While checkpointing strategies are widespread used in today's implementations, algorithms will no longer be able to rely on this technique as soon as the hardware complexity induces a mean time of failure of the full system smaller than the time for checkpointing and restarting. Finally the power demand of computing facilities handling the simulation experiments can be identified as a major hurdle. Already today, the energy costs often exceed the acquisition costs after few years posing an economical challenge, and moreover question the power demand and the ecological footprint the resource efficiency of computer simulations. While future hardware is expected to reduce the power demand by featuring efficient accelerator technology and energy saving mechanisms, conventional software usually ignores this issue by allowing only very limited usage of these techniques. In many simulation applications, generating solution approximations of discretized partial differential equations is the computationally most expensive part in the algorithm particularly since the traditional numerical methods require both communication and synchronization that limit the efficient hardware usage. In this thesis we target the inevitable question of how numerical solvers can be adapted to future computing facilities by proposing unconventional methods, suitable for the highly parallel and hybrid hardware platforms that are expected for the near future. Especially, we address the topic of hardware-adapted methods by aiming for synchronization-free linear solvers that minimize idle times by removing synchronization barriers, and therefore allow the efficient usage of computer systems consisting of components with different hardware characteristics. The implied high tolerance with respect to communication latencies improves the fault tolerance of the simulation method. As asynchronous methods also enable the usage of the power and energy saving mechanisms provided by the hardware, they address all challenges we identified for numerical methods in the exascale era and combine the most important characteristics required for hardware-efficient simulation algorithms. From the theoretical point we investigate the derived methods with respect to their convergence properties and analyze the potential of adapting them to a specific problem by accounting for the discretization method or the matrix characteristics. Also, we provide a comprehensive study revealing excellent performance, scalability and fault-tolerance properties as well as remarkable energy-efficiency of block-asynchronous iteration on different hardware architectures.
File Format	PDF HTM / HTML
Alternate Webpage(s)	https://d-nb.info/1029764689/34
Language	English
Access Restriction	Open
Content Type	Text
Resource Type	Article

Central Library (ISO-9001:2015 Certified)
Indian Institute of Technology Kharagpur
Kharagpur, West Bengal, India | PIN - 721302

See location in the Map
03222 282435
Mail: support@ndl.gov.in

Sl.	Authority	Responsibilities	Communication Details
1	Ministry of Education (GoI), Department of Higher Education	Sanctioning Authority	https://www.education.gov.in/ict-initiatives
2	Indian Institute of Technology Kharagpur	Host Institute of the Project: The host institute of the project is responsible for providing infrastructure support and hosting the project	https://www.iitkgp.ac.in
3	National Digital Library of India Office, Indian Institute of Technology Kharagpur	The administrative and infrastructural headquarters of the project	Dr. B. Sutradhar bsutra@ndl.gov.in
4	Project PI / Joint PI	Principal Investigator and Joint Principal Investigators of the project	Dr. B. Sutradhar bsutra@ndl.gov.in Prof. Saswat Chakrabarti will be added soon
5	Website/Portal (Helpdesk)	Queries regarding NDLI and its services	support@ndl.gov.in
6	Contents and Copyright Issues	Queries related to content curation and copyright issues	content@ndl.gov.in
7	National Digital Library of India Club (NDLI Club)	Queries related to NDLI Club formation, support, user awareness program, seminar/symposium, collaboration, social media, promotion, and outreach	clubsupport@ndl.gov.in
8	Digital Preservation Centre (DPC)	Assistance with digitizing and archiving copyright-free printed books	dpc@ndl.gov.in
9	IDR Setup or Support	Queries related to establishment and support of Institutional Digital Repository (IDR) and IDR workshops	idr@ndl.gov.in