Dagstuhl Seminar 21441
Adaptive Resource Management for HPC Systems
( Nov 01 – Nov 05, 2021 )
Permalink
Organizers
- Michael Gerndt (TU München, DE)
- Masaaki Kondo (Keio University - Yokohama, JP)
- Barton P. Miller (University of Wisconsin-Madison, US)
- Tapasya Patki (LLNL - Livermore, US)
Contact
- Andreas Dolzmann (for scientific matters)
- Jutka Gasiorowski (for administrative matters)
Today’s supercomputers have very static resource management. Jobs are submitted via batch scripts to the resource manager, then scheduled on the machine with a fixed set of nodes. Other resources, such as power, network bandwidth and storage are not actively managed and are provided only on a best-effort basis. This inflexible, node-focused and static resource management will have to change in the future due to many reasons, some of them listed below.
First, applications are becoming increasingly more dynamic. Techniques such as adaptive mesh refinement, e.g., as used in Tsunami simulations, lead to scalability changes over the application’s execution. Furthermore, only some application phases might profit from specialized accelerators, and I/O phases might even run best with a limited number of compute resources.
Additionally, the execution environment of applications is also becoming dynamic. Modern processors change the clock frequency according to the instruction mix as well as power and thermal envelopes. Heavy use of the vector units can lead to a lower clock frequency to stay in the thermal power budget, for example.
As an independent concern, due to the sheer number of components, failure rates are expected to increase thus slowing down computation or even leading to an increased number of node failures.
Finally, the upcoming machines will be power constrained, which means that the power will have to be carefully distributed among all running applications. The resulting power capping will impact the application’s performance due to adaptation of the clock frequency and due to manufacturing variability. These challenges in HPC will only be solvable by using a more adaptive resource management approach. For example, compute nodes need to be redistributed among running applications to adapt to changes in the application’s resource requirements either due to a varying number of grid points or interspersed algorithmic phases that profit from certain accelerators; network and I/O bandwidth will have to be assigned to applications to avoid interference caused by contention of concurrent communication and I/O phases; power needs to be dynamically redistributed both within an application and across applications to enable increased efficiency. Dynamic redistribution of resources will also give more flexibility to the resource manager to schedule jobs on the available resources and thus reduce idle times and efficiency lowering contention scenarios, e.g., in the situation of big jobs waiting for execution.
This Dagstuhl Seminar will investigate a holistic, layered approach for adaptive resource management. It starts with the resource management layer being responsible for scheduling applications on the machine and dynamically allocating resources to the running applications. At the programming level, applications need to be programmed in a resource-aware style such that they can adapt to resource changes and can make most efficient usage of the resources. On top of the programming interfaces, programming tools have to be available that allow the application developers to analyze and tune the applications for the varying amount of available resources. At the application level, applications have to be redesigned to enable significant gains in efficiency and throughput, e.g., adaptive mesh refinement, approximate computing, and power-aware algorithms are a few aspects to mention here.
The outcomes of this seminar will be a list of challenges and a roadmap that identifies the next steps for implementing adaptive resource management of HPC systems including languages, message passing libraries, resource managers, tools, and runtimes. A report will be published after the seminar.
Today's supercomputers have very static resource management. Jobs are submitted via batch scripts to the resource manager, then scheduled on the machine with a fixed set of nodes. Other resources, such as power, network bandwidth and storage are not actively managed and are provided only on a best-effort basis. This inflexible, node-focused and static resource management will have to change in the future due to many reasons, some of them listed below.
First, applications are becoming increasingly more dynamic. Techniques such as adaptive mesh refinement, e.g., as used in Tsunami simulations, lead to scalability changes over the application's execution. Furthermore, only some application phases might profit from specialized accelerators, and I/O phases might even run best with a limited number of compute resources.
Additionally, the execution environment of applications is also becoming dynamic. Modern processors change the clock frequency according to the instruction mix as well as power and thermal envelopes. Heavy use of the vector units can lead to a lower clock frequency to stay in the thermal power budget, for example.
As an independent concern, due to the sheer number of components, failure rates are expected to increase thus slowing down computation or even leading to an increased number of node failures.
Finally, the upcoming machines will be power constrained, which means that the power will have to be carefully distributed among all running applications. The resulting power capping will impact the application's performance due to adaptation of the clock frequency and due to manufacturing variability. These challenges in HPC will only be solvable by using a more adaptive resource management approach. For example, compute nodes need to be redistributed among running applications to adapt to changes in the application's resource requirements either due to a varying number of grid points or interspersed algorithmic phases that profit from certain accelerators; network and I/O bandwidth will have to be assigned to applications to avoid interference caused by contention of concurrent communication and I/O phases; power needs to be dynamically redistributed both within an application and across applications to enable increased efficiency. Dynamic redistribution of resources will also give more flexibility to the resource manager to schedule jobs on the available resources and thus reduce idle times and efficiency lowering contention scenarios, e.g., in the situation of big jobs waiting for execution.
This Dagstuhl Seminar investigated a holistic, layered approach for adaptive resource management. It started with the resource management layer being responsible for scheduling applications on the machine and dynamically allocating resources to the running applications. At the programming level, applications need to be programmed in a resource-aware style such that they can adapt to resource changes and can make most efficient usage of the resources. On top of the programming interfaces, programming tools have to be available that allow the application developers to analyze and tune the applications for the varying amount of available resources. At the application level, applications have to be redesigned to enable significant gains in efficiency and throughput, e.g., adaptive mesh refinement, approximate computing, and power-aware algorithms are a few aspects to mention here.
The discussions led to a joint summary presenting the state-of-the-art, required techniques on these layers of HPC systems, as well as the foreseen advantages of adaptive resource management.
- Eishi Arima (TU München, DE) [dblp]
- Eduardo César (Autonomus University of Barcelona, ES) [dblp]
- Isaías Alberto Comprés Ureña (TU München, DE) [dblp]
- Michael Gerndt (TU München, DE) [dblp]
- Jophin John (TU München, DE) [dblp]
- Matthias Maiterth (TU München, DE) [dblp]
- Barton P. Miller (University of Wisconsin-Madison, US) [dblp]
- Bernd Mohr (Jülich Supercomputing Centre, DE) [dblp]
- Frank Mueller (North Carolina State University - Raleigh, US) [dblp]
- Santiago Narvaez Rivas (TU München, DE)
- Mirko Rahn (Fraunhofer ITWM - Kaiserslautern, DE) [dblp]
- Lubomir Riha (VSB-Technical University of Ostrava, CZ) [dblp]
- Martin Schulz (TU München, DE) [dblp]
- Anna Sikora (Autonomus University of Barcelona, ES) [dblp]
- Ondrej Vysocky (VSB-Technical University of Ostrava, CZ)
- Felix Wolf (TU Darmstadt, DE) [dblp]
- Dong Ahn (LLNL - Livermore, US)
- Andrea Bartolini (University of Bologna, IT) [dblp]
- Pete Beckman (Argonne National Laboratory - Lemont, US) [dblp]
- Mohak Chadha (TU München, DE)
- Julita Corbalan (Barcelona Supercomputing Center, ES)
- Balazs Gerofi (RIKEN - Kobe, JP) [dblp]
- Toshihiro Hanawa (University of Tokyo, JP) [dblp]
- Shantenu Jha (Rutgers University - Piscataway, US) [dblp]
- Rashawn Knapp (Intel - Hillsboro, US) [dblp]
- Masaaki Kondo (Keio University - Yokohama, JP) [dblp]
- Daniel John Milroy (LLNL - Livermore, US)
- Tapasya Patki (LLNL - Livermore, US) [dblp]
- Barry L. Rountree (LLNL - Livermore, US) [dblp]
- Roxana Rusitoru (Arm - Cambridge, GB) [dblp]
- Sakamoto Ryuichi (Tokyo Institute of Technology, JP) [dblp]
- Wolfgang Schröder-Preikschat (Universität Erlangen-Nürnberg, DE) [dblp]
Classification
- modelling / simulation
- operating systems
- optimization / scheduling
Keywords
- High Performance Computing
- Programming Tools
- Power Management
- Resource Management