Loading…
October 24, 2022 | Detroit, Michigan
View More Details & Registration Information
 

The Sched app allows you to build your schedule but is not a substitute for your event registration. You must be registered for KubeCon + CloudNativeCon North America 2022 - Detroit, MI + Virtual and add this Co-Located event to your registration to participate in these sessions. If you have not registered but would like to join us, please go to the event registration page to purchase a registration.

Please note: This schedule is automatically displayed in Eastern Daylight Time (EDT), UTC -4. To see the schedule in your preferred timezone, please select from the drop-down menu to the right, above "Filter by Date."

The schedule is subject to change.
Back To Schedule
Monday, October 24 • 11:20am - 11:30am
⚡ Lightning Talk: Fluence: Approaching a Converged Computing Environment - Daniel Milroy, Lawrence Livermore National Laboratory & Claudia Misale, IBM T.J. Watson Research Center

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Feedback form is now closed.
Adoption of cloud technologies by high performance computing (HPC) is accelerating, and HPC users want their applications to perform well everywhere. While container orchestration provides resiliency, elasticity, and declarative management, it is not designed to enable app performance like HPC schedulers. In particular, Kube-scheduler is not suited to scheduling emerging HPC workflows that require pods placed advantageously. In response to interest in scheduling flexibility, the K8s community developed the Scheduling Framework to integrate new policies and schedulers. KubeFlux, a Scheduling Framework plugin based on the Fluxion open-source HPC scheduler, provides HPC scheduling capability in K8s. We detail our improvements to the MPI Operator and demonstrate its scalability to 16,384 ranks. With the improved operator we compare the performance of HPC benchmark apps scheduled by Kube-scheduler and KubeFlux. We conclude that KubeFlux makes pod placements that enable much higher app performance than Kube-scheduler. KubeFlux is an example of the rich capability that can be added to K8s and paves the way to converged computing environments with the best capabilities of HPC and cloud.

Speakers
avatar for Daniel Milroy

Daniel Milroy

Computer Scientist, Lawrence Livermore National Laboratory
Daniel Milroy is a Computer Scientist at the Center for Applied Scientific Computing at the Lawrence Livermore National Laboratory. His research focuses on graph-based scheduling and resource representation and management for high performance computing (HPC) and cloud converged environments... Read More →
avatar for Claudia Misale

Claudia Misale

Staff Research Scientist, IBM T.J. Watson Research Center
Claudia Misale is a Staff Research Scientist in the Hybrid Cloud Infrastructure Software group at IBM T.J. Watson Research Center (NY). Her research is focused on Kubernetes for IBM Public Cloud, and also targets porting HPC applications to the cloud by enabling batch scheduling alternatives... Read More →



Monday October 24, 2022 11:20am - 11:30am EDT
Room 252 AB