Loading…
October 24, 2022 | Detroit, Michigan
View More Details & Registration Information
 

The Sched app allows you to build your schedule but is not a substitute for your event registration. You must be registered for KubeCon + CloudNativeCon North America 2022 - Detroit, MI + Virtual and add this Co-Located event to your registration to participate in these sessions. If you have not registered but would like to join us, please go to the event registration page to purchase a registration.

Please note: This schedule is automatically displayed in Eastern Daylight Time (EDT), UTC -4. To see the schedule in your preferred timezone, please select from the drop-down menu to the right, above "Filter by Date."

The schedule is subject to change.
Back To Schedule
Monday, October 24 • 1:25pm - 1:55pm
Make Kubernetes Networking Ready for World Class AI and HPC Workloads - Sunyanan Choochotkaew, IBM & Gaurav Singh, Red Hat

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Feedback form is now closed.
While use of Kubernetes for various services is growing rapidly, it is still behind in the world of HPC and AI clusters. Part of the reason is that the lack of support for advanced features like multiple 100G networks available in HPC/AI Systems. Vast majority of AI systems in hyperscalers such as IBM Cloud, AWS, Azure, and Oracle Cloud come with two to 8 100G network interfaces on the A100 GPU nodes. However, by default in Kubernetes, a pod has only one network interface, but attaching multiple interfaces is often a requirement in the scenarios. Multus unlocks the potential of multi-networking feature in Kubernetes, but there are still challenges in usability, manageability, and scalability. We present Multi-NIC CNI, a new open-source project, to democratize multiple interfaces capability for everyone. This CNI saves users from the concerns regarding environment heterogeneity and acquiring CNI specific knowledge. This talk will introduce the architecture, use cases, and performance of the CNI, then show how beneficial it is for HPC/AI. We will demonstrate the CNI on a large scale GPU Cluster consisting of over 1400 GPUs and two 100G network interfaces that we build in IBM Cloud.

Speakers
avatar for Sunyanan Choochotkaew

Sunyanan Choochotkaew

Research Scientist, IBM
Sunyanan Choochotkaew is a research scientist at IBM Research - Tokyo, specializing in research on distributed computing and performance acceleration on Cloud platforms. She received her Ph.D. in information and computer sciences from Osaka University, Japan. She served as a program... Read More →
GS

Gaurav Singh

Product Manager, Red Hat
Gaurav Singh is Product manager in RedHat Openshift . He is responsible for core openshift components like scheduler, kubelet and pod autoscaling . Prior to Red Hat . Gaurav Singh has worked as product manager for Siemens, Hitachi Vantara and Dell.



Monday October 24, 2022 1:25pm - 1:55pm EDT
Room 252 AB