Mohan Kalkuntay, VP of Architecture and Technology at Broadcom, discusses Broadcom's Ethernet fabric solutions for AI, emphasizing the unique requirements of AI networking, such as fewer, high-bandwidth, bursty flows and the critical importance of minimizing tail latency to improve job completion time. He identifies challenges like flow collisions, link failures, and incast, and proposes solutions including network telemetry, packet spraying, cognitive routing, zero-impact failure recovery, and receiver-side credit control. Kalkuntay also notes the convergence of front-end and back-end network requirements in cloud environments and introduces Broadcom's scheduled fabric solutions: Jericho 3 AI for switch scheduling and Tomahawk 5 for endpoint scheduling.
Sign in to continue reading, translating and more.
Continue