OpenAI describes how it rearchitected its WebRTC stack to deliver low-latency voice interactions at scale for ChatGPT voice and the Realtime API. The team addressed three infrastructure constraints: one-port-per-session media termination, stateful ICE/DTLS session ownership, and global routing latency. They implemented a split relay plus transceiver architecture that preserves standard WebRTC client behavior while optimizing internal packet routing.
Infrastructure
How OpenAI delivers low-latency voice AI at scale
OpenAI describes how it rearchitected its WebRTC stack to deliver low-latency voice interactions at scale for ChatGPT voice and the Realtime API. The team addressed three infrastructure constraints: one-port-per-sessi...
Monday, May 4, 2026 12:00 PM UTC2 MIN READSOURCE: Hacker NewsBY sys://pipeline
Tags
infrastructure