From Custom to Open: Scalable Network Probing and HTTP/3 Readiness with Prometheus
The Problem: Legacy Tooling and Its Limitations Currently, Slack utilizes a hybrid approach to network measurement, incorporating both internal (such as traffic between AWS Availability Zones) and external (monitoring traffic from the public internet into Slack’s infrastructure) solutions. These tools comprise a combination of commercial SaaS offerings and custom-built network testing solutions developed by our…

Slack Faces Challenges with Legacy Network Monitoring Tools as It Embraces HTTP/3
Slack, the popular communication platform, has long relied on a hybrid approach to network measurement, combining both internal and external solutions. This setup includes a mix of commercial SaaS offerings and custom-built network testing solutions developed by internal teams over time. While this approach was sufficient for their needs, the company encountered significant challenges when rolling out HTTP/3 support on the edge.
The core issue stemmed from a lack of client-side observability. HTTP/3, built on the QUIC transport protocol, uses UDP instead of the traditional TCP. This shift to a new transport protocol meant that existing monitoring tools and SaaS solutions were unable to probe HTTP/3 endpoints for metrics. At the time, there was a major gap in the market: none of the SaaS observability tools Slack investigated supported HTTP/3 probing out of the box.
Prometheus Blackbox Exporter (BBE), a cornerstone of Slack's monitoring infrastructure, also lacked native support for QUIC. Without the ability to probe hundreds of thousands of HTTP/3 endpoints in their new infrastructure, Slack couldn't achieve the client-side visibility needed to monitor regressions to HTTP/2 or obtain accurate round trip measurements.
Enter Sebastian Feliciano, an intern who took on the challenge of addressing this gap. Feliciano scoped, implemented, and ultimately open-sourced QUIC support for Prometheus BBE. The first step in this process was selecting a QUIC-capable HTTP client. After careful consideration, they chose quic-go as the foundation for the new functionality. The decision was based on quic-go's wide adoption across other open source technologies and its first-class support in creating HTTP clients in Go.
Feliciano's integration of quic-go into BBE's codebase marked a significant milestone. By leveraging quic-go's capabilities, they were able to extend BBE's functionality to support QUIC-based HTTP/3 endpoints. This development not only addressed Slack's immediate needs but also contributed to the broader open source community by making QUIC support available to other organizations facing similar challenges.
The addition of QUIC support to Prometheus BBE has enabled Slack to achieve the client-side observability required for their HTTP/3 infrastructure. With this enhancement, Slack can now effectively monitor its vast network of HTTP/3 endpoints, ensuring the stability and performance of its services.
Feliciano's work underscores the importance of open source contributions in addressing technological challenges. By contributing QUIC support to Prometheus BBE, they not only solved Slack's problem but also provided a valuable tool for other organizations transitioning to HTTP/3. This innovative approach to network probing highlights Slack's commitment to innovation and its dedication to leveraging open source solutions to drive growth and improve user experience.










