Home InternationalLaunching Cloudflare’s Gen 13 servers: trading cac...
International⭐ Featured

Launching Cloudflare’s Gen 13 servers: trading cache for cores for 2x edge compute performance

Cloudflare’s Gen 13 servers double our compute throughput by rethinking the balance between cache and cores. Moving to high-core-count AMD EPYC ™ Turin CPUs, we traded large L3 cache for raw compute density. By running our new Rust-based FL2 stack, we completely mitigated the latency penalty to unlock twice the performance.

6 April 2026 at 07:25 pm
1 views
Launching Cloudflare’s Gen 13 servers: trading cache for cores for 2x edge compute performance

Cloudflare has announced the launch of its Gen 13 server fleet, which delivers a significant boost in edge compute performance by rethinking the balance between cache and cores. The new architecture, built on AMD EPYC™ Turin CPUs, prioritizes raw compute density over large L3 cache, resulting in a 2x increase in compute throughput. This breakthrough was made possible by the transition to a Rust-based FL2 stack, which eliminated the latency penalty and unlocked the full potential of the high-core-count processors.

Two years ago, Cloudflare introduced its Gen 12 server fleet, powered by AMD EPYC™ Genoa-X processors with their massive 3D V-Cache. The cache-heavy architecture was well-suited for the request handling layer, FL1, at the time. However, as the company evaluated next-generation hardware, it faced a challenge: the CPUs offering the most significant throughput gains came with a substantial reduction in cache. The legacy software stack was not optimized for this shift, and the potential benefits were being hindered by increasing latency.

The solution came with the FL2 transition, a Rust-based rewrite of Cloudflare's core request handling layer. FL2 removed the dependency on the larger cache, allowing performance to scale with the number of cores while maintaining service-level agreements (SLAs). This innovation proved crucial in harnessing the full potential of the Gen 13 architecture, which is now live and based on AMD EPYC™ 5th Gen Turin-based servers running FL2.

AMD's EPYC™ 5th Generation Turin-based processors offer more than just a core count increase. The architecture delivers improvements across multiple dimensions that are essential for Cloudflare's server requirements. With up to 192 cores and simultaneous multithreading (SMT) providing 384 threads, the Turin processors offer a 2x core count compared to Gen 12's 96 cores. Zen 5's architectural improvements also deliver better instructions-per-cycle (IPC) performance compared to Zen 4.

In addition to increased core count and improved IPC, Turin processors are more power-efficient. Despite the higher core count, they consume up to 32% fewer watts per core compared to Genoa-X processors. This efficiency is particularly valuable for edge computing, where power consumption is a critical factor.

The Turin processors also support DDR5-6400 memory, providing higher memory bandwidth to support the increased core count. However, this high-density OPN (Optimized Performance Node) architecture makes a deliberate tradeoff: prioritizing throughput over per-core cache. For example, comparing the highest-density Turin OPN to Gen 12 Genoa-X processors reveals that Turin's 192 cores share 384MB of L3 cache, compared to the larger cache available in Genoa-X.

By shifting the focus from cache to cores and optimizing the software stack with FL2, Cloudflare has achieved a 2x edge compute performance boost. This breakthrough not only demonstrates the potential of high-core-count architectures but also highlights the importance of software optimization in fully leveraging new hardware capabilities. As Cloudflare continues to innovate and expand its edge network, the Gen 13 servers will play a crucial role in delivering fast, reliable, and secure connectivity to users around the globe.

📰 Related News
Ollama 0.2.6 Released with Native Gemma 4 Support and Enhanced Performance
Ollama 0.2.6 Released with Native Gemma 4 Support and Enhanced Performance
Ollama 0.2.6 is now live, featuring native support for Google's Gemma 4 models and improved local inference performance for Windows, macOS, and Linux.
14 Apr
Weekly news roundup: Shortages spread to MLCCs; SK Hynix reportedly in talks with Microsoft and Google
Weekly news roundup: Shortages spread to MLCCs; SK Hynix reportedly in talks with Microsoft and Google
Below are the most-read DIGITIMES Asia stories from the week of April 6-April 13, 2026:
14 Apr
cutile-stencil 0.2.0
cutile-stencil 0.2.0
An xDSL-based stencil compiler that generates optimized GPU kernels via NVIDIA cuTile
14 Apr
merlin-llm added to PyPI
merlin-llm added to PyPI
Merlin — a fast local LLM for agentic coding on Apple Silicon
14 Apr
Fluent Cut - Craft and compose videos programmatically in PHP with an elegant fluent API
Fluent Cut - Craft and compose videos programmatically in PHP with an elegant fluent API
Craft and compose videos programmatically in PHP with an elegant fluent API - b7s/fluentcut
14 Apr
Crypto Investor at Center of Trump Corruption Allegations Now Sees Himself as ‘Victim’
Crypto Investor at Center of Trump Corruption Allegations Now Sees Himself as ‘Victim’
Justin Sun has accused Trump-affiliated World Liberty Financial of misconduct and a general lack of transparency.
14 Apr
nvidia-nat-weave 1.7.0a20260413
nvidia-nat-weave 1.7.0a20260413
Subpackage for Weave integration in NeMo Agent Toolkit
14 Apr
nvidia-nat-s3 1.7.0a20260413
nvidia-nat-s3 1.7.0a20260413
Subpackage for S3-compatible integration in NeMo Agent Toolkit
14 Apr
Social Security Trust Fund to Run Dry in 2032: Just 6 Years From Now
Social Security Trust Fund to Run Dry in 2032: Just 6 Years From Now
Six years. That is how much time separates retirees from a Social Security system that, by its own projections, runs out of money. If you are 56 years old...
14 Apr
cane-gpu-perf added to PyPI
cane-gpu-perf added to PyPI
GPU inference benchmarking with opinionated diagnostics
13 Apr