KernelEvolve: How Meta’s Ranking Engineer Agent Optimizes AI Infrastructure
This is the second post in the Ranking Engineer Agent blog series exploring the autonomous AI capabilities accelerating Meta’s Ads Ranking innovation. The previous post introduced Ranking Engineer Agent’s ML exploration capability, which autonomously designs, executes, and analyzes ranking model experiments. This post covers how to optimize the low-level infrastructure that makes those models run [...] Read More... The post KernelEvolve: How Meta’s Ranking Engineer Agent Optimizes AI Infrastructure appeared first on Engineering at Meta .

Meta's Ranking Engineer Agent is pushing the boundaries of AI innovation, and a key component of this effort is the optimization of its underlying infrastructure. In this article, we delve into KernelEvolve, an agentic kernel authoring system that enables the efficient execution of AI models at scale.
Meta operates a vast fleet of heterogeneous hardware, including NVIDIA GPUs, AMD GPUs, Meta's custom MTIA silicon chips, and CPUs. Each of these hardware types requires specialized software to translate high-level model operations into efficient, chip-specific instructions known as optimized kernels. Traditionally, the process of authoring and optimizing kernels has been a time-consuming and labor-intensive task, requiring human experts to hand-tune kernels for each new chip generation and ML model architecture.
However, with the increasing number of models and the diversity of hardware types and generations, this manual approach has become unsustainable. To address this challenge, Meta developed KernelEvolve, an autonomous agent that optimizes performance for AI models. KernelEvolve significantly accelerates the development process by compressing weeks of expert engineering time—including profiling, optimizing, and cross-hardware debugging—into just hours of automated search and evaluation. This automation not only saves time but also frees up human engineers to focus on other critical tasks.
Moreover, KernelEvolve delivers substantial performance improvements. For instance, it achieved over 60% inference throughput improvement for the Andromeda Ads model on NVIDIA GPUs and over 25% training throughput improvement for an ads model on Meta's custom MTIA chips. These enhancements are crucial for maintaining the efficiency and scalability of Meta's AI infrastructure, which powers a wide range of services and applications.
Beyond the specific use case of Meta's Ranking Engineer Agent, KernelEvolve is a general-purpose solution applicable to a variety of AI models and hardware configurations. By automating the kernel optimization process, KernelEvolve ensures that the full potential of diverse hardware is harnessed, enabling faster development cycles and better performance across the entire AI ecosystem.
In conclusion, KernelEvolve represents a significant advancement in Meta's ongoing quest to optimize its AI infrastructure. By leveraging autonomous agents like KernelEvolve, Meta is able to efficiently scale its AI capabilities, ensuring that its models run at peak performance on a diverse range of hardware platforms. This innovation not only benefits Meta's Ads Ranking innovation but also sets a precedent for the broader AI community, demonstrating the potential of automated, agentic systems to drive performance and efficiency in complex, large-scale AI environments.










