How Meta Used AI to Map Tribal Knowledge in Large-Scale Data Pipelines
AI coding assistants are powerful but only as good as their understanding of your codebase. When we pointed AI agents at one of Meta’s large-scale data processing pipelines ā spanning four repositories, three languages, and over 4,100 files ā we quickly found that they weren’t making useful edits quickly enough. We fixed this by building [...] Read More... The post How Meta Used AI to Map Tribal Knowledge in Large-Scale Data Pipelines appeared first on Engineering at Meta .

At Meta, we've long recognized the potential of AI to enhance our development processes, but we quickly encountered a challenge when trying to leverage these tools for large-scale data pipelines. The issue was simple yet profound: AI coding assistants are only as effective as their understanding of the codebase. When we directed AI agents to one of our sprawling data processing pipelines, which spanned four repositories, three languages, and over 4,100 files, we found that they were unable to make useful edits quickly enough.
To address this, we devised a solution that involved building a pre-compute engine. This engine consisted of a swarm of 50+ specialized AI agents that methodically read every file in the pipeline. Their task was to produce 59 concise context files that encoded the "tribal knowledge" previously residing solely in the minds of engineers. This tribal knowledge refers to the tacit understanding of the codebase, the underlying design choices, and the relationships between different components that aren't immediately apparent from the code itself.
The outcome of this initiative was transformative. By providing AI agents with structured navigation guides, we achieved 100% coverage of our code modulesāa significant improvement from the previous 5% coverage. Moreover, the system covered all 4,100+ files across three repositories. In the process, we also documented over 50 non-obvious patterns, or underlying design choices and relationships that weren't immediately obvious from the code.
Preliminary tests have shown that this new approach results in a 40% reduction in the number of AI agent tool calls per task. This efficiency is a direct result of the structured knowledge layer that the pre-compute engine provides. Crucially, this system is model-agnostic, meaning it can work with most leading AI models.
Another key aspect of our solution is its self-maintenance. Every few weeks, automated jobs periodically validate file paths, detect coverage gaps, re-run quality checks, and auto-fix stale references. This ensures that the knowledge layer remains accurate and up-to-date, allowing the AI to continue functioning effectively.
The problem we faced was rooted in the lack of a map for our AI tools. Our pipeline is config-as-code, with Python configurations, C++ services, and Hack automation scripts working in unison across multiple repositories. A single data field onboarding task involves coordinating six subsystems: configuration registries, routing logic, DAG composition, validation rules, C++ code generation, and automation scripts. These components must stay in sync to function properly.
We had already built AI-powered systems for operational tasks, such as scanning dashboards, pattern-matching against historical incidents, and suggesting mitigations. However, when we attempted to extend these capabilities to development tasks, the AI struggled due to the absence of a clear map of the codebase.
By implementing the pre-compute engine, we've equipped our AI tools with the necessary knowledge to navigate and contribute effectively to our large-scale data pipelines. This approach not only enhances the efficiency of our development processes but also ensures that the AI is not just a consumer of the infrastructureāit's the engine that drives it.
In conclusion, Meta's journey with AI in large-scale data pipelines highlights the importance of addressing the limitations of these tools. By building a pre-compute engine that systematically maps tribal knowledge, we've unlocked new possibilities for AI-driven development. This solution not only improves the speed and accuracy of AI edits but also ensures that the knowledge layer remains dynamic and up-to-date, allowing the AI to continue evolving alongside the codebase.










