AI Agents in Data Engineering¶
Concrete enterprise patterns for AI agents solving data engineering problems β from Meta's 50-agent pipeline mapper to Microsoft's governance framework.
Meta: Mapping Tribal Knowledge at Scale¶
Source: Meta Engineering Blog
The most concrete enterprise example of AI agents applied to data engineering:
| Metric | Before | After |
|---|---|---|
| Pipeline documentation coverage | 5% | 100% |
| Research time for pipeline questions | 2 days | 30 minutes |
| Files mapped | Manual | 4,100 automatically |
Architecture: 50 Specialised Agents¶
Not one generalist agent β 50 agents each doing one thing well: - Lineage mapping - Documentation generation - Ownership tracking - Dependency analysis - Freshness monitoring - Each agent = narrower scope, deeper reliability
Why this pattern matters: Most people try to build one "smart" agent. Meta proved that specialised swarms outperform generalists at enterprise scale.
Blueprint for Your Org¶
Your dbt project likely has similar tribal knowledge issues: - Undocumented models - Unclear dependencies - Orphaned transforms - Missing ownership
Prototype: Use dbt MCP + Claude/GPT API to build a "pipeline knowledge agent" β start with your most complex dbt project. One agent per concern (lineage, docs, ownership, freshness).
Microsoft Agent Governance Toolkit¶
First toolkit addressing all 10 OWASP Agentic AI risks. MIT-licensed.
Seven Packages¶
- Policy engine β deterministic enforcement
- Identity/credential governance β prevent credential sprawl
- Tool call interception β inspect and block dangerous tool calls
- SRE-patterned failure handling β circuit breakers, SLOs
- Multi-agent isolation β prevent cascading failures
- Memory/session governance β prevent memory poisoning
- Compliance reporting β EU AI Act, Colorado AI Act
Key Specs¶
- <0.1ms p99 latency β production-viable
- Framework integrations: LangChain, CrewAI, LlamaIndex, OpenAI SDK, Google ADK, PydanticAI, Haystack
- Languages: Python, TypeScript, Rust, Go, .NET
- Install:
pip install agent-governance-toolkit[full]
OWASP Agentic AI Top 10 (Dec 2025)¶
Standard vocabulary for agent security β knowing these is table stakes: 1. Goal hijacking 2. Tool misuse 3. Identity abuse 4. Memory poisoning 5. Cascading failures 6. Rogue agents 7. Data exfiltration 8. Prompt injection 9. Supply chain 10. Compliance violations
Career Application¶
If building AI agents that query Snowflake or modify dbt models: - Governance policies prevent agents from dropping tables or exposing PII - OWASP knowledge signals seniority in interviews - Regulatory readiness (EU AI Act Aug 2026, Colorado June 2026) is a differentiator
Agentic Engineering Patterns¶
Source: Simon Willison
Practical prompt design for coding agents β the patterns that make agent output reliable:
The Three-Pattern Stack¶
- Reference codebase > detailed specs β Give the agent an existing codebase to read, not a description of what to build. "Imitate how X works" > describing X's logic yourself.
- Self-testing loops β Give the agent a way to validate:
python -m http.server+ browser comparison. Agents that can verify their own work produce better results. - Short, context-rich prompts β Three sentences, each carrying rich context, beat long specifications.
The /tmp Pattern¶
Clone reference repos to /tmp β agent can read the existing codebase without risk of contaminating your project. Clean separation of concerns.
Application to AE¶
- Reference: existing dbt model instead of describing grain/keys
- Validation:
dbt compile+dbt testinstead of "check it looks right" - Imitation: "Build this like the fct_orders model" instead of "Build a fact table with these columns"
NVIDIA Dynamo: Full-Stack Agentic Inference¶
Source: NVIDIA Blog
Optimisation framework for running agentic inference at scale β relevant when deploying agents in production data pipelines: - Disaggregated serving (separate prefill/decode) - KV cache routing between workers - GPU-to-GPU transfer without CPU round-trip - Built on top of vLLM
Practical upshot: when you're running 50 agents against your dbt project, Dynamo-pattern infrastructure keeps latency low and GPU utilisation high.
Key Takeaways¶
- Specialised swarms > generalist agents β Meta's 50-agent approach is the proven enterprise pattern
- Governance is non-optional β Microsoft AGT + OWASP Top 10 are the baseline
- Self-validation is the reliability multiplier β agents that can test their own work are dramatically more useful
- MCP is the integration standard β how agents talk to dbt, Snowflake, your tools
- Career framing: "I build governed agentic data workflows" not "I write SQL"
Related¶
- [[Agentic-Analytics-Engineering]] β the career transition and architecture
- [[AI-Model-Landscape-April-2026]] β open vs closed, self-hosted agents
- [[Data-Principles-for-Analytics-Engineering]] β foundational data architecture principles
See also¶
- [[Agentic-Engineering-Patterns]]