Agentic Analytics Engineering — The 2026 Shift¶
The discipline is moving from "writing SQL" to "building systems that enable agentic data workflows at scale." This is the defining career transition of 2026.
The Core Tension¶
AI is accelerating data work faster than governance can keep up. The dbt Labs 2026 State of AE report quantifies it:
- 72% of teams now use AI-assisted coding
- 83% prioritise trust in data (up from 66% in 2025 — fastest-growing priority)
- 71% fear bad data reaching stakeholders
- 77% of leaders pushing teams to improve productivity with AI
The discipline isn't being replaced — it's bifurcating. Those who build the systems that govern AI output will thrive. Those who only use AI as autocomplete will compete with AI itself.
"There's a real tension between moving fast and building trust, and you can't optimize for both without intention. That's where discipline in modeling, validation, and ownership becomes a requirement, not a best practice." — Pooja Crahen, Sr Manager AE at Okta
The Architecture of Agentic AE¶
Stephen Robb (dbt Labs) published the first complete blueprint: dbt + Google AI Agents. Three layers:
1. Local Validation Loop¶
dbt compilevia Fusion engine validates SQL before warehouse- Agent generates SQL → validates locally → detects issues → auto-iterates
- Key insight: Fusion's deterministic compiler means AI no longer has to "hope" its SQL is correct
2. Cloud Operations via MCP¶
- dbt MCP server exposes full project metadata as agent tools
- Agent can inspect models, run tests, understand dependencies — "like an AE would"
- MCP is the critical integration layer: [[AI-Agents-in-Data-Engineering]]
3. Specialised Subagents¶
- One agent per concern (modeling, documentation, testing, lineage)
- Narrower scope = deeper analysis = more reliable output
- Pattern matches Meta's approach (see [[AI-Agents-in-Data-Engineering]])
Starter repo: github.com/StephenR-DBT/dbt-gemini-agent-starter
Anthropic self-service analytics pattern (June 2026)¶
Anthropic published a production account of using Claude for self-service business analytics: How Anthropic enables self-service data analytics with Claude. The headline claim is stark: 95% of business analytics queries automated via Claude, with ~95% aggregate accuracy.
The durable point is not "point Claude at the warehouse". Anthropic says that creates a false sense of precision. Their framing is that analytics accuracy is a context and verification problem, not a code-generation problem.
Failure modes¶
- Concept/entity ambiguity — user language like "active users" must map to one governed definition, grain, population, and exclusion set.
- Staleness — schemas, definitions, source tables, and business context rot quickly.
- Retrieval failure — the answer may exist somewhere, but unstructured search over dashboards/SQL/query history does not reliably route the agent to it.
Stack pattern¶
- Data foundations: dimensional modelling, tests, freshness/completeness checks, governed model tiers, and metadata treated as a first-class product.
- Sources of truth: semantic layer first; lineage/transformation graph second; query corpus distilled into curated docs rather than dumped raw into retrieval; business context/knowledge graph for ambiguous references.
- Skills: pairwise skill design — a knowledge/router skill narrows the search space, while a procedural analysis skill encodes the senior analyst workflow.
- Maintenance: skill docs colocated with transformation code; PR hooks flag model changes that do not update related skill/reference docs. Anthropic reports offline accuracy drifting from ~95% to ~65% over a month before they treated skill maintenance as engineering work.
- Validation: offline evals, fixed ablation tests, adversarial review, provenance footers, passive monitoring, and correction harvesting.
Implications for Adam¶
This is directly aligned with the Personal Context Management shift. The winning system is not "more stored context"; it is curated, routed, maintained context with provenance and evals.
For LocalStack/Claude usage, the pack strategy should copy the pattern: compact router instructions, domain-specific reference docs, explicit gotchas, freshness/provenance footers, and correction harvesting from real work sessions.
For analytics engineering positioning, this becomes a strong narrative: the future AE value is building governed agentic analytics systems — semantic layers, docs, skills, evals, provenance, and correction loops — rather than just writing SQL faster.
Context layer / ktx pattern (June 2026)¶
SSP's write-up on Kaelio's open-source ktx context layer sharpens the architecture: reliable analytics agents need both hard semantics and soft semantics in one reviewed place.
- Hard semantics: warehouse schema, joins, grain, approved metric definitions, BI/dashboard logic, dbt metadata, and executable YAML/SQL.
- Soft semantics: business documentation, Notion/wiki/Markdown notes, methodology explanations, exceptions, and correction history.
- Governance loop: auto-ingest/draft where useful, but commit to git, review like code, validate before runtime, and keep ownership/freshness explicit.
- Runtime loop:
status → discover metric/context → validate/compile SQL → inspect → execute read-only/limited rows → capture corrections.
This complements the Anthropic pattern: semantic layer first, but not semantic layer only. The differentiated AE skill is building the context layer that lets Claude/Codex answer analytics questions through governed definitions instead of hallucinating field names against raw warehouse tables.
Commercially, this is an AI-enablement wedge: most companies do not need another dashboard; they need their existing metrics, docs, lineage, and business rules made agent-readable, testable, and maintainable.
The Career Play¶
What's changing¶
| Before | After |
|---|---|
| Write SQL transforms | Design systems that validate AI-generated transforms |
| Manual code review | Govern agentic output with policy + testing |
| One analyst, one model | 50 specialised agents per concern |
| "Modern data stack" | "Open data infrastructure" (dbt+Fivetran framing) |
How to position¶
- "Agentic data workflows" is becoming industry-standard language — align your narrative
- Governance as a skill, not just dbt — agent governance (OWASP, Microsoft AGT) is career-differentiating
- MCP fluency — understanding how agents safely interact with dbt is table stakes for 2026+
- Open data infrastructure replaces "modern data stack" as the organising concept after the dbt+Fivetran merger
Immediate actions¶
- Clone
dbt-gemini-agent-starterand experiment with the three-component architecture - Set up dbt MCP server locally (docs: quickstart)
- Try
adk web(Google ADK) for agent experimentation - Install Microsoft Agent Governance Toolkit:
pip install agent-governance-toolkit[full] - Attend dbt State of AE virtual event (April 29)
- Consider dbt Summit registration (Sept 15-18, Las Vegas, early bird saves $1,100)
The Governance Layer¶
Microsoft released the Agent Governance Toolkit (MIT-licensed):
- Sub-millisecond policy enforcement (<0.1ms p99)
- Covers all 10 OWASP Agentic AI Risks
- Framework-agnostic: LangChain, CrewAI, LlamaIndex, OpenAI SDK, Google ADK, PydanticAI
- Regulatory alignment: EU AI Act (Aug 2026), Colorado AI Act (June 2026)
Career signal: Understanding agent governance + OWASP agentic risks is rare and positions you for senior AI engineering roles.
Key Metrics for Making the Case¶
When justifying AI engineering investment to leadership:
| Metric | Source | Use For |
|---|---|---|
| 5% → 100% documentation coverage | Meta pipeline agents | ROI of agentic workflows |
| 2 days → 30 min research time | Meta pipeline agents | Time savings from AI agents |
| 72% AI-assisted coding adoption | dbt State of AE 2026 | "Everyone is doing this" |
| 71% fear bad data | dbt State of AE 2026 | Justifying governance investment |
Related¶
- [[AI-Ready-Analytics-Foundations-2026]] — synthesis of dbt/Snowflake/MIT 2026 research on trust, governance, AI readiness, and cost controls
- [[Data-Principles-for-Analytics-Engineering]] — foundational principles
- [[AI-Agents-in-Data-Engineering]] — Meta, governance, enterprise patterns
- [[AI-Model-Landscape-April-2026]] — open vs closed, deployment control
- [[Agentic-Engineering-Patterns]] — Simon Willison's prompt design patterns
- [[Context-Layer-for-Enterprise-AI]] — broader enterprise context-layer thesis
Data Integrity before Prompt Engineering (June 2026)¶
A recurring analytics-agent failure mode is trying to repair unreliable data with better prompts. The durable rule: prompt engineering cannot compensate for broken source data, unclear grain, stale definitions, or untested transformations.
For agentic analytics systems, the verification stack must start below the LLM:
- source freshness, completeness, and validity checks
- clear semantic definitions and grains
- deterministic tests before narrative generation
- provenance on generated answers
- correction capture when users find data issues
This reinforces the Anthropic self-service analytics pattern: reliable AI analytics is a context, governance, and data-quality problem first; prompt quality is downstream.
2026-06-13 — AI OSS tool repo goes archived over night after raising $7.3M Seed¶
- Source: unknown
- Domains: agentic_engineering, analytics_engineering
- Why it was promoted: High-signal durable technical story suitable for synthesis into the context library.
Summary: AI tool project archived immediately post-funding.
Reusable context: What happened The TensorZero AI OSS tool repository was unexpectedly archived on June 12, 2026, shortly after raising $7.3M in seed funding. This sudden move is speculated to be a strategic pivot to a closed-source model or an acquisition.
Why it matters This incident underscores the inherent risks for analytics engineers and data platforms relying on venture-backed open-source AI/ML tooling. It highlights the potential for rapid discontinuation or privatization of critical LLMOps infrastructure, which can disrupt development and impact long-term operational stability.
What to do Prioritize evaluation of AI/ML and LLMOps tools based on their long-term sustainability, community governance, and clear support commitments, rather than solely on recent funding or initial hype.
2026-06-13 — https://www.ssp.sh/blog/how-to-use-ai-with-de-wes-mckinney/¶
- Source: slack-intake
- Domains: agentic_engineering, analytics_engineering
- Why it was promoted: High-signal durable technical story suitable for synthesis into the context library.
- Raw source:
/home/adam/.hermes/context-inbox/raw/intake/2026-06-13/https-www-ssp-sh-blog-how-to-use-ai-with-de-wes-mckinney-18658954d088.md
Summary: This series interviews real practitioners to extract the patterns behind how they actually use AI in their data work today. This is the second interview in ‘How to use AI with DE’, and this time we have none other than Wes McKinney.
Creator of Pandas, probably the most widely used data analysis library for Python, Wes has shaped the era of data and is co-creator of Apache Arrow. He also created Ibis to address these issues with a different approach to Python dataframe libraries, by decoupling t
Reusable context: This series interviews real practitioners to extract the patterns behind how they actually use AI in their data work today. This is the second interview in ‘How to use AI with DE’, and this time we have none other than Wes McKinney.
Creator of Pandas, probably the most widely used data analysis library for Python, Wes has shaped the era of data and is co-creator of Apache Arrow. He also created Ibis to address these issues with a different approach to Python dataframe libraries, by decoupling the dataframe API from the backend implementation.
The article is structured in four parts:(1)how to trust the outcome,(2)knowing what not to build, factoring in cost-per-token among others,(3)accountability of agents and the code they generate, and(4)philosophizing about the future of agentic engineering.
Besides creating the most popular dataframe libraries used by most data people, Wes McKinney now focuses full time on agentic engineering with his newly founded companyKenn Software, which focuses on the promise of building a new stack of development and knowledge systems for the agentic era. He’s also doing AI and Python atPosit, where they work on adata science IDE. He’s a part-timeinvestorin various startups.
Wes has been running Claude Code, Codex, and Gemini CLI for months. Thousands of sessions, hundreds of thousands of messages. He has released multiple tools that help the agentic work (more on this later), and he is at the forefront of what’s going on with his recent blog posts about “Why he uses programming languages built for agents, not humans” andMythical Agent Month, with his recent insights into how to work with agents. Find all his takes atWes McKinney.com.
I had the pleasure of asking Wes more about these topics, and we’ll go into more details, plus many other things. Let’s get started.
We started the interview with a critical question that stands above all others in the current AI landscape, and I asked him: “Can we trust the outcome?”. What if we need
2026-06-13 — Larger Context Windows Don’t Fix RAG — So I Built a System That Does¶
- Source: unknown
- Domains: agentic_engineering, analytics_engineering, hermes_system
- Why it was promoted: High-signal durable technical story suitable for synthesis into the context library.
Summary: Critique of context windows vs improved RAG systems with proposed alternatives.
Reusable context: What happened The article highlights a key failure in current RAG systems: larger context windows do not resolve accuracy issues, particularly for analytical queries. It proposes a "QueryRouter" system that intelligently routes queries based on intent ("Computation" or "Retrieval") to address this "Error Observability Collapse."
Why it matters This is critical for analytics engineers and those working with AI/ML tooling, as it underscores that LLMs are not reliable computational engines for aggregations. Relying solely on RAG for analytical questions leads to polished but incorrect results.
What to do Evaluate implementing a query classification layer (like the proposed QueryRouter) in your AI/analytics stack to direct computational queries to deterministic engines (e.g., dbt, Snowflake) and factual retrieval queries to RAG.
2026-06-13 — Megathread Summary: I Asked Multiple Reddit Communities How to Build a Living Memory /Context Engine for Business. Here's what everyone had to say.¶
- Source: unknown
- Domains: agentic_engineering, analytics_engineering, hermes_system
- Why it was promoted: High-signal durable technical story suitable for synthesis into the context library.
Reusable context: What happened A Reddit megathread summarized community discussions on building a "living memory" or context engine for businesses, focusing on design philosophies like "Query-First Design," architectural choices such as append-only event logs and hybrid search, and memory management strategies including significance scoring.
Why it matters This research directly informs the development of advanced AI tooling and agent frameworks by providing practical insights into managing and synthesizing enterprise knowledge, which is critical for analytics engineers integrating AI with data platforms and orchestration tools.
What to do Evaluate hybrid search (vector + relational/graph) solutions and append-only event log architectures for future knowledge management systems within your data stack.
2026-06-13 — Mistral Seeks $3.5 Billion to Build European AI Infrastructure - PYMNTS.com¶
- Source: PYMNTS.com
- Domains: analytics_engineering
- Why it was promoted: High-signal durable story with actionable implications for the context library.
Summary: Mistral seeks $3.5B for AI infrastructure.
Reusable context: What happened I am unable to access the content of the article "Mistral Seeks $3.5 Billion to Build European AI Infrastructure - PYMNTS.com" from the provided link or via direct web search.
Why it matters The article title suggests direct relevance to AI infrastructure and funding, which is critical for analytics engineering, data platforms, and AI/ML tooling development in Europe.
What to do Please provide an alternative accessible link to the article or key details from its content so I can complete the summary.
2026-06-14 — Anthropic Offers Mythos Upgrade for Cyber Partners and a ‘Safe’ Version for the Rest of You - WIRED¶
- Source: WIRED
- Domains: agentic_engineering, analytics_engineering
- Why it was promoted: High-signal durable story with actionable implications for the context library.
Summary: Anthropic releases a 'Mythos' model upgrade for partners alongside a safety-focused version for general users.
Reusable context: What happened Anthropic released new "Mythos-class" AI models, offering an unrestricted "Mythos 5" to cyber security partners and a "Fable 5" with aggressive safety guardrails for general public and developer use. Fable 5 routes high-risk queries to a less capable model, though both show strong performance in coding and analytical tasks.
Why it matters This bifurcated release demonstrates a growing trend of specialized AI models and controlled access based on use-case, which impacts the capabilities available for AI/ML tooling and developer productivity within data platforms. The strong analytical and coding benchmarks of Fable 5 suggest immediate utility for analytics engineers.
What to do Evaluate Claude Fable 5 for its potential to automate or enhance complex analytical tasks and coding within existing data workflows.
2026-06-14 — Claude Fable Blocked - 11 Quiet Details on What’s Next¶
- Source: unknown
- Domains: agentic_engineering, analytics_engineering
- Why it was promoted: High-signal durable technical story suitable for synthesis into the context library.
Summary: Report on the blocking of 'Claude Fable' and details on future model development.
Reusable context: What happened Anthropic's "Fable 5" model was blocked, reportedly due to its advanced capabilities raising "difficulty" and government influence, rather than purely technical flaws. This highlights growing regulatory scrutiny in LLM development.
Why it matters Increased regulatory intervention and safety concerns will directly impact the availability, capabilities, and ethical considerations of integrating LLMs like Claude into data platforms and AI/ML tooling. This influences adoption timelines and feature sets for analytics engineers.
What to do Evaluate new Claude releases with a focus on their compliance posture and capabilities for enterprise use, especially for sensitive data processing or automated decisioning.
2026-06-14 — I indexed 669 GB of my GoPro videos using my M1 Max computer and local ML models¶
- Source: unknown
- Domains: agentic_engineering, analytics_engineering
- Why it was promoted: High-signal durable story with actionable implications for the context library.
Summary: Developer indexes 669 GB of media using local ML on Apple Silicon.
Reusable context: What happened A developer used local ML models on an M1 Max to index 669 GB of GoPro video, allowing for semantic search and automated clip extraction for video editing.
Why it matters This showcases the growing power of local AI/ML tooling and personal compute for processing large, unstructured datasets, offering a cost-effective alternative to cloud-based solutions for data platforms and enhancing developer productivity by automating complex tasks.
What to do Evaluate local ML frameworks (e.g., MLX, ONNX Runtime) for personal data processing workflows and integrate them into existing data pipelines where cost or privacy are concerns.
2026-06-14 — Linux 7.1¶
- Source: unknown
- Domains: agentic_engineering, analytics_engineering
- Why it was promoted: High-signal durable story with actionable implications for the context library.
Summary: Announcement of Linux 7.1 kernel.
Reusable context: What happened Linux Kernel 7.1 was officially released on June 14, 2026, introducing a rewritten in-kernel NTFS driver for improved performance, initial hardware enablement for AMD Zen 6 and Intel Panther Lake processors, and a new policy to manage AI-generated bug reports.
Why it matters The improved NTFS driver can enhance data processing efficiency on Linux for analytics engineers working with Windows filesystems. Hardware support for upcoming CPUs directly benefits the performance of data platforms and AI/ML workloads. The AI bug report policy signifies AI's increasing role in developer workflows, impacting AI tooling and productivity.
What to do Evaluate the new in-kernel NTFS driver in Linux Kernel 7.1 for potential performance improvements in data pipeline operations involving Windows filesystems.
2026-06-14 — [NEW FAMILY OF MODELS] Supra1.5 family just released!¶
- Source: Reddit r/LocalLLaMA
- Domains: agentic_engineering, analytics_engineering
- Why it was promoted: High-signal durable technical story suitable for synthesis into the context library.
Reusable context: What happened SupraLabs, under Project Chimera, released the Supra1.5 family of "Small Language Models" (SLMs). These models boast approximately 50 million parameters and are designed for high performance and efficient local inference, representing a significant advancement in ultra-compact AI.
Why it matters The Supra1.5 models are relevant to analytics engineering and data platforms due to their capability for "edge" data tasks like local SQL generation and privacy-preserving analytics. Their instant inference and improved tool-calling features can enhance developer productivity for agentic CLI assistants and offline AI applications, seamlessly integrating with existing AI/ML tooling.
What to do Evaluate Supra1.5 models for embedded AI applications within data pipelines or CLI tools, particularly for privacy-sensitive data processing and real-time developer assistance.
2026-06-14 — The Verifier Tax: Horizon-Dependent Safety–Success Tradeoffs in Tool-Using LLM Agents [R]¶
- Source: Reddit r/MachineLearning
- Domains: agentic_engineering, analytics_engineering
- Why it was promoted: High-signal durable story with actionable implications for the context library.
Reusable context: What happened New research introduces the "Verifier Tax," demonstrating that runtime safety checks in tool-using LLM agents consistently reduce task success rates and rarely lead to genuinely "Safe Success," especially over interaction horizons of 15-30 turns. Agents struggle significantly with recovery after a blocked action due to safety interventions.
Why it matters This is critical for AI/ML tooling and data platforms, revealing a fundamental safety-performance tradeoff in LLM agents. The "Verifier Tax" implies that current safety mechanisms often break agent reasoning, impacting the reliability and efficiency of LLM-powered automation in data workflows.
What to do When designing or evaluating LLM agent systems for analytics, prioritize frameworks that enable grounded identity verification and robust post-intervention reasoning to effectively recover from safety blocks.
2026-06-15 — Ask HN: Has anyone replaced Claude/GPT with a local model for daily coding?¶
- Source: unknown
- Domains: agentic_engineering, analytics_engineering
- Why it was promoted: High-signal durable technical story suitable for synthesis into the context library.
Summary: Community discussion on the feasibility of replacing cloud-based coding LLMs with local models.
Reusable context: What happened Hacker News discussion reveals that while some developers are experimenting, local models like Code Llama or Phind-70B are generally not yet replacing cloud models (Claude, GPT) for daily coding tasks due to significantly lower inference speeds (e.g., 0.7 tokens/sec) and inferior performance on complex optimizations.
Why it matters This directly impacts the immediate adoption strategy for integrating LLMs into analytics engineering workflows, suggesting that cloud-based solutions remain dominant for productivity-critical tasks. It also highlights the current limitations of local inferencing on commodity hardware for computationally intensive coding assistance.
What to do Continue to prioritize cloud-based LLM integrations for developer tooling, while monitoring local model performance advancements and hardware capabilities for future on-premise deployment considerations.