Here’s a scenario that plays out frequently in enterprises across the globe.
A business leader requests what appears to be a “simple” report. The data team logs a Jira ticket, discusses it during sprint planning, and then disappears into weeks of back‑and‑forth clarifications. From a business perspective, the request feels like a straightforward query against a clean, well-defined table. For the data team, it resembles an archaeological excavation.
The business wonders why it takes so long. The data team struggles to articulate why.
The reality is this: data work isn’t slow because coding is difficult; it’s slow because the context required to write that code is difficult.
What most organizations label as technical delay is, in truth, a failure of data context management.
The Anatomy of a “Simple” Request
The inordinate amount of time taken to address simple requests is best understood by an example. Imagine working for an ecommerce company. Ahead of a board meeting, the leadership wants a concise narrative:
- What’s moving?
- What’s emerging?
- What should we pay attention to next?
The request for a weekly report highlighting top movers looks like this: “Show products/categories with the largest week‑over‑week change in sales, demand signals, and conversion. Include ‘emerging categories’ where demand is rising for multiple consecutive weeks.”
The business sees clarity. The data team sees a minefield of unresolved questions. This gap is one of the most persistent data engineering bottlenecks in modern enterprises.
The Definition Crisis
What does “Sales” mean? Is it gross or net? Is it recorded on order date or shipment date? Does it include returns or cancellations? Each choice maps to a different table – ‘orders’, ‘shipments’, or ‘financial_transactions’.
Next think about how you would parse “Week-over-week”. It sounds precise, but is it a calendar week or a fiscal week? Do we compare the previous week or the same week last year? Are we focused on absolute change or percentage change?
These aren’t technical problems; they are interpretation problems.
Then there are terms like “demand signals” and “emerging categories” which are especially dangerous. Do signals mean page views, searches, or “add-to-carts”? Does “emerging” mean growth for two weeks or four? To build this, the team must invent a definition. Once encoded in SQL, that invention becomes “truth.” If it’s wrong, the report is questioned, but the definition – the hidden context – remains in the culprit.
Without structured metadata and shared semantic layers, teams repeatedly recreate meaning from scratch instead of relying on metadata-driven data engineering practices that make definitions explicit and reusable.
Why This Takes Weeks
Look at the problem from the data team’s point of view. None of this work involves exotic infrastructure or complex Python. The time is spent reconstructing intent.
One system records gross sales at order placement. Another records net sales at shipment. Promotions are applied differently across channels.
Engineers read old dashboards for clues. They argue over definitions in Slack. They decipher pipelines named sales_final_v3_OLD. Every question requires finding someone who remembers why a decision was made years ago.
The logic, definitions, business rules, conventions, or in other words – context is difficult to ascertain. It lives in old Jira tickets, half-remembered conversations, and frequently in the heads of a handful of experts who “know how it works.”
In the end, a pipeline will be created by gathering all the disparate contexts. However, the final pipeline encodes choices that were never written down. The next team will repeat the same exercise.
This cycle—reconstruct, encode, forget, repeat—is the invisible drag that slows AI in data engineering workflows, even when automation tools promise acceleration.
AI Changes the Equation—But There is a Catch
AI has changed the economics of data work. Code can now be generated, refactored, translated, and optimized in minutes.
And yet, most data teams do not feel dramatically faster.
AI exposes what was always true but easier to ignore when the code itself was hard. Our attention was naturally directed towards hiring engineers who understood SQL well, knew the latest python frameworks, and had experience building production-grade pipelines with CI/CD practices. While lack of knowledge could slow down a team earlier, this is not the case anymore in the AI era.
Modern AI systems excel at execution. Given clear intent, structured rules, and unambiguous constraints, they perform remarkably well. Where they fail is when the problem is underspecified, inconsistent, or buried in human memory.
AI does not struggle with code. However, it struggles with context.
Ask an AI to write a data pipeline from Salesforce to Snowflake, and it will happily comply. But if business rules are implicit and edge cases were decided in meetings years ago, the output will look correct while being fundamentally wrong. Not because the model is weak, but because intent was never made explicit.
This mirrors what human engineers have always faced. Senior data engineers do not spend their time typing. They spend it asking questions, validating assumptions, and reconstructing decisions. AI simply makes the gap impossible to ignore.
The real competitive advantage for data teams is institutionalized context which means shared definitions, explicit assumptions, documented decisions and living business logic that is visible, query-able, and evolvable. In practice, this is what effective data context management looks like at scale.
Until organizations treat context as a first-class asset, every “simple” request will continue to feel like archaeology.
The Context Dividend: Moving From Execution to Intent
The winners of the AI era won’t be the ones who process more petabytes of data; they will be the ones who build durable understanding. This means:
- Platforms must actively discover organizational context from different systems, and embed it in the workflows of data teams
- Processes must archive intent, rationale, exceptions, definitions, so it never has to be “rediscovered.”
Organizations that embrace metadata-driven data engineering and formalize AI in data engineering workflows will reduce ambiguity even before a single line of code is written.
We reached a fork on the road. Data teams can continue to rebuild logic from memory, chasing intent through a trail of Jira tickets, GitHub repositories, and risk falling behind their business priorities, or they can treat context as infrastructure, and use it to accelerate their output.
Teams that cling to the old way will feel slow despite their high-tech tools. Teams that adopt context-rich platforms and processes will finally make “simple” requests actually simple.
Execution is no longer a bottleneck. Context is—and it is the constraint organizations can no longer afford to ignore.



