Add a source by asking
"Add this source, add this fact" becomes a request your AI fulfills, drafting models, tests, and docs for human review.
Add Salesforce opportunities as a daily fact.
stg_opportunities.sql fct_opportunities.sql schema tests docs Add this source becomes a request your AI fulfills, for a human to review.
For as long as there have been pipelines, adding a source or a fact has meant hand-coding it. Someone writes the model, someone writes the tests, someone writes the docs, and the request waits in a backlog until that someone has time. The work is rarely hard. It’s just slow, repetitive, and gated on a scarce person.
Now you ask. “Add the payouts source and a daily payouts fact” becomes a request your AI fulfills, drafting the model, the tests, and the documentation, and handing back a pull request for a human to read. Text-to-pipeline turns a sentence into a reviewable change. The work that used to fill an afternoon arrives as a draft in minutes.
Why now
- AI-assisted coding is already the most adopted use of AI on data teams. Engineers reach for an assistant to write transformations, fix queries, and explain unfamiliar code as routine. The muscle is built; pointing it at whole pipeline changes is the next step, not a leap.
- Agents can scaffold a model and its tests from one request. Given a source and a target, an agent drafts the staging model, the fact, the tests that guard them, and the docs, in your project’s conventions. Grounded in your company memory of how you name things and how your team builds, the draft reads like a teammate wrote it, not a generic tool.
What it looks like
An analyst messages the team’s agent: “add the Stripe payouts source and a daily payouts fact.”
A naive tool would dump a generic SQL snippet and leave the wiring to you. This does what a careful colleague would: inspects the source, drafts a staging model that conforms to your naming and existing definitions, builds a daily payouts fact on top, and writes the tests that protect both: uniqueness on the grain, not-null on the keys, a freshness check on the load. It drafts the docs so the new fact shows up described, not orphaned. Then it opens a pull request.
A data engineer reviews that PR like a peer’s. The model is mostly right; the freshness window is too tight, so they loosen it; a column name breaks convention, so they fix it. They approve and merge. An afternoon of writing became fifteen minutes of reading and directing.
What counts isn’t that the AI wrote code; it’s that it produced a complete, reviewable change, model and tests and docs together, so the human’s time goes to judgment rather than typing.
Where it’s heading
Toward the data engineer as reviewer and director rather than typist. As drafts get more reliable, the day shifts from producing boilerplate to deciding what should be built, checking the drafts, and handling the genuinely novel cases the agent can’t. The “can you just add this source” backlog becomes a queue of small reviews. And because every change ships with its own tests, the same drafting that adds a source feeds the self-healing pipelines that keep it running, so the system that builds the pipeline is the system that maintains it. The role doesn’t shrink; it moves up.
How we think about it
The AI drafts; the human reviews and owns the merge. A draft is not a deployment. Every text-to-pipeline change arrives as a pull request, with its tests and docs, so a person can read it, correct it, and decide whether it belongs in production. The agent is fast and tireless and occasionally wrong, which is exactly why the merge stays human. Speed on the draft, judgment on the merge.
Strike that balance and adding a source is no longer a backlog item. It’s a sentence you say and a change you review, with a person firmly in the loop where it counts.
Add a source by asking, in short.
Does this mean the data engineer stops writing code?
No, they stop writing the boilerplate. The engineer reviews, corrects, and directs the draft, then owns the merge. The hard parts, deciding what a fact should mean and whether the logic is right, are still human work. The typing is what goes away.
What if the AI drafts the wrong model?
The human catches it in review, the same place you'd catch a colleague's mistake. The draft arrives as a pull request, not a live change. Nothing reaches production until a person reads the model, checks the tests, and approves the merge.
How is this different from a code generator?
A code generator emits a snippet you then wire in. This drafts the whole change in context: the model, the tests that guard it, the docs that explain it, grounded in your conventions and existing definitions, ready to review as one coherent unit.
Keep exploring
Self-healing pipelines
Pipelines that detect, diagnose, and remediate routine failures, so the 2am page is for the genuinely novel only.
A data-quality copilot
An agent that proposes tests, triages incidents, and explains root cause. Quality becomes a continuous conversation, not a quarterly cleanup.
Natural-language writeback
Correcting and enriching data in plain language, with governed, audited write paths back to the warehouse.
Where could this take your BI?
If this is the direction you want to head, we should talk.