A perspective from Plainsight Migrating to Fabric or Databricks?
Building blocks

The semantic layer

The shared definition of metrics and entities, and the single biggest lever for AI accuracy. Answers come from governed definitions, not raw-schema guesswork.

Semantic layer

defined once, governed
grounding
A question, in plain language

Why did EU margin drop last quarter?

The semantic layer
Revenue = sum(net_amount) Margin = (revenue - cost) / revenue EU region = country in DE, FR, NL, ... Active customer = ordered in 90 days
Raw tables and sources
orders invoices crm_accounts fx_rates product_dim events

The same governed definitions feed every chart, agent, and answer across the business.

For years, every tool decided for itself what “revenue” meant. The dashboard had one definition, the spreadsheet another, finance a third, and reconciling them was someone’s quarterly headache.

The semantic layer is one governed definition that everything shares. A metric is defined once, certified, and every tool, report, and language model resolves to it. It stops being an internal modeling detail and becomes the thing that decides whether an AI answer is trustworthy at all.

Why now

The semantic layer used to be a convenience. It’s now the single biggest lever for trustworthy generative BI.

Point a model at a raw schema and ask for SQL, and it’s right maybe 17 to 40% of the time. Ground it in a semantic layer that defines the metrics and entities, and accuracy climbs to roughly 85 to 95%. The figures move with every paper, but the shape holds: grounding roughly triples how often the answer is right. The model isn’t smarter; it’s no longer guessing what you mean by “active customers.” It looks the definition up instead of improvising one. That’s the difference between chat with your data being a party trick and something finance relies on.

What it looks like

A marketing lead asks how many active customers there are in the EU.

A tool over raw tables has to infer what “active” means: last 30 days? ever purchased? not yet churned? It picks one silently and returns a number that may not match what finance reported last week. Two teams, two truths, no idea why.

Grounded in the semantic layer, it doesn’t infer. “Active customers” is a certified metric defined once, perhaps in a Power BI model or a cube.js project, with the exact logic the business agreed on. Finance and marketing ask the same question and get the same answer, because they’re reading the same definition.

Where it’s heading

The semantic layer is becoming the contract between humans, AI, and data: humans agree what the metrics mean, the AI answers within that agreement, the warehouse holds the raw material. Increasingly it’s something agents read from and propose changes to, not a static artifact a modeller hand-builds.

As more analytical work shifts to machines, that contract gets more load-bearing, not less. A human can quietly compensate for a fuzzy definition; an agent at scale cannot. The clearer the semantic layer, the more work can be trusted to AI, which makes investing in it one of the highest-leverage things a data team can do.

How we think about it

No semantic layer, no trustworthy generative BI. Everything else on this site, the conversations, the generated reports, the proactive insights, rests on the system knowing what your numbers mean. Without it, AI is a confident guess generator pointed at your most important decisions; with it, the same models become dependable, which is as much a question of trust and governance as of clever modeling.

So we treat it as the foundation, not the finishing touch. Get the definitions right and shared, and the rest of generative BI has something solid to stand on.

Questions

The semantic layer, in short.

Is the semantic layer just a data dictionary?

No. A dictionary describes columns. A semantic layer defines metrics and entities as logic the system computes, so "revenue" isn't a note about a field but a certified calculation every tool and model resolves to the same way.

Do we need a new product to have one?

Often you already have the start of one. A well-built Power BI model or a cube.js project is a semantic layer. The work is less about buying a tool than agreeing on the definitions and making them the single place answers come from.

Why does this matter so much for AI specifically?

A model pointed at raw tables has to guess what your business means. Pointed at governed definitions, it doesn't guess, it looks up. That single change is the difference between a demo and an answer you can defend.

Where could this take your BI?

If this is the direction you want to head, we should talk.

Talk to us
Talk to us