A perspective from Plainsight Migrating to Fabric or Databricks?
For end users

Alerts that find the right person

Metrics watched continuously; anomalies and threshold breaches reach the right person with context, not noise.

Proactive alerting

watched continuously
monitoring
A metric, watched continuously

Daily orders -23% vs forecast

09:14 likely cause: checkout error on EU site

routed to: Ops on-call

Anomalies reach the right person with context, not a wall of noise.

For years, alerting meant drawing a line and waiting. You picked a threshold (revenue below this, errors above that) and the system pinged a shared channel whenever it was crossed. The lines were guesses, blind to seasonality, and fired so often that people muted the channel. Alert fatigue was the predictable result of static rules shouting at a crowd.

The emerging way watches metrics continuously, learns what normal looks like for each, and routes the genuinely unusual to the one person who owns it, with the likely cause attached. From a fixed line and a shared inbox to a learned baseline and a named owner. An alert stops being a broadcast and becomes a message to someone who can act.

Why now

  • Anomaly detection over the semantic layer, not raw tables. When the system knows this number is “refunds,” scoped to a region, with a known weekly rhythm, it learns a baseline that respects context. A spike that looks alarming against a flat threshold is unremarkable on a Monday; a small move at an odd hour is the one that matters.
  • Routing. Knowing something is wrong is half the job; knowing who owns it is the other half. Once metrics carry ownership, an alert goes to the regional lead, not a channel of forty people who each assume someone else has it. Specific detection plus specific routing turns an alert back into a signal.

What it looks like

Refunds in one region jump on a Tuesday afternoon.

A threshold system would miss it (the daily total still sits under the line) or have fired so often that week that nobody’s looking. The learned baseline sees that refunds for that region, that day, that hour are running well above normal, and the move is sharp enough to be real.

So one message goes out, to the regional owner, not everyone. Refunds are up roughly 3x against the regional baseline since early afternoon, concentrated in a single product line, starting shortly after a price change went live. It links straight to the breakdown to confirm in seconds. No firehose, no triage meeting, just the right person looking at the right thing with the cause in hand.

Where it’s heading

Toward alerts that propose what to do, not just what happened. The refund alert wouldn’t stop at naming the price change; it would offer to roll it back, draft the note to the pricing team, or open the ticket, and wait for a human to approve. The alert becomes the opening line of a response rather than the end of a report, so the owner spends attention on the decision, not on assembling context.

How we think about it

An alert nobody owns is noise. Route it with context to a named owner, or stay silent. A broadcast to a shared channel isn’t an alert; it’s a way of avoiding responsibility for one, and it trains people to ignore the channel. Every alert should answer who, what, and why before it fires, carrying enough of the automated insight that the owner can act without hunting. When the system can’t meet that bar, it should send nothing, as much a matter of trust and governance as of detection.

Honor that, and the alert channel is no longer the thing everyone mutes. It’s the one people read, because when it pings, it’s for them, and it already tells them what to do.

Questions

Alerts that find the right person, in short.

How is this different from the threshold alerts we already have?

Static thresholds fire on a fixed line you set once and forget. A learned baseline knows what normal looks like for this metric, this region, this day of the week, so it flags the genuinely unusual, not the merely large. Fewer alerts, and the ones that fire are worth reading.

Will this just create more noise?

Less, not more. The goal is fewer, better alerts. An alert that goes to everyone goes to no one. Routing each one to a single owner with the likely cause attached is what turns a firehose back into a signal people trust.

What happens when the system is not sure?

It says so, or stays quiet, rather than paging someone on a hunch. An alert nobody can act on is noise, and crying wolf is how teams learn to mute the channel. When confidence is low, silence is the safer default.

Where could this take your BI?

If this is the direction you want to head, we should talk.

Talk to us
Talk to us