All posts
Opinion7 min read

Metric overload as a product bug: dashboards that look "rigorous" but reduce decision accuracy

TL;DR

A dashboard with too many metrics is not a measurement system — it is a decision-avoidance system that creates the appearance of data-driven culture while actually increasing cognitive load, deferring choices, and distributing accountability so thinly that nobody is responsible for acting.

Key takeaways

  • Limit any operational dashboard to 5–7 metrics max. If you can’t make that cut, you haven’t decided what you’re optimising for — and a 47-metric dashboard is a symptom of that upstream failure, not a solution to it.
  • Distinguish leading indicators from lagging indicators from diagnostic metrics. Only the first two belong on the primary dashboard. Diagnostics belong in a drill-down layer that you pull when something is wrong, not a layer you scan every morning.
  • When every team member is watching different metrics, you don’t have shared situational awareness — you have competing narratives. Agree on the 3 numbers that define “good week” before you instrument anything else.
  • The correct test for a new metric is not “could this be useful?” but “what decision would change if this number moved?”. If you can’t name the decision, the metric is noise.
  • Metric proliferation is often an organisational proxy for distrust — more dashboards get built when leadership doesn’t trust teams to make judgment calls. Fixing the dashboard without fixing the trust dynamic will not hold.

The dashboard that proves you’re serious

At some point in most product organisations, someone builds a dashboard and calls it the “source of truth.” It has MRR, ARR, churn, NPS, DAU, WAU, MAU, activation rate, time-to-value, feature adoption by cohort, support ticket volume, P1 count, P2 count, page load times, error rates, funnel drop-off at each step, and seventeen more things that seemed important when someone requested them six months ago.

This dashboard gets presented in every all-hands. It gets linked in every weekly update. It is treated as evidence of rigour.

It is making your decisions worse.

Not because the data is wrong. Because the human brain cannot hold 47 numbers in working memory and arrive at a coherent priority. The research on this is not subtle: Hick’s Law quantifies that decision time grows logarithmically with the number of options. George Miller’s foundational 1956 paper “The Magical Number Seven, Plus or Minus Two” (Psychological Review) puts the functional limit of working memory at 7 ± 2 items. Your 47-metric dashboard is not a measurement system — it is a decision-paralysis machine dressed up as analytical maturity.

Narrow vs Wide Dashboard

Criteria5-metric dashboard47-metric dashboard
Decision speedFastSlow
Accountability clarityStrongWeak
Cognitive loadLowHigh
Metric ownershipClearDiffuse

How dashboards get fat

Metric proliferation follows a predictable path. A product team launches. Someone asks “how do we know if it’s working?” and the reasonable answer is: track a few things. Then a stakeholder asks why bounce rate isn’t on the dashboard. Then engineering wants latency. Then customer success wants CSAT. Then a VP wants to see it broken down by plan tier.

None of these requests are unreasonable in isolation. The problem is that nobody ever removes a metric once added. Removing a metric feels like saying it doesn’t matter, and that feels like a political statement.

So the dashboard accumulates. Each addition has an advocate. No removal ever does.

The same dynamic appears in OKR reviews. Teams with 12 key results per quarter don’t achieve 12 things — they achieve 3 things and retroactively reframe the other 9. Google’s own internal guidance for OKRs recommends 3–5 objectives with 3–5 key results each — and that’s at a company with enormous coordination infrastructure. Most startups I’ve worked with or observed have OKR lists that are longer than that at the team level, with no agreed hierarchy between items.

Rigour is not the same as comprehensiveness

The confusion at the root of metric overload is conflating rigour with coverage. Rigour means your measurement is accurate and your reasoning from it is sound. Coverage means you have measured many things. These are not the same property, and coverage actively degrades rigour past a certain point.

Here’s the mechanism: when you have 47 metrics, any given week will show some of them up and some of them down. A narrative can always be constructed from the ones that moved favourably. This is not deliberate dishonesty — it is the inevitable result of giving humans a large enough dataset and asking them to summarise it. The summary will always foreground what confirms the story they wanted to tell.

A “calibrated” belief is one where your uncertainty matches the actual evidence. A 47-metric dashboard produces the opposite of calibration. You feel maximally informed because you have maximally many numbers, while your actual ability to predict outcomes has not improved — and may have decreased because you’re now tracking inputs, outputs, and proxies all at once with no hierarchy between them.

The teams I’ve seen make consistently good product decisions share a pattern: they have a small, agreed-upon set of north-star metrics (usually 2–3) and a clearly separated diagnostic layer they only pull when something anomalous happens. Mixpanel, Amplitude, and PostHog all support this architecture natively — a “home” dashboard with the numbers that define good vs. bad, plus drill-down dashboards that are tools for investigation, not ambient monitoring.

Metric Hierarchy

1

Diagnostic Metrics

pull on anomaly only

2

Leading Indicators

track weekly

3

North-star Metric

track daily — defines success

The legitimate case for more metrics — and when it falls apart

The strongest version of the counter-position is: complex systems require complex measurement. A logistics company optimising thousands of routes genuinely cannot describe system health with three numbers. A healthcare platform monitoring patient outcomes is legally and ethically obligated to track a wide range of signals.

This is true. But it conflates two different activities: monitoring for safety and anomaly detection (which should be comprehensive) and decision-making dashboards (which should be minimal).

A nuclear plant has hundreds of sensors. The control room interface that operators actually use to make decisions shows a handful of critical indicators with clear alert thresholds. The full sensor telemetry is recorded and queryable, but it is not displayed on the decision surface by default. This distinction — between the measurement system and the decision interface — is exactly what most product dashboards collapse.

PagerDuty and similar on-call systems learnt this the hard way: alert fatigue is a patient-safety-level problem in healthcare alerting and a production-reliability-level problem in software ops. The fix is not fewer sensors — it’s fewer alerts on the primary surface, with the rest available on demand. The same architecture applies to product metrics.

Treating it as a product bug

If metric overload were a bug report, it would read: “Users exposed to feature are slower to make decisions and more likely to defer action. Root cause: information architecture does not match cognitive capacity. Reproduce: add 40+ metrics to any dashboard and observe week-over-week decision velocity.”

The fix is not a data team project. It’s a product leadership conversation with a specific output: what are the 3–5 numbers that, if they moved in the right direction for 90 days, would constitute success? Write those down. Put only those on the primary dashboard. Move everything else to a diagnostics layer with a clear entry condition (“pull this when metric X drops more than 15%”).

This will feel reductive. Someone will argue you’re ignoring important signals. They are right that those signals exist. They are wrong that ambient monitoring of 47 things is the correct response to that.

You cannot manage what you cannot focus on. The dashboard that looks most rigorous is often the one that has most thoroughly distributed responsibility for outcomes across enough metrics that no one person owns any of them. That’s not a measurement culture. That’s a way of making accountability optional while maintaining the aesthetic of seriousness.

Product, measurement, and decision quality