All posts
Opinion7 min read

The hidden cost of "one more metric": cognitive load and analysis paralysis as measurable waste

TL;DR

Adding metrics past a team's cognitive throughput ceiling creates measurable decision latency and shipping slowdowns — the correct response to a bad decision is rarely another chart, and the ROI on removing three metrics is almost always higher than adding one.

Key takeaways

  • Cap dashboards at 7–10 active metrics per team — beyond that, attention distributes so thinly that no metric is actually monitored, and anomalies go unactioned for days instead of hours.
  • Audit each metric by asking: 'What decision has this changed in the last 90 days?' If the answer is none, it is decoration, not measurement — archive it.
  • When a decision stalls on 'we need more data', diagnose whether the missing data would actually change the decision before commissioning a new metric; in practice it rarely would.
  • Assign ownership to every metric — one named person accountable for acting when it breaches a threshold. Unowned metrics are unread metrics.
  • Treat 'let's add a metric for that' as a budget request, not a free action: it costs engineering time, maintenance overhead, and a slice of everyone's attention budget going forward.

The dashboard nobody reads is costing you real decisions

There is a pattern I have seen in every product team I have worked with that reaches a certain size. The first dashboard gets built. It has five metrics. It is useful. Decisions get made from it.

Then something goes wrong that the dashboard did not surface. So a sixth metric gets added. Then a seventh. Then someone asks about cohorts, so a filter gets added. Then a new initiative launches and it gets its own section. Eighteen months later there is a dashboard with 40 metrics, four sub-dashboards, and a Monday ritual where everyone scrolls through it for forty minutes and leaves the meeting with no clearer sense of what to do next.

This is not a tooling problem. Amplitude, Mixpanel, Looker, Grafana — they all enable the same accumulation. It is a measurement philosophy problem, and it has a measurable cost that teams almost never quantify.

Why we keep adding and rarely remove

Metrics accumulate because adding one is a low-friction, high-status action. You identify a gap in visibility, you instrument it, you feel like you have improved the system. The cost is diffuse and deferred — everyone's attention gets a little thinner, but no single person feels responsible for that.

Removing a metric is the opposite: high friction, low status, and politically charged. Whoever added it may feel criticised. The team that owns it may feel devalued. So metrics compound like technical debt, but with less tooling to make the debt visible.

George Miller's 1956 paper put working memory capacity at 7 ± 2 chunks (Miller, G.A., “The Magical Number Seven, Plus or Minus Two: Some Limits on Our Capacity for Processing Information”, Psychological Review, 1956). Subsequent cognitive load research (Sweller, Paas, van Merriënboer) has refined this significantly — the number of items you can actively hold in decision-relevant working memory during a complex task is closer to 4. A dashboard with 40 metrics is not a decision-support tool. It is a catalogue.

The consequence is not that people ignore the dashboard. They skim it. Skimming produces a different class of decision error than not measuring at all: it produces false confidence. The team believes they are data-informed because they looked at the numbers. They are not, because they absorbed about 10% of what was there.

Metric Count vs Decision Quality

1

20+ Metrics

Dashboard theatre

2

10–20 Metrics

Skim and hope

3

5–9 Metrics

Scanning, selective attention

4

1–4 Metrics

Active memory, decisions made

Cognitive load is waste in the accounting sense

Lean manufacturing distinguishes value-adding work from waste (muda). Waste is any activity that consumes resources without producing customer value. In knowledge work, attention is the resource. Any metric that consumes attention without producing a changed decision is waste by definition.

This is not a metaphor. You can put rough numbers to it.

Assume a product team of eight people reviews a dashboard weekly for 45 minutes. If 60% of the metrics on that dashboard have not influenced a decision in the past quarter — a conservative estimate based on my own audits — then roughly 27 minutes per person per week is spent processing signal that produces no action. Across eight people, that is 3.6 hours per week of waste, or approximately 180 hours per year. At a blended senior IC rate of $120/hr, that is $21,600 per year in attention cost for one team reviewing one dashboard.

That number is not meant to be precise. It is meant to establish that the cost is real and not trivial — and that it scales with team size, dashboard count, and review frequency.

The second cost is decision latency. When a team has 40 metrics and something genuinely anomalous happens, finding the signal in the noise takes longer than when they have 10. I have watched incident post-mortems where the relevant metric was present in the dashboard the whole time, but the team did not notice the drop for three days because it was on slide seven of twelve. That delay had downstream effects.

The legitimate case for more measurement

The strongest objection to this argument is survivorship bias: the metrics you removed might be the ones that would have caught the next incident.

This is true. Removing measurement creates blind spots. The response is not 'therefore add freely' — it is 'therefore choose deliberately and accept that you cannot measure everything.'

There is also a legitimate case for adding metrics during specific phases. Early product development, where you genuinely do not know what matters, warrants broader instrumentation. A new initiative with no prior baseline needs temporary measurement breadth until the signal-to-noise ratio is understood. The error is treating these temporary expansions as permanent infrastructure.

The third objection is compliance and audit requirements. Some industries require tracking specific metrics regardless of decision utility. Finance, healthcare, and infrastructure companies have regulatory obligations that override measurement ROI. These are real constraints. They do not invalidate the principle — they define the floor below which you cannot cut, while the discretionary stack above the floor still accumulates the same way.

What a measurement audit actually looks like

The most effective intervention I have seen is a quarterly metric audit with a single question per metric: 'What decision has this changed in the last 90 days?'

Not 'is this metric interesting' or 'does this metric reflect something real'. The question is specifically about decisions — past, concrete, documented. If the answer is 'none', the metric is a candidate for archiving. Archive means it still gets collected (so you can restore it cheaply if needed), but it no longer appears in the active dashboard.

The process works best when it is owned by a named individual — not a committee — who has explicit authority to archive without full consensus. Consensus-based metric removal almost never happens because the political cost is distributed across too many stakeholders.

For teams building AI-assisted products specifically: LLM evaluation is a domain where metric sprawl reaches its most extreme form fastest. It is trivially easy to instrument 30 different quality signals across faithfulness, relevance, toxicity, length, latency, cost, hallucination rate, citation accuracy, and tone. Many of these matter. Tracking all of them actively at once means none of them drive decisions. Pick the three that most directly reflect user outcomes and let the rest run as background audit checks with threshold alerts, not weekly review obligations.

The discipline is the same as any other engineering constraint: you cannot have everything, so choose what you are optimising for and accept the tradeoffs.

Product, measurement, and decision quality