From Story Points to Outcomes: Coaching Teams to Measure What Matters
If you have been coaching agile teams for more than a few years, you have had this conversation: a team proudly reports that their velocity has climbed from 32 to 58 story points per sprint. Everyone in the room nods approvingly. But when you ask which user behaviors changed as a result of what the team shipped in those high-velocity sprints, the room goes quiet. Velocity measures how much a team produces. It says nothing about what that production accomplishes. Teams that optimize for velocity are, in the best case, building valuable things faster. In the worst case, they are getting very good at building the wrong things efficiently.
Coaching teams out of story-point fixation is one of the most persistent challenges in agile practice today. The fixation is understandable: points are unambiguous, comparable across sprints, and satisfy the organizational need for a productivity metric. Outcomes are harder to define, slower to measure, and require the team to take a position on what user behavior they are trying to change — which is a vulnerable and uncomfortable place to be. But this discomfort is exactly where valuable work lives. Jeff Gothelf and Josh Seiden's Lean UX framework gives coaches a practical vocabulary and toolset for making this shift from output to outcome measurement stick.
Retrospectives that include outcome review create a direct feedback loop between what teams build and what users do.
Why Velocity Metrics Create Perverse Incentives
Story points were never intended to be a management metric. They were designed as a team-internal planning tool — a way for the people doing the work to estimate relative complexity among tasks using a shared reference system. The moment velocity becomes a target that managers track and compare, it loses its utility as a planning tool and gains a new, damaging role as a performance proxy. Teams respond to incentives. When velocity is the number being watched, teams learn to write stories that are easy to complete, estimate conservatively to ensure they hit their commitment, and avoid complex discovery work that does not generate point completions.
The result is what some coaches call 'story point theater': the appearance of productivity without the substance of value creation. Teams move fast, burn down their backlogs, and meet their sprint goals — while the product languishes in a failure to drive the user behaviors that would justify its existence. The antidote is not to abandon agile planning tools but to add a second measurement layer that captures impact, not activity. Lean UX does this through the concept of outcome-based goals: specific, measurable changes in user behavior that the team commits to driving, independent of how many stories they complete to get there.
Introducing 'Who Does What By How Much' in Sprint Planning
The framework Jeff Gothelf and Josh Seiden describe as 'Who Does What By How Much' is a three-part structure for defining behavioral outcomes that teams can commit to and measure. 'Who' specifies the user segment being targeted. 'Does What' defines the precise behavior change being sought. 'By How Much' sets a measurable threshold — a quantitative target that tells the team unambiguously whether the outcome has been achieved. This structure forces specificity that vague goal-setting frameworks avoid. A goal like 'improve the onboarding experience' is not a Who-Does-What-By-How-Much outcome. A goal like 'first-time users who complete the setup wizard in the first session will increase from 41% to 60% within six weeks' is.
As a coach, introducing this structure in sprint planning creates an immediate diagnostic value. Teams that cannot answer all three parts of the framework for a given sprint goal are, by definition, building features without a connected hypothesis about value. The inability to specify 'by how much' is particularly revealing — it almost always signals that the team has not agreed on what success looks like, which means they have no basis for evaluating whether what they shipped was worth building. Making this explicit in planning, rather than discovering it in retrospective when the sprint is already done, is where coaching intervention has its highest leverage.
Pairing delivery metrics with behavioral data transforms velocity from a proxy into a context.
Running an Outcome-Based Retrospective
Most sprint retrospectives focus on process: what went well, what could improve, what to try next. These are valuable conversations, but they leave the most important question unasked: Did what we shipped this sprint change user behavior? Adding an outcome review segment to your retrospective format surfaces this question in a structured way. Before the retrospective begins, pull the behavioral metrics associated with the sprint's committed outcomes. Share them with the team at the start of the session. Then structure the first segment of the retrospective around three outcome-focused questions: Which of our committed outcomes showed measurable movement? Which showed no movement? And for the ones that moved, how confident are we that our work caused it?
This last question is the hardest and most important. Correlation between a feature release and a metric movement is not causation. Teams that celebrate metric improvements without interrogating causality will eventually make bad decisions based on false confidence. Teaching teams to ask 'how do we know we caused this?' — and to design their experiments with enough control that they can answer the question honestly — is some of the most valuable coaching work available in an agile context. It connects the sprint cycle to the scientific method in a way that story-point velocity never can.
Coaching the Transition: From Resistance to Ownership
Most teams initially resist outcome-based measurement. The resistance takes several forms. Some teams argue that they cannot control outcomes — that user behavior is influenced by too many external factors to be a fair performance measure. Others argue that outcomes are too slow to measure within a sprint cycle and therefore cannot inform sprint-level decisions. A third group, usually the most politically experienced, recognizes that committing to behavioral outcomes requires the team to take a position that managers can evaluate, which feels riskier than committing to story deliverables that are entirely within the team's control.
Each of these objections deserves a genuine coaching response rather than dismissal. The control objection is addressed by framing outcomes as hypotheses rather than guarantees — the team is committing to running an experiment designed to drive a behavior change, not to producing the behavior change through will alone. The speed objection is addressed by choosing shorter-cycle proxies: if the full behavioral outcome takes six weeks to measure, identify a leading indicator that can be observed within two weeks. The political risk objection is the trickiest because it reflects a real organizational dynamic. Teams that commit to outcomes and miss them need psychological safety — and that safety is a leadership challenge that coaching alone cannot solve. Part of the coach's role is to create the conditions at the organizational level where outcome commitment is rewarded for its learning value, not punished for its honest accounting of failure.
The Bottom Line
The shift from story-point velocity to outcome-based measurement is not a metrics change. It is a culture change. It requires teams to become comfortable with uncertainty, take positions on what success means, and accept that building the right thing slowly is more valuable than building the wrong thing fast. As an agile coach, your leverage is in making the discomfort of this transition visible and productive — turning the question 'Did we change user behavior?' from a threatening evaluation into a team's most useful learning mechanism. When teams internalize that question, they stop needing you to ask it.
Related Posts from Sense & Respond Learning
Why 'Velocity' Is a Vanity Metric (And What to Measure Instead)
Writing Better User Stories: Why You Need 'Hypothesis Statements' Instead
The Two-Week Learning Cycle: Running Discovery and Delivery in Parallel
Fixing Broken Standups: How to Run a Daily Sync That Actually Surfaces Blockers
Further Reading & External Resources
Who Does What By How Much? — Jeff Gothelf & Josh Seiden — The source framework for outcome-based team measurement
Lean UX — Gothelf & Seiden (O'Reilly) — The definitive guide to integrating UX into agile development
Want to go deeper? This post is part of the Sense & Respond Learning resource library — practical frameworks for product managers, transformation leads and executives who want to lead with outcomes, not outputs.
Explore the full library at https://www.senseandrespond.co/blog