Marketing Strategy30 min read

The Strategy-Execution Gap in Growth Teams: Why OKRs Fail and How Input Metrics Fix Them

Your Q1 OKR was 'increase activation rate by 15%.' It's March and you're at 3%. The problem isn't execution — it's that activation rate is an output. You can't execute on an output. Input metrics bridge the gap between strategy and daily action.

Murat Ova·
Share:
The Strategy-Execution Gap in Growth Teams: Why OKRs Fail and How Input Metrics Fix Them
Photo by Estee Janssens on Unsplash

TL;DR: Roughly 70% of OKRs are abandoned by end of Q2 because most key results are output metrics that teams influence but do not directly control -- you cannot execute on "increase activation rate by 15%." Decomposing output metrics into controllable input metrics (actions a team can take on any given Tuesday) bridges the strategy-execution gap and transforms OKRs from aspirational scorecards into operational playbooks.


The Seventy Percent Failure Rate

There is a statistic that circulates through management consulting circles: roughly 70% of OKRs are abandoned, deprioritized, or quietly forgotten by the end of Q2. The number is difficult to pin down precisely because failure in goal-setting frameworks is not the kind of thing organizations measure rigorously. But the pattern is consistent across industries, company sizes, and levels of leadership commitment.

Quantive (formerly Gtmhub), which processes OKR data across thousands of organizations, reports that average OKR attainment rates hover around 60-70% of target. That figure sounds respectable until you examine what it actually means. Most organizations set OKRs with the implicit understanding that hitting 70% is "good." Which means the targets are calibrated to be partially missed. Which means the planning process has already internalized failure as a feature.

The real problem is not the attainment rate. The real problem is what happens between the day an OKR is written and the day it is reviewed. In that gap — usually 90 days — something breaks. The objective that seemed clear in a planning offsite becomes abstract in daily work. The key result that looked measurable turns out to be measurable only in retrospect. The growth team that committed to "increase activation rate by 15%" discovers, around week six, that they do not actually know what to do on any given Tuesday morning that connects to that number. Quantifying product-market fit faces the same challenge — the output metric (retention, NPS, usage frequency) is observable but not directly actionable without decomposition into the inputs that drive it.

This is the strategy-execution gap. And it is not a discipline problem, a talent problem, or a prioritization problem. It is a measurement architecture problem.

The Core Diagnosis

OKRs fail not because teams lack ambition or discipline. They fail because most key results are output metrics — outcomes that a team influences but does not directly control. You cannot execute on an output. You can only execute on inputs, and most OKR frameworks never decompose outputs into the controllable inputs that drive them.

The Anatomy of an Abandoned OKR

Consider a growth team at a mid-stage SaaS company. Q1 planning produces the following OKR:

Objective: Accelerate product-led growth Key Result 1: Increase free-to-paid conversion rate from 4.2% to 5.5% Key Result 2: Reduce time-to-first-value from 8 days to 4 days Key Result 3: Increase monthly active users by 20%

This looks reasonable. The objective is directional. The key results are quantified. Everyone leaves the planning meeting feeling aligned. And then the following happens:

Week 1-2. The team reviews the metrics. Free-to-paid conversion is 4.2%. Time-to-first-value is 8.1 days. Monthly active users are flat. Fine — the quarter just started. They brainstorm initiatives: onboarding flow improvements, activation email sequences, a new feature tutorial.

Week 3-4. Work begins on the onboarding redesign. The activation email sequence is drafted. Meanwhile, the metrics have not moved. This is expected — interventions take time to propagate.

Week 5-6. The onboarding redesign ships. The email sequence launches. The team checks the numbers. Free-to-paid conversion has moved from 4.2% to 4.3%. Time-to-first-value dropped to 7.6 days. Monthly active users grew 2%. Progress, but incremental. At this rate, they will not hit any of their key results.

Week 7-9. A sense of drift sets in. The OKR review meeting becomes a status update about projects rather than a discussion about outcomes. The team is busy — shipping features, running experiments, writing content. But the connection between daily work and quarterly targets feels tenuous. The key results start to feel like weather: something you observe but do not control.

Week 10-12. The quarter ends. Free-to-paid conversion is 4.5%. Time-to-first-value is 6.8 days. Monthly active users grew 9%. Against a standard OKR grading rubric, this is a 0.3, 0.5, and 0.45 respectively. The team grades the OKR as "partially achieved" and moves on to Q2 planning, where they will set equally ambitious output targets and begin the cycle again.

Typical OKR Attainment Trajectory — Growth Team Key Result

Loading chart...

The chart tells a story that is familiar to anyone who has worked in a growth team. The target line represents where the team needs to be. The linear pace line represents where they would be if progress were evenly distributed across the quarter. The actual line shows what happens in practice: a slow, halting crawl that falls further behind each week. The gap between aspiration and reality widens, not because the team stops working, but because the metric they are tracking does not respond predictably to the work they are doing.

This is not a failure of effort. It is a failure of metric selection.

Output Metrics vs. Input Metrics: The Critical Distinction

The distinction between output metrics and input metrics is the single most important concept in bridging the strategy-execution gap. It is also the concept most consistently absent from OKR training, goal-setting workshops, and planning templates.

Output metrics measure results that a team influences but does not directly control. Conversion rate. Revenue. Monthly active users. Churn rate. Activation rate. Net promoter score. These are the numbers that executives care about, that boards ask about, and that OKRs are almost always built around.

Input metrics measure activities that a team directly controls. The number of experiments launched per week. The number of onboarding emails tested. The number of user interviews conducted. The number of high-intent SEO articles published. The number of A/B tests reaching statistical significance per month. The number of activation touchpoints redesigned.

The relationship between the two is causal but not deterministic. Publishing 12 SEO articles targeting high-intent keywords does not guarantee a 20% traffic increase. But it is an action a team can execute, measure, and adjust. "Increase organic traffic by 20%" is something a team can only watch and hope for.

Output Metrics vs. Input Metrics: Comparative Properties

PropertyOutput MetricsInput Metrics
ControllabilityLow — influenced by team actions and external factorsHigh — directly determined by team actions
Feedback SpeedSlow — weeks to months to see movementFast — measurable daily or weekly
ActionabilityLow — does not tell you what to do todayHigh — directly translates to tasks and priorities
Strategic AlignmentHigh — directly represents business valueModerate — requires a model connecting inputs to outputs
Motivational EffectDemoralizing when stagnant despite effortEnergizing because effort creates visible progress
Risk of GamingLow — hard to manipulate real business outcomesModerate to high — can be gamed if disconnected from outputs

The trouble with output-only OKRs is not that the outputs are wrong. The outputs are exactly what the business needs to improve. The trouble is that they are useless as operational instruments. Telling a growth team to "increase conversion rate by 30%" is like telling a pilot to "increase altitude by 10,000 feet" without mentioning that the mechanism for doing so involves the throttle, the control column, and the flaps. The altitude is the result. The cockpit instruments — the input metrics — are what you actually operate.

Caution

"Increase revenue by 20%" is not a plan. It is a wish expressed in numerical form. The teams that consistently hit growth targets are not the ones with the most ambitious OKRs. They are the ones that have decomposed their output targets into input metrics they can control, measure, and optimize on a weekly or daily cycle.

Andy Grove, who developed the OKR framework at Intel, understood this intuitively. His original formulation paired ambitious objectives with key results that were often activity-based, not purely outcome-based. Somewhere in the translation from Intel's semiconductor operations to Silicon Valley's adoption of the framework, the emphasis shifted from "what can we measure and control?" to "what outcome do we want?" The framework lost its operational teeth.

The Input Metric Tree: A Decomposition Framework

The Input Metric Tree is a framework for converting an output metric into a hierarchy of controllable input metrics. It is part causal model, part operational plan, and part diagnostic tool. The logic is straightforward: start with the output you care about and recursively ask, "what actions directly influence this number, and which of those actions do we control?"

Consider a growth team whose output target is to increase monthly recurring revenue (MRR) by $200K.

At the first level of decomposition, MRR growth can be broken into three components:

ΔMRR=MRRnew+MRRexpansionMRRchurn\Delta MRR = MRR_{new} + MRR_{expansion} - MRR_{churn}

Each of these can be further decomposed. New customer revenue is a function of qualified leads multiplied by close rate multiplied by average deal size: Formally, MRRnew=Lqualified×rclose×ACVMRR_{new} = L_{qualified} \times r_{close} \times \overline{ACV}. Qualified leads are in turn a function of website traffic, conversion to signup, and signup to qualified lead rate:

Lqualified=(cchannelsVc)×rsignup×rSQLL_{qualified} = \left(\sum_{c \in \text{channels}} V_c \right) \times r_{signup} \times r_{SQL}

Website traffic is a function of organic search visitors, paid acquisition visitors, referral visitors, and direct visitors.

Now we are approaching input metrics. Organic search visitors are influenced by the number of SEO-optimized pages published, the number of backlinks acquired, and the technical health of the site. These are controllable. A team can commit to publishing 12 high-intent keyword articles per month, conducting 20 backlink outreach campaigns per week, and maintaining a Core Web Vitals score above 90.

Input Metric Tree — MRR Growth Decomposition

Loading chart...

The chart illustrates a principle that is central to the Input Metric Tree framework: controllability decreases as you move from inputs to outputs. The metrics at the bottom of the tree — the ones closest to daily action — are the ones a team can most directly influence. The metrics at the top — the business outcomes — are the emergent result of all those inputs interacting with external conditions the team does not control.

The decomposition exercise forces three things that output-only OKRs do not:

Causal reasoning. You cannot build an Input Metric Tree without articulating your theory of how inputs produce outputs. A metric ontology formalizes this theory into a queryable structure, ensuring that the relationships between inputs, leading indicators, and outcomes are documented and testable across the organization. If your tree says that publishing more SEO content leads to more traffic leads to more signups leads to more revenue, you have a testable hypothesis. If the links break — if more traffic does not produce more signups — you know where the model is wrong and can redirect effort.

Prioritization clarity. Once you see the full tree, you can identify which branches offer the highest leverage. If your close rate is 40% and your signup-to-qualified-lead rate is 3%, the constraint is probably upstream of the sales team. The tree reveals where the bottleneck is.

Operational translation. Each leaf node in the tree is something a team member can do this week. The tree converts a quarterly ambition into a weekly work plan.

Input Metric Tree Example — Increasing Activation Rate

LevelMetricTypeOwnerCadence
L0 (Output)Activation Rate (trial users reaching aha moment)OutputGrowth LeadMonthly review
L1Onboarding completion rateIntermediateProductBi-weekly
L1Time-to-first-valueIntermediateProductBi-weekly
L1Activation email open rateIntermediateGrowth MarketingWeekly
L2 (Input)Number of onboarding A/B tests shippedInputProductWeekly
L2 (Input)Number of activation email variants testedInputGrowth MarketingWeekly
L2 (Input)Number of user interviews conducted on drop-off pointsInputResearchWeekly
L2 (Input)Number of in-app tooltip experiments launchedInputProductWeekly
L2 (Input)Number of friction audit sessions completedInputDesignBi-weekly

Leading Indicators and Lagging Indicators

The input-output distinction maps onto a related but subtly different framework: leading versus lagging indicators. Lagging indicators tell you what already happened. Leading indicators tell you what is likely to happen.

Revenue is the quintessential lagging indicator. By the time you see a revenue decline, the causes occurred weeks or months earlier — a drop in lead quality, a product regression, a competitive shift. Cohort-based unit economics provides the analytical framework for connecting these lagging revenue signals back to the acquisition and activation inputs that caused them, with the time dimension preserved. Lagging indicators are useful for accountability and reporting. They are nearly useless for steering.

Leading indicators predict future changes in lagging indicators. A decline in qualified pipeline this month predicts a revenue shortfall next quarter. A drop in onboarding completion rate this week predicts lower activation metrics next month. An increase in customer support ticket volume today predicts elevated churn in sixty days.

The relationship between input metrics and leading indicators is close but not identical. All input metrics are leading indicators, but not all leading indicators are input metrics. Customer sentiment scores, for instance, are a leading indicator of churn — but they are not an input metric because a team does not directly control customer sentiment. What a team controls are the inputs that influence sentiment: response time, resolution rate, product reliability, feature delivery cadence.

The practical consequence is this: a growth team needs three measurement layers operating simultaneously.

Layer 1: Input metrics. Measured daily or weekly. Directly controllable. This is what individual contributors track. "How many experiments did I ship this week?"

Layer 2: Leading indicators. Measured weekly or bi-weekly. These are the intermediate metrics that input activities are expected to influence. "Is the onboarding completion rate moving in the right direction?"

Layer 3: Lagging outcomes. Measured monthly or quarterly. These are the output metrics that executives and board members care about. "Did MRR grow by $200K?"

The Three-Layer Dashboard

Build your growth team's operational dashboard with all three layers visible. Input metrics at the top — these are the dials you turn. Leading indicators in the middle — these are the gauges that tell you whether turning those dials is working. Lagging outcomes at the bottom — these are the destination. If you can only see one layer, you are either flying blind (inputs only), detecting problems too late (lagging only), or aware of trouble without knowing what to do about it (leading only).

Amazon's Input Metric Obsession

No company illustrates the power of input metric thinking more clearly than Amazon. The company's operational culture is built on a principle that Jeff Bezos articulated repeatedly in shareholder letters: focus on controllable input metrics, and the output metrics will follow.

Amazon's leadership meetings are not organized around revenue or profit. They are organized around input metrics that the company believes drive revenue and profit as downstream consequences. The distinction is not semantic. It shapes how the company allocates attention, how it evaluates leaders, and how it makes investment decisions.

Colin Bryar and Bill Carr, former Amazon executives, documented this in their account of Amazon's internal practices. The company's weekly business review — the WBR — tracks hundreds of metrics, the vast majority of which are input metrics. Delivery speed. Selection breadth. Number of items available for same-day delivery. Page load time. Number of defects per million customer interactions. These are not outcomes. They are activities and conditions that Amazon controls and that, according to its causal model, produce the customer experience that generates revenue.

The most illustrative example is how Amazon approached its third-party marketplace. The output metric was gross merchandise volume (GMV) from third-party sellers. But Amazon did not set GMV targets and hope for the best. It decomposed GMV into input metrics: the number of active third-party sellers, the number of listed items per seller, the conversion rate of third-party listings, the fulfillment speed of third-party orders, the defect rate of third-party transactions.

Then it went further. The number of active third-party sellers is itself an output. What controls it? The number of outreach emails sent to prospective sellers. The conversion rate of the seller onboarding funnel. The time from application to first listing. The revenue a seller earns in their first 90 days. Each of these is more controllable than the last, and Amazon tracked them all.

Amazon's Input-Output Metric Hierarchy (Illustrative)

Loading chart...

The result was that Amazon's marketplace team could walk into a weekly review and say: "GMV grew 3% below target, but we know why. Seller onboarding funnel completion dropped from 62% to 54% because of a regression in the listing tool. We have a fix shipping Thursday. The leading indicators suggest GMV will recover within two weeks." That level of diagnostic precision is impossible when your only metric is the output.

Bezos called this "working backwards from the customer experience." A more precise description is: working backwards from the output metric through a causal chain of intermediate and input metrics until you reach something a single team can control and improve in a single sprint.

The lesson is not that Amazon is uniquely brilliant. The lesson is that the company institutionalized a measurement architecture that most organizations leave to chance. The architecture has a name — the WBR — and a structure — the input metric hierarchy — and a cadence — weekly reviews focused on input metrics, not output aspirations.

Spotify's Squad Model and Its Measurement Problem

Spotify's squad model, popularized by Henrik Kniberg's 2012 whitepaper and widely adopted across the technology industry, offers a contrasting case study. The model is a framework for organizational design: autonomous squads own specific areas of the product, align through chapters and guilds, and are empowered to choose their own methods.

The squad model's strengths are real. Autonomy increases ownership. Small teams move faster. Decentralized decision-making reduces coordination overhead. But the model has a measurement gap that Spotify itself has acknowledged and that many companies adopting the model have amplified.

The gap is this: squads are typically given ownership of an output metric — a product area's engagement rate, a feature's adoption metric, a conversion funnel stage. The squad has autonomy over how they pursue the metric. But the metric itself is often a lagging output that the squad influences through a causal chain it may not fully understand or control.

A squad owning "podcast engagement" at Spotify might track minutes listened, unique listeners, retention rate, and skip rate. These are all outputs. The squad can ship features, adjust algorithms, change the UI — and then wait weeks to see if the output moved. If it did not, was the intervention wrong? Was the timeframe too short? Was an external factor overwhelming the signal? The output metric alone cannot answer these questions.

Spotify recognized this problem. In subsequent iterations of the model, the company placed greater emphasis on experimentation velocity — an input metric — as a proxy for progress. The reasoning: if a squad is running well-designed experiments at a high cadence, the probability of finding interventions that move the output metric increases. You cannot control whether any single experiment succeeds. You can control how many experiments you run.

This is an input metric in disguise. The output is podcast engagement. The input is experiments per sprint. The theory is that more experiments, designed against a hypothesis about what drives engagement, will eventually shift the output. The squad's weekly standup stops being about "did the engagement number go up?" and starts being about "did we ship three experiments this sprint and collect data on the last three?"

Insight

Spotify's evolution from pure output ownership to experiment-velocity tracking illustrates a broader principle: organizational autonomy without input metric discipline produces teams that feel empowered but cannot diagnose why their numbers are or are not moving. Autonomy tells a team they can choose the path. Input metrics tell them whether they are walking.

The adoption of the squad model across the industry often replicated the organizational structure without replicating the measurement evolution. Companies gave squads ownership of output metrics, gave them autonomy, and then expressed frustration when the outputs did not move. The missing piece was not the structure. It was the measurement architecture that converts output ownership into input discipline.

Building an Input-Output Metric Map

The Input-Output Metric Map is a practical artifact that every growth team should build and maintain. It is a document — a spreadsheet, a diagram, a Notion page, whatever your team uses — that explicitly connects output targets to the input metrics that are hypothesized to drive them.

The construction process has five steps:

Step 1: Identify the output metrics. These are your OKR key results, your north star metrics, your board-level KPIs. List them. For a growth team, this might be: activation rate, free-to-paid conversion, monthly active users, net revenue retention.

Step 2: Decompose each output into its mathematical components. Revenue equals users multiplied by average revenue per user. Activation rate equals users who complete the activation sequence divided by total new users. Conversion rate equals conversions divided by visitors. Break the output into its constituent parts.

Step 3: For each component, identify the leading indicators. What moves before this component moves? If signup rate changes, what preceded it? A change in landing page traffic composition? A change in the signup flow? A change in ad targeting?

Step 4: For each leading indicator, identify the controllable inputs. What can the team do this week that is expected to influence this leading indicator? These are your input metrics.

Step 5: Assign ownership, cadence, and targets to each input metric. Every input metric needs an owner (a person, not a team), a measurement cadence (daily or weekly for inputs), and a target that is calibrated to the output target.

Input-Output Metric Map — Growth Team Example

Output TargetLeading IndicatorInput MetricOwnerWeekly Target
Activation rate: 4.2% → 5.5%Onboarding completion rateOnboarding A/B tests shippedPM — Sarah2 per week
Activation rate: 4.2% → 5.5%Time-to-first-valueUser interview sessions on drop-offResearcher — James4 per week
Conversion rate: 2.1% → 2.8%Trial engagement scoreActivation email variants testedGrowth — Maya3 per week
Conversion rate: 2.1% → 2.8%Feature discovery rateIn-app nudge experiments launchedPM — Alex2 per week
Organic traffic: +20%Indexed pages ranking top-10SEO articles published (high-intent keywords)Content — Priya3 per week
Organic traffic: +20%Domain authorityBacklink outreach campaigns sentSEO — Daniel15 per week
Net revenue retention: 105% → 115%Feature adoption depthCustomer health check calls completedCS — Jordan20 per week
Net revenue retention: 105% → 115%Expansion opportunity pipelineUpsell demos scheduledSales — Kim8 per week

The map serves three functions. First, it is a planning tool: during quarterly planning, the team can trace from output targets backward to the input commitments required. Second, it is a diagnostic tool: when an output metric stalls, the team can examine the corresponding input metrics and leading indicators to identify where the causal chain is breaking. Third, it is an accountability tool: each input metric has an owner, a target, and a cadence. The weekly review becomes a conversation about whether the team is executing on its controllable inputs, not a passive observation of whether outputs happened to move.

Weekly Execution Rhythms Tied to Inputs

The cadence at which a team reviews metrics matters as much as the metrics themselves. Most OKR implementations default to quarterly reviews with occasional monthly check-ins. This is too slow for input metric management. By the time a quarterly review reveals that an output is off track, ten weeks of input data — data that could have signaled the problem in week three — has passed unexamined.

The rhythm that works for growth teams operating on input metrics is a weekly cycle with four components:

Monday: Input metric review. The team examines the previous week's input metrics against targets. Did we ship three onboarding experiments? Did we publish three SEO articles? Did we complete 20 customer health checks? These are binary questions with clear answers. There is no ambiguity about whether an input was delivered.

Monday: Leading indicator scan. Alongside the input review, the team checks the leading indicators. Are the intermediate metrics moving in the expected direction? If onboarding completion rate has not budged despite five weeks of onboarding experiments, the causal hypothesis may need revision.

Wednesday: Mid-week check. A brief (fifteen-minute) sync to identify blockers to hitting the current week's input targets. This is operational, not strategic. "The backlink outreach campaign is blocked because the prospect list is not ready. Who can unblock this by Thursday?"

Friday: Learning capture. What did this week's inputs teach us? Which experiments produced signal? Which content pieces gained traction? Which customer calls revealed new patterns? This is where the team updates its causal model — the connections between inputs and outputs — based on real data.

Weekly Input Target Attainment — Growth Team (8-Week Period)

Loading chart...

The chart shows something that output-only dashboards never reveal: the team's execution consistency. Over eight weeks, the team hit its input targets with reasonable reliability. Some weeks were below target on individual metrics, but the overall pattern is one of sustained activity. This sustained activity is what produces compound effects over a quarter. No single week's inputs move the output metric. But twelve weeks of consistent input delivery — forty-eight onboarding experiments, thirty-six SEO articles, thirty-six email variants, forty-two user interviews — will move the needle in ways that are invisible in any single weekly measurement.

The weekly rhythm also creates a faster feedback loop for adjusting strategy. If the team discovers in week four that its onboarding experiments are producing no signal while its SEO articles are driving significant traffic gains, it can reallocate effort. In a quarterly OKR review, that realization comes too late to act on. In a weekly input review, it comes in time to redirect the remaining eight weeks of the quarter.

When Input Metrics Become Goodhart's Law Victims

Charles Goodhart, an economist at the Bank of England, observed in 1975 that any measure which becomes a target ceases to be a good measure. This principle, now known as Goodhart's Law, is the most significant risk in an input-metric-driven system.

The risk is real and it takes predictable forms.

Volume without quality. A team targeted on "number of experiments shipped per week" begins shipping trivial experiments — changing button colors, tweaking copy by a single word — to hit the target. The input metric is satisfied. The output does not move because the experiments are not testing meaningful hypotheses.

Activity without learning. A content team targeted on "articles published per month" begins producing thin, formulaic content that satisfies the count but does not rank, does not convert, and does not serve the reader. The input metric is green. The output — organic traffic — is red.

Proxy decay. An input metric that was once a valid proxy for the output gradually loses its connection as conditions change. "Number of outbound sales calls" was a valid input for pipeline generation when the market was greenfield. As the market matured and cold outreach response rates declined, the same input stopped producing the same output. But the team kept targeting the input because it was measurable and controllable.

Goodhart's Law in Practice

Input metrics must be paired with quality gates. "Experiments shipped" is insufficient without "experiments with a pre-registered hypothesis and a minimum detectable effect calculated in advance." "Articles published" is insufficient without "articles targeting keywords with search volume above 500 and commercial intent." The input metric provides the discipline of execution. The quality gate prevents that discipline from becoming theater.

The defense against Goodhart's Law in an input-metric system has three components:

First, pair every input metric with a quality criterion. Not just "how many" but "how many that meet this standard." This makes gaming harder because the quality criterion introduces a judgment that resists simple maximization.

Second, maintain the causal audit. Every month, review whether the input metrics are actually driving the leading indicators, and whether the leading indicators are actually driving the outputs. If the links have decayed, revise the Input Metric Tree. This is the learning component that prevents the system from becoming a bureaucratic exercise.

Third, rotate input metrics periodically. If the same input metric persists for three or four quarters, it becomes entrenched. Teams optimize for it. The behavior it was meant to proxy becomes subordinated to the metric itself. Rotating inputs — shifting from "articles published" to "articles ranking in the top 10 within 60 days" — keeps the team focused on the outcome the input is supposed to serve rather than the input itself.

Balancing Autonomy with Alignment

The tension between team autonomy and organizational alignment is one of the oldest problems in management. Input metrics offer a resolution that neither top-down command nor bottom-up empowerment can achieve alone.

In a top-down model, leadership sets output targets and prescribes the activities to achieve them. This eliminates the strategy-execution gap at the cost of autonomy. Teams do not own their methods. Learning is slow because the feedback from execution to strategy flows through a management hierarchy.

In a bottom-up model, teams are given output targets and full autonomy over how to achieve them. This maximizes autonomy at the cost of alignment and diagnostic clarity. When the output does not move, no one can diagnose why, because the inputs were never specified or tracked.

The input metric model occupies a middle position. Leadership defines the output targets and validates the causal model — the Input Metric Tree — that connects inputs to outputs. Teams choose which inputs to prioritize within the tree and own the execution of those inputs. The weekly review examines whether inputs were delivered and whether the causal model is holding. Leadership steers at the model level: "We need to revise our hypothesis about what drives activation." Teams steer at the input level: "We are going to shift from onboarding email experiments to in-app tooltip experiments because the data suggests tooltip interactions correlate more strongly with activation."

This separation of concerns creates a form of aligned autonomy. The alignment comes from the shared causal model. The autonomy comes from team-level control over which inputs to pursue and how to execute them. Neither party — leadership nor teams — operates without accountability. Leadership is accountable for the quality of the causal model. Teams are accountable for the consistency of input delivery.

Three Models of Goal-Setting: Comparing Autonomy and Alignment

DimensionTop-Down PrescriptionBottom-Up EmpowermentInput Metric Model
Who sets outputs?LeadershipLeadershipLeadership
Who sets inputs?LeadershipTeam (implicit)Team (explicit, validated)
Who owns execution?Team (constrained)Team (fully autonomous)Team (autonomous within causal model)
Diagnostic clarityHigh (but fragile)LowHigh (and adaptive)
Strategy-execution gapSmall (but innovation-killing)LargeSmall (and learning-oriented)
Feedback speedSlow (quarterly)VariableFast (weekly)
Risk of misalignmentLowHighModerate (requires model maintenance)
Goodhart vulnerabilityHigh (prescribed inputs become rote)Low (no inputs to game)Moderate (mitigated by quality gates)

The Uncomfortable Organizational Truth

The strategy-execution gap persists not because organizations lack frameworks. OKRs, KPIs, balanced scorecards, V2MOM, SMART goals — the alphabet soup of goal-setting methodologies is vast. The gap persists because most of these frameworks stop at the output level. They define what success looks like and leave the how — the causal chain from daily action to quarterly result — as an exercise for the team.

This is a design flaw, not a feature. And it produces a predictable set of organizational pathologies.

Leadership becomes frustrated because teams are "not executing." Teams become demoralized because they are executing furiously — shipping features, running campaigns, producing content — but the output metrics are not moving. The disconnect is not between people. It is between the measurement system and the operational reality.

Input metrics fix this by inserting a translation layer between strategic ambition and daily work. They do not replace OKRs. They complete them. An OKR that says "increase activation rate by 15%" is a statement of strategic intent. An Input Metric Tree that decomposes activation rate into onboarding experiments, user interviews, activation emails, and in-app guidance — with weekly targets, quality gates, and assigned owners — is a statement of operational commitment.

The organizations that bridge the strategy-execution gap are not the ones with the best strategic planning. They are the ones with the best measurement architecture. They know the difference between a metric they can watch and a metric they can work. They build their operational rhythms around the latter and let the former follow as a consequence.

This requires a specific form of intellectual honesty. It requires admitting that "increase revenue by 20%" is not a plan. It requires decomposing aspirations into controllable activities and accepting that some of those activities will turn out to be wrong — that the causal model will need revision, that inputs will need rotation, that the connection between what you do and what happens is probabilistic, not certain.

The alternative — setting ambitious output targets, reviewing them quarterly, and hoping the space between is filled by hard work and good intentions — is what produces the 70% abandonment rate. It is what produces the Q2 planning meeting where last quarter's misses are acknowledged, new targets are set, and the same structural gap between strategy and execution goes unaddressed.

Input metrics are not a silver bullet. They are plumbing. Mundane, essential, and invisible when working correctly. But without them, the water does not flow from the strategy reservoir to the execution tap. And the growth team stands in the kitchen, staring at a dry faucet, wondering why nothing comes out.


Further Reading

References

  • Doerr, J. (2018). Measure What Matters: How Google, Bono, and the Gates Foundation Rock the World with OKRs. Portfolio/Penguin.

  • Bryar, C., & Carr, B. (2021). Working Backwards: Insights, Stories, and Secrets from Inside Amazon. St. Martin's Press.

  • Grove, A. S. (1983). High Output Management. Random House.

  • Goodhart, C. A. E. (1984). Problems of monetary management: The UK experience. In Monetary Theory and Practice (pp. 91-121). Palgrave Macmillan.

  • Kniberg, H., & Ivarsson, A. (2012). Scaling Agile @ Spotify with Tribes, Squads, Chapters & Guilds. Spotify Engineering Whitepaper.

  • Cagan, M. (2018). Inspired: How to Create Tech Products Customers Love (2nd ed.). Wiley.

  • Wodtke, C. (2016). Radical Focus: Achieving Your Most Important Goals with Objectives and Key Results. Cucina Media.

  • Niven, P. R., & Lamorte, B. (2016). Objectives and Key Results: Driving Focus, Alignment, and Engagement with OKRs. Wiley.

  • Kaplan, R. S., & Norton, D. P. (1996). The Balanced Scorecard: Translating Strategy into Action. Harvard Business School Press.

  • Strathern, M. (1997). 'Improving ratings': Audit in the British University system. European Review, 5(3), 305-321.

  • Quantive. (2023). The State of OKRs: Annual Report on OKR Adoption and Attainment Rates. Quantive Research.

  • Mankins, M. C., & Steele, R. (2005). Turning great strategy into great performance. Harvard Business Review, 83(7), 64-72.