Skip to content
platform-engineering

DORA, SPACE, DevEx, DX Core 4: Each Answers a Different Question

Software teams don't break down due to lack of metrics. They break down because they measure with conviction things they don't fully understand.

13 min read

Também em Português

Series Why Productive Teams Fail
3/8

Software teams don’t break down due to lack of metrics. They break down because they measure with conviction things they don’t fully understand.

“Productivity,” “efficiency,” and “performance” are words used with excessive confidence in technical and executive discourse, as if they were stable, universal, and self-explanatory concepts. They’re not. In practice, they function more as rhetorical shortcuts than as precise definitions. Each person in the organization hears these words and projects a different expectation onto them — and yet, everyone agrees to measure them.

The problem of conceptual ambiguity

It’s in this nebulous space that metrics frameworks emerge. They appear as attempts to organize conceptual chaos, offering models, indicators, and common languages. The problem is that, when used without reflection, these approaches start being treated as ready-made answers, when in fact they’re just lenses.

And every lens magnifies some things while distorting or hiding others.

The question that precedes the metric

Before talking about metrics, therefore, we need to talk about the game. What kind of problem are we trying to solve?

Questions about the system
  • Are we delivering fast enough?
  • How to reduce pipeline risks?
  • Is code quality adequate?
Questions about people
  • Is the work sustainable for the team?
  • Where is the cognitive strain?
  • What is being avoided or ignored?

The answer completely changes what makes sense to measure — and which model makes sense to use.

The four frameworks

Throughout this series, we’ll discuss four widely used approaches for analyzing productivity and efficiency in software teams. They don’t compete with each other, because they don’t try to answer the same question.

Different lenses, different insights

The common mistake is placing them side by side as alternatives, when in fact each starts from a different definition of what matters.

DORA: The delivery flow

[2] — the result of years of research on DevOps practices — looks at the team’s delivery capability as a flow system. Its focus is on understanding how frequently changes reach production, how long they take to get there, and how stable this process is.

Deployment Frequency
How often code reaches production. High frequency signals reliable pipeline and low perceived risk.
Lead Time for Changes
How much time between commit and deployment to production. Measures pipeline speed.
Time to Restore Service
Speed of recovery after failures. Measures resilience and incident response capability.
Change Failure Rate
Proportion of deployments that cause problems in production. Measures delivery process quality.

The model categorizes teams into Elite, High, Medium, and Low performers. This classification is useful for initial diagnosis, but becomes problematic when treated as an end in itself.

Useful for: Predictability, reliability, and delivery pipeline efficiency.

Not designed for: Human experience, learning, or cognitive strain. When used outside this context, it starts generating conclusions it never set out to support.

What DORA doesn't measure

DORA doesn’t capture why the metrics are what they are. A team can have high deployment frequency because they automated well — or because they’re under constant pressure to deliver fast.

SPACE: The conceptual map

emerges almost as a reaction to the excessive simplification of the idea of productivity. Instead of offering a closed set of metrics, it proposes a conceptual map.

The multidimensional view

Productivity in knowledge work is multidimensional and cannot be reduced to a single number without serious losses.

Satisfaction and well-being
Feeling of satisfaction, fulfillment, and health at work. How developers feel about their work and environment.
Performance
Work outcome measured by quality and impact, not just volume of deliveries.
Activity
How work is done. Counting actions, but always contextualized — commits, PRs, deployments are signals, not objectives.
Communication and collaboration
How people and teams exchange information and work together. Quality of interactions, not quantity of meetings.
Efficiency and flow
Ability to do work without friction or interruption. Measures obstacles that prevent sustained progress.

This model doesn’t prescribe specific metrics. It offers lenses to examine productivity from multiple angles and deliberately avoids single numbers or rankings. SPACE’s strength lies in forcing the uncomfortable question: why do you want to measure this and what are you leaving out by doing so? It’s less prescriptive and more philosophical — which is precisely its greatest value.

Developer Experience: Daily life

The perspective starts from a different point. Instead of beginning with flow or the abstract concept of productivity, it starts with the daily experience of those who write, test, review, and maintain code.

The central question isn’t “how fast are we delivering,” but “how difficult is it to do the right work in this environment”.

Feedback loops
How long to see the result of a change — compilation, tests, deployment. Fast cycles accelerate learning.
Cognitive load
Complexity that needs to be kept in mind to work. High cognitive load exhausts decision-making capacity.
Flow state
Ability to enter and remain in deep concentration. Interruptions destroy real productivity.
Tooling
Quality, integration, and reliability of tools. Poor tools are constant friction.
Documentation
Clarity about how systems work and decisions were made. Absence creates dependency on tribal knowledge.
Onboarding
Time and difficulty for new members to become productive. Slow onboarding reveals systemic problems.

Excessive friction, poor tools, opaque processes, and constant interruptions don’t appear directly in classic delivery metrics, but they erode quality, motivation, and team sustainability over time.

The predictive power

While DORA measures outcome (what already happened), DevEx measures conditions (what’s about to happen). A team with poor DevEx can maintain high delivery for a while — until it can’t anymore.

DX Core 4: The pragmatic synthesis

Finally, appears as an attempt at pragmatic synthesis. It doesn’t seek to explain everything or create a complete theoretical model. Its goal is to identify clear action levers — areas where investment generates consistent impact on experience and workflow.

Fast feedback loops
Speed of cycles (compilation, test, CI, deployment). The faster the feedback, the faster the learning and course correction. Long waits kill momentum.
Low cognitive load
Reduction of unnecessary complexity. Simple systems allow focus on business problems, not infrastructure. Accidental complexity drains mental energy.
Deep flow state
Protection against interruptions and context switching. Flow is where real productivity happens — but it's fragile. Environments that constantly break flow destroy creative capacity.
High developer satisfaction
Feeling of progress and fulfillment. Satisfaction isn't comfort — it's the sense that work matters and evolves. Dissatisfied teams produce less and with lower quality.

Unlike SPACE, which maps dimensions for reflection, DX Core 4 points directly to where to act. Each of these dimensions can be measured, but can also be immediately improved through concrete actions: improve tools, simplify architecture, protect focused time, listen to feedback.

It’s less ambitious conceptually, but much more direct in practice. When used well, it helps transform diffuse diagnoses into concrete decisions.

One of the most critical aspects of DX Core 4 is the emphasis on fast feedback loops. Long waits between action and result don’t just delay work — they kill momentum[1], that feeling of continuous progress that keeps teams engaged and productive.

Practical action map

If you don’t know where to start improving Developer Experience, these four dimensions offer an immediate action map. They’re not the only things that matter, but they’re the ones that most frequently make a difference.

Carregando diagrama...
Each approach answers a different question. DORA measures delivery flow, SPACE forces multidimensional reflection, DevEx captures daily experience, and DX Core 4 identifies action levers. They're not alternatives — they're complementary lenses.

The Risk: When Frameworks Become Ideology

So far, we’ve presented four approaches with respect. Each has real value when used consciously. But there’s a risk that needs to be named before you encounter them in a degenerated state: models can become ideologies.

From tool to doctrine

When an approach stops being a lens and becomes absolute truth, it stops illuminating problems and starts hiding them under a veneer of technical rationality.

How degeneration happens

Phase 1: Honest adoption — A team or organization discovers the model, sees value, begins applying it consciously.

Phase 2: Normalization — The approach becomes common language. “Our DORA is good,” “we need to improve our SPACE satisfaction,” “poor DevEx here.”

Phase 3: Bureaucratization — Models become requirements, not tools. Metrics are collected because “that’s what you do,” not because they answer questions.

Phase 4: Ideology — The approach becomes unquestionable. Questioning it is seen as resistance to “modernity” or “technical maturity.”

Signs you’ve reached Phase 4:

  • Discussions about the model replace discussions about the real problem
  • Criticisms of the approach are treated as heresy, not contribution
  • Numbers become automatic justification for decisions (“because DORA says,” “because SPACE indicates”)
  • The organization stops asking “why do we measure this?” and starts treating measurements as axioms

As we saw in Article 2, choosing what to measure is choosing what to see and what to ignore. Choosing a model is choosing a narrative about what constitutes success. This choice is never purely technical — it reflects organizational priorities, leadership anxieties, power structures.

Well-used frameworks
  • Illuminate previously invisible aspects
  • Enable more precise conversations
  • Facilitate systematic diagnostics
  • Are reviewed and adjusted constantly
Frameworks as ideology
  • Hide complexity under a single number
  • Replace difficult conversations with dashboards
  • Become compliance rituals
  • Are treated as unquestionable truths

These approaches carry assumptions about what quality work is, who matters, and what can be sacrificed. These assumptions may be right or wrong — but they’re rarely explicit. When you choose to measure something, you need to know exactly what game you’re playing.

Metrics are not neutral

Because, in the end, metrics are not neutral. They shape behavior, direct investment, and define what the organization comes to call success.

Evaluation models as political instruments

Choosing an evaluation model is choosing a narrative about what constitutes well-done work. And this choice isn’t technical — it’s political.

When an organization decides to measure only DORA, it’s implicitly saying: “delivery is what matters.” When it adds SPACE, it recognizes there’s more at stake. When it invests in DevEx, it admits that human experience has weight. When it ignores all and focuses on lines of code or story points, it reveals exactly what game it’s playing — even if it doesn’t admit it.

Technical decisions (apparent)
  • Which framework to use?
  • What metrics to collect?
  • How to categorize teams?
Political decisions (real)
  • What counts as success?
  • Who will be rewarded?
  • What behavior do we want?

The danger of metric bureaucracy

Well-intentioned approaches often degenerate into empty rituals. What begins as an honest attempt to understand complex work becomes a compliance checklist.

Signs of degeneration:

  • Metrics are collected, but no one acts on them
  • Teams optimize to look good in numbers, not to actually improve
  • Discussions about measurement methods replace discussions about real problems
  • Models become compliance requirements, not diagnostic tools
  • Technical jargon replaces direct conversations about difficulties

When this happens, the model has stopped illuminating the problem and has become the problem.

The responsibility of those who measure

To measure is to intervene. There’s no neutral observation of social systems.

The starting point

And it all starts with a simple choice — and rarely explicit: what game are we really trying to win?

If you don’t choose consciously, the system chooses for you. And systems, left to their own devices, tend to optimize for predictability, control, and absence of conflict — not for learning, quality, or sustainability.

Before adopting any approach, ask:

  • What problem am I trying to solve with this metric?
  • What behavior might it inadvertently encourage?
  • What does it leave invisible?
  • Who benefits if this metric improves?
  • Who might be harmed?

These models are lenses. Useful when you know what you’re looking for and aware of what you’re ignoring. Dangerous when treated as objective truths about human work.

The question isn’t which approach is better. The question is: do you know what game you’re playing?

Notas de Rodape

  1. [1]

    Momentum, in the context of software development, refers to the psychological state of continuous progress and engagement. It’s the feeling of moving forward consistently, where each action generates visible results quickly, creating a virtuous cycle of motivation and productivity. When momentum is lost — through slow builds, bureaucratic processes, or prolonged waits — the cost isn’t just temporal: it’s cognitive and emotional. Teams lose focus, mental energy dissipates, and regaining context requires additional effort. Protecting momentum is protecting the team’s ability to work in a flow state.

  2. [2]

    Forsgren, Nicole; Humble, Jez; Kim, Gene. Accelerate: The Science of Lean Software and DevOps. IT Revolution Press, 2018. The book presents years of research from the DORA program, establishing the four key metrics (deployment frequency, lead time, time to restore, and change failure rate) as indicators of software delivery performance.

Related Posts

Comments 💬