Skip to content
platform-engineering

DevEx: Flow, Feedback, and the Load Nobody Measures

The DevEx model proposes that experience is a technical variable — not subjective. Three dimensions (flow, feedback, mental burden) capture what DORA and SPACE don't see.

42 min read

Também em Português

Series Why Productive Teams Fail
6/8

In previous articles, we traversed two approaches that changed how we measure productivity in software engineering. DORA gave us flow indicators — how fast and stable the system delivers. SPACE expanded the lens, forcing us to accept that productivity is multidimensional.

But one question remains unanswered: if we have good measurements and recognize the complexity, why do teams still suffer?

The missing perspective

DORA observes the system from outside — it measures what comes out of the pipeline. SPACE expands the dimensions, but is still a measurement model. Neither directly asks: what is it like to work inside this system?

This is the question that [1] — the model published in 2023 by Abi Noda, Margaret-Anne Storey, Nicole Forsgren, and Michaela Greiler — attempts to answer.

What the DevEx approach proposes

The central thesis: experience is a technical variable

DevEx’s central idea is simple, but with profound implications: developer experience directly shapes technical outcomes. Not in an abstract or motivational way, but in a concrete and measurable manner.

When experience is bad
  • Systems hard to understand
  • Unstable tools
  • Opaque processes
  • Constant interruptions
What this produces
  • Defensive decisions
  • Shortcuts that become patterns
  • Silent rework
  • Fragmented reasoning

None of this shows up immediately in flow indicators, but all of it accumulates in the code, architecture, and product.

The conceptual shift

In SPACE[4], satisfaction is one dimension among five — the “S” in Satisfaction and well-being. In DevEx, experience isn’t a dimension: it’s the central variable. This approach argues that a cognitively hostile environment doesn’t just make people unhappy; it produces worse software.

What this model observes

Unlike DORA, which observes the system from outside, DevEx observes the system from within. It’s interested in the path developers take to accomplish tasks:

  • Creating a service: How many steps? How many approvals? How much implicit knowledge is required?
  • Running tests: How long does it take? Are tests reliable? Is feedback clear?
  • Understanding a codebase: Is the architecture evident or obscure? Are conventions clear?
  • Debugging an error: Are logs accessible? Does observability exist? Is reproducing the problem possible?
  • Deploying to production: Is the pipeline reliable? Is rollback safe?

Each friction in this path consumes mental energy. And mental energy is a finite resource.

Technical complexity ≠ experiential complexity

A crucial point of this approach: technical complexity is not the same as experiential complexity.

A system can be complex by nature and still offer a good experience, if its rules are clear, its tools reliable, and its boundaries well-defined. Similarly, an apparently simple system can be exhausting if it requires excessive memory, implicit decisions, and constant political navigation.

What DevEx makes visible

Visible costs

  • Bugs
  • Incidents
  • Deploy failures

Invisible costs (DevEx)

  • Context cost
  • Waiting cost
  • Uncertainty cost
  • Ambiguity cost

DevEx is interested in costs that don’t show up as bugs or incidents, but as suboptimal decisions made under pressure or fatigue.

Origin: Why DevEx Emerged in 2023

The frustration with metrics that don’t answer why

The DevEx paper was published in 2023 by Abi Noda, Margaret-Anne Storey, Nicole Forsgren, and Michaela Greiler in ACM Queue — not by coincidence. It was a direct response to a decade of frustration.

Since DORA established the four classic metrics in 2014, the industry had better metrics for measuring delivery. But it still couldn’t answer a basic question: why do teams with good metrics still break?

Flow metrics (DORA) showed what was happening. SPACE (2021) expanded to multiple dimensions, showing where to look. But no framework directly asked developers: what is it like to work inside this system?

The paradigm shift

The difference wasn’t technological — it was epistemological. The industry finally accepted that subjective experience isn’t noise in the data; it is data. That what people feel while working isn’t “soft” — it’s a technical variable that affects code, architecture, and product.

Who created it and why that matters

The four authors aren’t random names. Each brought a specific perspective:

Nicole Forsgren: Co-author of DORA. Deeply understood the limits of quantitative metrics and knew exactly what they couldn’t capture.

Margaret-Anne Storey: Software engineering researcher focused on cognition and developer experience. Decades of academic research on how developers actually work.

Abi Noda: Founder of DX (company focused on developer experience). Lived the problem in practice: organizations wanting to improve DevEx but without clear methodology to measure it.

Michaela Greiler: Consultant and researcher with experience at Microsoft, focused on engineering practices and developer productivity.

The combination wasn’t an accident. Academic research (Storey, Forsgren) met market pragmatism (Noda, Greiler). Scientific rigor met practical urgency.

Why 3 dimensions, not 5?

SPACE had 5 dimensions — Satisfaction, Performance, Activity, Communication, Efficiency. It was comprehensive, but also hard to operationalize. Organizations looked at SPACE and asked: “Where do we start?”

DevEx made a deliberate choice: reduce to 3 dimensions that capture the essence of lived experience.

Flow: Captures the ability to work without artificial interruptions — the psychological state where real productivity happens.

Feedback: Captures how quickly the system responds — time between action and result, between change and confirmation.

Mental Burden: Captures the mental effort required to operate — how much of the brain is spent on the real problem versus navigating accidental complexity.

Strategic simplification

The 3 dimensions aren’t “the only ones that matter.” They’re the ones organizations can measure and act on without getting lost in complexity. This approach traded comprehensiveness for actionability.

The context: Between SPACE and pragmatism

DevEx was born from productive tension: SPACE had proven that productivity was multidimensional, but organizations still didn’t know what to do with it.

SPACE’s problem: Five dimensions are hard to track simultaneously. Collaboration, for example, is important — but how to measure without creating metrics that become games? Efficiency is crucial — but efficient for what, exactly?

DevEx’s bet: Focus on three dimensions that:

  1. Can be measured with combination of perceptual metrics (surveys) and objective ones (systems)
  2. Are actionable — teams can intervene directly
  3. Capture the lived experience of working, not just output metrics

The trade-off was conscious. DevEx doesn’t try to be a universal productivity framework. It tries to answer a specific question: what makes development work cognitively sustainable?

The connection to DX company

There’s a detail that needs to be said: Abi Noda is founder of DX, a commercial company focused on developer experience. The paper was published in an academic journal (ACM Queue), with scientific rigor — but it’s impossible to ignore that commercial interest exists.

Does this invalidate the framework? No. But it contextualizes. DX sells tools to measure DevEx. The framework makes it easier to sell those tools. This doesn’t mean the framework is false — it means it solves problems that a company finds profitable to solve.

The politics of measurement

Every model carries interests. DORA emerged from Google. SPACE emerged from Microsoft and GitHub research. DevEx has an explicit commercial connection. This doesn’t invalidate them — but it reminds us that what we choose to measure reflects what someone finds important to measure.

What DevEx inherited from SPACE

DevEx didn’t emerge from nothing. It’s a direct descendant of SPACE — and two of the authors (Forsgren and Storey) were in both papers.

From SPACE, DevEx inherited:

  • The idea that productivity doesn’t fit in a single metric
  • The combination of perceptual metrics (surveys) with objective ones (systems)
  • Focus on satisfaction as a technical variable, not just “happiness”

What DevEx simplified:

  • Reduced 5 dimensions to 3
  • Focused on lived experience instead of general productivity
  • Prioritized actionability over comprehensiveness

SPACE asked: “How to measure productivity multidimensionally?” DevEx asks: “What is it like to work here, and can this be measured?”

The three dimensions of the approach

The DevEx approach is structured around three central dimensions — defined through empirical research as the factors that most impact real developer productivity.

Carregando diagrama...
The 3 DevEx dimensions form an interdependent system. Degradation in one dimension affects the others. Solid arrows = negative impact. Dashed arrows = positive synergy.

Flow: continuity, not just speed

Flow, here, isn’t just speed; it’s continuity. It’s the ability to work without artificial interruptions — to start a task and be able to finish it without being torn from context.

The concept comes from cognitive psychology: the “flow state” described by Mihaly Csikszentmihalyi[2] is that state of deep immersion where work flows naturally, concentration is total, and time seems to disappear. In software development, this state is where real productivity happens.

The problem: this state is extremely fragile. A single interruption can cost 15-25 minutes to rebuild. And in typical work environments, developers are interrupted every 10-15 minutes on average.

What systematically destroys flow

  • Fragmented meetings: It’s not the total meeting time that matters, but how they fragment the day.
  • Bureaucratic approvals: Each time a developer needs to stop and wait for approval, context is lost.
  • Unresolved dependencies: “I need to talk to team X before continuing” is a symptom of poorly designed architecture or process.
  • Unstable environments: When the development environment randomly breaks, flow becomes impossible.
  • Constant notifications: Each ping is a micro-interruption that accumulates cognitive cost.

Interruption economics

Each forced interruption costs more than lost time — it costs the effort of rebuilding mental context. A developer interrupted 8 times in a day hasn’t lost 8 moments; they’ve lost the ability to do deep work that entire day.

The measurement this model proposes: How many hours of focused, uninterrupted work can a developer have per day? If the answer is “less than 2”, there’s a structural problem.

Practical example: The real cost of interruptions

Scenario: Developer with 8 meetings of 30 minutes distributed throughout the day. Total: 4 hours.

Naive calculation: 4 hours left for technical work. Seems reasonable.

Reality:

  • Largest continuous block available: 1h30
  • Cost of rebuilding context after each interruption: ~15 minutes
  • 8 interruptions × 15 minutes = 2 hours lost in reconstruction
  • Real possible flow time: ~1h30 per day

What happens in 6 months:

  • Simple tasks become complex because there’s never time to understand them deeply
  • Refactorings are postponed indefinitely (require long flow)
  • Technical debt accumulates because fixing it properly requires concentration that doesn’t exist
  • Code quality drops — not due to incompetence, but due to structural impossibility of doing otherwise

Solution isn’t eliminating meetings. Solution is consolidating them. Same 4 hours, but grouped into two blocks: entire morning and end of afternoon. Result: 4-6 hours of possible flow instead of 1h30.

Flow vs. Speed: the crucial difference

Here’s a confusion that costs dearly: flow is not synonymous with speed.

Speed is quantitative. Flow is qualitative. A team can have high velocity (many tasks completed) and zero flow (no task done with deep attention).

High velocity, without flow
  • Many tasks completed quickly
  • Constant interruptions compensated with overtime
  • Code works, but nobody fully understands it
  • Technical debt in exponential growth
Moderate velocity, with flow
  • Fewer tasks, but each well understood
  • Protected time blocks for deep work
  • Code reflects clear thinking
  • Technical debt controlled

Concrete example: Team A delivers 40 story points per sprint but has 15 meetings per week and deploys that frequently break. Team B delivers 25 story points but has 5 meetings and stable deploys.

Throughput metrics favor Team A. But six months later, Team B is accelerating while Team A is rewriting entire modules because nobody can maintain the code anymore.

Flow sustains velocity. Velocity without flow is debt disguised as productivity.

How flow and feedback reinforce each other

Flow and feedback aren’t independent — they mutually amplify.

When feedback is fast (30-second build), developers can experiment without leaving the flow state. They test a hypothesis, see the result, adjust. The entire cycle happens within the same concentration session.

When feedback is slow (20-minute build), each iteration breaks flow. Developers make a change, wait, lose context, try to recover when the result arrives. Work becomes fragmented even without external interruptions.

Slow feedback destroys flow even in environments without meetings.

When “improving flow” makes everything worse

Not every intervention to “improve flow” works. Some worsen the system:

1. Eliminate all meetings without replacing them with another form of alignment

Result: Individual flow increases, but direction diverges. Each developer works focused — on the wrong thing. Three months later, nobody understands how the system became so incoherent.

2. Prohibit interruptions to the point of creating silos

“Don’t disturb developers in flow” becomes “don’t talk to anyone about anything”. Collaboration dies. Knowledge becomes fragmented. Individual flow increases; collective intelligence collapses.

3. Optimize tools without touching processes

Buying faster IDEs doesn’t solve if the code review process takes 3 days. The bottleneck isn’t technical — it’s organizational. Spending on tools while ignoring structure is improvement theater.

4. Create “flow days” that become days of solitude

“Friday is a no-meeting day!” becomes Friday where nobody talks. The problem wasn’t having meetings — it was having them poorly distributed and poorly conducted. Eliminating communication isn’t optimizing flow; it’s fragmenting the team.

Collaboration vs isolation

Flow doesn’t mean working alone. It means working without artificial and avoidable interruptions. Pair programming can generate flow. Well-facilitated meetings can generate flow. Well-used asynchronous messages preserve flow. The problem was never collaboration — it was unnecessary interruption.

Feedback: system response speed

Feedback isn’t just monitoring; it’s the speed at which the system responds to developer actions. It’s the time between “I made a change” and “I know if it worked”.

When the feedback cycle is short (seconds), developers experiment more. They try different approaches. They iterate rapidly. They learn from small errors before they become big errors.

When the cycle is long (minutes or hours), behavior changes. Developers avoid experimentation because each attempt costs too much time. They accumulate large changes to “make use of” the wait. They lose context between action and result.

The difference isn’t incremental — it’s qualitative.

The feedback cycles that matter

  • Compilation/build: Seconds is ideal. Minutes is already problematic.
  • Unit tests: Should run in seconds. If they take minutes, developers stop running them frequently.
  • Integration tests: Minutes is acceptable. Tens of minutes forces “blind” commits.
  • Deployment to test environment: If it takes hours, developers stop testing in realistic environments.
  • Production feedback: When something breaks, how long until someone knows?

Cycle dynamics

The longer the feedback cycle, the larger the changes developers make at once — and the greater the risks they take unknowingly. Slow feedback doesn’t just slow down work; it changes the nature of work for the worse.

Practical example: The 30-minute build

Scenario: Fintech company with monorepo. Complete build takes 30 minutes. “But it’s normal for a system this size,” they say.

Behavior observed after 6 months:

Developers stop testing locally:

  • “I’ll commit and let CI run, it’s faster”
  • “Blind” commits become standard
  • Broken build becomes routine (“someone broke main again”)

Accumulate large changes:

  • “Since it’s going to take 30 minutes anyway, I’ll do everything at once”
  • PRs with 2000+ lines become common
  • Code review becomes theater (nobody can review properly)

Avoid refactoring:

  • “I won’t refactor this now, I’d have to run build many times”
  • Technical debt accumulates
  • Fear of touching legacy code increases

Lose context between action and result:

  • Make change, go do something else while waiting
  • When build finishes (30 min later), they’ve forgotten the reasoning
  • Debugging becomes exponentially harder

The throughput metric was good. High velocity. Many PRs merged. But code quality was in free fall — not because the team was bad, but because the system made it impossible to work well.

Intervention: Parallelization + cache + incremental build. Build dropped to 3 minutes.

Result in 3 months:

  • Smaller and more frequent commits
  • Experimentation increased (refactorings that were “too expensive” became viable)
  • PRs decreased in size (average 200 lines vs 2000+)
  • Code review became real again
  • Technical debt started being paid

It wasn’t just “time savings”. It was structural behavioral change.

The anatomy of effective feedback

Not all feedback is equal. Effective feedback has four simultaneous properties:

1. Fast

  • Seconds for local compilation
  • Minutes for integration tests
  • Hours (at most) for deploy to staging environment

2. Clear

  • Error message points exactly where the problem is
  • Doesn’t require tribal knowledge to interpret
  • Stack traces are readable and relevant

3. Actionable

  • Feedback says not just “what broke” but “where” and “why”
  • Developers know the next step without consulting others
  • No ambiguous messages like “unexpected system error”

4. Reliable

  • Tests don’t fail randomly (no flaky tests)
  • Pipeline doesn’t break due to infrastructure problems
  • When something fails, it’s because there’s a real problem in the code

When any of these properties is missing, feedback becomes noise.

Examples of useless feedback:

  • Fast but not clear: Build fails in 5 seconds with “Error: Build failed”. Where? Why? Nobody knows.
  • Clear but not fast: Test points exactly to the problem… but took 2 hours to run.
  • Fast and clear but not reliable: Test fails today, passes tomorrow without code change. Developers stop trusting.
  • Fast, clear and reliable but not actionable: “Integration failure with service X” — but nobody documents how to debug service X.

Types of feedback and their cycles

Each feedback type has an expected cycle. When the cycle is longer than expected, developers change behavior:

Feedback TypeIdeal CycleProblematic CycleResulting Behavior
Syntax/compilation< 5 seconds> 1 minuteAvoid compiled languages, accumulate changes
Unit tests< 10 seconds> 2 minutesStop running tests locally
Integration tests< 5 minutes> 20 minutesCommit without testing, cross fingers
Code review< 2 hours> 2 daysAccumulate large changes, decrease PRs
Staging deploy< 30 minutes> 4 hoursStop testing in realistic environment
Production monitoring< 1 minute> 15 minutesProblems escalate before being detected

The progression of cycles matters as much as individual speed.

Fast feedback on compilation but slow on code review doesn’t help — the global bottleneck defines the behavior.

How feedback and mental burden interact

Ambiguous feedback increases mental burden. Each obscure error message consumes mental energy.

Scenario A — Clear feedback:

Error: Cannot find module './utils/formatDate'
at src/components/Dashboard.tsx:15:23

Developer sees, understands, fixes. Cognitive cost: minimal.

Scenario B — Obscure feedback:

Error: Build failed with exit code 1
See logs for details

Developer needs to:

  1. Find where the logs are
  2. Decipher 500 lines of output
  3. Guess which part is relevant
  4. Search Slack to see if anyone’s seen this
  5. Eventually discover it was the same import problem

Cognitive cost: enormous. And this happens dozens of times per day.

Bad feedback doesn’t just delay — it exhausts.

When fast feedback is counterproductive

Fast feedback doesn’t always help. There are scenarios where speed without quality worsens the system:

1. Fast but false feedback

CI that runs in 2 minutes but skips half the tests. Developers receive ”✓ All good!” when it’s not.

Result: Bugs reach production. Trust in CI collapses. Teams stop believing in green builds.

2. Immediate but noisy feedback

Alert system that notifies every warning. Slack bombarded with messages. Developers desensitize.

Result: Important alerts get lost in noise. “Oh, it’s just another warning.”

3. Fast but contextless feedback

“Payment service error” appears in 5 seconds, but nobody knows what to do with it. No logs, no context, no documentation.

Result: Feedback speed doesn’t matter if it’s not actionable.

4. Optimize local feedback at the expense of global feedback

Local build in 10 seconds, but developer only discovers they broke integration 3 hours later in CI.

Result: Feeling of speed without real safety. Problems detected too late.

System-level thinking

Optimizing one feedback cycle without looking at the whole system creates hidden bottlenecks. Fast build with slow code review, or fast code review with slow deploy, or fast deploy with nonexistent monitoring — each bottleneck nullifies previous optimizations.

Mental burden: required mental effort

Mental burden is the mental effort required to understand, decide, and act within that environment. It’s everything the developer needs to keep in mind — simultaneously — to do their work.

The concept comes from John Sweller’s theory of mental burden[3]: our working memory has limited capacity. When this capacity is consumed by accidental complexity, less space remains for essential complexity.

The three types of mental burden

Intrinsic load: The complexity inherent to the problem. Solving a machine learning algorithm is intrinsically complex. This is unavoidable.

Extraneous load: Complexity added by the environment, tools, or processes. Having to remember 47 commands to deploy is extraneous load. This is avoidable.

Germane load: Effort dedicated to learning and building useful mental models. Understanding the system architecture to contribute better. This is desirable.

The problem: most development environments are saturated with extraneous load — complexity that shouldn’t exist, but does due to accumulated decisions over the years.

Signs of high mental burden

  • Developers need to “remember” many things that should be automated or documented
  • Simple decisions require consulting multiple people because nobody has complete context
  • Onboarding new members takes months because knowledge is tribal
  • There are many “gotchas” that only those who’ve made mistakes know about
  • Senior developers are constantly interrupted because only they know how certain things work

Code as symptom

When the brain is overloaded, it economizes where it can — and usually economizes on long-term quality. Architectures become more rigid, tests more fragile, documentation more sparse. Not due to lack of competence, but due to mental survival.

The indicator this approach proposes: How much of a developer’s mental effort is spent on the real problem versus navigating accidental complexity? If the answer is “more than 50% on accidental complexity”, the system is stealing mental capacity that should be invested in value.

Practical example: The system nobody understands

Scenario: E-commerce company with 15 years of history. Main system has 40 microservices. Knowledge concentrated in 3 senior developers.

Visible symptoms:

Onboarding takes 6 months:

  • New developers receive repository access
  • “Read the code, you’ll understand”
  • Documentation exists, but has been outdated since 2019
  • Six months later, still afraid to touch critical modules

Simple changes take weeks:

  • “Add a field to the checkout API”
  • Seems simple. But which services need to change?
  • Developer spends 3 days mapping dependencies
  • Discovers 7 different services need updating
  • Each with different deploy process

Nobody wants to touch critical modules:

  • “The payment service works, don’t touch it”
  • Known bug for 2 years without fix
  • Not because it’s technically difficult
  • Because nobody has the mental model of the entire system

Seniors become bottlenecks:

  • Every decision needs to go through the 3 who “understand”
  • They spend the day answering questions instead of coding
  • If one leaves, the system is held hostage by the other two

Attempted solution: more documentation

Company hires tech writer. Spends 6 months documenting everything. Result:

  • 300 pages of documentation
  • Outdated in 3 months (nobody remembers to update alongside code)
  • Developers continue asking seniors (it’s faster than reading 300 pages)

Real problem wasn’t lack of documentation.

It was architecture that requires implicit knowledge to operate. Each microservice was created by a different person, with different patterns, without shared conventions. Mental burden is in the structure, not in the absence of text.

How mental burden accumulates silently

Mental burden doesn’t appear suddenly. It accumulates in small decisions, none absurd in isolation:

Year 1: “Let’s use Kubernetes, everyone is using it”

  • +10% mental burden (learn K8s)

Year 2: “Let’s add Istio for service mesh”

  • +15% mental burden (learn Istio + how it interacts with K8s)

Year 3: “We need Prometheus + Grafana for observability”

  • +10% mental burden (learn Prometheus query language)

Year 4: “Let’s migrate to Terraform for infrastructure as code”

  • +15% mental burden (learn Terraform + HCL)

Year 5: “We’re adding Vault for secrets management”

  • +10% mental burden (learn Vault + access policies)

Result after 5 years:

  • No decision was stupid
  • Each tool solves a real problem
  • But the sum is 60% more mental burden than at the start

New developer needs to learn:

  • System programming languages (OK, expected)
  • Kubernetes (deploy platform)
  • Istio (service mesh)
  • Prometheus (metrics)
  • Grafana (dashboards)
  • Terraform (infrastructure)
  • Vault (secrets)

Six-month ramp to make a change to one endpoint.

Complexity creep

Each addition makes sense in isolation. The absurdity only becomes visible when trying to explain the entire system to someone new. Then someone says: “It’s complex, but that’s normal for systems this size.” No. It’s complex because each generation added tools without questioning accumulated mental cost.

Difference between essential and accidental complexity

Not all complexity is avoidable. The challenge is distinguishing:

Essential complexity (unavoidable):

  • Business rules that reflect domain reality
  • Tax calculation logic for 50 different countries
  • Financial transaction orchestration with regulatory requirements
  • ML recommendation algorithms

This is hard because the problem is hard. Simplifying would mean removing functionality or violating requirements.

Accidental complexity (avoidable):

  • Needing to know 7 different tools to deploy
  • Having 4 different ways to configure services (YAML, JSON, TOML, environment variables)
  • Processes requiring 12 manual steps because “it’s always been this way”
  • Architecture where each module follows different conventions

This is hard because someone chose for it to be hard (even unintentionally).

Example of confusion between the two:

Problem: Payment system with multiple integrations (PayPal, Stripe, PagSeguro, etc.)

Essential complexity: Each gateway has different API, different flows, different error handling. This can’t be simplified — it’s domain reality.

Accidental complexity added:

  • Each integration was implemented by a different person, with different patterns
  • PayPal uses callbacks, Stripe uses promises, PagSeguro uses observables
  • Three ways to handle retry (one with library X, another with library Y, another manual)
  • No shared abstraction

Result: Developer needs to learn not just gateway APIs (essential) but also three different internal ways of doing the same thing (accidental).

Essential complexity
  • Real business rules
  • Regulatory requirements
  • Intrinsically complex algorithms
  • Heterogeneous external integrations
Accidental complexity
  • Excessive tools
  • Lack of conventions
  • Unnecessary manual processes
  • Inconsistent architecture

When “simplifying” increases mental burden

Not every attempt to reduce complexity works. Sometimes, “simplifications” worsen mental burden:

1. Abstractions that hide too much

Framework that “does everything automatically” without making clear what’s happening underneath. When something breaks, developer doesn’t know where to start debugging.

Example: ORM that automatically generates SQL. Works 95% of the time. In the remaining 5% (slow query, N+1 problem), developer needs to understand not just SQL but also the ORM and also how the ORM translates to SQL.

Before: SQL complexity (high, but visible) After: SQL + ORM + mapping complexity (even higher, and now invisible)

2. Microservices as “solution” to complex monolith

“Let’s break up the monolith, it’ll be simpler!”

Result:

  • Before: 1 complex system
  • After: 15 smaller systems + network complexity + orchestration complexity + distributed observability complexity

Complexity didn’t disappear — it changed location and multiplied.

3. “Convention over configuration” where nobody knows the conventions

Framework that promises simplicity via implicit conventions. But conventions aren’t documented, or are in 47 different places, or change between versions.

Result: Developers spend more time trying to discover “the convention” than if they had explicit configuration.

4. Tools that “automate” but require complex configuration

CI/CD that promises simplicity but requires 500 lines of YAML with obscure syntax and non-obvious behavior.

Before: Manual deploy (10 known steps) After: “Automatic” deploy (1 button, but if it breaks, nobody knows how to fix without reading 50 pages of documentation)

True simplification

Real simplification reduces required mental effort. False simplification just moves complexity elsewhere (usually somewhere less visible, where it’s harder to debug). Hiding complexity is not the same as removing it.

How the three dimensions reinforce or destroy each other

Flow, Feedback, and Mental Burden aren’t independent — they form a system.

Virtuous cycle:

  • Low mental burden → Developers understand the system quickly
  • Fast understanding → Can work in flow (don’t need to constantly stop to ask)
  • Continuous flow → Make better decisions (brain isn’t fragmented)
  • Good decisions → System becomes clearer (documentation, consistent architecture)
  • Clear system → Mental burden stays low

Vicious cycle:

  • High mental burden → Developers don’t understand the system
  • Lack of understanding → Constantly interrupt to ask (destroys flow)
  • Fragmented flow → Rushed or defensive decisions
  • Bad decisions → System becomes more confusing (technical debt, inconsistencies)
  • Confusing system → Mental burden increases further

Feedback fits into this system:

Fast and clear feedback reduces mental burden:

  • Errors are identified immediately
  • Developer doesn’t need to “remember” what they were doing 30 minutes ago
  • Learning happens in real time

Slow and obscure feedback increases mental burden:

  • Developer needs to maintain mental context for long periods
  • Ambiguous error messages require investigation (more load)
  • Trial-and-error cycle becomes exhausting

Systemic interdependence

You can’t “improve DevEx” by optimizing one dimension while ignoring the others. Fast feedback doesn’t help if mental burden is too high to take advantage of it. Flow doesn’t help if feedback is too slow to sustain continuity. The three dimensions need to be addressed as an interdependent system.

Measurement methodology

The DevEx approach proposes a measurement methodology based on two complementary types of indicators:

Perceptual indicators

These are collected directly from developers, usually via periodic surveys. The goal is to capture subjective experience — something system data can’t reveal.

  • Structured surveys: Standardized questions applied regularly (quarterly or semi-annually) to measure perception of flow, satisfaction with feedback, and system clarity.
    • “How often can you work without interruptions for at least 2 hours?”
    • “How satisfied are you with the time it takes to receive CI feedback?”
  • Experience scale: Developers rate specific aspects on numerical scales, enabling comparison over time.
    • “From 1 to 5, how easy is it to understand the architecture of the system you work on?”
    • “From 1 to 5, how confident do you feel making changes to this code?”
  • Friction identification: Open questions reveal problems that automatic metrics don’t detect — like confusing processes or tribal knowledge.
    • “What is the biggest obstacle you face in completing your work?”
    • “What would you change about the development process if you could?”
  • Objective metrics validation: If build time is 5 minutes but developers report dissatisfaction, something is wrong that the numbers don’t show.
    • “Does the current build time negatively impact your work? Why?”
    • “Do the available tools meet your needs? What’s missing?”

Workflow measurements

These are collected automatically from systems and tools. The goal is to have objective data that complements subjective perception.

  • Build and CI time: How long between commit and pipeline feedback? Measured directly from the CI/CD system.
    • “What is the average local build time?”
    • “What is the average CI pipeline time until first feedback?”
  • PR review time: How long does a pull request wait for review? Extracted from the version control system.
    • “What is the average time between PR opening and first comment?”
    • “What is the average time between PR opening and merge?”
  • Interruption frequency: How many meetings per day? How many context switches? Can be inferred from calendars and communication tools.
    • “How many meetings does a developer have on average per day?”
    • “What is the largest free time block in the average calendar?”
  • Code complexity: Metrics like cyclomatic complexity, coupling between modules, and documentation coverage — extracted via static analysis.
    • “What is the average cyclomatic complexity per module?”
    • “What percentage of code has up-to-date documentation?”

Why combine both types

For each dimension, the model suggests:

DimensionPerceptual MetricWorkflow Metric
Flow”How often can you enter a deep focus state?”Number of meetings per day, time between context switches
Feedback”How satisfied are you with build/CI time?”Build time, test execution time, PR review time
Mental Burden”How easy is it to understand the codebase?”Cyclomatic complexity, documentation coverage

Dual perspective

Objective measurements alone can deceive. A 5-minute build seems fast — but if the developer needs to run it 10 times a day to debug, the experience is terrible. Perceptual indicators capture what numbers don’t show: lived reality.

DevEx, DORA, and SPACE: complementarity

So far, we’ve seen three approaches in the series. A natural question arises: how do they relate?

What each model sees

Each approach emerges from a different concern and, therefore, illuminates distinct aspects of the same system:

FrameworkCentral questionWhat it observesWhat it ignores
DORA”Does the system deliver well?”Pipeline flow (frequency, stability)Human cost, experience
SPACE”Are we measuring correctly?”Multiple productivity dimensionsHow dimensions feel
DevEx”What’s it like to work here?”Lived experience as technical variableDelivery metrics, output

Different perspectives

DORA looks at the house from outside: “How many people enter and leave? How often?” SPACE maps the rooms: “What dimensions exist?” DevEx asks those who live there: “What’s it like to live here?”

Different diagnoses for the same problem

To illustrate how the frameworks complement each other, consider a common scenario: team with high turnover and delayed deliveries.

What DORA would see:

  • Lead time increasing month over month
  • Deploy frequency decreasing
  • Failure rate stable or increasing
  • Diagnosis: “Pipeline is degrading. We need to improve automation and delivery processes.”

What SPACE would see:

  • Satisfaction declining
  • High activity but low performance
  • Fragmented collaboration
  • Diagnosis: “Multiple dimensions are deteriorating simultaneously. Something systemic is happening.”

What DevEx would see:

  • Flow state rarely achieved (constant interruptions)
  • Slow feedback (30-minute builds, PRs waiting days)
  • High mental burden (confusing architecture, nonexistent documentation)
  • Diagnosis: “The environment is cognitively hostile. People are leaving because working here is exhausting.”

Complementary diagnoses

The three diagnoses don’t contradict each other — they complement. DORA shows that something is wrong. SPACE shows where the problem manifests. DevEx shows why the problem exists in everyday experience.

How to use all three together

These approaches don’t compete — they complement. A mature organization can:

  1. Use DORA to monitor pipeline health (flow metrics)
  2. Use SPACE to ensure it’s not optimizing just one dimension at the expense of others
  3. Use DevEx to understand if metrics reflect real experience

Example of combined use

Situation: Platform team wants to improve product teams’ productivity.

Step 1 — DORA as baseline:

  • Measure deploy frequency, lead time, MTTR, failure rate
  • Identify pipeline bottlenecks
  • Establish benchmarks

Step 2 — SPACE for multidimensional view:

  • Check if optimizing delivery is hurting satisfaction
  • Verify if activity is high but performance low
  • Evaluate collaboration quality between teams

Step 3 — DevEx for deep diagnosis:

  • Measure mental burden (surveys + code complexity)
  • Evaluate feedback cycles (build time, PR review)
  • Identify flow destroyers (meetings, interruptions)

Step 4 — Triangulation:

  • Cross-reference objective data (DORA) with perception (DevEx)
  • Verify if improvements in one dimension (SPACE) hurt others
  • Prioritize interventions based on combined evidence

Complementarity with DORA and SPACE

What DevEx adds

  • Experience as technical variable
  • Perceptual + objective metrics
  • Focus on mental burden and flow
  • Captures invisible costs

What DevEx doesn't replace

  • Delivery metrics (DORA)
  • Broad multidimensional view (SPACE)
  • Industry benchmarks
  • Output metrics

When to use each framework

SituationRecommended frameworkWhy
Evaluate DevOps maturityDORAStandardized metrics, available benchmarks
Diagnose productivity declineSPACEMultidimensional view avoids myopic optimization
Investigate high turnoverDevExFocuses on lived experience that causes departures
Justify tooling investmentDORA + DevExCombines delivery metrics with perception
Redesign team processesSPACE + DevExBalances dimensions with real experience

Tensions between approaches

Despite complementarity, there are tensions that need to be recognized:

DORA vs DevEx: DORA may indicate high throughput while DevEx shows poor experience. Teams can deliver fast despite hostile systems — until they can’t anymore. An optimized pipeline doesn’t guarantee that working in it is sustainable.

SPACE vs DevEx: SPACE includes satisfaction as one dimension among five. DevEx argues that satisfaction isn’t a dimension — it’s a consequence of the three central dimensions (flow, feedback, load). SPACE treats satisfaction as a metric; DevEx treats it as an outcome.

DORA vs SPACE: DORA focuses on four specific delivery metrics. SPACE argues that productivity is irreducible to a fixed set of metrics. Using only DORA can create blind spots; using only SPACE can create paralysis from too many dimensions.

The risk of optimizing separately: Improving DevEx without looking at DORA can create “comfortable” environments that don’t deliver. Improving DORA without looking at DevEx can create fast pipelines that exhaust people. Improving SPACE without focus can dilute effort across too many dimensions.

Multiple lenses needed

No model is complete. Each illuminates different aspects of the same system. The value lies in using multiple lenses — not in choosing one and ignoring the others. Maturity lies in knowing which lens to use for which question.

The Structural Critique of DevEx

Like DORA and SPACE before it, DevEx is not a neutral tool. It carries assumptions, limits, and risks that need to be made explicit. An organization that adopts DevEx without understanding its limitations may end up reproducing exactly the problems it was trying to solve.

What DevEx doesn’t see: The organizational dimension

The DevEx model focuses on individual and team experience — how developers feel while working, how much flow they achieve, how fast they receive feedback. But it systematically ignores the organizational structures that produce this experience.

Concrete example:

Team reports high mental burden. DevEx survey confirms: developers spend 60% of their time navigating accidental complexity. Company invests in:

  • Better documentation
  • Faster tools
  • Architecture training

Six months later, mental burden remains high.

Real cause DevEx didn’t capture:

  • Approval structure requires sign-off from 4 hierarchical levels
  • Architectural decisions are made by committee that meets once per month
  • Teams have no autonomy to change tooling without centralized approval
  • Junior developers can’t question senior decisions

The problem wasn’t lack of documentation or bad tools. It was power structure that concentrates decisions, fragments authority, and transforms technical work into political navigation.

DevEx measures symptoms. It doesn’t ask who created the conditions that produce those symptoms.

Organizational blind spots

When an approach ignores organizational dimensions, it allows organizations to treat political problems as if they were technical. “Let’s improve DevEx” becomes a substitute for “let’s redistribute power and change incentives”. One is palatable. The other is threatening.

When DevEx becomes performative theater

There’s a predictable risk: DevEx becomes an HR or internal marketing initiative. Something the company “does” to appear to care about developers, without structurally changing anything.

Typical scenario:

Company announces “DevEx Program”:

  • Hires specialized consultancy
  • Applies quarterly surveys
  • Creates dashboards with the 3 dimensions
  • Presents results to leadership

Developers report:

  • Slow build: 20 minutes
  • High mental burden: confusing architecture
  • Fragmented flow: 12 meetings per week

Company responds:

  • Buys “more modern” tool licenses ($200k/year)
  • Promises to “review processes” (but changes nothing)
  • Creates “no-meeting day” (Friday, but nobody respects it)

Result in 1 year:

  • Metrics didn’t change
  • Developers more cynical (“it’s just theater”)
  • Company points to “DevEx investment” as evidence it “cares”

DevEx became a check-box. Something to show in all-hands and recruitment. The model was instrumentalized to perform care without actually exercising it.

Measurement without power

Measuring experience without giving autonomy to change it is worse than not measuring. It creates expectation that something will improve, followed by frustration when nothing changes. DevEx without authority to intervene is just corporate satisfaction research.

The political instrumentalization of DevEx

Productivity models are never neutral. They carry interests — from who created them, who funds them, who implements them. DevEx is no exception.

Question rarely asked: Who decides what is “good experience”?

Scenario A — Developers want:

  • 4-day week (studies show it reduces burnout and improves focus)
  • Permanent remote work (reduces interruptions, improves flow)
  • Less pressure from impossible deadlines (allows quality work)

Company responds:

  • “That’s not DevEx, that’s benefits”
  • “We need office presence for collaboration”
  • “Deadlines are business reality”

Company proposes “DevEx improvements”:

  • Faster machines (approved)
  • More monitoring tools (approved)
  • Deploy “gamification” with badges (approved)

What’s happening: Company defines DevEx as “what improves experience without changing power structure or reducing work extraction”. Developers want autonomy, flexibility, sustainable pace. Company offers tools.

DevEx became a bargain: “We’ll improve your tools. In exchange, we continue demanding the same unsustainable throughput.”

Definition as power

If “improving DevEx” only means optimizing tools and processes without touching workload, autonomy, or pressure, the approach becomes an instrument for maintaining the status quo disguised as progressive improvement.

Why DevEx can be used against developers

There’s a cruel paradox: improving DevEx can serve as justification to demand more.

Perverse logic:

Before: 20-minute build. Slow deploys. Bad tools.

  • Organization accepts moderate velocity (“can’t do more with this environment”)

Company invests in DevEx:

  • Build drops to 2 minutes
  • Automated deploy
  • Modern tools

After:

  • “Now that DevEx is good, why hasn’t velocity increased proportionally?”
  • “We invested $500k in tools. We expect ROI.”
  • Pressure for throughput increases

Developers now work in a better environment, but under greater pressure. The improvement was real — but was captured as justification for extracting more work.

Another scenario:

Company implements “excellent DevEx”: fast tools, clear processes, impeccable documentation. But:

  • On-call is 24/7 because “deploy is so easy you can do it at midnight”
  • Slack response expectation is <5 minutes because “tools are fast”
  • Vacations are interrupted because “you have remote access to everything”

DevEx optimized not for sustainable work, but for continuous extraction.

The intensification trap

If improving experience only serves to increase throughput without questioning total workload, DevEx becomes an instrument of intensification, not sustainability. The problem stops being friction and becomes infinite demand.

Who pays the cost of “good DevEx”?

Not everyone experiences DevEx the same way. And not everyone pays the same price when it’s “improved”.

Junior developers:

  • Benefit from clearer tools and better documentation
  • But: if DevEx becomes expectation of immediate high productivity, pressure on them increases
  • “With this environment, you should be delivering more”

Platform teams:

  • Responsible for building and maintaining infrastructure that improves other teams’ DevEx
  • Often don’t have good DevEx for themselves
  • Overburdened maintaining tools others use

Developers on remote teams:

  • May have better DevEx (fewer interruptions, greater flow)
  • But: lose political visibility, informal access to decisions, networking
  • Promotions and opportunities depend on physical presence in many companies

Developers from underrepresented groups:

  • Navigate already hostile systems (microaggressions, exclusion, tokenization)
  • DevEx focused on “productivity” ignores emotional and cognitive cost of this context
  • “Why aren’t you as productive as others?” ignores that others aren’t navigating the same barriers

Cost is distributed unequally. So is benefit.

Who usually gains from DevEx
  • Senior developers with autonomy
  • Product teams with resources
  • People in technical power positions
  • Organizations that can invest
Who usually pays the cost
  • Juniors under increased pressure
  • Overburdened platform teams
  • People in marginalized contexts
  • Companies that can't invest but compete

DevEx doesn’t distribute benefits equally. And models that ignore this perpetuate inequalities under a veneer of “improvement for all”.

When focusing on DevEx is escapism

There are contexts where investing in DevEx is avoiding the real problem.

Scenario 1: Startup with 6 months of runway

Company is dying. Product-market fit doesn’t exist. Revenue isn’t growing. But:

  • CTO decides to “improve DevEx” before focusing on product
  • Invests 2 months optimizing CI/CD, refactoring architecture, improving tools
  • Developers work in better environment
  • Company breaks 4 months later

DevEx was used as escape. Easier to optimize build than face that the product doesn’t work.

Scenario 2: Excellent technical team, mediocre product

Clean code. Robust tests. Automated deploy. Impeccable DevEx. But:

  • Product doesn’t solve real problem
  • Users don’t return
  • Growth stagnated

Team focused on internal excellence while ignoring external value.

Scenario 3: Structurally dysfunctional organization

Rigid hierarchy. Political decisions. Perverse incentives. But:

  • Company invests in “improving DevEx”
  • As if better tools compensated for broken structure
  • Developers continue leaving — not due to bad tools, but toxic context

DevEx became band-aid for deep organizational wound.

Purpose over process

Excellent technical environment doesn’t compensate for meaningless work, valueless product, or dysfunctional organization. Good tools facilitate work — but don’t create reason to do it.

When DevEx is not enough

Applicability limits

DevEx helps when

  • The problem is measurable technical friction
  • There's autonomy to change processes
  • The organization is willing to redistribute power
  • Local improvements are politically possible

DevEx doesn't solve when

  • The problem is organizational structure
  • Decisions are outside the team's reach
  • Organizational incentives contradict change
  • Architecture reflects immutable power structure
  • The product has no real value

The question the model doesn’t ask

DevEx measures flow, feedback, and mental burden. But doesn’t ask: productivity for what?

If the goal is:

  • Maximize work extraction → DevEx becomes intensification tool
  • Create valuable software → DevEx is useful instrument
  • Keep developers engaged while organization is dysfunctional → DevEx is theater

The approach doesn’t answer “what is this productivity for?” It assumes productivity is intrinsically good. But productivity applied to valueless product, or sustained by exploitation, or used to maintain status quo — isn’t achievement. It’s a problem.

Beyond technical solutions

DevEx offers lenses to see experience problems. But it doesn’t offer criteria to decide when experience matters more than other things, or when improving experience serves dubious purposes. This requires judgment that no model provides.

DevEx as a piece of the puzzle

The DevEx approach fills an important gap: it formalizes the intuition that systems difficult to live in produce worse software. Not as opinion, but as structured research with measurement methodology.

But — like DORA and SPACE before it — DevEx is a lens, not a complete answer.

The core contribution

DevEx’s central contribution isn’t just the three dimensions or the measurement methodology. It’s the assertion that experience is a technical variable — one that deserves the same analytical rigor we give to throughput, latency, or availability.

The question that remains

Throughout this series, we’ve accumulated approaches: DORA for flow, SPACE for multidimensionality, DevEx for lived experience. Each illuminates a different aspect of the same problem.

But having multiple lenses isn’t the same as knowing where to focus. When everything seems important, where exactly should we intervene?

Notas de Rodape

  1. [1]

    Noda, Abi; Storey, Margaret-Anne; Forsgren, Nicole; Greiler, Michaela. DevEx: What Actually Drives Productivity. ACM Queue, 2023. The paper proposes a framework based on three dimensions — flow state, response cycles, and mental burden — for systematically measuring and improving developer experience.

  2. [2]

    Csikszentmihalyi, Mihaly. Flow: The Psychology of Optimal Experience. Harper & Row, 1990. The book presents decades of research on deep concentration states and their conditions. The work has become a fundamental reference for understanding productivity in creative and intellectual work.

  3. [3]

    Sweller, John. Mental Burden Theory. Springer, 2011. The theory proposes that learning is optimized when mental burden is properly managed, distinguishing between intrinsic load (inherent to the material), extraneous load (imposed by instructional design), and germane load (dedicated to building mental schemas).

  4. [4]

    Forsgren, Nicole; Storey, Margaret-Anne; Maddila, Chandra; Zimmermann, Thomas; Houck, Brian; Butler, Jenna. The SPACE of Developer Productivity. ACM Queue, 2021. The paper introduces five dimensions for measuring developer productivity: Satisfaction, Performance, Activity, Communication, and Efficiency.

Related Posts

Comments 💬