
no time, jump straight to the conclusion
Why Engineering Metrics Matter
In today’s competitive technology landscape, the ability to deliver high-quality software efficiently isn’t just a nice-to-have—it’s essential for business survival. Engineering metrics provide organizations with a structured approach to measuring, understanding, and improving their software development processes.
Good engineering metrics serve multiple crucial purposes:
- Decision making: They provide data-driven insights that help leadership make informed decisions about resource allocation, process improvements, and strategic priorities.
- Continuous improvement: By establishing baselines and tracking progress over time, teams can identify bottlenecks, inefficiencies, and opportunities for optimization.
- Team alignment: Metrics create a shared language and understanding of what success looks like, helping to align teams around common goals.
- Cultural transformation: The right metrics can drive cultural changes by emphasizing what the organization truly values—whether that’s quality, speed, developer experience, or customer satisfaction.
- Business impact: Ultimately, engineering metrics should connect technical work to business outcomes, demonstrating how development practices impact the bottom line.
With effective engineering metrics, leaders can answer pressing questions like: “Are we on track to meet our goals? How can we make developers more productive? Which processes slow us down? Which issues need immediate attention?”
However, not all metrics are created equal. Using the wrong metrics or focusing too narrowly on certain measurements can lead to unintended consequences—from gaming the system to burnout and decreased quality. This is why understanding the different approaches to engineering metrics is so important.
Early Days: Activity-Based Metrics
The journey of engineering metrics began with simplistic activity-based measurements: lines of code written, hours worked, or number of features shipped. While easy to measure, these metrics often incentivized quantity over quality and failed to capture the true impact of engineering work.
During this era, LOC was used as the primary indicator of programmer productivity (LOC per programmer month) and program quality (defects per KLOC). This simplistic approach served as a surrogate measure for different aspects of software development including effort, functionality, and complexity.
By the mid-1970s, the limitations of LOC became apparent, particularly with the growing diversity of programming languages. A line of code in assembly language clearly represents different levels of effort, functionality, and complexity compared to a line in a high-level language. This recognition sparked a surge of interest in more nuanced measurements.
The year 1976 marked a milestone with the publication of the first dedicated book on software metrics by Tom Gilb. Shortly thereafter, researchers like Halstead (1977) and McCabe (1976) pioneered measures of software complexity, while Albrecht (1979) introduced function points as a language-independent measure of size.
Throughout the 1980s and 1990s, the field of software metrics continued to evolve, shifting focus from individual programmer productivity to team effectiveness and process efficiency. The rise of agile methodologies in the early 2000s introduced new metrics that emphasized delivery speed, adaptability, and customer satisfaction.
references
- history of software metrics: https://www.eecs.qmul.ac.uk/~norman/papers/new_directions_metrics/HelpFileHistory_of_software_metrics_as_a.htm
- Software metrics book: https://openlibrary.org/books/OL21710170M/Software_metrics
- Halstead complexity measures: https://en.wikipedia.org/wiki/Halstead_complexity_measures
- Cyclomatic complexity: https://en.wikipedia.org/wiki/Cyclomatic_complexity
- Function Point Analysis (International Function Point User Group): https://ifpug.org/ifpug-standards/fpa
The DevOps Revolution and DORA Metrics (2013-2021)
The DevOps Research and Assessment (DORA) team, led by Dr. Nicole Forsgren, Jez Humble, and Gene Kim, conducted groundbreaking research that established a clear connection between technical practices and organizational performance. Their work, published in the annual State of DevOps Report starting 2014 and the book Accelerate, introduced four key metrics that have become industry standards:
- Deployment Frequency: How often an organization successfully releases to production
- Lead Time for Changes: The time it takes from code commit to code running in production
- Mean Time to Recovery (MTTR): How long it takes to restore service after a failure
- Change Failure Rate: The percentage of deployments that cause a failure in production
- Reliability (added 2021): The ability of a system to consistently perform as expected, minimizing failures and downtime to ensure a seamless user experience.
These metrics resonated because they were empirically validated to predict organizational performance while being relatively straightforward to measure.
references
- State of DevOps Report: https://dora.dev/research
- first issue (2014): https://dora.dev/research/2014/
- 5th key DORA metric added (2021): https://cloud.google.com/blog/products/devops-sre/announcing-dora-2021-accelerate-state-of-devops-report
- Accelerate book: https://www.goodreads.com/book/show/35747076-accelerate
SPACE Framework (2021)
In 2021, researchers from GitHub, Microsoft, and the University of Victoria introduced the SPACE framework in a paper titled “The SPACE of Developer Productivity.” This framework took a holistic view of productivity that expanded beyond just output:
- Satisfaction and wellbeing: How fulfilled developers feel about their work
- Performance: The outcome of a development process
- Activity: The actions and tasks completed by developers
- Communication and collaboration: How effectively team members work together
- Efficiency and flow: How smoothly work progresses
references
- The SPACE of Developer Productivity: https://queue.acm.org/detail.cfm?id=3454124
DevEx Framework by Netlify, GitHub & Google (2023)
Starting in 2023, researchers from Netlify, GitHub, and Google published “The DevEx Framework,” which outlines three core dimensions:
- Feedback Loops: The speed and quality of feedback developers receive
- Cognitive Load: The mental effort required to complete tasks
- Flow State: The ability to achieve focused, uninterrupted work
This framework also emphasizes measuring DevEx through a combination of system metrics, surveys, and qualitative studies.
references
- Measuring Developer Experience with the DevEx Framework: https://shipyard.build/blog/devex-framework/
- DevEx Framework: A 3D Metric Approach to Developer Effectiveness: https://www.hatica.io/blog/devex-framework/
- The 19 Developer Experience Metrics to Measure in 2025: https://linearb.io/blog/developer-experience-metrics
DX Core 4 framework (2024)
Recently, end of 2024, software development intelligence platform DX introduced the DX Core 4 framework, focusing on four key dimensions:
- Speed
- Effectiveness
- Quality
- Impact
This framework aims to simplify measurement for engineering leaders and communicate more effectively with non-technical stakeholders. This evolution reflects a growing recognition that software development is a complex, multidimensional activity that cannot be adequately measured by any single metric or even a single framework.
references
- Introducing the DX Core 4: https://getdx.com/news/introducing-the-dx-core-4/
- DX Unveils New Framework for Measuring Developer Productivity: https://www.infoq.com/news/2025/01/dx-core-4-framework/
Github Engineering System Success Playbook (ESSP, 2025)
Most recently GitHub introduced the GitHub Engineering System Success Playbook (ESSP), a three-step process for sustainably improving engineering performance. It emphasizes that better business outcomes result from the harmonious collaboration of quality, velocity, and developer satisfaction, and suggests strengthening these foundational areas to unleash the full potential of engineering teams. The playbook, influenced by frameworks such as SPACE and DORA, provides a holistic approach to identifying roadblocks, evaluating potential solutions and implementing change using lagging and leading metrics across the zones of quality, velocity, developer satisfaction and business outcomes. It emphasizes the importance of team perspectives, careful use of metrics and continuous improvement.
references
- ESSP introduction: https://resources.github.com/engineering-system-success-playbook
Core Components of Each Framework
DORA Metrics
The DORA metrics focus on four key indicators of delivery performance:
- Deployment frequency: How often code is deployed to production
- Lead time for changes: Time from code commit to production deployment
- Change failure rate: Percentage of deployments causing failures
- Mean time to recovery: How quickly service is restored after an incident
- Reliability: How consistently a systems performs as expected
SPACE Framework
The SPACE framework assesses developer productivity through five dimensions:
- Satisfaction and well-being: Measures developer fulfillment, health, and engagement
- Performance: Evaluates output quality and quantity
- Activity: Tracks actions like completed tasks, commits, and pull requests
- Communication and collaboration: Assesses team interactions and knowledge sharing
- Efficiency and flow: Measures how effectively developers work without interruptions
DevEx Framework
The DevEx framework focuses on three dimensions of developer experience:
- Feedback loops: How quickly developers receive input on their actions and code
- Cognitive load: The mental effort required to understand and use tools and documentation
- Flow state: The ability to maintain focused, uninterrupted work
DX Core 4 Framework
The DX Core 4 framework simplifies measurement with four balanced dimensions:
- Speed: Measured through “diffs per engineer” rather than traditional lead time
- Effectiveness: How well engineering work achieves intended outcomes
- Quality: The reliability and stability of delivered code
- Impact: The business value generated by engineering efforts
Github Engineering System Success Playbook
The ESSP focusses on 2 main concepts:
- Zones and Metrics
Zones
The following layered success zones have been identified so that software engineering can deliver greater value to the business
- Quality
- Velocity
- Software Engineers Happiness
- Business Outcomes
Per zone ESSP proposes the following metrics:
Metrics
Quality
- (Median) Change failure rate
- (Median) Failed deployment recovery time
- (Median) Code security and maintainability
Velocity
- (Median) Lead time
- Deployment frequency
- (Mean) PRs merged per developer
Software Engineers Happiness
- (Median) Flow state experience
- (Median) Engineering tooling satisfaction
- (Median) Copilot satisfaction – IMO this can be any other coding support (potentially) based on AI such as cursor IDE or others
Business Outcomes
- (Percentage) AI leverage
- (Percentage) Engineering expenses to revenue
- (Percentage) Feature engineering expenses to total engineering expenses
Key Similarities
- Holistic Approach: All frameworks recognize that traditional, single-dimensional metrics are inadequate for measuring modern software development.
- Balance of Technical and Human Factors: Each framework acknowledges that both technical performance and human factors contribute to overall productivity.
- Research-Backed Development: All frameworks were developed with input from industry experts and researchers, grounding them in both academic research and practical experience.
- Team Over Individual Focus: These frameworks generally shift emphasis from individual developer metrics to team-level outcomes.
- Beyond Output Measures: All frameworks move beyond simple output measures to include outcome and impact metrics that better reflect value delivery.
Notable Differences
- Scope and Focus:
- SPACE offers the most comprehensive approach with five dimensions spanning technical and human aspects
- DevEx narrows specifically on developer experience and removing friction
- DORA focuses primarily on delivery pipeline performance
- DX Core 4 and ESSP attempts to balance technical metrics with business impact
- Measurement Approaches:
- SPACE combines both system data (quantitative) and perceptual data (qualitative)
- DevEx relies heavily on surveying developer perceptions
- DORA is primarily quantitative, focusing on measurable delivery outcomes
- DX Core 4 and ESSP introduces metrics designed to be accessible to non-technical stakeholders
- Primary Audience:
- SPACE targets engineering managers seeking comprehensive productivity improvements
- DevEx focuses on improving day-to-day developer experience
- DORA aims at DevOps teams optimizing delivery pipelines
- DX Core 4 explicitly targets communication with business executives
- ESSP focusses on executives
- Balancing Mechanisms:
- SPACE uses five dimensions to provide a balanced view
- DevEx uses three dimensions centered on developer experience
- DORA balances throughput metrics with stability metrics
- DX Core 4 uses “oppositional metrics” (speed vs. effectiveness, impact vs. quality) to create balance
- ESSP proposes three zones and multiple metrics per zone
- Maturity and Adoption:
- DORA is the most established and widely adopted
- SPACE has gained significant traction since 2021
- DevEx is newer but building momentum
- DX Core 4 is rather new (end of 2024) and ESSP the newest (April 2025) and still gaining adoption
Conclusion: Integrating the Approaches and Frameworks
While I’ve presented these frameworks as distinct approaches, the most effective engineering organizations recognize that they’re complementary rather than competitive. Each framework illuminates different aspects of engineering effectiveness, and together they provide a more complete picture than any single framework alone.
For example:
- DORA metrics can tell you how effectively your delivery pipeline is functioning
- SPACE can help you understand broader productivity factors including team dynamics
- DevEx, DX Core 4 and ESSP can highlight opportunities to make your software engineers’ daily experience more effective and satisfying
- DX Core 4 and ESSP balance engineering happiness and business impact
The key is to start with clear objectives:
- What specific challenges is your organization facing?
- What outcomes are you trying to improve?
- What impact are you striving for?
- What cultural changes do you want to drive?
Then, select metrics and frameworks that align with those objectives, being careful not to implement too many metrics at once. Begin with a small set of meaningful measurements, establish baselines, and expand as your measurement capability matures.
Remember that the purpose of these frameworks isn’t measurement for measurement’s sake, but rather to drive continuous improvement in ways that matter to your business, your customers, and your engineering teams:
“When a measure becomes a target, it ceases to be a good measure”
Goodhart’s law: https://en.wikipedia.org/wiki/Goodhart%27s_law
Leave a Reply