Contact us
Tell us what you want to achieve with Faros AI and we’ll show you how.
Want to learn more about Faros AI?

Thank you!

A Faros AI expert will reach out to schedule a time to talk.
P.S. If you don't see it within one business day, please check your spam folder.
Oops! Something went wrong while submitting the form.

Does measuring software engineering performance actually deliver value?

The concept of measuring the performance of software development teams is nothing new, but it recently returned to the public consciousness with a little controversy, thanks to a McKinsey article. Guest Author, Jason English shares his perspective on why everyone hasn't already jumped on the measurement bandwagon?

Jason English, Intellyx (Guest)
Jason English, Intellyx (Guest)
A white banner features a man in a suit observing an assembly line; a speech bubble of a bearded man indicates a guest post.
7
min read
Browse Chapters
Share
November 3, 2023

Every enterprise in the world wants to maximize performance: delivering for customers better, faster, and cheaper than the competition.

Further, software company executives love to repeat the mantra that “every company is a software company” as often as possible.

Therefore, it stands to reason that management consulting firms would seek to apply their MBA statistical models to maximize performance of the software-producing function of any enterprise.

The concept of measuring the performance of software development teams is nothing new, but it recently returned to the public consciousness with a little controversy thanks to this recent McKinsey piece titled: “Yes, you can measure software developer productivity.”

Implement their methodology, the article says, and developers could realize a 20-to-30 percent reduction in customer-reported defects, a 20 percent improvement in employee experience scores, and a 60 percent improvement in customer satisfaction.

Sounds incredible! With results like that, why hasn’t everyone already jumped on their proposed measurement bandwagon?

Why measure developer productivity?

Compared to other process-oriented industries, the software industry has been rather undisciplined in its approach to measuring results. An ineffable ‘tiger team’ mentality arose, where we expected one genius developer or an expert team to lock themselves in the office with a couple pizzas and some Jolt Cola, and hammer out brilliant code.

This ‘code cowboy’ mentality predictably led to failure and heartbreak, as two-thirds of software projects consistently failed to meet budgets and timelines.

CEOs and CFOs were constantly frustrated by a lack of accountability. They wanted engineering orgs to take a page from the discipline of industrial supply chain optimization, so software development could realize the benefits of KPI measurements, Kanban-style workflows, and process automation that built everything else in our modern economy.

The DevOps movement evolved from Agile methodologies around 2008, and engineering organizations started looking at software delivery through a continuous improvement lens. We learned to empower dev teams to collaborate with empathy while ‘measuring what matters’ and ‘automating everything’ toward delivering customer value.

The release of The Phoenix Project book articulated the connection between DevOps and supply chain optimization, highlighting the Three Ways: flow/systems thinking, feedback loops, and a culture of continuous improvement reminiscent of the best-running Toyota car factories in Japan.

In an industrial supply chain scenario, planners could look for signals like supplier availability, work-in-process, and inventory turns as performance indicators. By comparison, software development deals with much less substantial signals — bits and bytes moving over the internet: the intellectual assets of ideas, requirements, and data.

If we are to achieve a new wave of industrialization in the software industry, clearly coming to grips with the data that feeds the software supply chain is our first priority.

Where measurements meet incentives

The McKinsey model was built atop two currently popular frameworks: DORA (DevOps Research and Assessment) metrics, popularized by Google and many other companies invested in the DevOps movement; and SPACE metrics (satisfaction, performance, activity, communication and collaboration, and efficiency) added by GitHub and Microsoft.

On top of that, they added a set of new ‘opportunity focused’ metrics: Developer velocity benchmarks, contribution analysis, talent capability score, and inner/outer loop time spent.

Interestingly, their “inner/outer loop” metric uniquely prioritizes time spent on the “inner loop” building (coding and testing) software, instead of the “outer loop” time spent on integration, integration testing, releasing, and deployment.

But what if that outer loop is a vitally important part of certain roles in the engineering org? To avoid technical debt, we need architects focused on system design, and SREs capable of tracking down root causes of issues in deployment.

This wonderfully vitriolic blog response in The Pragmatic Engineer with Kent Beck and Gergely Orosz responds with a perfect example of how a measurement initiative that started with decent results eventually strayed:

“At Facebook we [Kent here] instituted the sorts of surveys McKinsey recommends. That was good for about a year. The surveys provided valuable feedback about the current state of developer sentiment.

Then folks decided that they wanted to make the survey results more legible so they could track trends over time. They computed an overall score from the survey. Very reasonable thing to do. That was good for another year. A 4.5 became a 4. What happened?

Then those scores started cropping up in performance reviews, just as a "and they are doing such a good job that their score is 4.5". That was good for another year.

Then those scores started getting rolled up. A manager’s score was the average of their reports’ scores. A director's score would be the average of their reporting managers’ scores.

Now things started getting unhinged. Directors put pressure on managers for better scores. Managers started negotiating with individual contributors for better survey scores. “Give me a 5 & I’ll make sure you get an ‘exceeds expectations’.” Directors started cutting managers & teams with poor scores, whether those cuts made organizational sense or not.”

Whoa. How orgs act upon development metrics is as important as the measurements themselves. Nobody wants to see performance improvement goals create a zero-sum game that disheartens valued technical talent.

On the positive side, McKinsey’s article can only spur more thought and discussion among the development community toward how engineering orgs can deliver more predictable metrics, like the ones CEOs and CFOs expect to see from other groups like sales and customer services.

Developer enablement metrics for success at Autodesk

You already know Autodesk—if you’ve ever seen a really cool modern building, or a hyper-realistic 3D animated film, chances are, their software was used by professionals to help design or create it.

Autodesk supports an suite of highly refined and specialized CAD and design tools, but as they started migrating to a common cloud-and-microservices-based architecture to improve scalability and automate deployment infrastructure, delivery time became unpredictable, with teams stymied by environment availability and service interdependencies.

“If ten teams are doing well and only one team is doing poorly, you are only as good as your weakest link,” said Ben Cochran, VP of the newly formed Developer Enablement team, reporting directly to the CTO.

With an eye to improving developer experience and morale across their system, rather than at an individual level, the team adopted DORA metrics, including deployment frequency, mean time to recovery (MTTR), lead time, and change failure rate (CFR) as Autodesk's foundation for productivity measurement.

The output velocity and business outcomes of their software team were improved, but in the macro view, creating an environment of collaboration and shared learning that removes roadblocks, rather than taking punitive measures based on measurements, made all the difference.

The Intellyx Take

For engineers, too much emphasis on monitoring and metrics can feel like Big Brother is looking over your shoulder, inhibiting creative problem solving. Conversely, a lack of measurement also means that problems aren’t getting reliably solved.

Poor development performance metrics overlook the constant competitive imperative for achieving more productivity with fewer resources, and can eventually result in layoffs or draconian performance measures being put in place.

Success at measurement depends on a balancing act between innovation and efficiency, while aligning team members with high-value business outcomes and eliminating administrative toil from the development process.

Even if there’s healthy disagreement about the details of McKinsey’s developer performance model, it’s useful to get everyone talking about how to mature the discipline of software development.

Said Vitaly Gordon, CEO of Faros.ai in a recent blog: “McKinsey speaks the language of the C-Suite well. If they can get executives to commit time and effort to removing friction from the engineering experience based on what the data is telling us, I am all for it.”

Image source: Mike G., Flickr CC2.0 license.

©2023 Intellyx LLC. Intellyx retains editorial control of this document. At the time of writing, Faros.ai is an Intellyx client. No AI was used in the writing of this story. Image source: Mike G., Flickr CC2.0 license.

Contact us
Tell us what you want to achieve with Faros AI and we’ll show you how.
Want to learn more about Faros AI?

Thank you!

You will get an email soon. Feel free to download Faros AI Community Edition.
Oops! Something went wrong while submitting the form.

More articles for you

Cause and effect relationship with Copilot logo
Editor's Pick
AI
DevProd
20
MIN READ

Does Copilot Improve Code Quality? The Cause and Effect Data Is In

Does GitHub Copilot improve code quality? Our causal analysis reveals its true impact on PR size, code coverage, and code smells.
March 13, 2025
Image of a spiral bound Engineering Productivity Handbook on a blue background
Editor's Pick
Guides
DevProd
20
MIN READ

The Engineering Productivity Handbook: How to tailor your initiative to your goals, operating model and culture

What to measure and why it matters. How to collect and normalize productivity data. And the key to operationalizing metrics that drive impact.
February 25, 2025
Two overlapping circles on a dark blue background featuring the Faros AI and Microsoft logos and the title of the press release.
Editor's Pick
News
DevProd
2
MIN READ

Faros AI Partners with Microsoft to Unleash AI-Powered Engineering Efficiency on Microsoft Azure

Now available in Azure Marketplace for procurement with Microsoft Azure Consumption Commitment (MACC), Faros AI empowers enterprises to optimize engineering with AI
February 19, 2025

See what Faros AI can do for you!

Global enterprises trust Faros AI to accelerate their engineering operations. Give us 30 minutes of your time and see it for yourself.