How to Measure the Benefits of GitHub Copilot — Best Practices
Advice and benchmarks for converting GitHub Copilot benefits into meaningful ROI.
Neely Dunlap
Browse chapters
Share
October 22, 2024
How to Measure the Benefits of GitHub Copilot — Best Practices
After three to six months with GitHub Copilot up and running, leadership will be knocking at your door to answer their big question: “How has the world’s most famous AI coding assistant increased our developer productivity?” To answer that question, you need data to illustrate the benefits of GitHub Copilot on engineering outcomes.
When framed within the Launch-Learn-Run framework, you’ve reached the Run phase.
- During the initial Launch phase, you focused on understanding organic adoption and usage.
- In the subsequent Learn phase, you gathered insights from developer surveys, ran A/B tests, and analyzed before-and-after metrics for early adopters and power users.
- Now, in the Run phase, you need to measure downstream impacts across the SDLC to ensure individual benefits of GitHub Copilot have resulted in collective productivity gains.
ROI metrics encompass more than just time savings and developer satisfaction; they must also reflect the primary business goals of delivering features to customers faster, maintaining high quality and reliability, and supporting business growth.
With so many of your developers using GitHub Copilot, you’ll be able to measure whether its adoption is moving the needle on collective KPIs—the lagging indicators to better and faster coding.
In the final article of our series, we explore best practices for measuring and communicating the full benefits of GitHub Copilot.
Measure Downstream Velocity KPIs
As your rollout expands and adoption grows, individual developer's time savings and productivity gains should ultimately translate to faster end-to-end delivery and improved collective outcomes.
GitHub Copilot likely generated time savings for your developers, increasing their personal velocity as measured by PR Merge Rate. Once the code is merged, dependencies kick in—on reviewers, QA, and deployment processes.
To measure GitHub Copilot’s downstream benefits from faster coding, measure the following velocity metrics:
- PR Cycle Time: Is the whole cycle getting faster or are the gains being erased?
- Task Cycle Time (or Lead Time, depending on your taxonomy and processes): Are tasks completing faster end to end?
- Task Throughput: Are developers completing more tasks?
Best practice: Look closely at teams where GitHub Copilot usage is high. Teams with low adoption will not have measurable downstream impacts. Leverage your usage data to compare and contrast KPIs for teams (or teams of teams) that have achieved over 50% adoption. Look at their metrics before and after the 50% usage threshold has been crossed.
Measure Downstream Quality KPIs
After several months of adoption, downstream impacts on quality may come to bear. Proactively monitoring changes to quality KPIs will help put the right guardrails in place. Similarly, if your metrics show quality is holding steady or even improving, your confidence to expand GitHub Copilot licenses will increase.
Gather metrics from quality, support, and incident management tools to observe the impact on metrics like:
- Bugs per developer
- Incidents per developer
- Vulnerabilities
Best practice: Continue evaluating and enhancing the quality, reliability, and security of AI-generated code. Define and adhere to business-approved coding standards to prevent avoidable future issues.
Identify Shifting Bottlenecks
Every organization and sub-organization is going to experience different immediate gains and downstream impacts depending on their context and DevOps maturity. For example, a team with a feature-flag-controlled fully continuous deployment process and extensive test automation may see the gains in faster coding times directly translate to faster end-to-end lead times and more frequent deployments. Other teams may have more work to do to get there.
Best Practice: Monitor shifting bottlenecks. Visualize the cycle times within a metrics to understand where the work is slowing down. Comparing cycle times before and after Copilot adoption helps identify the constraints you need to tackle to capitalize on accelerated coding.
Leverage Causal Analysis If Gains Don’t Materialize
At any given moment, multiple factors that influence developer productivity are at play. So if there is no positive improvement in your metrics—how can you be sure it’s related to Copilot?
Tools like Faros AI utilize machine learning to conduct causal analysis of these metrics, and can answer this question. To find out more, contact us.
Maximizing the Long-Term Benefits of GitHub Copilot
Once you’ve measured downstream impact, you’ll be able to have meaningful, data-driven conversations with leadership that justify the tool’s continued use and expansion. Following the best practices in this series will set your engineering organization up to experience maximum benefits of GitHub Copilot.
Additional blogs in this series:
Overview: GitHub Copilot Best Practices: Launch-Learn-Run Framework
Phase 1: Launch: How to Increase GitHub Copilot Adoption and Usage
Phase 2: Learn: How to Capitalize on GitHub Copilot’s Advantages
More articles for you
See how real-world user insights drove the latest evolution of Faros AI’s Chat-Based Query Helper—now delivering responses 5x more accurate and impactful than leading models.
Editor's pick
Is the Build Time metric the right measure to demonstrate the ROI of Developer Productivity investments? Does it stand up in court? We found out.
Editor's pick
A guide to measuring continuous integration metrics, such as CI Speed and CI Reliability, and an introduction to the most important developer productivity metric you never knew existed.
See what Faros AI can do for you!
Global enterprises trust Faros AI to accelerate their engineering operations.
Give us 30 minutes of your time and see it for yourself.