• Platform
  • Copilot Impact
  • DORA Metrics
  • Resources
    Sign In
    Get a Demo
GuidesAI

How to Measure the Benefits of GitHub Copilot — Best Practices

Advice and benchmarks for converting GitHub Copilot benefits into meaningful ROI.

Neely Dunlap

Browse chapters

Share

October 16, 2024

How to Measure the Benefits of GitHub Copilot — Best Practices

After three to six months with GitHub Copilot up and running, leadership will be knocking at your door to answer their big question: “How has the world’s most famous AI coding assistant increased our developer productivity?” To answer that question, you need data to illustrate the benefits of GitHub Copilot on engineering outcomes.

When framed within the Launch-Learn-Run framework, you’ve reached the Run phase.

  • During the initial Launch phase, you focused on understanding organic adoption and usage.
  • In the subsequent Learn phase, you gathered insights from developer surveys, ran A/B tests, and analyzed before-and-after metrics for early adopters and power users.
  • Now, in the Run phase, you need to measure downstream impacts across the SDLC to ensure individual benefits of GitHub Copilot have resulted in collective productivity gains.

ROI metrics encompass more than just time savings and developer satisfaction; they must also reflect the primary business goals of delivering features to customers faster, maintaining high quality and reliability, and supporting business growth.

With so many of your developers using GitHub Copilot, you’ll be able to measure whether its adoption is moving the needle on collective KPIs—the lagging indicators to better and faster coding.

In the final article of our series, we explore best practices for measuring and communicating the full benefits of GitHub Copilot.

Measure Downstream Velocity KPIs

As your rollout expands and adoption grows, individual developer's time savings and productivity gains should ultimately translate to faster end-to-end delivery and improved collective outcomes.

GitHub Copilot likely generated time savings for your developers, increasing their personal velocity as measured by PR Merge Rate. Once the code is merged, dependencies kick in—on reviewers, QA, and deployment processes.

To measure GitHub Copilot’s downstream benefits from faster coding, measure the following velocity metrics:

  • PR Cycle Time: Is the whole cycle getting faster or are the gains being erased?
  • Task Cycle Time (or Lead Time, depending on your taxonomy and processes): Are tasks completing faster end to end?
  • Task Throughput: Are developers completing more tasks?

Best practice: Look closely at teams where GitHub Copilot usage is high. Teams with low adoption will not have measurable downstream impacts. Leverage your usage data to compare and contrast KPIs for teams (or teams of teams) that have achieved over 50% adoption. Look at their metrics before and after the 50% usage threshold has been crossed.

bar graph depicting PR cycle time above 50% Copilot usage

Measure Downstream Quality KPIs

After several months of adoption, downstream impacts on quality may come to bear. Proactively monitoring changes to quality KPIs will help put the right guardrails in place. Similarly, if your metrics show quality is holding steady or even improving, your confidence to expand GitHub Copilot licenses will increase.

Gather metrics from quality, support, and incident management tools to observe the impact on metrics like:

Best practice: Continue evaluating and enhancing the quality, reliability, and security of AI-generated code. Define and adhere to business-approved coding standards to prevent avoidable future issues.

Identify Shifting Bottlenecks

Every organization and sub-organization is going to experience different immediate gains and downstream impacts depending on their context and DevOps maturity. For example, a team with a feature-flag-controlled fully continuous deployment process and extensive test automation may see the gains in faster coding times directly translate to faster end-to-end lead times and more frequent deployments. Other teams may have more work to do to get there.

Best Practice: Monitor shifting bottlenecks. Visualize the cycle times within a metrics to understand where the work is slowing down. Comparing cycle times before and after Copilot adoption helps identify the constraints you need to tackle to capitalize on accelerated coding.

bar graph depicting lead time bottlenecks before and after Copilot

Leverage Causal Analysis If Gains Don’t Materialize

At any given moment, multiple factors that influence developer productivity are at play. So if there is no positive improvement in your metrics—how can you be sure it’s related to Copilot?

Tools like Faros AI utilize machine learning to conduct causal analysis of these metrics, and can answer this question. To find out more, contact us.

Maximizing the Long-Term Benefits of GitHub Copilot

Once you’ve measured downstream impact, you’ll be able to have meaningful, data-driven conversations with leadership that justify the tool’s continued use and expansion. Following the best practices in this series will set your engineering organization up to experience maximum benefits of GitHub Copilot.

Other blogs in this series:

Back to blog posts

More articles for you

See what Faros AI can do for you!

Global enterprises trust Faros AI to accelerate their engineering operations.
Give us 30 minutes of your time and see it for yourself.

Get a Demo