• Platform
  • Copilot Impact
  • DORA Metrics
  • Resources
    Sign In
    Get a Demo
GuidesAI

How to Capitalize on GitHub Copilot’s Advantages — Best Practices

A guide to converting GitHub Copilot advantages into productivity gains.

Neely Dunlap

Browse chapters

Share

October 22, 2024

How to Capitalize on GitHub Copilot’s Advantages — Best Practices

Once your team is a few weeks into GitHub Copilot adoption, it's time to begin observing and analyzing its impact on early adopters, so you can fully leverage GitHub Copilot’s advantages. When framed within the Launch-Learn-Run framework, you’re now squarely in the Learn phase.

Previously, during the initial Launch phase, the focus was on understanding organic adoption and usage. The Learn phase moves your program forward—it’s all about gathering insights from developer surveys, running A/B tests, and comparing the before-and-after metrics for developers using the tool.

While it’ll be too early to see downstream impacts materialize across the board, you can begin to understand the advantages of GitHub Copilot experienced by individual developers. These leading indicators signal the potential collective improvements you can expect down the road, and highlight the sources of friction you must address to get the biggest bang for your buck.

By harnessing your learnings and adapting your program, you'll be well on your way to demonstrating GitHub Copilot's advantages and showing its impact to leadership. This will pave the way for a broader rollout and, ultimately, higher ROI once you reach the Run phase.

In this article, we’ll detail how to conduct this critical Learn phase.

Conduct and Analyze Developer Surveys

Gather the Data

Developer surveys are essential for understanding how GitHub Copilot increases productivity because developers must self-report their time savings. (Time savings from GitHub Copilot cannot be automatically calculated for now.)

These surveys provide insights into time savings, the advantages of GitHub Copilot, and overall satisfaction with the tool.

There are two types of surveys to consider:

  1. Cadence-based surveys: These surveys periodically collect feedback from software developers, typically aligned with sprints, milestones, or quarters. They include questions about how often GitHub Copilot is used, what it is used for, how much time was saved and how it was reinvested, its perceived helpfulness, and overall satisfaction levels.
  2. PR surveys: These surveys are presented immediately after a developer submits a PR to capitalize on the information while it’s fresh in their mind. Similar questions are asked, but regarding this specific PR. They include questions like whether Copilot was used for this PR, what it was used for, the amount of time saved, plans for utilizing the saved time, and satisfaction rates.

Best Practice: Instrument the data. Utilize dashboards that track time savings, the equivalent economic benefit, and the developer satisfaction clearly, in one place. Report on these findings in monthly reviews and AI steering meetings.

charts illustrating time savings and satisfaction

Best Practice: Choose the survey type preferred by your dev teams. Developers typically prefer cadence-based surveys over PR surveys, but the timeliness of PR-triggered surveys can provide more accurate time saving estimations. Space out the surveys so they don’t become burdensome. At the start of your program, run a survey every two weeks and then taper it down to once or twice a quarter.

Best Practice: Include an NPS or CSAT question in your survey. This type of question is a high-level indicator of the developer experience with Copilot, and it’s easy for leaders to understand.

Best Practice: Acknowledge the feedback. Developers expect that action will be taken to make necessary improvements. Your program champion should analyze the feedback and adjust subsequent rollout and training efforts to maximize GitHub Copilot’s advantages.

Analyze and Compare Differences Across Teams

As individual developers and teams may use GitHub Copilot differently, they’ll experience varying benefits. These differences will range across time saved, what they’re using Copilot for, and how helpful it is—which may be related to the type of work they do, the programming language, and the team’s composition (e.g., some teams have lots of senior developers, others are predominantly more junior).

Benchmark: On average, we’ve observed that developers save 38 minutes per day, but this number varies widely between organizations and within groups.

Best practice: Examine the data through the team lens. After looking at the overall data, slice-and-dice by team to understand where GitHub Copilot’s advantages are particularly powerful. For example, some teams may find it tremendously useful, while others may code in a language better suited to another coding assistant. Matching the tool to the task will help every team benefit from AI assistance.

bar graph depicting development tasks assisted by Copilot
Thoughtfully Reinvest Time Savings

As your developers become more proficient with GitHub Copilot, they will use it more efficiently and save even more time on their tasks. Instead of just picking the next ticket, teams can capitalize on GitHub Copilot’s advantages by prioritizing their most important work. High-impact tasks and initiatives may range from advancing existing projects, improving quality, and developing new skills, to addressing technical debt.

Best Practice: Strategize in advance. In preparation for anticipated time savings, your teams should discuss strategic priorities in advance to make the most of the time gained from faster coding. Reinvesting the time savings in the right things drives value for the organization and creates the ROI for the tool.

a circle graph with responses indicating how developers plan to use their time saved

Conduct A/B Tests

Create Comparable Cohorts

Running A/B tests helps you understand the advantages gained by the developers with Copilot licenses versus their non-augmented peers. Since these are relatively early days, you should measure and compare the metrics that are most immediately impacted by the use of coding assistants, like PR Merge Rate, PR Size, Code Smells, Review Time, and Task Throughput.

Best Practice: Run the A/B test for 4-12 weeks.

Best Practice: Compare apples to apples. When setting up your cohorts, ensure that the A and B groups are similar in makeup and remain representative of your typical teams. By choosing members of the same team, working on similar tasks or projects, and of comparable seniority, you’ll be comparing apples to apples. Also, be sure to control for differences between teams (ie different tech stacks or processes) for the clearest picture of GitHub Copilot’s impact.

bar graph showing PR merge rate by cohort

Best Practice: Experiment with additional A/B tests. A/B tests go further than comparing those with GitHub Copilot and those without. If you’re trialing different coding assistants or different license tiers of the same tool, doing so in the Learn phase can equip you with answers for leadership inquiries surrounding the value of different products or features. For example, does the Enterprise license tier’s improved Copilot Chat skills and use of internal knowledge bases result in more time savings, higher velocity, and better quality? Do features like PR Summaries and text completion decrease PR Review Time, a known bottleneck for Copilot users?

Compare Differences in Velocity and Quality Metrics

Since these are still relatively early days in your Copilot journey, during your A/B test, measure and compare the velocity and quality metrics that are most immediately impacted by the use of coding assistants—such as PR merge rate, review time, and task throughput.

Best Practice: Watch PR merge rate closely. This metric measures the throughput of pull requests merged per developer, on average, per month. Expect this metric increase for developers with Copilot.

Best Practice: Prepare reviewers for increased workloads in advance. Many organizations witness a negative increase in PR Review Time. It may be helpful to revisit SLAs to ensure everyone is on the same page, and set reminders for overdue code reviews. Additionally, as collecting qualitative feedback on AI-augmented changes can provide valuable insights, encourage reviewers to share their thoughts and feedback with program champions.

gauge showing GitHub Copilot Before and After Metrics: PR Review Time

Best Practice: Look beyond PR metrics. Introduce data from task management tools like Jira, Azure DevOps, or Asana to observe any notable differences in throughput and velocity between the two cohorts.

bar graph showing GitHub Copilot Before and After Metrics: Task Throughput

Best Practice: Balance speed and impact on quality. Monitor quality metrics from static code analysis tools, like SonarQube, or security findings from GitHub Advanced Security to monitor PR Test Coverage, Code Smells, and Number of Vulnerabilities for the cohorts.

Track Leading Indicators of Productivity Improvements

By analyzing data from the GitHub Copilot cohort, you can evaluate performance changes they’re experiencing over time. It’s essential to know which KPIs have increased, decreased, or stayed the same. This data can be used as benchmarks for future rollouts.

Benchmark: Organizations often see a significant decrease in PR size (up to 90%) and an increase in PR merge rate (up to 25%), while code reviews can become a bottleneck, rising by as much as 20%.

Best Practice: Pay extra attention to power users. When comparing before-and-after metrics, take a close look at power users, your heaviest Copilot adopters. Insights from how their productivity is changing can help project what to expect with higher general usage.

Learning to Run: Transforming Individual GitHub Copilot Advantages into Collective Impact

By implementing these best practices during the Learn phase, you’ll be capitalizing on the initial advantages gained from GitHub Copilot and amplifying the impact for teams across your organization.

Though you never really stop learning and iterating, after 3–6 months, you’ll enter the third stage of the Launch-Learn-Run framework. In our next article, we explore the Run stage, where you’ll examine downstream impacts and collective benefits of GitHub Copilot.

Continue to next blog:

Phase 3: Run: How to Measure the Benefits of GitHub Copilot

Additional blogs in this series:

Overview: GitHub Copilot Best Practices: Launch-Learn-Run Framework

Phase 1: Launch: How to Increase GitHub Copilot Adoption and Usage

Back to blog posts

More articles for you

See what Faros AI can do for you!

Global enterprises trust Faros AI to accelerate their engineering operations.
Give us 30 minutes of your time and see it for yourself.

Get a Demo