In our last guide, we wrote about how to measure development performance at the organizational level. Many of the same principles can be applied to measuring teams.
The key difference is that rather than using engineering metrics to maximize outputs, they should be used to help teams achieve more predictable and consistent results. Better business outcomes are driven by teams that ship features with a repeatable velocity.
So how do we measure velocity? It may be tempting to start with metrics from project management tools, such as story points and issues completed. While reasonable proxies, these metrics are abstractions of the actual engineering work being performed. Pull requests are a closer approximation of delivering working software, which is the primary measure of progress as set forth in the Agile Manifesto.
While there are hundreds, if not thousands, of metrics that teams can measure to improve productivity, you really only need a few metrics to improve engineering predictability. Focusing on a few key metrics is not only less complex to implement (requiring no more than a source control management tool, like GitHub or Bitbucket), but also allows for standardization across teams and unlocks organizational benchmarks.
Trust is critical for high-performing teams. It’s important that team managers avoid the natural tendency to use engineering metrics to evaluate or micromanage individuals. Not only does micromanagement create a toxic environment where engineers feel judged, but it can also lead to individuals changing their behaviors to game the metrics.
The first KPI that teams should track is changes delivered — or pull requests merged to the mainline branch — over time. If there are large fluctuations in this metric week-over-week, then that may be an indication that your team is performing inconsistently. Volatility can make planning more difficult and can impact collaboration between teams.
There are many potential explanations for inconsistency: inadequately scoped issues from the product team, such as unpredictable bottlenecks (e.g. a slow handoff from another team), and external constraints (e.g. a deadline that has been moved up). Another potential explanation is that quality has deteriorated to the point that it is adding friction to the development process.
Engineering work can be broken down into three types: feature, churn, and refactor. A feature is any new work that extends the existing codebase. Churn is work that modifies code that was previously added or modified within 21 days, while refactor is work that modifies code that has been stable for at least 21 days.
Every team will have its own healthy balance between these three types of work. For instance, teams that are developing a new application will have a higher percentage of feature work, and teams that are maintaining a legacy application will spend more time refactoring code to burn down technical debt.
A high rate of churn and refactor impacts velocity — especially if there is a gradual shift in engineering work from features to churn and refactor, which may be a signal that your team should focus more time on improving quality.
While you are your own best reference class, it is also important to understand how your team is performing relative to the rest of the organization. By measuring changes delivered per engineer and benchmarking your team against organization and global averages, you can get a better understanding of how your team is performing.
When benchmarking teams against the organization (or even other teams), it’s important to remember that all teams are different. For instance, a DevOps team shouldn’t be compared to an app development team, and a quality assurance team shouldn’t be compared to an SRE team. Even with product development teams, there are different types of applications in different life cycles of development, each with varying requirements day-to-day.
Measuring lead time helps you understand how long it takes to deliver changes. Teams should aim to keep lead time consistent each week.
Bottlenecks in the development process limit your ability to deliver changes consistently and prevent a smooth flow of work. For instance, if engineers are taking longer to perform code reviews, it can cause significant lead time delays. Or if deployments are manual, time-consuming, and frequent, then it will lead to delays in how quickly features are delivered to customers.
You can improve your team’s lead time by setting objectives for each stage. These guideposts can help teams maintain predictable outputs. They also lend themselves to creating reminders, where developers are alerted about changes that do not achieve the objective that was mutually agreed upon by the team.
When measuring team performance, it is important to remember that teams are part of a much larger ecosystem. If you don’t solve the organizational bottlenecks, then it doesn’t matter how well your teams are performing. Companies that invest in improving the developer experience — whether it be improving developer onboarding, investing in productivity tooling, or enabling end-to-end continuous integration and deployment — better enable teams to ship features consistently.
A team can be a catalyst for organizational change by using data to highlight issues impacting the developer experience. For instance, if one team successfully adopts GitHub Copilot and sees an improvement in their metrics, then you can make the case for rolling out GitHub Copilot to the rest of the organization.
By empowering teams with data, organizations can not only improve the predictability and repeatability of their product releases, but also create a more satisfying, happier environment for developers.
Connect your tools and visualize your data in minutes. When you sign up, you’ll get immediate access to your data. No demo or sales calls.