Your team is probably using Copilot, Claude, Cursor, or some combination of all three. The question isn’t whether AI is writing code in your repos. It’s whether that code is any good, and how much of your output it actually accounts for.
AI Impact answers that. CompassHQ looks at every merged PR, figures out if AI was involved, and gives you a breakdown of throughput and quality so you can compare AI-generated work against human-written code.
How Detection Works
Every PR gets classified into one of three buckets:
| Classification | What It Means | Signal |
|---|---|---|
| Agent-Authored | An AI bot opened the PR | PR author is a known bot account (copilot[bot], claude-code[bot], devin-ai[bot], etc.) |
| AI-Assisted | A human wrote it with AI help | Commit messages have Co-Authored-By lines mentioning an AI tool |
| Human | No AI signals found | Neither of the above |
If the PR author is a bot, that wins. It’s agent-authored regardless of what’s in the commit messages.
Which Tools Are Recognized
- GitHub Copilot — bot accounts and co-author tags
- Claude / Claude Code — bot accounts and co-author tags
- Cursor — co-author tags
- Devin — bot account
- CodeRabbit, Sweep, Sourcery, Ellipsis, Codex — bot accounts
All matching is case-insensitive. If you’re using a tool that isn’t on this list, those PRs will show up as human until we add support.
The Dashboard
Open AI Impact from the sidebar. You’ll see four sections.
Summary Cards
Three numbers at the top:
- AI PR Ratio — what percentage of your merged PRs had any AI involvement
- Agent-Authored PRs — the subset that were fully written by a bot
- AI Tools Used — how many distinct tools showed up, plus the top three by volume
Throughput Over Time
A weekly chart with four lines: total PRs, AI-assisted, agent-authored, and human-only. This is where you can spot trends. If the AI-assisted line is climbing but your total output stays flat, AI might just be replacing work your team would’ve done anyway. If both lines rise together, that’s the capacity gain you’re looking for.
Quality: AI vs Human
This is the section that matters most. Three metrics, side by side:
| Metric | What It Tells You |
|---|---|
| Rework Rate | How often the author had to push fixes after review. If AI PRs have a higher rate, reviewers are catching more problems in generated code. |
| Avg PR Size | Lines changed per PR. AI tools tend to produce bigger PRs. Worth watching — large PRs get worse reviews. |
| Test Failure Rate | How often CI checks fail. A higher failure rate on AI PRs suggests the generated code isn’t being tested well enough before it’s pushed. |
Each metric is color-coded blue when AI does better, amber when it does worse. You’ll know at a glance where the problems are.
Tool Breakdown
A bar chart showing PR count by tool. Useful if you’re paying for multiple AI tools and want to know which ones your team actually uses.
Tagging Bugs Back to PRs
When you’re investigating an incident or bug, you can link it to the PR that introduced it. Open the issue and set the “Introduced by PR” field. If that PR was AI-assisted or agent-authored, the quality metrics update automatically.
This is manual, so it only works if your team does it consistently. Make it part of your incident review and you’ll build up real data on whether AI-generated code introduces more defects than human-written code.
Organizational Readiness
If you’ve got the Surveys module turned on, the bottom of the AI Impact page shows readiness scores pulled from the “AI Capabilities Assessment” survey. It covers seven areas:
- AI Strategy — does your org have a clear plan for AI adoption?
- Platform Quality — how mature is your internal tooling?
- Data Quality — can AI tools trust the data they’re working with?
- Developer Experience Focus — are you investing in developer productivity?
- Learning Culture — is the team open to trying new tools?
- Architecture — are your systems ready for AI-generated contributions?
- User Focus — does AI output align with what users need?
You’ll also see self-reported time saved (hours per week) and a team sentiment score. If nobody’s run the survey yet, you’ll get a link to launch it.
Time Period
There’s a period selector in the top right. You can switch between 30, 60, 90, and 180-day windows. Default is 90.
Getting Useful Data Out of This
Give it a few weeks before you read too much into the numbers. You need a decent sample of merged PRs before the ratios stabilize.
Once you’ve got baseline data, pay attention to quality, not just volume. A 40% AI PR ratio sounds impressive, but not if those PRs have double the rework rate. If you’re seeing quality problems, tighten your review guidelines for AI-generated code rather than slowing down adoption.
Run the readiness survey once a quarter. The numbers are only useful as a trend — a single snapshot doesn’t tell you much. And tag your bugs. The “Introduced by PR” link is the only way to connect incidents back to AI-generated code, so if your team skips it, the quality data will have gaps.