Be Digital · Case study
A Copilot-only coverage workflow
A team on GitHub Copilot Business — running Claude models, split across VS Code and IntelliJ — needed to lift a Java service past a 91% SonarQube coverage gate without touching production code, and without a runaway token bill. This is how I built a slim, Copilot-only workflow that drives the work autonomously, enforces "tests only" at the pipeline, and keeps the context window small on purpose.
- Goal
- Line coverage ≥ 91% (SonarQube gate)
- Hard constraint
- Add tests only — never edit
src/mainorpom.xml - Tooling
- GitHub Copilot (Agent mode) · Claude model · Maven · JUnit 5
- IDEs
- VS Code & IntelliJ IDEA
- Profile
- Copilot-only — no Claude Code, no
CLAUDE.md - Enforcement
- CI diff gate + SonarQube quality gate
The problem
Coverage gates are easy to state and easy to game. The honest version of the task has three tensions pulling against each other. It has to be autonomous enough to be worth it — nobody wants to hand-write the fortieth boilerplate test. It has to be safe — a coverage push must never quietly "fix" a failing assertion by editing the code under test. And it has to be affordable — an agent that re-reads a giant coverage report on every step turns a cheap task into an expensive one.
The team had also standardized on one AI tool: GitHub Copilot Business, with Claude selected in the model picker. So the full framework I'd normally reach for — which pairs Claude Code (with its runtime write-blocking hooks) and Copilot — was overkill. Half of it would never run. The job was to strip it down to exactly what Copilot uses, while keeping the one feature that makes the loop worth running: autonomy.
The approach: a slim, Copilot-only profile
Instead of the full dual-tool config (a CLAUDE.md, a .claude/
directory with deny-rules and a pre-tool hook, plus the Copilot files), the slim profile ships
only what Copilot actually reads — and leans on CI for the enforcement the hooks used to provide.
Kept (Copilot reads it)
copilot-instructions.md— auto-loaded policy- Two prompt files —
/coverand/increase-coverage rank_coverage.py— token-saving ranker.vscode/settings + tasks;.sdkmanrc- CI templates: guardrail · sonar · base
Removed — Claude Code only
CLAUDE.md— never read by Copilot; the policy lives incopilot-instructions.mdinstead.claude/settings.jsondeny-rules — a Claude Code mechanism; doesn't load under Copilot.claude/hooks/guard-paths.sh— the write block; hooks only fire inside Claude Code.claude/commands/— Claude Code slash commands; Copilot uses prompt files- IntelliJ Claude-panel run config
One exception, and it's repo-specific. When this profile was
built, .claude/skills/ belonged here too. As of April 2026,
Copilot reads project-level skills committed to the repo — per
GitHub's docs,
.github/skills/, .claude/skills/, or
.agents/skills/ — so a repo-checked-in .claude/skills/
is no longer invisible to Copilot. The distinction is repo-level vs. personal:
project skills committed to the repository are cross-tool, while home-directory
skills (~/.copilot/skills, ~/.claude/skills) stay
tool-specific and aren't shared across agents. This workflow uses no skills
either way — the autonomous loop runs from a prompt file, not a model-invoked
skill — so nothing in the profile changed; only the rationale did.
Giving Copilot the autonomous loop
The feature worth preserving is the autonomous /increase-coverage loop. People
assume that's a Claude Code exclusive — it isn't. VS Code Copilot Agent mode can
take multiple steps on its own: edit files, run terminal commands, read the results, and try
again. A prompt file with agent: agent frontmatter is the native equivalent of a
slash command, so the same loop runs entirely inside Copilot.
--- .github/prompts/increase-coverage.prompt.md --- agent: agent description: Autonomously raise coverage toward 91% (tests only) # body: seed report → rank worst 5 → for each class write tests, # run mvn -B test, self-correct (≤3) → stop at 91% → summarize
Safety without the hook
This is the honest trade-off of going Copilot-only. In the full framework, a
.claude/ PreToolUse hook intercepts and denies writes to src/main
and pom.xml — and, when it matches Bash as well as the edit tools,
it also catches the back door of mutating files via sed -i or cat >, which
a path-scoped deny-rule on Edit/Write alone would miss. Worth being precise:
it's an agent-level guard — a script Claude Code runs in its own tool loop — not a kernel file
permission (that's the separate /sandbox feature). And it only fires inside Claude Code;
Copilot never sees it. So on this profile, "tests only" is held by two softer layers and one hard one.
main — the pipeline diff gate is non-negotiable and tool-agnostic.Keeping it cheap: token engineering
An autonomous loop is a token amplifier. Each step re-sends the whole conversation as input, and every tool result — every Maven log, every file read — gets appended and re-sent on the next step. Left alone, the cost curve bends the wrong way. Two moves kept it flat.
First, compress the inputs. The agent needs to know which classes are least
covered; that lived in a jacoco.xml report ~6,000 tokens long (roughly 1,500+ lines). Pasting it every
turn was the single biggest cost. A 100-line script parses it once and emits a compact ranked
table — same decision quality, a fraction of the tokens.
Second, keep tool output off the meter. Maven runs go through quietly
(-q), tests are scoped to the class in play rather than the whole suite, and each
class gets a fresh context (new conversation) so five rounds of build logs don't pile into one ballooning history.
Two IDEs, one OS caveat
The team was split across editors, so the profile had to work in both. In VS Code,
the prompt files surface as real slash commands (/cover, /increase-coverage)
and .vscode/settings.json turns on agent mode and auto-approves the repeating safe
commands so the loop doesn't stop to confirm every mvn run.
In IntelliJ, the JetBrains plugin does now support .prompt.md files
(added in Copilot 1.5.54), but with two caveats that made manual paste the pragmatic choice for
this team: (a) prompt-file support is gated behind editor_preview_features, which
Copilot Business orgs frequently leave disabled; (b) the plugin only reliably discovers prompt
files created through its own settings UI — repo-committed .github/prompts/*.prompt.md
files often aren't recognized. So on a standardized Business plan, pasting the prompt body into
Copilot's agent chat remains the dependable path. copilot-instructions.md still
auto-loads in both.
One toolchain lever covers both: a .sdkmanrc pins Java and Maven, so sdk env
switches versions in one place, and the CI image is built from JAVA_VERSION/MAVEN_VERSION
variables to match. The shell pieces (setup script, the guardrail check, SDKMAN) assume bash — on
Windows that means Git Bash or WSL, which this team already used.
What I'd carry to the next repo
- Ship only what the running tool reads. A config file an AI never loads isn't neutral — it's maintenance debt and a false sense of safety. The slim profile is seven files, and every one earns its place.
- Put enforcement where it can't be bypassed. Natural-language rules are guidance; the CI diff gate is the contract. Make the hard check tool- and IDE-agnostic.
- Autonomy and frugality aren't opposites. The same loop that accelerates the work will inflate the bill unless you compress inputs and keep tool output out of history.
- "Good" coverage asserts behavior. The prompt forbids tests that execute lines without asserting anything — the number is only worth raising if the tests would catch a regression.
Be Digital — notes from the bench. Related: Cost optimization: working with tokens · Claude Skills & context engineering · Claude Code implementation field notes.
Want this kind of work for your team?
Cost-aware, auditable AI workflows — Copilot or Claude Code — that respect your guardrails and your budget.
See GenAI & AppSec advisory