Be Digital · Field notes · Part two

Wiring the playbook: Claude Code across a polyglot platform

The first note was the mental model — keep the always-on layer lean, push everything else off the meter. This one is the wiring: how that model becomes a real setup across four related apps in different languages, and the one discipline that decides whether any of it holds up over a long session.

← Part one: Why Claude “Skills” quietly change the economics of AI work

The two memories you’re actually managing

Claude Code loads two memory systems at the start of every session — and the split matters, because you write one and the model writes the other. CLAUDE.md is yours: instructions, conventions, architecture, loaded in full every time. Auto memory is the model’s: build commands that worked, a Testcontainers quirk, a flaky-test workaround — accumulated on its own, with the first 200 lines reloaded each session.

The division of labor writes itself. Hard rules and routing go in CLAUDE.md, where they load whole and survive compaction. Incidental learnings go to auto memory. The trap is putting a must-follow rule in the auto file: past line 200 it silently drops, and you’re left wondering why the model forgot.

Why per-app memory is almost free

Claude reads memory by walking up the tree — from wherever you’re working toward the root, concatenating every CLAUDE.md it passes. Files in subtrees below you don’t load until the model actually reads something there. That single mechanic is the whole reason a four-app setup stays cheap: open a session inside the grid service and you pay for the root file plus that one app’s file — never the other three.

Time-series APIQuestDB · ILP + SQL

In-memory gridHazelcast 5.x

Edge serviceNode 20 · TypeScript

Core serviceSpring Boot 3 · Java 21

Three JVM apps and a Node app — one quality project each. Polyglot is the interesting constraint: coverage tooling forks (JaCoCo on one side, c8 on the other), so the rules have to fork with it.

So where does each thing actually live?

The placement rule is the cost rule. Anything paid every session stays tiny; everything else gets pushed to a layer that loads only when it’s earned.

Always-on taxCLAUDE.md + skill descriptions: The four-app map, build commands per stack, the handful of conventions that change behavior. Mine came in under 50 lines. Everything else moves down.
On the matched path.claude/rules/*.md · paths globs: Language rules that load only when a matching file is touched — Java rules stay out of a Node session entirely, and vice-versa. This is where the polyglot split lives.
On trigger.claude/skills/*/SKILL.md: The procedures — raise-coverage, fix-vulnerability, triage, the cluster-upgrade and data-tier checks. Only the one-line description is resident; the body loads when the task calls for it.
Off the main threadsubagents: Heavy exploration runs in its own context window; only the final report returns. A cross-app security crawl comes back as a one-page list instead of a blown budget.
The org floormanaged-policy CLAUDE.md: In a regulated shop, the non-negotiables — no secret commits, no unreviewed prod config — go here. It can’t be excluded by an individual’s settings. That’s the point.

Guarding the irreversible — cheaply

Two operations in this platform can’t be taken back: a rolling upgrade on the grid, and a retention or partition change on the time-series store. You don’t want those rules sitting in always-on memory diluting everything else — but you do want them to appear, loudly, at the exact moment they’re relevant.

Conditional priority blocks do that. Wrap the rule so it surfaces only on the matching condition:

<important if="rolling upgrade, member add/remove, or cluster restart">
Always dry-run first. One member at a time. Enforce the quorum stop rule: if removing the next member would drop active size below the minimum, stop.
</important>

It costs nothing until the path matches. When it does, it’s the most prominent thing in context. The same pattern gates the data-tier work — name the partitions in scope, confirm the environment, and never issue a destructive change without an explicit go-ahead in the prompt.

The discipline nobody mentions: the live budget

Keeping the files lean is only half of it. The other half is the running context window during a session — and there’s a well-worn set of thresholds for where quality starts to slip.

The thresholds are practitioner rules of thumb, not hard limits — but they hold up. Subagents are the highest-leverage move: keep the heavy work off the main thread.

This is also why the fix loops in the setup carry a hard three-cycle cap: scan, fix, re-scan, and if a gate still fails after three rounds, stop and report. It’s a quality control and a budget control at once — it stops the agent from grinding a stubborn gate straight into the dumb zone.

lines in the root CLAUDE.md — the entire always-on tax

skills, costing ~nothing until one fires

cycle cap on every scan-fix loop

<30%

context in use — the high-stakes ceiling

The open question, answered

Last time I said I was still chewing on how far to push repo “maps.” Here’s where I landed. A full, hand-maintained map of four apps in always-on memory is a fossil — stale within a sprint, taxed every session, competing for attention with the rules that actually matter. So: keep a one-line-per-app “at a glance” in the root for routing, push the real map to an imported doc with a single home, and let auto memory plus on-demand file reads carry the deep structure. The model reading the actual code is more current than any map I’d maintain by hand. Enrich the at-a-glance lines only if it starts mis-routing — not before.

Three take-aways I’m keeping

Push facts down the tree. Memory loads upward, so per-app files are cheap — only the app you’re in gets paid for.
Guard the irreversible with conditional rules. They cost nothing until the path matches, then they’re the loudest thing in the room.
Manage the live budget, not just the files. Subagents and a fix cap keep a long session out of the dumb zone — that’s where the quality actually leaks.

Vintage NYC subway token — Good For One Fare — Consulting since 2002, NYC.

Be Digital — notes from the bench, part two. Further reading: Claude Code Memory (Sébastien Dubois) · Effective context engineering for AI agents
Related: Claude Skills & context engineering · A Copilot-only coverage workflow · Cost optimization: working with tokens · Context & token cheat-sheet · The Credit Fire — Copilot spend parable

Deploying GenAI in a regulated environment?

See GenAI & AppSec advisory