Wrapping a Harness Around Your AI Coding Agent
How I use CLAUDE.md, AGENTS.md, and their config directories to turn an AI coding tool into something I can actually trust, line by line, as a blind developer.
I use AI to write code, and I also happen to be blind. That second fact changes how I work with the first one. Most developers can glance at a block of generated code and feel like they have checked it. I cannot. Every line goes through a screen reader, one at a time, so I read what the model actually wrote instead of trusting the shape of it.
That constraint taught me something early that the rest of the industry is only now arguing about: you do not just prompt an AI, you harness it. The harness is the difference between a tool that helps you move faster and a tool that quietly buries assumptions in your codebase for you to find later.
This post is the in depth version of that idea. I will show you exactly how I build a harness for the two agents I use most, Claude Code and OpenAI Codex, including the files and directories that make it work and the parts people get wrong.
What a harness actually is
When I started leaning on AI to write code, I noticed how much it assumed. It would look at the file in front of it and quietly decide the rest of the repository worked the same way, instead of tracing where the data came from or who depended on it. So I did what anyone does at first: I repeated myself. I told it over and over, do not assume, confirm the facts before you change anything.
Repeating yourself does not scale, and it does not survive a fresh session. The fix is to stop typing the rules and start writing them into files the agent loads automatically, every single time. That is the harness. It has two halves:
- Instructions the agent reads at the start of every session, so your standards are the first thing it sees, before your prompt.
- Enforcement that runs on its own, so the standards are checked by your tooling rather than by your patience.
Both modern agents are built around exactly this pattern. They just use different filenames for it.
The Claude Code side: CLAUDE.md and the .claude directory
Claude Code reads a file called CLAUDE.md into context at the start of every session. Put it at the root of your project and commit it to git, and now every teammate, and every session, starts from the same rules. This is the heart of the harness.
Here is a trimmed version of my own CLAUDE.md:
# Project: codingblindtech.com
## Golden rules
- Never assume. If a fact is not in the code or the docs, ask or verify. Do not invent values or numbers.
- Confirm before you change. State what you are about to do and why, then do it.
- Trace data to its source. Before editing a function, find where its inputs come from and who calls it.
- Tests are not optional. Write the test first, then the code that satisfies it.
## Commands
- Install: `npm install`
- Test: `npm run test`
- Build: `npm run build`
## Boundaries
- Do not change the Astro `site` value or the Firebase or DNS config without asking.
A few things worth knowing about how this file loads, because the behavior is more useful than it first appears.
Claude Code does not read one file. It walks up the directory tree from wherever you launched it, collecting every CLAUDE.md it finds along the way and concatenating them. The collection is additive, not a precedence battle, so a rule in a parent folder and a rule in a subfolder both apply. Files nested below your current directory load lazily, only when Claude touches a file in that subtree. That means you can put repository wide rules at the root and package specific rules deeper in, and each layer shows up exactly where it is relevant.
There are also scope variants. A CLAUDE.local.md at the project root holds your private preferences for that project; create it by hand and add it to .gitignore so it stays off the team’s radar. A CLAUDE.md in your home directory under ~/.claude/ applies across all your projects, which is where personal habits live. On native Windows that home path resolves to %USERPROFILE%\.claude, and inside WSL it is your Linux home directory, which is the case for my own Windows plus WSL2 setup.
The .claude directory
CLAUDE.md is the instructions half. The enforcement half, plus everything more structured than a single file, lives in a .claude/ directory next to it. The pieces I lean on:
CLAUDE.mdat the project root: loaded every session and committed to git.CLAUDE.local.mdat the project root: your private notes, kept out of git..claude/settings.json: permissions, hooks, environment variables, and model defaults..claude/rules/: topic-scoped instructions, optionally gated to certain file paths..claude/skills/: reusable workflows you invoke by name..claude/agents/: subagents, each with its own prompt and context.~/.claude/: personal config that applies across all your projects.
The rules/ directory is for instructions that should only apply in certain places. A rules file can carry a paths: frontmatter block so it loads only when Claude is working on files that match. That keeps your TypeScript conventions out of the way when Claude is editing a config file, and vice versa:
---
paths:
- "src/**/*.ts"
---
# Verification rules for TypeScript
- Every new function ships with a Vitest test in the same change.
- No `any`. If a type is unknown, model it explicitly rather than papering over it.
The most important file for the enforcement half is .claude/settings.json, because that is where hooks live. A hook is a command that Claude Code runs automatically at a point in its lifecycle. This is the part of the harness that does not depend on the model choosing to behave. My favorite is a hook that runs the test suite after Claude edits or writes any file:
{
"hooks": {
"PostToolUse": [
{
"matcher": "Edit|Write",
"hooks": [
{ "type": "command", "command": "npm run test --silent" }
]
}
]
}
}
When that hook runs and the tests fail, the failure is fed straight back to Claude as part of the tool result, so it sees the broken test and fixes it before moving on. For someone who works test first, this is the harness made real: the rule in CLAUDE.md says write the test first, and the hook makes the green suite a condition of progress rather than a suggestion. If you want a hard stop instead of feedback, a PreToolUse hook can deny a tool call outright, which is how you block something like a destructive shell command before it ever executes.
The same directory holds agents/, where you define subagents: focused helpers with their own prompt and their own slice of context. A reviewer subagent whose only job is to hunt for correctness, security, and test risk is a natural fit for the never assume rule, because it gives you a second independent pass over the work.
One security note, since I think about this for a living. The transcripts, prompt history, and file snapshots Claude Code keeps under ~/.claude/ are stored as plaintext. Anything that passes through a tool, including the contents of a .env file it reads, lands in that transcript on disk. The protection is your operating system’s file permissions and the cleanup window, nothing more. If you handle secrets, deny reads of credential files in your permission rules and keep that retention window short.
The GPT side: AGENTS.md and the .codex directory
OpenAI Codex uses the same mental model with different names. The instructions file is AGENTS.md, a plain Markdown file that Codex concatenates into context at the start of every session. Think of it as a README written for the agent rather than for a human: a README explains what a project does, while AGENTS.md explains how the project should be worked on, what to run, and what to never touch.
# AGENTS.md
## Project
Static Astro blog deployed to Firebase Hosting on the free tier.
## Commands
- Install: `npm install`
- Test: `npm run test`
- Build: `npm run build`
## Conventions
- Never assume. Verify facts against the code, or ask, before proceeding.
- Trace inputs to their source before editing a function.
- Write the test first.
## Boundaries
- Do not modify deployment or DNS configuration without confirmation.
AGENTS.md is worth knowing about beyond Codex, because it is no longer one vendor’s idea. It started in Codex and has since been adopted as an open standard, now backed by the Agentic AI Foundation under the Linux Foundation and supported across Codex, Cursor, Gemini CLI, GitHub Copilot, Windsurf, and a long list of others. One instructions file, many tools. That is a real argument for putting your harness rules there even if Claude Code is your daily driver.
The discovery rules will feel familiar. Codex walks up from your working directory to the project root, which by default is the nearest folder containing a .git. It reads a global AGENTS.md from your home config first, then the repository root file, then any nested AGENTS.md closer to the work, with the closest file winning. You can override behavior per directory, and an AGENTS.override.md takes precedence over the regular file in the same folder.
The .codex directory, and the .agents correction
Here is where I want to be precise, because there is a common misconception, and I would rather give you the real layout than a tidy one that is wrong.
There is no top level .agents/ directory in the Codex world. The structure that actually exists is this:
AGENTS.mdat the project root: loaded every session, the README for agents..codex/config.toml: the project-scoped config layer..codex/agents/: project subagents, one TOML file per agent.~/.codex/: personal config, including~/.codex/AGENTS.mdand a personalagents/folder.
So the parallel to Claude Code’s .claude/ is .codex/, the parallel to CLAUDE.md is AGENTS.md, and subagents live in .codex/agents/, which is a subfolder of .codex, not a separate top level folder. The string .agents.md does appear in the Codex docs, but only as an example of a custom fallback filename you could add to a list, not as a standard directory or file anyone is expected to use. If you went looking for a .agents/ folder to create, you would be building something the tool does not read.
The .codex/config.toml file is the config layer, the rough equivalent of settings.json. A couple of keys are directly useful for the harness. project_doc_max_bytes controls how much of each AGENTS.md Codex reads before truncating, which matters if your rules grow long. project_doc_fallback_filenames lets Codex treat an existing file under a different name as an instructions file. And subagents are declared here too:
# ~/.codex/config.toml
project_doc_max_bytes = 65536
project_doc_fallback_filenames = ["TEAM_GUIDE.md"]
[agents.reviewer]
description = "Find correctness, security, and test risks before code is accepted."
config_file = "./agents/reviewer.toml"
There is a security detail here too that mirrors the one above. If you mark a project as untrusted, Codex skips the project scoped .codex/ layers entirely, including project local config, hooks, and rules, while still loading your user and system config. That is a sensible default for cloning a repository you do not fully trust yet.
Where the two converge
Strip away the filenames and the two harnesses are the same shape, which is the real point.
Both put your instructions in a Markdown file the agent reads before your first prompt: CLAUDE.md and AGENTS.md. Both walk up the directory tree and concatenate what they find, so you layer general rules at the root and specific rules deeper in. Both separate committed team configuration from personal configuration that lives in your home directory. Both give you subagents for focused, independent passes. And with AGENTS.md becoming a cross tool standard, the instructions half of your harness is increasingly portable: write the rules once, and Codex, Cursor, Gemini CLI, and others can read them.
The principle underneath
A harness is not a clever prompt. It is the decision to encode your standards as files and checks instead of retyping them and hoping. Instructions set the expectation: never assume, confirm the facts, trace the data to its source, write the test first. Enforcement makes the expectation real: a hook that runs your tests after every edit, a subagent that reviews for risk, a permission rule that keeps the agent away from your secrets.
I built this because I had no choice. I cannot eyeball generated code and move on, so I needed the assumptions to surface on their own instead of staying buried. But the constraint turned out to be an advantage worth sharing, because the harness does not care whether you can see the screen. It works the same whether the model is Claude or GPT, and it works the same for any developer who would rather trust their tooling than their luck. I have proven it on my own projects, and it is the first thing I set up now before I write a single prompt.