Correcting Instruction Debt

Here’s another AI-generated post, with Anand editor notes. But I’ve also added my own version of the post below.

I told my “find a free calendar slot” script to “Avoid weekends and holidays”. Wednesday vanished. Turns out it’s a Singapore holiday (Anand: It’s Eid al-Adha), — irrelevant for the people I was meeting in other zones. I’d debugged my own helpful rule. (Anand: What? What does “debugged my own helpful rule” even mean?)

Annoying. But revealing. I went to ChatGPT — not to fix the script, but to think: (Anand: True. I had no clue what to do.)

Could you maybe interview me to figure out what direction I might want to take this train of thought in…? Just ask me two or three questions.

(The whole point was to not take on more work myself.)

Two questions in, it named it: instruction debt. (Anand: which is such a cool term that I’ll keep it.)

Not “bad instructions,” because the original instruction was reasonable. The debt is created when a rule that once reduced cognitive load later creates invisible work, missed options, brittle behavior, or debugging cost.

That hit. The script obeyed too literally. I got no warning. Worst of all, I’d scored a self-goal — given my future self an instruction that would bother me, while believing I was being helpful.

I asked it to research further — and to mine my own agent logs as evidence. (Local MCP runs bash; ChatGPT can read ~/.codex, ~/.claude, ~/.copilot and run ~/code/scripts/agentlog.py directly.) It came back with a taxonomy. I asked it to stress-test against more correction turns and discard what didn’t survive. (Anand: Basically, I said, analyze my logs.)

It did. The robust categories, each grounded in an actual correction I’d made:

Objective framing — “don’t base teachability on scores… base it on the pattern of errors.” Wrong proxy. (Anand: Oh, yeah, I was trying to find patterns of errors in student submissions.)
Evidence/modeling — Ticketmaster classifier overfit on venue_name. Predictive, not causal. (Anand: True. Stupid model said, “tickets in this stadium sell more” as if it were actionable.)
Constraint semantics — the Singapore holiday. Hard filter where a warning would do.
State/action — Darwinbox: “Click Clockin” clocked me out. No pre/post-state check. (Anand: The button said “Clock in OR out”. I was clocked in. It clicked, thinking that’ll clock me in, without seeing that the button was already pressed.)
Representation/path — blog migration: “ALL LINKS relative” broke nested URLs. (Anand: Yeah, relative links in my blog have been problematic for 20 years.)
Validation — OCBC PDF: row balances passed, totals failed by SGD 6.9M. (_Anand: I’m nowhere near this rich. Codex just messed up badly.)

ChatGPT’s own self-critique was the best part:

“Lack of carefulness” should not be a category. It is not actionable. (_Anand: No idea what this means!)

Then the pivot. It proposed a 60-line “Operating Contract” for my AGENTS.md. I pushed back:

The operating contract is WAY too long. I was thinking 1 line, not 60… fast and frugal heuristics that cover the majority of the scenarios, rather than hard-coding everything, is what we’re suggesting coding agents do in the first place.

It came back with three lines. I pasted them in verbatim:

For non-trivial tasks, define the user-visible invariant: "done means ___"; verify that invariant before claiming success.
Treat constraints as soft preferences unless safety, privacy, data loss, credentials, or the current request makes them hard; surface any constraint that filters, skips, blocks, or deletes.
Prefer simple, rerunnable changes: inspect real inputs/state first, use existing tools/libs, log counts/examples, and call out uncertainty.

Line 1 catches OCBC and clock-in. Line 2 catches the Singapore holiday. Line 3 catches the toil.

Don’t add the rule to your head. Add it to the file the agent already reads. (Anand: Oh, so totally true!)

Actually, the first half of the above AI-generated post didn’t really resonate with me. So let me explain in my own words what I did.

I found that, for some reason, this Wednesday never appears when I ask for find a free calendar slot.
I asked Codex, “Why on earth is this happening?” It said, because you told me to exclude holidays.
That got me thinking, where am I giving instructions that shoot me in the foot? And ChatGPT did a long, detailed analysis of my coding agent logs and came up with a bunch of examples and categorization.
I didn’t bother reading it. I told it in Henry Kissinger style: can you do better?
I didn’t bother reading it again. I told it, “Just tell me what to put into AGENTS.md”. I don’t want to do the work every time. YOU do the work. Automate it!
It gave me 60 lines. I said, “What rubbish! I can’t review 60. Just 3, max.”
I copied that into AGENTS.md.

For non-trivial tasks, define the user-visible invariant: “done means ___”; verify that invariant before claiming success.

Treat constraints as soft preferences unless safety, privacy, data loss, credentials, or the current request makes them hard; surface any constraint that filters, skips, blocks, or deletes.

Prefer simple, rerunnable changes: inspect real inputs/state first, use existing tools/libs, log counts/examples, and call out uncertainty.

The first makes total sense. Define “done”.
The second makes some sense - that’s exactly what I did wrong with the calendar.
The third is supposed to “handle my recurring style” - and kind of makes sense, so I’ll let it be.

Related