March 8, 2026

Using game-playing agents to teach

After an early morning beach walk with a classmate, I realized I hadn’t taken my house keys. My daughter would be sleeping, so I wandered with my phone. This is when I get ideas - often a dangerous time for my students. In this case, the idea was a rambling conversation with Claude that roughly begins with: As part of my Tools in Data Science course, I plan to create a Cloudflare worker which allows students to play a game using an API. The aim is to help them learn how to build or use AI coding agents to interact with APIs to solve problems. ...

Leaked key sociology

It’s impressive how easy it is to find leaked API keys in public repositories. I asked Codex to run trufflehog on ~5,000 student GitHub accounts and (so far, after a few hours, 15% coverage), it found quite a few. Some are intended to be public, like Google Custom Search Engine keys. 1 2 const GOOGLE_API_KEY = "AIza..."; const GOOGLE_CX = "211a..."; Some are Gemini API keys. 1 2 3 4 5 6 7 api_key1 = "AIza..." But what’s really impressive is, when I ran: ...

Gemini CLI harness is not good enough

I’ve long felt that while the Gemini 3 Pro model is fairly good, the Gemini CLI harness isn’t. I saw an example of this today. Me: Tell me the GitHub IDs of all students in this directory. Gemini CLI: SearchText 'github' within ./ Found 100 matches (limited) Sending this message (14606686 tokens) might exceed the remaining context window limit (1037604 tokens). Me: Only send the (small) required snippets of data. Write code as required. ...

Things I Learned - 08 Mar 2026

This week, I learned: IITM has launched a 4 year degree in management & data science. “Use AI to replace early-career mentorship: use AI-driven synthetic practice when traditional apprenticeship pathways collapse. AI can generate personalized coaching, replacing the missing junior loop with training environments.” Jack Clark Observability is more than logging. It’s agents watching feeds and signalling insights! The GPT 5.4 prompt guidance is a bit complex, but here’s what it’s broadly saying: (Gemini) It’ll over-complicate answers and front-end design unless you tell it exactly how you want it It’ll keep checking with you or give up (e.g. on errors) unless you tell it otherwise, e.g. with checklists or rules Claude Code supports 32K output tokens by default. Since I generate large data stories, I usually hit this limit and lose an entire session. Setting the environment variable CLAUDE_CODE_MAX_OUTPUT_TOKENS=64000 (which is the maximum) reduces this problem. Google Workspace CLI lets you run npx -y @googleworkspace/cli as a single unified service for all Google Workspace APIs. It follows agent-friendly CLI practices which I turned into a SKILL.md. I’ve been using mise use -g ubi:owner/repo to install GitHub packages. The ubi backend is now deprecated in favor of the new github backend. This works fine for most repos, with edge cases like jtroo/kanata which still require ubi:jtroo/kanata as of now. On the margin, I’ll likely switch to just as my task runner. Claude With AI now writing almost all of my code, I don’t see much need to format it. Code formatters like ruff, dprint, biome, etc. are not relevant when AI will be reading and writing the code, not humans. I just format the prompts in Markdown. Salt is the duct tape of food ingredients. Lemon juice, vinegar, butter/oil, onion/garlic, etc. are runners-up. Claude Claude’s prompt to import memory from other AI providers doesn’t seem to work with Claude’s free account: “No memories or stored context found.”