Things I Learned - 02 Nov 2025

This week, I learned:

TVMaze API is an API for TV shows, episodes, cast, crew, etc. Useful for TV-related apps as well as learning APIs.
Awesome Skills is a curated list of prompts and skills for AI coding agents.
⭐ nokode is a API server that has no code: just LLMs responding. Interestingly, it is compliant. Just expensive, slow, forgetful and unreliable compared to code. All four are improving with time, indicating that coding may be transitional.
Notes from Vanya Seth’s keynote at OSAI HYD
- Superpowers of Gen AI to keep in mind when exploring AI coding agent use cases:
  - Translating. Requirements to code, code to code, language to queries, standard to standard.
  - Finding info just-in-time (in context). How does this work? What’s this error? What tools are permitted in my org? Who knows what? E.g. Atlassian Rovo queries across JIRA, Confluence, etc.
  - Brainstorming and ideation. Product ideation. Requirements. Testing gaps. Architecture review. Exploratory / scenario testing.
  - Summarizing and clustering. Change logs, incident management, research data, docs summary.
- Challenges in using AI coding agents:
  1. Adoption imbalance. Only certain roles are amplified by AI. Coding, QA, more than planning, maintenance, AI ops, etc. What’s the impact of this?
    - ⭐ Goldratt’s ToC implies that backlogs need to fill faster. Downstream becomes a bottleneck. Technical debt piles up.
    - ACTION: Use AI across entire value chain, from research to maintenance.
  2. Locality. enhances roles (nodes), not relationships (links). They optimize local work, not global flow. Workflow tools are missing.
    - Coordination overhead. Context Fragmentation. Translation problems.
    - ⭐ Expand productive roles to cover neighboring tasks. Productive developers shift left and build backlogs; shift right to reduce code review, maintenance tasks.
      - E.g. Move maintenance/production activities into development. Security, performance, monitoring, observability, cost, infrastructure.
      - We spend time on IDE, CI/CD, Jira, Confluence, Prod observability tools.
      - A typical Agent Development Platform (ADP) covers evals, guardrails, workflow builder, agent builder, observability, prompt management, AI gateway (LiteLLM), MCP servers, model fine-tuning, model serving, model repository, vector stores
      - We need ADP Agents covering delivery risk, continuous security, prod issues RCA, observability, performance, accessibility, product research, infra optiimzation, test data generation, anomaly detection, release management
      - ACTION: Share ADP photo with Patrick.
  - ACTION: ⭐ Centralize skills (“knowledge packs”) and MCPs and observe which gets used most. Allow people to use more.
  1. Lethal Trifecta. There’s growing demand for higher productivity with AI code assistants. But the lethal trifecta makes them an attack vector. It has access to sensitive information, exfiltrate data, and read and follow unsafe instructions.
    - Can lead to supply chain poisoning attacks.
    - Regulated industries cannot adopt.
  2. Technical debt growth. More productivity leads to poor code quality which will slow down future work.
    - See Software Engineering Excellence 2025
    - AI induced complacency.
    - Sunk-cost fallacy on AI-generated code hurts.
    - ACTION: Evaluate code quality continuously to reduce technical debt. Double-down on good engineering practices.
  3. Compliance.
    - Model residency. Self-hosting is required.
    - Data observability gaps. Data privacy, audit trails, etc. are concerns.
    - Token economics. $20/day happens in Thoughtworks. Token cost is subsidized.
    - Rogue AI usage. Use of dis-allowed tools; shadow IT.
    - ROI justification. Hard to quantify productivity gains.
  4. Adoption.
    - AI Literacy. Tap into organizational knowledge
    - Champions & communities of practice to support cross-pollination.
    - Use-case driven adoption. Teams identify based on AI superpowers.
    - AI playbook. Share what worked, what didn’t work.
AI automation is likely less if a high portion of work
- Has legal liability (e.g. pharmacist/judge vs shop attendant/lawyer)
- Is subjective (e.g. perfumer/auction appraiser vs lab chemist/insurance appraiser)
- Needs rapid contextual decisions (e.g. detective/fireman/ER vs parking enforcer)
- Via ChatGPT, Claude
parse-sse from Sindre Sorhus is a more standards-compliant, more likely-to-be-maintained alternative to my async-sse package.
Which is better: Comment A: 1 upvote, 0 downvotes (100% positive) or Comment B: 99 upvotes, 1 downvote (99% positive)? Use Wilson’s Lower Bound which measures “What % positive am I 95% confident of?” Claude
- Using this, we can measure metrics for tweets, like below. ChatGPT
- Popularity = (5 _ WLB(reposts / views) + 2 _ WLB(likes / views)) * Decay(half-life of 72 h)
- Memorability = (5 _ WLB(bookmarks / views) + 4 _ WLB(replies / views)) * Decay(half-life of 36 hours)
A nice visual “benchmark” of text-to-image and image editing models. Seadream 4, Gemini 2.5 Flash, and Qwen Image Edit lead. This includes examples like straightening te Tower of Pisa - which only Flux.1 and Seadream 4 do well on; or removing only the brown M&Ms - which only Qwen Image Edit manages to.
Arch is a pure LLM router. It supports multiple LLMs, flexible routing and observability but not auth.
From Codex docs
- Add custom prompts in ~/.codex/prompts/xyz.md and launch as /prompts:xyz. Optional: description: and argument-hint: in YAML front-matter. For example, create prompts to refactor, rewrite in a developer’s style, document AGENTS.md, identify re-usable code, etc.
- AGENTS.override.md overrides parent directory AGENTS.md. AGENTS.md appends to parent AGENTS.md. Fallback names are allowed.
- codex exec supports streaming JSON
- codex exec accepts a CODEX_API_KEY= environment variable. codex uses an OPENAI_API_KEY.
- You can configure which environment variables are passed to the shell
- Codex reads 32KB from AGENTS.md by default
Things that I currently follow and don’t follow from Peter Steinberger’s excellent Just Talk To It:
- Prefer Codex > Claude Code.
- Ask for options before executing
- Generate & review specs collaboratively
- You don’t need git worktrees
- Prefer subscriptions over API to reduce cost
- Store docs with code
- Give examples
- Use voice input
- Use Codex Web as a mobile inbox for ideas
- Prefer CLI over agentic platforms
- Prefer CLI tools over MCP
- Avoid ALL-CAPS for Codex. It follows instructions well
- Avoid sub-agents, RAG, etc.
- Iterate UI live. Watch changes
- Use 3-8 agents in parallel on a single repo.
- Make small, atomic commit checkpoints. Commit only what the agent touches
- Add ast-grep as a pre-commit hook to block rule violations.
- Keep custom prompts minimal (commit, automerge, massageprs, review, …). Just “commit” reduces context
- Cancel long tasks and ask what’s happening
- Prefer Medium over High reasoning. It decides level of thinking
- Share screenshots
- Use tmux to run CLIs persistently
- Schedule refactor time (20%). Use jscpd, knip, oxlint, …
- Don’t reset context. Cold start wastes time + tokens
- Write tests in the same context. Yields better tests, reveals bugs.
- Prototype in a separate folder / PR
- Queue continue messages** before stepping away
- Ask it to “Preserve intent and add comments at tricky spots”. Future you needs the WHY
- On hard problems, add “take your time”, “be comprehensive”, “read all related code”, “form hypotheses”, etc.
- Maintain an evolving AGENTS.md with product notes, naming, API patterns, test policy, ast-grep rules, etc. Delete stale guidelines
Fascinating implications from Quantifying Human-AI Synergy ChatGPT
- Models vary in ability to uplift humans. Don’t just use standalone model benchmarks.
- People vary in ability to work with AI. Don’t just measure solo skills. Reward AI collaboration ability (delegation, prompting, verification, revision, …)
- Train models to ask for missing Theory-of-Mind cues: goal, beliefs, constraints, audience, success test
- Train people by asking them to predict what the model will get right/wrong, and validate
- Design UI and models for synergy. UI: Surface/solicit assumptions, intent, uncertainty, constraints. Model: Infer & adapt to evolving user state.

OpenRouter image generation now includes GPT-5 Image Mini. An image costs about 1 cent. Here’s the code:

curl 'https://openrouter.ai/api/v1/chat/completions' \
  -H "Authorization: Bearer $OPENROUTER_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    model: "openai/gpt-5-image-mini",
    messages: [{ role: "user", content: "Draw a cat" }],
    modalities: ["image"],
    image_config: { "aspect_ratio": "16:9" }
  }' | jq -r '.choices[0].message.images[0].image_url.url' | cut -c23- | base64 -d > cat.png

Related