This week, I learned:

  • There’s a very interesting HN discussion on the AI coding of CloudFlare Workers OAuth Provider. My takeaways: #ai-coding
    • Write very comprehensive specs.
    • Use LLM to create the specs.
    • Reviewing is a skill we need to develop.
    • Understanding others’ code takes effort.
    • But LLM code is easier to review because it’s immediate and has no ego.
    • Unit tests are critical.
    • Use LLMs for well understood specs, APIs, platforms and libraries to really save time.
    • Logic-less stuff like Markdown, JSON and HTML templates are a LOT easier to verify. Do more of that.
    • We can only make so many decisions in a day. AI coding saves us that effort.
    • Experts are not experts in every area. They benefit from LLMs in other areas.
    • LLMs are great for rubber ducking. Speaking and speccing really help.
    • LLMs make mistakes. So do most humans.
    • LLM speed makes coding more exhausting.
    • Use LLMs to understand codebases.
    • AI coding could reduce demand for developers. E.g. Sysadmin demand plummeted with cloud infra and infrastructure-as-code.
    • But, niche use cases could grow, like how demand for photographers grew despite point-and-shoot cameras.
    • Transaction cost of hiring even 1 person is high and that will likely be a bottleneck. Plus people can use LLMs themselves, so that will dampen niche demand.
  • Google Introduced Google Vids last year. It’s a video creator styled like PowerPoint. Looks promising.
  • FastMCP looks like an easy way to build MCPs. (Yet to try it)
  • O3 and to a lesser extent, Claude Sonnet 4, are the models that can accurately summarize complex subjects and create a list of links without hallucinations. Ref
  • Claude Trace lets you record all interactions with Claude Code.
  • Elevenlabs now supports emotion and interruption. Ref
  • Thinking longer alone is not enough to scale intelligence. We need better models, too. Ref
  • Indian High Court judgements are now available as a public dataset on AWS and updated periodically. Ref
  • A few observations in AI code editors’ styles.
    • O3 is better at finding bugs than Jules, which tends to try and fix them rather than discover them.
    • Codex writes more minimal edits in PRs than Jules, which is more verbose.
    • Claude Code remains the best at faithfully creating and updating front-end apps.
  • Deep Research is great for fact-checking my notes! ChatGPT
  • Web bench evaluates LLMs in web development. Claude Sonnet remains ahead.
  • Vision language models heavily rely on past training and miss changes they don’t expect. Ref
  • Pure CSS tooltips are possible. Julia Evans
  • Google has an OAuth Playground which is a convenient way to get a temporary OAuth token.
  • At the moment, the best speech to text for Android appears to be ChatGPT’s transcription. The default Android text to speech (which I thought was good) no longer feels adequate. Gemini mis-hears and doesn’t wait till I’m done. Whisper ASR has poor noise cancellation and a 30 second limit.
  • anyascii is a better alternative to unidecode. It supports more characters and also supports transliteration. I use it to strip out non-ASCII in ChatGPT’s output. Commit
  • DeepWiki creates docs for humans GitHub repos. Example. It’s verbose, human-facing, and does not understand the nuances of context and implications. Context7 creates llms.txt for LLMs. Example. It’s concise, example-oriented, and works only if there are code snippets relevant (e.g. API calls) that can be generated from the codebase. Like creating an llms.txt automatically, e.g. https://context7.com/textualize/textual/llms.txt #ai-coding
  • We will move towards an organization structure where developers are embedded with business teams rather than working as a separate group. Sort of like embedded executive assistance instead of a central typing pool. Making AI Work