This week, I learned:
- There’s a very interesting HN discussion on the AI coding of CloudFlare Workers OAuth Provider. My takeaways: #ai-coding
- Write very comprehensive specs.
- Use LLM to create the specs.
- Reviewing is a skill we need to develop.
- Understanding others’ code takes effort.
- But LLM code is easier to review because it’s immediate and has no ego.
- Unit tests are critical.
- Use LLMs for well understood specs, APIs, platforms and libraries to really save time.
- Logic-less stuff like Markdown, JSON and HTML templates are a LOT easier to verify. Do more of that.
- We can only make so many decisions in a day. AI coding saves us that effort.
- Experts are not experts in every area. They benefit from LLMs in other areas.
- LLMs are great for rubber ducking. Speaking and speccing really help.
- LLMs make mistakes. So do most humans.
- LLM speed makes coding more exhausting.
- Use LLMs to understand codebases.
- AI coding could reduce demand for developers. E.g. Sysadmin demand plummeted with cloud infra and infrastructure-as-code.
- But, niche use cases could grow, like how demand for photographers grew despite point-and-shoot cameras.
- Transaction cost of hiring even 1 person is high and that will likely be a bottleneck. Plus people can use LLMs themselves, so that will dampen niche demand.
- Google Introduced Google Vids last year. It’s a video creator styled like PowerPoint. Looks promising.
- FastMCP looks like an easy way to build MCPs. (Yet to try it)
- O3 and to a lesser extent, Claude Sonnet 4, are the models that can accurately summarize complex subjects and create a list of links without hallucinations. Ref
- Claude Trace lets you record all interactions with Claude Code.
- Elevenlabs now supports emotion and interruption. Ref
- Thinking longer alone is not enough to scale intelligence. We need better models, too. Ref
- Indian High Court judgements are now available as a public dataset on AWS and updated periodically. Ref
- A few observations in AI code editors’ styles.
- O3 is better at finding bugs than Jules, which tends to try and fix them rather than discover them.
- Codex writes more minimal edits in PRs than Jules, which is more verbose.
- Claude Code remains the best at faithfully creating and updating front-end apps.
- Deep Research is great for fact-checking my notes! ChatGPT
- Web bench evaluates LLMs in web development. Claude Sonnet remains ahead.
- Vision language models heavily rely on past training and miss changes they don’t expect. Ref
- Pure CSS tooltips are possible. Julia Evans
- Google has an OAuth Playground which is a convenient way to get a temporary OAuth token.
- At the moment, the best speech to text for Android appears to be ChatGPT’s transcription. The default Android text to speech (which I thought was good) no longer feels adequate. Gemini mis-hears and doesn’t wait till I’m done. Whisper ASR has poor noise cancellation and a 30 second limit.
- anyascii is a better alternative to unidecode. It supports more characters and also supports transliteration. I use it to strip out non-ASCII in ChatGPT’s output. Commit
- DeepWiki creates docs for humans GitHub repos. Example. It’s verbose, human-facing, and does not understand the nuances of context and implications. Context7 creates llms.txt for LLMs. Example. It’s concise, example-oriented, and works only if there are code snippets relevant (e.g. API calls) that can be generated from the codebase. Like creating an llms.txt automatically, e.g. https://context7.com/textualize/textual/llms.txt #ai-coding
- We will move towards an organization structure where developers are embedded with business teams rather than working as a separate group. Sort of like embedded executive assistance instead of a central typing pool. Making AI Work