<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/">
  <channel>
    <title>S Anand</title>
    <link>https://www.s-anand.net/blog/</link>
    <description>Recent content on S Anand</description>
    <generator>Hugo -- 0.156.0</generator>
    <language>en-us</language>
    <lastBuildDate>Sat, 13 Jun 2026 21:04:54 +0530</lastBuildDate>
    <atom:link href="https://www.s-anand.net/blog/index.xml" rel="self" type="application/rss+xml" />
    <item>
      <title>Make 5 new friends today</title>
      <link>https://www.s-anand.net/blog/make-5-new-friends-today/</link>
      <pubDate>Sat, 13 Jun 2026 21:04:54 +0530</pubDate>
      <guid>https://www.s-anand.net/blog/make-5-new-friends-today/</guid>
      <description>&lt;p&gt;Here&amp;rsquo;s my address to the graduating class the &lt;a href=&#34;https://study.iitm.ac.in/&#34;&gt;IITM BS Program&lt;/a&gt; today:&lt;/p&gt;
&lt;p&gt;&lt;img loading=&#34;lazy&#34; src=&#34;https://files.s-anand.net/images/2026-06-13-make-5-new-friends-today.avif&#34;&gt;&lt;/p&gt;
&lt;hr&gt;
&lt;p&gt;&lt;strong&gt;Make five new friends today.&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Your CGPA is what you&amp;rsquo;re most focused on. When I graduated, my CGPA helped me get a job. It didn&amp;rsquo;t help much after that.&lt;/p&gt;
&lt;p&gt;Getting that job helped me for the four years I worked there. That&amp;rsquo;s it.&lt;/p&gt;
&lt;p&gt;The things I learnt in class helped a bit longer. Maybe six years. There is a long tail.&lt;/p&gt;
&lt;p&gt;But all of these are &lt;em&gt;depreciating assets&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Make five new friends today.&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;The friends I made at college are still friends. Your friends will grow more influential and build stronger relationships. That is an &lt;em&gt;appreciating asset&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;And in the AI era, this is an asset that will appreciate even more.&lt;/p&gt;
&lt;p&gt;Talk to the person to your left. Your right. In front of you. Behind you.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Make five new friends today.&lt;/strong&gt;&lt;/p&gt;
</description>
    </item>
    <item>
      <title>Let AI take your exams</title>
      <link>https://www.s-anand.net/blog/let-ai-take-your-exams/</link>
      <pubDate>Fri, 12 Jun 2026 08:10:56 +0530</pubDate>
      <guid>https://www.s-anand.net/blog/let-ai-take-your-exams/</guid>
      <description>&lt;p&gt;At 2 pm IST today (Fri 12 Jun 2026), I&amp;rsquo;m running a workshop at &lt;a href=&#34;https://www.iitmparadox.org/workshops&#34;&gt;Paradox, IITM&lt;/a&gt; - at &lt;a href=&#34;https://doms.iitm.ac.in/&#34;&gt;DOMS 101&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;You can join online at &lt;a href=&#34;https://meet.google.com/cpt-faee-ucx&#34;&gt;https://meet.google.com/cpt-faee-ucx&lt;/a&gt; and ask questions on chat.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Agenda&lt;/strong&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;You&amp;rsquo;ve been told AI can pass your exams. But what happens when you actually watch it try — live, on your questions, in real time?&lt;/p&gt;
&lt;p&gt;This workshop starts with a collective experiment: we ask coding agents to solve real exams (including IITM exams) and see how it solves them.&lt;/p&gt;
&lt;p&gt;What follows isn&amp;rsquo;t a tutorial on prompting — it&amp;rsquo;s an autopsy that reveals what your exams are actually testing, where AI confidently hallucinates, and what that means for what&amp;rsquo;s worth learning.&lt;/p&gt;
&lt;p&gt;You&amp;rsquo;ll leave with a reframed understanding of your degree (the goal isn&amp;rsquo;t answers, it&amp;rsquo;s the ability to catch wrong ones) and a concrete study rituals that uses AI as a Socratic sparring partner rather than an answer machine.&lt;/p&gt;
&lt;p&gt;Come with a question you got wrong recently — it&amp;rsquo;s going to be useful.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;strong&gt;Real agenda&lt;/strong&gt;: An &lt;a href=&#34;https://en.wikipedia.org/wiki/R/IAmA&#34;&gt;ask-me-anything&lt;/a&gt; session plus real-life experiments.&lt;/p&gt;
&lt;p&gt;&lt;img loading=&#34;lazy&#34; src=&#34;https://files.s-anand.net/images/2026-05-18-let-ai-take-your-exams.avif&#34;&gt;&lt;/p&gt;
</description>
    </item>
    <item>
      <title>Data Stories with AI Workshop</title>
      <link>https://www.s-anand.net/blog/data-stories-with-ai-workshop/</link>
      <pubDate>Mon, 08 Jun 2026 16:26:32 +0530</pubDate>
      <guid>https://www.s-anand.net/blog/data-stories-with-ai-workshop/</guid>
      <description>&lt;p&gt;On Sat 13 Jun 2026 at 3 pm, I&amp;rsquo;m conducting an online workshop on &amp;ldquo;Data Stories with AI&amp;rdquo;.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Register at &lt;a href=&#34;https://forms.gle/dNkUxtJ2PVqNMNcE9&#34;&gt;https://forms.gle/dNkUxtJ2PVqNMNcE9&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;In this workshop, you will use ChatGPT and Claude, mostly, to:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Find data&lt;/li&gt;
&lt;li&gt;Analyze it&lt;/li&gt;
&lt;li&gt;Extract insights&lt;/li&gt;
&lt;li&gt;Visualize as stories&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;It&amp;rsquo;s a data visualization using AI workshop for journalists - but you don&amp;rsquo;t need to know data, visualization, journalism, or even technology.&lt;/p&gt;
&lt;p&gt;But this &lt;em&gt;is&lt;/em&gt; a practical workshop. You’ll be doing stuff and sharing your results.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Pre-requisites&lt;/strong&gt;: A paid &lt;a href=&#34;https://chatgpt.com/pricing/&#34;&gt;ChatGPT Plus&lt;/a&gt; or &lt;a href=&#34;https://claude.com/pricing&#34;&gt;Claude Pro&lt;/a&gt; account - both are about Rs 2,000. Just buy it for a month. It&amp;rsquo;s worth it.&lt;/p&gt;
&lt;p&gt;It&amp;rsquo;s on &lt;strong&gt;Google Meet: &lt;a href=&#34;https://meet.google.com/qzt-obnb-cgp&#34;&gt;https://meet.google.com/qzt-obnb-cgp&lt;/a&gt;&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;I&amp;rsquo;ll record and share it.&lt;br&gt;
Attend to &lt;em&gt;do&lt;/em&gt; stuff.&lt;br&gt;
Watch later to &lt;em&gt;see&lt;/em&gt; stuff.&lt;/p&gt;
&lt;p&gt;&lt;a href=&#34;https://forms.gle/dNkUxtJ2PVqNMNcE9&#34;&gt;&lt;img loading=&#34;lazy&#34; src=&#34;https://files.s-anand.net/images/2026-06-08-data-stories-with-ai-poster.avif&#34;&gt;&lt;/a&gt;&lt;/p&gt;
</description>
    </item>
    <item>
      <title>A cynical view of WhatsApp&#39;s Advanced Privacy</title>
      <link>https://www.s-anand.net/blog/a-cynical-view-of-whatsapp-s-advanced-privacy/</link>
      <pubDate>Mon, 08 Jun 2026 14:13:47 +0530</pubDate>
      <guid>https://www.s-anand.net/blog/a-cynical-view-of-whatsapp-s-advanced-privacy/</guid>
      <description>&lt;p&gt;WhatsApp has an &lt;a href=&#34;https://faq.whatsapp.com/715385484388016/&#34;&gt;Advanced privacy mode&lt;/a&gt; they &lt;a href=&#34;https://blog.whatsapp.com/introducing-advanced-chat-privacy&#34;&gt;launched in Apr 2025&lt;/a&gt;. People in the chat:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;em&gt;Can&amp;rsquo;t ask Meta AI to answer questions, or to create images or summaries in this chat.&lt;/em&gt;&lt;br&gt;
&lt;strong&gt;Cynical view&lt;/strong&gt;: When regulators clamp down AI or users complain about AI, Meta can say &amp;ldquo;We asked users and they gave permission!&amp;rdquo;&lt;/li&gt;
&lt;li&gt;&lt;em&gt;Can’t export the chat.&lt;/em&gt;&lt;br&gt;
&lt;strong&gt;Cynical view&lt;/strong&gt;: When regulators force Meta to be inter-operable with Signal, Telegram, etc. Meta can say &amp;ldquo;Users don&amp;rsquo;t want to export their chats!&amp;rdquo; Also, easier to tell businesses &amp;ldquo;You can disable exports - less litigation risk&amp;rdquo;.&lt;/li&gt;
&lt;li&gt;&lt;em&gt;Can’t save media to their device gallery automatically.&lt;/em&gt;&lt;br&gt;
&lt;strong&gt;Cynical view&lt;/strong&gt;: When you want to switch to Telegram, Signal, these photos can&amp;rsquo;t be exported - so you have to stay on WhatsApp.&lt;/li&gt;
&lt;/ul&gt;
&lt;!-- https://claude.ai/chat/cd137b92-3cbf-4ebf-b799-d3bc1d7eb79d --&gt;
</description>
    </item>
    <item>
      <title>Oh Shit moments with Gen AI</title>
      <link>https://www.s-anand.net/blog/oh-shit-moments-with-gen-ai/</link>
      <pubDate>Sun, 07 Jun 2026 20:13:01 +0530</pubDate>
      <guid>https://www.s-anand.net/blog/oh-shit-moments-with-gen-ai/</guid>
      <description>&lt;p&gt;Hacker News has a lively thread asking &lt;a href=&#34;https://news.ycombinator.com/item?id=48406174&#34;&gt;What was your &amp;ldquo;oh shit&amp;rdquo; moment with GenAI?&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Here are two dozen that gives a sense of what real people find impressive (or worrying) about AI capabilities.&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Analysis&lt;/strong&gt; &lt;a href=&#34;https://news.ycombinator.com/item?id=48417647&#34;&gt;simonw&lt;/a&gt; used ChatGPT Code Interpreter to upload a CSV, analyze it, create charts, automating everything a software for journalists would do.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Analysis&lt;/strong&gt; &lt;a href=&#34;https://news.ycombinator.com/item?id=48419002&#34;&gt;Sobrino&lt;/a&gt; saw that a months-long OCR project to read and clean-up PDFs is now just a prompt on ChatGPT.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Coding&lt;/strong&gt; &lt;a href=&#34;https://news.ycombinator.com/item?id=48423405&#34;&gt;plumefar&lt;/a&gt; used Claude and Gemini to modernize 20-30 years of chemistry code in 10 days.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Coding&lt;/strong&gt; &lt;a href=&#34;https://news.ycombinator.com/item?id=48420655&#34;&gt;veidr&lt;/a&gt; used a multi-agent fleet managing coordination, testing, UI feedback loops, etc. with no-human-in-loop coding to build a useful git-submodule GUI.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Creativity&lt;/strong&gt; &lt;a href=&#34;https://news.ycombinator.com/item?id=48419220&#34;&gt;idopmstuff&lt;/a&gt; used Nano Banana Pro to turn a poor iPhone product photo into usable e-commerce product photography and Amazon-style infographics, replacing a photographer/designer workflow.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Creativity&lt;/strong&gt; &lt;a href=&#34;https://news.ycombinator.com/item?id=48419883&#34;&gt;koreth1&lt;/a&gt; used Suno to generate a K-pop-style anthem about their family dog with a catchy melody and lyrics funny enough to make the family laugh.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Education&lt;/strong&gt; &lt;a href=&#34;https://news.ycombinator.com/item?id=48424304&#34;&gt;plagasul&lt;/a&gt; saw a teacher automate grading feedback emails based on notes and the student list spreadsheet.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Education&lt;/strong&gt; &lt;a href=&#34;https://news.ycombinator.com/item?id=48431219&#34;&gt;aniviacat&lt;/a&gt; watched a non-technical brother build a complex working app with Codex using vague, shallow wording despite not knowing code, git, or technical details.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Hardware&lt;/strong&gt; &lt;a href=&#34;https://news.ycombinator.com/item?id=48433163&#34;&gt;ivanvanderbyl&lt;/a&gt; used Claude to reverse engineer a FujiFilm camera&amp;rsquo;s Bluetooth/Wi-Fi transfer protocol and build a much faster native Mac/iOS transfer app.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Hardware&lt;/strong&gt; &lt;a href=&#34;https://news.ycombinator.com/item?id=48417379&#34;&gt;shreddude&lt;/a&gt; had Claude decompile camper van firmware, document CAN interfaces, and program an ESP32 to control power, HVAC, lighting, and tanks.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Health&lt;/strong&gt; &lt;a href=&#34;https://news.ycombinator.com/item?id=48422978&#34;&gt;TylerE&lt;/a&gt; used Claude as a health adjunct to organize a complex medical profile, screen for drug interactions, log symptoms, and draft portal messages to doctors.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Legal&lt;/strong&gt; &lt;a href=&#34;https://news.ycombinator.com/item?id=48419540&#34;&gt;bsiverly&lt;/a&gt; used AI to prepare a San Francisco property-tax appeal with valuation research, and the city agreed, sending a $12k refund.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Legal&lt;/strong&gt; &lt;a href=&#34;https://news.ycombinator.com/item?id=48422177&#34;&gt;grumblepeet&lt;/a&gt; used AI to fill out complex government-framework enrollment forms and identify the certification steps needed, transforming their business.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Personal&lt;/strong&gt; &lt;a href=&#34;https://news.ycombinator.com/item?id=48420387&#34;&gt;acosmism&lt;/a&gt; used ChatGPT screenshots to understand and operate a 100-year-old home&amp;rsquo;s steam heating system in winter despite knowing nothing about it.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Personal&lt;/strong&gt; &lt;a href=&#34;https://news.ycombinator.com/item?id=48417845&#34;&gt;andrewthornton&lt;/a&gt; used Gemini videos to diagnose a broken furnace during a cold holiday weekend and keep it running until HVAC service arrived.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Research&lt;/strong&gt; &lt;a href=&#34;https://news.ycombinator.com/item?id=48424383&#34;&gt;angusturner&lt;/a&gt; found that Opus does reads papers, does architecture research and creates CUDA kernels&amp;hellip; It is AI automating AI research.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Research&lt;/strong&gt; &lt;a href=&#34;https://news.ycombinator.com/item?id=48423241&#34;&gt;chaoxu&lt;/a&gt; used ChatGPT to find a counterexample to a theoretical computer science conjecture they&amp;rsquo;d been trying for 2 years.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Research&lt;/strong&gt; &lt;a href=&#34;https://news.ycombinator.com/item?id=48422278&#34;&gt;rochansinha&lt;/a&gt; built a physics-based digital twin for an electrolyzer system, covering thermodynamics, fluid dynamics, and electrochemical reactions at a level usually needing expensive specialist software.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Security&lt;/strong&gt; &lt;a href=&#34;https://news.ycombinator.com/item?id=48418364&#34;&gt;kstrauser&lt;/a&gt; used a coding agent to test an open source vulnerability, and in a few minutes, had a tool that could crash any system using this software.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Security&lt;/strong&gt; &lt;a href=&#34;https://news.ycombinator.com/item?id=48425576&#34;&gt;raesene9&lt;/a&gt; gave an LLM a Linux privilege-escalation PoC and asked whether it could become a container breakout; it generated a working container breakout in one prompt.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Society&lt;/strong&gt; &lt;a href=&#34;https://news.ycombinator.com/item?id=48423552&#34;&gt;laboring1&lt;/a&gt; read that a character.ai chatbot encouraged a child to commit suicide, making the &amp;ldquo;oh shit&amp;rdquo; moment about real-world harm, not capability.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Society&lt;/strong&gt; &lt;a href=&#34;https://news.ycombinator.com/item?id=48426293&#34;&gt;ozgung&lt;/a&gt; realized AI makes large-scale profiling, surveillance, and social-media analysis cheap, fast, and accurate enough to change privacy and power dynamics.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Work&lt;/strong&gt; &lt;a href=&#34;https://news.ycombinator.com/item?id=48422478&#34;&gt;binarysolo&lt;/a&gt; used Gemini to reverse engineer a departed employees&amp;rsquo; work from their emails/docs/calendar/meetings and create an onboarding document.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Work&lt;/strong&gt; &lt;a href=&#34;https://news.ycombinator.com/item?id=48419595&#34;&gt;eqmvii&lt;/a&gt; built a Slack agent that took over a 30-minute internal business process, handled ambiguity and edits, and eventually killed the old process.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;img loading=&#34;lazy&#34; src=&#34;https://files.s-anand.net/images/2026-06-07-oh-shit-moments-with-gen-ai.avif&#34;&gt;&lt;/p&gt;
&lt;!-- https://chatgpt.com/c/6a25564b-7b78-83ec-bb53-82be864b4eed --&gt;
</description>
    </item>
    <item>
      <title>Things I Learned - 07 Jun 2026</title>
      <link>https://www.s-anand.net/blog/things-i-learned-07-jun-2026/</link>
      <pubDate>Sun, 07 Jun 2026 00:00:00 +0000</pubDate>
      <guid>https://www.s-anand.net/blog/things-i-learned-07-jun-2026/</guid>
      <description>&lt;p&gt;This week, I learned:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;sudo resolvectl flush-caches&lt;/code&gt; clears the DNS cache on Linux. Useful when you&amp;rsquo;re changing DNS records and want to see the changes immediately. In my case, I was creating a Cloudflare tunnel to my laptop and wanted to test it quickly.&lt;/li&gt;
&lt;li&gt;Making something easy to verify makes it &lt;em&gt;much&lt;/em&gt; faster to train models on it. Arithmetic verification is easy - calculators can be deterministically verified. Chess verification is easy - Stockfish became easy to train. Code verification is easy - LLMs improved coding ability rapidly. Therefore:
&lt;ul&gt;
&lt;li&gt;Wherever we have environments that are easy to verify, AI will improve faster there.&lt;/li&gt;
&lt;li&gt;To make AI improve faster in an area, build environments that are easy to verify.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://blog.modelcontextprotocol.io/posts/2026-07-28-release-candidate/&#34;&gt;MCP is getting simpler&lt;/a&gt;. A stateless HTTP protocol. Simpler OAuth. Plugins. No idea when it will land in Claude or ChatGPT, though. Worth checking after 28 Jun 2026 - after it is finalized.&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://www.microsoft.com/en-us/microsoft-365/blog/2026/06/02/introducing-microsoft-scout-your-always-on-personal-agent/&#34;&gt;Microsoft Scout&lt;/a&gt; is Microsoft&amp;rsquo;s version of OpenClaw or &lt;a href=&#34;https://gemini.google/overview/agent/spark/&#34;&gt;Gemini Spark&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;git subtree&lt;/code&gt; is a useful way of maintaining git repos inside git repos. For example, if you have a tool &lt;code&gt;tool-a&lt;/code&gt; under a project. It&amp;rsquo;s more light-weight than sub-modules, lets you commit at any point to the parent &lt;em&gt;or&lt;/em&gt; child, and is a built-in feature in &lt;code&gt;git&lt;/code&gt;. &lt;!-- https://chatgpt.com/c/6a1d25bf-49f4-83ec-8e02-5905a22f4fe0 --&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://blog.google/innovation-and-ai/technology/developers-tools/introducing-gemma-4-12b/&#34;&gt;Gemma 4 12B is released&lt;/a&gt; and seems almost as good as the 26B version. This is the class of models that makes it practical to run edge AI on phones. It&amp;rsquo;s multimodal and reasonably smart (like frontier models were 12-18 months ago).&lt;/li&gt;
&lt;li&gt;I don&amp;rsquo;t use Claude/ChatGPT Projects much. It offers 3 advantages: custom instructions, memory, files, and chats. Files aren&amp;rsquo;t useful - I use my entire laptop as a file system via MCP. Instructions aren&amp;rsquo;t useful - I can paste commonly used prompts with a click. Chats aren&amp;rsquo;t useful - I have chat references enabled, so all past chats are accessible anyway. Memory isn&amp;rsquo;t useful - I have memory enabled globally anyway. In short, I haven&amp;rsquo;t discovered the power of projects that everyone&amp;rsquo;s raving about. SKILL.md is more useful for me. &lt;!-- https://claude.ai/chat/eb3bcf3e-f1f0-4d33-b5a2-14b09cbda189 --&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://gerrit.googlesource.com/git-repo/+/HEAD/README.md&#34;&gt;&lt;code&gt;repo&lt;/code&gt;&lt;/a&gt; is a Google/Android tool built on top of &lt;code&gt;git&lt;/code&gt; that lets you manage multiple git repos. It sounded promising until I released it needs a &lt;code&gt;repo init&lt;/code&gt; that creates a &lt;code&gt;.repo/&lt;/code&gt; - which is more overhead that I&amp;rsquo;d like to keep.&lt;/li&gt;
&lt;li&gt;When using &lt;code&gt;&amp;lt;image onerror=...&amp;gt;&lt;/code&gt; fallbacks, include &lt;code&gt;this.oneerror=null&lt;/code&gt; to prevent infinite loops if the fallback image also fails to load. &lt;a href=&#34;http://mvark.blogspot.com/2026/05/how-to-add-backup-image-in-html-when.html&#34;&gt;RK&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;One of the advantages of multiple agent (rather than a single agent loop) is: it&amp;rsquo;s easier to change directions when wrong. Single loops get stuck. &lt;a href=&#34;https://www.youtube.com/watch?v=mR-WAvEPRwE&#34;&gt;Build Agents That Run for Hours&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Claude Code also supports &lt;a href=&#34;https://code.claude.com/docs/en/agent-teams&#34;&gt;agent teams&lt;/a&gt; where sub-agents can talk to each other rather than rely on the main agent to coordinate. Useful for parallel exploration. Anthropic lets Claude define &amp;ldquo;organizational policies&amp;rdquo; for agent teams best suited for the task (AI-native workflows). It also lets agents to push back on their scope, e.g. &amp;ldquo;This is too hard.&amp;rdquo; &lt;a href=&#34;https://www.youtube.com/watch?v=mR-WAvEPRwE&#34;&gt;Build Agents That Run for Hours&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Claude Code has a &lt;code&gt;/background [prompt]&lt;/code&gt; (or &lt;code&gt;/bg&lt;/code&gt;) command that runs the current session the background. You can run &lt;code&gt;claude agents&lt;/code&gt; as a separate command to &lt;a href=&#34;https://code.claude.com/docs/en/agent-view&#34;&gt;monitor agents&lt;/a&gt;. (There&amp;rsquo;s no equivalent in Codex yet.) This seems to be the future of agentic operations: a bunch of agents running that you monitor and steer through an agent view dashboard.&lt;/li&gt;
&lt;li&gt;Models are evolving. Therefore prompts evolved. Now harnesses also need to evolve. The workflows will also evolve. As a result, evaluations might be the (relatively) more stable assets. Datasets are likely to be the most stable ground truth.&lt;/li&gt;
&lt;li&gt;How to learn a new field fast:
&lt;ul&gt;
&lt;li&gt;Yes, it&amp;rsquo;s possible to learn 50% of a field in 20 hours. &lt;a href=&#34;https://www.youtube.com/watch?v=5MgBikgcWnY&#34;&gt;Josh Kaufman, &amp;ldquo;The First 20 Hours&amp;rdquo;&lt;/a&gt; popularized it. The next 30% takes months and the last 20% takes years.&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://en.wikipedia.org/wiki/Threshold_knowledge&#34;&gt;Threshold concepts&lt;/a&gt; are those that change your perspective and open up new ways of thinking.&lt;/li&gt;
&lt;li&gt;Experts&amp;rsquo; knowledge is hard-wired and they can&amp;rsquo;t identify nor teach threshold concepts naturally. Don&amp;rsquo;t assume they can.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://www.google.com/search?q=Polanyi+%22we+know+more+than+we+can+tell%22&#34;&gt;&amp;ldquo;We know more than we can tell.&amp;rdquo;&lt;/a&gt; Polanyi&amp;rsquo;s 1966 book &amp;ldquo;The Tacit Dimension&amp;rdquo; says that there&amp;rsquo;s some knowledge that can&amp;rsquo;t be verbalized. This &lt;a href=&#34;https://en.wikipedia.org/wiki/Tacit_knowledge&#34;&gt;tacit knowledge&lt;/a&gt;, therefore, will be harder for humans and AI to learn.&lt;/li&gt;
&lt;/ul&gt;
</description>
    </item>
    <item>
      <title>What I don&#39;t post on LinkedIn</title>
      <link>https://www.s-anand.net/blog/what-i-don-t-post-on-linkedin/</link>
      <pubDate>Sat, 06 Jun 2026 11:55:10 +0530</pubDate>
      <guid>https://www.s-anand.net/blog/what-i-don-t-post-on-linkedin/</guid>
      <description>&lt;p&gt;I don’t post all my writing on &lt;a href=&#34;https://www.linkedin.com/in/sanand0/recent-activity/all/&#34;&gt;LinkedIn&lt;/a&gt;. For example:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Fewer &lt;strong&gt;strategy posts&lt;/strong&gt;, e.g. “&lt;a href=&#34;https://www.s-anand.net/blog/where-enterprise-ai-is-headed/&#34;&gt;Where Enterprise AI is Headed&lt;/a&gt;”, “&lt;a href=&#34;https://www.s-anand.net/blog/how-the-innovation-team-works&#34;&gt;How My Innovation Team Works&lt;/a&gt;”, etc. aren&amp;rsquo;t on LinkedIn.&lt;/li&gt;
&lt;li&gt;Fewer &lt;strong&gt;developer posts&lt;/strong&gt;, e.g. my &lt;a href=&#34;https://raw.githubusercontent.com/sanand0/scripts/refs/heads/main/agents/AGENTS.md&#34;&gt;AGENTS.md&lt;/a&gt;, my &lt;a href=&#34;https://github.com/sanand0/scripts/tree/main/agents&#34;&gt;SKILL.md files&lt;/a&gt; files, &lt;a href=&#34;https://github.com/sanand0/scripts/&#34;&gt;CLI tools&lt;/a&gt;, &lt;a href=&#34;https://sanand0.github.io/datastories/tds-2026-01-p1/&#34;&gt;evals&lt;/a&gt;, etc. aren&amp;rsquo;t on LinkedIn.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;I also shorten content because of LinkedIn constraints. For example:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;No &lt;strong&gt;links&lt;/strong&gt;, e.g. the list of all my AI-in-education resources&lt;/li&gt;
&lt;li&gt;Short &lt;strong&gt;content&lt;/strong&gt;, e.g. my full &lt;a href=&#34;https://www.s-anand.net/blog/ai-advice-for-teams/&#34;&gt;advice for teams using AI&lt;/a&gt; is much longer than the LinkedIn post.&lt;/li&gt;
&lt;li&gt;Trimmed &lt;strong&gt;prompts&lt;/strong&gt;, e.g. how to &lt;a href=&#34;https://www.s-anand.net/blog/transcript-ai-ded-interviews/&#34;&gt;convert meeting transcripts&lt;/a&gt; into a personalized org-consulting report&lt;/li&gt;
&lt;li&gt;Snipped &lt;strong&gt;chats&lt;/strong&gt;, e.g. the full moves of &lt;a href=&#34;https://www.s-anand.net/blog/chatgpt-is-about-fide-1600/&#34;&gt;GPT-5.5 playing chess&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;I filter the LinkedIn posts, sharing what’s most useful for most people.&lt;/p&gt;
&lt;p&gt;But some people asked if I can share the full content over email.&lt;/p&gt;
&lt;p&gt;So, I’ve set up my &lt;strong&gt;full blog&lt;/strong&gt;, &lt;strong&gt;by email&lt;/strong&gt; (free, via Google Groups): &lt;a href=&#34;https://groups.google.com/g/s-anand&#34;&gt;https://groups.google.com/g/s-anand&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Preview before you subscribe. Switch to weekly digest if it’s too much.&lt;/p&gt;
&lt;p&gt;&lt;img loading=&#34;lazy&#34; src=&#34;https://files.s-anand.net/images/2026-06-06-anand-blog-google-groups.avif&#34;&gt;&lt;/p&gt;
</description>
    </item>
    <item>
      <title>Editing images with code and AI</title>
      <link>https://www.s-anand.net/blog/editing-images-with-code-and-ai/</link>
      <pubDate>Sat, 06 Jun 2026 11:32:47 +0530</pubDate>
      <guid>https://www.s-anand.net/blog/editing-images-with-code-and-ai/</guid>
      <description>&lt;p&gt;&lt;a href=&#34;https://a16z.com/&#34;&gt;Andreessen Horowitz&lt;/a&gt; published an interesting article titled &lt;a href=&#34;https://www.a16z.news/p/the-next-frontier-of-visual-ai-is&#34;&gt;The Next Frontier of Visual AI Is Code&lt;/a&gt;. Here&amp;rsquo;s the summary.&lt;/p&gt;
&lt;p&gt;A lot of our work is visual: ads, slides, dashboards, logos, videos, architecture, etc.&lt;/p&gt;
&lt;p&gt;We can generate visual output either as:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Pixels&lt;/strong&gt; (like Nano Banana a photo), or as&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Code&lt;/strong&gt; (like Claude generating an SVG)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Code is more powerful: AI can inspect the output and improve fast in a loop: &lt;strong&gt;Code &amp;gt; Render &amp;gt; Inspect &amp;gt; Revise&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;For &lt;strong&gt;2D and UI&lt;/strong&gt;: AI can generate SVG logos, HTML/CSS pages, Figma layers, app, etc.
For &lt;strong&gt;animation and 3D&lt;/strong&gt;: AI can generate Lottie JSON, Blender scripts, USD scene graphs, shader code, or 3D programs.&lt;/p&gt;
&lt;p&gt;This means &lt;strong&gt;design, marketing, and product teams&lt;/strong&gt; are immediate prospects. Instead of static mockups, they&amp;rsquo;ll get &lt;em&gt;editable&lt;/em&gt; logos, landing pages, design prototypes, motion graphics, onboarding animations, etc.&lt;/p&gt;
&lt;p&gt;Emerging industries are &lt;strong&gt;robotics, manufacturing, architecture, gaming companies&lt;/strong&gt;. Agents can build 3D chairs, machine, room, or robot with the right materials, joints and constraints.&lt;/p&gt;
&lt;p&gt;I&amp;rsquo;ve seen some examples of this approach. Pavan used Blender MCP to &lt;a href=&#34;https://pavankumart18.github.io/ai-blender-design-journey/index.html&#34;&gt;build an entire building campus&lt;/a&gt; &lt;em&gt;solely by prompting Claude&lt;/em&gt;. Also to &lt;a href=&#34;https://pavankumart18.github.io/ai-blender-design-journey/index2.html&#34;&gt;design a mug from scratch&lt;/a&gt;. His approach is called &lt;a href=&#34;https://fugtemypt123.github.io/VIGA-website/&#34;&gt;VIGA: Vision-as-Inverse-Graphics Agent&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;&lt;a href=&#34;https://pavankumart18.github.io/ai-blender-design-journey/index.html&#34;&gt;&lt;img loading=&#34;lazy&#34; src=&#34;https://files.s-anand.net/images/2026-06-06-building-campus.avif&#34;&gt;&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;If you see anyone creating visual assets of any form - manually, or with AI as pixels - please remember this approach.&lt;/p&gt;
</description>
    </item>
    <item>
      <title>When the prompt is longer than the code</title>
      <link>https://www.s-anand.net/blog/when-the-prompt-is-longer-than-the-code/</link>
      <pubDate>Fri, 05 Jun 2026 17:14:17 +0800</pubDate>
      <guid>https://www.s-anand.net/blog/when-the-prompt-is-longer-than-the-code/</guid>
      <description>&lt;p&gt;I used &lt;a href=&#34;https://pi.dev/&#34;&gt;pi&lt;/a&gt; to create a compact home page for &lt;a href=&#34;https://media.s-anand.net/&#34;&gt;media.s-anand.net&lt;/a&gt; using these prompts:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Create &lt;code&gt;index.html&lt;/code&gt; - a simple, elegant page that says that this page (media.s-anand.net) serves large media files for &lt;a href=&#34;https://www.s-anand.net/&#34;&gt;Anand&lt;/a&gt; - that&amp;rsquo;s where they should look instead.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&amp;hellip; followed by:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Skip the part that says &amp;ldquo;Please visit &amp;hellip;&amp;rdquo;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&amp;hellip; then:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Shorten index.html to just 2-3 elegant rules of CSS. I want it MUCH smaller and simpler.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&amp;hellip; and finally:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Center vertically and horizontally.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;These prompts ended up being larger than the 471-byte &lt;code&gt;index.html&lt;/code&gt;:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-html&#34; data-lang=&#34;html&#34;&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;cp&#34;&gt;&amp;lt;!doctype html&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;p&#34;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&#34;nt&#34;&gt;html&lt;/span&gt; &lt;span class=&#34;na&#34;&gt;lang&lt;/span&gt;&lt;span class=&#34;o&#34;&gt;=&lt;/span&gt;&lt;span class=&#34;s&#34;&gt;&amp;#34;en&amp;#34;&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;p&#34;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&#34;nt&#34;&gt;meta&lt;/span&gt; &lt;span class=&#34;na&#34;&gt;charset&lt;/span&gt;&lt;span class=&#34;o&#34;&gt;=&lt;/span&gt;&lt;span class=&#34;s&#34;&gt;&amp;#34;utf-8&amp;#34;&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;p&#34;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&#34;nt&#34;&gt;meta&lt;/span&gt; &lt;span class=&#34;na&#34;&gt;name&lt;/span&gt;&lt;span class=&#34;o&#34;&gt;=&lt;/span&gt;&lt;span class=&#34;s&#34;&gt;&amp;#34;viewport&amp;#34;&lt;/span&gt; &lt;span class=&#34;na&#34;&gt;content&lt;/span&gt;&lt;span class=&#34;o&#34;&gt;=&lt;/span&gt;&lt;span class=&#34;s&#34;&gt;&amp;#34;width=device-width, initial-scale=1&amp;#34;&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;p&#34;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&#34;nt&#34;&gt;title&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;&amp;gt;&lt;/span&gt;media.s-anand.net&lt;span class=&#34;p&#34;&gt;&amp;lt;/&lt;/span&gt;&lt;span class=&#34;nt&#34;&gt;title&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;p&#34;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&#34;nt&#34;&gt;style&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;  &lt;span class=&#34;nt&#34;&gt;body&lt;/span&gt; &lt;span class=&#34;p&#34;&gt;{&lt;/span&gt; &lt;span class=&#34;k&#34;&gt;min-height&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;:&lt;/span&gt; &lt;span class=&#34;mi&#34;&gt;100&lt;/span&gt;&lt;span class=&#34;kt&#34;&gt;vh&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;;&lt;/span&gt; &lt;span class=&#34;k&#34;&gt;display&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;:&lt;/span&gt; &lt;span class=&#34;k&#34;&gt;grid&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;;&lt;/span&gt; &lt;span class=&#34;k&#34;&gt;place-content&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;:&lt;/span&gt; &lt;span class=&#34;kc&#34;&gt;center&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;;&lt;/span&gt; &lt;span class=&#34;k&#34;&gt;margin&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;:&lt;/span&gt; &lt;span class=&#34;kc&#34;&gt;auto&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;;&lt;/span&gt; &lt;span class=&#34;k&#34;&gt;font&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;:&lt;/span&gt; &lt;span class=&#34;mf&#34;&gt;1.25&lt;/span&gt;&lt;span class=&#34;kt&#34;&gt;rem&lt;/span&gt;&lt;span class=&#34;o&#34;&gt;/&lt;/span&gt;&lt;span class=&#34;mf&#34;&gt;1.6&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;system-ui&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt; &lt;span class=&#34;kc&#34;&gt;sans-serif&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;;&lt;/span&gt; &lt;span class=&#34;p&#34;&gt;}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;  &lt;span class=&#34;nt&#34;&gt;h1&lt;/span&gt; &lt;span class=&#34;p&#34;&gt;{&lt;/span&gt; &lt;span class=&#34;k&#34;&gt;font-size&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;:&lt;/span&gt; &lt;span class=&#34;nf&#34;&gt;clamp&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;&lt;span class=&#34;mi&#34;&gt;2&lt;/span&gt;&lt;span class=&#34;kt&#34;&gt;rem&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt; &lt;span class=&#34;mi&#34;&gt;8&lt;/span&gt;&lt;span class=&#34;kt&#34;&gt;vw&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt; &lt;span class=&#34;mi&#34;&gt;4&lt;/span&gt;&lt;span class=&#34;kt&#34;&gt;rem&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;);&lt;/span&gt; &lt;span class=&#34;p&#34;&gt;}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;p&#34;&gt;&amp;lt;/&lt;/span&gt;&lt;span class=&#34;nt&#34;&gt;style&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;p&#34;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&#34;nt&#34;&gt;h1&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;&amp;gt;&lt;/span&gt;media.s-anand.net&lt;span class=&#34;p&#34;&gt;&amp;lt;/&lt;/span&gt;&lt;span class=&#34;nt&#34;&gt;h1&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;p&#34;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&#34;nt&#34;&gt;p&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;&amp;gt;&lt;/span&gt;This domain serves large media files for &lt;span class=&#34;p&#34;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&#34;nt&#34;&gt;a&lt;/span&gt; &lt;span class=&#34;na&#34;&gt;href&lt;/span&gt;&lt;span class=&#34;o&#34;&gt;=&lt;/span&gt;&lt;span class=&#34;s&#34;&gt;&amp;#34;https://www.s-anand.net/&amp;#34;&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;&amp;gt;&lt;/span&gt;Anand&lt;span class=&#34;p&#34;&gt;&amp;lt;/&lt;/span&gt;&lt;span class=&#34;nt&#34;&gt;a&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;&amp;gt;&lt;/span&gt;.&lt;span class=&#34;p&#34;&gt;&amp;lt;/&lt;/span&gt;&lt;span class=&#34;nt&#34;&gt;p&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;p&#34;&gt;&amp;lt;/&lt;/span&gt;&lt;span class=&#34;nt&#34;&gt;html&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Not that this matters, because:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;I didn&amp;rsquo;t know what I wanted and having an AI coding agent generate a first draft helped with starting trouble and ideation.&lt;/li&gt;
&lt;li&gt;I don&amp;rsquo;t know how to center on the screen and this did it for me.&lt;/li&gt;
&lt;li&gt;I am practicing the skill that matters - reviewing - and not the skill AI is taking over - writing code.&lt;/li&gt;
&lt;/ol&gt;
</description>
    </item>
    <item>
      <title>How AI bottlenecks shift</title>
      <link>https://www.s-anand.net/blog/how-ai-bottlenecks-shift/</link>
      <pubDate>Fri, 05 Jun 2026 13:59:11 +0800</pubDate>
      <guid>https://www.s-anand.net/blog/how-ai-bottlenecks-shift/</guid>
      <description>&lt;p&gt;&lt;img loading=&#34;lazy&#34; src=&#34;https://files.s-anand.net/images/2026-06-05-how-ai-bottlenecks-shift.avif&#34;&gt;&lt;/p&gt;
&lt;p&gt;I wrote about &lt;a href=&#34;https://www.s-anand.net/blog/my-changing-ai-opinions/&#34;&gt;my changing AI opinions&lt;/a&gt;. At least some of this is because the industry is moving so fast that the bottlenecks keep shifting.&lt;/p&gt;
&lt;p&gt;Here are four examples of how we AI couldn&amp;rsquo;t do something (the bottleneck), but that became possible, and the bottleneck shifted - changing the way we work.&lt;/p&gt;
&lt;p&gt;It&amp;rsquo;s good to keep this in mind when thinking about AI.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Coding&lt;/strong&gt;:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;em&gt;&amp;ldquo;It can&amp;rsquo;t write useful code. We can&amp;rsquo;t get real help.&amp;rdquo;&lt;/em&gt;
&lt;ul&gt;
&lt;li&gt;But in &lt;a href=&#34;https://github.blog/news-insights/research/research-quantifying-github-copilots-impact-on-developer-productivity-and-happiness/&#34;&gt;Sep 2022: GitHub finds Copilot developers are 55% faster&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;em&gt;&amp;ldquo;It writes code but doesn&amp;rsquo;t know our codebase. We can&amp;rsquo;t let it touch real projects.&amp;rdquo;&lt;/em&gt;
&lt;ul&gt;
&lt;li&gt;But in &lt;a href=&#34;https://cloud.google.com/blog/products/ai-machine-learning/gemini-on-vertex-ai-expands&#34;&gt;Feb 2024: Gemini 1.5 Pro has 1M-token context ~ 30K LOC&amp;quot;&lt;/a&gt;. Cursor indexes code.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;em&gt;&amp;ldquo;It understands the repo but can&amp;rsquo;t ship a fix on its own. We can&amp;rsquo;t hand it a whole issue.&amp;rdquo;&lt;/em&gt;
&lt;ul&gt;
&lt;li&gt;But in &lt;a href=&#34;https://cognition.ai/blog/introducing-devin&#34;&gt;Mar 2024: Devin solves 14% of SWE-bench - up from 2%.&lt;/a&gt;. Verified SWE-Bench is now 70%+.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;em&gt;&amp;ldquo;It ships fixes, but we can&amp;rsquo;t review them fast enough or trust they&amp;rsquo;re stable.&amp;rdquo;&lt;/em&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;https://dora.dev/research/2024/dora-report/&#34;&gt;Oct 2024: DORA 2024 finds AI hurt both throughput and stability&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Now&lt;/strong&gt;: &lt;a href=&#34;https://cloud.google.com/blog/products/ai-machine-learning/announcing-the-2025-dora-report&#34;&gt;Sep 2025: DORA 2025 finds is positive but stability stayed negative&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Now&lt;/strong&gt;: &lt;a href=&#34;https://metr.org/blog/2025-07-10-early-2025-ai-experienced-os-dev-study/&#34;&gt;Jul 2025: METR&amp;rsquo;s RCT finds &lt;em&gt;experienced&lt;/em&gt; devs 19% slower&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;strong&gt;Agents&lt;/strong&gt;&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;em&gt;&amp;ldquo;It does one step. We can&amp;rsquo;t chain actions.&amp;rdquo;&lt;/em&gt;
&lt;ul&gt;
&lt;li&gt;But &lt;a href=&#34;https://openai.com/index/function-calling-and-other-api-updates/&#34;&gt;Jun 2023: OpenAI function calling lets models invoke tools and return structured calls&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;em&gt;&amp;ldquo;Every integration is bespoke. We can&amp;rsquo;t connect it to all our systems.&amp;rdquo;&lt;/em&gt;
&lt;ul&gt;
&lt;li&gt;But &lt;a href=&#34;https://www.anthropic.com/news/model-context-protocol&#34;&gt;Nov 2024: Anthropic open-sources MCP, standardizing tool and data access&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;em&gt;&amp;ldquo;It can act and connect, but over a long task its errors compound. We can&amp;rsquo;t trust a 20-step run.&amp;rdquo;&lt;/em&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Now&lt;/strong&gt;: &lt;a href=&#34;https://metr.org/blog/2025-03-19-measuring-ai-ability-to-complete-long-tasks/&#34;&gt;Mar 2025: METR finds autonomous task horizon doubling ~every 7 months. Reliability is a challenge&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;But Claude Mythos, with a ~16 hour reliable execution, might fix this.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;strong&gt;Enterprise knowledge work&lt;/strong&gt;&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;em&gt;&amp;ldquo;It only knows the public internet. We can&amp;rsquo;t use it on our own documents.&amp;rdquo;&lt;/em&gt;
&lt;ul&gt;
&lt;li&gt;But &lt;a href=&#34;https://openai.com/index/morgan-stanley/&#34;&gt;Sep 2023: Morgan Stanley&amp;rsquo;s assistant uses ~100K internal documents&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;em&gt;&amp;ldquo;It reads our documents but can&amp;rsquo;t fit enough of them. We can&amp;rsquo;t ask across the whole corpus.&amp;rdquo;&lt;/em&gt;
&lt;ul&gt;
&lt;li&gt;But &lt;a href=&#34;https://www.anthropic.com/news/100k-context-windows&#34;&gt;May 2023: Claude&amp;rsquo;s 100K-token context&lt;/a&gt; and &lt;a href=&#34;https://cloud.google.com/blog/products/ai-machine-learning/gemini-on-vertex-ai-expands&#34;&gt;Feb 2024: Gemini 1.5&amp;rsquo;s 1M tokens&lt;/a&gt; reduce chunking needs.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;em&gt;&amp;ldquo;It runs on our data, but we can&amp;rsquo;t trust it without a way to measure when it&amp;rsquo;s silently wrong.&amp;rdquo;&lt;/em&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Now&lt;/strong&gt;: &lt;a href=&#34;https://openai.com/index/morgan-stanley/&#34;&gt;the Morgan Stanley deployment relies on an eval framework&lt;/a&gt; - evals are the bottleneck.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;strong&gt;Document processing&lt;/strong&gt;&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;em&gt;&amp;ldquo;It needs thousands of labeled samples. We can&amp;rsquo;t stand up new doc types quickly.&amp;rdquo;&lt;/em&gt;
&lt;ul&gt;
&lt;li&gt;But &lt;a href=&#34;https://cloud.google.com/blog/products/ai-machine-learning/mobilize-your-unstructured-data-with-generative-ai&#34;&gt;Sep 2023: Google Document AI extracts with limited-to-no ML training&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;em&gt;&amp;ldquo;It learns fast but reads only text. We can&amp;rsquo;t handle scans, charts, and tables.&amp;rdquo;&lt;/em&gt;
&lt;ul&gt;
&lt;li&gt;But &lt;a href=&#34;https://openai.com/index/chatgpt-can-now-see-hear-and-speak/&#34;&gt;Sep 2023: GPT-4V&lt;/a&gt; vision model and &lt;a href=&#34;https://openai.com/index/hello-gpt-4o/&#34;&gt;May 2024: GPT-4o native multimodal&lt;/a&gt; solved this.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;em&gt;&amp;ldquo;It sees the page but can&amp;rsquo;t understand long, layout-heavy documents. We can&amp;rsquo;t trust it on real multi-page files.&amp;rdquo;&lt;/em&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Now&lt;/strong&gt;: &lt;a href=&#34;https://arxiv.org/abs/2407.01523&#34;&gt;NeurIPS 2024: on MMLongBench-Doc, GPT-4o scored under ~50 on multi-page chart/table documents&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;But Gemini 3.5 Flash, GPT 5.5, Claude 4.8 Opus, etc. have excellent vision and need to be tested.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;!-- https://claude.ai/chat/91d588c8-29d4-4b9d-bc3f-b4bd2e37d281 --&gt;
</description>
    </item>
    <item>
      <title>Watching videos with a plastic cover</title>
      <link>https://www.s-anand.net/blog/watching-videos-with-a-plastic-cover/</link>
      <pubDate>Fri, 05 Jun 2026 12:00:11 +0800</pubDate>
      <guid>https://www.s-anand.net/blog/watching-videos-with-a-plastic-cover/</guid>
      <description>&lt;p&gt;On the Indigo 1026 from Singapore to Chennai, I saw a passenger two seats in front of me watch videos in an interesting way.&lt;/p&gt;
&lt;p&gt;&lt;img loading=&#34;lazy&#34; src=&#34;https://files.s-anand.net/images/2026-06-05-indigo-flight-phone-plastic-video.avif&#34;&gt;&lt;/p&gt;
&lt;p&gt;She had wrapped her phone in a plastic cover, wedged it behind the tray table so that it would appear at a comfortable viewing position, and watched an Asian movie (presumably with bluetooth headphones).&lt;/p&gt;
&lt;p&gt;At first, I wondered if she travels with a plastic wrapper for this purpose. Then I realized it was from the Indigo safety instructions kit.&lt;/p&gt;
&lt;p&gt;&lt;img loading=&#34;lazy&#34; src=&#34;https://files.s-anand.net/images/2026-06-05-indigo-flight-safety-instructions.avif&#34;&gt;&lt;/p&gt;
&lt;p&gt;Now, that&amp;rsquo;s either a very experienced flyer (who knows about the wrapper) or a very innovative person (likely both).&lt;/p&gt;
&lt;p&gt;I was curious, so I asked her:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Have you done this before? &lt;strong&gt;ANS&lt;/strong&gt;: Yes, many times.&lt;/li&gt;
&lt;li&gt;Did you know that Indigo carries these wrappers you could use? &lt;strong&gt;ANS&lt;/strong&gt;: No but flights usually carry a Ziploc bag.&lt;/li&gt;
&lt;li&gt;Is it easy to control through the wrapper? &lt;strong&gt;ANS&lt;/strong&gt;: I don&amp;rsquo;t touch it much while watching. (I was able to control it but fingerprint unlock didn&amp;rsquo;t work)&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;It&amp;rsquo;s fascinating to discover such innovations in the wild!&lt;/p&gt;
</description>
    </item>
    <item>
      <title>My changing AI opinions</title>
      <link>https://www.s-anand.net/blog/my-changing-ai-opinions/</link>
      <pubDate>Fri, 05 Jun 2026 09:41:33 +0800</pubDate>
      <guid>https://www.s-anand.net/blog/my-changing-ai-opinions/</guid>
      <description>&lt;p&gt;I asked Claude about my AI opinions.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Based on my transcripts and blog posts, find the three claims I make most consistently, the three I&amp;rsquo;ve quietly reversed, and the one assumption I&amp;rsquo;ve never questioned but everything depends on.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;!-- https://claude.ai/chat/1d41b5a3-7963-4e87-aade-c44a37d5f355 --&gt;
&lt;p&gt;Here are things I&amp;rsquo;ve changed my opinion on:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;THEN&lt;/strong&gt;: One frontier model will win - not specialization. &lt;strong&gt;NOW&lt;/strong&gt;: Gemini for media, Claude for strategy/style, GPT for rigor. SLMs as tools.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;THEN&lt;/strong&gt;: Carefully curate my course content. &lt;strong&gt;NOW&lt;/strong&gt;: Give students prompts directly.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;THEN&lt;/strong&gt;: Web apps are differentiated artifacts. &lt;strong&gt;NOW&lt;/strong&gt;: HTML is easier to generate than PPT - a signal of slop, not craft.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;THEN&lt;/strong&gt;: Human in the loop. &lt;strong&gt;NOW&lt;/strong&gt;: Human NOT in the loop, bottlenecking it. On-the-loop, etc. is fine.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;THEN&lt;/strong&gt;: Minimal single-agent loop, avoid sub-agents&amp;quot; &lt;strong&gt;NOW&lt;/strong&gt;: Multi-agent, sub-agent, and agent teams.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;THEN&lt;/strong&gt;: Avoid MCP, prefer SKILLS.md. &lt;strong&gt;NOW&lt;/strong&gt;: Use MCP because integrating with Claude / ChatGPT / &amp;hellip; is easy.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;There are the top contradictions in my opinions.&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&amp;ldquo;Vibe code everything, end-to-end&amp;rdquo; &lt;em&gt;vs&lt;/em&gt; &amp;ldquo;Don&amp;rsquo;t commit based on vibe-coding, it&amp;rsquo;s not fully reliable&amp;rdquo;.&lt;/li&gt;
&lt;li&gt;&amp;ldquo;Experience is a liability&amp;rdquo; &lt;em&gt;vs&lt;/em&gt; &amp;ldquo;Domain expertise is an edge&amp;rdquo;&lt;/li&gt;
&lt;li&gt;&amp;ldquo;Paid plans are fine for private data&amp;rdquo; &lt;em&gt;vs&lt;/em&gt; &amp;ldquo;Don&amp;rsquo;t assume paid = private&amp;rdquo;&lt;/li&gt;
&lt;li&gt;&amp;ldquo;Get out of the way / delegate everything&amp;rdquo; &lt;em&gt;vs&lt;/em&gt; &amp;ldquo;Verification is the whole job&amp;rdquo;&lt;/li&gt;
&lt;li&gt;&amp;ldquo;Build, don&amp;rsquo;t plan&amp;rdquo; &lt;em&gt;vs&lt;/em&gt; &amp;ldquo;Plan → Correct → Execute&amp;rdquo;&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Some things, I have never questioned.&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;For numbers, math, or correctness, make AI write and run code &amp;ndash; never trust prose arithmetic.&lt;/li&gt;
&lt;li&gt;Judge AI against human accuracy, never against perfection &amp;ndash; because experts disagree among themselves.&lt;/li&gt;
&lt;li&gt;Use AI heavily &amp;ndash; reach for it first, high volume.&lt;/li&gt;
&lt;li&gt;Don&amp;rsquo;t build a foundation model from scratch &amp;ndash; steer existing general models instead.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Things I usually say, &lt;strong&gt;but&lt;/strong&gt; there are exceptions.&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&amp;ldquo;Always verify&amp;rdquo;. &lt;strong&gt;But&lt;/strong&gt; blindly trust AI in non-core areas where you are unskilled (such as personal finance).&lt;/li&gt;
&lt;li&gt;&amp;ldquo;AI is improving fast&amp;rdquo;. &lt;strong&gt;But&lt;/strong&gt; there is a jagged edge, to verify before upgrading.&lt;/li&gt;
&lt;li&gt;&amp;ldquo;Don&amp;rsquo;t build models&amp;rdquo;. &lt;strong&gt;But&lt;/strong&gt; go ahead if it&amp;rsquo;s easy and beats LLMs clearly.&lt;/li&gt;
&lt;li&gt;&amp;ldquo;Models keep getting cheaper&amp;rdquo;. &lt;strong&gt;But&lt;/strong&gt; not in May 2026.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;img loading=&#34;lazy&#34; src=&#34;https://files.s-anand.net/images/2026-03-05-my-changing-ai-opinions.avif&#34;&gt;&lt;/p&gt;
</description>
    </item>
    <item>
      <title>My most memorable anniversary</title>
      <link>https://www.s-anand.net/blog/my-most-memorable-anniversary/</link>
      <pubDate>Wed, 03 Jun 2026 09:27:14 +0800</pubDate>
      <guid>https://www.s-anand.net/blog/my-most-memorable-anniversary/</guid>
      <description>&lt;p&gt;At 9:30 pm, I checked my calendar for tomorrow&amp;rsquo;s appointments, alt-tabbed frantically into ChatGPT, and started typing:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Tomorrow is my 24th anniversary. It&amp;rsquo;s a bit late for me to buy anything (except maybe an online service) or prepare something.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;!-- https://chatgpt.com/c/6a1d867e-47bc-83ec-b25f-54d858826d24 --&gt;
&lt;p&gt;This has become a habit &amp;ndash; leaving things to the last minute &lt;em&gt;and&lt;/em&gt; asking ChatGPT to save my day.&lt;/p&gt;
&lt;p&gt;I &lt;em&gt;did&lt;/em&gt; give it good context, though.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;You remember the OCBC expenses treemap you created by analyzing my transactions? That will give you a good guessable idea of the kinds of things she spends on and hopefully, therefore, what she likes.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;I was analyzing my bank transactions over the weekend, so that came in handy.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;What can I do (either create for her, like I made a &lt;a href=&#34;https://www.s-anand.net/blog/birthday-sandwich-cake/&#34;&gt;peanut jelly nutella sandwich birthday for Dhyeya&lt;/a&gt; or buy - if it&amp;rsquo;s an online service / gift voucher kind of thing) in the next hour or so?&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;ChatGPT began with gyaan - which I promptly ignored:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Your best move is not &amp;ldquo;buy a gift&amp;rdquo;. It is &amp;ldquo;prove that I have noticed your life, then attach a redeemable escape hatch&amp;hellip;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&amp;hellip; and moved on to analyzing her personality and recommending from different perspectives:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Practical person&lt;/strong&gt;&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;No-decision day&lt;/strong&gt;: &amp;ldquo;Tomorrow, you decide nothing. I handle transport, food, booking, payment, and exit plan.&amp;rdquo;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Service relief voucher&lt;/strong&gt;: 24 hours where she forwards any household/admin nuisance to you.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Personal-care float&lt;/strong&gt;: ClassPass/spa/salon voucher plus the promise: &amp;ldquo;I will book the slot and taxi, not just give you a coupon.&amp;rdquo;&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;strong&gt;A romance playwright&lt;/strong&gt;&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;24 tiny vows&lt;/strong&gt;: not grand vows; operational vows: &amp;ldquo;I will stop asking where the charger is.&amp;rdquo;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;The ordinary-places pilgrimage&lt;/strong&gt;: one route across 3-5 mundane but meaningful places.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Private audio guide&lt;/strong&gt;: 8 short voice notes she unlocks at home / taxi / restaurant / temple.&lt;/li&gt;
&lt;/ol&gt;
&lt;/blockquote&gt;
&lt;p&gt;&amp;hellip; and so on.&lt;/p&gt;
&lt;p&gt;For some of these, I&amp;rsquo;m absolutely the worst person to execute. ClassPass? Spa? Salon? I don&amp;rsquo;t even know the difference between them.&lt;/p&gt;
&lt;p&gt;But 1.2 and 2.4 were interesting. So, after a few iterations with ChatGPT, I woke up the next morning with:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Today&amp;rsquo;s our 24th anniversary. For the next 24 hours, I&amp;rsquo;m going to knock off 24 things from your TODO list.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;I rescheduled my meetings. I spent the entire day doing nothing but what was on her list. Water purifier, doctor appointments, toothbrush shopping, investments, drying clothes, duct-taping, &amp;hellip; and we dined out.&lt;/p&gt;
&lt;p&gt;At 10:00 pm: &amp;ldquo;I think it was my most memorable anniversary. Thank you.&amp;rdquo;&lt;/p&gt;
&lt;p&gt;&lt;img loading=&#34;lazy&#34; src=&#34;https://files.s-anand.net/images/2026-06-03-my-most-memorable-anniversary.avif&#34;&gt;&lt;/p&gt;
&lt;p&gt;Thank you, ChatGPT. Thank you, habit of setting low expectations.&lt;/p&gt;
</description>
    </item>
    <item>
      <title>It&#39;s who you know</title>
      <link>https://www.s-anand.net/blog/it-s-who-you-know/</link>
      <pubDate>Tue, 02 Jun 2026 09:47:12 +0800</pubDate>
      <guid>https://www.s-anand.net/blog/it-s-who-you-know/</guid>
      <description>&lt;p&gt;&lt;a href=&#34;https://www.linkedin.com/in/dharmendrasingh17/&#34;&gt;Dharmendra Singh&lt;/a&gt; shared how they built an app with AI.&lt;/p&gt;
&lt;p&gt;&lt;img loading=&#34;lazy&#34; src=&#34;https://files.s-anand.net/images/2026-06-02-paymentpulse.webp&#34;&gt;&lt;/p&gt;
&lt;p&gt;That&amp;rsquo;s normal. I&amp;rsquo;m just thrilled they used client transcripts as the source.&lt;/p&gt;
&lt;p&gt;Basically, they converted the &amp;ldquo;voice of the client&amp;rdquo; to working software. To quote them: &amp;ldquo;A strong spoken business narrative can be converted into a usable product brief quickly when the capture step is disciplined.&amp;rdquo;&lt;/p&gt;
&lt;p&gt;You know what this means? &lt;strong&gt;Interviewing is a skill to hire for&lt;/strong&gt;. Better questions = better answers = better apps.&lt;/p&gt;
&lt;p&gt;&amp;hellip; until AI starts interviewing better than us (which it might be already). At that point, picking &lt;em&gt;whom to interview&lt;/em&gt; becomes important.&lt;/p&gt;
&lt;p&gt;You know what that means? &lt;strong&gt;People management is a skill to hire for&lt;/strong&gt;. Better stakeholders = better interviews = better apps.&lt;/p&gt;
&lt;p&gt;&amp;hellip; until AI understand people by mining signals better than us (which it might be already). At which point, &lt;em&gt;stuff you can&amp;rsquo;t capture or express&lt;/em&gt; (body language, trust, um&amp;hellip; &lt;a href=&#34;https://en.wikipedia.org/wiki/Nepotism?&#34;&gt;nepotism&lt;/a&gt; becomes more important.&lt;/p&gt;
&lt;p&gt;You know what that means? &lt;strong&gt;It&amp;rsquo;s who you know&lt;/strong&gt;, not what you know.&lt;/p&gt;
&lt;p&gt;But wait&amp;hellip; isn&amp;rsquo;t there supposed to be something wrong with that?&lt;/p&gt;
&lt;p&gt;Sigh&amp;hellip; time to review &lt;a href=&#34;https://straive.com/&#34;&gt;Straive&lt;/a&gt; hiring policies.&lt;/p&gt;
</description>
    </item>
    <item>
      <title>Things I Learned - 31 May 2026</title>
      <link>https://www.s-anand.net/blog/things-i-learned-31-may-2026/</link>
      <pubDate>Sun, 31 May 2026 00:00:00 +0000</pubDate>
      <guid>https://www.s-anand.net/blog/things-i-learned-31-may-2026/</guid>
      <description>&lt;p&gt;This week, I learned:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;https://www.d-id.com/&#34;&gt;D-ID&lt;/a&gt; is an avatar generator platform like &lt;a href=&#34;https://heygen.com/&#34;&gt;HeyGen&lt;/a&gt;. &lt;a href=&#34;https://creatify.ai/&#34;&gt;Creatify&lt;/a&gt; and &lt;a href=&#34;https://www.synthesia.io/&#34;&gt;Synthesia&lt;/a&gt; are a couple of others I heard of. This space seems to be growing.&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://github.com/sigstore/cosign&#34;&gt;cosign&lt;/a&gt; is a CLI that lets you sign and verify any piece of text with a Google, GitHub or Microsoft account. &lt;code&gt;cosign sign-blob FILE --bundle sign.json&lt;/code&gt; opens a login window and creates a &lt;code&gt;sign.json&lt;/code&gt; signature. Anyone who has &lt;code&gt;FILE&lt;/code&gt; and &lt;code&gt;sign.json&lt;/code&gt; and the email ID can verify via a Google account with &lt;code&gt;cosign verify-blob FILE --bundle sign.json --certificate-identity $EMAIL --certificate-oidc-issuer https://accounts.google.com&lt;/code&gt;. &lt;!-- https://chatgpt.com/c/6a197375-fd5c-83ec-9f21-43b084a3830a --&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://arxiv2md.org/&#34;&gt;arxiv2md.org&lt;/a&gt; converts arXiv papers to Markdown. &lt;a href=&#34;https://github.com/timf34/arxiv2md&#34;&gt;Source&lt;/a&gt;. &lt;a href=&#34;https://markxiv.org/&#34;&gt;markxiv.org&lt;/a&gt; claims the same - by just changing the URL - but it ended up reporting an error when I tried this link: &lt;a href=&#34;https://markxiv.org/abs/2604.08649&#34;&gt;https://markxiv.org/abs/2604.08649&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;From Akhilesh Tilotia: So we have someone in our team with initials AS. She made a document which was named vAS. Then I made edits and named it vAT. These docs were in a CoWork folder. I asked Claude to clean up my doc. It created another version for me to review. In its wisdom, it named the file vAU 🙂&lt;/li&gt;
&lt;li&gt;Maybe what a forward-deployed engineer does is enginer AI-native workflows. (This sounded profound when I wrote it down. Not sure if it&amp;rsquo;ll sound as profound tomorrow.) The idea is that the FDE will say, screw existing processes; let me fire up my AI agent and get stuff done; THEN we&amp;rsquo;ll figure out what works, how to optimize it, etc.&lt;/li&gt;
&lt;li&gt;The &lt;a href=&#34;https://arxiv.org/abs/2604.08649&#34;&gt;PRAGMA: Revolut Foundation Model&lt;/a&gt; has some good tokenization ideas for tabular data. Create your own token space with &lt;code&gt;key–value–time&lt;/code&gt; tokenization - to retain field information. Bucketize numbers by percentile, preserving magnitude/ordering that subword tokenization destroys. Encode time both as log-seconds &lt;em&gt;and&lt;/em&gt; as cyclical calendar features.&lt;/li&gt;
&lt;li&gt;Codex uses the &lt;kbd&gt;Alt + Up Arrow&lt;/kbd&gt; key to edit queued commands, but on the VS Code terminal, this key binding is not sent to the terminal. Enable the &lt;code&gt;terminal.integrated.sendKeybindingsToShell&lt;/code&gt; setting to send it to the terminal, hence Codex.&lt;/li&gt;
&lt;li&gt;Based on this &lt;a href=&#34;https://chatgpt.com/share/6a16dfd6-bd70-83ec-807a-646366ba9a99&#34;&gt;catalog&lt;/a&gt; on &amp;ldquo;universal foods&amp;rdquo;, here&amp;rsquo;s what I 🟢 like, am 🟡 neutral, 🔴 dislike, 🟣 must try, and will ⚫ skip. &lt;!-- https://chatgpt.com/c/6a165e95-5100-83ec-8b90-c41fd2876fdf --&gt;
&lt;ul&gt;
&lt;li&gt;Universal favorites: 🟢 pizza, 🟢 fried potatoes/chicken, 🟡 dumplings, 🟢 ice cream.&lt;/li&gt;
&lt;li&gt;Universal comfort foods: 🟢 khichdi, 🟡 congee, 🟡 dal-rice, 🟡 risotto, 🟡 ramen, 🟢 pho, ⚫ chicken noodle soup, 🔴 rice porridge, 🟡 mac-and-cheese, 🔴 mashed potato, 🟣 polenta, 🟢 oatmeal, 🟣 Japanese curry rice.&lt;/li&gt;
&lt;li&gt;Acquired tastes that convert most: 🟡 coffee, 🟢 tea, 🟡 dark chocolate, 🟢 mild fermented dairy, 🟢 pickles, 🟢 olives, 🟣 kimchi, 🟣 miso, 🟢 mild chili dishes.&lt;/li&gt;
&lt;li&gt;Acquired tastes that have cult devotion: 🟣 durian, 🟣 natto, 🟣 stinky tofu, ⚫ fermented fish, ⚫ hákarl, 🟢 very funky blue cheese, ⚫ offal.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://oceanofpdf.com/&#34;&gt;OceanoPDF&lt;/a&gt; seems like a good place to download ePubs of books.&lt;/li&gt;
&lt;li&gt;The entire Wikipedia is available as a &lt;a href=&#34;https://huggingface.co/datasets/wikimedia/structured-wikipedia&#34;&gt;Parquet file&lt;/a&gt;. You can query it like &lt;code&gt;duckdb -c &amp;quot;FROM &#39;hf://datasets/wikimedia/structured-wikipedia/enwiki/data/*.parquet&#39; LIMIT 5&amp;quot;&lt;/code&gt;. The English version has 35 GB, 7.6 million articles, and you&amp;rsquo;re better off downloading it rather than running analyses remotely.&lt;/li&gt;
&lt;li&gt;When you receive a Calendly link of the form &lt;code&gt;https://cal.com/USER/EVENT&lt;/code&gt; you can fetch the available slots via &lt;code&gt;curl -H &#39;cal-api-version: 2024-09-04&#39; &#39;https://api.cal.com/v2/slots?eventTypeSlug=EVENT&amp;amp;username=USER&amp;amp;start=2026-05-25&amp;amp;end=2026-06-01&amp;amp;timeZone=Asia/Singapore&amp;amp;format=range&#39;&lt;/code&gt;. Useful to automate good meeting-slot selection. &lt;!-- https://chatgpt.com/c/6a126d5e-b9c8-83ec-a88b-f230d04434e9 --&gt;&lt;/li&gt;
&lt;li&gt;&amp;ldquo;Reference saved memories&amp;rdquo; in ChatGPT is different from &amp;ldquo;Reference chat history&amp;rdquo; as per &lt;a href=&#34;https://help.openai.com/en/articles/8590148-memory-faq&#34;&gt;OpenAI&lt;/a&gt;. In &lt;a href=&#34;https://help.openai.com/en/articles/12584461-developer-mode-and-mcp-apps-in-chatgpt&#34;&gt;Developer Mode&lt;/a&gt;, memory is turned off, but not chat history. I confirmed that I can access past conversations in Developer Mode. It might be a privacy concern for others, but for me, this is singularly useful, because I can use ChatGPT with &lt;a href=&#34;https://www.s-anand.net/blog/how-i-use-local-mcp/&#34;&gt;Local MCP&lt;/a&gt; effectively getting a non-metered AI coding agent. &lt;!-- https://chatgpt.com/c/6a12c899-ac5c-83ec-a4fa-6e0717f810b3 --&gt;&lt;/li&gt;
&lt;li&gt;Seems GPT-5.2 reaches expert level in peer review: 45 scientists took 469 hours evaluating human &amp;amp; AI reviews on 82 papers. &amp;ldquo;Surprisingly, current AI reviewers are competitive even with the top-rated reviewers in Nature’s official peer review&amp;hellip;&amp;rdquo; though not without weaknesses, so use AI + humans. &lt;a href=&#34;https://arxiv.org/abs/2605.20668&#34;&gt;On the limits and opportunities of AI reviewers: Reviewing the reviews of Nature-family papers with 45 expert scientists&lt;/a&gt; via &lt;a href=&#34;https://bsky.app/profile/emollick.bsky.social/post/3mmf2ano3ik27&#34;&gt;Ethan Mollick&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
</description>
    </item>
    <item>
      <title>AI Coding Agent Subscription ROI</title>
      <link>https://www.s-anand.net/blog/ai-coding-agent-subscription-roi/</link>
      <pubDate>Sat, 30 May 2026 23:19:34 +0800</pubDate>
      <guid>https://www.s-anand.net/blog/ai-coding-agent-subscription-roi/</guid>
      <description>&lt;p&gt;I ran &lt;a href=&#34;https://github.com/ryoppippi/ccusage&#34;&gt;&lt;code&gt;npx -y ccusage monthly --compact&lt;/code&gt;&lt;/a&gt; to get the following break-up of my AI coding agent costs.&lt;/p&gt;
&lt;table&gt;
  &lt;thead&gt;
      &lt;tr&gt;
          &lt;th&gt;Month&lt;/th&gt;
          &lt;th style=&#34;text-align: right&#34;&gt;Codex&lt;/th&gt;
          &lt;th style=&#34;text-align: right&#34;&gt;Claude&lt;/th&gt;
      &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
      &lt;tr&gt;
          &lt;td&gt;2025-09&lt;/td&gt;
          &lt;td style=&#34;text-align: right&#34;&gt;$37.47&lt;/td&gt;
          &lt;td style=&#34;text-align: right&#34;&gt;$2.29&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;2025-10&lt;/td&gt;
          &lt;td style=&#34;text-align: right&#34;&gt;$106.79&lt;/td&gt;
          &lt;td style=&#34;text-align: right&#34;&gt;$9.13&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;2025-11&lt;/td&gt;
          &lt;td style=&#34;text-align: right&#34;&gt;$100.35&lt;/td&gt;
          &lt;td style=&#34;text-align: right&#34;&gt;$14.24&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;2025-12&lt;/td&gt;
          &lt;td style=&#34;text-align: right&#34;&gt;$240.69&lt;/td&gt;
          &lt;td style=&#34;text-align: right&#34;&gt;$24.88&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;2026-01&lt;/td&gt;
          &lt;td style=&#34;text-align: right&#34;&gt;$100.89&lt;/td&gt;
          &lt;td style=&#34;text-align: right&#34;&gt;$20.28&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;2026-02&lt;/td&gt;
          &lt;td style=&#34;text-align: right&#34;&gt;$323.21&lt;/td&gt;
          &lt;td style=&#34;text-align: right&#34;&gt;$29.46&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;2026-03&lt;/td&gt;
          &lt;td style=&#34;text-align: right&#34;&gt;$1996.32&lt;/td&gt;
          &lt;td style=&#34;text-align: right&#34;&gt;$134.87&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;2026-04&lt;/td&gt;
          &lt;td style=&#34;text-align: right&#34;&gt;$401.36&lt;/td&gt;
          &lt;td style=&#34;text-align: right&#34;&gt;$47.07&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;2026-05&lt;/td&gt;
          &lt;td style=&#34;text-align: right&#34;&gt;$378.20&lt;/td&gt;
          &lt;td style=&#34;text-align: right&#34;&gt;$45.13&lt;/td&gt;
      &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;This shows the ROI of my $20 subscriptions to each. I get ~$35 worth of API calls for my $20 Claude Pro subscription and ~$400 of API calls for my $20 ChatGPT Plus subscription (on top of my ChatGPT chats.)&lt;/p&gt;
&lt;p&gt;I end up using Codex a lot more - partly because it&amp;rsquo;s a bit more diligent, but mostly because it&amp;rsquo;s a lot cheaper.&lt;/p&gt;
&lt;p&gt;Clearly, subscriptions are good deal for individuals. Codex, especially.&lt;/p&gt;
&lt;p&gt;This may not be true for corporates. &lt;a href=&#34;https://simonwillison.net/2026/May/27/product-market-fit/&#34;&gt;Simon Willison&lt;/a&gt; says that Anthropic and OpenAI both changed &lt;em&gt;enterprise&lt;/em&gt; pricing to align with token prices. That means the cost of enterprise AI security is ~2-20 &lt;em&gt;times&lt;/em&gt; their token budget - which is growing rapidly.&lt;/p&gt;
&lt;hr&gt;
&lt;p&gt;BTW, my moment of &lt;a href=&#34;https://en.wikipedia.org/wiki/Chatbot_psychosis&#34;&gt;AI psychosis&lt;/a&gt; was in March 2026. The coding agents had increased their limits and I was tokenmaxxing. I&amp;rsquo;m far from that limit today, but the symptoms linger.&lt;/p&gt;
&lt;noscript&gt;
&lt;p&gt;&lt;img loading=&#34;lazy&#34; src=&#34;https://files.s-anand.net/images/2026-05-30-ai-coding-agent-subscription-roi.avif&#34;&gt;&lt;/p&gt;
&lt;/noscript&gt;
&lt;p&gt;&lt;canvas id=&#34;ai-coding-agent-usage&#34;&gt;&lt;/canvas&gt;&lt;/p&gt;
&lt;script&gt;
  (async function () {
    const rows = [
      { month: &#39;2025-09&#39;, claude: 2.29, codex: 37.47 },
      { month: &#39;2025-10&#39;, claude: 9.13, codex: 106.79 },
      { month: &#39;2025-11&#39;, claude: 14.24, codex: 100.35 },
      { month: &#39;2025-12&#39;, claude: 24.88, codex: 240.69 },
      { month: &#39;2026-01&#39;, claude: 20.28, codex: 100.89 },
      { month: &#39;2026-02&#39;, claude: 29.46, codex: 323.21 },
      { month: &#39;2026-03&#39;, claude: 134.87, codex: 1996.32 },
      { month: &#39;2026-04&#39;, claude: 47.07, codex: 401.36 },
      { month: &#39;2026-05&#39;, claude: 45.13, codex: 378.20 }
    ];

    const theme = {
      ink: &#39;#231f20&#39;,
      muted: &#39;#6b625c&#39;,
      grid: &#39;rgba(35, 31, 32, 0.11)&#39;,
      axis: &#39;rgba(35, 31, 32, 0.22)&#39;,
      claude: &#39;#b96d3a&#39;,
      codex: &#39;#2d5f87&#39;,
      tooltip: &#39;rgba(35, 31, 32, 0.94)&#39;
    };

    const canvas = document.getElementById(&#39;ai-coding-agent-usage&#39;);

    Object.assign(canvas.style, {
      display: &#39;block&#39;,
      width: &#39;100%&#39;,
      height: &#39;100%&#39;,
      minHeight: &#39;480px&#39;
    });

    function loadChartJs() {
      if (window.Chart) return Promise.resolve();
      return new Promise((resolve, reject) =&gt; {
        const script = document.createElement(&#39;script&#39;);
        script.src = &#39;https://cdn.jsdelivr.net/npm/chart.js@4.4.1/dist/chart.umd.min.js&#39;;
        script.onload = resolve;
        script.onerror = reject;
        document.head.appendChild(script);
      });
    }

    await loadChartJs();

    const usd = new Intl.NumberFormat(&#39;en-US&#39;, {
      style: &#39;currency&#39;,
      currency: &#39;USD&#39;,
      maximumFractionDigits: 2
    });

    const compactUsd = new Intl.NumberFormat(&#39;en-US&#39;, {
      style: &#39;currency&#39;,
      currency: &#39;USD&#39;,
      notation: &#39;compact&#39;,
      maximumFractionDigits: 1
    });

    Chart.defaults.font.family = document.body.style.fontFamily;
    Chart.defaults.color = theme.muted;

    new Chart(canvas, {
      type: &#39;line&#39;,
      data: {
        labels: rows.map(d =&gt; d.month),
        datasets: [
          {
            label: &#39;Claude&#39;,
            data: rows.map(d =&gt; d.claude),
            borderColor: theme.claude,
            backgroundColor: theme.claude,
            pointBackgroundColor: &#39;#ffffff&#39;,
            pointBorderColor: theme.claude,
            pointBorderWidth: 2.5,
            pointRadius: 4,
            pointHoverRadius: 7,
            borderWidth: 3,
            tension: 0.22
          },
          {
            label: &#39;Codex&#39;,
            data: rows.map(d =&gt; d.codex),
            borderColor: theme.codex,
            backgroundColor: theme.codex,
            pointBackgroundColor: &#39;#ffffff&#39;,
            pointBorderColor: theme.codex,
            pointBorderWidth: 2.5,
            pointRadius: 4,
            pointHoverRadius: 7,
            borderWidth: 3,
            tension: 0.22
          }
        ]
      },
      options: {
        responsive: true,
        maintainAspectRatio: false,
        interaction: { mode: &#39;index&#39;, intersect: false },
        layout: { padding: { top: 12, right: 18, bottom: 4, left: 8 } },
        plugins: {
          legend: {
            position: &#39;bottom&#39;,
            labels: {
              usePointStyle: true,
              pointStyle: &#39;circle&#39;,
              boxWidth: 8,
              boxHeight: 8,
              padding: 22,
              color: theme.muted,
              font: { size: 13, weight: &#39;650&#39; }
            }
          },
          tooltip: {
            enabled: true,
            mode: &#39;index&#39;,
            intersect: false,
            backgroundColor: theme.tooltip,
            bodyFont: { size: 13, weight: &#39;650&#39; },
            padding: 13,
            displayColors: true,
            callbacks: {
              title: items =&gt; items[0].label,
              label: item =&gt; `${item.dataset.label}: ${usd.format(item.parsed.y)}`,
              afterBody: items =&gt; {
                const i = items[0].dataIndex;
                const total = rows[i].claude + rows[i].codex;
                return `Combined: ${usd.format(total)}`;
              }
            }
          }
        },
        scales: {
          x: {
            grid: { color: &#39;rgba(35,31,32,0.07)&#39;, drawTicks: false },
            border: { color: theme.axis },
            ticks: { maxRotation: 0, autoSkip: false, color: theme.muted, font: { size: 12 } }
          },
          y: {
            beginAtZero: true,
            suggestedMax: 2200,
            grid: { color: theme.grid },
            border: { display: false },
            ticks: {
              padding: 8,
              color: theme.muted,
              callback: value =&gt; value &gt;= 1000 ? compactUsd.format(value) : &#39;$&#39; + value
            },
            title: {
              display: true,
              text: &#39;Cost (USD)&#39;,
              color: theme.muted,
              font: { size: 12, weight: &#39;650&#39; }
            }
          }
        }
      }
    });
  })();
&lt;/script&gt;
</description>
    </item>
    <item>
      <title>Retire the Verify Button</title>
      <link>https://www.s-anand.net/blog/retire-the-verify-button/</link>
      <pubDate>Sat, 30 May 2026 16:25:35 +0800</pubDate>
      <guid>https://www.s-anand.net/blog/retire-the-verify-button/</guid>
      <description>&lt;p&gt;&lt;img loading=&#34;lazy&#34; src=&#34;https://files.s-anand.net/images/2026-05-30-retire-the-verify-button.avif&#34;&gt;&lt;/p&gt;
&lt;p&gt;My post &lt;a href=&#34;https://www.s-anand.net/blog/add-a-verify-button/&#34;&gt;&amp;ldquo;Add a Verify Button&amp;rdquo;&lt;/a&gt; has a problem. When &lt;a href=&#34;https://www.linkedin.com/in/rohitsaran/&#34;&gt;Rohit&lt;/a&gt; requested hyperlocal news for every PIN code in Mumbai, we&amp;rsquo;d need a &amp;ldquo;verify&amp;rdquo; button on &lt;em&gt;every&lt;/em&gt; &lt;a href=&#34;https://sanand0.github.io/journalists/statnostics/&#34;&gt;Statoistics card&lt;/a&gt; - hundreds of PIN codes, &lt;em&gt;every day&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;Verifying every output introduces new bottleneck: a person inspecting every unit. &lt;strong&gt;That&amp;rsquo;s 100% inspection - which you do when you don&amp;rsquo;t yet trust the process.&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Manufacturing solved this a century ago. At Western Electric&amp;rsquo;s Hawthorne Works (famous for the &lt;a href=&#34;https://en.wikipedia.org/wiki/Hawthorne_effect&#34;&gt;Hawthorne Effect&lt;/a&gt;), quality control meant inspecting finished products and pulling the defective ones. &lt;a href=&#34;https://en.wikipedia.org/wiki/Walter_A._Shewhart&#34;&gt;Walter Shewhart&lt;/a&gt; sent his boss a &lt;a href=&#34;https://deming.org/the-first-control-chart/&#34;&gt;one-page memo&lt;/a&gt;; about a third of it was a control chart.&lt;/p&gt;
&lt;p&gt;&lt;img loading=&#34;lazy&#34; src=&#34;https://deming.org/wp-content/uploads/2021/04/Screen-Shot-2021-04-18-at-7.30.33-PM.png&#34;&gt;&lt;/p&gt;
&lt;p&gt;&lt;a href=&#34;https://en.wikipedia.org/wiki/W._Edwards_Deming&#34;&gt;Deming&lt;/a&gt; turned this approach into his third point: &lt;em&gt;&amp;ldquo;Stop relying on inspection for quality.&amp;rdquo;&lt;/em&gt; Build quality in from the start instead of inspecting defects out at the end.&lt;/p&gt;
&lt;p&gt;His process tells us what to do with a verify button as volume climbs.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Measure how often it&amp;rsquo;s right.&lt;/strong&gt; Don&amp;rsquo;t retire inspection until you know your defect rate. For example, on &lt;a href=&#34;https://sanand0.github.io/llmevals/double-checking/&#34;&gt;one classification task I benchmarked&lt;/a&gt;, the average model error was about 14%. Until we know that number, &amp;ldquo;it&amp;rsquo;s probably fine&amp;rdquo; is just a feeling.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Stratify.&lt;/strong&gt; &amp;ldquo;The garden has 18 plants&amp;rdquo; is easy to validate and low-risk if wrong. &amp;ldquo;This loan is denied&amp;rdquo; is neither. Verify the risky things carefully, let the cheap things through with low effort. Equal effort on both is waste.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Sample.&lt;/strong&gt; Nobody inspected every artillery shell in the war. Shewhart&amp;rsquo;s Bell Labs colleagues &lt;a href=&#34;https://en.wikipedia.org/wiki/Harold_F._Dodge&#34;&gt;Harold Dodge&lt;/a&gt; and Harry Romig put sampling inspection on a statistical basis. Check a sample at known confidence; watch whether the process drifts. The equivalent: verify a random sample of cards, track the rate, and react when the rate moves, not when one card looks off.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Augment with other models.&lt;/strong&gt; When I &lt;a href=&#34;https://sanand0.github.io/llmevals/double-checking/&#34;&gt;correlated two models&amp;rsquo; errors&lt;/a&gt;, the correlation was about 20%. If one gets a case wrong, the other usually doesn&amp;rsquo;t miss the same one. So a second model is a cheap, imperfect inspector. Asking AI to generate verifiable output lets another model to spot obvious errors.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Also, it&amp;rsquo;s best to avoid overreacting to defects. Deming called this (re-tuning the process after every defect) &lt;em&gt;tampering&lt;/em&gt;. It makes the variation worse. It&amp;rsquo;s worth collecting data and finding the real causes before changing the process.&lt;/p&gt;
&lt;p&gt;That&amp;rsquo;s what &lt;a href=&#34;https://www.linkedin.com/in/ankorrai&#34;&gt;Ankor&lt;/a&gt; calls the &lt;a href=&#34;https://sanand0.github.io/talks/2026-03-18-verifiable-agents/&#34;&gt;future of verifiable autonomy&lt;/a&gt;. It starts with:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;we are going to have to move beyond testing correctness to standard testing… if we test the pipeline once before deployment, we can trust that every single output produced by that pipeline, unless we make any adjustment to it, can be trusted.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;His analogy is software. Verification becomes a standard layer in the production loop, like how CI/CD is a standard step before you ship. Over a few years the need for human validation drops, and programmatic checks plus triage take over.&lt;/p&gt;
&lt;p&gt;Regulated finance has a lot of experience with this. After the GFC, the Fed and OCC issued &lt;a href=&#34;https://www.federalreserve.gov/supervisionreg/srletters/sr1107.htm&#34;&gt;SR 11-7&lt;/a&gt; in April 2011. Every quantitative model going into production needs independent validation by people separate from the developers, plus ongoing monitoring, before it ships. &amp;ldquo;Retire the verify button&amp;rdquo; doesn&amp;rsquo;t mean stop checking. &lt;strong&gt;It means have an independent validation layer with an owner.&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Of course, this incurs cost - at scale. For us, it led to concerns from the Finance team that the token costs overhead was climbing up. But, to quote &lt;a href=&#34;https://www.linkedin.com/in/srinivasankg/&#34;&gt;KG&lt;/a&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Token cost cannot be overhead. Token cost is direct cost because you&amp;rsquo;re replacing people.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;So I now &lt;a href=&#34;https://sanand0.github.io/llmpricing/&#34;&gt;benchmark cost alongside accuracy&lt;/a&gt;. A contract-validation demo I run checks a contract against a clause checklist, citing where each clause sits, for about 3 cents and 6 seconds. Pricing it lets me decide whether a reviewer&amp;rsquo;s half-hour is worth more than 3 cents. Usually it is. Sometimes it isn&amp;rsquo;t.&lt;/p&gt;
&lt;hr&gt;
&lt;p&gt;Sometimes, this isn&amp;rsquo;t good enough. A client wanted PII scrubbed from 3 million user images with &lt;em&gt;zero&lt;/em&gt; leaks. I did the arithmetic out loud:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;with 99.9%, we&amp;rsquo;re talking about 3,000 images with personally identifiable information potentially slipping through. Is that OK?&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;He said, &amp;ldquo;No.&amp;rdquo; I told him we couldn&amp;rsquo;t do it. It needs more technology than we had. (Our sales team nearly had a heart attack.) &lt;strong&gt;A critical output of measuring is to check if it&amp;rsquo;s even possible.&lt;/strong&gt;&lt;/p&gt;
&lt;hr&gt;
&lt;p&gt;I still manually verify AI output for new stuff. I don&amp;rsquo;t trust every pipeline yet. But when the scale becomes unwieldy, this is the process I switch to.&lt;/p&gt;
&lt;!-- https://claude.ai/chat/36780e30-48ca-4f84-af7a-4308e0880ce4 --&gt;
</description>
    </item>
    <item>
      <title>Add a Verify Button</title>
      <link>https://www.s-anand.net/blog/add-a-verify-button/</link>
      <pubDate>Sat, 30 May 2026 11:39:10 +0800</pubDate>
      <guid>https://www.s-anand.net/blog/add-a-verify-button/</guid>
      <description>&lt;p&gt;&lt;img loading=&#34;lazy&#34; src=&#34;https://files.s-anand.net/images/2026-05-30-add-a-verify-button.avif&#34;&gt;&lt;/p&gt;
&lt;p&gt;&lt;a href=&#34;https://www.linkedin.com/in/rohitsaran/&#34;&gt;Rohit Saran&lt;/a&gt; looked at the &lt;a href=&#34;https://sanand0.github.io/journalists/statnostics/&#34;&gt;Statoistics cards&lt;/a&gt; my AI agents are generating for &lt;a href=&#34;https://x.com/hashtag/STATOISTICS&#34;&gt;The Times of India&lt;/a&gt;, and asked about a small button under each one.&lt;/p&gt;
&lt;p&gt;&lt;a href=&#34;https://sanand0.github.io/journalists/statnostics/2026-04-27-citizen-survey/03-family-doctor-everyone-wants-nobody-has.svg&#34;&gt;&lt;img loading=&#34;lazy&#34; src=&#34;https://files.s-anand.net/images/2026-05-30-statoistics-card.avif&#34;&gt;&lt;/a&gt;&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;In the list of Statoistics that you had put, I saw there&amp;rsquo;s a button called &amp;lsquo;Verify.&amp;rsquo; What was that meant to be or will do in future?&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;That verify button explains the claim, mentions the sources, and shows how to check the claim.&lt;/p&gt;
&lt;p&gt;One card said &amp;ldquo;9 in 10 Indians want a family doctor and barely 1 in 35 has one&amp;rdquo;. The button breaks that down:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&amp;ldquo;87% want a family doctor, 2.8% outpatient visits were to an Asha worker…&amp;rdquo;&lt;/li&gt;
&lt;li&gt;It identifies in the source document what are the columns that we were looking at, what numbers it verified.&lt;/li&gt;
&lt;li&gt;It links to the program that it wrote to do the verification.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;I said, &amp;ldquo;it lets humans check if the numbers are right - by giving them steps &amp;ndash; where exactly to check, how to check if it is correct.&amp;rdquo;&lt;/p&gt;
&lt;p&gt;&lt;a href=&#34;https://www.linkedin.com/in/sajeev-kumarapuram-205ba933&#34;&gt;Sajeev&lt;/a&gt; pushed back: &lt;em&gt;&amp;ldquo;It&amp;rsquo;s more &amp;rsquo;explain&amp;rsquo; than &amp;lsquo;verify&amp;rsquo; really.&amp;rdquo;&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;True. &lt;a href=&#34;https://timesofindia.indiatimes.com/toireporter/author-Saurabh-Banerjee-479202560.cms&#34;&gt;Saurabh&lt;/a&gt; had asked for exactly this earlier: while a person is checking by hand, give them something that shows how the AI got to its answer. &lt;strong&gt;A verify button&amp;rsquo;s first job is not to prove the AI is right. It&amp;rsquo;s to let a nervous journalist check, cheaply, until they stop being nervous.&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;This instinct is old. The Royal Society took &lt;a href=&#34;https://royalsociety.org/about-us/who-we-are/history/&#34;&gt;&lt;em&gt;nullius in verba&lt;/em&gt;&lt;/a&gt; as its motto around 1662, &amp;ldquo;take nobody&amp;rsquo;s word for it.&amp;rdquo; They didn&amp;rsquo;t print claims and ask you to trust the author. In 1663 they made &lt;a href=&#34;https://en.wikipedia.org/wiki/Robert_Hooke&#34;&gt;Robert Hooke&lt;/a&gt; their Curator of Experiments, whose job was to re-run the demonstration in front of the Fellows. A verify button is that, without Hooke.&lt;/p&gt;
&lt;p&gt;(Merchants got there two centuries earlier: double-entry bookkeeping, codified by &lt;a href=&#34;https://en.wikipedia.org/wiki/Luca_Pacioli&#34;&gt;Pacioli&lt;/a&gt; in 1494, means every entry has a counter-entry and the books either balance or they don&amp;rsquo;t.)&lt;/p&gt;
&lt;p&gt;Rohit&amp;rsquo;s reason for liking it went somewhere I hadn&amp;rsquo;t fully thought through. He went to brand.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;It&amp;rsquo;s like why a product with 10-year guarantee is likely to be made better than a product with 2-year warranty, because the company has confidence to tell the customer, &amp;lsquo;Look, I am standing behind this product for 10 years.&amp;rsquo;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;And later:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Any brand that is saying, &amp;lsquo;Whatever I write is verifiable,&amp;rsquo; is so much more in this age of misinformation.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;His version of why this matters for a newspaper: &lt;em&gt;&amp;ldquo;a brand is only about trust. Rest is news is anyway a commodity.&amp;rdquo;&lt;/em&gt; &lt;strong&gt;A verify button is a public claim that you&amp;rsquo;re willing to be checked.&lt;/strong&gt;&lt;/p&gt;
&lt;hr&gt;
&lt;p&gt;Here&amp;rsquo;s how I actually build one &amp;ldquo;Verify&amp;rdquo; buttons, in increasing order of effort.&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Link plus a searchable string.&lt;/strong&gt; A hyperlink may still be wrong. I want a link &lt;em&gt;and&lt;/em&gt; a short quote I can paste into the page&amp;rsquo;s search box and find. &lt;em&gt;&amp;ldquo;When I click on that link, I should be able to literally search for and find that piece of text, verifying that it did not hallucinate&amp;rdquo;&lt;/em&gt; Then even a plain program (not even an LLM) can open every link and confirm the text is there.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;For numbers, the SQL query.&lt;/strong&gt; If it&amp;rsquo;s data, the SQL query (or Python script) that fetches that particular result is the closest equivalent. The button should just run the query against live data and shows the number. The user doesn&amp;rsquo;t need to know SQL - they just see that the number matches.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;The procedure as a checklist.&lt;/strong&gt; The button breaks the card into steps: this is the claim, this is the number, this is the column it came from, check that the D1A value matches. A person ticks down it.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Verify with an AI agent.&lt;/strong&gt; Add a link that opens the claim in Google AI mode with a pre-filled prompt asking it to fact-check the claim. For example: &lt;a href=&#34;https://tools.s-anand.net/askai/?q=Fact-check+with+step-by-step+evidence%3A+According+to+Citizen+Survey+2022-23%2C+87%25+of+Indians+want+a+dedicated+family+doctor+but+only+2.8%25+actually+use+one.&#34;&gt;Fact-check with step-by-step evidence: According to Citizen Survey 2022-23, 87% of Indians want a dedicated family doctor but only 2.8% actually use one. How might it have changed since the publication?&lt;/a&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Rohit framed verification as three jobs, not one: &lt;em&gt;&amp;ldquo;Verification has sourcing, verification, and updation.&amp;rdquo;&lt;/em&gt; The last clause lets you also ask whether the number has gone stale since you published it.&lt;/p&gt;
&lt;p&gt;Getting the source right is not the same as getting the conclusion right. Rohit said: &lt;em&gt;&amp;ldquo;you are asking AI not only to get right source and right data, but now we are asking to interpret.&amp;rdquo;&lt;/em&gt; And interpretation is subjective on both ends. The button can confirm the number is real but not &lt;em&gt;prove&lt;/em&gt; the argument is sound.&lt;/p&gt;
&lt;p&gt;Of course, the sources could be wrong. &amp;ldquo;Check the source&amp;rdquo; assumes good data quality. Luckily, data is more often right than wrong, and verification can shine a light on bad data.&lt;/p&gt;
&lt;hr&gt;
&lt;p&gt;We can start simple. The cheapest version: &lt;em&gt;every&lt;/em&gt; AI output has a &amp;ldquo;Verify&amp;rdquo; link to a search query the user can easily inspect. That changes their question from &amp;ldquo;can I trust this?&amp;rdquo; to &amp;ldquo;let me check.&amp;rdquo;&lt;/p&gt;
&lt;p&gt;If this can establish trust and a brand for India&amp;rsquo;s largest newspaper, enterprises AI apps might do well to follow.&lt;/p&gt;
&lt;!-- https://claude.ai/chat/36780e30-48ca-4f84-af7a-4308e0880ce4 --&gt;
</description>
    </item>
    <item>
      <title>One extra push-up every day</title>
      <link>https://www.s-anand.net/blog/one-extra-push-up-every-day/</link>
      <pubDate>Fri, 29 May 2026 09:57:03 +0800</pubDate>
      <guid>https://www.s-anand.net/blog/one-extra-push-up-every-day/</guid>
      <description>&lt;p&gt;&lt;img loading=&#34;lazy&#34; src=&#34;https://files.s-anand.net/images/2026-05-29-one-extra-push-up-every-day.avif&#34;&gt;&lt;/p&gt;
&lt;p&gt;I&amp;rsquo;m doing one extra push-up every day.&lt;/p&gt;
&lt;p&gt;One of my &lt;a href=&#34;https://www.s-anand.net/blog/my-year-in-2025/&#34;&gt;2026 goals&lt;/a&gt; is to build muscles. I haven&amp;rsquo;t done anything about it until May.&lt;/p&gt;
&lt;p&gt;This month, I figured I would do the absolute minimum, at least to get started, because I seem to have starting trouble more than anything else. &lt;a href=&#34;https://chatgpt.com/share/6a18f311-dcf0-83ec-aa24-9ecc82053f37&#34;&gt;I asked ChatGPT&lt;/a&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;I want to build muscles. What&amp;rsquo;s the most effective thing that I can do that would take no more than one minute that I can practice every day without any equipment and I can do this anywhere and will have the most impact on building muscles? Research, give me the top five options and recommend one for me.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;It suggested push-ups. &lt;a href=&#34;https://claude.ai/share/d6138707-2e80-48b7-9c9f-8ff07d424d9f&#34;&gt;So did Claude&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Since I do yoga every day, I decided to do push-ups after that. I kept forgetting, so I decided to do push-ups &lt;em&gt;before&lt;/em&gt; that. (This worked.)&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Lesson&lt;/strong&gt;: &lt;a href=&#34;https://jamesclear.com/habit-stacking&#34;&gt;Habit stacking&lt;/a&gt; works. Habit pre-stacking works better.&lt;/p&gt;
&lt;p&gt;I remember that story about a boy carrying a young bull as a child, and as they both grew up, he grew into a man strong enough to carry an adult bull. I am applying a similar practice.&lt;/p&gt;
&lt;p&gt;I started with 10 push-ups a day. Every day, I&amp;rsquo;m adding one push-up to it. I just finished 23. That is, I have spent the last ~16 days (with 3 misses in between) adding one push-up each day to my routine.&lt;/p&gt;
&lt;p&gt;This seems to be just the right level of incremental difficulty. Every day feels as miserable as the previous one. I began with absolutely not being able to do a single push-up more than 10 push-ups. I just finished my 23 push-up routine, absolutely not able to even one more. And it&amp;rsquo;s felt exactly the same way as every day.&lt;/p&gt;
&lt;p&gt;Maybe it&amp;rsquo;s because I &lt;em&gt;know&lt;/em&gt; the quota and the brain decides that&amp;rsquo;s &lt;em&gt;exactly&lt;/em&gt; the limit of what&amp;rsquo;s possible. But still, it feels like one extra push-up a day is reasonable progression.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Lesson&lt;/strong&gt;: &lt;a href=&#34;https://jamesclear.com/quotes/habits-are-the-compound-interest-of-self-improvement&#34;&gt;Compounding habits&lt;/a&gt; seems to work. I&amp;rsquo;ll keep you posted.&lt;/p&gt;
</description>
    </item>
    <item>
      <title>ChatGPT is about FIDE 1600</title>
      <link>https://www.s-anand.net/blog/chatgpt-is-about-fide-1600/</link>
      <pubDate>Thu, 28 May 2026 16:04:51 +0800</pubDate>
      <guid>https://www.s-anand.net/blog/chatgpt-is-about-fide-1600/</guid>
      <description>&lt;p&gt;I asked ChatGPT to play chess with &lt;a href=&#34;https://stockfishchess.org/&#34;&gt;Stockfish&lt;/a&gt;. Stockfish is a &amp;ldquo;strong open-source chess engine&amp;rdquo;. It has 8 levels of difficulty, which &lt;a href=&#34;https://share.google/aimode/yA9NvnPcsZ1TFtmna&#34;&gt;roughly maps to these FIDE levels&lt;/a&gt;:&lt;/p&gt;
&lt;section ai-disclosure=&#34;ai-generated&#34; data-ai-model=&#34;gemini-3.5-flash&#34; data-ai-provider=&#34;Google&#34;&gt;
&lt;table&gt;
  &lt;thead&gt;
      &lt;tr&gt;
          &lt;th&gt;Stockfish&lt;/th&gt;
          &lt;th&gt;FIDE&lt;/th&gt;
          &lt;th&gt;Player Level &amp;amp; Description&lt;/th&gt;
      &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
      &lt;tr&gt;
          &lt;td&gt;Level 1&lt;/td&gt;
          &lt;td&gt;~1000&lt;/td&gt;
          &lt;td&gt;&lt;strong&gt;Beginner&lt;/strong&gt;: Constantly blunders, hangs pieces deliberately.&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;Level 2&lt;/td&gt;
          &lt;td&gt;~1100&lt;/td&gt;
          &lt;td&gt;&lt;strong&gt;Advanced Beginner&lt;/strong&gt;: Fewer obvious tactical mistakes, plays completely aimlessly.&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;Level 3&lt;/td&gt;
          &lt;td&gt;~1200&lt;/td&gt;
          &lt;td&gt;&lt;strong&gt;Early Intermediate&lt;/strong&gt;: Punishes very basic errors but regularly drops pieces.&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;Level 4&lt;/td&gt;
          &lt;td&gt;~1350&lt;/td&gt;
          &lt;td&gt;&lt;strong&gt;Intermediate&lt;/strong&gt;: Plays standard opening moves; requires solid, blunder-free play to beat.&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;Level 5&lt;/td&gt;
          &lt;td&gt;~1450&lt;/td&gt;
          &lt;td&gt;&lt;strong&gt;Advanced Intermediate&lt;/strong&gt;: Rarely hangs single pieces; you need positional advantages.&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;Level 6&lt;/td&gt;
          &lt;td&gt;~1650&lt;/td&gt;
          &lt;td&gt;&lt;strong&gt;Strong Club Player&lt;/strong&gt;: Highly tactical. Aggressively exploits your mistakes.&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;Level 7&lt;/td&gt;
          &lt;td&gt;~1950&lt;/td&gt;
          &lt;td&gt;&lt;strong&gt;Expert&lt;/strong&gt;: Exceptionally strong. Requires precise positional mastery and deep calculation.&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;Level 8&lt;/td&gt;
          &lt;td&gt;~2400&lt;/td&gt;
          &lt;td&gt;&lt;strong&gt;Grandmaster&lt;/strong&gt;: Invincible for most humans. Plays with ruthless perfection.&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;Full Engine&lt;/td&gt;
          &lt;td&gt;~3600&lt;/td&gt;
          &lt;td&gt;Our of human reach completely, &amp;ldquo;like a smart ant trying to debate physics with a human.&amp;rdquo;&lt;/td&gt;
      &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;
&lt;/section&gt;
&lt;p&gt;In the &lt;a href=&#34;https://chatgpt.com/share/6a17f88a-dd74-83ec-b6e6-b42fac198d9c&#34;&gt;first iteration&lt;/a&gt;, here were the results:&lt;/p&gt;
&lt;table&gt;
  &lt;thead&gt;
      &lt;tr&gt;
          &lt;th&gt;Stockfish&lt;/th&gt;
          &lt;th&gt;Result&lt;/th&gt;
      &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
      &lt;tr&gt;
          &lt;td&gt;Level 0&lt;/td&gt;
          &lt;td&gt;Win&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;Level 1&lt;/td&gt;
          &lt;td&gt;Win&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;Level 2&lt;/td&gt;
          &lt;td&gt;Stalemate&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;Level 3&lt;/td&gt;
          &lt;td&gt;Stalemate&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;Level 4&lt;/td&gt;
          &lt;td&gt;Win&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;Level 5&lt;/td&gt;
          &lt;td&gt;Loss&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;Level 6&lt;/td&gt;
          &lt;td&gt;Loss&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;&amp;hellip; etc.&lt;/td&gt;
          &lt;td&gt;Loss&lt;/td&gt;
      &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;When I asked ChatGPT how it played, it said something like &amp;ldquo;I wrote a Python program that plays chess using a fixed policy.&amp;rdquo;&lt;/p&gt;
&lt;p&gt;That&amp;rsquo;s crazy! So I told it:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Rather than use a fixed policy, get the move that Stockfish made, analyze it, and return your next move. See if you can win at level 6.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;After a few attempts, it &lt;a href=&#34;https://chatgpt.com/share/6a17f740-0424-83ec-b298-5bf6056a3905&#34;&gt;won&lt;/a&gt;!&lt;/p&gt;
&lt;p&gt;&lt;a href=&#34;https://lichess.org/l9vffWVr&#34;&gt;Here&amp;rsquo;s the game&lt;/a&gt;:&lt;/p&gt;
&lt;video controls=&#34;&#34; width=&#34;534&#34; height=&#34;542&#34; style=&#34;max-width: 100%; height: auto;&#34;&gt;
  &lt;source src=&#34;https://files.s-anand.net/images/2026-05-28-chatgpt-vs-stockfish-chess-game.webm&#34; type=&#34;video/webm&#34;&gt;&lt;a href=&#34;https://lichess.org/l9vffWVr&#34;&gt;ChatGPT vs Stockfish Level 6&lt;/a&gt;
&lt;/video&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code class=&#34;language-pgn&#34; data-lang=&#34;pgn&#34;&gt;[White &amp;#34;ChatGPT&amp;#34;]
[Black &amp;#34;Stockfish Skill Level 6&amp;#34;]
[Termination &amp;#34;White won by checkmate&amp;#34;]
[FinalFEN &amp;#34;4Q3/2qrkp2/4pN2/1pp1P3/7P/p1P3P1/P5K1/4R3 b - - 5 39&amp;#34;]

1. d4 e6 2. c4 Nf6 3. Nf3 Be7 4. g3 O-O 5. Bg2 a5
6. O-O c6 7. Qc2 d5 8. Rd1 Ne4 9. Nc3 Nxc3 10. bxc3 a4
11. e4 h6 12. Bf4 Re8 13. e5 b6 14. Nd2 Ba6 15. h4 Qc7
16. Be3 Bb7 17. f4 Na6 18. Rf1 Rad8 19. f5 Bf8 20. f6 Nb8
21. fxg7 Bxg7 22. Qd1 Nd7 23. Qg4 Nxe5 24. dxe5 c5
25. Bf4 Re7 26. Re1 Kf8 27. Qh5 a3 28. Bh6 dxc4 29. Nxc4 Bxg2
30. Kxg2 Rd3 31. Bxg7+ Kxg7 32. Rf4 Rd2+ 33. Nxd2 Rd7
34. Ne4 b5 35. Rg4+ Kf8 36. Rg8+ Kxg8 37. Nf6+ Kf8
38. Qh8+ Ke7 39. Qe8# 1-0
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;So, guess ChatGPT (GPT-5.5, extended thinking) is at around 1600 FIDE level right now.&lt;/p&gt;
&lt;p&gt;What&amp;rsquo;s impressive is that it wasn&amp;rsquo;t specifically trained on Chess. It&amp;rsquo;s just something it picked up on the way.&lt;/p&gt;
&lt;p&gt;If it starts beating level 8 (grandmaster), will we finally acknowledge AGI? (Me? I think &lt;a href=&#34;https://marginalrevolution.com/marginalrevolution/2025/04/o3-and-agi-is-april-16th-agi-day.html&#34;&gt;we achieved AGI on 16 Apr 2025&lt;/a&gt;.)&lt;/p&gt;
</description>
    </item>
  </channel>
</rss>
