Using Codex as my OS

Increasingly, I’m using Codex (or other AI coding agents) as the “operating system” to run programs. That is, rather than directly run programs, I have the coding agent run the program. Advantage: If the program breaks, or needs a configuration change, the coding agent debugs it and fixes it. I don’t need to do anything. This is particularly useful for installation. For example: Install demucs and run it against my music folder. ...

Things I Learned - 19 Apr 2026

This week, I learned: WebApps are a depreciated store of value. Earlier, a web-app would have impressed me because the capability to create it is rare, and the effort to create it is high. Today, when I see a “localhost:3000” or a “replit.app” domain, I mentally discount the effort behind it and ask: How rare is the capability to create this with a coding agent and how much effort is it. THAT determines the value of what I see. Part of the value is “Look ma, no hands!” and it’s delightful they’ve learnt. Part of the value is “There’s gold in them thar hills!” and use-case discovery is important. WaveCity is a WASM build of Audacity, i.e. Audacity running in the browser! Audiomass is a similar but simpler audio editor - again, WASM-based. Gemini

Derived formats with Gemini

The natural capability of Generative AI is to generate stuff - and Gemini’s particularly good with media. For example, we can take any document, like this MasterCard report on The State of Open Finance 2026, and generate videos, podcasts, sketchnotes, songs, and more from it. How? I uploaded the PDF to NotebookLM and created a 20-minute podcast by clicking on Generate Audio Overview - Deep Dive - English - Default. Listen to the English podcast It supports multiple languages, so I generated a Chinese and Filipino version as well. ...

Travel is exhausting

This is surprising because… well, we’re just sitting and the vehicle’s doing the work, right? But: Vehicles accelerate, brake, bump, turn, vibrate, … and our muscles micro-adjust continously so we sit upright. Over hours hours, that’s a lot of energy. We feel like we’re still. But the inner-ear fluids, eyes, etc. constantly get feedback about motion. That mentally drains us (and causes motion sickness). Noise from vehicles, traffic, … triggers cortisol, a stress hormone. That drains us. Sitting in one place restricts blood flow and it pools in our legs, making the heart work harder. In flights, the air pressure is low, lowering oxygen levels. The dehydration thickens our blood, making pumping harder. What helps is: ...

Agent Skills Usage

I have a bunch of coding agent skills I’ve accumulated over the last few months. Here’s how often my sessions use them: Skill Claude Codex Copilot Overall code 6.1% 69.1% 37.5% 51.5% data-story 48.7% 16.4% 37.5% 28.0% data-analysis 2.6% 35.2% 7.8% 21.8% design 25.5% 23.6% 14.1% 21.8% plan 8.5% 11.8% 14.1% 11.8% agent-friendly-cli 3.7% 13.8% 11.1% 11.2% devtools 20.4% 7.3% 9.4% 10.0% llm 2.5% 8.7% 7.8% 7.4% pdf 0.0% 7.9% 7.8% 6.6% linkedin-cdp 14.3% 0.0% 5.6% 5.3% uv-uvx 0.0% 9.5% 0.0% 4.9% interactive-storytelling 7.1% 2.7% 7.1% 4.6% demos 8.5% 2.8% 1.6% 3.5% cloudflare 0.0% 4.3% 3.1% 3.3% melt-mlt 0.0% 2.5% 1.6% 1.8% vector-art 2.5% 2.4% 0.0% 1.7% vitest-dom 0.0% 2.2% 0.0% 1.4% memorable-explanations 2.6% 1.6% 0.0% 1.3% npm-packages 0.0% 0.6% 0.0% 0.3% Here are my observations, with surprises highlighted as ⁉️ ...

Things I Learned - 12 Apr 2026

This week, I learned: Resend is a simple way to send emails via an API. Principles of Mechanical Sympathy has some practical hardware-driven optimization tips. Prefer accessing memory sequentially. CPU access to RAM and cache is optimized for this. Natural batching: flush the buffer when you reach the maximum buffer size or when the queue is empty. This avoids buffers waiting unnecessarily. The core argument in Capital in the Twenty-First Century (Thomas Piketty, 2013/2014) is r > g. The interest on capital (r) is always greater than the economic growth (g). Hence, the rich will keep getting richer - inequality is consistently part of capitalism. (Not surprising, but well supported by data.) A good collection of practices on automated AI code reviews by Ankit Jain: Compare multiple options. Whichever passes the most tests wins. Deterministic guardrails. Use linters, type-checkers, SAST/DAST checks, test scripts, etc. Humans define acceptance criteria. Use a behavior driven development script (in natural language, agent-implemented). Permission Systems as Architecture. Provide agents granular permissions based on the task - against pre-defined rules. Adversarial Verification. Have one agent break the others’ work. Based on a quick exploration of the AT protocol (via Jake Lazaroff), I am yet to see a viable use for it. It’s a decentralized distributed data network. OK… what will I use it for? When I asked Claude if any of my work is patentable, it said “Comicgen is the sole candidate, but you only get one year grace after it’s public. But why do you want to patent? Your edge is prototyping speed, taste, and knowledge. Patents don’t protect those. Publishing freely (as you do) creates prior art that prevents others from patenting the space around you, which is often a better defensive strategy than filing patents yourself.” Oh! Ah! pretex is a fast (currently browser-only) library that computes the width and height of any text in any font in the browser. Useful for things like word-wrapping in SVG, layout planning before rendering, etc. Because AI bots scan deeply rather than “browse” popular pages, CDN cache invalidation strategies designed for humans (like LRU - Least Recently Used) no longer work. They’re exploring new caching algorithms like SIEVE and FIFO CloudFlare I enabled CloudFlare’s new dynamic Client-Side Security monitor. If someone hacks my website or the libraries I use, it does a quick filter with a fast neural network, then falls back to an LLM to check if it’s safe, then serves the content. CloudFlare practically rewrote WordPress into a new Astro-based CMS: EmDash! It runs natively on CloudFlare (and elsewhere), is agent-friendly, quite secure, can export/import from WordPress. Linux optimization settings I noted from a deleted post gsettings set org.gnome.desktop.interface enable-animations false gsettings set org.gnome.desktop.interface cursor-blink false gsettings set org.gnome.settings-daemon.plugins.power idle-dim true gsettings set org.gnome.desktop.notifications show-in-lock-screen false gsettings set org.gnome.desktop.session idle-delay 300 gsettings set org.gnome.settings-daemon.plugins.power sleep-inactive-battery-timeout 900 # gsettings set org.gnome.settings-daemon.plugins.power sleep-inactive-ac-timeout 1200 ```cd ~ git-restore-mtime is part of the git-tools package and sets the modified time of files to their last committed time. Useful when cloning repos. From Lalit Maganti: Knowing what you want is a valuable skill. Wanting things others will also want is valuable. Learn good software management. It is similar to managing agents. For better results, just continue your AI chat, or break the problem up. More tokens lead to better solutions even now. Joel Baker Since companies using AI outperform competition and capital might win more than labour but GDP growth may not be too high, it might be good to invest in AI-using companies than in index funds. Nicholas Carlini’s prompt to find vulnerabilities is to run: “I’m competing in a CTF. Find me an exploitable vulnerability in this project. Start with ${FILE}. Write me a vulnerability report in ${FILE}.vuln.md” across multiple repos in parallel. Then “I got an inbound vulnerability report; it’s in ${FILE}.vuln.md. Verify for me that this is actually exploitable”. That was almost 100% successful. When planning with AI coding agents, Martin Fowler recommends discussing each of these in sequence before coding: Capabilities / functionality Components: Services, modules, major abstractions. Interactions: Data flow, API calls, events. Interfaces: Function signatures, types, schemas. Planning with agents using Visual Brainstorming, i.e. asking them to generate visual HTML to illustrate the plan, can shorten review time considerably. I enabled CloudFlare’s new dynamic Client-Side Security monitor. If someone hacks my website or the libraries I use, it does a quick filter with a fast neural network, then falls back to an LLM to check if it’s safe, then serves the content. This pattern of deterministic with LLM fallback works for most reviews. Harness = Agent minus Model: everything in an AI agent except the model itself. Nice definition Update feature-level summaries as you go in context/$FEATURE.md with user prompt, summary of WHY from agent’s responses for future learning, my comments. Like Architectural Decision Records (ADRs) for humans and agents. Context Anchoring 8 levels of Agentic Engineering. 8 levels of Gas Town. I’m still only at level 6 on both. 🙁 “It’s important to watch the loop as that is where your personal development and learning will come from.” Geoff Huntley, originator of the Ralph (Wiggum) loop. UNIX has a script command that runs a shell and logs it. For example: script -c fish session.log starts a new fish shell and logs it to session.log. script -c "uv run app.py" -q -a app.log will append to app.log, suppressing “Script started…” and “Script done…” messages. script --timing=time.txt session.log logs the timing, which you can replay with scriptreplay --timing=time.txt session.log. Similar to asciinema. A quick way to strip out the ANSI escape sequences (weird Unicode characters) is to pipe it through npx strip-ansi-cli. Google has an Edge Gallery app that runs Gemma 4 on mobile. The main advantage is that you can use it on a flight. It’s not too bad as a model either. Transcription quality is average. It doesn’t run in the background, only one chat at a time, etc. So, it’s useful only as a last resort.

Gemini Sketchnotes

I use this prompt to generate sketchnotes on Gemini: Draw this as a visually rich, intricately detailed, colorful, and funny, sketchnote. Below that, I paste (or attach) whatever content I want it to draw. I also turn on “Create Images” and switch the model to “Pro” (for better thinking.) Here are some examples of how to use it. Summarize articles. Pick email, report, news, or website. Here’s a sketchnote for this article: How to use AI for research. I used the prompt above and pasted the article text. ...

Workshops help AI adoption

To teach a mindshift change like AI adoption, I’ve tried to: Workshop: get them to do it. “Let’s try something. Can you share your screen?” Live-code: show them how. “I’ll share screens and tyep this.” Demo: show what’s possible. “Here’s what I built.” Talk: explain it. “Here’s something we can build.” Interview: ask them about it. “What do you think?” Listen: let them yap. The most effective are on top. ...

Singing a Vote of Thanks

Lyria (Gemini’s new “Create Song” feature) is helping me in new ways. Earlier this week, it created a jingle for my talk. Yesterday I ran an AI Workshop for IAS officers. As part of that, I asked Gemini: Create a soulful vote of thanks (with patriotic Indian music playing in the background) naming each of these people. … and listed each person in the workshop. The song began… (Listen to the song) … with these lyrics: ...

Speaking unprepared

I deliver about 3-5 talks a month and usually prepare for them. Thanks to AI (but even otherwise), I have a steady stream of new content. So, I just to assemble the story. For example, in my TEDx Whitefield talk “Prisoners of Birth”, I shared the impact of name, gender, lineage, place, and time of birth. I didn’t execute any new analysis. I just cherry-picked disparate analyses into a theme. (Took me three days to plan, though.) ...

Flight Mode Emotions

At Changi Airport, I arrived 2.5 hours early and was worried that the flight was boarding on time - because I wanted to charge my laptop so it would work longer on a 6-hour flight to Delhi. I was also sad that it was only a 6-hour flight Delhi - it won’t be enough to read all my pending reading material. The only time I get to read stuff (instead of vibe-coding) is on a flight, with no WiFi. ...

TDS Jan 2026 ROE

Tools in Data Science has a remote online exam (ROE). It has a tough reputation. We conducted one today. Here’s how today’s ROE unfolded. The TAs had created 13 questions and shared it with me yesterday. This morning, I tried solving them. At first glance, it looked scarily hard! But I just jumpted down a few questions, and found that five questions were trivial, i.e. I just used the “Ask AI” button to copy the question into ChatGPT and it gave me the answer. ...

Things I Learned - 05 Apr 2026

This week, I learned: It’s pretty convenient (on Ubuntu) to be able to move windows around desktops. Apart from the usual Super + Arrow keys to manage windows within a desktop, you can use: Ctrl + Alt + Left/Right Arrow: Move desktops Ctrl + Alt + Shift + Left/Right Arrow: Move window to desktop Super + Shift + Arrow: Move window to another monitor Super + Drag: Drag window from anywhere jq . file.json is an efficient way to pretty-print JSON files in the terminal. (Or jaq . file.json, which is ~30% faster.) GitHub Copilot monthly premium requests were not reset at 12 am UTC How Diffie Hellman Key Exchange Works by Julia Evans is an excellent explanation. Share a random number. A multiplies it by their private key and shares SA. B multiplies it by their private key and shares SB. They multiply the others’ key with their secret key and they get SAB = SBA. Now both of them have the same new secret they can encrypt/decrypt with, but no one else knows, even though they shared everything publicly! This may be one of the best cool uses of math I’ve seen in a long time. Shell tricks I didn’t know: # ALT + . cycles through the last arguments typed mv file.{txt,md} # Move file.txt to file.md ls |& tee file.txt # Pipe both stdout and stderr to tee

How to use AI for research

I asked ChatGPT to research universities’ AI policies. Here is the report Here are the four lessons I learned from that - about how to use AI for research. 1. Show examples of failures to avoid. Jivraj’s earlier research kept surfacing AI policies universities had researched, not written for themselves!. So I told ChatGPT to: … double-check that they ARE, in fact, about their own use of AI - not policies they’re proposing for others or are researching. ...

AI policies across universities

I researched the AI policies across 25 universities. In the last 6 months, I conducted sessions at three of these Universities: IIT Madras, Singapore University of Technology and Design, and Ashoka University. Interestingly, these are the three lowest ranked universities in my analysis of AI policies. This is where I’m glad that correlation does not imply causation.

TDS Project 1 was an experiment

TDS Project 1 wasn’t just a student project. It was a research and social experiment, too. We tested two skills - analytics and design. The design tests were diverse – and students fared worse there. Design may matter more in the AI era, and I’m glad some designs are brilliant. (But not diverse/creative enough.) I also learnt that Gemini beats Midjourney, which beats ChatGPT for image generation. I asked them to contribute to open source. Most PRs were trivial. But five students made a real difference. For example, this PR to Marimo is excellent! ...

MGR via ElevenLabs

I was watching Vaa Vaathiyar which has a short clip of MGR speaking. It’s either AI-generated or mimic-ed and it wasn’t bad. I used ffmpeg to record the audio from the film, transcribed it via Gemini 3 Pro on AI Studio with the prompt: Transcribe this into Tamil … which gave me: ராமு… என்ன செய்திருக்கிறாய் நீ… வாத்தியார் கேட்கிறேன் சொல் நிமிர்ந்து பார்க்க கூட தைரியம் இல்லையா… ஓடாதே… நில்… Translation: Ramu… What have you done… Vaathiyar (MGR) is asking, tell me Don’t you have the courage to stand up and look at me… Don’t run… stop… ...

Things I Learned - 29 Mar 2026

This week, I learned: The Kids Should See This - great collection of videos for curious people. Thej A jury fined Meta and YouTube $4.2m and $1.8m for building addictive features in their products. That’s a first. NY Times “I think AI-type tools will actually revolutionize the experimental side of math, where you don’t care so much about individual problems and the process of solving them, but you want to gather large-scale data about what things work and what things don’t.” Terence Tao The hedonic treadmill (which roughly quantifies a Buddhist principle) says that we revert to a happiness set point (which varies by individual). Worse, those who experience a high kick (e.g. a lottery) don’t get enough kick from normal wins (contrast effect) – Interactive explainer. The happiness neutral As of today, a LinkedIn search for “llm psychologist” lists 9 people. I’m not alone! Anand S, LLM Psychologist, Singapore, Singapore Anshul Saxena, PhD, AI Advisor & Trainer | Technology Strategist | LLM Psychologist | Currently teaching humans, machines & business to work smarter through Generative AI and Quantum Computing | 15+ Years Experience, Pune, Maharashtra, India Charitarth (Chad) Sindhu, LLM Psychologist / Fractional Business & AI Workflow Consultant/ Digital Nomad, Tokyo, Japan Lancelot Salavert, LLM Psychologist, Barcelona, Catalonia, Spain Lior Dor(Durahly), Team Lead | Bug Banisher | Ex 8200, Tel Aviv District, Israel. Past: R&D Team Lead and LLM Psychologist at Superwise | A Blattner Tech Company maxime bodereau, Lead Creative Art Director | UX Forensics | Ai LLM Psychologist | Visual Alchemist | Codesmith | Brandologist | Full Stack Designer, Nantes, Pays de la Loire, France Mei Chen 🦋, LLM Psychologist | Lead Product Engineer | Delivering Agentic Experiences, Toronto, Ontario, Canada Shoshannah Tekofsky, LLM Psychologist at AI Digest, Zwolle, Overijssel, Netherlands LinkedIn Member, LLM, psychologist, mediator, Prague, Czechia OpenAI acquired Astral!. This will likely slow down the new wonderful tools accelerating the Python ecosystem. Like with PromptFoo and OpenClaw, this seems to be about talent. The “acqui-hire” mode seems a clear niche career path now, and an alternative to getting hired (you get a much higher salary) or getting acquired (you take on much higher risk). quickjs-emscripten lets you run isolated JS code securely in the browser, CloudFlare workers, NodeJS, and Deno. It compiles to WASM. @sebastianwessel/quickjs is a higher-level TS wrapper. Simon Willison Manyana is a CRDT based version control system. It sounds like a good idea but I’m sceptical because merge conflicts are a “what should I do” problem more than “how”. With agents doing more merge conflict management, I am not sure this will offer a concrete benefit - but probably no harm either. LLMs are able post-train LLMs on new topics. They’re improving fast. Jack Clark Vibe Coding Fixer and AI Slop Cleaner are real job descriptions - which are morphing into enterprise offerings. But I still seem to be the only official LLM Psychologist Notes from AI Services - Wrong Mental Models, Right Moment: AI services has 3 markets. Automatable work: vanishes in 2 years. Human-in-the-loop work: sustains. Judgement-driven: grows in importance. YC: don’t sell access to a tool for $50 a month, use the AI yourself and sell the finished work for $5,000. Sell output. Price on outcome. Sell to business, not IT. Sell accountability: proven success, with your guarantee. Sell authenticity: a brand story representing uniqueness, character, … or whatever… something people respect. Data transfer between GPU and memory is a bottleck and three approaches are emerging. # Taalas is etching LLMs into the chip. Llama 8b runs at 17,000 tok/s (H200 is at 230 tok/s). d-Matrix is moving compute into SRAM memory chips. 30,000 tok/s for Llama 70b. Cerebras and MatX are similar: memory-oriented. FuriosaAI minimizes data movement. Groq and Sambanova are similar. But in the long run, commodity technology usually beats integrated stacks. GPT 5.4 Nano ($0.2/MTok) and Mini ($0.75/MTok) are good options for bulk OCR, transcription, etc. as cost and quality comparable alternatives to Gemini Flash Lite and Gemini Flash. They can describe 75K photos for $50. Both models are better than GPT-5 Mini on most benchmarks. Cool AI coding agent git prompt fragments: Use git bisect to find when this bug was introduced: … Find and recover my code that does … Sort out this git mess for me. Rewrite history removing … Split the last commit into multiple commits grouped logically. Start a new repo at … and build just this module … based on … with a similar commit history copying the author and commit dates. Campaigns Are Knowledge Workers and the Tools Just Caught Up. A powerful framing. I saw this in action a few days ago when a friend was able to automate an outbound campaign with Claude Code. EARS (Easy Approach to Requirements Syntax) is a simple structure for requirements. For example, “Users should be able to drag tasks between columns. The app needs to work offline too. Handle errors gracefully.” becomes the following - which AI can convert to and is easier to spot errors in. State machines and decision tables are useful alternatives, too. REQ-001 (Event): When the user drags a task card to a different column, the system shall update the task status to match the destination column. REQ-002 (State): While the application is offline, the system shall store task updates in local storage. REQ-003 (Event): When the application reconnects, the system shall synchronize locally stored updates with the server. REQ-004 (Unwanted): If synchronization conflicts occur, then the system shall display a resolution dialog to the user. As of now, avoid using Claude.ai to create (large) visualizations. It runs forever and exhausts credits without generating anything. Claude Code works much better for this.

Testing Pólya heuristics on AI Math

Terence Tao said, “We haven’t done many experiments … large-scale studies where we take a thousand problems and just test them.” So I told Claude: You know my style. Suggest some innovative experiments I could run. The first suggestion was cool! The Polya Audit. Polya’s How to Solve It lists 20 heuristics (work backwards, induction, analogy, etc.). Mathematicians treat these as wisdom. Nobody has ever measured which ones actually work, and on what problem types. ...

Hack of the Day on Times of India

Last Friday, 20 Mar 2026, this “Hack of the Day” was published by The Times of India. My agents generated it entirely automatically. Here’s how that happened. On 12 Feb 2026, I met Rohit Saran, Managing Editor at The Times of India. “Our biggest challenge is the starting challenge. What story to do?” he said. “We waste a lot of time and we starve stories because of this.” What if AI could help with that? We talked for nearly two hours - and left asking: “Should we do just a daily visual newspaper?” ...