Testing Pólya heuristics on AI Math

Terence Tao said, “We haven’t done many experiments … large-scale studies where we take a thousand problems and just test them.” So I told Claude: You know my style. Suggest some innovative experiments I could run. The first suggestion was cool! The Polya Audit. Polya’s How to Solve It lists 20 heuristics (work backwards, induction, analogy, etc.). Mathematicians treat these as wisdom. Nobody has ever measured which ones actually work, and on what problem types. ...

Sonnet 4.6 vs MiniMax M2.7

Based on several (i.e. two) recommendations, I subscribed to MiniMax. At $10/month, you get 1,500 requests every 5 hours and 15,000 every week. That’s a LOT! Using the same prompt I had Claude Code generate two data stories: The first paragraph, by Claude Sonnet 4.6 The first paragraph, by MiniMax M2.7 Here’s my comparison of the two. It’s partly based on Claude Opus 4.6’s comparison but I felt the same way. ...

Coding agents ARE the new software

Increasingly, I use coding agents instead of writing software. For example, I built a Blog UMAP. Then, I built Calvin UMAP. And more. But instead of building re-usable software, I just ran Claude with prior context. Increasingly, I use coding agents to run software. For example, I use Codex to classify my expense receipts. It writes re-usable code, but I run it using Codex, and it updates the code with new/edge cases. ...

The Nov 2025 Vibe Coding Ghost Revolution

I kept hearing that with the Nov 2025 release of Opus 4.5 and GPT 5.2 Codex, ex-coders were sprinting back to coding. On a sample of ~1,700 developers on GitHub, exactly ten fit the “dormant returner” profile. Here are a couple of examples: But they’re the exception. I could find only TEN out of 1,700 developers who returned. I also found a few who exited: To be fair, the vibe coding revolution is real, but maybe we are (I am) mis-interpreting it. ...

Calvin UMAP

Similar to the embedding map of my blog posts, I created an embedding map of Calvin & Hobbes. It uses the same process as before. Video

AI in SDLC at PyConf

I was at a panel on AI in SDLC at PyConf. Here’s the summary of my advice: Process Make AI your entire SDLC loop. Record client calls, feed them to a coding agent to directly build & deploy the solution. Record your prompts, run post-mortems, and distill them into SKILLS.md files for reuse. Prompting Ask AI to make output more reviewable. Don’t waste time reviewing unclear output. Prefer directional feedback (feeling, emotion, intent) over implementational. Also give AI freedom to do things its way. Learn from that - you’ll be surprised. Learning ...

Blog embeddings map

I created an embedding map of my blog posts. Each point is a blog post. Similar posts are closer to each other. They’re colored by category. I’ve been blogging since 1999 and over time, my posts have evolved. 1999-2005: mostly links. I started by link-blogging 2005-2007: mostly quizzes, how I do things, Excel tips, etc. 2008-2014: mostly coding, how I do things and business realities 2015-2019: mostly nothing 2019-2023: mostly LinkedIn with some data and how I do things 2024-2026: mostly LLMs … and this transition is entirely visible in the embedding space. ...

Hardening my Dev Container Setup

I run AI coding agents inside a Docker container for safety. The setup is dev.dockerfile: builds the image dev.sh: launches the container with the right mounts and env vars dev.test.sh: verifies everything works. I wrote them semi-manually and it had bugs. I had GitHub Copilot + GPT-5.4 High update tests and actually run the commands to verify the setup. Here’s what I learned from the process. 1. Make it easier to review. The first run took long. I pressed Ctrl+C, told Copilot to “add colored output, timing, and a live status line”. Then I re-ran. Instead of a bunch of ERROR: lines, I now got a color-coded output with timing + a live status line showing what’s running. ...

Recording screencasts

Since WEBM compresses videos very efficiently, I’ve started using videos more often. For example, in Prototyping the prototypes and in Using game-playing agents to teach. I use a fish script to compress screencasts like this: # Increase quality with lower crf= (55 is default, 45 is better/larger) # and higher fps= (5 is default, 10 is better/larger). screencastcompress --crf 45 --fps 10 a.webm b.webm ... To record the screencasts, I prefer slightly automated approaches for ease and quality. ...

Leaked key sociology

It’s impressive how easy it is to find leaked API keys in public repositories. I asked Codex to run trufflehog on ~5,000 student GitHub accounts and (so far, after a few hours, 15% coverage), it found quite a few. Some are intended to be public, like Google Custom Search Engine keys. 1 2 const GOOGLE_API_KEY = "AIza..."; const GOOGLE_CX = "211a..."; Some are Gemini API keys. 1 2 3 4 5 6 7 api_key1 = "AIza..." But what’s really impressive is, when I ran: GEMINI_API_KEY=AIza... curl "https://generativelanguage.googleapis.com/v1beta/models/gemini-3-flash-preview:generateContent" \ -H 'x-goog-api-key: $GEMINI_API_KEY' \ -H 'Content-Type: application/json' \ -d '{"contents": [{"parts": [{"text": "Hi"}]}]}' … on most leaked Gemini API keys, I got: ...

AI video compression

I recorded a short screen cast of a demo I built. It was ~900KB - way too large to publish as a thumbnail. So I asked ChatGPT: What’s the best equivalent of squoosh.app for WEBM compression? I’m looking for a free modern high-quality online video compressor. There are a few, and they compressed it to a third of its size, but 300KB is still too large. So I attached the original and asked: ...

Organizing PDF receipts

One of my goals this year is to “Automate finance + tax”. Today, I took a baby step by organizing my expenses. This is my current process: STEP 1: Download PDF receipts (from OpenAI, Anthropic, Google, and other cloud/AI services) STEP 2: Organize them, so I know which receipt to upload against which expense STEP 3: Submit on SAP Concur. All steps are manual as of now. I automated STEP 2: Organize them. ...

TDS Jan 2026 GA1 released

Graded Assignment 1 (GA1) for the Tools in Data Science course is released and is due Sun 15 Feb 2026. See https://exam.sanand.workers.dev/tds-2026-01-ga1 If you already started, you might notice some questions have changed. Why is GA1 changing? Because some questions don’t work. For example: We replaced Claude Artifacts with a Vercel question because Claude won’t allow a proxy anymore. A question had unintentionally wrong instructions. (Some questions have intentionally wrong instructions, but those are, …um… intentional). Someone changed an API key. … etc. When will GA1 stabilize? Probably by end of day, Sun 9 Feb 2026? ...

Migrating TDS from Docsify to Hugo

This morning, I migrated my Tools in Data Science course page from Docsify to Hugo using Codex. Why? Because Docsify was great for a single term. For multiple terms, archives became complex. I still could have made it work, but it felt like time to move towards a static site generator. I don’t know how Hugo or Go work. I didn’t look at the code. I just gave Codex instructions and it did the rest. This gives me a bit more confidence that educators can start creating their own course sites without needing coding or platforms. Soon, they might not be stuck to LMSs either - they can build their own. ...

Gemini Scraper

Gemini lets you copy individual responses as Markdown, but not an entire conversation. That’s useful if you want to save the chat for later, pass it to another LLM, or publish it. So I built a bookmarklet that scrapes the entire conversation as Markdown and copies it to the clipboard. SETUP: Drag the bookmarklet to your bookmarks bar. USAGE: On a Gemini chat page, click the bookmarklet. It copies the chat as Markdown. ...

NPTEL Applied Vibe Coding Workshop

For those who missed my Applied Vibe Coding Workshop at NPTEL, here’s the video: You can also: Read this summary of the talk Read the transcript Or, here are the three dozen lessons from the workshop: Definition: Vibe coding is building apps by talking to a computer instead of typing thousands of lines of code. Foundational Mindset Lessons “In a workshop, you do the work” - Learning happens through doing, not watching. “If I say something and AI says something, trust it, don’t trust me” - For factual information, defer to AI over human intuition. “Don’t ever be stuck anywhere because you have something that can give you the answer to almost any question” - AI eliminates traditional blockers. “Imagination becomes the bottleneck” - Execution is cheap; knowing what to build is the constraint. “Doing becomes less important than knowing what to do” - Strategic thinking outweighs tactical execution. “You don’t have to settle for one option. You can have 20 options” - AI makes parallel exploration cheap. Practical Vibe Coding Lessons Success metric: “Aim for 10 applications in a 1-2 hour workshop” - Volume and iteration over perfection. The subscription vs. platform distinction: “Your subscriptions provide the brains to write code, but don’t give you tools to host and turn it into a live working app instantly.” Add documentation for users: First-time users need visual guides or onboarding flows. Error fixing success rate: “About one in three times” fixing errors works. “If it doesn’t work twice, start again-sometimes the same prompt in a different tab works.” Planning mode before complex builds: “Do some research. Find out what kind of application along this theme can be really useful and why. Give me three or four options.” Ask “Do I need an app, or can the chatbot do it?” - Sometimes direct AI conversation beats building an app. Local HTML files work: “Just give me a single HTML file… opening it in my browser should work” - No deployment infrastructure needed. “The skill we are learning is how to learn” - Specific tool knowledge is temporary; meta-learning is permanent. Vibe Analysis Lessons “The most interesting data sets are our own data” - Personal data beats sample datasets. Accessible personal datasets: WhatsApp chat exports Netflix viewing history (Account > Viewing Activity > Download All) Local file inventory (ls -R or equivalent) Bank/credit card statements Screen time data (screenshot > AI digitization) ChatGPT’s hidden built-in tools: FFmpeg (audio/video), ImageMagick (images), Poppler (PDFs) “Code as art form” - Algorithmic art (Mandelbrot, fractals, Conway’s Game of Life) can be AI-generated and run automatically. “Data stories vs dashboards”: “A dashboard is basically when we don’t know what we want.” Direct questions get better answers than open-ended visualization. Prompting Wisdom Analysis prompt framework: “Analyze data like an investigative journalist” - find surprising insights that make people say “Wait, really?” Cross-check prompt: “Check with real world. Check if you’ve made a mistake. Check for bias. Check for common mistakes humans make.” Visualization prompt: “Write as a narrative-driven data story. Write like Malcolm Gladwell. Draw like the New York Times data visualization team.” “20 years of experience” - Effective prompts require domain expertise condensed into instructions. Security & Governance Simon Willison’s “Lethal Trifecta”: Private data + External communication + Untrusted content = Security risk. Pick any two, never all three. “What constitutes untrusted content is very broad” - Downloaded PDFs, copy-pasted content, even AI-generated text may contain hidden instructions. Same governance as human code: “If you know what a lead developer would do to check junior developer code, do that.” Treat AI like an intern: “The way I treat AI is exactly the way I treat an intern or junior developer.” Business & Career Implications “Social skills have a higher uplift on salary than math or engineering skills” - Research finding from mid-80s/90s onward. Differentiation challenge: “If you can vibe code, anyone can vibe code. The differentiation will come from the stuff you are NOT vibe coding.” “The highest ROI investment I’ve made in life is paying $20 for ChatGPT or Claude” - Worth more than 30 Netflix subscriptions in utility. Where Vibe Coding Fails Failure axes: “Large” and “not easy for software to do” - Complexity increases failure rates. Local LLMs (Ollama, etc.): “Possible but not as fast or capable. Useful offline, but doesn’t match online experience yet.” Final Takeaways “Practice vibe coding every day for one month” - Habit formation requires forced daily practice. “Learn to give up” - When something fails repeatedly, start fresh rather than debugging endlessly. “Share what you vibe coded” - Teaching others cements your own learning. “We learn best when we teach.” Tool knowledge is temporary: “This field moves so fast, by the time somebody comes up with a MOOC, it’s outdated.”

Finding open source bugs with Ty

Astral released Ty (Beta) last month. As a prototyper, I don’t type check much - it slows me down. But the few apps I shipped to production had bugs type checking could have caught. Plus, LLMs don’t get slowed by type checking. So I decided to check if Ty can spot real bugs in real code. I asked ChatGPT: Run ty (Astral’s new type checker) on a few popular Python packages’ source code, list the errors Ty reports (most of which may be false positives), and identify at least a few that are genuine bugs, not false positives. Write sample code or test case to demonstrate the bug. ...

Migrating my blog from WordPress to Hugo

In 2009, I migrated from a self-made Perl static site generator to WordPress because it was slow, WordPress was dynamic and rapidly growing in features, and I wanted to write rather than code. (Also, I had plenty of time in 2009 for such things!) Over the years, problems crept in. Hosting costs ($200/year) for a slow server. No local writing - Windows Live Writer was dead. I wasn’t using most WordPress features. So it was time to migrate back to a static site generator. (Also, I now have plenty of time for such things!) ...

Using SVG favicons with Unicode

Browsers support SVG favicons as data: URLs. For example, this SVG: <svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 32 32"> <circle cx="16" cy="16" r="15" fill="#2563eb"/> <path fill="#fff" d="m16 7 2 7 7 2-7 2-2 7-2-7-7-2 7-2Z"/> </svg> … can be: Compressed via svgomg Converted to a data: URL via svgviewer Inserted into HTML like this: <link rel="icon" type="image/svg+xml" href="data:image/svg+xml,%3Csvg%20xmlns%3D%22http%3A%2F%2Fwww.w3.org%2F2000%2Fsvg%22%20viewBox%3D%220%200%2032%2032%22%3E%3Ccircle%20cx%3D%2216%22%20cy%3D%2216%22%20r%3D%2215%22%20fill%3D%22%232563eb%22%2F%3E%3Cpath%20fill%3D%22%23fff%22%20d%3D%22m16%207%202%207%207%202-7%202-2%207-2-7-7-2%207-2Z%22%2F%3E%3C%2Fsvg%3E"/> The fun part is that you can use text inside the SVG, styled as you wish: ...

Creating a favicon from SVG

I use a tiny SVG favicon.svg. <svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 32 32"> <circle cx="16" cy="16" r="15" fill="#2563eb"/> <path fill="#fff" d="m16 7 2 7 7 2-7 2-2 7-2-7-7-2 7-2Z"/> </svg> It’s small enough that I usually inline it in HTML: <link rel="icon" href="data:image/svg+xml,<svg xmlns='http://www.w3.org/2000/svg' viewBox='0 0 32 32'><circle cx='16' cy='16' r='15' fill='%232563eb'/><path fill='%23fff' d='m16 7 2 7 7 2-7 2-2 7-2-7-7-2 7-2Z'/></svg>"> But sometimes I need a /favicon.ico because I don’t want to change the HTML (e.g. generated content, others’ code, too many files to change) and /favicon.ico is the default browsers look for. ...