June 29, 2025

We created data visualizations just using LLMs at my VizChitra workshop yesterday. Titled Prompt to Plot, it covered: Finding a dataset Ideating what to do with it Analyzing the data Visualizing the data Publishing it on GitHub … using only LLM tools like #ChatGPT, #Claude, #Jules, #Codex, etc. with zero manual coding, analysis, or story writing. Here’re 6 stories completed during the 3-hour workshop: Spotify Data Stories: https://rishabhmakes.github.io/llm-dataviz/ The Price of Perfection: https://coffee-reviews.prayashm.com/ The Anatomy of Unrest: https://story-b0f1c.web.app/ The Page Turner’s Paradox: https://devanshikat.github.io/BooksVis/ Do Readers Love Long Books? https://nchandrasekharr.github.io/booksviz/ Books Viz: https://rasagy.in/books-viz/ The material is online. Try it! ...

My VizChitra talk on Data Design by Dialog was on LLMs helping in every stage of data storytelling. Main takeaways: After open data, LLMs may the single biggest act of data democratization. https://youtu.be/hPH5_ulHtno?t=01m24s LLMs can help in every step of the (data) value chain. https://youtu.be/hPH5_ulHtno?t=00m47s LLMs are bad with numbers. Have them write code instead. https://youtu.be/hPH5_ulHtno?t=06m33s Don’t confuse it. Just ask it again. https://youtu.be/hPH5_ulHtno?t=05m30s If it doesn’t work, throw it away and redo it. https://youtu.be/hPH5_ulHtno?t=20m02s Keep an impossibility list. Revisit it whenever a new model drops. https://youtu.be/hPH5_ulHtno?t=20m02s Never ask for just one output from an LLM. Ask for a dozen. https://youtu.be/hPH5_ulHtno?t=22m20s Our imagination is the limit. https://youtu.be/hPH5_ulHtno?t=26m35s Two years ago, they were like grade 8 students. Today, a postgraduate. https://youtu.be/hPH5_ulHtno?t=00m47s Do as little as possible. Just wait. Models will catch up. https://youtu.be/hPH5_ulHtno?t=31m45s Funny bits: ...

How long have you made ChatGPT think? My highest was 6m 50s, with the question: Here are vehicle telematics stats for 2 months. Unzip it and take a look. Find interesting insights from this data. Look hard until you find at least 5 surprising insights from this. The next largest thinking block (5m 42s) was where I asked: I would like to explore parallels to the current phenomenon where intelligence is becoming too cheap to meter. Historically, both in recent history as well as over ancient history, what technologies have made what kind of tasks so cheap that they are too cheap to meter? Give me a wide range of examples ...

How long can I make ChatGPT think?

Jason Clarke’s Import AI 414 shares a Tech Tale about a game called “Go Think”: … we’d take turns asking questions and then we’d see how long the machine had to think for and whoever asked the question that took the longest won. I prompted Claude Code to write a library for this. (Cost: $2.30). (FYI, this takes 2.3 seconds in NodeJS and 4.2 seconds in Python. A clear gap for JSON parsing.) ...

Things I Learned - 29 Jun 2025

This week, I learned: “People are great at feedback on what you are doing wrong. They are not so good at telling you how to fix it. They don’t know you that well.” Amit Kapoor Perfect Cursors makes periodic cursor positions animate smoothly by interpolating on a spline** CloudFlare and Vercel now support sandboxes where you can execute code. The price is not so low that we can execute for free in bulk but works well infrequent or batched code execution. Simon Willison Here’s how I’m using ffmpeg for video recording & editing. To record screen at 5 frames per second, I run an abbreviation screenrecord which maps to: Gemini CLI has a generous free tier and uses Bootstrap over Tailwind Ref #ai-coding Cloudflare has a native agents SDK that looks good, especially for CloudFlare users. Ref There are several brands with recognizable chart style guides. It’s possible to generate style guides for these from the charts, but applying them via matplotlib is almost #impossible today. ChatGPT Hyperfine is like %timeit for the shell. Written in Rust ⭐ Vertical AI is a moat against AGI. Specialization reduces hallucinations. Custom workflows and regulations are sticky and defensible. We need to start selling to users, not IT, though. Ref When AI automates a task, the bottleneck shifts. AI process re-design is about reworking the process around the new bottleneck, and iterating quickly. With coding, it’s testing, reviewing, deploying, use-case identification. uvx git-smart-squash re-organizes haphazard commits using LLMs. git-smart-squash #ai-coding GitHub offers a free Docker container registry. Simon Willison There are three major areas where humans either are, or will soon be, more necessary than ever: trust, integration and taste – NYT. Anil. To deal with this: Learn things that might grow in importance, like: Data modeling APIs Code reviews Drawing and 3D modeling Narrative storytelling Design Movie making Statistics Sceptical fact checking Continuous AI auditing e.g. awesome-continous-ai or automated-auditing Zero knowledge proofs Homomorphic encryption Privacy-preserving computation Fingerprinting and watermarking Governance frameworks Ethics and AI dilemmas Negotiation Change management Remote working, management, hiring Creating attention scarcity Local cultures Work with people of growing importance People designing products in regulated industries Cross domain experts Art developers, game makers, designers System thinkers. Economists, ecologists, system planners. People who look for second order effects. Live in cities that might play a bigger role in the future Cities like Singapore and learn how it builds civics trust, creates digital IDs. Cities like Bangalore and Hyderabad and learn how they grow tech talent Creative cities like Paris, Seoul, Mexico City, Berlin, etc. on sabbaticals to taste hubs Try to: Build auditing credentials and IP Audit your calendar for what AI can do. Have it interview you Practice sceptical fact checking and audit A clever way to test a library’s quality is to have LLMs write code from docs and test it. Failing libraries have flawed code/docs. Improve. Ref #ai-coding Common Pile is an 8TB open dataset for LLM training that includes ArXiv, PubMed, StackExchange, GitHub, IRC, Regulations.gov, Patents, UK parliament, books. Easier than scraping. A useful way to have reasoning models do deep-research-like work is to have them “First, create a plan to solve the problem, clearly listing the objective, approach, and output. Then follow the plan.” DE-COP is a method to check if LLMs were trained on private content. GPT-4o was trained on O’Reilly books, based on this method. Ref LLMs are more persuasive than humans. But repeated exposure reduces the effect. Ref Phoenix.new uses live views to publish apps as it codes. The testing framework looks at the screen while it codes and fixes errors. It commits every change Anthropic system prompt asking Claude to pursue its goals led to self preservation behavior. Ref The hungrier I am the better the food tastes. A good reason to eat less quantity and frequency You can purge the jsDelivr cache manually. Helps if you released a new version of a package and way to purge an alias (e.g. https://cdn.jsdelivr.net/npm/your-package@1) XConvert is a convenient online app to compress .webm videos. Not great design but fairly good compression. You can draw a treemap of import times via python -X importtime app.py > timing.txt and then paste them at https://kmichel.github.io/python-importtime-graph/. PyOpenLayers adds interactive mapping via OpenLayers to Marimo and Jupyter. In a TechCrunch interview with Jared Kaplan has was asked if Anthropic is becoming less safety conscious because they released Opus 4 which blackmails. Kaplan replied that they have stronger testing and higher transparency, so they’re more likely to share AI dangers early. Great positioning! Conversations are about perspective change and this nailed it. The system prompts for Anthropic misalignment evals are a fascinating read. AI PR Watcher tracks GitHub pull requests from Codex and other LLMs. Codex is way ahead of anything else on volume and success rate. Devin is next on volume, Cursor is next on success rate.