November 2025

Things I Learned - 30 Nov 2025

This week, I learned: Warp has a terminal agent feature - allowing Warp to control a terminal via text. I find that regular coding agents like Codex can do that too with tmux. For example, I opened a session and had Codex run commands in it while I watched. Here’s the guidance it needed: # Create a new session tmux new-session -d -s $SESSION 'uv run --with pandas,httpx,lxml python -iqu' # Capture output to a log file tmux pipe-pane -t $SESSION -o "cat >> /tmp/$LOG" # Run a command tmux send-keys -t $SESSION 'print(1 + 2)' C-m # See output cat /tmp/$LOG # Capture the last 5 lines of the pane tmux capture-pane -p -t $SESSION -S -5 Notes from Early science acceleration experiments with GPT-5 - via Claude LLMs are accelrating research because they are good at: Literature search, especially across disciplinary boundaries Generating and checking routine calculations Proposing variations on known techniques Identifying connections between disparate results Producing first-draft code for well-specified problems Explaining why certain approaches won’t work But they’re curently struggling with the following - though it’s a shrinking space Genuinely novel conceptual leaps (but this is increasingly happening, e.g. Sawhney and Sellke’s problem #848) Recognizing when it’s plagiarizing, e.g. when it “discovered” a proof for the Chevalley-Warning theorem which was copied from a Noga Alon paper - it wasn’t conscious of this Knowing what it doesn’t know Distinguishing important problems from unimportant ones Understanding the “negative space” of mathematics (why certain problems are hard, why obvious approaches fail) Anthropic introduced three excellent tool use practices that I expect will be adopted widely. Tool search: Don’t pass the tool definitions to the model. Model can ask for a tool search when needed Programmatic tool calling: Instead of calling a tool, it’ll return a Python program to execute that will call the tools! This is a huge win Tool use examples: Lets you specific examples of tool calls to guide th model better The Hacker News thread flags that CLIs solve these - but CLI updates are hard, while APIs auto-update. With AI, some skills that beome more valuable are (and will soon be in short supply, hence need to be taught) are: # Problem formulation (“What question should we actually ask?”) Traits: Curiosity (absolutely), systems thinking, comfort with ambiguity, metacognition (thinking about your thinking) Practice reframing exercises (“What are 5 other ways to frame this?”), study great questions in your field, work backward from outcomes, learn adjacent domains. The “5 Whys” technique helps. Also: deliberately pause before diving into solutions—force yourself to spend time in the question space. Taste and judgment (“Is this response appropriate?”) Traits: Pattern recognition from experience, cultural literacy, empathy, contextual awareness, aesthetic sense How to strengthen: Immerse yourself in excellent examples, study spectacular failures (they’re more instructive!), get feedback on your calls, practice explaining why you made a judgment. Build a “swipe file” of great/terrible examples. The key is volume—you need lots of reps. Quality assessment (“Is this AI output correct?”) Traits: Healthy skepticism, attention to detail, domain knowledge, logical reasoning, understanding of edge cases How to strengthen: Study common AI failure modes, build verification checklists, practice the “does this make sense?” test, learn what “good” looks like in your domain, cross-reference claims. Develop your “bullshit detector” by analyzing why wrong answers feel wrong. Creative synthesis (“How do these ideas connect?”) Traits: Associative thinking, wide knowledge base, playfulness, comfort with non-obvious connections, intellectual courage How to strengthen: Consume diverse inputs outside your field, practice analogical thinking (“X is like Y because…”), use visual thinking tools like concept maps, study how innovations happen in other domains, give yourself permission to make weird connections. Read broadly—fiction, history, science. Domain expertise (“Does this solution work in reality?”) Traits: Deep curiosity, persistence, willingness to get hands dirty, learning from failure, long-term commitment How to strengthen: Deliberate practice on real problems, seek mentorship, study edge cases and failure modes, build things (don’t just read about them), learn your field’s history. The “10,000 hours” thing is real, but it’s quality hours that matter. Meta pattern: Reflection loops: doing something, then analyzing why it worked/didn’t. Exposure to excellence: you can’t develop taste without seeing great work. Some more new CLI tools I installed: trash-cli: Alias rm to move files to trash instead of deleting permanently. After a week of seeing ligatures in Fira Code, all other fonts look ugly. My favorite ligatures: !== ==> =» <–> (and every possible arrow) >= ||> ||- |- … The first name, alphabetically (at least among Straive employees) is “Aabida” and the last is “Zyrene”. Something I would never have discovered working in a smaller company. chokidar-cli is an easy way to run commands when files change, e.g. npx -y chokidar-cli '**/*.js' -c 'npm run build' npx -y mapscii shows a map on the terminal. Not too useful, not maintained, but very interesting. termsvg converts asciinema .cast files to animated SVG suitable for embedding in GitHub (e.g. via mise x github:MrMarble/termsvg -- termsvg export file.cast --minify). The animated SVG is ~10X larger than the .cast file. The GZipped size is fine but saving it as .svgz is not recognized by GitHub. In contrast, agg, the official asciinema-to-GIF converter, creates .GIF files that are only 5X larger. The most efficient seems to be embedding via asciinema.org usql queries MySQL, Postgres, SQLite, MSSQL, Oracle, etc via a single interface. For example, usql 'mysql://rfamro:@mysql-rfam-public.ebi.ac.uk:4497/Rfam' -c "SELECT * FROM clan limit 3;". But DuckDB is more versatile, IMHO. INSTALL mysql; LOAD mysql; ATTACH 'host=mysql-rfam-public.ebi.ac.uk port=4497 user=rfamro database=Rfam' AS rfam (TYPE mysql); SELECT * from rfam.Rfam.clan LIMIT 3; SELECT * FROM 'file.xlsx' LIMIT 3; SELECT * FROM 'file.csv' LIMIT 3; Autistic and allistic people just have different communication styles. Autistic people have no trouble understanding other autists. They just happen to be in a minority which makes it seem like they have a social deficit. Conflict between Neurotypes 1 second = 10 tokens for OpenAI Realtime APIs. 1 second = 25 tokens for Gemini Live API 39 cents / hour on GPT Realtime Mini = 36 cents audio input + 3 cents text output 139 cents / hour on GPT Realtime = 115 cents audio input + 15 cents text output 30 cents / hour on Gemini 2.5 Flash Native Audio (Live API) = 27 cents audio input + 3 cents text output Here are some AI experiments I’m planning to try with our marketing team: Video Generation: Create marketing videos from text scripts in minutes Poster Generation: AI designs high-conversion posters from brief text inputs - notably Nano Banana Pro Synthetic Persona A/B Testing: LLM agents simulate 100K+ user behaviors to test designs before real users LLM-Powered A/B Automation: AgentA/B system runs experiments with AI-simulated traffic Vibe Coding Landing Pages: Marketers build production-ready pages in hours vs weeks On-demand Landing Pages: Generate pages for automated campaigns/products without human intervention Brand Voice Cloning at Scale: Train on company content to ensure consistency across 1000s of pieces Persona-Driven Content Synthesis: Use 1B+ personas to generate diverse content perspectives Competitive Intelligence Briefing: Real-time monitoring across millions of data points + data storytelling Marketing Analytics with LLMs: AI agents analyze complex datasets for insights Brand Compliance Checks: Ensure all content meets brand guidelines automatically Autonomous Blog Squads: AI agents identify trending topics / internal content, create data stories ready for review New skill unlocked: creating tutorials from talk proposals. I asked Claude to Write a Malcolm Gladwell article based on this talk description to teach me the topic and passed it this talk proposal: Your Causal Parrot might be lying to you. The story it wrote is very engaging and informative! LLMs “understand” causality because of training, but lack a world model to extrapolate to new situations. Giving them tools to reason (e.g. causal models, sub-agents to explore root causes) will help. A cool Gemini 3 Pro hack: convert satellite imagery into stylized maps! Bilawal Sidhu Running sub-agents in tmux helps avoid timeout cancellation, and hence allowing resuming Peter Steinberger

PC Dream Machine Specs across 30 years

In 1995, I wrote down the specs for my "dream machine". Comparing it against the machine I have today: Item19952025IncreaseRAM32 MB64 GB2000GPU RAM16 MB8 GB500HDD4 GB1 TB250HDD speed10 MB/s2 GB/s200Processor150 MHz5.10 GHz34Monitor21"27"1.3Resolution2048x15361920x12000.73 Clearly, RAM has seen the biggest growth. Low cost, high demand.Followed by the hard disk - both on capacity and speed. The processor speed increase, in comparison, is modest. What's surprising is that my monitor today isn't that much bigger than what I wanted. The resolution is actually lower than what I wanted 30 years ago! Clearly, I overestimated how important screen resolution would be. ...

Patterns for Short Code

I had Claude Code create a PR to update my Unicoder tool. As part of that, I prompted it to repeatedly: Shorten the code by exploring opportunities to rewrite more elegantly. Finally, I asked it: You applied several changes to refactor the code for elegance and brevity. What were the principles you applied? List them all with examples. The objective is to teach me how to shorten and simplify code elegantly. Its response was a good tutorial on refactoring to shorten and simplify code. ...

Mapscii Rendering

I asked Claude Sonnet 4.5 on GitHub Copilot: How does mapscii render vector tiles as text blocks on the terminal? What are the libraries it uses and how does it use them? Explain as a detailed tutorial with examples so that I can learn how to rasterize vector shapes on the terminal. Below’s the explanation. What’s noteworthy to me is: How it uses Braille characters to get 8 pixels per character cell (2x4 grid). How it fills polygons using (a) triangulation and (b) getting all edge points using Bresenham’s algorithm, then (c) filling horizontal spans between edge pairs. I’ll help you understand how mapscii renders vector tiles as text blocks on the terminal. Let me search through the repository to find the relevant code and understand the architecture. Now let me get more specific files to understand the rendering pipeline better:# How MapSCII Renders Vector Tiles as Text Blocks on the Terminal ...

Things I Learned - 23 Nov 2025

This week, I learned: Here are some new CLI tools I installed: vd (visidata): Terminal spreadsheet viewer & editor for CSV, Excel, JSON, SQL, Parquet, etc. qsv: Fast CSV command line toolkit for slicing, filtering, aggregating, and analyzing CSV files. rga (ripgrep-all): ripgrep that searches PDFs, Office docs, EPUBs, zip files. pdfcpu: PDF processor for splitting, merging, optimizing, and manipulating PDF files. gum: Stylish CLI tool for creating interactive prompts, confirmations, and more. Models read pretty fast, consuming input tokens at ~4K-20K words per second. It’s the “speaking” (output token rate) that is the bottleneck. So shortening input doesn’t matter as much as shortening output for latence. ChatGPT When building agents, as of now, prefer native provider SDKs (OpenAI Agents SDK, Anthropic SDK) over even light abstractions like Vercel AI SDK or Pydantic. There are subtle issues related to error messages, response handling, cache handling, etc. that trip up abstractions given how early things are. Armin Ronacher Gone are the times when LLMs couldn’t do mental math. Now they’re computing base64 and SHA256 from memory, without needing code! Example Organizing a round table event in Singapore costs ~$75-150. Here’s what drives the cost variation # 50%: brand/location. 25%: food and beverage. 15%: duration (full day is only slightly more expensive than half day) 10%: date, demand, etc. 10%: add-ons: AV, etc. OpenRouter supports embedding models. BGE base seems pareto optimal with 0.5 cents / MTok and a good MTEB ranking. TOON vs JSON. Early days, and TOON seems to be marketing a lot, so I’m wary, but for large tabular data where input tokens are crunched, it seems a readable alternative to multiple CSVs, but not worth the hype. 0 19 Nov 2025. Always use GPT-5.1-Codex-Max instead of GPT-5.1-Codex. At every thinking level, it takes fewer tokens for similar or higher accuracy. Tibo ug -i --smart-case --bool 'word1 word2 ...' seems the cleanest way to find files that have all words. –smart-case uses case-insensitive if all words are lowercase, else case-sensitive. Examples: ug --bool '"exact phrase" word2' # exact phrase + other tokens anywhere ug --bool 'word1 word2 -word3' # must contain word1 AND word2, but NOT word3 ug --bool '("foo bar") OR baz' # grouped expressions and OR ug --bool 'word1 NEAR/5 word2' # match when words are within 5 tokens/words ug -Z2 'word' # allows up to 2 typos in 'word' ⭐ ug -i --smart-case --bool -Q lets you interactively search within files. This is the coolest feature! Fixing laptop issues is clearly a whole lot easier with an AI chatbot. I fixed these Ubuntu issues purely using Claude. It told me what to run. I ran it, shared the output, it diagnosed, told me what to do next, etc. until the issues were fixed. For example: My keyboard shortcuts stopped working. It turned out I edited my media-keys.dconf and removed the trailing slash. # A 3-finger tap mapped to a middle click and I couldn’t remove it. It turned out my touchegg.conf explicitly had this mapping. I disabled it. # My gnome extensions would get disabled every time the screen went to sleep. It turned out my extension cache was corrupted or stale. sudo apt install --reinstall gnome-shell-extension-manager and rm -rf ~/.cache/gnome-shell/ fixed it. # GhostScript seems the best way to compress PDFs via the CLI. Example: gs -sDEVICE=pdfwrite -dCompatibilityLevel=1.4 -dPDFSETTINGS=/screen -dNOPAUSE -dQUIET -dBATCH -sOutputFile=output.pdf input.pdf Pandoc supports Lua filters which are a powerful way to customize the document conversion process. Here is a Lua filter that converts horizontal rules in a markdown document to page breaks and preserve in a Word document (OpenXML format) function HorizontalRule() return pandoc.RawBlock('openxml', '<w:p><w:r><w:br w:type="page"/></w:r></w:p>') end readpst - via sudo apt install pst-utils - extracts emails from Outlook PST files to mbox format. Useful for email migrations. Write tutorials or blog posts as you learn. Steve Klabnik Running a coding agent post mortem, e.g. “what worked well, what didn’t, and why? Next time, what are a few bullets I could include that will avoid these problems?” helps me prompt better next time. For example, Claude Code suggested: Use Firefox for headless browser automation (Chromium often crashes) Set HOME=/root when running Playwright with Firefox Start a local HTTP server rather than using file:// protocol External images may not load in screenshots due to network isolation

Thanks Pratap Vardhan – this was my best birthday gift this year! LinkedIn

Nano Banano Pro has excellent text generation (though it doesn’t always give you what you want in the first try). I couldn’t spot any errors in the generated text. Can you? I used this prompt (with the workshop details and my photo): Create a professional poster for the below, including all relevant information. Use my photo (attached) professionally. The NPTEL workshop is real, BTW. First 100 seats, I think. You can register here: https://elearn.nptel.ac.in/shop/iit-workshops/ongoing/computer-science/applied-vibe-coding-workshop/ ...

While meditating, I realized 75% of “LULL” is the letter “L”. (This sort of thing happens a lot when I meditate.) MUMMY (60% M) and DADDY (60% D) have lower percentage, but are longer, so maybe get a bonus? I asked Claude Code what would top such a list. It picked a dictionary, generated the 333 words with 4+ letters and >50% concentration. What did I like best? “ASSESSES”. 5/8 letters are “S”. That’s nearly two-thirds. ...

Things I Learned - 16 Nov 2025

This week, I learned: Windows 11 got some very practical updates. Notepad now supports Markdown preview natively. MS Paint has an opacity filter. Microsoft Copilot can share screens and speak/listen. Things I learn when Ubuntu drivers crashed on my laptop: The SG.GS Ubuntu ISO mirror is a lot faster than the official Ubuntu ISO download (5 min vs 12 hours). Rufus and balenaEtcher are the de facto tools for bootable USB drives from ISO. Gemini 2.5 Flash Image is not great at generating text. But a clever a workaround is to provide the rendered text as an image input! Also, Gemini 2.5 Flash Image seems to ignore commands that try style transfer (e.g. “turn me into Studio Ghibli”). GemImg FLIP animation is an efficient animation technique. Capture the First position Apply the Last position (changing position, size, rotation, etc.) Invert, i.e. apply just the transform that’ll move it back to the First position Plan the animation. This only needs to change transform, hence no DOM reflow. Asking coding agents to create a codemod for large-scale refactoring works well Peter Steinberger When to quit vs persist. # # Do stats/signals support positive outcome? QUIT if not. Crossed any limits you set for yourself? QUIT if so. (Run pre-mortems to find these stats/signals and limits.) Is the decision hard to reverse AND uncertainty high? QUIT if so. Else you can experiment cheaply. (Create reversibility.) Are youI continuing because of past effort or pride? QUIT if so. (Set review cadence.) Is there a better alternative? SWITCH if so. (Get outside help.) Once a model generates an output, an agentic look tends not to change the fundamental approach and just tweaks it. So, if a solution is directionally wrong, restarting works better than iterating. Agentic Pelican on a Bicycle Reading between the lines on the Microsoft OpenAI deal: Microsoft values OpenAI’s growth (financial return) than control Neither trusts the other enough to decide what’s AGI Microsoft gets some wins: models until 2032 (even post AGI) as well as research IP. Both parties expect AGI between 2027-2030. OpenAI keeps all consumer hardware - so is betting hard on hardware. It’s more Apple than Microsoft territory Divorce preparation: Microsoft can pursue AGI with other partners. OpenAI can purchase compute from anyone and release open weights models. Infra has more value than model dev! OlmoEarth is a set of image models trained on labelled geospatial data. That’s useful for deforestation and land cover monitoring, wildfire detection, urban growth monitoring, crop mapping, etc. The models are open weights and can be fine-tuned. Claude Code’s output styles are a way of using Claude Code for anything (e.g. writing, analysis, research, personal advice, etc.), not just coding. Create a ~/.claude/output-style/your-style-name.md and run /output-style your-style-name to replace the system prompt will be replaced. You can also use the --system-prompt and --append-system-prompt flags with the CLI. Following Ethan Mollick’s lead I asked: I can travel back in time to any time before 1500 in India and change only one thing. What is the single thing you would change? Nothing obvious.. ChatGPT: Create a single, simple, phonetic script for all public life in India around 1100 CE. Claude: institutionalize systematic historical recordkeeping, introduce limited liability commercial entities, and mandate systematic translation of Sanskrit technical texts into all major regional languages. How about now? ChatGPT suggests: make all public rules and records computable by law. Claude suggests: make all state-level entitlements and civil documentation fully portable across India. For the first time in history, Russian troops surrendered to a wheeled drone that carried 138 pounds of explosives - Washington Post. Given the cost and accessibility of drones, I guess drone terrorist attacks will soon emerge. HTML + JS apps will last longer than server-side apps and it makes sense to write more of those. For essential back-end services, keep them generic. Specific services layers I see are: Auth (e.g. Google Auth, Auth0, Supabase, …) Storage (e.g. Supabase, Firebase) LLMs (e.g. OpenAI, Claude, OpenRouter) Communications (e.g. EmailJS) … #TODO Extend with LLMs https://gistpreview.github.io/ is an unofficial GIST preview tool. It accepts a ?GIST_ID and displays the gist as a standalone HTML page. Simon Willison XSLT is deprecated in Chrome. So the <script> tag in XML will become the new way of rendering RSS/Atom. This is one of the rare “break-the-web” changes from browsers. Simon Willison “India has absurdly low internal migration - around 9% annual migration rate versus 25-30% in China or the US. Not because people don’t want to move, but because the cost of moving is artificially massive. You lose your ration card, state entitlements, kids’ school continuity, voting rights, …” # Rolf Dobelli’s The Not To-Do List is a good application of inversion. Also, the chapter titles themselves explain most of the message, which is very helpful. Just thinking about any of these can be a useful path to improvement. Let things fall apart Feed your weaker self Be unreliable Be an asshole Have high expectations Drift through the day Mess up your marriage Be a quitter Be hypocritical Cling to your bad habits Set the wrong goals Drink yourself miserable Get involved in other people’s drama Only learn from your own experience Be hyperactive on social media Indulge in road rage Surround yourself with negative people Micromanage your neighbours Say yes to drugs Get stuck in your career Never be playful Feel guilty Practise ingratitude Trust your banker Be paranoid Make other people feel unimportant Live in the past Listen to your inner voice Expect rationality Get nihilistic Catastrophize Consider money unimportant Cultivate a victim mentality Become a lapdog Get rich quick, get smart quick Ruminate Trade your reputation for money Never suffer Let your emotions define you Try to end it all Marry the wrong person – and stay with them Celebrate your resentment Join a cult Try to change people Say everything you think Spin multiple plates Do only shallow work Invite bad people into your life Go where the competition is strong Say yes to everything Crowd your life with gadgets Fall into the content trap DeepSeek-V3.2-Exp has linear inference time, i.e. longer inputs don’t take longer time. It picks the top 2K most relevant tokenss from the input instead. This can make model inference cheaper and faster. California’s Bill AB 316 makes the people who build autonomous systems liable for their actions. That’s quite a step. Udio and Universal are launching a platform to generate music in the style of famous artistes. An interesting new way to monetize. Fingerprinting music is a hot area. VaultGemma shows a fine-tuning approach that eliminates personal info that appears only once from memorization. It works by adding noise to weights and capping weights updates so that no one example has undue influence. Model quality is mostly the same. Amazon is giving drivers smart glasses to scan packages, get directions, capture proof of delivery and detect hazards. Cool! TechCrunch ⭐ Over 3 months, I’ve recorded ~180 calls. Processing each costs ~1.25 cents (GPT-5) and 1 year’s conversations cost ~$9. That’s incredible value for money if I hired GPT-5 / Codex as a data-driven personal coach to guide me on: What are my blindspots? That is, feedback people share with me that I ignore? What are the clusters of persona that I interact with and which of these have a positive and negative influence on me? Where am I am being unreliable? Where am I being an asshole? Where are my expectations high? Where are they low? Where would the opposite have helped? Where do I quit early? Where do I persist? Where would the opposite have helped? What good habits should I continue? What bad habits should I stop? What are the strongest opportunities to thank or praise that I missed? Is there a pattern? What triggers could I use to build this habit? Where have I tried to change people? Where have people tried to change me? Where have I spotted wrong questions? That is, rather than answering the question, I spotted the more apt question and answered that instead? … and a hundred other questions that I wouldn’t even know to ask. Sub-agents can run parallel / independent tasks while keeping the context window small. (But the advantage over xargs seems marginal.) Simon Willison Document, lint, type-check, add test cases (or other similar tasks) for all folders in a monorepo. Research and create a report for each topic in */RESEARCH.md. Synthesize learnings from each conversation in transripts/*.md. “If you’re signed into sensitive accounts like your bank or your email provider in your browser, simply summarizing a Reddit post could result in an attacker being able to steal money or your private data.” Brave OpenAI Atlas has a “Watch Mode” that will stop working if you move away from that tab. Useful to keep an eye on sensitive sites. Simon Willison “… image editing platforms seem like they’ll eat and subsume Photoshop… modern image editors – especially Nano Banana from Google Gemini – … they’re extremely effective and, increasingly, instructable” - Import AI. Facebook now suggests edits to photos - TechCruch. WebPerl runs Perl in the browser via WebAssembly. Simon Willison

When I realized Aishwarya Rai begins and ends with AI, I had to find out if there were more like her. It took a coding agent (Claude Code in this case) 10 minutes to find the 10 celebrities who share that distinction, at least across the 24,086 names on Wikipedia: Ai Nagai - Japanese playwright Aiguo Dai - Chinese-American atmospheric scientist Ai (poet) - American poet Aisea Nawai - Fijian rugby player Ai (singer) - Japanese-American singer Aisha Chughtai - Pakistani actress Aiyappan Pillai - Indian social reformer Aizawa Seishisai - Japanese Confucian scholar Ainmuire mac Sétnai - Irish high king Aisha Yousef al-Mannai - Qatari artist Glory be to these AI bookends! ...

Habits of a code addict

AI can be held to account

“Humans can be held to account. Not AI.” I hear this often. But it’s not true. Corporations are non-human, but they can enter into contracts and face criminal charges. Ships can be sued directly. Courts can arrest the vessel itself. Deities and temples in India can own property. Forests and rivers in New Zealand, Colombia, Spain, have been granted legal personhood. Medieval Europe has held animal trials (e.g. for “guilty” pigs). ...

I always wondered why old movies are rated so high on IMDb. For example, 12 Angry Men (1954) with just ~900K votes ranks about as high as Inception (2010) with ~2M votes. Few people I know have seen 12 Angry Men. So where does this high rating come from? My theories were: Old movies really are that good. IMDb’s algorithm is biased towards old movies. People remember older movies fondly. Actually, it’s none of these. It’s selection bias. ...

If a bot passes your exam, what are you teaching?

It’s incredible how far coding agents have come. They can now solve complete exams. That changes what we should measure. My Tools in Data Science course has a Remote Online Exam. It was so difficult that, in 2023, it sparked threads titled “What is the purpose of an impossible ROE?” Today, despite making the test harder, students solve it easily with Claude, ChatGPT, etc. Here’s today’s score distribution: ...

Things I Learned - 09 Nov 2025

This week, I learned: “But when an identity based belief was challenged, the brain responded as if under physical attack.” Why Engineers Can’t Be Rational About Programming Languages Notes from How to build a cult, Lulu Cheng, The Knowledge Project podcast Conviction is infectious. Communicate at the INTERSECTION of interests. Learn theirs Begin with “why your story matters to them” (first sentence). That beats “how you tell it” > “where you tell it”. The easiest way to align with an audience is to find your community. Humor, curiosity, awe, any strong emotion is a hook. Culture has momentum. Best way to break it is to show an alternative that works. People will copy that REPEAT messages over and over with complete CONVICTION to convince people who TRUST you. That works, but you need all three. Trust builds from likeability, repeated exposure, common beliefs. An excellent way to defend against online criticism (when it matters) is to just SHOW UP and THANK them for feedback. Serious reputational damage must either be fixed immediately - or you live with it forever. Between a story and statistics, the story will always wins. Never fight a story with a statistic. Dig into your statistics and uncover BETTER stories. ⭐ Prebuttals are a great idea. Start with all possible criticisms yourself and diffuse them. The other person has nothing left to say Sparring keeps you sharp. Spar with LLMs. To defend, show how the attack targets other people, increasing the surface area. Show how the SPECIFIC attack targets a larger group. Create a SPECIFIC cause worth fighting for. Each role has specific objective to optimise for. The leader’s role is to balance across these. Cheerleader effect. People look beautiful next to a cheerleader. Associations taint. Each person has dozens of aspects to their persona. We cannot remember all of them. Each person can make a choice on who they project themselves to be in any group. Shaping their persona. The Rainbow CSV extension may be causing delays (infinite spinner) when pasting Markdown in VS Code. Restarting it seems to fix the issue. ⭐ Claude scientific skills is a collection of skills teaching Claude how to use scientific libraries, databases, and APIs across several domains. This may be a good example of a non-trivial skill library - that is hard for AI coding agents to infer by themselves. Notes from How I use every Claude Code feature Use AGENTS.md as guardrails, not a manual. Document what it gets wrong. Use self-documenting tools/APIs rather than documenting. Docs: Explain why and when to read each doc. Never say “Never.” Explain when to which which alternative. Prefer CLIs for stateless tools, MCPs for stateful, authenticated, or complex (e.g. Playwright). Coding agents work well with version control. Simon Willison Break up uncommitted changes into small commits Rewrite branch history for readability Use gh CLI to fetch line-wise comments from a PR and make requested changes (e.g. renaming, refactoring, adding types, etc.) ⭐ When using MCPs or tools with private data, “color untrusted content in red, unsafe actions in blue, and never mix colors.” Good advice. ⭐ DeepWiki offers a codemaps feature that explains code in an interactive way. It shows a structured explanation on the left. You can click on any note to see the code on the right. It’s an effective way to understand how a library or tool executes a task. Here’s an example of how Mermaid works. Gemini offers RAG with free storage. RAG costs are quite high. This simplifies the process a lot. But I tried running the sample program and after an hour, it still had not completed uploading a single file. Best to wait and watch. OpenRouter supports embedding models using an OpenAI-like API Kimi K2 Thinking seems popular because It’s an open-weights model on par with the top models on Humanity’s Last Exam (text-only) and BrowseComp Can run 200-300 tool calls without human guidance 4x cheaper than GPT-5 with low tokens (32B active on 1T parameters, INT4 quantized) Based on responses to Simon Willison’s question, ChatGPT Fine-tuning helps when: Lower latency, e.g. for type-ahead, at lower cost (37 mentions) Structured extraction, parsing and classifiers, e.g. postal address, detecting secrets (18 mentions) Custom vision models, e.g. check containers (12 mentions) Domain-specific code and stacks (niche languages, stack-specific generation, text→SQL) (11 mentions) … and a long tail. Fine tuning does not help: When A base model plus prompting or RAG does as well or better (15 mentions) When you risk being leapfrogged by a new release (4 mentions) When cost and data do not justify the ROI (3 mentions) The data I can export from my Android phone includes the below. 🟢 indicates it’s tracked. 🟡 might need action, e.g. enabling / coding. # 🟢 GPS/GNSS location (current & history). Turn on device Location. If you want a timeline you can export, enable Google Location History and later export via Google Takeout → Location History (JSON/KML). 🟡 GNSS raw measurements (engineering traces). Android exposes GNSS “raw” logs on many devices; capture with dev tools or logging apps if supported (intended for research). See GNSS Raw Measurements API. 🟢 Wi-Fi scans (nearby SSIDs/BSSIDs). Toggle Location scanning → Wi-Fi scanning in Location settings; apps need location permission to read results. 🟡 Wi-Fi RTT distance to APs (indoor ranging). Apps can use Wi-Fi RTT (802.11mc/az) to measure distance to compatible APs; requires location permission. 🟢 Bluetooth proximity/traffic. For packet-level logs, enable Developer options → Enable Bluetooth HCI snoop log, then pull /sdcard/btsnoop_hci.log (Wireshark). 🟢 Cell towers (IDs, signal strength). Apps can read via TelephonyManager (e.g., getAllCellInfo()), with appropriate telephony permissions. 🟢 Activity recognition (walking, running, in vehicle). Apps must request ACTIVITY_RECOGNITION (runtime) from Android 10+. 🟢 Steps (step counter / detector). Use sensors API; from Android 10+ you must declare ACTIVITY_RECOGNITION to access step counter/step detector. 🟢 Accelerometer / gyroscope / magnetometer streams. Apps read via SensorManager; some high-rate reads require HIGH_SAMPLING_RATE_SENSORS. 🟢 Ambient light / proximity. Read via SensorManager; typically no special permission. 🟢 Google Fit data (steps, workouts, heart rate from wearables, etc.). Manage and export from Google Fit / Google account Download your data. 🟢 Contacts. MIUI → Settings → System apps → Contacts → Import/Export to .vcf (vCard). 🟢 Call history / SMS (device). MIUI local/cloud backup can include call logs & messages; export by creating a local/Cloud backup and downloading. Note: 3P apps can’t read call/SMS logs unless they’re the default dialer/SMS. 🟡 Gmail, Calendar, Contacts (Google). Export via Google Takeout (MBOX/ICS/CSV etc.). 🟡 WhatsApp / Telegram / Signal chats. Use in-app exports: WhatsApp → Export chat, Telegram Desktop → Export, Signal → encrypted backup. 🟢 Advertising ID. View/reset in Settings → Google → Ads (wording varies), per Google help on Ad ID reset. 🟡 Per-app screen time / unlocks / opens. Third-party “usage” apps (e.g., analytics or “digital wellbeing” clones) require Usage Access (PACKAGE_USAGE_STATS). Use Android’s UsageStatsManager or apps that export CSV. Stock Digital Wellbeing does not offer an export. 🟡 Notification history (last 24h). Settings → Notifications → Notification history → On. OEM-optional, but present on most devices. Viewable once enabled. 🟡 Notification content stream (live). Grant an app Notification access to capture/export notifications going forward. (User-granted API via NotificationListenerService.) | 🟢 Per-app data usage (mobile/Wi-Fi). Apps/ADB can query NetworkStatsManager; Settings shows per-app totals. Advanced dumps via adb shell dumpsys netstats. 🟡 Wi-Fi detailed logs. Developer options → Enable Wi-Fi verbose logging for richer diagnostics. 🟡 Bluetooth packet logs. Developer options → Enable Bluetooth HCI snoop log; export file and analyze in Wireshark. 🟢 Per-app storage usage. Apps/ADB can query StorageStatsManager; Settings shows per-app storage. 🟡 Photo/video metadata (EXIF incl. location). Enable “Save location” in Camera app to embed GPS in EXIF; export files normally (EXIF remains). | 🟢 Downloads & file metadata. Use a file manager or connect via USB; metadata is in the files themselves. | 🟢 Battery usage history (per-UID/app), wakelocks, jobs. Generate adb bugreport and analyze with Battery Historian or dumpsys batterystats. 🟡 System/device logs (logcat). You can view via ADB/Android Studio. Android restricts 3rd-party access to system-wide logs for privacy. 🟢 Developer quick tiles (Sensors off). Developer options → Quick settings developer tiles → Sensors off to globally cut Camera/Mic & SensorManager sensors on demand. 🟡 Google Takeout: one-stop export for Location History (Timeline), Gmail (MBOX), Calendar (ICS), Google Photos, Drive, YouTube, Fit, etc. MacroDroid, Automate and Tasker sound like powerful Android workflow automation tools. Some uses I can put it to: Automatically upload recordings to Dropbox Turn off hotspot when I reach office Vibrate if I’m walking slowly Adding <link rel="alternate" type="text/markdown" title="LLM-friendly version" href="/llms.txt"> is an emerging approach for pointing to LLMs.txt. It works. I asked Codex to read the CloudFlare vitest page. It read the file truncating the middle, found the <link rel="alternate" type="text/markdown" href="https://developers.cloudflare.com/workers/testing/vitest-integration/write-your-first-test/index.md"/ link in it, and reasoned “Considering fetching markdown instructions” and fetched the Markdown page. Giles’ Blog toon is a YAML-like format that’s LLM friendly and especially token-efficient (CSV-like) for tables. You can convert back and forth between JSON and toon. Food printing applies 3D printing techniques to create real food items. Given the art that this can create, I expect at least some adoption in niche restaurants. PMTiles lets you store map tiles as a single-file archive that libraries like MapLibre can read. Useful to avoid tile servers. Mirrow is a CLI SVG animation builder that converts a DSL to animated SVGs. However, it may be easier to use an LLM to create the animated SVG directly with SMIL than learning Mirrow (or teaching the LLM Mirrow). ⭐ One approach to giving memory (“episodic memory”) to coding agents is to allow them to search their logs.This gives them access to past discussions about a repo or other repos. To configure Gemini CLI with an AI router, set: "security.auth.selectedType": "gemini-api-key" in ~/.gemini/settings.json export GOOGLE_GEMINI_BASE_URL=https://llmfoundry.straive.com/gemini/ (or your AI router base URL for Gemini) export GEMINI_API_KEY=... (your AI router API key) Passing a HAR export to an LLM to build a scraper is a powerful idea! Lessons from Diagram Chasing Addy Osmani’s Gemini CLI tips are practical guides to using any coding agent, not just Gemini. I learnt about: Run shell commands with !, e.g. !ls -la or even !bash. It’s added to the chat. On-the-fly tool creation: ask it to write code for the task on the fly. Use it for system optimization, e.g. editing dotfiles, system customization, log error analysis, etc. Run GEMINI_SYSTEM_MD=... gemini -p "task" --yolo --format json < input.txt to run Gemini with a different system prompt and feed it input.txt to run in a pipeline. (FYI: Codex does not send a default system prompt, so there’s nothing to override.) There is a Gemini CLI Show and Tell thread with examples. This include Janitor AI, a Gemini CLI session viewer, etc. Hands on with Gemini CLI has several Use cases to try out. Renaming photos and organizing files are clever ones. AGENTS.md can be used like a decision log - rules, styles, or preferences that evolve over time - on a per-repo basis. Gemini’s /memory add feature helps with this. gemini --checkpointing is a useful “undo” feature. /restore rolls you back to a specific checkpoint. The overhead is small. Caching is only available with API key or Vertex AI, not OAuth login as of now OpenAI TTS costs are confusing. But in short TTS-1 costs $15 / MChars (max 4,096 chars per request), which ends up at ~86c / hour GPT-4o Mini TTS costs ~$16 / MChars (max 2K tokens which is ~7,000 chars per request), which ends up at ~88c / hour. Very similar cost, effectively TTS-1 HD is twice TTS-1. OpenAI has a usage API that provides cost as well as usage for completions, images, audio speeches, etc. These require an organization admin key Cost API: curl "https://api.openai.com/v1/organization/costs?start_time=$TIMESTAMP&project_ids=$PROJECT_ID&group_by=line_item" Audio speech usage API: curl "https://api.openai.com/v1/organization/usage/audio_speeches?start_time=$TIMESTAMP&project_ids=$PROJECT_ID&group_by=model"

Is all AI content slop?

Is all AI content slop? I asked Claude to: Analyze this thread. Then explain it like a Malcolm Gladwell New Yorker article. https://news.ycombinator.com/item?id=45820872 It gave me a beautiful, engaging and insightful essay about a 300+ message debate about AI vs humans on routine tasks. https://claude.ai/share/60c5810f-5c81-4970-8026-a24bf89c3392 Is this slop? One phrase stood out: There’s an irony here that the commenter doesn’t quite state but implies beautifully: we’ve spent so long celebrating automation because humans are imperfect that we’ve forgotten we also value humans because they’re imperfect. ...

OpenAI TTS cost

The OpenAI text-to-speech cost documentation is confusing. As of 2 Nov 2025: GPT-4o mini TTS costs $0.60 / MTok input and $12.00 / MTok audio output according to the model page and the pricing page. They also estimate this to be ~1.5c per minute - both for input and output. It supports up to 2,000 tokens input. TTS-1 costs $15 / MTok speech generated according to the model page but the pricing page says it's $15 / MChars. No estimate per minute is provided. Is supports up to 4,096 characters input. TTS-1 HD is twice as expensive as TTS-1 I wanted to find the approximate total cost for a typical text input measured per character and token. ...

Things I Learned - 02 Nov 2025

This week, I learned: TVMaze API is an API for TV shows, episodes, cast, crew, etc. Useful for TV-related apps as well as learning APIs. Awesome Skills is a curated list of prompts and skills for AI coding agents. ⭐ nokode is a API server that has no code: just LLMs responding. Interestingly, it is compliant. Just expensive, slow, forgetful and unreliable compared to code. All four are improving with time, indicating that coding may be transitional. Notes from Vanya Seth’s keynote at OSAI HYD Superpowers of Gen AI to keep in mind when exploring AI coding agent use cases: Translating. Requirements to code, code to code, language to queries, standard to standard. Finding info just-in-time (in context). How does this work? What’s this error? What tools are permitted in my org? Who knows what? E.g. Atlassian Rovo queries across JIRA, Confluence, etc. Brainstorming and ideation. Product ideation. Requirements. Testing gaps. Architecture review. Exploratory / scenario testing. Summarizing and clustering. Change logs, incident management, research data, docs summary. Challenges in using AI coding agents: Adoption imbalance. Only certain roles are amplified by AI. Coding, QA, more than planning, maintenance, AI ops, etc. What’s the impact of this? ⭐ Goldratt’s ToC implies that backlogs need to fill faster. Downstream becomes a bottleneck. Technical debt piles up. ACTION: Use AI across entire value chain, from research to maintenance. Locality. enhances roles (nodes), not relationships (links). They optimize local work, not global flow. Workflow tools are missing. Coordination overhead. Context Fragmentation. Translation problems. ⭐ Expand productive roles to cover neighboring tasks. Productive developers shift left and build backlogs; shift right to reduce code review, maintenance tasks. E.g. Move maintenance/production activities into development. Security, performance, monitoring, observability, cost, infrastructure. We spend time on IDE, CI/CD, Jira, Confluence, Prod observability tools. A typical Agent Development Platform (ADP) covers evals, guardrails, workflow builder, agent builder, observability, prompt management, AI gateway (LiteLLM), MCP servers, model fine-tuning, model serving, model repository, vector stores We need ADP Agents covering delivery risk, continuous security, prod issues RCA, observability, performance, accessibility, product research, infra optiimzation, test data generation, anomaly detection, release management ACTION: Share ADP photo with Patrick. ACTION: ⭐ Centralize skills (“knowledge packs”) and MCPs and observe which gets used most. Allow people to use more. Lethal Trifecta. There’s growing demand for higher productivity with AI code assistants. But the lethal trifecta makes them an attack vector. It has access to sensitive information, exfiltrate data, and read and follow unsafe instructions. Can lead to supply chain poisoning attacks. Regulated industries cannot adopt. Technical debt growth. More productivity leads to poor code quality which will slow down future work. See Software Engineering Excellence 2025 AI induced complacency. Sunk-cost fallacy on AI-generated code hurts. ACTION: Evaluate code quality continuously to reduce technical debt. Double-down on good engineering practices. Compliance. Model residency. Self-hosting is required. Data observability gaps. Data privacy, audit trails, etc. are concerns. Token economics. $20/day happens in Thoughtworks. Token cost is subsidized. Rogue AI usage. Use of dis-allowed tools; shadow IT. ROI justification. Hard to quantify productivity gains. Adoption. AI Literacy. Tap into organizational knowledge Champions & communities of practice to support cross-pollination. Use-case driven adoption. Teams identify based on AI superpowers. AI playbook. Share what worked, what didn’t work. AI automation is likely less if a high portion of work Has legal liability (e.g. pharmacist/judge vs shop attendant/lawyer) Is subjective (e.g. perfumer/auction appraiser vs lab chemist/insurance appraiser) Needs rapid contextual decisions (e.g. detective/fireman/ER vs parking enforcer) Via ChatGPT, Claude parse-sse from Sindre Sorhus is a more standards-compliant, more likely-to-be-maintained alternative to my async-sse package. Which is better: Comment A: 1 upvote, 0 downvotes (100% positive) or Comment B: 99 upvotes, 1 downvote (99% positive)? Use Wilson’s Lower Bound which measures “What % positive am I 95% confident of?” Claude Using this, we can measure metrics for tweets, like below. ChatGPT Popularity = (5 _ WLB(reposts / views) + 2 _ WLB(likes / views)) * Decay(half-life of 72 h) Memorability = (5 _ WLB(bookmarks / views) + 4 _ WLB(replies / views)) * Decay(half-life of 36 hours) A nice visual “benchmark” of text-to-image and image editing models. Seadream 4, Gemini 2.5 Flash, and Qwen Image Edit lead. This includes examples like straightening te Tower of Pisa - which only Flux.1 and Seadream 4 do well on; or removing only the brown M&Ms - which only Qwen Image Edit manages to. Arch is a pure LLM router. It supports multiple LLMs, flexible routing and observability but not auth. From Codex docs Add custom prompts in ~/.codex/prompts/xyz.md and launch as /prompts:xyz. Optional: description: and argument-hint: in YAML front-matter. For example, create prompts to refactor, rewrite in a developer’s style, document AGENTS.md, identify re-usable code, etc. AGENTS.override.md overrides parent directory AGENTS.md. AGENTS.md appends to parent AGENTS.md. Fallback names are allowed. codex exec supports streaming JSON codex exec accepts a CODEX_API_KEY= environment variable. codex uses an OPENAI_API_KEY. You can configure which environment variables are passed to the shell Codex reads 32KB from AGENTS.md by default Things that I currently follow and don’t follow from Peter Steinberger’s excellent Just Talk To It: Prefer Codex > Claude Code. Ask for options before executing Generate & review specs collaboratively You don’t need git worktrees Prefer subscriptions over API to reduce cost Store docs with code Give examples Use voice input Use Codex Web as a mobile inbox for ideas Prefer CLI over agentic platforms Prefer CLI tools over MCP Avoid ALL-CAPS for Codex. It follows instructions well Avoid sub-agents, RAG, etc. Iterate UI live. Watch changes Use 3-8 agents in parallel on a single repo. Make small, atomic commit checkpoints. Commit only what the agent touches Add ast-grep as a pre-commit hook to block rule violations. Keep custom prompts minimal (commit, automerge, massageprs, review, …). Just “commit” reduces context Cancel long tasks and ask what’s happening Prefer Medium over High reasoning. It decides level of thinking Share screenshots Use tmux to run CLIs persistently Schedule refactor time (20%). Use jscpd, knip, oxlint, … Don’t reset context. Cold start wastes time + tokens Write tests in the same context. Yields better tests, reveals bugs. Prototype in a separate folder / PR Queue continue messages** before stepping away Ask it to “Preserve intent and add comments at tricky spots”. Future you needs the WHY On hard problems, add “take your time”, “be comprehensive”, “read all related code”, “form hypotheses”, etc. Maintain an evolving AGENTS.md with product notes, naming, API patterns, test policy, ast-grep rules, etc. Delete stale guidelines Fascinating implications from Quantifying Human-AI Synergy ChatGPT Models vary in ability to uplift humans. Don’t just use standalone model benchmarks. People vary in ability to work with AI. Don’t just measure solo skills. Reward AI collaboration ability (delegation, prompting, verification, revision, …) Train models to ask for missing Theory-of-Mind cues: goal, beliefs, constraints, audience, success test Train people by asking them to predict what the model will get right/wrong, and validate Design UI and models for synergy. UI: Surface/solicit assumptions, intent, uncertainty, constraints. Model: Infer & adapt to evolving user state. OpenRouter image generation now includes GPT-5 Image Mini. An image costs about 1 cent. Here’s the code: curl 'https://openrouter.ai/api/v1/chat/completions' \ -H "Authorization: Bearer $OPENROUTER_API_KEY" \ -H "Content-Type: application/json" \ -d '{ model: "openai/gpt-5-image-mini", messages: [{ role: "user", content: "Draw a cat" }], modalities: ["image"], image_config: { "aspect_ratio": "16:9" } }' | jq -r '.choices[0].message.images[0].image_url.url' | cut -c23- | base64 -d > cat.png

Sometimes, technology creates truly memorable moments. Like when email connected me with my schoolmates in 1993. Or WhatsApp connected me with long-lost relatives in 2010. Today, Google Gemini took me back 55 years, converting the grainy black-and-white wedding photos of my parents into vivid high-resolution color images. So many people. Much younger. More alive. I look forward to when I can watch the video. Move around. Talk to them… Prompt: Convert this black and white photo to color. CAREFULLY ensure that the photo, especially faces, are EXACTLY the same. Use vivid colors and sharp photography, like in modern digital photos. Model: gemini-2.5-flash-image (nano-banana) Temperature: 0 ...