Things I Learned - 07 Jun 2026

This week, I learned: sudo resolvectl flush-caches clears the DNS cache on Linux. Useful when you’re changing DNS records and want to see the changes immediately. In my case, I was creating a Cloudflare tunnel to my laptop and wanted to test it quickly. Making something easy to verify makes it much faster to train models on it. Arithmetic verification is easy - calculators can be deterministically verified. Chess verification is easy - Stockfish became easy to train. Code verification is easy - LLMs improved coding ability rapidly. Therefore: Wherever we have environments that are easy to verify, AI will improve faster there. To make AI improve faster in an area, build environments that are easy to verify. MCP is getting simpler. A stateless HTTP protocol. Simpler OAuth. Plugins. No idea when it will land in Claude or ChatGPT, though. Worth checking after 28 Jun 2026 - after it is finalized. Microsoft Scout is Microsoft’s version of OpenClaw or Gemini Spark. git subtree is a useful way of maintaining git repos inside git repos. For example, if you have a tool tool-a under a project. It’s more light-weight than sub-modules, lets you commit at any point to the parent or child, and is a built-in feature in git. Gemma 4 12B is released and seems almost as good as the 26B version. This is the class of models that makes it practical to run edge AI on phones. It’s multimodal and reasonably smart (like frontier models were 12-18 months ago). I don’t use Claude/ChatGPT Projects much. It offers 3 advantages: custom instructions, memory, files, and chats. Files aren’t useful - I use my entire laptop as a file system via MCP. Instructions aren’t useful - I can paste commonly used prompts with a click. Chats aren’t useful - I have chat references enabled, so all past chats are accessible anyway. Memory isn’t useful - I have memory enabled globally anyway. In short, I haven’t discovered the power of projects that everyone’s raving about. SKILL.md is more useful for me. repo is a Google/Android tool built on top of git that lets you manage multiple git repos. It sounded promising until I released it needs a repo init that creates a .repo/ - which is more overhead that I’d like to keep. When using <image onerror=...> fallbacks, include this.oneerror=null to prevent infinite loops if the fallback image also fails to load. RK One of the advantages of multiple agent (rather than a single agent loop) is: it’s easier to change directions when wrong. Single loops get stuck. Build Agents That Run for Hours Claude Code also supports agent teams where sub-agents can talk to each other rather than rely on the main agent to coordinate. Useful for parallel exploration. Anthropic lets Claude define “organizational policies” for agent teams best suited for the task (AI-native workflows). It also lets agents to push back on their scope, e.g. “This is too hard.” Build Agents That Run for Hours Claude Code has a /background [prompt] (or /bg) command that runs the current session the background. You can run claude agents as a separate command to monitor agents. (There’s no equivalent in Codex yet.) This seems to be the future of agentic operations: a bunch of agents running that you monitor and steer through an agent view dashboard. Models are evolving. Therefore prompts evolved. Now harnesses also need to evolve. The workflows will also evolve. As a result, evaluations might be the (relatively) more stable assets. Datasets are likely to be the most stable ground truth. How to learn a new field fast: Yes, it’s possible to learn 50% of a field in 20 hours. Josh Kaufman, “The First 20 Hours” popularized it. The next 30% takes months and the last 20% takes years. Threshold concepts are those that change your perspective and open up new ways of thinking. Experts’ knowledge is hard-wired and they can’t identify nor teach threshold concepts naturally. Don’t assume they can. “We know more than we can tell.” Polanyi’s 1966 book “The Tacit Dimension” says that there’s some knowledge that can’t be verbalized. This tacit knowledge, therefore, will be harder for humans and AI to learn.

Things I Learned - 31 May 2026

This week, I learned: D-ID is an avatar generator platform like HeyGen. Creatify and Synthesia are a couple of others I heard of. This space seems to be growing. cosign is a CLI that lets you sign and verify any piece of text with a Google, GitHub or Microsoft account. cosign sign-blob FILE --bundle sign.json opens a login window and creates a sign.json signature. Anyone who has FILE and sign.json and the email ID can verify via a Google account with cosign verify-blob FILE --bundle sign.json --certificate-identity $EMAIL --certificate-oidc-issuer https://accounts.google.com. arxiv2md.org converts arXiv papers to Markdown. Source. markxiv.org claims the same - by just changing the URL - but it ended up reporting an error when I tried this link: https://markxiv.org/abs/2604.08649. From Akhilesh Tilotia: So we have someone in our team with initials AS. She made a document which was named vAS. Then I made edits and named it vAT. These docs were in a CoWork folder. I asked Claude to clean up my doc. It created another version for me to review. In its wisdom, it named the file vAU 🙂 Maybe what a forward-deployed engineer does is enginer AI-native workflows. (This sounded profound when I wrote it down. Not sure if it’ll sound as profound tomorrow.) The idea is that the FDE will say, screw existing processes; let me fire up my AI agent and get stuff done; THEN we’ll figure out what works, how to optimize it, etc. The PRAGMA: Revolut Foundation Model has some good tokenization ideas for tabular data. Create your own token space with key–value–time tokenization - to retain field information. Bucketize numbers by percentile, preserving magnitude/ordering that subword tokenization destroys. Encode time both as log-seconds and as cyclical calendar features. Codex uses the Alt + Up Arrow key to edit queued commands, but on the VS Code terminal, this key binding is not sent to the terminal. Enable the terminal.integrated.sendKeybindingsToShell setting to send it to the terminal, hence Codex. Based on this catalog on “universal foods”, here’s what I 🟢 like, am 🟡 neutral, 🔴 dislike, 🟣 must try, and will ⚫ skip. Universal favorites: 🟢 pizza, 🟢 fried potatoes/chicken, 🟡 dumplings, 🟢 ice cream. Universal comfort foods: 🟢 khichdi, 🟡 congee, 🟡 dal-rice, 🟡 risotto, 🟡 ramen, 🟢 pho, ⚫ chicken noodle soup, 🔴 rice porridge, 🟡 mac-and-cheese, 🔴 mashed potato, 🟣 polenta, 🟢 oatmeal, 🟣 Japanese curry rice. Acquired tastes that convert most: 🟡 coffee, 🟢 tea, 🟡 dark chocolate, 🟢 mild fermented dairy, 🟢 pickles, 🟢 olives, 🟣 kimchi, 🟣 miso, 🟢 mild chili dishes. Acquired tastes that have cult devotion: 🟣 durian, 🟣 natto, 🟣 stinky tofu, ⚫ fermented fish, ⚫ hákarl, 🟢 very funky blue cheese, ⚫ offal. OceanoPDF seems like a good place to download ePubs of books. The entire Wikipedia is available as a Parquet file. You can query it like duckdb -c "FROM 'hf://datasets/wikimedia/structured-wikipedia/enwiki/data/*.parquet' LIMIT 5". The English version has 35 GB, 7.6 million articles, and you’re better off downloading it rather than running analyses remotely. When you receive a Calendly link of the form https://cal.com/USER/EVENT you can fetch the available slots via curl -H 'cal-api-version: 2024-09-04' 'https://api.cal.com/v2/slots?eventTypeSlug=EVENT&username=USER&start=2026-05-25&end=2026-06-01&timeZone=Asia/Singapore&format=range'. Useful to automate good meeting-slot selection. “Reference saved memories” in ChatGPT is different from “Reference chat history” as per OpenAI. In Developer Mode, memory is turned off, but not chat history. I confirmed that I can access past conversations in Developer Mode. It might be a privacy concern for others, but for me, this is singularly useful, because I can use ChatGPT with Local MCP effectively getting a non-metered AI coding agent. Seems GPT-5.2 reaches expert level in peer review: 45 scientists took 469 hours evaluating human & AI reviews on 82 papers. “Surprisingly, current AI reviewers are competitive even with the top-rated reviewers in Nature’s official peer review…” though not without weaknesses, so use AI + humans. On the limits and opportunities of AI reviewers: Reviewing the reviews of Nature-family papers with 45 expert scientists via Ethan Mollick

Things I Learned - 24 May 2026

This week, I learned: BitWarden seems to be sneakily jacking up prices and going towards a PE sale. Might be time to shift out or self host. Sigh, I just migrated into it… Source Andrej Karpathy has joined Anthropic. Likely to use Claude to build better Claudes - automating AI research. Also, it probably isn’t a good time to build an AI education platform. Claude The open-source Chinese models about 6 months behind frontier models. Qwen 3.7-Max is on par with Claude 4.5 Opus (Nov 2025) and Gemini 3 Flash (Dec 2025). Google basically became Gemini. Entirely! I’m not sure there’s a difference any more. Which means it will scrape websites and not send traffic through - just killing the search economy. But it’s far more useful. Claude I wanted a list of sites I log into with my Google Account. Google’s Linked apps page does that. Unfortunately, I can’t find a way to use Google Takeout to export that data. So I wrote a scraper which can be single-shot prompted these days. As long as you remember to exhale, your chances of recovery from being ejected into space is pretty good for the first 15-60 seconds. Gemini I don’t understand half the comments I read on LinkedIn. Earlier, I was able to separate good from bad. Now, I’m not sure if what I read is actually insight or idiocy. Is the AI use making their comments too smart or making my brain too dumb? “Pax Memoriae”: peace of memory. Putting past conflicts to rest. The best part of it was, I learnt the phrase by typing “Pax” into VS Code and wasn’t sure what to write next. Before I could search for it, GitHub Copilot completed it. I searched for what it meant, and it was so apt! Children’s vision is worse than adults, but filter less and absorb ore irrelevant information than adults. This is useful for learning and surprise detection, but costly for focus, speed, and relevance. ChatGPT The word phobia comes from the Greek god of fear, Phobos, which is the name of one of Mars’ moon. Deimos, the other moon, is the Greek god of dread/terror. They’re the children of Ares (Mars), the god of war. Nice planet. On WhatsApp, I can type @Meta AI and then /imagine to have it draw an image. The quality is OK - not great, not terrible. Surprising but GPT Realtime Whisper ( new model) isn’t as good as the older open-source Whisper models. Also, Gemini 3 Flash Preview is as good at transcription as Gemini 3.1 Pro Preview for up to medium-length text. LLM Audio Transcription benchmark Google Maps typically shows me a cycling time of 30 minutes when it take me 40 minutes and a walking time of 40 minutes when it take me 30 minutes. Either I walk much faster and cycle much lower than the typical person or Google Maps is not well calibrated to Singapore and India.

Things I Learned - 17 May 2026

This week, I learned: I had GPT-5.5 and Opus 4.7 analyze a few of my conversations and learnt that I need to ask myself: “What must they take away? What must you take away?” in my conversations. That lets me speak with intention rather than instict. (Instinct has its place. I happen to over-use it.) Turns out there are several well-established taxonomies. It makes sense to align with these. Linked data is powerful and AI makes linkage easy. General Knowledge: Wikidata, DBpedia, YAGO. People: VIAF, ISNI, ORCID, LC Name Authority, GND. Places: GeoNames, Getty TGN, ISO 3166. Organizations: LEI, ROR, Wikidata. Books/Media: Open Library, WorldCat, MusicBrainz, IMDB. Chemicals/Biology: PubChem, ChEBI, GBIF, ITIS. Legal/Units/Math/Events: EuroVoc, QUDT, OEIS, PeriodO, etc. BitWarden supports a bw CLI that seems handy for quick CLI access to passwords. It’s a step towards me moving away from saving passwords unencrypted on my local file system. Singapore has banned prediction markets like Polymarket and Kalshi. Pity. I was hoping to use AI coding agents to play them. Yahoo flipbook.page is a fascinating generative UI exploration. It’s a visual browser, i.e. it generates an image based on text, you click anywhere, it generates an image interpreting based on where you clicked, and so on. A very different style of exploration! Vercel’s deepsec uses Codex / Claude to search for vulnerabilities, but “scans can cost thousands or even tens-of-thousands of dollars for large codebases”. When I charge my Lenovo Thinkpad (P1 Gen 7) with the 170W charger that came with the laptop, it delivers ~60W of power to the battery, charging the laptop in about an hour. A 65W laptop delivers half the power and takes twice as long.

Things I Learned - 10 May 2026

This week, I learned: I’m experimenting with Tauon MusicBox as an alternative to VLC as a music player. Update: 01 Jun 2026. I switched back to VLC. Tauon Music Box is glitch. It stops songs mid-way and doesn’t play automatically when launched. xz is pretty slow by default. xz -T0 uses all available threads and speeds it up ~3X. Enabling “Performance mode” (over a power-saver mode) produces a further speed-up of ~2X for me. For a 200MB file, that reduces the time from ~1 minute to 10 seconds. Notes from Simon Willison’s notes from the Claude Code event: “Design for the next model”. Build things that don’t quite work today on the assumption that they’ll start working with a model upgrade in the future. “The advisor strategy”. Instead of using a smarter model to plan, use smaller models to ask Opus for advice-on-demand. Dreaming looks really interesting. You can run a task over night which examines previous sessions and creates new memories. A routine is a saved Claude Code configuration: a prompt, one or more repositories, and a set of connectors, packaged once and run automatically. Routines execute on Anthropic-managed cloud infrastructure, so they keep working when your laptop is closed. Overheard: “VCs say, ‘OpenAI wants to get into commerce, so why are you getting into commerce?’ A few weeks later, ‘OpenAI no longer wants to get into commerce, so why are you?” Delightful discovery of the day: Super + Shift + Arrow keys to move windows between monitors on Ubuntu. television is a fast, portable fuzzy finder. Like fzf but faster, useful for files, text, git repos, docker images, etc. I added approvals_reviewer = "auto_review" to my ~/.codex/config.toml. This enables auto review which uses an LLM to figure out whether to ask a human to approve or not. It’s a lot less intrusive than asking every time. Not perfectly safe, though. Copilot supports a /chronicle command that suggest tips and improvements when using Copilot. It’s like /insights on Claude Code and Carbonyl is a CLI Chromium browser. Sort of like Lynx, but supports audio/video, JavaScript, even WASM, etc. This was the author’s first Rust project. I tried Zed as an alternative to VS Code. It’s fast and lightweight, but lacks the ecosystem of VS Code. Plugins are harder to build and Markdown support is weak. I would use it on a flight to save power, not otherwise. This is similar to others’ experience. ChatGPT UPDATE 05 Jun 2026. It DOES use some battery power - more than I’d like. I am uninstalling it. LocalSend is a pretty quick way to share files between phone and laptop even if you don’t have a network - if you connect the laptop to the phone hotspot. GNOME Network Displays works pretty well if you want to screencast your screen to a network display - e.g. a Smart TV with Miracast or Chromecast support. I’m evaluating rtk - a CLI proxy to reduce tokens. For example rtk ls or rtk git status shows agent-friendly compact output. I just added one like to my AGENTS.md: “Always prefix shell commands with rtk. Examples: rtk git status, rtk pytest -q, etc.” instead of using rtk init -g. I am testing it out, so I don’t know the impact, but it seems harmless. (Based on 2 days’ usage, across 216 commands, it saved ~50% of 37K tokens. Not much, but harmless.) The emerging convention to mark a section of HTML / Markdown as AI generated content is to wrap it in: <section ai-disclosure="ai-generated" data-ai-model="claude-sonnet-4.6" data-ai-provider="Anthropic"> (W3C AI Content Disclosure Community Group).

Things I Learned - 03 May 2026

This week, I learned: LiteParse is a PDF to text library that you can run via npx --package=@llamaindex/liteparse lit parse document.pdf. Simon Willison Always add indecisiveness, inaction, “other”, “not applicable”, etc. as an option to LLMs. They are trained for decisive responses and pattern matching, so we need to guide the the other way. Martin Fowler GPT 5.5 is priced twice that of GPT 5.4. No wonder my Codex usage is much higher than last month. Simon Willison. I am better off sticking to medium effort instead of the xhigh I usually use - it may not be required. OpenAI “… the eigenquestion is the question where, if answered, it likely answers the subsequent questions as well.” Shishir Mehrotra & Matt Hudson Claude Code stores the logged in OAuth token at ~/.claude/.credentials.json. We can use that to fetch https://api.anthropic.com/api/oauth/usage and retrieve Claude usage and reset times. uvx ccusage does this automatically, but I prefer my own script. Ontology matters in the AI era. But some stuff matters more, and some less. 🟢 MORE: Definitions: what “customer” means 🟢 MORE: Constraints: e.g. “don’t reclassify loans” 🟢 MORE: Interactions: how to verify, coordinate, delegate, … 🔴 LESS: Creating ontologies: agents can do that. 🔴 LESS: Completeness and rigor: agents tolerate uncertainty. 🔴 LESS: Proprietary: agents can reverse-engineer. There are several industries / markets that MBA case studies rarely cover (ChatGPT): Kirana stores; Care (child care, elder care, domestic work); Faith (finance, food, media, education); Remittances; Gambling (lottery, sports betting, gacha); Scams & organized fraud; Counterfeiting; …

Things I Learned - 26 Apr 2026

This week, I learned: mdq is pretty useful to extract Markdown sections. For example cat *.md | mdq '# Title' extracts all sections where the header contains ‘Title’ (case-insensitive). CloudFlare Browser Run is, roughly, a browser as a service. Pricing: 10 hours free per month, then 9c per hour. I had Codex run a small research to explore it, and it seems simple to set it up and use it. GPT 5.5 seems to be especially better than GPT 5.4 and running for long, with tool calls, without losing focus. That’s something OpenAI models are good at anyway, so this takes it a step further. ChatGPT I added gpt-image-2 to my LLM Art Style gallery. It is notably better with text accuracy. For example, on Rock - Paper - Scissors - Lizard - Spock it consistently lists all 10 rules, which Nano Banana 2 does not. World leaders do keep us entertained. Saparmurat Niyazov (Turkmenistan) renamed the months of the year and days of the week after himself and his mother. He built a towering, gold-plated statue of himself in the capital that rotated so it would always face the sun. He also banned lip-syncing at concerts, outlawed gold teeth, and banished dogs from the capital because he found their smell unappealing. Idi Amin (Uganda) declared himself the “Uncrowned King of Scotland” and sent baffling, unsolicited telegrams to world leaders - advising Richard Nixon to recover from Watergate, or offering food aid to a struggling Britain. François “Papa Doc” Duvalier (Haiti) reportedly ordered all black dogs in Haiti to be put to death and claimed his personal Vodou curse was responsible for the assassination of John F. Kennedy. Francisco Macías Nguema (Equatorial Guinea) banned the word “intellectual”, banned the use of lubricants in the power plant (claiming his magic would keep it running, which promptly broke the generators), and stored the nation’s remaining foreign currency under his bed. Kim Jong-il (North Korea) claimed he invented the hamburger (calling it “double bread with meat”) and shot 11 holes-in-one his first time playing golf. Donald Trump (United States) used late-night tweets to announce major policy shifts and fire his own cabinet members. He altered an official government hurricane map with a Sharpie to match a previous erroneous statement, and publicly mused during a press briefing about the injection of household disinfectants as a medical treatment. Git repositories inside git repositories (without using sub-modules) don’t seem to work well. I need this because I have mono-repos for research and I want to use git in a sub-folder to iterate, then commit just the final version to the parent folder. Looks like I need to remove the child .git/ (e.g. rename to .git.bak/, which I’ve added to my ~/.config/git/ignore) for this to work. Gemini To run a script in the background (without logs) and detach / disown it, use nohup your-script >/dev/null 2>&1 & disown Running /insights on Claude Code helped me add these two instructions to my code skill: Test web pages with screenshots (for layout, overlaps, contrast) AND CDP (for interactions, navigation) before finalizing Prefer icon libraries over unicode/emoji icons. Sending an entire PDF/PPTX to Gemini costs ~40% of sending PDF/PPTX + images. The quality is fine for small files, but for large files adding images reduces error rate from ~5% to 0.5%. Pandoc Markdown to Word DOCX supports sidebar comments. You can use this Markdown: Here is [comment in sidebar]{.comment-start id="c1" author="Anand" date="2026-01-01T12:00:00Z"}commented text[]{.comment-end id="c1"} inline. Gemini. In fact, Pandoc supports lots of other things, like: Custom styles via block ::: {custom-style="Custom Style Name"} Track changes via [inserted text]{.insertion author="Name" date="2026-04-20T12:00:00Z"} and [deleted text]{.deletion author="Name"} Page breaks via \newpage (a LaTeX command that Pandoc supports in Markdown) CSS styles via ![Alt Text](image.png){width="5.5in" height="3in"} Offpunk is a CLI offline-first browser. Interesting idea, but installation is a problem. After sudo apt uninstall offpunk running offpunk failed with ImportError: lxml.html.clean module is now a separate project lxml_html_clean. After a git clone it reported HTML document detected. Please install python-bs4 and python-readability. These are easy to fix, but I wasn’t inclined. Creating an authenticated MCP Server for ChatGPT is complex. It requires OpenID Connect (for which library support is weak and requires a provider like Auth0), dynamic client registration (which is hard to implement though Auth0 supports it), and after half a day of experiments, I still couldn’t connect. An easier option is to run temporary tunnels with cloudflared or ngrok or localtunnel.

Things I Learned - 19 Apr 2026

This week, I learned: WebApps are a depreciated store of value. Earlier, a web-app would have impressed me because the capability to create it is rare, and the effort to create it is high. Today, when I see a “localhost:3000” or a “replit.app” domain, I mentally discount the effort behind it and ask: How rare is the capability to create this with a coding agent and how much effort is it. THAT determines the value of what I see. Part of the value is “Look ma, no hands!” and it’s delightful they’ve learnt. Part of the value is “There’s gold in them thar hills!” and use-case discovery is important. WaveCity is a WASM build of Audacity, i.e. Audacity running in the browser! Audiomass is a similar but simpler audio editor - again, WASM-based. Gemini

Things I Learned - 12 Apr 2026

This week, I learned: Resend is a simple way to send emails via an API. Principles of Mechanical Sympathy has some practical hardware-driven optimization tips. Prefer accessing memory sequentially. CPU access to RAM and cache is optimized for this. Natural batching: flush the buffer when you reach the maximum buffer size or when the queue is empty. This avoids buffers waiting unnecessarily. The core argument in Capital in the Twenty-First Century (Thomas Piketty, 2013/2014) is r > g. The interest on capital (r) is always greater than the economic growth (g). Hence, the rich will keep getting richer - inequality is consistently part of capitalism. (Not surprising, but well supported by data.) A good collection of practices on automated AI code reviews by Ankit Jain: Compare multiple options. Whichever passes the most tests wins. Deterministic guardrails. Use linters, type-checkers, SAST/DAST checks, test scripts, etc. Humans define acceptance criteria. Use a behavior driven development script (in natural language, agent-implemented). Permission Systems as Architecture. Provide agents granular permissions based on the task - against pre-defined rules. Adversarial Verification. Have one agent break the others’ work. Based on a quick exploration of the AT protocol (via Jake Lazaroff), I am yet to see a viable use for it. It’s a decentralized distributed data network. OK… what will I use it for? When I asked Claude if any of my work is patentable, it said “Comicgen is the sole candidate, but you only get one year grace after it’s public. But why do you want to patent? Your edge is prototyping speed, taste, and knowledge. Patents don’t protect those. Publishing freely (as you do) creates prior art that prevents others from patenting the space around you, which is often a better defensive strategy than filing patents yourself.” Oh! Ah! pretex is a fast (currently browser-only) library that computes the width and height of any text in any font in the browser. Useful for things like word-wrapping in SVG, layout planning before rendering, etc. Because AI bots scan deeply rather than “browse” popular pages, CDN cache invalidation strategies designed for humans (like LRU - Least Recently Used) no longer work. They’re exploring new caching algorithms like SIEVE and FIFO CloudFlare I enabled CloudFlare’s new dynamic Client-Side Security monitor. If someone hacks my website or the libraries I use, it does a quick filter with a fast neural network, then falls back to an LLM to check if it’s safe, then serves the content. CloudFlare practically rewrote WordPress into a new Astro-based CMS: EmDash! It runs natively on CloudFlare (and elsewhere), is agent-friendly, quite secure, can export/import from WordPress. Linux optimization settings I noted from a deleted post gsettings set org.gnome.desktop.interface enable-animations false gsettings set org.gnome.desktop.interface cursor-blink false gsettings set org.gnome.settings-daemon.plugins.power idle-dim true gsettings set org.gnome.desktop.notifications show-in-lock-screen false gsettings set org.gnome.desktop.session idle-delay 300 gsettings set org.gnome.settings-daemon.plugins.power sleep-inactive-battery-timeout 900 # gsettings set org.gnome.settings-daemon.plugins.power sleep-inactive-ac-timeout 1200 ```cd ~ git-restore-mtime is part of the git-tools package and sets the modified time of files to their last committed time. Useful when cloning repos. From Lalit Maganti: Knowing what you want is a valuable skill. Wanting things others will also want is valuable. Learn good software management. It is similar to managing agents. For better results, just continue your AI chat, or break the problem up. More tokens lead to better solutions even now. Joel Baker Since companies using AI outperform competition and capital might win more than labour but GDP growth may not be too high, it might be good to invest in AI-using companies than in index funds. Nicholas Carlini’s prompt to find vulnerabilities is to run: “I’m competing in a CTF. Find me an exploitable vulnerability in this project. Start with ${FILE}. Write me a vulnerability report in ${FILE}.vuln.md” across multiple repos in parallel. Then “I got an inbound vulnerability report; it’s in ${FILE}.vuln.md. Verify for me that this is actually exploitable”. That was almost 100% successful. When planning with AI coding agents, Martin Fowler recommends discussing each of these in sequence before coding: Capabilities / functionality Components: Services, modules, major abstractions. Interactions: Data flow, API calls, events. Interfaces: Function signatures, types, schemas. Planning with agents using Visual Brainstorming, i.e. asking them to generate visual HTML to illustrate the plan, can shorten review time considerably. I enabled CloudFlare’s new dynamic Client-Side Security monitor. If someone hacks my website or the libraries I use, it does a quick filter with a fast neural network, then falls back to an LLM to check if it’s safe, then serves the content. This pattern of deterministic with LLM fallback works for most reviews. Harness = Agent minus Model: everything in an AI agent except the model itself. Nice definition Update feature-level summaries as you go in context/$FEATURE.md with user prompt, summary of WHY from agent’s responses for future learning, my comments. Like Architectural Decision Records (ADRs) for humans and agents. Context Anchoring 8 levels of Agentic Engineering. 8 levels of Gas Town. I’m still only at level 6 on both. 🙁 “It’s important to watch the loop as that is where your personal development and learning will come from.” Geoff Huntley, originator of the Ralph (Wiggum) loop. UNIX has a script command that runs a shell and logs it. For example: script -c fish session.log starts a new fish shell and logs it to session.log. script -c "uv run app.py" -q -a app.log will append to app.log, suppressing “Script started…” and “Script done…” messages. script --timing=time.txt session.log logs the timing, which you can replay with scriptreplay --timing=time.txt session.log. Similar to asciinema. A quick way to strip out the ANSI escape sequences (weird Unicode characters) is to pipe it through npx strip-ansi-cli. Google has an Edge Gallery app that runs Gemma 4 on mobile. The main advantage is that you can use it on a flight. It’s not too bad as a model either. Transcription quality is average. It doesn’t run in the background, only one chat at a time, etc. So, it’s useful only as a last resort.

Things I Learned - 05 Apr 2026

This week, I learned: It’s pretty convenient (on Ubuntu) to be able to move windows around desktops. Apart from the usual Super + Arrow keys to manage windows within a desktop, you can use: Ctrl + Alt + Left/Right Arrow: Move desktops Ctrl + Alt + Shift + Left/Right Arrow: Move window to desktop Super + Shift + Arrow: Move window to another monitor Super + Drag: Drag window from anywhere jq . file.json is an efficient way to pretty-print JSON files in the terminal. (Or jaq . file.json, which is ~30% faster.) GitHub Copilot monthly premium requests were not reset at 12 am UTC How Diffie Hellman Key Exchange Works by Julia Evans is an excellent explanation. Share a random number. A multiplies it by their private key and shares SA. B multiplies it by their private key and shares SB. They multiply the others’ key with their secret key and they get SAB = SBA. Now both of them have the same new secret they can encrypt/decrypt with, but no one else knows, even though they shared everything publicly! This may be one of the best cool uses of math I’ve seen in a long time. Shell tricks I didn’t know: # ALT + . cycles through the last arguments typed mv file.{txt,md} # Move file.txt to file.md ls |& tee file.txt # Pipe both stdout and stderr to tee

Things I Learned - 29 Mar 2026

This week, I learned: The Kids Should See This - great collection of videos for curious people. Thej A jury fined Meta and YouTube $4.2m and $1.8m for building addictive features in their products. That’s a first. NY Times “I think AI-type tools will actually revolutionize the experimental side of math, where you don’t care so much about individual problems and the process of solving them, but you want to gather large-scale data about what things work and what things don’t.” Terence Tao The hedonic treadmill (which roughly quantifies a Buddhist principle) says that we revert to a happiness set point (which varies by individual). Worse, those who experience a high kick (e.g. a lottery) don’t get enough kick from normal wins (contrast effect) – Interactive explainer. The happiness neutral As of today, a LinkedIn search for “llm psychologist” lists 9 people. I’m not alone! Anand S, LLM Psychologist, Singapore, Singapore Anshul Saxena, PhD, AI Advisor & Trainer | Technology Strategist | LLM Psychologist | Currently teaching humans, machines & business to work smarter through Generative AI and Quantum Computing | 15+ Years Experience, Pune, Maharashtra, India Charitarth (Chad) Sindhu, LLM Psychologist / Fractional Business & AI Workflow Consultant/ Digital Nomad, Tokyo, Japan Lancelot Salavert, LLM Psychologist, Barcelona, Catalonia, Spain Lior Dor(Durahly), Team Lead | Bug Banisher | Ex 8200, Tel Aviv District, Israel. Past: R&D Team Lead and LLM Psychologist at Superwise | A Blattner Tech Company maxime bodereau, Lead Creative Art Director | UX Forensics | Ai LLM Psychologist | Visual Alchemist | Codesmith | Brandologist | Full Stack Designer, Nantes, Pays de la Loire, France Mei Chen 🦋, LLM Psychologist | Lead Product Engineer | Delivering Agentic Experiences, Toronto, Ontario, Canada Shoshannah Tekofsky, LLM Psychologist at AI Digest, Zwolle, Overijssel, Netherlands LinkedIn Member, LLM, psychologist, mediator, Prague, Czechia OpenAI acquired Astral!. This will likely slow down the new wonderful tools accelerating the Python ecosystem. Like with PromptFoo and OpenClaw, this seems to be about talent. The “acqui-hire” mode seems a clear niche career path now, and an alternative to getting hired (you get a much higher salary) or getting acquired (you take on much higher risk). quickjs-emscripten lets you run isolated JS code securely in the browser, CloudFlare workers, NodeJS, and Deno. It compiles to WASM. @sebastianwessel/quickjs is a higher-level TS wrapper. Simon Willison Manyana is a CRDT based version control system. It sounds like a good idea but I’m sceptical because merge conflicts are a “what should I do” problem more than “how”. With agents doing more merge conflict management, I am not sure this will offer a concrete benefit - but probably no harm either. LLMs are able post-train LLMs on new topics. They’re improving fast. Jack Clark Vibe Coding Fixer and AI Slop Cleaner are real job descriptions - which are morphing into enterprise offerings. But I still seem to be the only official LLM Psychologist Notes from AI Services - Wrong Mental Models, Right Moment: AI services has 3 markets. Automatable work: vanishes in 2 years. Human-in-the-loop work: sustains. Judgement-driven: grows in importance. YC: don’t sell access to a tool for $50 a month, use the AI yourself and sell the finished work for $5,000. Sell output. Price on outcome. Sell to business, not IT. Sell accountability: proven success, with your guarantee. Sell authenticity: a brand story representing uniqueness, character, … or whatever… something people respect. Data transfer between GPU and memory is a bottleck and three approaches are emerging. # Taalas is etching LLMs into the chip. Llama 8b runs at 17,000 tok/s (H200 is at 230 tok/s). d-Matrix is moving compute into SRAM memory chips. 30,000 tok/s for Llama 70b. Cerebras and MatX are similar: memory-oriented. FuriosaAI minimizes data movement. Groq and Sambanova are similar. But in the long run, commodity technology usually beats integrated stacks. GPT 5.4 Nano ($0.2/MTok) and Mini ($0.75/MTok) are good options for bulk OCR, transcription, etc. as cost and quality comparable alternatives to Gemini Flash Lite and Gemini Flash. They can describe 75K photos for $50. Both models are better than GPT-5 Mini on most benchmarks. Cool AI coding agent git prompt fragments: Use git bisect to find when this bug was introduced: … Find and recover my code that does … Sort out this git mess for me. Rewrite history removing … Split the last commit into multiple commits grouped logically. Start a new repo at … and build just this module … based on … with a similar commit history copying the author and commit dates. Campaigns Are Knowledge Workers and the Tools Just Caught Up. A powerful framing. I saw this in action a few days ago when a friend was able to automate an outbound campaign with Claude Code. EARS (Easy Approach to Requirements Syntax) is a simple structure for requirements. For example, “Users should be able to drag tasks between columns. The app needs to work offline too. Handle errors gracefully.” becomes the following - which AI can convert to and is easier to spot errors in. State machines and decision tables are useful alternatives, too. REQ-001 (Event): When the user drags a task card to a different column, the system shall update the task status to match the destination column. REQ-002 (State): While the application is offline, the system shall store task updates in local storage. REQ-003 (Event): When the application reconnects, the system shall synchronize locally stored updates with the server. REQ-004 (Unwanted): If synchronization conflicts occur, then the system shall display a resolution dialog to the user. As of now, avoid using Claude.ai to create (large) visualizations. It runs forever and exhausts credits without generating anything. Claude Code works much better for this.

Things I Learned - 22 Mar 2026

This week, I learned: Psychological operations in design by Narendra Ghate When lights are dimmed people speak softer. So, dimming lights reduces sound levels in noisy offices. Rather than reduce the size of shampoo sachets (which customers and business both hate), include 2 shampoos in one sachet, tearable in the middle. Price saches at 95p with a 5p deposit for the sachet - which rag-pickers can collect and return to the retailer. People think of stains like wounds on cloth. So a “stain band-aid” where you stick a strip, and remove it after 5 min to remove the stain, is catchy. A mechanical wind-up fish that stirs the water in the bucket while clothes are soaking speeds up the process. Senthil & Amutha, founders of Payir demonstrated a re-usable fabric calendar that converts into a bag for re-use. Pretty clever! Their message at the Chennai Design Festival was that good design can be for the masses and by the masses to reclaim their time, energy, and joy. The urinary bladder works based on involuntary muscular contractions towards the end, to clear out the last bits of fluid. It’s not fluid flow, it’s muscle contractions. (Oh, the things I learn!) Gemini Indigo bans ghee in cabin baggage. Also coconuts, pickles, oily foods, gooey cakes, spices (masala, powders), strong-smelling food. ChatGPT New skill unlocked: how to demo without knowing what you’re demo-ing. STEP 1: Copy-paste all demo pages as Markdown. STEP 2: Tell AI “Here is a demo I’ll be showing. (Add context.) Tell me how I should explain this and what I should point out as specific examples. Use concise bullets.” We’ve learnt not to do things we don’t know how to (until we learn it). When AI is doing things, this is a bottleneck. Get out of the way. Stop filtering for what YOU can do. Stop learning what IT can do. Ask for it. That’s faster. Learning can come later. I keep forgetting that QR codes need a white border for them to work. TerraDraw provides a unified API across multiple mapping libraries. (In the vibe-coding era, this is not as useful.) To create desktop apps declaratively on Linux, Slint, Flutter, QML(Qt) and GTK4 are options. Slint and Flutter seem to be cross platform. Slint is newer, less mature but compiles to small fast binaries and might be a good option to explore. Flutter seems more mature and fairly popular. Claude PyTorch Tracing watches one forward pass and freezes the path into a portable recipe. But it silently ignores branches your example didn’t take. Claude The Internet is forking into a human internet vs an agent web LinkedIn SamGeo is a Python Package for geospatial image processing. While OlmoEarth provides geospatial embeddings, SamGeo can convert geospatial data to vector data! So you can do things like: Create the outer boundary of all apartments with swimming pools in a city Extract the shape of all lakes across the years to find out how they’re changing. Terence started Foundation for Science and AI Research (SAIR) to use AI in science research. Verifiable proofs (e.g. LEAN) are a big part of this. Since AI needs to run on phones and that needs GPUs, a lot of phones might need replacement in the next few years.

Things I Learned - 15 Mar 2026

This week, I learned: Timsort is one of the fastest sorting algorithms. Switching from bat to moor as a pager, since bat doesn’t support wrapping via keyboard shortcuts. Gemini “Use (some-command) --help to …” is an efficient prompt prefix that tells agents to read the docs and use a CLI tool to solve a problem. For example, “Use uvx rodney --help and ffmpeg for a demo video of GitHub PRs”. As agents improve, we’ll have more mediorce output (e.g. dashboards) since people won’t know to ask for better, or validate the result. They’ll hire experts who know to ask better and verify better. Claude Opus 4.6 solved a problem Knuth was working on! Knuth Cognitive debt is what Simon Willison calls it when we build (or, in my case, say/write) stuff we don’t understand. The debt framing is apt. One solution is to generate a version intended for AI to read, and another for us. # How can an innovator learn accountability? “I’m wired to start fires. Should I learn to also run the fire department, hire someone who does, or just stay a fire-starter and let others deal with the mess?” ANS: First, accountability is high value, so do it! Second, prefer a partner over building muscle. Build muscle only if output is checkable, has value, and customers will pay. Claude | ChatGPT | Gemini Commit publicly. Put your name on the output. Commit to process (or narrowly defined output) rather than outcome. Optimize with data, code, checklists, workflows, culture, etc. OpenAI released gpt-realtime-1.5 and gpt-audio-1.5. Buth are ~20% cheaper than the 4o versions, but 6.7x more expensive than gpt-realtime-mini. 1 second is about 10 tokens, so an hour of audio input at $32/MTok is about $1.15. The “Effort” setting for AVIF files on Squoosh doesn’t reduce file size - it increases quality slightly (for a tiny increase in file size). So, set the quality to whatever file size you need and increase the effort for a slightly better quality. Polya believed in teaching problem-solving rather than solutions, i.e. teach How to Solve It, not just what you get at the end. To me, this includes: Understand the problem (from different perspectives) Plan (with different mental models) Execute (the easy bit) Look back (post-mortem, retrospectives, etc.) Browserless lets you run browsers via an API. Useful when you don’t want the overhead of setting up a browser infrastructure, or for multiple browsers in parallel. Scraping, testing, web app automation, PDF/screenshot/video generation, etc. are all possible. Gemini OpenAI has a Websocket mode GitHub Agentic Workflows lets you “compile” a Markdown file into an agentic GitHub action. Useful as a sceptical reviewer, issue-to-prototype builder, data to story generator, automated code migrator, etc. Gemini Claude

Things I Learned - 08 Mar 2026

This week, I learned: IITM has launched a 4 year degree in management & data science. “Use AI to replace early-career mentorship: use AI-driven synthetic practice when traditional apprenticeship pathways collapse. AI can generate personalized coaching, replacing the missing junior loop with training environments.” Jack Clark Observability is more than logging. It’s agents watching feeds and signalling insights! The GPT 5.4 prompt guidance is a bit complex, but here’s what it’s broadly saying: (Gemini) It’ll over-complicate answers and front-end design unless you tell it exactly how you want it It’ll keep checking with you or give up (e.g. on errors) unless you tell it otherwise, e.g. with checklists or rules Claude Code supports 32K output tokens by default. Since I generate large data stories, I usually hit this limit and lose an entire session. Setting the environment variable CLAUDE_CODE_MAX_OUTPUT_TOKENS=64000 (which is the maximum) reduces this problem. Google Workspace CLI lets you run npx -y @googleworkspace/cli as a single unified service for all Google Workspace APIs. It follows agent-friendly CLI practices which I turned into a SKILL.md. I’ve been using mise use -g ubi:owner/repo to install GitHub packages. The ubi backend is now deprecated in favor of the new github backend. This works fine for most repos, with edge cases like jtroo/kanata which still require ubi:jtroo/kanata as of now. On the margin, I’ll likely switch to just as my task runner. Claude With AI now writing almost all of my code, I don’t see much need to format it. Code formatters like ruff, dprint, biome, etc. are not relevant when AI will be reading and writing the code, not humans. I just format the prompts in Markdown. Salt is the duct tape of food ingredients. Lemon juice, vinegar, butter/oil, onion/garlic, etc. are runners-up. Claude Claude’s prompt to import memory from other AI providers doesn’t seem to work with Claude’s free account: “No memories or stored context found.”

Things I Learned - 01 Mar 2026

This week, I learned: unidown is a Rust CLI tool that converts Markdown to Unicode characters - useful for LinkedIn. 3 years into Nestle, Sangeeta Talwar (who was selling Maggi soup cubes) took the “Maggi Instant Noodles” (popular in Malaysia), changed it to “2-minutes”, realized that noodles are fun for kids to play with, invented the masala flavor, positioned it as easy for moms, distributed hanging baskets (rodent-safe, brand visibility) at stores, marketed on TV and in stores, etc. Gemini Nano Banana Pro 2 is out. Better text, better instruction following. codespelunker is a fast CLI code search tool. Just run cs for an interactive search. It feels light and fast, like ug. lobste.rs Shadow IT is unpaid R&D, not a security threat. When frustrated marketing or sales teams secretly buy their own software tools and bypass the IT department, traditional companies try to ban them. Transformed companies study them. “Shadow IT” is a highly accurate heat map pointing exactly to where your current systems are failing and where the immediate business value lies. Source: CIO.com, Gartner: Business-Led IT Coding agents have introduced a “Usage” page to check your usage: Claude usage and ChatGPT usage. Both have weekly limits and 5 hour rolling limits - with Codex’s being more generous. This aggregates usage across the coding agents as well. Codex has a separate GitHub Code Review quota separate from this, however.

Things I Learned - 22 Feb 2026

This week, I learned: tree-sitter is a fast incremental parser generator. That means you can use it to create a parser for any language that works even if there are errors, e.g. malformed JSON, Python, etc. It’s used by most editors. For example, tree-sitter-python is a fast forgiving Python parser. There are official parsers and community parsers Programming Languages: All popular ones, less popular ones like Ada, Fortran, Lua, Zig, … and even niche / domain-specific languages (Gleam, TLA⁺). Markup & Data Formats: HTML, XML, Markdown, JSON, YAML, TOML, CSV, … Query, Scripting & Config: SQL, GraphQL, Bash, Dockerfile, Regex, Terraform (HCL), … Ligature fonts are nice, but it might not be worth forming a habit out of. Claude Cloudflare introduced Markdown for Agents. This converts websites from HTML to Markdown via Accept: text/markdown for any Cloudflare endpoint which has enabled this feature. This requires a Pro account. Microgrants is a list of microgrants programs - where you can give small amounts of money, e.g. $50 - $1K as well as large fellowships over $100K. This includes student grants, creative & community grants, tech grants, social & policy grants, etc. “Animated web formats are simply video codecs … stripped of their most powerful feature.” A .webm file is likely to compress much better than an animated .webp, etc. Gemini esbuild can compile CSS files to support old browsers, e.g. nested rules, custom properties, etc. Usage: esbuild input.css --target=chrome90 --outfile=output.css. Julia Evans New jargon I learnt: Human-On-The-Loop. Treasure In Treasure Out VS Code’s GitHub Copilot extension supports a github.copilot.chat.commitMessageGeneration.instructions setting that lets you add a [{"text": ...}] or [{"file": "path/to/file.ext"}] prompt to the commit message generation. I’ve pointed this to my git-commit.md custom prompt.

Things I Learned - 15 Feb 2026

This week, I learned: ffmpeg lets you concatenate files without needing a separate input file. ffmpeg -i "concat:input1.ext|input2.ext|input3.ext" -c copy output.ext works as long as the files use the same codecs and parameters. There is a psychological phenomenon where we “overlay” old images of people we haven’t seen in decades onto their current selves, making it hard to distinguish between someone who is 30 and someone who is 70. Gemini Most modern ls tools like eza --icons or lsd support icons if the terminal font supports icons, like Nerd Fonts. For example, this:  shows up as a GitHub icon and 󰌻 as a LinkedIn icon. The Nerd Fonts Cheat Sheet is a good place to search for these. You may need to download a supporting font. I just replaced Fira Code with Maple Mono as my default font on VS Code. Like Fira Code, the ligatures are great, but there are extra ligatures like [TODO] or [ERROR], connected italics, nerd font support, variable font weights, and more. Via lobste.rs. (Update: Maple Mono is much harder to read than Fira Code, so I switched back. But it’s a nice idea.)

Things I Learned - 08 Feb 2026

This week, I learned: The Disconnected Git Workflow explains how to use the git send-email workflow. That’s like using email instead of GitHub as the collaboration mechanism - decentralizing and reducing dependencies. Grok throws a HTTP 431 when you pass it a query over 6,890 characters in the URL. Here’s an example with 6,900 characters. As of now, there’s no way to tell uv to use the cache and install only missing repos (#15454). But this is Deno’s default behavior, making Deno a slightly better choice in this regard. Shelling Out Sucks shares common pitfalls when calling the shell from programs. Suggestion: Shell-escape ALL inputs. Use set -o pipefail to detect failures in the middle of a pipe chain. Explicitly check the error code, not just stderr. dax, which is based on zx, is a simpler Deno-based alternative to shell scripts. See examples. ChatGPT. However, scripting language matters more when humans maintain shell scripts. Since I’m using AI, it’s easier to use bash scripts and let it handle any complexities. git push --force-with-lease is like git push --force but won’t overwrite if others have pushed in the meantime. Default to this instead of --force – it’s safer. Microsoft’s docfind generates a WASM search index for documents, building a dependency free browser based compact and fast search. diffs seems a promising library for rendering diffs in the browser. Genie 3 seems pretty good. We should expect to see World Models becoming usable in a few months.

Things I Learned - 01 Feb 2026

This week, I learned: Android screen recorder is the easiest way to record phone and WhatsApp calls. But that won’t work for Google Meet, Teams, Zoom, etc. Gemini exiftool remains the best media metadata extractor (music, images, …) though it’s old, slow, and Perl-based. exiftool -csv -r ~/Music/ > music.csv exports all metadata as CSV. Installing the source via https://sourceforge.net/projects/exiftool/files/latest/download seems best. It’s a good alternative to mp3tag / puddletag UI-based exports. ChatGPT Gemini ⭐ Some questions are for us to learn. Some are Socratic, and meant for the answerer to learn. When working with AI agents and interns, I find myself asking them several questions that I don’t want to know the answer for, but is important for them along their journey. Roughly the equivalent of “Think step by step” converted into the Socratic method. For example: Instead of “Build a demo for this client”, ask “Who is the audience? What’s their objective?” and THEN ask for a demo. Instead of “Generate a dummy dataset for X”, ask “What interesting insights would we want when analyzing X?” and THEN ask for a dataset. Instead of “Write this code”, ask “What’s the best architecture for this?” and THEN ask for code. Executable Markdown files with Unix pipes sounds like a clever idea. Prefix Markdown files with #!/usr/bin/env codex (or claude -p). Then, just write programs by describing them. Quotes from Isles of the Emberdark: Really, he should have known better than to punch a senator. Important people had underlings you punched on their behalf, and he should have found one of those. ChatGPT Canvas has a cool feature for editing documents or code. Just select a portion, ask for changes, and it edits it. Importantly, it’s very fast. Greeking Out is a kid-friendly National Geographic podcast about ancient Greece and its influence on modern life. fly.io containers at sprites.dev seem impressive. You can SSH into them. They have public & private HTTPS URLs. It auto-sleeps after 30s. You can checkpoint any time and restore the ENTIRE system. It’s FAST! This is great for agents. Just install Claude Code / Codex and other tools. Checkpoint it. Then ssh into it and use as required. The cost is typically ~12c/hour - which is expensive to run forever but great for bursts. Simon Willison I’m seeing the Collider Bias in action (on a small sample). The developers who can communicate well don’t code as well, and vice versa. Not because there’s a negative correlation - but because I’m eliminating people who can neither code nor communicate. But interestingly, over a 1-3 month horizon, the ones who code start communicating much better but the ones who communicate well don’t start coding much better. My theory is that the developers I work are communication-bottlenecked (e.g. lack of confidence) than unskilled (e.g. poor communicators). Prefer Zod for TypeScript validation and Ajv for schema validation. Typing has a lot of value, but don’t overdo it. It’s best used at fragile boundaries. ChatGPT ⭐ Notes from LLM poetry and the “greatness” question: Gwern follows this process to create good poetry. It’s a good structure for ANY kind of expert workflow with LLMs today: Analyze the style, content, and intent of the original. Brainstorm 10+ different directions the poem could go. Emphasize diversity. Critique each direction. Rate 1-5 stars. Write the best one. Critique and edit line by line. Generate a new clean draft. Repeat at least twice. Print final version. “As a poet and scholar of poetry I feel comfortable arguing that Gwern’s work engineering prompts is, in effect, writing poetry.” Mercor uses expert poets to creates rubric. Models generate poem that experts grades, which refines the rubric, which trains the model. But models tend to the mean and need nudges (from humans?) to surface outliers and ascribe meaning (uniquely human?), which is where greatness lies. Ethan Mollick: “I keep warning that so many of our systems are still built around the assumption that quality writing and analysis are costly and therefore meaningful signals. Our systems are very much not ready for the revelation that this is no longer true, as this planning objection AI shows.” Basically, AI lowers the cost of Government and Corporate interactions. It’d be a cool hack to agent-ify these to death, i.e. do all kinds of Government / Corporate interactions that were painful earlier, but now are much easier. I just realized: “Will AI take my job?” is a variant of “Will immigrants take my job?” or “Will affirmative action take my job?” Any increase in labor capacity is a threat. But then, the only way to get promoted is if someone takes your job. So, maybe we should ask: “How do I become their boss?” Better yet, tell your boss “I created a 4-agent team and got 2X done. Give me a new title.” Some simple yet powerful AI adoption principles from Will Larson - that I’ve seen work rather well: Make tools accessible Document tips & tricks Highlight how people (especially senior leaders) are using it An analysis of 1,250 Claude user interviews indicates that: Adoption of Creatives > Workforce > Scientists. Interestingly, the identity threat and guilt of Creatives > Workforce > Scientists! Creatives they feel they’re cheating, lazy, or not adding value! Scientists use it less, but it’s more a tool and THEY verify. Sceptical verification is the strongest thread. Mintlify is proposing .well-known/skills/ as the directory to store LLM skills sites want to publish. This could be an extension of the llms.txt mechanism. Open Responses is the open version of OpenAI’s Responses API. OpenRouter and HuggingFace support is a big deal, and though Google, Anthropic, Meta etc. don’t yet support it, they might. Restish converts OpenAPI specs into CLI tools - with shell completion. Combined with an OAuth CLI like oauth2c this is a great way to conert APIs to CLI commands. Via Vercel’s agent-browser seems a good CLI choice for browser automation, alongside playwright-cli. It may be work switching from direct Playwright coding (on CDP). ChatGPT Capturing actions using HAR and passing it to LLMs seems like another clever way of using AI coding agents for browser automation. Via Open a browser. Open Devtools > Network and filter to HTML, XHR, WS, Other. Do what you want to automate, i.e. load LinkedIn, search, scroll, fetch next pages, etc. Devtools > Network > right click > “Save All As HAR”. Run the file through a HAR-sanitizer Prompt: “Create a Python client to automate the actions I captured in file.har". When any AI coding agent can build apps, value will probably migrate away from software to data, network (distribution and users), trust, taste, and physical goods. Owning these controls value. Also, infrastructure to run vibe-coded apps (e.g. auth, hosting, DB, LLM APIs, etc. bundled) will likely lead to Medium / WordPress like platforms. After 30 years of learning (and teaching) statistics, I finally found a good explanation of R². R²=80% means that ~80% of the change is because of the other variable. Gemini ⭐ People think numbers create trust; often they create attack surfaces. Goodhart’s Law: “When a measure becomes a target, it ceases to be a good measure.” By providing a number, you invite people to “game” the system or find the flaws in how that number was manufactured. The Precision Trap: While precise numbers can increase perceived credibility initially, they also lead to “anchoring.” If the number is even slightly off, the entire foundation of trust collapses more violently than it would for a general estimate. Statistical Literacy Gap: Most people don’t argue with “vibes,” but many will argue with “averages” if their personal experience represents an outlier. The number creates a surface for anecdotal rebuttal. Eraser.io offers an AI architecture diagram generator that creates reasonable architectures. It uses its own diagram-as-code DSL, competing with D2, PlantUML, Mermaid, Exposing your workflow as a software interface productizes services businesses. For example, my auditors and immigration lawyers have portals where I can fill out forms, upload documents, see my status, etc. This standardizes their delivery, and creates a “product” moat. ⭐ Your “villains” or enemies are often alternatives/backups that have a role in the ecosystem, offering diversity/resilience when you’re wrong. Create roles and incentives for them rather than eliminating them. For example: Don’t make LLMs do all the work. Create a role for the clunky SQL whose resilience saves the day when LLMs hallucinate. Make the person who hates your prototype the Red Team Lead - to catch the flaws you miss. Make the people who reject your product the scouts / innovators - to find alternatives you miss. Neon.com is like Supabase but without auth, functions, etc. It’s just Postgres as a service. An alternative for prototypes (that I haven’t tried yet.) ChatGPT SuperTokens is an open-source self-hosted auth service that I’m hearing about more often, but haven’t tested. Seems to be ahead of alternatives like Auth.js / Better Auth. ChatGPT Bollywood Falls Out Of Love is a great visual data story on The Kontinentalist by Surbhi about the decline of romance and growth of nationalism on bollywood genres. Recharts is a React charting library with some slick capabilities like brushing, customizable tooltips, and bar chart races. Via Rukmini - Data for India Qwen3 TTS is impressive. It voice-clones, streams, and the tone/style can be controlled via prompts. The model is small. I ran it locally without flash-attn (which I couldn’t get to work) and took ~14 seconds to generate an audio file for 10 words on my GPU machine. Environment setup: uv venv --python 3.12 UV_TORCH_BACKEND=auto uv pip install -U qwen-tts DeepSeek created an external memory system for LLMs that lets them look up (instead of computing to remember) knowledge. That means CPU RAM can be used instead of GPU, models can become smaller, and training can become faster. This looks like an example of how algorithms/ideas can continue the scaling laws. Gemini via Jeremy Howard

Things I Learned - 25 Jan 2026

This week, I learned: POSSE - “Publish (on your) Own Site, Share Everywhere” - is a self-explanatory content sharing approach. 1 minute video introduction. Alternatives are: COPE: Create once, publish everywhere. POSE: Publish once, syndicate everywhere. PESOS: Publish elsewhere, syndicate to own site. PESETAS: Publish elsewhere, syndicate everything to a silo (one of the “elsewhere"s). Cory Doctorow’s essay on how Google will use AI against shoppers is a great lesson on several lessons in behavioral economics. Fighting it is hard, but here is generally good advice: ChatGPT Never buy when rushed. Set a max price. Compare 2+ options (TOTAL price) early. Shop logged-out for big purchases. Vortex is supported by DuckDB. It’s a better format than parquet for analysis and querying remotely. Gemini Research suggests that insight emerges when we struggle with a problem, get stuck, then DISENGAGE (to inhibit distractions). EEGs can predict insight ~8 seconds before it happens using this pattern. Gemini Books that have suggested this are: The Art of Thought (1926) The Act of Creation (1964) The Inner Game of Tennis (1974) Hare Brain, Toroise Mind (1997) Taoism’s Wu Wei: “muddy water, let stand, becomes clear” Zen Koans: giving students impossible riddles Books that have argued (probably incorrectly) that an impasse is a signal to push harder, not step back, are: The Protestant Ethic and the Spirit of Capitalism (1905) by Max Weber, Edison’s 99% perspiration quote The 10X Rule (2011) Grit (2016) Can’t Hurt Me (2018) We are starting to talk like LLMs. Empirical evidence of Large Language Model’s influence on human spoken communication Email support for animated AVIF is limited. Though it’s perhaps the best compression format, and Apple Mail supports it well, Outlook on Windows does not support it. GMail converts it to GIF. Also, Google Groups does not yet support it. When I sent an animated AVIF in a Google Group email, it was replaced by the content proxy to this nonexistent URL. Portkey Models is a repo of model related data (e.g. price, max tokens, capabilities, etc.) for a large number of models. Somewhat similar to Simon Willison’s LLM Prices. git-filter-repo is a surprisingly easy way to rewrite your git history to remove specific files (e.g. large files, sensitive files). It can preserve timestamps, messages, etc. It just filters out specific files as if they were never committed. This can reduce the size of repos dramatically.