March 30, 2025

The vCard QR Code is The New Business Card

Arindam Roy suggested a business idea post-COVID: an online QR code for business cards. The benefits are clear. Never print or carry cards. The scanned card goes straight to the other person’s contacts. There might be social metrics we could capture as well. That idea never got past the discussion stage, though I’ve begged for it a few times. Last year, I decided to use existing tools to solve the problem. My current approach: generate a vCard QR code. ...

LLM GPU or API? The Cost Will Surprise You

Say you want to use Llama 3.3 70b Instruct. You could: Use it via an API. OpenRouter offers it at ~12 cents / MTok. Azure offers it at 71 cents. Your price may vary. Self-host it on a major cloud provider. Azure offers A100 80GB at ~$3.67 / hour. In a day, you could generate ~0.5-1M tokens. Self-host it on an emerging cloud provider. Lambda Labs offers A100 80GB at ~$1.79 / hour. Again, ~0.5-1M tokens a day. Clearly, self-hosting is cheaper if you run it continuously. Let’s say we run for 1 million tokens every day. Then: ...

It's so easy to fool yourself

I asked 4 LLMs to “Suggest catchy rhyming slogans to promote journaling.” Here are the suggestions: GPT 4.5 Write it right, day or night! Reflect, inspect, journaling perfect! Express to impress, journaling’s best! Clear your mind, write and unwind! Pen your zen, again and again! Daily pages, inner sages! Take a pause, write your cause! Emotions flow, journals grow! Write, ignite, your inner light! Ink it, think it, don’t just blink it! Pen your quest, ease your stress! From confusion to conclusion—journal your solution! Keep a book, take a look, watch your growth hook! Thoughts on paper, worries taper! Write to heal, make it real! Claude 3.7 Sonnet Write it down, turn that frown upside down! Pen your thoughts, connect the dots. Journal every day, keep the stress away. Ink your mind, treasures you’ll find. Write and reflect, your future self will thank you direct. Scribble and doodle, your mind will stay in the noodle. Journal with flair, show yourself you care. Jot it in ink, give yourself time to think. Fill the page, release the rage. Write your story, capture the glory. Daily writing, life enlightening. Express in a line, your day will shine. Thoughts on paper, life feels safer. Write with heart, that’s the smart start. Journaling’s neat, makes life complete. Gemini 2.5 Pro Thoughts unclear? Write them down here. Let the ink help you think. Untangle your mind, one line at a time. To make things clear, keep your journal near. When your head’s in a knot, give journaling a shot. Feeling stressed? Put your pen to the test. Write it down, let worries drown. Ease your mind, leave troubles behind. Let your feelings take flight, just sit down and write. Turn the page on inner rage. Write to know, watch yourself grow. Explore your soul, take control. Pen and page, turn a new age. Discover your way, write every day. Know your heart? Make journaling your art. Capture the day in your own way. Make moments last, write down the past. Don’t let memories stray, write them today. Reflect and connect, gain self-respect. Life moves fast? Make memories last. Get in the flow, write and grow. Pen in hand, understand. Daily lines, clear designs. Just take note, stay afloat. Write your heart, make a fresh start. Grok 3: Write your day, let thoughts play! Pen in hand, your mind expands! Journal your way, seize the day! Feel the groove, let words move! Scribe your soul, make it whole! Ink flows free, just be me! Page by page, unlock your sage! Write it down, wear the crown! At first glance, GPT 4.5 didn’t impress me. Claude 3.7 Sonnet did. I also didn’t like Gemini 2.5 Pro, but Grok was great. ...

AI Imagined Languages Unlock New Thoughts

I prompted Gemini 2.5 Pro Experimental 03-25 to “Invent a new, alien language. Use it to discover unusual insights. Here’s the output in full. Thinking Here’s a thinking process that could lead to the alien language “Glimmering” and its insights: Understand the Core Request: The user wants a new, alien language and wants to use it to discover unusual insights. This means the language needs to be fundamentally different from human languages, and its structure/concepts should enable thinking in ways that are difficult or non-intuitive in English (or other human languages). Brainstorm Alien Concepts: What makes something “alien”? Biology: Different senses (echolocation, electroreception, sensing magnetic fields, different light spectrum perception), different lifecycles (metamorphosis, symbiosis, collective consciousness), different physical forms (non-corporeal, energy-based, silicon-based, plant-like) Environment: Extreme conditions (gas giant atmosphere, deep ocean trench, vacuum of space, high radiation), different physics (maybe?), non-linear time perception. Psychology/Society: Collective vs. individualistic focus, different emotional spectrum, logic-based vs. empathy-based, non-hierarchical structures, different concepts of self, time, causality. Choose a Core Concept for the Language: Let’s pick something that inherently forces a different perspective. Idea 1: Focus on relationships/connections. Instead of nouns for objects, words describe the relationship between things. “Tree” might be “earth-sky connector” or “shade-giver.” Idea 2: Focus on states of change/flux. Everything is described by its process, not its static state. “River” is “downward-flowing-water-process.” Idea 3: Focus on sensory perception, but a Maybe based on resonance, vibration, or emotional energy. Idea 4: Focus on potentiality and probability. Words describe not what is, but what could be or the likelihood of something. Let’s go with Idea 3, leaning towards resonance and interconnectedness, maybe linked to light or energy. This feels suitably “alien” and potentially insightful. Let’s call the species the “Lumin.” Develop Language Principles based on the Core Concept (Resonance/Light/Interconnectedness): No Static Nouns: Objects aren’t fixed things but patterns of energy or resonance. Words describe the pattern or the interaction. Verbs are Primary: Action, interaction, and change are fundamental. Static description is secondary. Emphasis on Relationality: How things resonate with each other is crucial. Prepositions or affixes will be vital for showing connection/influence. Nuance through Intensity/Quality: Instead of just “red,” maybe “vibrant-red-resonance” vs. “fading-red-resonance.” Adjectives become qualities of the resonance/action. Concept of “Self”: Maybe less individualistic. “Self” is a locus of interacting resonances, not a discrete entity.Time: Perhaps less linear, more about overlapping patterns or echoes of resonance. Create Basic Lexicon and Grammar Sketch: Core concept words: Glim (verb root): To resonate, to shine, to exist as a pattern of energy. Sha (particle/prefix): Connection, interaction, resonance with. Vor (particle/suffix): Intensity, vibrancy, strength of pattern. Zil (particle/suffix): Fading, weakening, diminishing pattern. Kai (concept): A stable, recognizable pattern (closest to a noun, but dynamic). Ree (concept): Flow, change, movement of energy/resonance. Nuum (concept): The interconnected field/medium through which resonance travels (the “ether” or “fabric of reality”). Grammar Ideas: Sentence structure: Often Verb-like concept first, then modifiers showing relationships and qualities. Glim-sha-Kai-Vor (Strong pattern resonates with…). No strict tense: Use particles to indicate persistence (-vor), fading (-zil), or initiation (Ree- prefix?). Pronouns: Maybe based on resonance signature? Me could be Kai-An (This-pattern), You could be Kai-Sha (Interacting-pattern). Collective We could be Sha-Glim-Vor (Strong-interacting-resonance). Name the Language: Based on the core concepts: “Resonance Speak,” “Light Tongue,” “Pattern Weave.” Let’s go with something evocative: Glimmering (or Glim-Ree in the language itself). Translate Concepts and Look for Insights: Now, apply Glimmering to human concepts. “Love”: In English, often a noun (a state) or verb (an action). In Glimmering? Sha-Glim-Vor (Strong-interacting-resonance). Insight: Love isn’t a possession or a static feeling, but an active, powerful resonance between patterns. It requires mutual participation and reinforcement. Lack of interaction (Sha-Glim-Zil) means the “love” fades, not because it was “lost,” but because the resonance weakened. “Identity/Self”: English emphasizes a stable, internal self. In Glimmering? Kai-An-Ree (This-pattern-in-flux). Or perhaps Sha-Nuum-Glim (Resonance-interacting-with-the-field). Insight: The “self” isn’t a fixed entity but a constantly changing pattern defined by its interactions with everything else. Identity is relational and dynamic, not intrinsic and static. There’s no core “you” separate from your connections and resonance. “Knowledge/Truth”: English often treats these as objective, attainable things. In Glimmering? Kai-Sha-Glim-Vor (A strong, stable resonance between a pattern [observer] and another pattern [observed]). Or Nuum-Ree-Kai (Recognizing a flow/pattern within the field). Insight: Truth isn’t absolute but a measure of the strength and stability of the resonance between the observer and the observed. “Knowing” something is actively resonating with its pattern. Disagreement isn’t “wrongness” but a difference in resonance patterns (Sha-Glim-Kai- dissonance_marker). Understanding requires aligning or acknowledging different resonances. “Death”: English sees it as an end, cessation. In Glimmering? Kai-Zil-Nuum (Pattern-fading-into-the-field). Or Glim-Ree-Sha (Resonance-flowing-elsewhere/transforming). Insight: Death isn’t an absolute end but a transformation. The specific pattern (Kai) dissolves, its energy/resonance (Glim) rejoining the larger field (Nuum) or contributing to new patterns (Ree-Sha). It emphasizes continuity of energy/information, just in a different form. The “loss” is the cessation of a specific interactive pattern. “Problem/Conflict”: English frames it as an obstacle. In Glimmering? Sha-Glim-Dissonance (Interacting-resonance-with-clashing/instability). Insight: Conflict isn’t inherently “bad” but a state of unstable or disharmonious resonance. The “solution” isn’t necessarily elimination, but finding a way to shift the resonances (Ree-Sha-Glim) into a more stable or harmonious pattern, potentially creating something new in the process. Refine and Elaborate: Flesh out the examples. Add more nuance. Make the insights clearer. Ensure the language structure consistently supports the insights. Add a concluding thought about the value of such different perspectives. Ensure the “alienness” is maintained. For instance, avoid direct human analogies where possible, focusing on the resonance/pattern concept. Add a brief description of the hypothetical speakers (the Lumin) to ground the language’s origin. Self-Critique: Does the language really lead to unusual insights, or are these just philosophical points rephrased? Is the language structure consistent? Is it different enough from human languages? Adjustment: Emphasize how the language structure forces this thinking, e.g., the lack of static nouns requires thinking in terms of processes and relationships. Make the link between grammar and insight explicit. Make the insights feel less like standard philosophy and more like consequences of a fundamentally different way of parsing reality. Output Okay, let’s invent an alien language designed to generate unusual insights. We’ll call the species the “K’tharr,” dwellers of a dense, perpetually twilight gas giant’s upper atmosphere, communicating through modulated bioluminescent patterns and subtle gravitational waves (perceived through specialized organs). Their reality is fluid, interconnected, and based on flows rather than static objects. ...

Things I Learned - 30 Mar 2025

This week, I learned: Discussion with Vedang Recurse center (Brooklyn, online) is a 6/12 week free self-driven programmer retreat. Runs every 6 weeks. You can do whatever you pick. There are daily standups for accountability. The groups are diverse. You can pair with them, pivot ideas, whatever. Principles: push yourself & learn. Western education techniques (e.g. spaced repetition, adaptive learning) are very much present in Indian coaching systems, though not known by those names. However, interventions are hard since class 12 students just don’t have enough time. Coaching classes are a social phenomenon. It’s not the smart students who are pulling in their friends. Smart students actually follow the popular students. (Coaching classes are below the typical smart students’ standards.) Monetizing coaching is hard. People don’t want to pay for advice, and welcome free advice only if they ask for it. Coupling with execution is necessary. Aider’s integrations make it more powerful than Cursor/Windsurf. It auto-lints, runs test cases. Allows different models for “architecting” (generating changes) vs “editing” (applying code). It reads from the screen logs. Context is manual, not automated. Uses an ai! comment to trigger changes and ai? to ask questions. Cline.bot is another Cursor-like open source AI code editor that’s a VS Code plugin. When coding with LLMs, a useful workflow is: data schema ➡️ interfaces ➡️ LLM-generated test cases ➡️ code. ShellSage is a tmux based LLM tool for the command line. It screen-grabs from tmux, which is powerful. Some MCPs that have proven useful: vega-lite, SQLite, sequential thinking, memory make sucks but is hard to beat. just comes closest. CRDTs are more powerful than for just collaborative editing. It can power a peer-to-peer Internet (beginning with office tools). Versioning schema is still problematic. yjs is a good start but automerge (Rust, WASM) is faster and may be better. Loro is another. Fermyon hosts WASM serverless functions. If LLMs are most safely used where there’s no definitive “wrong” answer, here are low-risk industries and safe LLM use cases within each: Marketing and Advertising: Ad Copy and Campaign Content Generation, Personalized Marketing Messages, Creative Strategy Brainstorms, Automated Marketing Production (Everyday Wins) Customer Service and Support: AI-Powered Chatbots for Common Queries, Agent Assist and Email Drafting, Summarizing and Analyzing Customer Feedback, Interactive Troubleshooting and FAQs Retail and eCommerce: AI-generated Summary of Product Reviews, Product Description and Catalog Content Generation, Visual Content and Image Captions, Personalized Shopping Recommendations (Narrative Form) Human Resources and Talent Management: Job Description and Policy Writing, Resume Screening and Candidate Q&A, Employee Communications and Feedback, Training and Onboarding Content Education and E-Learning: Personalized Explanations and Tutoring, Content Creation: Stories, Examples, and Analogies, Practice Problems and Quiz Generation, Automated Grading and Feedback Media and Entertainment: Writing and Editing Assistance, Personalized Media Content, Localization and Dubbing Scripts, Content Moderation and Curation (Assistive) Finance and Banking: Market Commentary and Research Summaries, Client Communications and Explanations, Regulatory Compliance Summaries, Scenario Analysis and Planning Management Consulting and Strategy: Research and Insight Generation, Document and Slide Drafting, Brainstorming and Scenario Planning Legal Services: Drafting Contracts and Legal Documents, Legal Research Q&A and Summaries, Client Communications and Explanations Reflecting on Satya Nadella’s “SaaS is dead”, building or porting apps’ functionality into classic chatbots (e.g. via MCPs) would be an emerging market. E.g. “Create a HubSpot MCP. Do whatever you want on HubSpot, except via ChatGPT or your favorite LLM chatbot.” To be fair, such interfaces exist. HubSpot MCP with a vega-lite MCP and a few others could solve many common HubSpot UI tasks. DarwinBox MCP, ZenDesk MCP, etc. are emerging. 13 things I would have told myself before building an autorouter has a few interesting points: The A* algorithm finds the shortest path in a graph much quicker than others like Dijkstra’s algorithm by preferring nodes closer to the goal. Spatial Hash Indexing are O(1) and beat Tree Data Structures which are O(log n). Always prefer hashes when possible. There’s an actual convention for using emojis in Git commits: gitemoji. It even has a VS Code plugin, a changelog generator, and more. Emojis have a strong role in enhancing Markdown documents. The ones I use often are: 🔴🟡🟢 for low/medium/high priority ⭐️ or ❤️ or 👍 for ratings or emphasis ✅ for completed tasks 💡 for ideas ⚠️ or ❗️ for warnings / issues Technological innovations have always been changing art forms. For example, the perspective grid and the camera obscura led to major improvements in realistic paintings in the 15th and 17th centuries. regex is an officially recommended Python library with better regex support than re. Ref Notes from ThursdAI - Mar 27 Gemini 2.5 Pro has good instruction following despite long context. It automatically thinks for longer where required. Good at understanding large codebases. Very fast. You can upload a 2 hour audio to transcribe with timestamps. ai.dev is the shortcut to Google AI studio. ChatGPT native image generation is the best image generation model now. - Great character consistency AND prompt adherence thanks to autoregression and not using stable diffusion. - It tends to refuse image generation less than Dall-E. (While Ghibli-style is possible, Calvin and Hobbes strips are blocked.) “We added a refusal which triggers when a user attempts to generate an image in the style of a living artist.” Addendum to GPT-4o System Card - A neat personalization implication is that you could put your kids into their favourite cartoon as a cartoon character that looks like them. It’s weird that the latest GPT 4o is ahead of GPT 4.5 on LM Arena. The new DeepSeek V3 is about as good as GPT 4.5 and VERY cheap (27c), so is the obvious choice to run on OpenRouter. MCP news: Qwen.ai supports MCP in the UI! (But it’s marked as “coming soon” in my case.) Unlike tools, MCP uses servers that can remember the state or context. Tools are stateless. MCP app store like Smithery, MCP.run, Glama, are mushrooming. Awesome MCP Servers is another good starting point. Azure lets you expose agents as MCP servers. ChatGPT now uses semantic VAD. I interrupts less and typically when you have meaningfully complete something. It responds a little slower as a result. AI generated images created from prompts cannot be copyrighted. News US Copyright Office LLMs are much better at GeoGuessr than humans. arXiv. Gemini leads the pack and is ~3x better at continents, 9x better at countries, and 37x better at cities. Gemini 2.5 Pro transcription has accurate timestamps and bounding boxes. Simon Willison Notes from Writing with AI Personal writing with connection won’t go away. AI can’t give you heartbreak. But the rest of non fiction writing will vanish. What AI is extraordinary at is personalizing to each audience member’s interest Outlier opinions will thrive among humans - since AI is trained on consensus. Managers tend to be good at working with LLMs because it’s mostly about delegation. LLMs are perfect for things that don’t have a wrong answer! – Benedict Evans. 💡 Explore arguing with AI. It’s a safe way to get into a confrontational emotional state (which has its own benefits.) 💡 Keep an LLM on in voice mode while reading and ask it any questions you have. What models are good for what? GPT 4.5 is great for creation - has a great sense of humor but a corporate style. Still, way better than GPT 4o. ChatGPT is good for voice transcription and note taking. (Increasingly we take notes for AI rather than ourselves.) Claude 3.7 has the best style of writing. It’s also great for drawing charts. O1 Pro and Deep Research is great for consumption - research. Grok is the least corporate, able to argue with you, and the latest knowledge cutoff. ElevenLabs for editing podcasts in your voice, making corrections. Playwright offers an MCP server. https://simonwillison.net/2025/Mar/25/playwright-mcp/ The new GPT-4o mini Transcribe model is a bit better than Whisper and costs half: ~18 cents per hour. It includes background noise cancellation and semantic chunking, which is useful. The new GPT-4o mini TTS is about 3-4 times cheaper than TTS-1 since it’s ~$12/MTok instead of $15/Mchar. It supports emotions with streaming. Cursor with Claude 3.7 Max seems surprisingly good at generating multi-page sites at one shot. Potentially, it can edit large repositories of code as well at one shot. If that’s the case, the way we write code will require higher order thinking skills: broad sweeping changes rather than micro edits. I tried Open WebUI with its Knowledge feature. In short, it sucks. Due to the RAG technique as well as model quality. When I passed it my notes about Straive and asked who Straive’s clients were: Open WebUI with Gemma 3 found one - after multiple attempts ChatGPT with o3-mini-high got 5 (missing nothing.) ChatGPT with GPT 4.5 got 4 Gemini with Gemini 2.0 Flash Thinking got 3 Gemini with Gemini 2.0 Flash got 3 (with a 4th wrong answer) I’ve settled on squoosh.app for image compression using WebP. I’m exploring FreeImage.host for image hosting instead of Imgur for WEBP support. FreeImage.host also seems reliable, retains file sizes, and supports hotlinking. DeepFace currently seems the easiest option for face detection. Easy to install. Multiple back-ends. Gemini Codrawing is a popular Hugging face space that lets you sketch something and prompt Gemini Flash to improve on it. Draw a dead man beside the pool of blood. Add an armor to the attacker. Significantly improve the quality of this picture. Add a red pool of blood next to the dead man. The armor looks like a frock. Make it more like an armor. Make this look like a professional drawing, even though it’s in stick figures. Draw it in the style of Picasso Phi-4 multimodal procehttps://huggingface.co/microsoft/Phi-4-multimodal-instructsses speech better than Whisper V3 on HuggingFace OpenASR, and images better than Gemini Flash Lite On any LLM project, BEGIN with evals. Always. The effort for evals may seem high. Use LLMs to reduce this effort. Include irrelevant questions because people WILL ask them. Be clear on how to handle that.