July 1, 2025

I’m off for a 10-day Vipassana meditation program. WHAT? A 10-day residential meditation. No phone, laptop, or speaking. https://www.dhamma.org/ WHEN? From today until next Sunday (13 July) WHERE? Near Chennai. https://maps.app.goo.gl/PnGkLoZ8U6aG2RKk8 WHY? I’ve heard good things and am curious. SURE? I’ve never been away from tech for this long. Let’s see! I’ve scheduled LinkedIn posts, so you’ll still see stuff. But I won’t be replying. LinkedIn

If someone asked me, “What’s changed this year in LLMs”, here’s my list:" Prompt engineering is out. Evals are in. https://www.linkedin.com/feed/update/urn%3Ali%3Ashare%3A7335146366681194496/ Hallucinations are fewer and solvable by double-checking. https://www.linkedin.com/feed/update/urn%3Ali%3Ashare%3A7326902628490059776/ LLMs are great for throwaway code / tools. https://www.linkedin.com/feed/update/urn%3Ali%3AugcPost%3A7319277426029539329/ LLMs can analyze data. No more Excel. https://www.linkedin.com/feed/update/urn%3Ali%3Aactivity%3A7345062233996988417/ LLMs are good psychologists. https://www.linkedin.com/feed/update/urn%3Ali%3Ashare%3A7326504476712808449/ Image generation is much better. https://www.linkedin.com/feed/update/urn%3Ali%3AugcPost%3A7304716144379076608/ LLMs can speak well enough to co-host a panel. https://www.linkedin.com/feed/update/urn%3Ali%3Ashare%3A7283025621503356930/ … and create podcasts. https://www.linkedin.com/feed/update/urn%3Ali%3Ashare%3A7326544867734540288/ But: LLMs are still not great at slides. https://www.linkedin.com/feed/update/urn%3Ali%3Ashare%3A7311066572113002497/ LLMs still can’t follow a data visualization style guide. LLMs can’t yet create good sketch notes. LLMs still draw bounding boxes as well as specialized models. Agents (LLMs running tools in a loop) can think only for ~6 min. What’s on your list of things LLMs still can’t do? ...

LLMs are smarter than us in many areas. How do we control them? It’s not a new problem. VC partners evaluate deep-tech startups. Science editors review Nobel laureates. Managers manage specialist teams. Judges evaluate expert testimony. Coaches train Olympic athletes. … and they manage and evaluate “smarter” outputs in many ways: Verify. Check against an “answer sheet”. Checklist. Evaluate against pre-defined criteria. Sampling. Randomly review a subset. Gating. Accept low-risk work. Evaluate critical ones. Benchmark. Compare against others. Red-team. Probe to expose hidden flaws. Double-blind review. Mask identity to curb bias. Reproduce. Re-running gives the same output? Consensus. Ask many. Wisdom of crowds. Outcome. Did it work in the real world? For example, you can apply them to: ...

How To Control Smarter Intelligences

LLMs are smarter than us in many areas. How do we manage them? This is not a new problem. VC partners evaluate deep-tech startups. Science editors review Nobel laureates. Managers manage specialist teams. Judges evaluate expert testimony. Coaches train Olympic athletes. … and they manage and evaluate “smarter” outputs in many ways: Verify. Check against an “answer sheet”. Checklist. Evaluate against pre-defined criteria. Sampling. Randomly review a subset. Gating. Accept low-risk work. Evaluate critical ones. Benchmark. Compare against others. Red-team. Probe to expose hidden flaws. Double-blind review. Mask identity to curb bias. Reproduce. Re-running gives the same output? Consensus. Aggregate multiple responses. Wisdom of crowds. Outcome. Did it work in the real world? For example: ...

I catch up on long WhatsApp group discussions as podcasts. The quick way is to scroll on WhatsApp Web, select all, paste into NotebookLM, and create the podcast. Mine is a bit more complicated. Here’s an example: Use a bookmarklet to scrape the messages https://tools.s-anand.net/whatsappscraper/ Generate a 2-person script https://github.com/sanand0/generative-ai-group/blob/main/config.toml Have gpt-4o-mini-tts convert each line using a different voice https://www.openai.fm/ Combine using ffmpeg https://ffmpeg.org/ Publish on GitHub Releases https://github.com/sanand0/generative-ai-group/releases/tag/main I run this every week. So far, it’s proved quite enlightening. ...

My Goals Bingo as of Q2 2025

In 2025, I’m playing Goals Bingo. I want to complete one row or column of these goals. Here’s my status from Jan – Jun 2025. 🟢 indicates I’m on track and likely to complete. 🟡 indicates I’m behind but I may be able to hit it. 🔴 indicates I’m behind and it’s looking hard. Domain Repeat Stretch New People 🟢 Better husband. Going OK 🟡 Meet all first cousins. 8/14 🟢 Interview 10 experts. 11/10 🟡 Live with a stranger. Tried homestay - doesn’t count Education 🔴 50 books. 6/50 🟡 Teach 5,000 students. ~1,500 🟡 Run a course only with AI. Ran a workshop with AI Technology 🟢 20 data stories. 10/20 🔴 LLM Foundry: 5K MaU. 2.2K MaU. 🟡 Build a robot. No progress. 🟢 Co-present with an AI. Done Health 🟢 300 days of yoga. 183/183 days 🟡 80 heart points/day. Far from it 🔴 Bike 1,000 km 300 hrs. Far from it 🟢 Vipassana. 2 Jul 2025 Wealth 🔴 Buy low. No progress. 🔴 Beat inflation 5%. Not started. 🟡 Donate $10K. Ideating. 🔴 Fund a startup. Not started. At the moment, there’s no row or column that looks like a definite win. ...