Gemini copies images almost perfectly

Summary: Nano Banana Pro is much better than recent models at copying images without errors. That lets us do a few useful things, like: Pre-process images for OCR, improving text recognition by cleaning up artifacts while preserving text shapes exactly. Convert textbook raster diagrams into clean vector-like images that vectorizers can process easily. Create in-betweens for cartoon animations Copy torn, stained 1950s survey maps into pristine, high-contrast replicas with boundary lines preserved pixel-perfectly. Redraw sewage map blueprints or refinery blueprints into clean schematics, separating the “pipes” from the “background noise”. … and more! GPT Image 1.5 has a good reputation for drawing exactly what you tell it to. ...

Gemini Scraper

Gemini lets you copy individual responses as Markdown, but not an entire conversation. That’s useful if you want to save the chat for later, pass it to another LLM, or publish it. So I built a bookmarklet that scrapes the entire conversation as Markdown and copies it to the clipboard. SETUP: Drag the bookmarklet to your bookmarks bar. USAGE: On a Gemini chat page, click the bookmarklet. It copies the chat as Markdown. ...

Learnings from building Babbage Insight

Here’s a great post by Karthik Shashidhar on why they shut down Babbage Insight, and the learnings from the experience. (I’m reproducing in full here since LinkedIn is hostile to content.) I added ⭐ to points I found most interesting. As the more perceptive of you would have figured out by now, we are shutting Babbage Insight . When I told this to one of my old friends, his immediate reaction was “so what were your learnings from this experience?”. And so I decided to write this. ...

LinkedIn is hostile to content

It’s incredible how hostile LinkedIn is for reading / writing content. Posts containing links to external websites (like my blog) get significantly less reach. That’s why you see links in comments, not the post! You can’t copy content from posts on their mobile app. You can’t even easily select the entire article on the web app! Selecting a part, and then shift-clicking elsewhere (which works almost everywhere) doesn’t work. Also, the copied text isn’t clean. It’s filled with hidden text (e.g. “Skip to search”), duplicated text (e.g. author name repeated), and other junk. It’s hard to export content. For example, the export feature does not include the original links in your articles, nor the links to images you posted! It’s hard to scrape content. LinkedIn actively tries to prevent scraping, and their TOS prohibits it. No formatting. You have to embed unicode characters. Search is terrible. You can’t search for posts by keyword, date, or author easily. No public posting - so you need to log in to read anything. ...

Baba Is You

I have this feeling that the skills we need for the AI era might be found in video games. (Actually, no. I just want an excuse to play games. Self-improvement is a bonus.) I asked the usual LLMs (Claude, Gemini, ChatGPT): What are mobile phone games that have been consistently proven to be educational, instructive or skill/mental muscle building on the one hand, and also entertaining, engaging and popular on the other? ...

Things I Learned - 18 Jan 2026

This week, I learned: Vulture is a neat library that funds unused Python code. uvx vulture script.py works fairly well, out-of-box. This helps when cleaning up AI-edited scripts that often have left-over code or imports. One of the lightest alternatives to Google Analytics is GoatCounter. If you just want page views, referrers, browsers, OSes, countries, and devices, it’s great. It’s privacy-friendly (no cookies), open source, easy to self-host, free for small sites, and the data is exportable. The number of countries that allow visa-free entries to Indian passports is gently growing in Asia (Kazakhstan, Thailand, Sri Lanka, Malaysia, Iran, and Philippines). Lessons from performance books. Claude # # Summary: In early days, explore, sample. Then narrow based on interest & fit. Practice hard and persist. ⭐⭐⭐⭐ Range (David Epstein): In changing environments (rules shift, feedback is noisy/late), sample broadly, i.e. generalize. Specialization vs generalization Nobel laureates have more hobbies. Olympic athletes have less. Shift nurses have same hobbies as non-shift workers. Hobbies help expertise in some areas Rewarding ONLY what succeeds locks behavior, halts exploration. Vary / delay incentives. Reward AFTER figuring out what works. Reinforcement and rewards Maybe “orderly” people specialize and creative people generalize? So pick what aligns with personality? ⭐⭐⭐ Peak (Anders Ericsson & Robert Pool): Compounded practice at the edge of competence, with good immediate feedback, helps 14-26%. But talent (genetics, upbringing, brainpower) differentiates more the expert level. Slow, effortful practice (spaced recall, interleaving topics, self-testing) builds lasting knowledge - but looks inefficient and doesn’t help with exams. Learning and long-term retention “Easy” 10K hours don’t help. ⭐⭐ Grit (Angela Duckworth): predicts roughly the same as conscientiousness (18%). It predicts success in stable paths moderately (but brainpower, etc. matter too). But premature grit hurts. Quit if it helps. But environment can defeat grit. Lessons from attention economy books. Claude # # The attention economy is real. It is designed to capture our mind, and it is winning. Distractions hurt MUCH more than we think. Batching, focus time helps. Privilege helps. The rich have more control over these than the poor do. ⭐⭐⭐⭐ Deep Work (Cal Newport, 2016) and ⭐⭐⭐ Digital Minimalism (Cal Newport, 2019): control the tools. Focus time, digital detox, embrace boredom. This helps - when you can afford to. ⭐⭐⭐ Indistractable (Nir Eyal): control yourself. The problem is internal (also true), so build habits, since willpower depletes (hm… not really). ⭐⭐⭐ How to Do Nothing (Jenny Odell, 2019): reject. Embrace boredom as resistance. This helps - when you can afford to. ⭐⭐ Stolen Focus (Johann Hari, 2022): regulate & rebel. The problem is systemic and external (also true). Reclaim your interface. BTW: Goldfish have excellent attention spans and memory :-) Lessons from trauma books. Claude # # ⭐⭐⭐ The Body Keeps the Score (Bessel van der Kolk, 2014): trauma recall shuts down the speech area. Eye movement desensitization (EMDR) helps. So does CBT, despite what the book says. But does yoga (only a little) or neurofeedback (too little data)? ⭐⭐⭐ What Happened to You? (Bruce Perry & Oprah Winfrey, 2021): calming people down before talking. Strong connections help more than a therapist. ⭐⭐ The Myth of Normal (Gabor Maté, 2022): trauma causes cancer (no), autoimmunity (partly), ALS (?), etc. ⭐ It Didn’t Start with You (Mark Wolynn, 2016): maybe anxiety is epigenetic and heriditary? Unproven. Family Constellation Therapy is wrong ⭐⭐ My Grandmother’s Hands (Resmaa Menakem, 2017): maybe racism is a somatic (body) response to generational (epigenetic) trauma? Too little data ⭐⭐ No Bad Parts (Richard Schwartz): maybe we’re not one person but a collection of parts, and interviewing family systems (IFS) helps? Unclear ⭐⭐⭐ Maybe You Should Talk to Someone (Lori Gottlieb): our memory is unreliable and therapy is messy. Connection & compassion help Most of these are based on the contested Polyvagal Theory: the nervous system scans for danger before the mind can process it. But the specific claims of the theory are wrong and it makes no other falsifiable claims. The nervous system has hierarchical responses to threat. 🟢 Not unique to PVT Social connection regulates physiology. 🟢 Not unique to PVT Unconscious threat detection (neuroception). 🟡 Weak evidence Mamellian brain (ventral vagal system) is uniquely mammalian. 🔴 Lungfish have it Reptilian brain (dorsal vagal) “shutdown” causes dissociation. 🔴 No evidence RSA directly measures vagal tone. 🔴 Contested Reptiles are “asocial”. 🔴 Wrong Trauma causes body changes too. It’s not just the mind. Childhood trauma persists. Relationships (connection & compassion) help more than therapy What constitutes tax residency in India? For an Indian citizen, as I understand it (after 2 hours of research): If you were in India >= 182 days: Resident* Else, if you left India this year for employment: NRI. Else, if you are an Indian Citizen living abroad (visiting or not): If Indian Income <= ₹15 Lakhs: NRI. Else if you were in India >= 120 days AND >= 365 days in the last 4 years: RNOR. Else if you are not liable to tax in any other country: RNOR. Else, if you left India for non-employment (students, tourism) and were in India >= 60 days AND >= 365 days in the last 4 years: Resident* Else: NRI. If you ended up as a Resident* If you were NRI in 9 of the last 10 years OR in India <= 729 days in the last 7 years: RNOR Else: ROR (Resident & Ordinarily Resident). For all practical purposes, RNOR is like an NRI. You pay tax only on Indian income, not global income. It’s like a transition status for returning NRIs. AVIF compresses better than WebP and may be the “next big thing”. I will be switching for all future images. Squoosh remains my choice of compressor and Ezgif’s AVIF maker and GIF to AVIF are handy.

Can AI Replace Human Paper Reviewers?

Stanford ran a conference called Agents for Science. It’s a conference for AI-authored papers, peer reviewed by AI. They ran three different AI systems on every paper submitted, alongside some human reviewers. The details of each of the 315 papers and review are available on OpenReview. I asked Codex to scrape the data, ChatGPT to analyze it, and Claude to render it as slides. The results are interesting! I think they’re also a reasonably good summary of the current state of using AI for peer review. ...

The Periodic Table by Primo Levi and Randall Munroe

I read The Periodic Table by Primo Levi, written in Randall Munroe’s style. Here is the conversation. I began with the prompt: Rewrite the first chapter Primo Levi’s The Periodic table in the style of Randall Munroe. Same content, but as if Primo Levi had written it in Randall Munroe’s style. After that, for each chapter, I prompted: Continue! Same depth, same style. ...

NPTEL Applied Vibe Coding Workshop

For those who missed my Applied Vibe Coding Workshop at NPTEL, here’s the video: You can also: Read this summary of the talk Read the transcript Or, here are the three dozen lessons from the workshop: Definition: Vibe coding is building apps by talking to a computer instead of typing thousands of lines of code. Foundational Mindset Lessons “In a workshop, you do the work” - Learning happens through doing, not watching. “If I say something and AI says something, trust it, don’t trust me” - For factual information, defer to AI over human intuition. “Don’t ever be stuck anywhere because you have something that can give you the answer to almost any question” - AI eliminates traditional blockers. “Imagination becomes the bottleneck” - Execution is cheap; knowing what to build is the constraint. “Doing becomes less important than knowing what to do” - Strategic thinking outweighs tactical execution. “You don’t have to settle for one option. You can have 20 options” - AI makes parallel exploration cheap. Practical Vibe Coding Lessons Success metric: “Aim for 10 applications in a 1-2 hour workshop” - Volume and iteration over perfection. The subscription vs. platform distinction: “Your subscriptions provide the brains to write code, but don’t give you tools to host and turn it into a live working app instantly.” Add documentation for users: First-time users need visual guides or onboarding flows. Error fixing success rate: “About one in three times” fixing errors works. “If it doesn’t work twice, start again-sometimes the same prompt in a different tab works.” Planning mode before complex builds: “Do some research. Find out what kind of application along this theme can be really useful and why. Give me three or four options.” Ask “Do I need an app, or can the chatbot do it?” - Sometimes direct AI conversation beats building an app. Local HTML files work: “Just give me a single HTML file… opening it in my browser should work” - No deployment infrastructure needed. “The skill we are learning is how to learn” - Specific tool knowledge is temporary; meta-learning is permanent. Vibe Analysis Lessons “The most interesting data sets are our own data” - Personal data beats sample datasets. Accessible personal datasets: WhatsApp chat exports Netflix viewing history (Account > Viewing Activity > Download All) Local file inventory (ls -R or equivalent) Bank/credit card statements Screen time data (screenshot > AI digitization) ChatGPT’s hidden built-in tools: FFmpeg (audio/video), ImageMagick (images), Poppler (PDFs) “Code as art form” - Algorithmic art (Mandelbrot, fractals, Conway’s Game of Life) can be AI-generated and run automatically. “Data stories vs dashboards”: “A dashboard is basically when we don’t know what we want.” Direct questions get better answers than open-ended visualization. Prompting Wisdom Analysis prompt framework: “Analyze data like an investigative journalist” - find surprising insights that make people say “Wait, really?” Cross-check prompt: “Check with real world. Check if you’ve made a mistake. Check for bias. Check for common mistakes humans make.” Visualization prompt: “Write as a narrative-driven data story. Write like Malcolm Gladwell. Draw like the New York Times data visualization team.” “20 years of experience” - Effective prompts require domain expertise condensed into instructions. Security & Governance Simon Willison’s “Lethal Trifecta”: Private data + External communication + Untrusted content = Security risk. Pick any two, never all three. “What constitutes untrusted content is very broad” - Downloaded PDFs, copy-pasted content, even AI-generated text may contain hidden instructions. Same governance as human code: “If you know what a lead developer would do to check junior developer code, do that.” Treat AI like an intern: “The way I treat AI is exactly the way I treat an intern or junior developer.” Business & Career Implications “Social skills have a higher uplift on salary than math or engineering skills” - Research finding from mid-80s/90s onward. Differentiation challenge: “If you can vibe code, anyone can vibe code. The differentiation will come from the stuff you are NOT vibe coding.” “The highest ROI investment I’ve made in life is paying $20 for ChatGPT or Claude” - Worth more than 30 Netflix subscriptions in utility. Where Vibe Coding Fails Failure axes: “Large” and “not easy for software to do” - Complexity increases failure rates. Local LLMs (Ollama, etc.): “Possible but not as fast or capable. Useful offline, but doesn’t match online experience yet.” Final Takeaways “Practice vibe coding every day for one month” - Habit formation requires forced daily practice. “Learn to give up” - When something fails repeatedly, start fresh rather than debugging endlessly. “Share what you vibe coded” - Teaching others cements your own learning. “We learn best when we teach.” Tool knowledge is temporary: “This field moves so fast, by the time somebody comes up with a MOOC, it’s outdated.”

Finding open source bugs with Ty

Astral released Ty (Beta) last month. As a prototyper, I don’t type check much - it slows me down. But the few apps I shipped to production had bugs type checking could have caught. Plus, LLMs don’t get slowed by type checking. So I decided to check if Ty can spot real bugs in real code. I asked ChatGPT: Run ty (Astral’s new type checker) on a few popular Python packages’ source code, list the errors Ty reports (most of which may be false positives), and identify at least a few that are genuine bugs, not false positives. Write sample code or test case to demonstrate the bug. ...

Mapping The Red Headed League

Mapping The Red Headed League is a fascinating reconstruction of the actual places mentioned (or hinted at) by Arthur Conan Doyle’s The Red Headed League by Aman Bhargava. We cross-reference railway timetables, scrutinize Victorian newspaper reports and historical incidents, scour government records, analyze meteorological data, and, in my specific case, pore over Ordnance Survey maps to make the pieces fit. What struck me is how little London has changed, how much old data is available, and what love it takes to reconstruct such a journey! ...

Things I Learned - 11 Jan 2026

This week, I learned: Software Heritage is a non-profit that archives software. You can submit any Git repo for archival. Over 400 million projects have been archived so far. Everything Bad Is Good For You by Steven Johnson (2005) argues that pop culture isn’t all bad. But it isn’t all good either, unlike the book’s claims. Claude Popular culture formats (e.g. video games, manga, soap operas, game shows) are steadily more cognitively demanding, complex. They provide a dopamine kick from problem-solving. These may have led to the Flynn Effect (rising IQs in 1990s-2000s). Or it may be due to nutrition, smaller families, education, etc. Action games correlate with visual-spatial skills. Strategy games correlate with memory, planning. But is it causation? It doesn’t always translate to real-world skills. Also, side effects are real and bad: screen-time, addiction, misinformation, etc. The purpose of a featured image in a blog post is to help readers decide whether to read it. Share the article’s output/focus (e.g. for data stories, products). Else a visual summary (e.g. sketchnote, comic capturing the essence). Else skip. Avoid stock photos. # NFLSavant.com has play-by-play data for NFL games. Ten of the least well known psychology / sociology research findings. ChatGPT Learning styles are a myth. People might prefer visual / audio / … learning but it doesn’t help learning. Mix learning modes. NotebookLM can help. Casual acquaintances help find new information or jobs much more than close friends, since they’re in different social circles. Nurture weak ties. Use a relationship architect. Tell a lie often enough and people mistake familiarity for truth. Fact-check habitually. The more you see / hear something the more you like it. (Exposure effect.) Expose to good things. When others mess up, we blame them. When we mess up, we blame the situation. (Attribution error.) Pause before judging. Sometimes, rewarding people makes them like doing it less. (Overjustification effect.) People who know less over-estimate their knowledge. (Dunning-Kruger effect.) Habitualize calibration via feedback and tests. People do worse when they’re afraid their failure will reflect on their stereotype. (Stereotype threat.) Practice emotional resets. Higher expectations lead to better performance. (Pygmalion effect.) Engineer positive expectations. Benevolent sexism (e.g. protective paternalism) can be harmful too. Scan for well-meaning bias. Liberalism => economic growth, peace and expanding rights. Also colonial violence, exclusions (women, slavery, …), and eroding community. It is vulnerable to authoritarianism (e.g. emergency powers, recessions). Since 2006, democracy has consecutively declined, reversing half the progress since WW2. But alternatives are unclear. Claude Notes from The Periodic Table by Primo Levi. Pure Zinc does not dissolve easily in sulphuric acid. An impurity like Copper Sulphate pulls electrons from Zinc and offers them to Hydrogen ions, speeding up the reaction. Impurities, foreign bodies, etc. have a purpose, too. Discomfort = Information. Overcoming discomfort = Capability. Capability = Freedom. Therefore: Seeking discomfort (carefully, purposefully) = Building freedom. Simple != Easy. Simple = Clear. Clear = Actionable. Indifference often feels like malice. ⭐ Analogies have limits. (The Map is not the Territory.) When using analogies, always explore where, when and why they will break. Pay close attention near where they break. ⭐ Knowledge vanishes with people unless written down. Write “Do X. Because of Y. Unless Z changes.” The last two are critical. I could NOT have read the book without a Randall Munroe re-styling. I cried anyway. “There’s about 300-400 that were corporate assets. One watched them all the time. These are people who in 15 years could be CEO. There’s something about them that caught your fancy when you were in a meeting… brilliant ideas that challenged your thinking… We called them “Corporate Assets” and tracked them, to make sure we game-planned them, give them the right assignments.” Indra Nooyi, The Knowledge Project The accesskey attribute works a bit like magic. Adding an accesskey="h" on a home page link, or an accesskey="t" on a theme toggle button automatically enables keyboard shortcuts Alt+H or Alt+T to activate them. (Varies by browser and OS, but hovering shows the shortcut!) Familiarity and recency feel like learning but they’re not. Instead: Take tests. Review (spaced repetition). Interleave learning. That’s what helps. Claude Make It Stick (Peter C. Brown, 2014) A Mind for Numbers (Barbara Oakley, 2014) Ultralearning (Scott Young, 2019) How to Take Smart Notes (Sönke Ahrens, 2017)

Self-discover LLM capabilities

Q: “How do we learn what we can do with AI agents?” Me: “Ask them!” I mean, they are probably aware of their abilities. They can search online for how other people are using them. They have access to tools (connect to GMail, write & run code, etc.) which they’re aware of, and even if not, can try out. Asking them seems a useful way of figuring out how to use them. ...

Creating data stories in different styles

TL;DR: Don’t ask AI agents for one output. Ask for a dozen, each in the style of an expert. Share what works best. AI agents build apps, analyze data, and visualize it surprisingly well, these days. We used to tell LLMs exactly what to do. If you’re an expert, this is still useful. An expert analyst can do better analyses than an AI agent. An expert designer or data visualizer can tell an AI agent exactly how to design it. ...

No response required

I asked Gemini 3 Pro to: Suggest short words that mean innovation, exploration, research, etc. Spark: A small but significant beginning. Lab: Short for laboratory; implies experimentation. … (long and pretty good list) I went with "Labs". Thanks. No response required. THINKING… Assessing User Response I’ve carefully considered the user’s latest input, recognizing their selection of “Labs” and their appreciative, but brief, response. I’m focusing on parsing the meaning of “No response required” to determine if further interaction is needed. The information should help me to understand future similar responses. ...

Humans have taught LLMs well

Human LLM Bullshitting: Humans confidently assert wrong information, from flat-earth beliefs to misremembered historical “facts” and fake news that spread through sheer conviction Hallucination: LLMs generate plausible but factually incorrect content, stating falsehoods with the same fluency as facts People-Pleasing: Humans optimize for social harmony at the expense of honesty, nodding along with the boss’s bad idea or validating a friend’s flawed logic to avoid conflict Sycophancy: LLMs trained with human feedback tell users what they want to hear, even confirming obviously wrong statements to avoid disagreement Zoning Out: Humans lose focus during the middle of meetings, remembering the opening and closing but losing the substance sandwiched between Lost in the Middle: LLMs perform well when key information appears at the start or end of input but miss crucial details positioned in the middle Overconfidence: Humans often feel most certain precisely when they’re least informed—a pattern psychologists have documented extensively in studies of overconfidence Poor Calibration: LLMs express high confidence even when wrong, with stated certainty poorly correlated with actual accuracy Trees for the Forest: Humans can understand each step of a tax form yet still get the final number catastrophically wrong, failing to chain simple steps into complex inference Compositional Reasoning Failure: LLMs fail multi-hop reasoning tasks even when they can answer each component question individually First Impressions: Humans remember the first and last candidates interviewed while the middle blurs together, judging by position rather than merit Position Bias: LLMs systematically favor content based on position—preferring first or last items in lists regardless of quality Tip-of-the-Tongue: Humans can recite the alphabet forward but stumble backward, or remember the route to a destination but get lost returning Reversal Curse: LLMs trained on “A is B” cannot infer “B is A”—knowing Tom Cruise’s mother is Mary Lee Pfeiffer but failing to answer who her son is Framing Effects: Humans give different answers depending on whether a procedure is framed as “90% survival rate” versus “10% mortality rate,” despite identical meaning Prompt Sensitivity: LLMs produce dramatically different outputs from minor, semantically irrelevant changes to prompt wording Rambling: Humans conflate length with thoroughness, trusting the thicker report and the longer meeting over concise alternatives Verbosity Bias: LLMs produce unnecessarily verbose responses and, when evaluating text, systematically prefer longer outputs regardless of quality Armchair Expertise: Humans hold forth on subjects they barely understand at dinner parties rather than simply saying “I don’t know” Knowledge Boundary Blindness: LLMs lack reliable awareness of what they know, generating confident fabrications rather than admitting ignorance Groupthink: Humans pass down cognitive biases through culture and education, with students absorbing their teachers’ bad habits Bias Amplification: LLMs exhibit amplified human cognitive biases including omission bias and framing effects, concentrating systematic errors from their training data Self-Serving Bias: Humans rate their own work more generously than external judges would, finding their own prose clearer and arguments more compelling Self-Enhancement Bias: LLMs favor outputs from themselves or similar models when evaluating responses Via Claude ...

Yearly Goal Tracking FAQ

I track my yearly goals by publishing and emailing them to my contacts: My year in 2020 My year in 2021 My year in 2022 My year in 2023 My year in 2024 My year in 2025 Here are questions people have asked about my goal tracking. How do you know that you have achieved the Better Husband tag? In 2024, she said that I was “definitely worse in 2023 than 2024.” ...

Scrabble image generation

AI image generation still has a long way to go. Here are two images generated by Gemini and ChatGPT from the same prompt: “Create a funny scrabble board of dysfunctional family relationships!” Gemini It’s probably showing off, with coffee stains, and spelling “DYSFUNCTIONAL” right. But “ABLOMY”? “PASSIAVE”? “RGUCT_SVA”? “SORDSP”? Most of the vertical letters are wrong. Some horizontals (“DTENSION”?) are off, too. Also: “Z” has 2 points? “C” has “C” points? “DOUBLE STTER SCORE”? “UUT SCORE SCORE” instead of “TRIPLE WORD SCORE”? ...

AI agents to hire

GDPval is a benchmark that compares how well AI does (vs experts without AI) on useful real-world tasks. In several areas, the agents outperform experts. For example, AI beats personal financial advisors, but not accountants and auditors. So I used ChatGPT / Claude to decide where to invest, but am having an accountant file my taxes. That’s a high leverage activity, especially since I might not have hired a personal financial advisor by default, and ChatGPT is certainly better than me (I’m not an expert) at personal financial advice. ...

New ways of reading books

I’m using AI to read books by: Summarizing. This tells me what the books is about, the key points it makes and the main takeaways. It also helps me decide if I want to dig deeper. Fact-checking. I can find mistakes, alternate perspectives, and biases. That’s a huge win! Re-authoring. I can write it in the style of Malcolm Gladwell, Randall Munroe, Richard Feynman, or anyone else I like. Makes dense prose much more enjoyable. So far, I’ve applied this at different levels - and I’m sure there are more possibilities: ...