Things I Learned - 12 Oct 2025

This week, I learned: ‘…as few as 250 malicious documents can produce a “backdoor” vulnerability in a large language model… data-poisoning attacks might be more practical than believed." Anthropic Tim Urban’s 2015 article, The AI Revolution: The Road to Superintelligence, is surprisingly relevant. A key theme is that post artificial-super-intelligence, pretty much anything we know / predict is probably wrong. LLMs are bad at asking questions, so you need to plan on their bahlf first. LLMs are bad at copy paste, so giving them a scaffolding to edit helps. Two things LLM coding agents are still bad at The VPN industry is a consolidating oligopoly that doesn’t offer much security and biases towards affiliates. Who Owns Express VPN, Nord, Surfshark? As of 2025, a fine-tuned DeBERTa-v3-Large / RoBERTa-Large model is better than an LLM at emotion classification. roberta-base-go_emotions is a good starting point if you don’t want to fine-tune. ChatGPT OpenAI defines an AI agent as “a system that can do work independently on behalf of the user”. swyx Brain coding is the new term for human coding - as opposed to vibe-coding (AI codes, human doesn’t review code) and AI coding (AI codes, human reviews code). npx -y emoj lets you type text and pick a relevant emoji. Many people who shifted away from conflict aversion did so by systematizing it. ChatGPT Martin Luther King Jr institutionalized not stepping back from conflicts in his movement. Kim Scott (Radical Candor) practiced caring more via short, specific feedback loops. Kwame Christian (Compassionate Curiosity) practiced ask open questions. Ed Catmull (Pixar) instituted Braintrust to ask candid questions. Ray Dalio (Bridgewater) instituted radical transparency. Many people who adopted a failure-seeking mindset made failure frequent, small, cheap, and informative. ChatGPT Jia Jiang ran a 100-day rejection challenge, acclimatizing himself to failure. Kim Liao (writer) moved from submission-avoidance to “100 rejections/year”. Reshma Saujani (Girls Who Code) built a practice of “brave, not perfect” - ship before perfect. Ray Dalio (Bridgewater) instituted mistake logs and “pain + reflection = progress”. Astro Teller (X, the Moonshot Factory) rewired incentives so teams are rewarded for killing their own ideas early. Sara Blakely (Spanx) set weekly failure quotas. Kathryn Schulz (author of Being Wrong) converts failures into teaching methods. Sindre Sorhus has already created a micro-framework css-extras using CSS @functions. Today, if I had to build agents, here are the tools and environment capabilities I’d ask for: Ask user (for clarifications) Internet tools Search Fetch (CORS-piercing) Scraper with XPath/CSS Selectors Access to llms.txt LLM APIs Summarizer (condenses chat) Sub-agents Coding tools Markdown convertor Code execution (including tests) Browser + DevTools for testing Memory / storage Tool/MCP directory with search Noting a few things that I find #impossible to do today with LLMs: LLMs can’t run experiments / explorations, like trying out on a new tool or web app in an environment, the way I would. LLMs can’t move stuff on my machine, e.g. notes from one list to another, when they’re only on my laptop, not GitHub. LLMs can’t capture the past wisdom in my head, e.g. the distilled principles of data visualization that we applied at Gramener. LLMs can’t prioritize my to-do list based on my preferences and what’s important to me. LLMs cannot write a blog post in my style of writing. When recruiting for people in the LLM era, look for questioning ability, sensible thinking, and how they use AI. Give them lots of fluff and context. Can they cut through it? Is their answer concise and to the point or waffling? Like post the industrial revolution, more people will become operators looking after AI, not craftsmen. This includes coding. zx is a nice JS-based alternative to shell scripts. const branch = await $`git branch --show-current`; await $`dep deploy --branch=${branch}`; docker run -it --name test --user vscode mcr.microsoft.com/devcontainers/base:ubuntu gives you a test Ubuntu image closer to a desktop / user setup rather than a server. Useful to try out apps.

Tamil AI

I was testing LLMs’ sense of Tamil humor with this quote: Extend this post with more funny Tamil words that end with .ai - mentioning why they’re funny. Chenn.ai is the artificial intelligence capital of India. Kadal.ai Kad.ai Dos.ai Vad.ai Ad.ai Thal.ai Mallig.ai Aratt.ai And finally Podad.ai All spoken in namma bash.ai 😅 The Chinese models didn’t fare well. DeepSeek made up words. Mood.ai - An AI that perfectly captures your mood. Sokk.ai - The AI for when you’re bored. Thanni.ai - A hydration assistant. Qwen too. ...

How to create a data-driven exam strategy

Can ChatGPT give teachers data-driven heuristics on student grades? I uploaded last term’s scores from about 1,700 students in my Tools in Data Science course and asked ChatGPT: This sheet contains the scores of students … (and explained the columns). I want to find out what are the best predictors of the total plus bonus… (and explained how scores are calculated). I am looking for simple statements with 80%+ correctness along the lines of: ...

Vibe-Coding for Interesting Data Stories

Last weekend, I fed Codex my browser history and said “explore.” It found a pattern I call rabbit holes – three ways we browse: Linear spiral - one page > next page > next. E.g. filing income tax, clicking “next” on the PyCon schedule. Hub & spoke - hub > open tabs > back to hub. E.g. exploring DHH’s Ubuntu setup, checking Firebase config. Wide survey - source > many, many pages. E.g. clearing inbox, scanning news. Then Claude Code built this lovely data story. ...

The Non-Obvious Impact of Reasoning Defaults

Yesterday, I discovered how much reasoning improves model quality. My Tools in Data Science assignment asks students to draft an llms.txt file for ipify and auto-checks with GPT-5 Nano - a fast, cheap reasoning model. I set reasoning_effort to minimal and ran this checklist: 1. Starts with "# ipify" and explains ipify. 2. Markdown sections on API access, support (e.g. GitHub, libraries). 3. Covers API endpoints (IPv4, IPv6, universal) and formats (text, JSON, JSONP). 4. Mentions free, no-auth usage, availability, open-source, safeguards. 5. Has maintenance metadata (e.g. "Last updated: <Month YYYY>"). 6. Mentions robots.txt alignment. Stay concise (no filler, <= ~15 links). If even one checklist item is missing or wrong, fail it. Respond with EXACTLY one line: PASS - <brief justification> or FAIL - <brief explanation of the first failed item>. With a perfect llms.txt, it claimed “Metadata section is missing” and “JSONP not mentioned” – though both were present. ...

Things I Learned - 05 Oct 2025

This week, I learned: Wrong answers are useful if you discover why they said that. Conversation is a game where you CO-CONSTRUCT common ground. Mike Caulfield BMTC hourly data from Bangalore Metro is available via RTI. Vivek “Find evidence for and against” improves LLM responses far more than “Are you sure?” Mike Caulfield SSH3 is an emerging SSH alternative that’s written on top of HTTP/3. It supports OAuth2, OpenID Connect, and HTTPS for certificates. Cholesterol has become a victim of its own success. We give statins to those with high LDL. So most people who have heart attacks have lower-than-natural cholesterol. Inflammation (HS-CRP) is now the strongest predictor of heart attack (American College of Cardiology). The usual stuff reduces HS-CRP: no sugar/carbs, veggies, nuts, green tea, turmeric/black pepper, weight loss, exercise, sleep, meditation. ⭐ The beginner mindset: scrub your instincts and don’t let life experience cloud you. This takes effort. Hold on to naivette and escape cynicism. The Knowledge Project: Barry Diller Forecasts give comfort. They may not be good but they feel safer than instinct. The Knowledge Project: Barry Diller My laptop’s mic is much better than my phone’s mic, surprisingly. When recording conversations, it’s better to leave my laptop open and record than use the phone’s recording app. ⭐ Here are the major not-immediately-obvious LLM megatrends/superpowers I see. Swarms. Ask for dozens of solutions in parallel. Merge, rank, auto-debate, converge. Personalize at Scale. Create feedback, designs, excerpts/summaries, … tailored to EACH person at scale. Computer use. Agents operate UIs like a human (browser, apps). LLM-as-a-judge. Use AI to validate ever-increasing AI generated output. Synthetic data. Create realistic data for prototypes, testing edge cases, market research simulation, training data, … Code on demand. Ask for outcomes directly. Agents code on the fly to get there, in data science, research, management, … Style transfer. Copy a master’s style of drawing, coding, writing, … creating an army of their apprentices. Multi-modality. Native voice/video/screensharing and long-context perception Citizen experts. Non-expertise is not a barrier. Amateurs can create expert-level films, music, software, reports, … Long-context LLMs. Growing context size lets us process entire repos, legal libraries, personal lifelogs, … Memory. Assistants learn per-person / per-team. Cuts prompt, builds knowledge. Agent-to-Agent. Agents consuming content (e.g. llms.txt), agents calling agents (sub-agents, A2A protocol, …) Real-world tools. Write reports, send emails, shop online, use computer, control devices, … Jagged frontier. AI is great at certain things but terrible at others. This frontier is unknown and shifting rapidly. Lethal trifecta. You can only have 2 out of these 3: private data, untrusted content, and external communication. Edge/Private AI. Small models on private cloud compute. Authenticity. What content is authentic? What’s slop? What’s fraud? Are AI twins liable? AI Governance. Strict liability, transparency mandates, state control, … Not sure about or haven’t seen enough of these: Data / workflow as the moat AI native business models AI digital-divide ⭐ What I’d like to do next, maybe, is build a boutique “AI Studio”. Small group of good people coding delightful AI problems. Something that doesn’t scale. GLM models can be used with Claude Code. At $3/month and a quality close to Claude 4 Sonnet, this is a good deal. But the effort of adding a new subscription is too high for me. I’d rather use it via OpenRouter which is doesn’t support an Anthropic API end point at the moment. typst is a good LaTeX alternative. Markdown-like syntax with fast rendering. Mostly useful for researchers using LaTeX. But publishers / journals don’t accept typst often. libSQL is an SQLite compatible fork with remote access, replication, ALTER TABLE to modify columns, random ROWID, etc. It supports the same externsions. The maintainers are working on turso - a SQLite compatible improvement with async, vectors, change data capture, etc. (still in alpha). But because of this, I’m a bit uncertain about the future of libSQL. ⭐ LLM benchmarks show a correlation of ~0.5, hinting at a common theme of intelligence. Correlations in coding & science are particularly high. Ethan Mollick. Reminds me of student marks correlations. Strong correlation clusters (physics, chemistry, biology, mathematics, computer science) with the weaker correlations going down to ~0.5. What does it indicate? LLMs learn like people? Knowledge areas cluster? Humans write benchmarks like exams? Dayflow records your screen at 1 fps and uses Gemini to summarise your activity every 15 min. Has low CPU usage. ⭐ Code Mode is a smart way to use MCPs and a very likely future direction. Using LLMs to write code to call MCPs rather than directly. Cloudflare supports an AI Index which will eliminate the need for a lot of custom RAG engineering.

The 11 sites I visit most: ChatGPT. It’s replaced Google as my default knowledge source. I prefer it over Gemini, Claude, etc. because the app has good features (memory from past conversations, code interpreter, strong voice mode, remote MCP on web app, etc.) The OpenAI models have pros and cons, but the app features are ahead of competition. Gmail. It’s my work inbox. Interestingly, I check it more (and respond faster) than social channels (e.g. WhatsApp, Google Chat, LinkedIn). It also doubles up as my task queue. WhatsApp. It’s my default phone + messaging app. A fair bit of my work communication happens here, too. Prime Video. I mainly watch The Mentalist. Totally love Patrick Jane! Google AI Studio. Mostly for transcription. It’s better than Gemini on UI, ability to handle uploads, file-formats, etc. It’s also free (though the data is used for training.) My Talks page: https://sanand0.github.io/talks/. I give 1-1.5 talks a week, mostly on AI/ML topics. I use Marp to render Markdown slides and publish it here. Google Chat. It’s Straive’s social channel. I can’t use it from my phone, so I log in only if I need to check if I missed something. LinkedIn. It’s where I post by default. I don’t use it for networking and only connect with people I’ve met and know well. YouTube. Mostly for movie clips over dinner. I occasionally watch educational content. LLM Foundry: https://llmfoundry.straive.com/. LLM Foundry is Straive’s internal gateway to multiple model APIs (I built it). I use it to experiment with models, grab API keys, and demo LLMs to clients. Squoosh. I compress every image, every time. Mostly into WebP (hands-down the best format today), typically lossless with an 8-color palette, or lossy at ~0-10% quality for photos. The list will change. But the reasons probably won’t: fast, simple, automatable, and practical (for me). ...

Tools in Data Science Sep 2025 edition is live: https://tds.s-anand.net/. Major update: a new AI-Coding section and fresh projects. I teach TDS at the Indian Institute of Technology, Madras as part of the BS in Data Science. Anyone can audit. The course is public. You can read the content and practice assessments. I fed the May 2025 term student feedback into The Sales Mind and asked: What are the top non-intuitive / surprising inferences? What are interesting observations? What are high impact actions? Full analysis: https://lnkd.in/gVWVqaxN: summary, outliers, and action ideas. ...

Vibe-Scraping: Write outcomes, not scrapers

There hasn’t been a box-office explosion like Dangal in the history of Bollywood. CPI inflation-adjusted to 2024, it is the only film in the ₹3,000 Cr club. 3 Idiots (2009) is the first member of the ₹1,000 Cr club (2024-inflation-adjusted). The hot streak was 2013-2017: each year, a film crossed that bar: Dhoom 3, PK, Bajrangi Bhaijaan, Dangal, Secret Superstar. Since then, we never saw such a release except in 2023 (Jawan, Pathan). ...

Things I Learned - 28 Sep 2025

This week, I learned: selectolax is a fast, easy-to-use, modern HTML5 parser with CSS selectors. A good replacement for lxml.html. The most effective way to convert a blob (e.g. file input) to a data URL on the browser seems to be via the FileReader API. const blobToDataURL = (blob) => new Promise((res, rej) => { const r = new FileReader(); r.onload = () => res(r.result); r.onerror = () => rej(r.error); r.readAsDataURL(f); }); Tool calls in OpenAI support files and images. OpenAI ⭐ “Task parity is not the same thing as job parity There is a lot of complexity as many different tasks are bundled into jobs, and many jobs contribute to processes inside an organization The jagged frontier of AI ability means doing tasks well doesn’t translate to doing jobs well.” Ethan Mollick Adding // @ts-check to a JavaScript file and documenting types via JSDoc might be the simplest way to migrate phase-wise from JS to Typescript. envsubst < file.txt replaces file.txt with the environment variable, e.g. $HOME is replaced by the HOME environment variable. Clean shell-level templating. GitHub Copilot CLI is out. npx -y @github/copilot Compost is the cheapest thing per ton that I can buy on Amazon India. I can buy 1 ton of compost for Rs 13,500. ChatGPT yt-dlp requires Deno from now on. #14404 In meetings, make cameras optional by default – and judge engagement by contributions, not video – because a 4-week field experiment found camera-on increased fatigue and reduced voice, especially for women and newcomers. Camera on early for trust building is useful. PubMed wrkflw is a quick and light way to test GitHub actions before publishing. It runs GitHub actions locally. GPT-5-Codex is available as an API and on LLM. Simon Willison ⭐ I’m habit engineering, i.e. discovering and stacking habits on to existing ones. For example: ChatGPT suggested increasing observability based on code reviews. I’m including it in my weekly codecast. ChatGPT suggested defining closures inmeetings. I’mn now discussing objectives at meeting starts and effectiveness at the end. Since Anaconda cannot be used for free by organizations with 200+ people, Straive’s received legal notices from Anaconda. Since laptops are under central IT administration, they went ahead and deleted all Anaconda instances. Installing miniconda for use with conda-forge requires admin access that most developers do not have, however. That leads to an interesting “No Python” situation. This is where uv becomes the knight in shining armor. Perceptron is SOTA LLM for object bounding boxes. Just 2B parameters. Gall’s “law” says that complex systems that work evolved from simple systems that worked. But a complex system designed from scratch won’t ever work. This holds in uncertain environments. But where formal theory or regulations exists, it doesn’t. ChatGPT uvx --with visidata vd gives you a command-line Excel editor to edit / convert CSV, Excel, JSON, SQLite, directories, etc. uvx markitdown https://example.com/ fetches example.com as Markdown. I learnt this when I told Codex it could use uvx markitdown to convert PDFs and it figured this part out by itself. The Dropbox connector for ChatGPT is the little flaky – at least on Android. It could not identify a file that was clearly there in Dropbox and I had to upload it manually. ChatGPT’s output is too dense for me. I added this to my custom instructions: “Write in simple language. Explain non-obvious terms intuitively.” yt-dlp has a --download-sections option that downloads specific YouTube time ranges. For example --download-sections "*00:01:00-00:03:00" downloads roughly (not exactly) from 1 min to 3 min. Note the * at the beginning. My Lenovo laptop’s touchpad started scrolling instead of moving when I moved my finger. Many things could have caused it, but the solution was to click (not tap) the top middle of the trackpad. ChatGPT The India Entrance Exam database is a dataset collating Indian entrance exams.

How to review trending GitHub repos on VS Code

Here’s how I track trending GitHub repos each week. I run a scheduled script that saves a clean TSV I can scan fast. It uses uvx gtrending to fetch weekly trending repos for: Rust: High-quality system tools. (Anything in Rust seems cool.) Go: Reliable CLI/infra tools. (Like Rust, most Go code seems good.) Python: Most AI/ML stuff TypeScript: Most modern JS codebases JavaScript: Most front-end utilities Shell: Productivity scripts I pipe results through jq to extract: ...

Vibe Shopping

I’ve started vibe shopping, i.e. using ChatGPT to shop for small, daily items and buying without verifying. For example: “A metal rack for the floor: at least 2 ft * 1 ft * 2 ft, small gaps, popular options on Amazon.in.” https://chatgpt.com/share/68d61d68-7040-800c-936b-354749539308 “An optical wired mouse that’s smaller than usual, 4*+, popular, Prime-eligible for Chennai by the weekend on Amazon.in.” https://chatgpt.com/share/68d61e0d-420c-800c-bc71-821b9f9296a9 The best use is when I don’t know the right terms. In this case, the terms were wire rack and mini mouse. ...

Tools in Data Science Sep 2025 edition is live: https://tds.s-anand.net/. Major update: a new AI-Coding section and fresh projects. I teach TDS at the Indian Institute of Technology, Madras as part of the BS in Data Science. Anyone can audit. The course is public. You can read the content and practice assessments. I fed the May 2025 term student feedback into The Sales Mind and asked: What are the top non-intuitive / surprising inferences? What are interesting observations? What are high impact actions? Full analysis: https://chatgpt.com/share/68cba081-afc0-800c-9da3-75222e84a499: summary, outliers, and action ideas. ...

The 10 sites I visit most often

Here are the 10 most frequent sites I use (based on Microsoft Edge’s home bar): ChatGPT. It replaced Google as my default knowledge source. I prefer it over Gemini, Claude, etc. because the app has good features (memory from past conversations, code interpreter, strong voice mode, remote MCP on web app, etc.) The OpenAI models have pros and cons, but the app features are ahead of competition. Gmail. It’s my work inbox. Interestingly, I check it more (and respond faster) than social channels (e.g. WhatsApp, Google Chat, LinkedIn). It also doubles up as my task queue. Prime Video. I mainly watch The Mentalist. Totally love Patrick Jane! Google AI Studio. Mostly for transcription. It’s better than Gemini on UI, ability to handle uploads, file-formats, etc. It’s also free (though the data is used for training.) My Talks page. I give 1-1.5 talks a week, mostly on AI/ML topics. I use Marp to render Markdown slides and publish it here. Google Chat. It’s Straive’s social channel. I can’t use it from my phone, so I log in only if I need to check if I missed something. LinkedIn. It’s where I post by default. I don’t use it for networking and only connect with people I’ve met and know well. YouTube. Mostly for movie clips over dinner. I occasionally watch educational content. Playground. LLM Foundry is Straive’s internal gateway to multiple model APIs (I built it). I use it to experiment with models, grab API keys, and demo LLMs to clients. Squoosh. I compress every image, every time. Mostly into WebP (hands-down the best format today), typically lossless with an 8-color palette, or lossy at ~0-10% quality for photos. That’s my current home row. It will change. But the reasons probably won’t: fast, simple, automatable, and practical (for me).

Voice coding is the new live coding

In Feb 2025 at PyConf Hyderabad, I tried a new slide format: command-line slideshows in bash. I’ve used this format in more talks since then: LLMs in the CLI, PyCon Singapore, Jun 2025 Agents in the CLI, Singapore Python User Group, Jul 2025 DuckDB is the new Pandas, PyCon India, Sep 2025 It’s my favorite format. I can demo code without breaking the presentation flow. It also draws interest. My setup was the top question in my PyCon talk. ...

Things I Learned - 21 Sep 2025

This week, I learned: When editing an image, ChatGPT’s non-thinking mode does a much better job of preserving the original image features than the thinking mode. When editing my photo, I found that the thinking mode creates images that looks quite different than me. A surprising effect of overthinking. ⭐ When evaluating model accuracy, compare with human accuracy rather than perfect accuracy. SMEs rarely agree among themselves, so it’s unlikely that they will agree with an LLM. Instead, measure how often the LLM agrees with the majority of SMEs and how often it disagrees with all SMEs. This gives a more realistic measure of accuracy. LLMs instead of Human Judges? and Judging LLM-as-a-Judge. ChatGPT I understand at least one mechanism of how costs are inflated in large organizations. Even people who want to keep costs low find that the process of tracking expenses, submitting receipts, answering questions around approval, adds transaction cost. So, rather than going for a $10 plus top up mechanism, I would rather go for and ask people to take a $500 top up. Better ask for more and waste than have to ask again. YouTube downloaders: yt-dlp for the CLI, Stacher for Windows/Mac/Linux, Cobalt for a web-based app. Ref VS Code a bunch of features I discovered: It can run a terminal in its own new window for over a year (via Ctrl+P > Terminal: Move Terminal into New Window). Now, Ctrl + Alt + Shift + ` does this directly. Terminal Intellisense shows completion suggestions in the UI. Very helpful. Ctrl+Space triggers the menu completion. ⭐ “We find that the per-step error rate itself rises as the task progresses”, i.e. once a conversation goes the wrong way, it’s really hard to correct it. The Illusion of Diminishing Returns Japonaise Cake is the name of the pastry that I had as a child and grew up longing for. I have spent several weeks searching for it in the roadside bakeries at Bangalore and Chennai but only one bakery seems to have it. systemd is the modern way to run scheduled jobs, instead of cron. It’s far more complex. But it can catch up on missed runs via a Persistent option. Working with systemd timers ⭐ Vice-chancellors of universities resist AI in education because (a) their faculty does not know AI and (b) AI is unreliable. But they are interested in (a) large-scale AI-evaluation and (b) AI-enabling entire campus. tldr.sh offers concise man pages, e.g. uvx tldr jq. cheat.sh offers detailed examples, e.g. curl cheat.sh/jq or curl cheat.sh/:help. ugrep is a fast drop-in replacement for grep. It supports fuzzy search with a customizable Levenshtein distance. Also ug -Q shows an interactive TUI searches like VS Code’s “Search in Files” feature. Very intuitive. Dagger lets you write CI/CD workflows in Python. I tried running it but after 7m of pulling large Docker containers, I gave up. Too heavy. dotslash lets you write scripts that downloads GitHub releases, caches, and runs them. Requires writing scripts. I prefer mise. ChatGPT has a quota for searches. I saw this phrase in the reasoning traces: “I’ll avoid overloading on citations since we only have a few calls left.” It doesn’t seem to be in ChatGPT’s system prompt from last month, so it’s either part of the tool response or a new prompt. Depending on the underlying chips that a model uses, the floating point multiplications may differ and model quality can vary. So Claude 4 Opus running on Anthropic’s GPUs can produce different results from when running on Google’s GPUs or Amazon’s GPUs.

AfterSlides: Write Slides After Talks

25 years ago, Mr. Krishnan (IAS) amused us with anecdotes of bureaucrats writing meeting minutes before the meeting. This week, I flipped that. I wrote slides after the talk. I call them AfterSlides. Why. I ran a couple of Ask-Me-Anything (AMA) sessions where the audience set the agenda. I learned their interests. They got answers. No slides prepared. How. I okayed recording with the organizers, recorded on my phone, transcribed with Gemini, and asked ChatGPT to generate the AfterSlides. ...

Turning Generic Gifts Into Joy with AI

In 2001, I received a campus interview invitation from BCG. It opened like this: Dear Anand, We’d like to invite you to an interview on … We were impressed by your … … and went on to share 2-3 phrases about what they liked about my CV. A dozen of us got similar letters – each personalized! That was cool. Two decades later, I still remember it. It showed care and competence – care enough to personalize for each candidate, competence to pull it off at scale across campuses. ...

Tomorrow, we’ll be vibe-analyzing data at a Hasgeek Fifth Elephant workshop. It’s a follow-up to my DataHack Summit talk “RIP Data Scientists”. I showed how it’s possible to automate many data science tasks. In this workshop, the audience will be doing that. Slides: https://sanand0.github.io/talks/2025-09-16-vibe-analysis/ (minimal because… well, it’s “vibe analysis”. We’ll code as we go.) Here are datasets I’ll suggest to the audience: India Census 2011: https://www.kaggle.com/datasets/danofer/india-census MovieLens movies: https://grouplens.org/datasets/movielens/32m/ IMDb movies: https://datasets.imdbws.com/ Occupational Employment and Wage Statistics (OEWS): https://www.bls.gov/oes/tables.htm Global AI Job Market & Salary Trends 2025: https://www.kaggle.com/datasets/bismasajjad/global-ai-job-market-and-salary-trends-2025 Flight Delay Dataset: https://www.kaggle.com/datasets/shubhamsingh42/flight-delay-dataset-2018-2024 London House Price Data: https://www.kaggle.com/datasets/jakewright/house-price-data Exchange Rates to USD: https://www.kaggle.com/datasets/robikscube/exhange-rates-to-usd-from-imforg-updated-daily Thailand Road Accidents (2019-202): https://www.kaggle.com/datasets/thaweewatboy/thailand-road-accident-2019-2022 … but if you’d like stories from any interesting recent datasets (10K - 10M rows, easy-to-download), please suggest in the comments. 🙏 ...

GPT-5 (Codex) follows instructions exactly as given. Usually a good thing, but sometimes, it this is what happens. AGENTS.md: ALWAYS WRITE TESTS before coding. Codex: Let me begin with the tests. (Spends 5 minutes writing tests.) Anand: Stop! This is a proof of concept. We don’t need tests! AGENTS.md: Write tests before coding. Drop tests for proof-of-concepts. Codex: (Proceeds to delete all existing tests.) Anand: STOP! We need those tests! ...