Things I Learned - 13 Apr 2025
This week, I learned: It’s possible to intentionally train yourself to: Form close friends. Care, ask, and share. Become a do-er. Stay mindful of the problem or opportunity you’re deferring. AI Coding and the Peanut, Butter & Jelly problem: #ai-coding This ability to define your desired outcome in crisp, complete terms is one of the most important superpowers of the AI era. The Singapore Urban Redevelopment Authority Property Data lets you search sale and rental prices of properties in Singapore. No API though Notes from meeting with Deepak Goel We have linguistic boundaries in media today more than national boundaries. The Chinese language media, for example, is a very different ecosystem. China culturally struggles with the exercise of branding and cultural power, unlike the west, which has adopted assertive and opinionated branding. You really learn the character of a region only by traveling Similarities arise from unexpected sources. For example, Japan and Ecuador have similar culutures - both are disaster prone locations. AI unlocks so many social research possibilities that were not possible before, e.g. by interpreting and classifying what people share in different situations. Companies send clients to third party trainings (e.g. at Harvard) along with their employees - to learn clients’ real pain points! Education has become a tool for customer experience. Schools are tying up with companies for this (e.g. with Emeritus) International Schools Partnership provides services to independent schools for a small stake. It’s an interesting business model. Research for colleges is a business model that’s at risk thanks to Deep Research (e.g. analyse sustainability practices of listed companies.) There’s an Indian Censor Board Scraper repo. Using chroot, you can boot from a Linux USB stick, but trick the system into working from your hard disk as the OS. Useful if your system won’t boot. Ref Claude 3.7 Sonnet with extended thinking has a token limit of over 64,000 tokens. Given a strong instruction following capability, that makes it one of the most powerful models for transforming text. For example, transcription restyling, translations, XML to json conversions, PDF to XML, etc. Notes from discussion with Sundeep In his experience, investors tend to let you run the show (e.g. ask what you want rather than push in a specific direction) unless there is trouble We discussed the “running out of problems” problem with AI. His suggestion: List problems we dropped or eliminated for lack of time/capacity. This filter is a blindspot. Even if you know how to do someting, use AI to discover an alternate solution approach. That’s the path to 10X (rather than incremental) optimization. Having AI create end-to-end pitch videos based on a product idea is now a reality. (He showed me one for his product.) Areas to explore with Deep Research are: What hidden trends is media misdirecting away from? What are second order effects and hidden gameplays? Which organizations would be good clients to target? What would be an apt pitch pitch for them? Experience dining is an emerging theme. Having LLMs explain scenarios (i.e. what might happen if …) based on parameters can help understand/quantify the impact of actions, and therefore what to do. One way to copy as Markdown: copy page contents, paste in text-html.com, copy HTML, paste in Turndown, copy Markdown. Claude 3.7 Sonnet with extended thinking has a token limit of over 64,000 tokens. Given a strong instruction following capability, that makes it one of the most powerful models for transforming text. For example, transcription restyling, translations, XML to json conversions, PDF to XML, etc. Elimination Game is like Survivor for LLMs, where they form alliances and out-vote each other until 2 remain. The eliminated LLMs vote for the winner. GPT-4.5 Preview, both Claude Sonnets and Gemini 2.5 Pro consistently out-perform the rest. Their dialogues are fascinating! SQLite can open locked databases (e.g. browser history) via sqlite3 'file:places.sqlite?mode=ro&nolock=1'. datasette uses this. For example, to read the Edge history on Linux, use datasette ~/.config/microsoft-edge/Default/History --nolock Ref Notes from ThursdAI - Apr 03 Nomic Embed Multimodal models are the current SOTA on multi-modal embeddings. Notably, they embed PDFs natively. Hailuo Speech-02 is the best speech model right now beating ElevenLabs. It has excellent voice cloning. Pricing: $30/1M chars. 10% of ElevenLabs, 2X of OpenAI TTS PaperBench is an open testing framework from OpenAI that requires models to replicate the research work in papers. It has ~8,000 tasks evaluated by LLMs and with LLMs judging the judges as well. The code is well worth studying. Runway Gen 4 was released with very high character consistency and longer durations Dreamina creates lip-synced videos from audio + a single image. Hedra is better for animated characters, though. Meta shared but has not released Mocha, an open character generation model that generates new characters speaking based on an audio you provide. It is not based on existing images but the quality is very good All Hands has a free online version where you can fix GitHub issues. This realistic frodo and sam mining through a minecraft tunnel, holding minecraft picaxes and torches made my day 🙂 AnimeJS released version 4. It animates HTML, SVG, Canvas, and WebGL with a consistent API. Looks elegant and powerful.