Image generation gets better at comics

I heard a lot about the new image generation models last week. So, I tested to see what’s improved. I gave the prompt below to various image generation models – old and new. A Calvin and Hobbes strip. Calvin is boxing Hobbes, with a dialog bubble from Calvin, saying “Bring it on!” Stable Diffusion XL Lightning Stable Diffusion XL Base Dall-E API ...

Weird emergent properties on Llama 3 405B

In this episode of ThursdAI, Alex Volkov (of Weights & Biases) speaks with Jeffrey Quesnelle (of Nous Research) on what they found fine-tuning Llama 3 405B. This segment is fascinating. Llama 3 405 B thought it was an amnesiac because there was no system prompt! In trying to make models align with the system prompt strongly, these are the kinds of unexpected behaviors we encounter. It’s also an indication how strongly we can have current LLMs adopt a personality simply by beginning the system prompt with “You are …” ...

The LLM Psychologist

Andrej Karpathy mentioned the term LLM psychologist first in Feb 2023. I’ve been thinking about this for a while, now. I’ve always been fascinated by psychologists in fiction. I grew up with Hari Seldon in Foundation, wanting to be a psycho-historian. (I spent several teenage years building my mind-reading abilities.) I wanted to be Susan Calvin, the only robopsychologist. ...

Visiting client offices is usually a painful exercise, given travel and security. But there are some small things that make your day. Like the Mentos at the reception. Or the unsecured WiFi. Or the delightful view of the city from a skyscraper. Today, it was the noble admin person who placed the power sockets ON TOP OF the desks, so I don’t have to bend below the desk or dig into a hole to get connected. ...

Fascinating to see the how LLM cost-quality frontier moves. Recent fights were mostly on cost. Yesterday, #OpenAI halved the GPT-4o cost. At $2.5/MTok (and with GPT-4o-min at 15 cents/MTok), the best and cheapest models are back with OpenAI, IMHO. Sigh, time to move all our stuff back from #Anthropic. For now… https://gramener.com/llmpricing/ LinkedIn

Loved this Rocky Aur Rani Kii Prem Kahaani scene where Ranveer asks, “Chinese ko Chinese bol sakte hai?” हम बहनदी भी नहीं बोल सकते? आंटी, मैं दिल्ली से हूँ। मैं कैसे नहीं बहनदी बोलूं बहनदी!? कैसा जमाना आ गया है? फैट-ों को फैट नहीं बोल सकते, ब्लैक-ों को ब्लैक नहीं बोल सकते, ओल्ड-ों को ओल्ड नहीं बोल सकते, मुँह खोलने से डर लगता है मुझे! आप मुझे बताओ, चाइनीज़ को चाइनीज़ बोल सकते हैं? ...

I'll leave tomorrow's problems to tomorrow's me

What a delightful idea. I’ll leave tomorrow’s problems to tomorrow’s me. – Saitama, One Punch Man Saitama is now one of my favorite heroes. Right up there with Atticus Finch and Juror #8. Very few people can articulate such a wonderful philosophy as effectively. The closest was Calvin. Of course, it’s not a perfect system. But they do say, “Sometimes, the best way to get something is to stop trying to get it.”

Hobbes on a calculator

I just learned that any word made of just these letters beighlosz can be spelt on a calculator. That includes Hobbes! 538804 upside-down looks like this: I’m surprised I never knew that. The longest, by far, appears to be hillbillies – 53177187714

The psychology of peer reviews

We asked the ~500 students in my Tools in Data Science course in Jan 2024 to create data visualizations. They then evaluated each others’ work. Each person’s work was evaluated by 3 peers. The evaluation was on 3 criteria: Insight, Visual Clarity, and Accuracy (with clear details on how to evaluate.) I was curious to see if what we can learn about student personas from their evaluations. ...

Embeddings in DuckDB

This article on Using DuckDB for Embeddings and Vector Search by Sören Brunk shows a number of DuckDB features I wasn’t aware of. DuckDB can read directly from Huggingface datasets DuckDB can read just the parts of a .parquet file it needs, even over HTTP DuckDB lets you write custom functions in Python DuckDB now has a vector similarity search extension I’ve recently become a DuckDB fan and continue to be impressed.

There are 4 frontier #LLMs today. No other (popular) model beats them on BOTH cost and quality. llama-3-8b-instruct claude-3-haiku-20240307 llama-3-70b-instruct gpt-4o-2024-05-13 This list changes rapidly. But in practice, it means there’s little reason to use any other LLM. They beat every other model on cost and quality (measured by the LMSYS Arena ELO score.) I opened Straive + Gramener’s keynote yesterday at marcus evans Group’s Digitech forum with this. Strange that this is not well known. Especially as switching from GPT-4 to Claude 3 Haiku can shrink a $1.2 million Gen AI budget to just $10K. ...

250 BC is when I’d pick to time-travel to. Ashoka was turning into one of the most famous emperors of India and Archimedes was growing into one of the greatest mathematicians of all time. Parallel Lives is a beautiful visualization by Jan Willem Tulp that shows who lived when, showing overlaps, and sized by their prevalence on Wikipedia. I’m a history fan and have spent several hours scrolling through the site: ...

A quick way to assess LLM capabilities

Simon Willison initiated this very interesting Twitter thread that asks, “What prompt can instantly tell us how good an LLM model is?” The Sally-Anne Test is a popular test that asks: Sally hides a marble in her basket and leaves the room. While she is away, Anne moves the marble from Sally’s basket to her own box. When Sally returns, where will she look for her marble?" ...

When picking a number between 1-100, do #LLMs pick randomly? Or pick like a human? Leniolabs_ found #ChatGPT prefers 42. Gramener re-ran the experiment. Things have changed a bit. Now, 47 is the new favorite. But Claude 3 Haiku latched on to 42 as its favorite. Gemini’s favorite is 72. See https://sanand0.github.io/llmrandom/ They all avoid multiples of 10 (10, 20, …), repeated digits (11, 22, …), single digits (1, 2, …) and prefer 7-endings (27, 37, …). These are clearly human #biases – avoiding regular / round numbers and seeking 7 as “random”. ...

This is the coolest data visualization I’ve seen in a long time. It makes you think about human behaviour. Please try and GUESS why the AirBnB occupancy rates shoot up in the red areas on Apr 7 before you read the comments! LinkedIn

From Laptops to Chatbots: Coding at 30,000 ft

Until recently, I could code on flights. This year, I lost that ability. Again. It’s happened before. In each case, technology has solved the problem for me. Here’s the history. I need a laptop. Since 2001, I’ve never been without one on a flight. I need power. Since 2005, I use dark mode and every low power feature available. (I also became good at finding hidden power outlets.) ...

From Calvin & Hobbes to Photo Tagging: Excel's Unexpected Image Capability

In Excel, using Visual Basic, you can change an image as you scroll. This makes it easy to look at each image and annotate it. This is how I transcribed every Calvin & Hobbes. I used this technique first when typing out the strips during my train rides from Bandra to Churchgate. I had an opportunity to re-apply it recently when we needed to tag hundreds of photographs based on a set of criteria. ...

Oh, wonderful! They’re keen to get in. Wise enough to take help. Honest enough not to be able to cover it up. Sounds like a good hire! LinkedIn

AI makes me a better person

Every time I get annoyed at people, I remind myself to be more like ChatGPT. Specifically: Don't get annoyed. Be patient. Encourage them. Step back and show them the big picture. (Then I get annoyed at myself for getting annoyed.) Today, I analyzed how exactly ChatGPT is different from me. So, I took a pitch document I co-authored with ChatGPT. Section A: Authored by Anand WHAT DO WE NEED? ...

A friend told me today that using #ChatGPT will make humanity dumber. “Probably. Like always, #Calvin has the best response I know to that. “I propose we leave math to the machines and go play outside.” 🙂 LinkedIn