I accidentally pressed the emergency button in the toilet. I was smarter this time, unlike earlier. https://www.linkedin.com/posts/sanand0_chatgpt-llm-activity-7246836804249628672-5QXy/ I asked #ChatGPT which (unhelpfully) told me that “Typically, these buttons cannot be turned off”. I called the reception who couldn’t understand a word of what I said. “Do you want water?” they asked when I told them “I pressed the emergency button in the bathroom. So, I went to ChatGPT’s advanced voice mode (I’m so grateful it was enabled last week) and said, “Translate everything I say into Korean. ...

After 15 minutes of a hard struggle, I finally asked #ChatGPT “How do I open the thing that’s closing the sink to allow the water to go down?” Here’s the thing with “maturity” (aka age, wisdom, experience, grey hair). It took me 15 minutes to realize I could use an #LLM to solve this problem. Despite me supposedly being an “LLM psychologist.” I suspect the school children of today won’t waste even a minute before checking ChatGPT. ...

Looks like XML tags are the best way to structure prompts and separate sections for an #LLM. It’s the only format that all of Anthropic, Google, and OpenAI LLMs encourage. For example: … … … … Anthropic Docs: https://docs.anthropic.com/en/docs/build-with-claude/prompt-engineering/use-xml-tags OpenAI Docs: https://platform.openai.com/docs/guides/prompt-engineering/strategy-write-clear-instructions Google Docs: https://cloud.google.com/vertex-ai/generative-ai/docs/learn/prompts/structure-prompts Alternatives are using JSON, Markdown, templating formats like Mustache/Jinja, etc. Even Llama’s system tokens seem a little XML-like. https://github.com/meta-llama/llama3/blob/main/llama/tokenizer.py#L61-L74 Personally, I’ve been using Markdown so far. But it’s time to switch over. (Only on the prompt side. On the generation side, Markdown still seems the best.) ...

Today, I learned that I began my career at TCS not IBM, and I never worked at the Boston Consulting Group (BCG) I am very curious (but a bit scared) to ask an #LLM whom I’m married to. LinkedIn

Visiting client offices is usually a painful exercise, given travel and security. But there are some small things that make your day. Like the Mentos at the reception. Or the unsecured WiFi. Or the delightful view of the city from a skyscraper. Today, it was the noble admin person who placed the power sockets ON TOP OF the desks, so I don’t have to bend below the desk or dig into a hole to get connected. ...

Fascinating to see the how LLM cost-quality frontier moves. Recent fights were mostly on cost. Yesterday, #OpenAI halved the GPT-4o cost. At $2.5/MTok (and with GPT-4o-min at 15 cents/MTok), the best and cheapest models are back with OpenAI, IMHO. Sigh, time to move all our stuff back from #Anthropic. For now… https://gramener.com/llmpricing/ LinkedIn

Loved this Rocky Aur Rani Kii Prem Kahaani scene where Ranveer asks, “Chinese ko Chinese bol sakte hai?” हम बहनदी भी नहीं बोल सकते? आंटी, मैं दिल्ली से हूँ। मैं कैसे नहीं बहनदी बोलूं बहनदी!? कैसा जमाना आ गया है? फैट-ों को फैट नहीं बोल सकते, ब्लैक-ों को ब्लैक नहीं बोल सकते, ओल्ड-ों को ओल्ड नहीं बोल सकते, मुँह खोलने से डर लगता है मुझे! आप मुझे बताओ, चाइनीज़ को चाइनीज़ बोल सकते हैं? ...

There are 4 frontier #LLMs today. No other (popular) model beats them on BOTH cost and quality. llama-3-8b-instruct claude-3-haiku-20240307 llama-3-70b-instruct gpt-4o-2024-05-13 This list changes rapidly. But in practice, it means there’s little reason to use any other LLM. They beat every other model on cost and quality (measured by the LMSYS Arena ELO score.) I opened Straive + Gramener’s keynote yesterday at marcus evans Group’s Digitech forum with this. Strange that this is not well known. Especially as switching from GPT-4 to Claude 3 Haiku can shrink a $1.2 million Gen AI budget to just $10K. ...

250 BC is when I’d pick to time-travel to. Ashoka was turning into one of the most famous emperors of India and Archimedes was growing into one of the greatest mathematicians of all time. Parallel Lives is a beautiful visualization by Jan Willem Tulp that shows who lived when, showing overlaps, and sized by their prevalence on Wikipedia. I’m a history fan and have spent several hours scrolling through the site: ...

When picking a number between 1-100, do #LLMs pick randomly? Or pick like a human? Leniolabs_ found #ChatGPT prefers 42. Gramener re-ran the experiment. Things have changed a bit. Now, 47 is the new favorite. But Claude 3 Haiku latched on to 42 as its favorite. Gemini’s favorite is 72. See https://sanand0.github.io/llmrandom/ They all avoid multiples of 10 (10, 20, …), repeated digits (11, 22, …), single digits (1, 2, …) and prefer 7-endings (27, 37, …). These are clearly human #biases – avoiding regular / round numbers and seeking 7 as “random”. ...

This is the coolest data visualization I’ve seen in a long time. It makes you think about human behaviour. Please try and GUESS why the AirBnB occupancy rates shoot up in the red areas on Apr 7 before you read the comments! LinkedIn

Oh, wonderful! They’re keen to get in. Wise enough to take help. Honest enough not to be able to cover it up. Sounds like a good hire! LinkedIn

A friend told me today that using #ChatGPT will make humanity dumber. “Probably. Like always, #Calvin has the best response I know to that. “I propose we leave math to the machines and go play outside.” 🙂 LinkedIn

For those in #Singapore and interested in #datavisualization & #llms, I’m talking about Visualizing LLM Hallucinations at SUTD on Thu 8 Feb at 7 pm SGT. This is for a non-technical audience. We’ll visualize the basics of how LLMs work, how they make mistakes, and at least one technique on how to spot these. https://www.meetup.com/data-vis-singapore/events/298902921/ LinkedIn

My PyCon talks are a way for me to learn. I usually pick topics I don’t know about. But at PyCon India 2023 the organizers picked “Programming Minecraft with Python” - a talk I’d given before. So, I started exploring ways to game it. (I like gaming things. It’s boring otherwise. Once, Infosys had me write a 400-page document. I began each page with a letter that spells out a poem.) ...

Ashwini Mathur and I are conducting a webinar on the impact of LLMs in Pharma. It’s online at 10 am Eastern on Mon Sep 11. Simon Willison described LLMs as alien technology we’re still discovering. I couldn’t agree more - and it helps to see it from different perspectives. So, we’re pairing the tech research at Gramener with the domain research Ashwini Mathur is doing at Novartis to explore the good, the bad, and the surprising uses of generative AI. ...

And THIS is impact. Thanks for making MY 2022 an amazing year, Tanvi Bansal! LinkedIn

Dear Alumni 🙂, A strategy consulting firm is looking for a senior person in advanced analytics at an associate partner level. Ideally someone who has come from data scientist route but has developed into business minded. If you’re interested or know anyone who is, please mail me at [email protected]. LinkedIn

I built an internal app that nudges people to respond to calendar invites in Gramener. A colleague replied, “… this is really helpful for improving hobbits”. #Autocorrect is delightfully serendipitous. I promptly watched all three #films in the Peter Jackson Hobbit series 🙂 LinkedIn

Anand Madhav described this video as “Me, convincing tech lead to solve a problem on the weekend” 🙂" For those who don’t understand Hindi, here’s a rough translation. I’ve come for help. There’s no one smarter at solving tricky problems. You’re a wizard. You’re strong AND smart. That’s why I need your help. I’ve used this strategy when reaching out. Praise does help. But has it backfired on anyone? Any examples that you can share? ...