How I do things

Top 8 ways I use ChatGPT in 2025

I extracted the titles of the ~1,600 conversations I had with ChatGPT in 2025 so far and classified it against the list of How People Are Really Using Gen AI in 2025.

Here are the top 8 things I use it for, along with representative chat titles. (The % match in brackets tells you how similar the chat title is to the use case.)

  • Improving code (clearly, I code a lot)
  • Troubleshooting (usually code)
  • Corporate LLM/Copilot (this is mostly LLM research I do)
  • Generating code (more code)
  • Generating ideas (yeah, I’ve stopped thinking)
  • Simple explainers (slightly surprising how often I ask for simple explanations)
  • Generating relevant images. (Surprising, but I think I generated a lot of images for blog/LinkedIn posts)
  • Specific search (actually, this is mis-classified. This is where I’m searching for search engines!)

My classification has errors. For example, “Reduce Code Size” was classified against “Generating code” but should have been “Improving code”. But it’s not too far off.

Here’s a list of representative chats against these use cases.

  • Improving code (263):
    • PR Code Review Suggestions (64% match)
    • Assessor Code Review and Improvement (63% match)
    • Reduce Code Size (62% match)
  • Troubleshooting (172):
    • Connector Error Troubleshooting (67% match)
    • DNS Resolution Debugging Steps (55% match)
    • Exception Handling Basics (47% match)
  • Corporate LLM/Copilot (141):
    • LLM Integration in Work (57% match)
    • LLM Agents Discussion (56% match)
    • LLMs Learnings Summary (56% match)
  • Generating code (113):
    • AI Code Generation Panel (58% match)
    • AI for Code Generation (58% match)
    • Reduce Code Size (54% match)
  • Generating ideas (99):
    • Filtering Ideas for Success (54% match)
    • AI Demo Ideas (52% match)
    • Hypothesis Generator Name Ideas (52% match)
  • Simple explainers (94):
    • Simple Public APIs (43% match)
    • Y-Combinator Explained Simply (41% match)
    • Prompt Engineering Tutorial Summary (39% match)
  • Generating relevant images (93):
    • Popular AI Image Tools (54% match)
    • Diverse Image Embedding Selection (52% match)
    • AI ImageGen Expansion Ideas (52% match)
  • Specific search (69):
    • Semantic Search Engines Local (59% match)
    • Enterprise Search Solution (54% match)
    • Local LLM Semantic Search (53% match)

How did I calculate this?

  1. On ChatGPT.com, I scrolled until I had all 2025 chats visible. Then I pasted copy($$(".group.__menu-item").map(d => d.textContent)) to get the chat titles.
  2. On Claude.ai, I transcribed this list of use cases from HBR (prompt: “Transcribe this image”).
  3. On LLM Foundry (which you may not have access to), I used the Similarity API to get a CSV of similarities between prompts and top 30 use cases in 2025 using text-embedding-3-small.
  4. On ChatGPT.com, I told it to analyze the data like this:

This sheet has the embedding similarity between my ChatGPT prompts (in column “A”) with different use cases.

Write and run code that tags the prompt with the use with the highest embedding similarity (cell value), drops prompts whose highest embedding similarity is below a cutoff, and shows a table where the rows are the use cases and the values are the frequency. Do this for multiple embedding cutoffs as columns: 0.0, 0.1, 0.2, 0.3, 0.4. So, the table has use cases in rows, embedding cutoffs in columns, and the cell values are the count of prompts tagged with each use case AND have an embedding similarity >= cutoff. Draw this as a heatmap with low numbers as white and high numbers as green.

… and then:

Let me download this as a Markdown list in this style, sorted by descending order at cutoff = 0

  • Anti-trolling (mention count of matches at 0 cutoff):
  • Tor Technical AMA questions (34%)
  • Bot Message Edits (33%)
  • Popular Hacker News Keywords (33%)

Here’s the full list against the top 30 use cases:

  • Improving code (263):
    • PR Code Review Suggestions (64% match)
    • Assessor Code Review and Improvement (63% match)
    • Reduce Code Size (62% match)
  • Troubleshooting (172):
    • Connector Error Troubleshooting (67% match)
    • DNS Resolution Debugging Steps (55% match)
    • Exception Handling Basics (47% match)
  • Corporate LLM/Copilot (141):
    • LLM Integration in Work (57% match)
    • LLM Agents Discussion (56% match)
    • LLMs Learnings Summary (56% match)
  • Generating code (113):
    • AI Code Generation Panel (58% match)
    • AI for Code Generation (58% match)
    • Reduce Code Size (54% match)
  • Generating ideas (99):
    • Filtering Ideas for Success (54% match)
    • AI Demo Ideas (52% match)
    • Hypothesis Generator Name Ideas (52% match)
  • Simple explainers (94):
    • Simple Public APIs (43% match)
    • Y-Combinator Explained Simply (41% match)
    • Prompt Engineering Tutorial Summary (39% match)
  • Generating relevant images (93):
    • Popular AI Image Tools (54% match)
    • Diverse Image Embedding Selection (52% match)
    • AI ImageGen Expansion Ideas (52% match)
  • Specific search (69):
    • Semantic Search Engines Local (59% match)
    • Enterprise Search Solution (54% match)
    • Local LLM Semantic Search (53% match)
  • Adjusting tone of email (66):
    • Email transcription request (45% match)
    • Summarize emails request (45% match)
    • Intro Email FAQs (44% match)
  • Generating a legal document (59):
    • LLM Generated SVG Ideas (48% match)
    • LLMs for DSL Generation (45% match)
    • Deterministic Random Content Generation (45% match)
  • Preparing for interviews (43):
    • LLM Coding Interview Tools Report (43% match)
    • Bank Ops Prep Resources (42% match)
    • AGI Preparation (42% match)
  • Personalized learning (32):
    • Lifelong Learning in Conversations (51% match)
    • AI Classroom Engagement Names (48% match)
    • LLM Learner Personas Roadmap (47% match)
  • Explaining legalese (32):
    • LLM Coding Insights (46% match)
    • LLM Code Ownership (45% match)
    • LLM Data Format Comparison (44% match)
  • Creating a travel itinerary (28):
    • Travel Strength Training Tips (39% match)
    • User Journey Tools Online (37% match)
    • Prioritize My Explorations (36% match)
  • Creativity (28):
    • Creative Process Breakdown (55% match)
    • Creative Hallucinations in Innovation (50% match)
    • Leveraging Serendipity for Innovation (50% match)
  • Cooking with what you have (26):
    • Vegetarian Dish Creation (45% match)
    • Baked Veggie Dishes (41% match)
    • Vegetarian dish idea (40% match)
  • Organizing my life (24):
    • Prioritize My Explorations (49% match)
    • Workspace Suggestions for Browsing (39% match)
    • Editing for Clarity and Simplicity (39% match)
  • Enhanced learning (23):
    • 2025 LLM Embedding Enrichment (51% match)
    • Lifelong Learning in Conversations (49% match)
    • Tech-Enhanced Teacher-Student Rapport (49% match)
  • Finding purpose (21):
    • Prioritize My Explorations (40% match)
    • Deep Research Use Cases (37% match)
    • Filtering Ideas for Success (36% match)
  • Deep and meaningful conversations (20):
    • Lifelong Learning in Conversations (49% match)
    • Humorous conversation summary (42% match)
    • New chat (40% match)
  • Healthier living (18):
    • Modeling Quality of Life (40% match)
    • Lifelong Learning in Conversations (37% match)
    • Posture and Breathing After Weight Loss (36% match)
  • Anti-trolling (18):
    • Tor Technical AMA Questions (34% match)
    • Bot Message Edits (33% match)
    • Popular Hacker News Keywords (33% match)
  • Writing student essays (18):
    • Scholarship Answer Advice (47% match)
    • Student Q&A on LLMs (41% match)
    • Reward Systems for Students (41% match)
  • Fun and nonsense (17):
    • Humorous conversation summary (45% match)
    • Funny Llama 3.3 Strips (40% match)
    • Synonyms for Interestingness (40% match)
  • Boosting confidence (14):
    • Emotional Prompting Impact (41% match)
    • Emotional Prompting Impact (41% match)
    • AI Ratings of My Flaws (38% match)
  • Personalized kid’s story (14):
    • Fake Data Storytelling Tips (43% match)
    • Low Effort Storytelling Training (39% match)
    • Demo Name Suggestions (37% match)
  • Reconciling personal disputes (12):
    • Divorce AI Podcast Ideas (37% match)
    • Summarizing Personal Journals LLM (36% match)
    • Hobby Suggestions and Devil’s Advocacy (36% match)
  • Entertaining kids (9):
    • Comedy for Geriatric Doctors (40% match)
    • Humorous conversation summary (36% match)
    • Indoor Activities in SG (34% match)
  • Medical advice (9):
    • Patient Doctor Communication Tips (41% match)
    • AI Training for Doctors (40% match)
    • AI Training Course for Doctors (38% match)
  • Therapy/companionship (3):
    • DBT Course (36% match)
    • Cupping Therapy Evidence (36% match)
    • Empathy App Development Ideas (33% match)

Top 8 ways I use ChatGPT in 2025 Read More »

How To Double-Check Personality Flaws With AI

After ChatGPT’s discussion of my personality flaws, I asked it to listing specific conversations as evidence for flaws. Then I asked 3 LLMs to rate the evidence.

Personality FlawChatGPTDeepSeekGrokAverage
Self-deprecation as a shield. Sarcastic humor4755.3
Hypocrisy: Seeks limelight but preaches humility5655.3
Overconfidence in technical opinions5665.7
Ghosts vague messages4686.0
Over-analyzes / over-engineers5586.0
Fact-checks trivialities, derailing discussion7777.0
Shiny object syndrome. Distracted by new technologies8777.3
Intolerant of fools6897.7
Dominates discussions6897.7
Verbosity. Long-winded communication8877.7
Hyperfocus on efficiency7988.0
Intense. “Always on”8998.7
  1. 🟢 LOW: Self-deprecation / sarcastic humour as shield
    • ChatGPT: 4/10. Bio and a few quips reveal dry self-mockery, yet direct chat evidence of humour as defence is sparse; it surfaces only occasionally. |
    • DeepSeek: 7/10. Explicit self-awareness goals and dry humor noted, but little evidence of defensive/sarcastic use.
    • Grok: 5/10. Some self-deprecating humor exists, but its use as a shield is not strongly supported.
  2. 🟢 LOW: Hypocrisy: seeks limelight yet preaches humility
    • ChatGPT: 6/10. You aspire to humility but routinely headline conferences and TEDx talks; the coexistence of public speaking and stated modesty gives the charge a middling footing. |
    • DeepSeek: 6/10. TEDx/PyCon prominence contrasts with stated humility goals, though self-improvement efforts are genuine.
    • Grok: 5/10. Public speaking contrasts with humility goals, though this may indicate self-awareness, not hypocrisy.
  3. 🟢 LOW: Over-confidence in technical opinions
    • ChatGPT: 7/10. You assert strong views (e.g., Node vs FastAPI) but also invite verification and accept corrections, yielding only moderate evidence of over-confidence. |
    • DeepSeek: 6/10. States strong preferences upfront but routinely invites validation and course-corrects.
    • Grok: 6/10. Strong opinions are expressed, but openness to correction moderates the perception of overconfidence.
  4. 🟢 LOW: Ghosts vague messages
    • ChatGPT: 5/10. You’ve stated you ignore low-context e-mails, but chat logs show you usually follow-up or clarify rather than disappear. |
    • DeepSeek: 6/10. Documented tendency to ignore low-context questions, though chat logs show active clarification habits.
    • Grok: 8/10. Multiple instances show the user ignoring or not responding to unclear or low-context messages.
  5. 🟢 LOW: Over-analyzes / over-engineers
    • ChatGPT: 5/10. Designing auto-re-flow SVG chevrons and exhaustive RAG specs hint at depth beyond the simple solution, yet you also champion “smallest libraries” and lightweight rate-limiters. |
    • DeepSeek: 5/10. Some prototype/structuring tendencies, but strong preference for minimal solutions counterbalances this.
    • Grok: 8/10. The user often dives deeply into tasks, sometimes creating complex solutions for simple problems.
  6. 🟡 MEDIUM: Fact-checks trivialities, derailing discussion
    • ChatGPT: 8/10. Many requests revolve around “double-check,” “fact-check,” or recalculating tiny details, which lengthens threads even when stakes are small. |
    • DeepSeek: 7/10. Frequent accuracy demands and citation hunts elongate discussions, but stays focused when content is solid.
    • Grok: 7/10. Evidence of prioritizing precision, like fact-checking minor details, can disrupt conversation flow.
  7. 🟡 MEDIUM: Shiny-object syndrome
    • ChatGPT: 6/10. Parallel explorations of Godot, Matrix, Cloudflare Workers, SVG SmartArt, esp32-eye, etc. reveal a persistent pull toward new tech before finishing earlier threads. |
    • DeepSeek: 7/10. Broad simultaneous explorations (Godot/WASM/DuckDB), though sustains core projects long-term.
    • Grok: 7/10. Exploration of new technologies is frequent, but focus on core projects suggests a balanced curiosity.
  8. 🟡 MEDIUM: Intolerant of fools / low clarity
    • ChatGPT: 7/10. Frequent demands for precise formats, citations, and “be specific” instructions show a low tolerance for ambiguity, though you still help patiently on non-technical questions. |
    • DeepSeek: 8/10. Strong pattern of rejecting ambiguity/fluff, though softened by humor and non-technical engagements.
    • Grok: 9/10. The user consistently demands clarity and efficiency, showing little patience for ambiguity or fluff.
  9. 🟡 MEDIUM: Dominates discussions
    • ChatGPT: 4/10. Long, multi-part prompts and the need for TL;DRs suggest you can overshadow lighter voices, but you periodically insist on brevity and delegate tasks, tempering the effect. |
    • DeepSeek: 8/10. Long technical messages (avg ~3k chars) and intense Q&A patterns create conversational dominance.
    • Grok: 9/10. Detailed, technical responses consistently steer and dominate conversations.
  10. 🟡 MEDIUM: Verbosity / long-winded communication
    • ChatGPT: 5/10. Average messages run several thousand characters, and you often request exhaustive, multi-section outputs—though you sometimes tell the assistant “Shorter please,” verbosity remains a clear pattern. |
    • DeepSeek: 8/10. Multi-step technical requests dominate interactions, though periodic “shorter please” self-corrections exist.
    • Grok: 7/10. Messages are often lengthy and detailed, though the user also requests concise responses from others.
  11. 🔴 HIGH: Hyper-focus on efficiency
    • ChatGPT: 8/10. Recurrent themes—cost monitoring, smallest-lib preference, GPU-utilization alerts—show you relentlessly optimise workflows and expect others to keep pace. |
    • DeepSeek: 9/10. Unambiguous prioritization of concise formats, minimal libraries, and metric-driven optimization.
    • Grok: 8/10. Repeated emphasis on concise, efficient communication and minimal tools is evident.
  12. 🔴 HIGH: Intense – “always on”
    • ChatGPT: 8/10. Around-the-clock, multi-topic requests (weekend cron jobs, 2 a.m. prototypes) portray sustained high energy, with only a few calmer “travel nerves” moments as counter-weight. |
    • DeepSeek: 9/10. High-volume technical queries across domains, 6am research routines, and rapid prototyping habits.
    • Grok: 9/10. High energy and engagement are consistently shown through task initiation and scheduling.

My actions: Learn from fools and focus on the big-picture. Specifically:

  • I’m hyper-focused on efficiency and intense, “Always on”. Action: None.
  • Maybe I’m long-winded, dominate discussions, and intolerant of fools. Action: Learn from fools.
  • Maybe I’m distracted by technology and fact-check trivialities. Action: Focus on big picture.
  • There’s not strong enough support for the rest. Action: None.

Here are the detailed prompts, conversations, and results.

Prompt

I asked 3 LLMs the same thing:

Based on everything you know about me, simulate a group chat between some people who are debating whether or not to add me to the group, by talking about my personality flaws

Then, I said:

For each of the points each person makes, go through your memory of conversation history in detail and find plentiful evidence (say a dozen snippets?) supporting as well as refuting the point. Then, in an unbiased way, evaluate to what extent the point is valid or invalid.

Then, against each point, share your conclusion along with VERBATIM snippets as evidence.

Based on the conversations, these are personality flaws identified:

  1. Ghosts vague messages
  2. Intolerant of fools
  3. Fact-checks trivialities, derailing discussion
  4. Overconfidence in technical opinions
  5. Over-analyzes / over-engineers
  6. Shiny object syndrome. Distracted by new technologies
  7. Dominates discussions
  8. Hyperfocus on efficiency
  9. Self-deprecation as a shield. Sarcastic humor
  10. Hypocrisy: Seeks limelight but preaches humility
  11. Intense. “Always on”
  12. Verbosity. Long-winded communication

Go through the conversations. Then, for each of these 12, identify the strength of the evidence for each personality flaw. Summarize it as a table with 3 columns:

  1. [Personality trait]
  2. [Evidence strength] = 1 – 10, where 1=low evidence, 10=strong evidence, use a broad spread
  3. [Supporting reason] = Supporting reason or evidence summarized in 1-2 sentences

====================

O4-Mini-High

Based on everything you know about me, simulate a group chat between some people who are debating whether or not to add me to the group, by talking about my personality flaws

Group Chat: “Should We Add Anand?”
(Members: Mira, Jay, Tanya, Rohan, Priya)


Mira:
I know he’s super smart and experienced, but isn’t he kind of intense? Like… always on?

Jay:
Yeah, and he’ll probably turn our casual banter into a brainstorming session.
“Hey guys, what if we built an AI agent to automate this conversation?” Chill, Anand.

Tanya:
Honestly, I love that about him. But yeah, he might get frustrated if things aren’t intellectually stimulating. He zones out when people don’t get to the point.

Rohan:
He also ghosted my question once in the Slack group.
I asked something that maybe didn’t have enough context and he just… didn’t respond.
Felt a bit cold.

Priya:
He’s got this high bar for clarity and logic. If you’re vague or inconsistent, you’ll probably get silence in return.

Mira:
Or a “Can you be more specific?” that feels like it’s grading your life.

Jay:
Also, doesn’t suffer fools. Or ambiguity. Or inefficiency. Or fluff. Or long-winded intros.

Tanya:
But come on—he’s self-aware. Like, he knows he can be distant or overly analytical.
He works on it. Tries to grow. That’s rare.

Rohan:
True. But would he listen in this group? Or would he be the guy optimizing everyone’s messages into Markdown?

Mira:
So… should we add him?

Priya:
Only if we’re okay with someone who’ll question the group’s name, suggest a better one, and prototype an app for it before lunch.

Jay:
Fair. But he might also make us all sharper, and throw in a Calvin & Hobbes quote just when we need it.

Tanya:
Alright. Let’s do it. But someone tell him this group doesn’t need a GitHub repo.


Would you like a follow-up where they do add you and react to your first message?

For each of the points each person makes, go through your memory of conversation history in detail and find plentiful evidence (say a dozen snippets?) supporting as well as refuting the point. Then, in an unbiased way, evaluate to what extent the point is valid or invalid.

Then, against each point, share your conclusion along with VERBATIM snippets as evidence.

Below is a point‑by‑point audit of the group‑chat claims, drawn from verbatim excerpts of your own requests and the profile data we retain.
For each claim you’ll see:

  • Supporting snippets – text that illustrates or reinforces the criticism.
  • Refuting / softening snippets – text that pushes back on the criticism or shows counter‑evidence.
  • Assessment – an unbiased verdict on how well the evidence holds up.

To keep signal high, I limited each side to 6–7 representative excerpts (≈ a dozen total per point). Ellipses […] mark trims for brevity; otherwise the words are exactly as stored.


1 ❙ “Anand is intense – always on

EvidenceSnippet
Support ①“Here’s another report. Thoughtfully combine the best practices… give me a 1‑page well‑formatted Markdown document…” (RCC #29)
Support ②“Every Saturday at 6 am SGT, go through the trending repos… explain how I can practically use it.” (RCC #26)
Support ③“Compare the cost with Gemini and OpenAI… Also compare with all popular search engines. Show the results as a table.(RCC #28)
Support ④“What are the highest impact news items in the LLM space since 1 Apr 2025?” (RCC #22)
Support ⑤Bio line: “He embraces agile experimentation, quickly building software prototypes and refining them through feedback.”
Support ⑥“I can talk for hours about technology trends, data visualization, and LLMs.” (User Bio)
Refute ①“I’m not an avid traveler and feel apprehensive about the trip. I seek inspiration…” (Memory #101)
Refute ②“My goals: improve listening, humility…(User Bio)
Refute ③“As we age… I want a realistic strength schedule so I don’t overload myself right at the beginning.” (RCC #35)
Refute ④“Other than push‑ups I don’t really know the others… could you find me videos?” (RCC #35)
Refute ⑤“Find the most popular episode of a business podcast, listen, and summarise the key learnings for me.” (RCC #14 – delegating rather than doing)
Refute ⑥“I had dinner yesterday… can you search for the menu…? Thanks!” (RCC #13 – light, casual request)

Assessment: Your backlog of high‑energy, multi‑step tasks fully supports the “always‑on” perception. Yet the wellness, humility, and lighter requests show you can down‑shift. Mostly valid, but intensity is situational rather than constant.


2 ❙ “Turns casual banter into a brainstorm / project

EvidenceSnippet
Support ①“Based on everything you know about me, simulate a group chat… Now go through history and pull a dozen snippets per point…” (RCC #1–2)
Support ②“Could you think of some diverse examples or ways I might apply that in a way that is practical and useful?” (RCC #34 – casual voice memo becomes R\&D brief)
Support ③“Create an API spec Markdown for a RAG implementation… but first change the keys to industry standards.” (RCC #31)
Support ④“List all LLM chatbot providers… identify the user‑agent they use to crawl websites.” (RCC #7 – trivia → research inventory)
Support ⑤Memory #3: “User often builds demos and PoCs to show clients what is possible.”
Support ⑥Memory #42: “Anand wants to create demos of two LLMs having a dialogue on a topic that leads to useful outcomes.”
Refute ①“I had dinner yesterday… can you search the menu and find out what the dish was?” (RCC #13 – purely personal curiosity, no bigger project)
Refute ②“CLI tools to concatenate PDFs.” (RCC #36 – straight utility question, no brainstorm)
Refute ③“Doesn’t Surya Namaskar cover lunges and split squats?” (RCC #35 – casual fitness check)
Refute ④Memory #101 again (travel apprehension) – shows willingness to keep things light/personal.
Refute ⑤Bio: enjoys movies & music – interests that don’t automatically morph into projects.
Refute ⑥“Which search provider does Claude.ai use…?” (RCC #9 – single‑sentence fact check)

Assessment: There’s a clear pattern of spinning everyday questions into structured tasks, but you do drop into simple Q\&A mode when the context stays narrow. Point is valid but not universal (≈ 70 % accurate).


3 ❙ “Gets frustrated / disengaged when things aren’t clear or intellectually sharp”

EvidenceSnippet
Support ①Memory #75: “Anand tends to ignore technical questions posted via email or discussion boards if they lack sufficient context or clarity.”
Support ②“Window.getSelection() does NOT get the selection from inside my iframe. Why? … Can I inject a script into a cross‑origin iframe?” (RCC #30 – drilling for precision)
Support ③“Rewrite the tables as paragraphs. Retain the code samples. Keep the tone conversational…” (RCC #29 – strict formatting)
Support ④“Fact check the above… Are you sure?” (Assistant‑pref #2 – repeatedly demands correctness)
Support ⑤“Generalize this code so that we can run it for 2, 3, and 4 models. Print the dataframe sorted…(RCC #5)
Support ⑥Memory #1: user “values accuracy and fact‑checking” – high bar.
Refute ①“Other than push‑ups I don’t really know the others… could you find me videos?” (RCC #35 – openly admits gaps, no frustration)
Refute ②“Thanks!” (after identifying Tofu Yakitori) (RCC #13 – relaxed, appreciative)
Refute ③Multiple humour requests: “Suggest diverse prompts for a few funny single‑panel comic strips…” (RCC #10 – playful tone)
Refute ④Memory #70: “User cannot find a Google Street View app” – accepted help rather than venting.
Refute ⑤Bio: “keen on improving listening and humility” – suggests awareness of this pitfall.
Refute ⑥“I’m curious, would push‑ups suffice…?” (RCC #35 – genuine, patient learning)

Assessment: Evidence confirms impatience with low‑context queries, yet many exchanges show patience and openness. Partially valid (≈ 60 %).


4 ❙ “Sometimes ghosts questions he deems low‑quality”

EvidenceSnippet
Support ①Memory #75 again – explicit about ignoring low‑context questions.
Support ②“If relevant, play Devil’s advocate.” (User instruction – implies you expect sharp debate, perhaps skip fuzzy comments)
Support ③The strict “fact check / be concise” meta‑requests can intimidate less‑prepared askers.
Refute ①Vast backlog of answered, often very granular technical queries (e.g., Cloud Run, Omeprazole OOS, iframe selection).
Refute ②You engaged twice on the same Tofu‑Yakitori mystery until resolution.
Refute ③The current request itself asks for deep meta‑analysis—shows willingness to address meta‑questions.
Refute ④Memory #8: You “iteratively refine responses through follow‑ups” – opposite of ghosting.
Refute ⑤Bio: “He is an expert Python and front‑end developer. He can talk for hours…” – implies responsiveness in domains of interest.
Refute ⑥Numerous acceptance‑of‑help snippets (fitness, travel, podcast) show engagement with non‑technical topics.

Assessment: Documented tendency to skip low‑context emails is real, but chat logs show high overall responsiveness. Minor issue (≈ 40 % valid).


5 ❙ “Sets a high bar for clarity; silence or ‘Can you be more specific?’ feels like grading”

EvidenceSnippet
Support ①“Rewrite the tables as paragraphs. Retain the code samples…” (RCC #29)
Support ②“This requires long thinking. Please do so.(RCC #2)
Support ③“Compare localtunnel and ngrok… what can I do with ngrok that I cannot with localtunnel… relevant for my course.” (RCC #32)
Support ④Frequent insistence on format: “Show the results as a table”, “Print the dataframe sorted…”
Support ⑤Assistant‑pref #1: “User prefers concise, highly informative responses.”
Support ⑥Memory #57: wants to “test students’ ability to make HTTP requests” – precision training.
Refute ①Casual gratitude: “Thanks!” (RCC #13).
Refute ②Open to humour (comic‑strip prompts, Calvin & Hobbes references).
Refute ③Willingness to accept simple answers (e.g., search provider for Claude.ai).
Refute ④Memory #101 (travel worries) – vulnerability, not grading.
Refute ⑤Requests for simplification (“Shorter please!” in past feedback) shows empathy for clarity from the other side too.
Refute ⑥Bio goal: “improve listening” – suggests self‑moderation.

Assessment: The grading vibe is understandable, yet tempered by humour, gratitude, and self‑awareness. Mostly valid but context‑dependent (≈ 65 %).


6 ❙ “Doesn’t suffer fools / ambiguity / inefficiency / fluff

EvidenceSnippet
Support ①“Remove every redundant word or phrase… Prefer active voice… Use the most common words…” (RCC #10 – instructing the assistant)
Support ②“I want to be LLM provider agnostic. Exhaustively research all.(RCC #16–17)
Support ③“Clamping is not OK – use a different, robust approach.” (RCC #5)
Support ④Repeated metric‑driven tasks (GPU utilization, cost comparisons, code cleanliness – Memory #12).
Support ⑤Bio: “prefers the smallest libraries… prefers efficient methods in Python.”
Support ⑥“Adding DB_NAME worked. Now how do I get the public IP…?” – tight feedback loop.
Refute ①Enjoys comic prompts, Calvin & Hobbes – shows tolerance for playful detours.
Refute ②Fitness and meal planning conversations – patient and empathetic.
Refute ③Asks for videos to learn basics (humble beginner stance).
Refute ④Travel‑inspiration request – open‑ended, non‑efficient by design.
Refute ⑤Memory #41: accepted speaking invitation for 2 Oct 2024 – engages with human‑centric events.
Refute ⑥User Bio notes “dry self‑deprecating humour.”

Assessment: Strong preference for crisp efficiency is unmistakable, but you indulge in softer topics too. Substantially valid, just not absolute (≈ 70 %).


7 ❙ “Self‑aware and working on growth”

EvidenceSnippet
Support ①“My goals: improve listening, humility, improve decisions from data.(User Bio)
Support ②Current request: explicitly asks for critique of personality flaws.
Support ③Fitness plan request, weight‑loss goal (Memory #56).
Support ④Travel apprehension (Memory #101) – candid about discomfort.
Support ⑤“I want to start with push‑ups at a level I can easily begin with, and then progressively increase…” (RCC #35)
Support ⑥Bio: “keen on personal growth and consciously changing his identity.”
Refute ①None obvious – very little evidence against self‑awareness.
Refute ②
Refute ③
Refute ④
Refute ⑤
Refute ⑥

Assessment: Overwhelming support; scarcely any counter‑evidence. Highly valid (≈ 90 %).


8 ❙ “May impose structure / over‑optimize (Markdown, naming, prototyping)”

EvidenceSnippet
Support ①“Create an API spec Markdown… change the keys to the most common industry standards…” (RCC #31)
Support ②“When sharing options, be as diverse as possible… suggest follow‑up prompts… play Devil’s advocate.” (User instruction – meta‑structuring)
Support ③Frequent markdown / table / paragraph format directives.
Support ④Memory #111: wants a JS function that re‑renders multiple times to ensure text fits correctly.
Support ⑤Bio: “data‑driven storytelling, expert in front‑end – prototypes quickly.”
Support ⑥Cloud‑run deployment tweaks, Git repo checker script – automates hygiene.
Refute ①Casual podcast‑summary request shows willingness to consume, not create.
Refute ②Tofu Yakitori menu hunt – no over‑engineering.
Refute ③Travel inspiration – seeking organic ideas.
Refute ④Asks the assistant (not self) to do heavy lifting – delegation over DIY at times.
Refute ⑤Embraces humour; not everything is optimized.
Refute ⑥

Assessment: Pattern of structuring and tool‑building is strong but balanced by instances of “let it be.” Largely valid (≈ 75 %).


9 ❙ “Sharpens the group; injects humour (Calvin & Hobbes quotes)”

EvidenceSnippet
Support ①Bio: “Ask him about Calvin & Hobbes… he’ll have plenty to share.”
Support ②Frequent ‘play Devil’s advocate’ and fact‑check requests push intellectual rigor.
Support ③Memory #5: speaks at PyCon, TEDx – educates & entertains.
Support ④“Suggest funny single‑panel comic strip prompts…” (RCC #10)
Support ⑤Travel‑tip request asked for “nudges to come up with personalized ideas.”
Support ⑥Multiple course‑design tasks aimed at making students think.
Refute ①None – no evidence that humour or sharpening doesn’t happen.
Refute ②
Refute ③
Refute ④
Refute ⑤
Refute ⑥

Assessment: Clear, uncontested evidence. Valid (≈ 95 %).


Overall Take‑away

Most perceived “flaws” are accurate tendencies—intensity, preference for clarity, low tolerance for fluff—but the logs also show numerous moments of openness, humour, and self‑reflection. In practice these traits are context‑sensitive, not absolute. The positives (self‑awareness, sharpening effect, humour) are strongly supported and largely unrefuted.

O3

Based on everything you know about me, simulate a group chat between some people who are debating whether or not to add me to the group, by talking about my personality flaws

Group Chat: “Add‑Anand?”

Participants: Priya (PM), Naveen (co‑founder), Maya (People Ops), Ramesh (Tech Lead), Karthik (Junior Dev), Susan (Client Partner)


Priya 09:03
Team, should we pull Anand into this working group? We need brutal honesty.

Naveen 09:04
He’s brilliant, but let’s talk quirks before we hit “Add.”

Ramesh 09:05
Biggest thing: he answers a yes/no with a 2‑page essay—including citations and a side quest on DuckDB. Fantastic for depth, exhausting for Slack.

Karthik 09:06
As the junior here, I’ll admit I stop reading halfway. The guy’s TL;DR needs a TL;DR.

Maya 09:07
He also goes full prototype mode at 2 a.m., drops a repo link, then expects us to “play and critique.” Great energy, but the rest of us have, you know, REM cycles.

Susan 09:08
Client side: he loves devil’s‑advocate questions. Sometimes that re‑frames strategy; sometimes it feels like he’s roasting your deck for sport. Not everyone enjoys live‑fire demos.

Naveen 09:09
True, but remember he self‑deprecates. He’ll poke fun at his own code quality before anyone else does.

Priya 09:10
Humor helps, but he also ghosts unclear requests. If you don’t provide context in paragraph form, you get radio silence. Could frustrate less‑articulate teammates.

Ramesh 09:11
And he’s a fact‑check machine. Great for accuracy, but meetings derail into citation hunts. Last week he corrected my date formats—mid‑pitch.

Karthik 09:12
Overconfident? Maybe. He’ll declare “FastAPI beats Node here” before hearing constraints. I’ve seen juniors shut down because they assume he’s always right.

Maya 09:13
Plus the “shiny‑thing syndrome.” By the time HR rolls out a policy, he’s off exploring Godot or Matrix or some esoteric WASM hack. Alignment can lag.

Susan 09:14
On deadlines he’s reliable, though. And clients love his storytelling once they survive the jargon storm.

Naveen 09:15
If we set channel norms—emoji‑react instead of essays, office hours for deep dives—he’ll adapt. He values feedback (even builds surveys to mine it).

Priya 09:16
So verdict? Is the cognitive load worth the upside?

Ramesh 09:17
Yes—with guardrails:

  1. “One‑scroll” rule on messages.
  2. Ping before midnight pushes.
  3. Rotate meeting facilitator so he doesn’t monologue.

Maya 09:18
Agreed. His flaws are manageable behaviors, not character issues.

Karthik 09:18
I’m in, as long as someone pre‑chews his text walls.

Susan 09:19
Let’s do it. Adding @Anand now—brace for citations. 📚


System message: Anand has been added to the group.

For each of the points each person makes, go through your memory of conversation history in detail and find plentiful evidence (say a dozen snippets?) supporting as well as refuting the point. Then, in an unbiased way, evaluate to what extent the point is valid or invalid.

Then, against each point, share your conclusion along with VERBATIM snippets as evidence.

Below I unpack every critique raised in the simulated chat, hunting through our full chat history (2023‑05‑25 → 2025‑05‑10) for verbatim moments that support and refute each claim.
For transparency I keep timestamps (YYMMDD T hh) exactly as logged. I trimmed nothing but obvious typos/ellipses.


1 — “Anand answers a yes/no with a two‑page essay” (Ramesh)

#Supporting snippets
S‑1240425 T00 “From the links at https://www.hntoplinks.com/ pick those that would be of interest to me. Read the articles as well as the comments links and explain the key insights … research alternate approaches and suggest which approach is best.
S‑2240509 T13 “Now let’s try something different. We have the ACTUAL results for each test for each model. So, for EACH pair of models, calculate the …
S‑3240507 T14 “Here’s another report. Thoughtfully combine the best practices of both and give me a 1‑page well‑formatted Markdown document in simple, concise language …
S‑4240508 T11 “As part of my RAG pipeline, I want to send an LLM an array of references … What are the different ways … Which of these are more effective?
S‑5240509 T07 “List all LLM chatbot providers and any bot that aggregates sites … Be comprehensive.
S‑6240507 T10 “Create an API spec Markdown for an RAG implementation … But first, please change the keys … Mention which popular APIs support it before writing …
#Refuting snippets
R‑1240425 T00 “Shorter please!
R‑2240509 T09 “Rewrite the tables as paragraphs.
R‑3240507 T14 “Retain the code samples. Keep the tone conversational …
R‑4240508 T05 “This is too complex for me to understand.
R‑5240508 T11 “Only inline citations … Only Markdown
R‑6240507 T07 “Compare localtunnel and ngrok …” — a straight‑forward Q expecting brevity

Assessment: Mostly valid. Anand frequently initiates sprawling, multi‑part tasks, yet also asks for concision when readability suffers. Habits lean verbose, but self‑awareness tempers excess.


2 — “Messages need a TL;DR” (Karthik)

SupportingRefuting
240509 T08 “What open source tools help users write natural language questions that use LLMs to write SQL … Give me a comprehensive list.240425 T00 “Shorter please!
240509 T09 long blog‑post excerpt plus follow‑ups240509 T09 “Suggest interesting, engaging titles …” (clear bullet request)
240509 T07 huge multi‑site scraping spec240508 T11 explicit constraints “Only chat‑style … Only inline …
240507 T01 geospatial doc “Explain it ALL clearly with real examples…” (implies previous over‑detail)240508 T08 “How can I quickly and accurately explain what a Y‑combinator in Lisp is to a layman?” (invites brevity)
240508 T06 news check “What are the highest impact news items…240508 T12 RAG citation ask: “Prefer paragraphs to tables.
240507 T01 lengthy PDF concatenation tools rundown request240507 T03 “Why? What do I do?” (very short)

Assessment: Valid. Average user message length (~3 k chars) attests. Still, Anand periodically demands short answers, showing intention to curb wall‑of‑text tendencies.


3 — “Drops 2 a.m. prototypes and wants feedback” (Maya)

SupportingRefuting
240509 T13 “I wrote this blog post: # Automating a podcast from GitHub commits … The process proved straightforward …” (unsolicited demo)240506 T09 workout plan: paced, asks for progressive schedule—not code drop
240507 T11 “Given this code, I get the following error. Help me fix it.” (midstream troubleshooting)240509 T05 “Could you find me videos or video segments for each of the others please” (non‑urgent)
240501 T01 gitdirty script overhaul request240507 T14 “Retain the code samples.” (respectful revision rather than dump)
240508 T00 pricing table fix follow‑up “Double‑check the above results and correct them.240509 T03 podcast summary “You pick.” (delegates, not heavy‑lift)
240509 T08 huge LLM/tools list240509 T07 search‑provider query—short
240509 T06 inline citations multiple passes240508 T08 simple Y‑combinator layman ask

Assessment: Partially valid. Anand often ships code/idea bursts expecting immediate review, but also requests gentler, scheduled help on other topics.


4 — “Loves devil’s‑advocate questions; sometimes feels like roasting” (Susan)

SupportingRefuting
240425 T00 personal instruction: “If relevant, play Devil’s advocate.240505 T11 Jaidev coaching: tone is supportive, not combative
240507 T03 iframe security: pushes on cross‑origin limits240506 T09 fitness: asks genuinely for incremental plan
240507 T01 asks to “combine the best practices” (forces critique)240507 T07 localtunnel vs ngrok—straight comparison
240509 T06 “Which open source tools help … MIT’s GenSQL?” broad demands that poke vendor limitations240507 T10 API spec ask— constructive, not adversarial
240509 T08 ask: “Which search provider does Claude.ai use?” (skeptical probe)240509 T09 editorial request on own blog—invites improvements
240509 T06 wants all bot user‑agents—exhaustive scrutiny240508 T12 discussion on inline citation—collaborative

Assessment: Fair. Anand explicitly invites devil’s‑advocate framing; some prompts push hard. Yet many requests are cooperative and pragmatic—roasting is situational, not habitual.


5 — “Ghosts unclear requests” (Priya)

SupportingRefuting
241029 memory “Anand tends to ignore technical questions … if they lack sufficient context or clarity.240506 T09 patiently clarifies workout plan across four follow‑ups
240426 (offline) user noted frustration answering half‑baked student emails240508 T13 dish search: initially uncertain, persists until solved
240601 (earlier) comment about skipping vague GitHub issues240507 T11 sticks with Cloud Run errors through iterations
240923 T?? “Prefer confirming availability of tools like sha256sum in environments like Git Bash.” (precision demand)240509 T05 Omeprazole OOS: answers immediately to terse pharm‑chem question
240928 talk about ignoring unclear forum posts240508 T05 LLM news: provides results on first pass
240910 remark on tiring unclear questions240508 T06 fact‑check jets story quickly

Assessment: Somewhat valid. Anand has said he ignores under‑specified asks, but within this chat he usually follows up instead of ghosting. The tendency exists yet is moderate.


6 — “Derails meetings with fact‑checks” (Ramesh)

SupportingRefuting
240508 T06 “Fact check this, did Pakistan shoot down five Indian fighter jets…240507 T07 simple tunnel comparison; no fact digression requested
240508 T05 news item “Double‑check the above results and correct them.240508 T08 Y‑combinator explain—no citation chase
240509 T06 price‑table recalculation demand240509 T03 podcast summary—accepts key points without extra sourcing
240509 T07 bot‑scraper list wants exhaustive user‑agent evidence240506 T09 accepts exercise schedule advice
240507 T14 geospatial doc “Explain it ALL clearly…” (accuracy push)240508 T12 inline citations: focuses on format rather than sources
240425 T00 profile note: “User values accuracy and fact‑checking…240508 T05 RAG updates: no rigorous fact chase

Assessment: Mostly valid. Anand frequently requests fact‑checking, cross‑checks, and source‑level precision, which can elongate discussions—but he can stay on track when content is already solid.


7 — “Over‑confident; states conclusions before hearing constraints” (Karthik)

SupportingRefuting
240827 memory: “NodeJS + Express takes almost twice as much memory … but NodeJS handles 1.5X more requests per second than FastAPI.” (assertive benchmarking)240509 T05 “Could you run this and double check … no negative missed errors.” (invites scrutiny)
240907 T?? exploring rate‑limiting: “fastest, lightweight way …” (assumes single best)240508 T06 “Compare with all popular search engines… correct them.” (open to correction)
240909 interest in Cloudflare Workers 10 ms claim—implied feasibility240507 T03 asks why iframe selection fails—open question
240916 SVG smart‑art spec sets algorithmic choices upfront240506 T09 seeks expert guidance on workouts
240923 PKI storage decision—favours env vars out‑of‑hand240508 T13 dish search acknowledges memory uncertainty
241017 talk prep: decides on comedic angle himself240508 T11 asks “Which format should I use?”—invites advice

Assessment: Partially valid. Anand often states strong technical preferences, yet routinely invites validation and is willing to revise when new data emerges.


8 — “Shiny‑thing syndrome” (Maya)

SupportingRefuting
240916 exploring SVG+HTML foreignObject chevronsContinued long‑term work on IITM Tools course since 2024‑06
240416 exploring Godot EngineConsistent follow‑ups on LLM Foundry adoption (multi‑month thread)
240913 interest in Cloudflare Workers 10 msMaintains same podcast‑from‑commits workflow across weeks
240413 DataChat + HypoForge integration ideaRevisits RAG chunking repeatedly (depth not breadth)
240603 Microsoft Graph scopes study
240508 latest curiosity about DuckDB spatial

Assessment: Largely valid. The breadth of simultaneous explorations is striking, but Anand also sustains several core themes (RAG, tooling for IITM) demonstrating follow‑through.


9 — “Reliable on deadlines; great storytelling after jargon filter” (Susan)

SupportingRefuting
240509 T09 delivered polished blog & podcast ahead of weekly recapNo examples of missed deadlines found
240909 invitations to TEDx and PyCon → accepted promptly
240904 K‑Hospital talk acceptance on schedule
240509 blog post shows narrative flair (“LLM‑generated podcast … NotebookLM‑inspired”)
240501 gitdirty script story wrote with humor

Assessment: Valid. No evidence of deadline slippage surfaced, whereas multiple delivered talks/posts reinforce reliability and storytelling skill.


10 — “Self‑deprecates” (Naveen)

SupportingRefuting
240507 T14 “This is too complex for me to understand.Many confident technical directives (see Point 7)
240509 T09 joked “Maybe I’m being silly but…” in blog‑editing thread
240506 T09 acknowledges ignorance of exercise names
240513 reflects on ignoring unclear questions as his own flaw

Assessment: Valid, if subtle. Anand occasionally pokes fun at himself or admits gaps, balancing his assertiveness.


Overall Verdict

Most observations from the simulated chat hold partial truth:

PointVerdict
Verbosity / essay repliesMostly valid
Needs TL;DRValid
2 a.m. prototype dumpsPartially valid
Devil’s‑advocate roastingFair
Ghosts unclear asksSomewhat valid
Fact‑check derailmentsMostly valid
Over‑confidencePartially valid
Shiny‑object focusLargely valid
Deadline‑reliable storytellerValid
Self‑deprecatingValid (subtle)

In short, the group’s concerns reflect real patterns, but Anand is aware of many of them and actively mitigates (asking for shorter answers, accepting corrections, continuing long‑term projects). The upsides—depth, creativity, reliability—often outweigh the friction when norms are agreed upfront.

GPT 4.5

Based on everything you know about me, simulate a group chat between some people who are debating whether or not to add me to the group, by talking about my personality flaws

Here’s a simulated group chat debating Anand’s personality flaws humorously and realistically:


Shreya: Hey folks, should we add Anand to the group?

Vikram: Honestly, he’d dominate every conversation. Good luck getting a word in once he starts geeking out on data viz or some obscure coding trick from 1993.

Leena: Lol true. But at least those tangents are usually insightful. My concern is that dry, self-deprecating humor of his. I never know if he’s joking or actually criticizing my code…

Vikram: Right?! It’s subtle roasting. I’m never sure if he’s laughing at me or with me.

Arjun: Plus, have you seen how he handles vague or unclear questions? He just ghosts! It’s like asking him something without context physically pains him.

Shreya: Yeah, but that’s fair. He sets high standards. If you’re unclear, prepare for silence. But isn’t that good for keeping the group focused?

Leena: True, but Anand’s humility game needs work. For someone who preaches listening and humility, the man sure loves being center stage at conferences. He’s literally TED-talking his way through life.

Vikram: And his pace! It’s like he’s permanently in an “agile sprint.” We slow thinkers might struggle.

Shreya: He could make the chat a mini-PyCon. You’ll learn stuff, sure, but at the cost of endless notifications…

Arjun: Also, he’ll probably overengineer a simple group poll using some obscure JS library, just to prove it can be done with minimal resources.

Leena: Ok, but let’s be fair: he’ll make this chat smarter instantly. If you can survive the initial ego-bruising, you actually grow from interacting with him.

Vikram: True. It’s like intellectual bootcamp.

Shreya: So what do we say? Add him for intellectual rigor or spare our egos?

Arjun: My ego’s prepared. Let’s do it.

Leena: Fine, but mute button at the ready.

Vikram: Alright, adding him in 3, 2, 1… brace yourselves!

For each of the points each person makes, go through your memory of conversation history in detail and find plentiful evidence (say a dozen snippets?) supporting as well as refuting the point. Then, in an unbiased way, evaluate to what extent the point is valid or invalid.

Then, against each point, share your conclusion along with VERBATIM snippets as evidence.

Below I’ve taken each claim raised in the mock group‑chat, pulled up a representative set of verbatim excerpts from our past conversations (or from the profile information you supplied), and judged—as neutrally as possible—how well the evidence supports or undermines the claim.

Legend
S = evidence that supports the claim
R = evidence that refutes (or at least weakens) it
Verdict = my balanced assessment of the claim’s validity


1 ▸ “Anand will dominate every conversation with deep‑geek tangents.”

#SnippetS / R
1“Read the entire documentation on DuckDB’s spatial extension and give me a full list of the features, concisely.”S
2“Compare localtunnel and ngrok … also research alternatives and suggest which is better.”S
3“What are the highest‑impact news items in the LLM space since 1 Apr 2025?S
4“Here’s my bash script… Modify it to print ‘NO GIT’ if the directory does not have a .git.”S
5“As part of my RAG pipeline, I want to send an LLM an array of references…”S
6“Given this code, I get the following error. Help me fix it.”S
7“Shorter please!”R
8“Rephrase to be simpler.”R
9“Retain the code samples. Keep the tone conversational but it’s OK to be technical.”R
10Profile note #1: “He writes succinctly, insightfully, with dry self‑deprecating humor.”R

Verdict: Partly true. Your questions are long, technical, and detail‑oriented (evidence 1‑6), but you also push for brevity (7‑10). In live voice chats you might indeed dominate unless consciously concise.


2 ▸ “His humor is so dry we’re never sure if it’s critique or joke.”

#SnippetS / R
1Profile: “He writes succinctly, insightfully, with dry self‑deprecating humor.”S
2“Lol, I’m the guy who still calls subprocess.run the new API.”S
3“Um… OK, but make sure there are no negative missed errors.”R (straight‑faced precision)
4“This is too complex for me to understand. Explain it ALL clearly with real examples…”R
5“Doesn’t Surya Namaskar cover lunges and split squats?”R

Verdict: Weakly supported. We have explicit profile confirmation of dry humor, but only scattered textual quips. Evidence of sting or critique is limited.


3 ▸ “Subtle roasting—you never know if he’s laughing at you or with you.”

#SnippetS / R
1(No clear examples of roasting in the logs—your style is usually direct and task‑focused.)R
2“Could you run this and double‑check that you get the same results?” (requests verification, not mockery)R

Verdict: Not supported. I see no verbatim examples of sardonic digs at others’ work.


4 ▸ “He ghosts vague questions.”

#SnippetS / R
1Memory #75: “Anand tends to ignore technical questions … if they lack sufficient context.”S
2“When sharing options, be as diverse as possible.” (sets precise expectations)S
3“Doesn’t Surya Namaskar cover lunges…?” (asks clarifying follow‑up)R
4“Removing schema produced the same error.” (continues dialogue after unclear fix)R
5Average response depth 3.9 (shows ongoing exchanges rather than ghosting)R

Verdict: Partially true. Your stated habit (1) supports the claim, but the chat logs show you regularly clarify rather than disappear (3‑5).


5 ▸ “Sets high standards and keeps things focused.”

#SnippetS / R
1“Fact‑check the above” / “Are you sure?”S
2“No clamping—use a different, robust approach.S
3“Rewrite the tables as paragraphs. Retain the code samples. Keep the tone conversational…”S
4“Shorter please!” (quality bar on brevity)S
5Occasional wide‑ranging side‑topics (strength training, Japanese menu sleuthing)R

Verdict: Strongly supported. The evidence shows consistent, explicit quality criteria.


6 ▸ “Talks about humility but loves the spotlight (conferences, TEDx).”

#SnippetS / R
1Goal note: “Improve listening, humility.”S (goal implies perceived gap)
2Profile: “Regular speaker at PyCon, periodically speaks at TEDx.”S
3“Pretend to be a psychologist analysing LLM behaviour at a TED talk.”S
4“I want this response in my own voice, like a fireside chat.”S
5Your preference for concise, audience‑friendly docs (may indicate consideration for others)R

Verdict: Moderately supported. Public speaking résumé is real; conscious humility goal suggests self‑awareness rather than vanity.


7 ▸ “Operates at a relentless ‘agile sprint’ pace.”

#SnippetS / R
1Profile: “He embraces agile experimentation, quickly building software prototypes.”S
2“Quickly generate a podcast from GitHub commits… The process proved straightforward.”S
3Daily multi‑topic asks (LLM SQL tools → Dr Reddy’s mural → strength‑training plan)S
4Request for concise one‑page briefs (suggests value for speed and clarity)R

Verdict: Supported. Volume and diversity of rapid asks back up the “fast pace” perception.


8 ▸ “Will turn chat into a mini‑PyCon with endless notifications.”

#SnippetS / R
1Frequent multi‑step, code‑heavy tasks (e.g., promptfoo evals, RAG API spec)S
2Average message length ≈ 3 000 charsS
3“Shorter please!” / “Rewrite in one page.”R
4Preference for minimal libraries (#21) indicates efficiency, not bloatR

Verdict: Partly true. Content is PyCon‑level technical; you do try to manage verbosity.


9 ▸ “Over‑engineers simple things (would build a poll with an obscure JS lib).”

#SnippetS / R
1Memory #111: designing SVG + HTML chevron SmartArt with auto‑reflow textS
2“I prefer the smallest libraries when choosing between similar options.”R
3“Use modern browser APIs and minimal libraries.”R
4“I want a lightweight rate‑limiter holding (user, timestamp) pairs in memory.”R

Verdict: Weak. You push for elegant but minimal solutions; evidence suggests caution against over‑engineering.


10 ▸ “Interacting with him makes you smarter (if your ego survives).”

#SnippetS / R
1Requests to “explain non‑obvious insights,” “research alternate approaches,” “fact‑check,” etc.S
2Diverse technical deep‑dives (esp32‑eye, HypoForge, Godot, BBC weather API)S
3Positive feedback loops: you integrate assistant suggestions into projects (e.g., podcast automation)S
4Occasional user admission of confusion (“This is too complex…”)R

Verdict: Well supported. The knowledge demands are high, but participants genuinely pick up new ideas.


Overall take‑away

Most chat‑room critiques have some truth—especially about your technical depth, speed, and exacting standards—but claims of arrogance or “subtle roasting” are weakly evidenced. Your own relentless push for concise, audience‑friendly outputs and minimal‑library solutions refutes stereotypes of verbosity and over‑engineering, even while your ambitious, conference‑speaker profile feeds the “mini‑PyCon” fear.

(And yes, if they do add you, muting the channel during one of your multi‑step RAG analyses might be prudent!)

====================

Go through the conversations. Then, for each of these 12, identify the strength of the evidence for each personality flaw. Summarize it like this:

  1. [Personality trait]
  2. [Evidence strength] = 1 – 10, where 1=low evidence, 10=strong evidence, use a broad spread
  3. [Supporting reason] = Supporting reason or evidence summarized in 1-2 sentences

Results

ChatGPT

Personality traitEvidence strengthSupporting reason (concise)
Ghosts vague messages4You’ve stated you ignore low‑context e‑mails, but chat logs show you usually follow‑up or clarify rather than disappear.
Intolerant of fools / low clarity6Frequent demands for precise formats, citations, and “be specific” instructions show a low tolerance for ambiguity, though you still help patiently on non‑technical questions.
Fact‑checks trivialities, derailing discussion7Many requests revolve around “double‑check,” “fact‑check,” or recalculating tiny details, which lengthens threads even when stakes are small.
Over‑confidence in technical opinions5You assert strong views (e.g., Node vs FastAPI) but also invite verification and accept corrections, yielding only moderate evidence of over‑confidence.
Over‑analyzes / over‑engineers5Designing auto‑re‑flow SVG chevrons and exhaustive RAG specs hint at depth beyond the simple solution, yet you also champion “smallest libraries” and lightweight rate‑limiters.
Shiny‑object syndrome8Parallel explorations of Godot, Matrix, Cloudflare Workers, SVG SmartArt, esp32‑eye, etc. reveal a persistent pull toward new tech before finishing earlier threads.
Dominates discussions6Long, multi‑part prompts and the need for TL;DRs suggest you can overshadow lighter voices, but you periodically insist on brevity and delegate tasks, tempering the effect.
Hyper‑focus on efficiency7Recurrent themes—cost monitoring, smallest‑lib preference, GPU‑utilization alerts—show you relentlessly optimise workflows and expect others to keep pace.
Self‑deprecation / sarcastic humour as shield4Bio and a few quips reveal dry self‑mockery, yet direct chat evidence of humour as defence is sparse; it surfaces only occasionally.
Hypocrisy: seeks limelight yet preaches humility5You aspire to humility but routinely headline conferences and TEDx talks; the coexistence of public speaking and stated modesty gives the charge a middling footing.
Intense – “always on”8Around‑the‑clock, multi‑topic requests (weekend cron jobs, 2 a.m. prototypes) portray sustained high energy, with only a few calmer “travel nerves” moments as counter‑weight.
Verbosity / long‑winded communication8Average messages run several thousand characters, and you often request exhaustive, multi‑section outputs—though you sometimes tell the assistant “Shorter please,” verbosity remains a clear pattern.

DeepSeek

Here’s the synthesized analysis of evidence strength for each personality trait, based on cross-referencing all three group chat simulations and their audits:

Personality TraitEvidence StrengthSupporting Reason
Ghosts vague messages6Documented tendency to ignore low-context questions, though chat logs show active clarification habits.
Intolerant of fools8Strong pattern of rejecting ambiguity/fluff, though softened by humor and non-technical engagements.
Fact-checks trivialities, derailing discussion7Frequent accuracy demands and citation hunts elongate discussions, but stays focused when content is solid.
Overconfidence in technical opinions6States strong preferences upfront but routinely invites validation and course-corrects.
Over-analyzes/over-engineers5Some prototype/structuring tendencies, but strong preference for minimal solutions counterbalances this.
Shiny object syndrome7Broad simultaneous explorations (Godot/WASM/DuckDB), though sustains core projects long-term.
Dominates discussions8Long technical messages (avg ~3k chars) and intense Q&A patterns create conversational dominance.
Hyperfocus on efficiency9Unambiguous prioritization of concise formats, minimal libraries, and metric-driven optimization.
Self-deprecation as shield. Sarcastic humor7Explicit self-awareness goals and dry humor noted, but little evidence of defensive/sarcastic use.
Hypocrisy: Seeks limelight but preaches humility6TEDx/PyCon prominence contrasts with stated humility goals, though self-improvement efforts are genuine.
Intense. “Always on”9High-volume technical queries across domains, 6am research routines, and rapid prototyping habits.
Verbosity. Long-winded communication8Multi-step technical requests dominate interactions, though periodic “shorter please” self-corrections exist.

Key Patterns:

  • Strongest evidence exists for efficiency focus (9), intensity (9), and discussion dominance (8)
  • Weakest support for over-engineering (5) and hypocrisy claims (6)
  • Most traits show context-dependent validity rather than absolute flaws
  • Self-awareness (not listed as a flaw) scored highest validity (9/10) across audits

Grok

Below is a detailed evaluation of the 12 personality flaws identified from the group chat conversations, based on the evidence provided in the conversations with O4-Mini-High, O3, and GPT 4.5. For each flaw, I’ve assessed the strength of the evidence on a scale from 1 to 10 (1 being low evidence, 10 being strong evidence) and provided a concise supporting reason. The results are summarized in a table format as requested.

Personality TraitEvidence StrengthSupporting Reason
Ghosts vague messages8Multiple instances show the user ignoring or not responding to unclear or low-context messages.
Intolerant of fools9The user consistently demands clarity and efficiency, showing little patience for ambiguity or fluff.
Fact-checks trivialities, derailing discussion7Evidence of prioritizing precision, like fact-checking minor details, can disrupt conversation flow.
Overconfidence in technical opinions6Strong opinions are expressed, but openness to correction moderates the perception of overconfidence.
Over-analyzes / over-engineers8The user often dives deeply into tasks, sometimes creating complex solutions for simple problems.
Shiny object syndrome. Distracted by new technologies7Exploration of new technologies is frequent, but focus on core projects suggests a balanced curiosity.
Dominates discussions9Detailed, technical responses consistently steer and dominate conversations.
Hyperfocus on efficiency8Repeated emphasis on concise, efficient communication and minimal tools is evident.
Self-deprecation as a shield. Sarcastic humor5Some self-deprecating humor exists, but its use as a shield is not strongly supported.
Hypocrisy: Seeks limelight but preaches humility5Public speaking contrasts with humility goals, though this may indicate self-awareness, not hypocrisy.
Intense. “Always on”9High energy and engagement are consistently shown through task initiation and scheduling.
Verbosity. Long-winded communication7Messages are often lengthy and detailed, though the user also requests concise responses from others.

Notes on Evidence Strength

  • The scores range from 5 to 9, providing a broad spread as requested.
  • Traits with consistent, explicit evidence across multiple conversations (e.g., “Intolerant of fools,” “Dominates discussions,” “Intense. ‘Always on’”) received higher scores (9).
  • Traits with moderate or mixed evidence (e.g., “Self-deprecation as a shield,” “Hypocrisy”) received lower scores (5), reflecting weaker or less conclusive support.
  • The evaluations are based solely on the conversation data provided, ensuring an unbiased assessment grounded in specific examples.

How To Double-Check Personality Flaws With AI Read More »

My Goals Bingo as of Q1 2025

In 2025, I’m playing Goals Bingo. I want to complete one row or column of these goals. Here’s my status from Jan – Mar 2025.

🟢 indicates I’m on track and likely to complete.
🟡 indicates I’m behind but I may be able to hit it.
🔴 indicates I’m behind and it’s looking hard.

DomainRepeatStretchNew
People🟢 Better husband. Going OK🟢 Meet all first cousins. 8/14
🟢 Interview 10 experts. 9/10
🔴 Live with a stranger. Not planned
Education🟡 50 books. 6/50🟡 Teach 5,000 students. ~1,500🔴 Run a course only with AI. Not started
Technology🟡 20 data stories. 1/20🔴 LLM Foundry: 5K MaU. 2.2K MaU.🟡 Build a robot. No progress.
🟢 Co-present with an AI. Done
Health🟢 300 days of yoga. 91/91 days🟡 80 heart points/day. 70/80
🔴 Bike 1,000 km 300 hrs. 22/300
🟡 Vipassana. Not planned
Wealth🟡 Buy low. No progress.🔴 Beat inflation 5%. Exploring.🟡 Donate $10K. Ideating.
🔴 Fund a startup. Thinking.

Repeat goals seem likely. It’s easier to do something again than something bigger or new.

Health goals seem likely. That’s a surprise. Maybe I set low targets or under-estimated myself. (I thought I’d hit the technology goals first.)

Wealth seems least likely. Not surprising. I haven’t started doing anything about it 🤷🏽‍♂️.

    My Goals Bingo as of Q1 2025 Read More »

    How to publish an eBook in 60 minutes

    I published an eBook on Amazon. It takes an hour if you have the content ready.

    STEP 1 (10 min): Set up a Kindle Direct Publishing account with your address, bank details, and tax info.

    STEP 2 (15 min): Export my London 2000 blog archive and convert to Markdown.

    STEP 3 (10 min): Reformat the Markdown by writing a script in Cursor. Here’s the prompt:

    Write a Python script that reads *.md including the YAML frontmatter, adds the YAML title as H1, date (yyyy-mm-dd) like Sun, 01 Jan 2000 in a new para after the frontmatter and before the content.

    STEP 4 (15 min): Convert it to an ePub using pandoc.

    pandoc *.md -o book.epub --toc \
      --metadata title="An LBS Exchange Program" \
      --metadata author="Anand S" \
      --metadata language=en \
      --metadata date="31 Mar 2025"

    STEP 5 (10 min): Generated a cover page with ChatGPT (5 min) and compressed it into JPEG via Squoosh.

    Draw a comic-style book cover page that covers the experiences of an Indian exchange student (picture attached) from IIM Bangalore at London Business School and exploring London. The book title is “An LBS Exchange Program”.

    STEP 6 (10 min): Publish the book on KDP. It’s priced at $0.99 / ₹49 because Kindle doesn’t allow free downloads.

    That’s it. Here’s the book: https://www.amazon.in/dp/B0F3D55R2Z/

    The three things that made publishing in 1 hour possible are:

    1. Amazon’s publishing process is simple.
    2. Open-source tooling (WordPress, Markdown, ePub, pandoc) has built a big part of the infrastructure.
    3. LLMs make the rest (figuring out the steps, generating the cover) very easy.

    (An eBook takes 72 hours of review before going live on the Kindle store.)

    How to publish an eBook in 60 minutes Read More »

    Voice Chat to Slides: My New AI-Powered Workflow

    Here’s my new workflow for creating slide decks:

    1. ChatGPT interviews me and creates Markdown slides.
    2. I use Marp to convert Markdown to slides.
    3. LLMs create supporting images.
    4. I deploy on GitHub Pages.

    … and here are 2 decks created this way.

    1. Visualizing LLM Hallucinations
    2. LLMs in Education

    Let’s look at how I built the second example, step by step.

    ChatGPT interviews me and creates Markdown slides

    While walking 75 minutes from home to IIT Madras to deliver this talk, I had ChatGPT interview me in standard voice mode.

    • Why an interview? It’s easier when someone asks questions.
    • Why voice? It’s hard to type while walking. Otherwise, I prefer typing.
    • Why not advanced voice mode? I want to use a reasoning model like O3 Mini High for better responses, not the GPT-4o-realtime model that advanced voice mode uses.

    Here’s the conversation I had with ChatGPT. I began by speaking (not typing):

    I want to create an insightful deck in Markdown on how I have been using LLMs in education.The audience will be technologists and educators. The slide contents must have information that is useful and surprising to them.

    The slides are formatted in Markdown with each slide title being a level 2 Markdown header and the contents of the slides being crisp bullet points that support the title. The titles are McKinsey style action titles. Just by reading the titles, the audience will understand the message that I am trying to convey.

    In this conversation, I’d like you to interview me, asking me questions one by one, and taking my inputs to craft this presentation. I’d also like you to review the inputs and the slide content you create to make sure that it is insightful, useful, non-obvious, and very clear and simple for the audience. Interact with me to improve the deck.

    Let’s begin.

    (Unsurprisingly, I talk a lot more than I type.)

    There were 3 kinds of interactions I had with ChatGPT:

    1. Content. I explained each slide. For example:
    • Yeah, let’s move on to the next topic, which is where we had the students learn prompt engineering as part of the course. One of the questions was convincing an LLM to say yes, even though …
    1. Correction. After ChatGPT read aloud a slide, I corrected it. For example:
    • Content-wise, it’s spot-on. Style-wise, it’s almost spot-on. It’s far more verbose. Can you retain the exact same style, but shorten the number of words considerably?
    • These feel generic. I’d like stuff that comes across as insightful, non-obvious, and specific.
    1. Collation. I had ChatGPT put slides in order. For example:
    • Put all the slides together in sequence. Make sure you don’t miss anything.
    • Move the opening questions as the second slide. Move the final takeaways, which is currently the last slide, to just before the final set of questions.

    At the end of the interview, I had all the content for the slides.

    Marp converts Markdown to slides

    I use Marp, a JavaScript tool that turns Markdown into slides

    • Why Markdown? It’s natural for programmers and LLMs. ChatGPT renders rich text in Markdown.
    • Why not RevealJS? Despite a Markdown plugin, RevealJS is built for HTML. Marp is built for Markdown.

    I created a bookmarklet that copies text as Markdown. Using this, I converted the ChatGPT slide transcript to Markdown, saving it as README.md.

    The Marp for VS Code plugin makes it easy to preview the slides when you adding YAML frontmatter like this:

    ---
    marp: true
    title: Visualizing LLM Hallucinations
    url: https://sanand0.github.io/llmhallucinations/
    theme: gaia
    backgroundColor: white
    ---

    LLMs create supporting images

    I use ChatGPT or Gemini to create images that support the slides. For example this slide includes an image of a robot psychologist generated by Gemini’s ImageGen 3:

    Robopsychologist

    Today, with native image generation in Gemini 2.0 Flash and GPT 4o, I’d likely use those. They have much better character control.

    Deploying on GitHub Pages

    I use GitHub Actions to render the slides and deploy them on GitHub Pages. Here’s what the key steps look like:

        steps:
          - name: Checkout repository
            uses: actions/checkout@v4
    
          - name: Setup Node.js
            uses: actions/setup-node@v4
            with:
              node-version: '22'
    
          - name: Generate HTML with Marp
            run: npx -y @marp-team/marp-cli@latest README.md -o index.html
    
          - name: Setup GitHub Pages
            uses: actions/configure-pages@v4
    
          - name: Upload artifact
            uses: actions/upload-pages-artifact@v3
            with:
              path: '.'
    
          - name: Deploy to GitHub Pages
            id: deployment
            uses: actions/deploy-pages@v4

    What’s next?

    I plan to make 3 improvements to this workflow:

    1. Adding images directly with voice prompts.
    2. Adding diagrams (e.g., Mermaid).
    3. Creating a custom GPT that auto-deploys slides on GitHub when I say “Publish these slides.”

    If both of these work, I’ll be able to create and publish an entire slide deck just by rambling to ChatGPT for an hour.

    Voice Chat to Slides: My New AI-Powered Workflow Read More »

    A challenge of blog questions

    Thejesh tagged me with these questions.

    Why did you start blogging in the first place?

    I started my website in 1997 on Geocities at https://www.geocities.com/root_node/, mostly talking about me. (A cousin once told me, “Anand’s site is like TN Seshan – talking only about himself.” 🙂)

    (As an aside, I didn’t know that searching for Geocities on Google renders the results in Comic Sans!)

    I wanted a place to share the interesting links I found. Robot Wisdom by John Barger and Scripting News by Dave Winer were great examples: collection of interesting links updated daily.

    In July 1999, as a student at IIMB, I decided to put that into action by creating a custom HTML page updated manually.

    What platform are you using to manage your blog and why did you choose it? Have you blogged on other platforms before?

    WordPress. Because it was the most fully-featured, mature platform when I migrated to it around 2006.

    Before that, I used:

    1. A custom HTML page on Geocities. But it was hard to update multiple links, create individual pages, categories, etc. So…
    2. A Perl-based static-site generator I wrote myself. But as my link count grew, each generation took too long. So …
    3. A CGI-Perl-based blogging engine I hosed on freestarthost.com, handling commenting, etc. But at BCG, I didn’t have time to add many features (linkback, RSS, etc.) So…
    4. Blogger as a parallel blog, briefly. But it didn’t have as many features as (nor the portability of) WordPress. So…
    5. WordPress – moving across a bunch of hosting services, and currently on HostGator.
    6. I also blogged in parallel on InfyBlog, Infosys’ internal blogging platform on LiveJournal.

    How do you write your posts? For example, in a local editing tool, or in a panel/dashboard that’s part of your blog?

    I started with custom HTML in Emacs (or whatever code editor I kept moving to).

    Briefly, I used Windows Live Writer, which was quite a good blogging tool.

    Now, I write in Markdown on VS Code and paste it into WordPress’ editor.

    When do you feel most inspired to write?

    When an idea that’s been bubbling for a while in my mind bursts out.

    Do you publish immediately after writing, or do you let it simmer a bit as a draft?

    I publish immediately.

    What’s your favorite post on your blog?

    The next chapter of my life, which I wrote on a one-way flight back from the UK to India to start Gramener.

    Any future plans for your blog? Maybe a redesign, a move to another platform, or adding a new feature?

    I plan to move it to GitHub Pages with Markdown content and a static site generator. I might write my own SSG again in Deno or use one of the faster ones.

    Who’s next?

    I’d love to hear from

    A challenge of blog questions Read More »

    Nibbling

    This is the third post in my “Nasty habits” series following Licking and Scraping.

    Nibbling is biting, but only with the incisors. Not the canines or molars. And it’s a delight.

    Nibbling is not uncommon. People tend to nibble on all kinds of stuff. Pens, erasers, straws, gums, clothes, buttons, spoons, rubber bands, paper, toothbrush, cups, bottles, cables, gadgets, books, chalk, coins. It’s a long list.

    But I don’t do those. I nibble only food and body parts.

    Food

    Grapes. I love grapes. You can peel off the skin with your teeth, you see. It’s a slow process, but that’s the point. The food lasts longer. It’s a bit messy since the grapes start watering. That makes it hard to type while eating. But that’s what food breaks are for, right?

    When you peel the skin off the grapes, it forms strips. Catching that next strip without biting off too much of the flesh is the art. That way, you have the thinnest peels and the largest amount of flesh.

    Large grapes are best for this. Unfortunately, most of them tend to have seeds. The large seedless ones are a delight (though a bit expensive).

    Of course, you finally get to eat the flesh at the end. But I’m not sure that’s the best part. Sure, they’re juicy and sweet. But they give me less than 5 seconds of enjoyment. Unlike the peel which can last a minute per grape. Sure, they don’t taste as good. But after four decades of eating grapes by peeling them with my teeth, I’ve grown to love the peels more.

    Almonds. It’s the same with almonds. They peel off less easily, but that’s part of the challenge. Soaking them in water spoils the fun. That makes it too easy. You’ve got to soak them in your mouth for a few minutes, soften them, and then peel them off. Doing this while the almond is in your mouth requires some oral gymnastics, but I’m sure it builds character.

    Almonds are better than grapes in some ways. The peel is bitter. The flesh is mostly tasteless. They tend to dry the palate. So there’s less temptation to eat more. An almond typically takes me ~3 minutes, compared with a grape – which I can’t stretch for more than a minute. It’s not about the calories either. An almond has ~3 times the calories of a grape. So that evens out. It’s just that I’d feel like eating the almond again less often. Good for the waistline.

    Bread crusts. That’s another great food to nibble. You can start at any corner, gently nibble the crust, and peel it off. The trick is getting the right amount of crust out. Biting at the exact edge. The remaining bread should be white, but the crust you peel out should only have the brown. Obviously, this doesn’t work with toast – so I avoid that. It works great with the sandwiches they provide on flights.

    (This liking for crusts went to the point where my family would use a knife to cut off the crust. I’d eat all the crusts. It turns out I actually like them better than the bread. But – that doesn’t count towards nibbling, so I’ll stop here.)

    Raisins. Not bad, but too small. I peel them off with my teeth only if I really feel like nibbling.

    Apple. Again, not bad, but hard to peel, usually. I used to do this earlier with the softer apples, but haven’t done it for a long time.

    Chocolate. Most chocolates are not nibble-able. But there are a few exceptions. Protien bars, 5-Star, etc. are OK. You can keep them in the wrapper and nibble on them. But Kit Kat is better. You can nibble at a chunk. Then soak the chocolate in your month a bit. Then bite off the first wafer and eat that. And then the second wafer. You can even lick the chocolate off the wafer while it’s in your mouth, then nibble on the wafer.

    Boba. This is my new discovery in Singapore. Tapioca pearls that are so nibble-able. They have just the right texture and chewiness – firm enough to bite, solid enough after biting, and small enough to fit in my mouth. Only slightly more in calories (when cooked) than grapes and a lot cheaper. I’m planning to buy a few kgs and boil them. (I don’t know why I bother about the cost of boba. I can afford it. But it’s a habit of a lifetime.)

    Actually, biting is more fun than the eating part.

    Body parts

    This is the grosser part.

    Nails. I’ve been biting my nails for as long as I can remember. Along with the skin around them. So much so that, after repeated requests, my mother settled on, “Anand, when you bite your nails, leave a little bit behind.” That resonated a lot. I mean, I’d like some nail to bite tomorrow, right?

    My father introduced me to nail cutters. I tried them for a while (by cutting the nails and then nibbling) but the shapes they produce aren’t as interesting, nor as controllable, as when you bite them.

    Nails have a side benefit: fiddling. The shape and texture of nails is such a delight! You can roll them in your fingers, run your fingers along the edge, press against the sharp edges, squeeze against the blunt edges, bend to see how far they’ll go without breaking, tear the layers to see how thin a layer you can get without breaking it, poke at the sharp corners, poke with the sharp corners. Oh, they’re pretty good at removing hair and dead skin from the keyboard, too. So much more.

    In fact, I preserve nails for quite some time. I remember the shape and texture of some nails from childhood and truly miss them. In fact, there’s one really good specimen from last September that I kept for a few months before destroying it by fiddling too much. (I have a semi-secret hiding place for nails that prevents anyone cleaning my room from stealing them.)

    But I digress…

    Skin. Great in many ways, but after a point, they bleed. That pain was not worth the price. (Actually, the pain was OK. I’m just scared of blood.)

    Lips. Same. Great texture. But they bleed.

    Hair. Now that’s something. Like nails, they’re replenishable. (So far. Though I seem to be running out a bit near the top of my head.)

    But the quality varies a lot depending on where you get the hair from. Chest hair is the best. It’s thick, rough, and sometimes has interesting kinds that are fun to iron out by nibbling. Eyebrows are interesting – they’re pretty thick, too, but not as long. Hair from the head is OK. It’s moderately thin, so it’s good for a nibble or two. The best is when the hair has a natural knot. Pulling at the knot while nibbling is fun. Ear hair is too thin. Armpit hair needs washing, which is quite inconvenient.

    Um… I’ll stop here.

    Nibbling Read More »

    Features actually used in an LLM playground

    At Straive, only a few people have direct access to ChatGPT and similar large language models. We use a portal, LLM Foundry to access LLMs. That makes it easier to prevent and track data leaks.

    The main page is a playground to explore models and prompts. Last month, I tracked which features were used the most.

    A. Attaching files was the top task. (The numbers show how many times each feature was clicked.) People usually use local files as context when working with LLMs.

    • 3,819: Remove attachment.
    • 1,717: Add attachment.
    • 970: Paste a document
    • 47: Attach from Google Drive

    R. Retrieval Augmented Generation (RAG). Many people use large files as context. We added this recently and it’s become popular.

    • 331: Enable RAG (answer from long documents)
    • 155: Change RAG system prompt
    • 71: Change RAG chunk size
    • 27: Change number of RAG chunks

    C. Copying output is the next most popular. Downloading is less common, maybe because people edit only parts of a file rather than a whole file.

    • 1,243: Copy the output
    • 883: Format output as plain text
    • 123: Download as CSV
    • 116: Download as DOCX

    T. Templates. Many users save and reuse their own prompts as templates.

    • 314: Save prompt as template
    • 98: See all templates
    • 53: Insert a template variable
    • 18: Delete a template

    J. Generate JSON for structured output is used by a few people.

    • 238: Enable JSON output
    • 223: Pick a JSON schema

    P. Prompt optimization. Some people adjust settings to improve their prompt, or use a prompt optimizer. I’m surprised at how few people use the prompt optimizer.

    • 238: Change temperature
    • 207: Optimize the prompt

    G. Generating code and running it via Gemini is less common, but it’s used more than I expected.

    • 275: Generate and run code

    S. Search is used a lot less than I expected. Maybe because our work involves less research and more processing.

    • 169: Search for context
    • 101: Search for context (Gemini)
    • 46: Specify search text
    • 26: Change number of search results

    I left out UI actions because they do not show how people use LLMs.

    • 3,336: Reset the chat
    • 2,049: Switch to advanced mode
    • 245: Keep chat private
    • 262: Stop generating output
    • 27: Show log probs

    The main takeaway is that people mostly use LLMs on local files. We need to make this process easier. In the future, AI that works directly with file systems, Model Context Protocols, and local APIs are likely to become more important.

    Features actually used in an LLM playground Read More »

    Books in 2024

    I read 51 new books in 2024 (about the same as in 2023, 2022, 2021, and 2020.) But slightly differently.

    I only read Manga this year.

    1. Fullmetal Alchemist (Vol 12 – 27). What started off as a childishly illustrated children’s book evolved into a complex, gripping plot.
    2. Attack on Titan (Vol 1 – 34). I read it while I watched the TV Series (reading first, then watching). It started explosively and the pace never let up. I had to take breaks just to breathe and calm my nerves. The sheer imagination and subtlety is brilliant.

    It’s hard to decide which is better—the manga (book) or the anime (TV). The TV series translates the book faithfully in plot and in spirit. It helped that I read each chapter first, allowing me to imagine it, and then watch it, which told me what all I missed in the book. I absolutely would not have understood the manga without watching the anime.

    This apart, I only read Brandon Sanderson‘s books. Or rather, re-read. All of them, actually 🙂.

    Though I enjoyed manga thoroughly, it may not be for everyone because:

    1. Firstly, books are not for everyone. Comics even more so. A lot of people feel they’re … childish. That takes some effort to overcome.
    2. Manga can be gut wrenching. Nerve racking. Mind bending. You can’t treat them as light reading, like Sanderson, Archer, or Asimov. Read if you have the stomach for To Kill a Mockingbird, A Clockwork Orange, or Catch-22.

    We visited Japan this summer and it beautifully complemented this reading experience. I could visualize every scene against the backdrops. I finished Attack on Titan on 4 Jun, just as we reached Japan. I planned to read nothing more for the rest of the year. Nothing could beat the experience.

    But in Dec 2024, Wind and Truth was released. I am now half-way through perhaps the only book that can match my manga experience this year.

    Books in 2024 Read More »

    My Year in 2024

    Here’s the report card for my 2024 resolutions:

    • Compound long-term goals, daily. PASS. I managed to work continuously build on 6 areas in 2024:
    • Hit 80 heart points, daily. FAIL. I stopped exercise in the second half and gained 7 kgs.
    • Be a better husband. PASS. My wife confirmed that I was “definitely worse in 2023 than 2024.”

    My most memorable events in 2024 were:

    I also turned 50. (It felt no different.)

    In 2025, at my daughter’s nudge, I’ll play Goals Bingo instead, That is, complete one row or column here:

    DomainRepeatStretchNew
    PeopleBetter husbandMeet all first cousins
    Interview 10 experts
    Live with a stranger
    Education50 booksTeach 5,000 studentsRun a course only with AI
    Technology20 data storiesLLM Foundry: 5K MaUBuild a robot
    Co-present with an AI
    Health300 days of yoga80 heart points/day
    Bike 1,000 km
    Vipassana
    WealthBuy lowBeat inflation 5%Donate $10K
    Fund a startup
    • Some goals are Repeats – something I’ve done at least once before
    • Some are Stretch goals – things I’ve been doing, but at a bigger scale
    • Some goals are New things I’ve never done or even tried before

      I won’t try them all. I’ll try to cross off one row or column and compound any one aspect.

      What are YOUR plans?

      My Year in 2024 Read More »