S Anand

One Year of Transforming Thoughts by Changing Environments

From The Extended Mind I learnt that our environment shapes our thinking more than I’d expected. That we can arrange our environment to extend our thoughts. In 2023, each month I changed something in my environment to see: What does “changing my environment involve”? What can I change? Will I succeed? Does it affect my thoughts? Can I track this? Here are the results. ...

Things I Learned - 24 Dec 2023

This week, I learned: DPO is a simpler alternative to RLHF for fine-tuning. Several HuggingFace models use DPO for training Name2Vec is a potential embedding for names. Google Knowledge Graph ID powers the Knowledge Graph. If it begins with /m/ it’s the same as the FreeBase ID. This is now available as WikiData. e.g https://www.wikidata.org/wiki/Property:P2671 I tried running Mixtral-8x7b locally (via Llamafile) and on together.ai. It’s good, but far from GPT 4. Generic computate-intensive algorithms eventually beat domain-specific tuning, because of Moore’s law. Ref The hidden brain podcast. the mystery of beauty Evolution drove us to beauty as an efficient survival mechanism. Understanding the world is one such mechanism. Hence we enjoy maths and chess ⭐ This leaderboard included paid models like GPT4 and Claude and compared them with open models on HUMAN + system benchmarks Lez Friedman Podcast: Jeff Bezos Build stuff that is is ubiquitous that other people take it for granted. The initial idea needs to be that obvious and easy. Like one click purchase or customer reviews Build stuff that other people can build on. Internet makes startups possible. Infrastructure is about enabling others at scale Decision making approaches: single person decides on two way doors. Deliberate as a team on one way doors Conflict resolution: disagree and COMMIT. NO sniping, I told you so, malicious compliance. Avoid compromise. Avoid decision by attrition (most persistent wins). People are inherently biased towards hierarchy. So the senior most person should speak last We have a happiness bias. Contracted by choosing the unhappier options first The map is not the territory. The metric is not the objective. We need metrics. But make sure you know why See the world through the eyes of the customer. Use your own product. It’s living their lives that makes customer obsession real. Jeff Bezos called their own customer care to see how long the actual wait time was. It was much longer than the metric reported How to prioritize. whatever problems customers will still face in 10 years are the big problems. These are worth putting time into because they are stable in time People working on big problems will never get down to the small problems. So have a dedicated team that works only on the paper cuts. It should be a dedicated team We co evolve with our tools. We build tools and then our tools change us. It reprograms our brains Cut out 10 minutes to the beginning of each meeting for people to read the material. They never reread anyway. This makes the meetings more productive Powerpoint is designed for persuasion, not truth seeking. It is also easier for the author than for the reader. Prefer narratives that are focused on finding the truth and are easier for the audience though tougher for the author ⭐ whisper-standalone-win provides a Windows binary for Faster-Whisper. It just needs CUDA and cuDNN installed. Then whisper-faster.exe video.mkv --language=English --model=medium generates the transcript. LLM use cases by Benedict Evans “Every text box on the internet will get an LLM” “Infinite interns” “Every UNIX function has become a company.” “Every ChatGPT suggestion…” llm360 publishes models along with training datasets. In The Age of AI has begun, Mar 2023, Bill Gates says, “In my lifetime, I’ve seen two demonstrations of technology that struck me as revolutionary.” The GUI (1980) and ChatGPT (2022). Rubeus is a HTTP proxy for multiple LLMs with load-balancing, fallbacks and retries. GPTRouter is a Python interface for multiple LLMs with fallbacks and retries. ⭐ Token Tally has an LLM Cost Tool that estimates GPU memory required and token cost across cloud providers.

Things I Learned - 17 Dec 2023

This week, I learned: Grab. Improving last mile delivery in maps. When did people pick up the phone, when should driver be allocated to minimize waiting time, layer on top of OSM. Singapore developers the Sea Lion 7b model Try VLLM with AWQ format. Can do batch inferencing. Needs a good GPU Amex prediction whether they can pay back in 1 year or 18 months. That choice is a business decision. In real time. Precompute individual score and use it as input to another model. Model must be explainable by regulation. Creates decision tree models therefore. Compliance team must agree if I can use a feature. Can’t use gender. Age (in US, Canada);- high age is more risk. Can’t use edu level in the US. Capture information from camera and use LLMs. Like traffic cameras mapping. Explore GIS from video cameras Grab tracks road closures and road accidents and whether a cycle can go on a road vs a bike vs a car All drivers have a front facing camera Drivers report road accidents by pressing a button Amex prices individual loans when selling to a collection agency #TODO buy a bike head camera! Playwright is a browser-based test framework. Supports recording. OpenAI provides logprobs for tokens! This can be a used to create cool visualizations of the likelihood of the each tokens. Github Copilot’s new features makes your entire workspace or a specific file its context. It also auto-writes your commit messages and PR descriptions. Mixtral-8x7b-Instruct “… really does seem to be equivalent in quality to ChatGPT 3.5.” Ref Practical AI podcast Advent of Gen AI is going on. Explore add to tools in data science course. Model validation write a book as an open source to github repository. Easier to evolve and easier to get feedback on.. Explore utterances as a GitHub commenting platform automatically give credits to contributors who have center pull request that was accepted or an issue that was fixed. This encourages contribution Visit book.premai.io ast-grep is a semgrep alternative that focuses on code refactoring rather than security. Comby is another such tool Serply is a Google Search API alternative to Google CSE ⭐ Generate textbooks! ChatGPT is good at generating questions or training datasets. It genuinely creates them rather than replicating from memory. Ref v0.dev creates web pages from code. Example. LIDA from Microsoft is an LLM based data visualization tool.

Things I Learned - 10 Dec 2023

This week, I learned: Bard supports extensions that include @Gmail – i.e. converse with your email. llama-cpp-python works with other GGUF models like Mistral and allows constrained output - JSON, function calling, etc. Ref 12 Tuning Strategies for RAG Llama Datasets are RAG datasets created mostly using GPT-4. Mostly small datasets. ⭐ Intuitions about large language models Bigger models (70b) are much better at learning from few-shot examples. They really learn. Bigger models will keep getting better! Chain of Thought prompting is a way of providing more compute to complex problems that require more compute Models will show emergent (completely new) behaviors that can’t be predicted from extrapolation. These may not be intentional. CodeAnt.ai is a VS Code plugin to detect code smells, refactor for modularity, to write docstrings and unit tests Anyscale prices the 7b Llama2, Zephyr, Mistral models at 15 cents per 1M tokens. Roughly 1/10th of GPT-3.5 Turbo’s ~$1.5 per 1M tokens Tools to identify personally identifiable information: galactic can use LLMs to detect PII Presidio by Microsoft Sherlock is a generic sematic type matching DL model pii-extractor-llm was trained on Indian names GLiNER is a Lightweight Generalist model for NER Tools to explore ElevenLabs speaks in your voice Cutout Pro removes backgrounds and parts of images Vocal Remover removes vocals from songs CapCut video editor TheBloke’s $35/month Patreon might be one of the least expensive ways to set up quantized LLMs in production. Microsoft released table-transformer to extract tables from PDFs. Sample usage Convert PDF to markdown with marker - an improvement over nougat. JupyterLab has a %%ai magic to use LLMs within notebooks. Ref Telling ChatGPT that the year is 2123 makes it bypass copyright. Ref Meta released SeamlessExpressive which preserves emotions in speech-to-speech translations Unsloth offers faster lower-memory LLM QLoRA finetuning DeepSeek is an open-source high-quality LLM Scalable Extraction of Training Data from (Production) Language Models extracts training data by repeating a token infinitely. SkyPilot lets you run LLMs on any cloud provider. vLLM lets you deploy LLMs with a single command. llamafile lets you run LLMs locally as a single file executable!

Things I Learned - 03 Dec 2023

This week, I learned: Gwern Branwen says LLMs nudge his “… making heavier use of the languages I don’t know well (Emacs Lisp & Python) since I increasingly trust that an LLM can help me maintain them.” Undetectable.ai checks for AI content. But it had false positives AND negatives in the 5 checks I ran. GPTZero got 2/2 right and seems better at detecting AI content. CoVA scrapes web pages via OCR When coding with LLMs, have SHORT, RELIABLE feedback loops. Ref

ChatGPT Custom Instructions

I speak with ChatGPT ~20 times a day. That’s more than I speak with most of my colleagues. ChatGPT is clearly my favorite team member. I conduct trainings, reviews and mentoring sessions with my colleagues. How to write code. How to write slides. How to communicate. That last bit is particularly important. With ChatGPT Custom Instructions, I can guide ChatGPT on how to work better with me. Currently, I have 10 custom instructions. They evolved over time and will continue to evolve. ...

Things I Learned - 26 Nov 2023

This week, I learned: This is an interesting GPT Vision API prompt from Simon Willison: “given this event flyer, create a link to add it to my Google Calendar”. Ref Quote from Jerry Liu: “GPT 4 is really good at complex reasoning”. It’s worth exploring what that means. Quote from Jerry Liu: “RAG is a hack”. It’s engineered, not machine learnt, so it’s suboptimal. We need an ML way of creating the context. Maybe fine tuning can be a way of CREATING the right context. But RAG can handle deterministic stuff like access control. Open AI fine tuning API is not good at memorizing info the way it is exposed. But the Gorilla paper shows that fine tuning can actually memorize well. Learn ML optimization approach - LLMOps. Have an evaluation framework with metrics like weights and biases or tensorboard. Helps figure out where fine tuning helps and where RAG does. Soon, this will become important. Flat indexing of chunks is not the only way to store embeddings. LlamaIndex allows you to create hierarchies that you can traverse for retrieval Agents mimic programming primitives. Switch. While. Call a function. Print. OpenRouter hosts several models and offers them as APIs! Ragas metrics evaluate quality of a RAG pipeline Orca 2 was trained on different reasoning techniques (e.g. step-by-step) and is as good as larger models Embeddings can help just re-rank regular search results. Ref Claude 2 Anthropic has a 200K context window but is still crap. Video-Llava can understand videos too. CoVA scrapes web pages using LLMs and visual information. jsonrepair can fix JSON fairly well. jsonformer wraps HuggingFace models to produce JSON. Ref Google has a model garden with lots of pre-trained and trainable models. Gorilla LLM specializes in APPI calls: Torch Hub, TensorFlow Hub, HuggingFace GPT-4 does not do abstraction at human levels Each of the GPTs / Prompts we create could be like a UNIX command prompt, and become a startup of its own Llava Plus extends LlaVA with pre-trained vision models that make image editing better Ollama runs local LLMs

Winning the alphabetical race

Since my name (Anand) begins with “A”, I used to get called on fairly early at school. In attendance. Answering questions. Classroom exercises. Quizzes. Even the distribution of test results. A few people later told me that it is good training, since I’d always be prepared. (Maybe. I’ve no idea.) At IBM and IIMB, Ajit was the only one ahead of me, alphabetically. Then he went a step ahead and named his son Aadi. I thought that’s impossible to beat. ...

Things I Learned - 19 Nov 2023

This week, I learned: XOT - Everything of Thought is a new prompt from Microsoft but I don’t understand it Creating Fine-Tuning datasets WITHOUT inputs Tamil-Llama Voyager plays Minecraft! Langchain supports evaluators. Pydantic is all you need drives towards code = data = text!

Things I Learned - 12 Nov 2023

This week, I learned: Julius.ai queries structured data. TODO: Explore https://github.com/microsoft/TaskMatrix microsoft/autogen enables multi-agent conversations. Architecture of today’s LLMs is similar to the A16Z architecture Stanford Foundational Model Transparency index was critiqued as misleading vLLM runs HuggingFace transformers models faster. So does DeepSpeed

LLMs can teach experts

I am a fairly good programmer. So, when I see a problem, my natural tendency is to code. I’m trying to break that pattern. Instead, I ask ChatGPT. For example, I asked: Write a compact 1-line Python expression that checks if user.id ends with @gramener.com or @straive.com user.id.endswith(("@gramener.com", "@straive.com")) After 15 years of using Python, I learnt that .endswith() supports tuple suffixes. This has been around since Python 2.5 (released in 2006 – before I knew Python.) The documentation has a tiny sentence in the middle saying “suffix can also be a tuple of suffixes to look for.” ...

Father of the bride

In 2012, I started Gramener with half a dozen friends. This week, we were acquired by Straive, a part of Barings Private Equity Asia. How do you feel? I feel like the father of the bride. Gramener was registered on 26 Feb. A day before my daughter’s birthday. I’ve spent more time with Gramener than my daughter. That makes Gramener my elder child. Who’s moving into a new household. Along with me. (I feel like சகலகலா சம்மந்தி.) ...

Scraping

I was at Cream Centre with my father on a Sunday afternoon. We’d finished a light lunch and were debating dessert. (He has triglycerides. I have cholesterol.) This was my fifth visit this year, and I had abstained so far. I couldn’t any longer. I ordered a Sizzling Brownie Sundae. But not for reasons you might think. Expertise comes from experience. I scrape food more than 99% of the people I know. So, I consider myself an expert. Here’s a guide on the art of scraping. ...

My PyCon talks are a way for me to learn. I usually pick topics I don’t know about. But at PyCon India 2023 the organizers picked “Programming Minecraft with Python” - a talk I’d given before. So, I started exploring ways to game it. (I like gaming things. It’s boring otherwise. Once, Infosys had me write a 400-page document. I began each page with a letter that spells out a poem.) ...

Ashwini Mathur and I are conducting a webinar on the impact of LLMs in Pharma. It’s online at 10 am Eastern on Mon Sep 11. Simon Willison described LLMs as alien technology we’re still discovering. I couldn’t agree more - and it helps to see it from different perspectives. So, we’re pairing the tech research at Gramener with the domain research Ashwini Mathur is doing at Novartis to explore the good, the bad, and the surprising uses of generative AI. ...

Always use value= for dynamic HTML options

Even after 30 years of HTML, I learn new things about it. This Monday morning, I woke up to a mail from Sundeep saying requests for a Data Engineer - AWS/Azure/GCP in our internal fulfilment portal raised an error. My guess was one of these: The “/” in the role is causing a problem. (Developer mistake.) The role exists in one table but not the other. (Recruitment team mistake.) The application wasn’t set up / restarted properly. (IT mistake.) All three were wrong. So I dug deeper. ...

My first LAMBDA in Excel

Ever since Excel introduced the LAMBDA function, I've been itching to use it in real life. I got my first chance today. We track the skill index of our different teams (consulting, analytics, technology, etc.) like this: TeamSkill IndexApr-23May-23Jun-23Jul-23Consulting0%0%Analytics33%33%Technology72%72%etc. The "Skill Index" column should pick the LAST value. If Apr-23 is filled, use that. But if May-23 is also filled, use that. ...

Licking

Last week, I was at IIT Madras for lunch with the faculty. The dessert was carrot halwa with ice cream. I scraped the last bits with my spoon, but a little ice cream was left over. I was torn. I CAN’T POSSIBLY waste it. But can I lick it? In public? I don’t have a problem licking at home. I lick my fingers. Plates. Bowls. Ladles. The cream on milk. The leftover milk in the glass. (If my tongue doesn’t reach that far, I wipe it with my finger and lick the finger.) ...

Zeigarnik effect vs my procrastination

I make commitments but don’t always deliver on time. In 2022, I ran an experiment to find out why I procrastinate. In Jan-Feb 2022, I listed the top 2 things I wanted to get done each day and measured how often I completed them. 14 Jan. ❌ Summarise from three research reports 12 Jan. ❌ UIFactory experiment ✅ Decide if I am a (…) 11 Jan. ❌ UIFactory experiment ✅ Agree on publishing in (…) 10 Jan. ❌ Client video. ❌ UIFactory experiment 09 Jan. ❌ UIFactory experiment. ❌ Attrition email as a story 07 Jan. ❌ ZS visual 06 Jan. ❌ Release Gramex Guide. ✅ UWC application 05 Jan. ❌ Publish network cluster post. ❌ Release Gramex guide 04 Jan. ❌ Publish network cluster post. ✅ Release Gramex. 03 Jan. ✅ Publish election TDS video. ❌ Publish Network cluster post. 02 Jan. ❌ Publish election TDS video. ❌ Publish Network cluster post. 01 Jan. ❌ Publish Network cluster post. ✅ Finalize SG school. I completed 23 / 57 things (40%). That’s one of my TOP priorities. ...

Picking books to read

I add book recommendations to my GoodReads – To-read list. Then I sort by rating and pick the first one I like to read. In 2023, I’m reshaping my environment. Picking books I usually won’t pick. (Read The Unknown Unknown: Bookshops and the Delight of Not Getting What You Wanted if you want to be similarly inspired.) So here are 4 approaches I’m adding to my process. Algorithmic. Sort Kaggle books based on popularity, rating, and age. Pick the top 10 (or 50) Serendipitous. Go to bookstores and libraries. Pick the most popular books Award-winning. Pick from the Pulitzer, Booker, Nobel, Hugo, and other award winners Challenges. Pick from Popsugar, Book Riot, Goodreads, The 52 Book Club, and other challenges FYI, here are algorithmic results (for books with 100+ ratings and a 4+ average on Goodreads): ...