Things I Learned - 26 Nov 2023

This week, I learned: This is an interesting GPT Vision API prompt from Simon Willison: “given this event flyer, create a link to add it to my Google Calendar”. Ref Quote from Jerry Liu: “GPT 4 is really good at complex reasoning”. It’s worth exploring what that means. Quote from Jerry Liu: “RAG is a hack”. It’s engineered, not machine learnt, so it’s suboptimal. We need an ML way of creating the context. Maybe fine tuning can be a way of CREATING the right context. But RAG can handle deterministic stuff like access control. Open AI fine tuning API is not good at memorizing info the way it is exposed. But the Gorilla paper shows that fine tuning can actually memorize well. Learn ML optimization approach - LLMOps. Have an evaluation framework with metrics like weights and biases or tensorboard. Helps figure out where fine tuning helps and where RAG does. Soon, this will become important. Flat indexing of chunks is not the only way to store embeddings. LlamaIndex allows you to create hierarchies that you can traverse for retrieval Agents mimic programming primitives. Switch. While. Call a function. Print. OpenRouter hosts several models and offers them as APIs! Ragas metrics evaluate quality of a RAG pipeline Orca 2 was trained on different reasoning techniques (e.g. step-by-step) and is as good as larger models Embeddings can help just re-rank regular search results. Ref Claude 2 Anthropic has a 200K context window but is still crap. Video-Llava can understand videos too. CoVA scrapes web pages using LLMs and visual information. jsonrepair can fix JSON fairly well. jsonformer wraps HuggingFace models to produce JSON. Ref Google has a model garden with lots of pre-trained and trainable models. Gorilla LLM specializes in APPI calls: Torch Hub, TensorFlow Hub, HuggingFace GPT-4 does not do abstraction at human levels Each of the GPTs / Prompts we create could be like a UNIX command prompt, and become a startup of its own Llava Plus extends LlaVA with pre-trained vision models that make image editing better Ollama runs local LLMs

Winning the alphabetical race

Since my name (Anand) begins with “A”, I used to get called on fairly early at school. In attendance. Answering questions. Classroom exercises. Quizzes. Even the distribution of test results. A few people later told me that it is good training, since I’d always be prepared. (Maybe. I’ve no idea.) At IBM and IIMB, Ajit was the only one ahead of me, alphabetically. Then he went a step ahead and named his son Aadi. I thought that’s impossible to beat. ...

Things I Learned - 19 Nov 2023

This week, I learned: XOT - Everything of Thought is a new prompt from Microsoft but I don’t understand it Creating Fine-Tuning datasets WITHOUT inputs Tamil-Llama Voyager plays Minecraft! Langchain supports evaluators. Pydantic is all you need drives towards code = data = text!

Things I Learned - 12 Nov 2023

This week, I learned: Julius.ai queries structured data. TODO: Explore https://github.com/microsoft/TaskMatrix microsoft/autogen enables multi-agent conversations. Architecture of today’s LLMs is similar to the A16Z architecture Stanford Foundational Model Transparency index was critiqued as misleading vLLM runs HuggingFace transformers models faster. So does DeepSpeed

LLMs can teach experts

I am a fairly good programmer. So, when I see a problem, my natural tendency is to code. I’m trying to break that pattern. Instead, I ask ChatGPT. For example, I asked: Write a compact 1-line Python expression that checks if user.id ends with @gramener.com or @straive.com user.id.endswith(("@gramener.com", "@straive.com")) After 15 years of using Python, I learnt that .endswith() supports tuple suffixes. This has been around since Python 2.5 (released in 2006 – before I knew Python.) The documentation has a tiny sentence in the middle saying “suffix can also be a tuple of suffixes to look for.” ...

Father of the bride

In 2012, I started Gramener with half a dozen friends. This week, we were acquired by Straive, a part of Barings Private Equity Asia. How do you feel? I feel like the father of the bride. Gramener was registered on 26 Feb. A day before my daughter’s birthday. I’ve spent more time with Gramener than my daughter. That makes Gramener my elder child. Who’s moving into a new household. Along with me. (I feel like சகலகலா சம்மந்தி.) ...