This week, I learned:

  • This is an interesting GPT Vision API prompt from Simon Willison: “given this event flyer, create a link to add it to my Google Calendar”. Ref
  • Quote from Jerry Liu: “GPT 4 is really good at complex reasoning”. It’s worth exploring what that means.
  • Quote from Jerry Liu: “RAG is a hack”. It’s engineered, not machine learnt, so it’s suboptimal. We need an ML way of creating the context. Maybe fine tuning can be a way of CREATING the right context. But RAG can handle deterministic stuff like access control.
  • Open AI fine tuning API is not good at memorizing info the way it is exposed. But the Gorilla paper shows that fine tuning can actually memorize well.
  • Learn ML optimization approach - LLMOps. Have an evaluation framework with metrics like weights and biases or tensorboard. Helps figure out where fine tuning helps and where RAG does. Soon, this will become important.
  • Flat indexing of chunks is not the only way to store embeddings. LlamaIndex allows you to create hierarchies that you can traverse for retrieval
  • Agents mimic programming primitives. Switch. While. Call a function. Print.
  • OpenRouter hosts several models and offers them as APIs!
  • Ragas metrics evaluate quality of a RAG pipeline
  • Orca 2 was trained on different reasoning techniques (e.g. step-by-step) and is as good as larger models
  • Embeddings can help just re-rank regular search results. Ref
  • Claude 2 Anthropic has a 200K context window but is still crap.
  • Video-Llava can understand videos too.
  • CoVA scrapes web pages using LLMs and visual information.
  • jsonrepair can fix JSON fairly well. jsonformer wraps HuggingFace models to produce JSON. Ref
  • Google has a model garden with lots of pre-trained and trainable models.
  • Gorilla LLM specializes in APPI calls: Torch Hub, TensorFlow Hub, HuggingFace
  • GPT-4 does not do abstraction at human levels
  • Each of the GPTs / Prompts we create could be like a UNIX command prompt, and become a startup of its own
  • Llava Plus extends LlaVA with pre-trained vision models that make image editing better
  • Ollama runs local LLMs