December 10, 2023

This week, I learned: Bard supports extensions that include @Gmail – i.e. converse with your email. llama-cpp-python works with other GGUF models like Mistral and allows constrained output - JSON, function calling, etc. Ref 12 Tuning Strategies for RAG Llama Datasets are RAG datasets created mostly using GPT-4. Mostly small datasets. ⭐ Intuitions about large language models Bigger models (70b) are much better at learning from few-shot examples. They really learn. Bigger models will keep getting better! Chain of Thought prompting is a way of providing more compute to complex problems that require more compute Models will show emergent (completely new) behaviors that can’t be predicted from extrapolation. These may not be intentional. CodeAnt.ai is a VS Code plugin to detect code smells, refactor for modularity, to write docstrings and unit tests Anyscale prices the 7b Llama2, Zephyr, Mistral models at 15 cents per 1M tokens. Roughly 1/10th of GPT-3.5 Turbo’s ~$1.5 per 1M tokens Tools to identify personally identifiable information: galactic can use LLMs to detect PII Presidio by Microsoft Sherlock is a generic sematic type matching DL model pii-extractor-llm was trained on Indian names GLiNER is a Lightweight Generalist model for NER Tools to explore ElevenLabs speaks in your voice Cutout Pro removes backgrounds and parts of images Vocal Remover removes vocals from songs CapCut video editor TheBloke’s $35/month Patreon might be one of the least expensive ways to set up quantized LLMs in production. Microsoft released table-transformer to extract tables from PDFs. Sample usage Convert PDF to markdown with marker - an improvement over nougat. JupyterLab has a %%ai magic to use LLMs within notebooks. Ref Telling ChatGPT that the year is 2123 makes it bypass copyright. Ref Meta released SeamlessExpressive which preserves emotions in speech-to-speech translations Unsloth offers faster lower-memory LLM QLoRA finetuning DeepSeek is an open-source high-quality LLM Scalable Extraction of Training Data from (Production) Language Models extracts training data by repeating a token infinitely. SkyPilot lets you run LLMs on any cloud provider. vLLM lets you deploy LLMs with a single command. llamafile lets you run LLMs locally as a single file executable!