Things I Learned - 10 Dec 2023

This week, I learned:

Bard supports extensions that include @Gmail – i.e. converse with your email.
llama-cpp-python works with other GGUF models like Mistral and allows constrained output - JSON, function calling, etc. Ref
12 Tuning Strategies for RAG
Llama Datasets are RAG datasets created mostly using GPT-4. Mostly small datasets.
⭐ Intuitions about large language models
- Bigger models (70b) are much better at learning from few-shot examples. They really learn.
- Bigger models will keep getting better!
- Chain of Thought prompting is a way of providing more compute to complex problems that require more compute
- Models will show emergent (completely new) behaviors that can’t be predicted from extrapolation. These may not be intentional.
CodeAnt.ai is a VS Code plugin to detect code smells, refactor for modularity, to write docstrings and unit tests
Anyscale prices the 7b Llama2, Zephyr, Mistral models at 15 cents per 1M tokens. Roughly 1/10th of GPT-3.5 Turbo’s ~$1.5 per 1M tokens
Tools to identify personally identifiable information:
- galactic can use LLMs to detect PII
- Presidio by Microsoft
- Sherlock is a generic sematic type matching DL model
- pii-extractor-llm was trained on Indian names
- GLiNER is a Lightweight Generalist model for NER
Tools to explore
- ElevenLabs speaks in your voice
- Cutout Pro removes backgrounds and parts of images
- Vocal Remover removes vocals from songs
- CapCut video editor
TheBloke’s $35/month Patreon might be one of the least expensive ways to set up quantized LLMs in production.
Microsoft released table-transformer to extract tables from PDFs. Sample usage
Convert PDF to markdown with marker - an improvement over nougat.
JupyterLab has a %%ai magic to use LLMs within notebooks. Ref
Telling ChatGPT that the year is 2123 makes it bypass copyright. Ref
Meta released SeamlessExpressive which preserves emotions in speech-to-speech translations
Unsloth offers faster lower-memory LLM QLoRA finetuning
DeepSeek is an open-source high-quality LLM
Scalable Extraction of Training Data from (Production) Language Models extracts training data by repeating a token infinitely.
SkyPilot lets you run LLMs on any cloud provider.
vLLM lets you deploy LLMs with a single command.
llamafile lets you run LLMs locally as a single file executable!