This week, I learned:
**/*.mdcan search for all Markdown files. Julia Evans- Windows 11 2024 Update features: Ref
- Live captions (via the tray) can transcribe audio and microphone.
- Cocreator in Paint lets you draw crudely and enhances it with AI. The neat UI is a slider that lets you control how close it should be to your drawing.
- Voice Clarity automatically cancels echo, reduces background noise, and minimizes reverb.
- Studio Effects (via the tray) lets you apply camera effects on all apps. Eye contact feature is CLEVER!
- sudo lets you run commands with admin privileges from the command line. source
- Roaming RAG is an alternative to RAG without the vector database.
- Applicable to well structured documents, e.g. technical books, manuals, etc.
- Create a hierarchical outline of the document. Code
- Keep the top-level headings.
- Preserve the first ~100 characters of opening text from each section.
- Present the second-level headings, but without any subsidiary content.
- Provide each section a unique 8 digit hex identifier.
- Each section heading is followed by a guiding comment for the model:
Section collapsed - expand with expand_section("{identifier}").
- Then read the relevant sections as context to answer the question. Code
- Traffic to StackOverflow has fallen considerably. Especially from young and Indian developers. StackOverflow revenue is down. Via Prashanth. They’re exploring:
- Licensing their content. (Meta says high quality content improves LLM performance by 30% on HumanEval)
- Enterprise StackOverflow for system integration
- Fine-tuned versions of Enterprise Stackoverflow for enterprises
- Integrate StackOverflow within your IDE. Ask questions, post directly
- I surveyed the Gramener QA team on how they were using LLMs.
- 7 used it for code generation (e.g. date extraction, regex generation)
- 4 used it for learning (e.g. Robot Framework, how to define test cases, API usage)
- 3 used it for formula generation (e.g. Excel)
- 2 used it for test scenario identification
- 2 used it for test data generation
- 2 used it for comparing expected vs actual datasets
- 1 used it for data type identification (e.g. given sample values, identify the data type).
- 1 used it for evaluating resulting (LLM as a judge)
- I asked the Straive Digitalized Operations team what management techniques they would apply to manage LLMs. Here are the responses:
- Ask better questions. (Prompt engineering.)
- Create templates or step-by-step instructions. (Chain of Thought.)
- Ask for multiple options and pick from the best options. (Agentic approach?)
- Training. (Fine tuning.)
- Price weaker responses lower. (Stratified model pricing?)
- “LLM hallucinations are a good thing. They are a sign of diversity, allowing us to improve the answer by exploring multiple paths.” – A colleague from Straive.
- Hyperbrowser is a cloud based puppeteer service.
- Bedrock Llama models can’t be directly called with their model names. You need to use their inference profile names, e.g.
us.meta.llama3-2-11b-instruct-v1:0if the model is in a US region. - Hacker News RSS is a good way to get RSS feeds from Hacker News. It’s also a good way to understand how to convert a news source into RSS feeds. BlueSky has RSS feeds too
- When embedding using a
SentenceTransformer.encode(docs)it’s best if we embed with smallerdocsand call it multiple times (rather than embedding more at once). On Colab T4, forgte-base-en-v1.5, when embedding 1,000 docs of up to 8K chars each, here is the TOTAL time it took, based on batch sizes (lower is better)- 1 doc per call: 10s
- 2 docs per call: 13s
- 4 docs per call: 19s
- 8 docs per call: 23s
- 16 docs per call: 32s
- 32 docs per call: 40s
- Running embeddings without a GPU is extremely slow. It takes ~2.4 seconds per string.