This week, I learned:

  • snapdom is a fast, light, element capture alternative to html2canvas but doesn’t work well with non-CORS images or iframes.
  • Sli.dev is a Markdown slide language. Similar to Marp
  • Don’t split your code into microservices until you need to scale. Ref
  • Vibe coding is like getting others’ code to work, which is exactly what most devs do. Simon Willison #ai-coding
  • Tofu Yakitori is a Japanese dish. It’s like a dhokla. Marinated tofu cubes brushed with that sweet‑savory tare (soy, mirin, sake, a hint of sugar), then grilled until caramel‑charred. One of the better (tasty + different) dishes I’ve had recently. I used ChatGPT to remind me of the dish name.
  • Trust, attitudes and use of artificial intelligence surveyed ~1,000 people across 47 countries on their views on AI. PDF
    • Emerging economies trust and use AI more. It’s an opportunity to leapfrog.
    • 26% of students use AI daily (vs 17% employees). Efficiency is the main benefit.
  • Gemini APIs now have automatic caching for 75% cost reduction if message is >1K (Flash) or >2K (Pro) tokens. Ref
  • YOLO is much better than Gemini at object detection. Use for pro-processing. Ref
  • Using [[n]] is probably the best citation format for inline search references in RAG. ChatGPT
  • ⭐ Double-checking is surprisingly efficient since LLM hallucinations are mostly uncorrelated. LLMs perform human tasks (e.g. classifying customer support messages) at ~85% accuracy. This might be unacceptable. But by asking 2 moderately correlated LLMs and double-checking discrepancies, we reduce automation by ~20% but reduce errors to 0.25%. Triple-checking reduces automation by ~25% but errors to under ~0.01%! Ref
  • Anthropic introduces web search in the API at $10 / 1K searches. Here’s how it compares:
  • India attacked Pakistan!
  • ⭐ When writing notes, summarize at the end of the day the learnings and next steps.
  • GitHub does not let you control the cache duration, but there are many creative workarounds. ChatGPT
    • HTML meta tags: <meta http-equiv="Cache-Control" content="no-cache, no-store, must-revalidate">
    • Use a service worker (blog)
    • Proxy through a CDN. Cloudflare, Netlify
    • Move to another static host: S3 + CloudFront, Heroku, Vercel, Surge, Firebase Hosting
  • Notes from the PromptEvals paper:
    • Good evals must be:
      • Objectively MEASURABLE (even if by an LLM). Otherwise, we won’t know if it’s right.
      • Directly RELEVANT to the input/prompt. Otherwise, we’re not evaluating the input.
    • Typical evals fall into 6 categories
      • Structured output: Adhere to a schema (Markdown, HTML, DSL, JSON + Schema)
      • Multiple choice
      • Length constraints: N characters, words, sentences, list items, etc.
      • Semantic constraints: Exclude terms, topic relevance, follow grammar, etc.
      • Stylistic constraints: Style, tone, persona
      • Prevent hallucinations: Factual accuracy. Instruction following