This week, I learned:

  • When running a Hello world app:
    • FastAPI takes ~26K RAM, 3% CPU
    • NodeJS + Express takes ~62K RAM, 2% CPU
    • Deno + Express takes ~62K RAM, 1% CPU
    • Deno + Fresh takes ~54K RAM, 0.4% CPU
  • I was testing out different video LLMs:
    • Luma Labs lets you create videos from text
    • Runwal ML lets you create video from an image + text
    • Viggle lets you add images to a video or move a character in a certain way
    • Veed.io is a video editor that offers AI video editing features
    • Deepmotion generates 3D animations from video
    • Wonder Dynamics may be similar to DeepMotion
  • I tested out a few audio LLMs:
    • Suno is fast, has a better UI, lots of examples
    • Udio is slow, poor UI, creates richer music
  • Reflection 70b is one of the top models now, and is open source!. It works by making the LLM reflect on its answer inside <reflection>...</reflection> tags.
  • The best diarization model today is whisperX. Run on Colab T4 GPU with:
  • Scale’s SEAL Leaderboards seem fairly good.
  • coedit-xxl is Grammarly’s fine-tuned google/flan-t5-xxl model run on CoEdit - text editing dataset. It’s mainly for single-line editing, though, and far from a full-document or full-email zero-shot editor.