Things I Learned - 08 Sep 2024

I benchmarked RAM and CPU usage across FastAPI, Node, and Deno, while exploring several video and audio generative AI tools. I also identified whisperX for diarization and tested Reflection 70b's internal reflection mechanism for improved model accuracy.

This week, I learned:

When running a Hello world app:
- FastAPI takes ~26K RAM, 3% CPU
- NodeJS + Express takes ~62K RAM, 2% CPU
- Deno + Express takes ~62K RAM, 1% CPU
- Deno + Fresh takes ~54K RAM, 0.4% CPU
I was testing out different video LLMs:
- Luma Labs lets you create videos from text
- Runwal ML lets you create video from an image + text
- Viggle lets you add images to a video or move a character in a certain way
- Veed.io is a video editor that offers AI video editing features
- Deepmotion generates 3D animations from video
- Wonder Dynamics may be similar to DeepMotion
I tested out a few audio LLMs:
- Suno is fast, has a better UI, lots of examples
- Udio is slow, poor UI, creates richer music
Reflection 70b is one of the top models now, and is open source!. It works by making the LLM reflect on its answer inside <reflection>...</reflection> tags.
The best diarization model today is whisperX. Run on Colab T4 GPU with:
Scale’s SEAL Leaderboards seem fairly good.
coedit-xxl is Grammarly’s fine-tuned google/flan-t5-xxl model run on CoEdit - text editing dataset. It’s mainly for single-line editing, though, and far from a full-document or full-email zero-shot editor.

Related