Self-Discover LLM Capabilities

Q: “How do we learn what we can do with AI agents?”

Me: “Ask them!”

I mean, they are probably aware of their abilities. They can search online for how other people are using them. They have access to tools (connect to GMail, write & run code, etc.) which they’re aware of, and even if not, can try out.

Asking them seems a useful way of figuring out how to use them.

For example, I didn’t know that ffmpeg (which ChatGPT, Gemini, Claude, etc. can run) can visualize audio using filters. They could create a bunch of stunning visualizations as a video compilation.

So, I told Claude:

I did not know ffmpeg could visualize audio via filters…
You have a container environment with a set of tools installed and you can run commands.
Identify creative ways in which the tools you have access to can be used…
…
Fact-check by cursorily verifying the command options…
But no need to implement any of these…
BLOW MY MIND!!

It gave me 125 ideas from drum patterns of log timestamps, directory structures as artistic graphs, frequency domains of images via Fourier transforms, morphological image erosion/dilation effects, and a whole bunch of things I’ve never heard of.

It was too much, so I didn’t bother. (I’ll read later.)

Implement the most visually impressive among these.

And the result was a stunning video compilation:

It generated these 10 purely algorithmic (no external assets) visualizations:

Mandelbrot Set via FFmpeg’s mandelbrot filter: Deep zoom into the famous seahorse valley, revealing infinite complexity from z² + c
Sierpinski Carpet via FFmpeg’s sierpinski filter: Recursive self-similar fractal pattern that animates through chaos game iterations
Game of Life via FFmpeg’s life filter: Conway’s cellular automaton with glowing cells and mold trails showing emergent complexity from 4 simple rules
Rule 30 via FFmpeg’s cellauto filter: Wolfram’s elementary cellular automaton that generates apparent randomness from deterministic rules
Domain Coloring via Python + NumPy: Complex function visualization where hue represents angle and brightness represents magnitude, morphing through z², z³, and rational functions
L-Systems via Python + PIL: Three fractal trees grow algorithmically using Lindenmayer system grammar rules - pure mathematical botany
Barnsley Fern via Python chaos game: 500,000 points plotted using an Iterated Function System, emerging from randomness into a perfect fern
Julia Set via Python + NumPy: Dancing fractals as the complex parameter c traces a wobbling circle, continuously morphing fractal boundaries
Plasma Effect via FFmpeg’s geq expression filter: Real-time interference patterns using layered sine waves in RGB channels
Gradient Spiral via FFmpeg’s gradients filter: Six-color rotating spiral with 8x speed, creating hypnotic color field animation

… with cinematic title cards (fade transitions, credits) in a 1080×1080 square format perfect for social media.

I learnt at least a few things from this:

The tool side. I now know that ffmpeg has built-in fractal capability. Fractals have fascinated me since I was 12. This is something to explore.
The technique side. I’m learning new terms like “temporal slit-scan photography” - used to create time-slice effects like bullet time in The Matrix, using ffmpeg. Or “music chord visualization” using neato, or capturing packet flow data using tcpdump to visualize network traffic, etc.

I would never have thought of these, but the capabilities are in my hands.

I think there’s benefit in just spending time with LLMs, asking them (in different ways) what they can do, and what would help, interest, or even amuse us.

PS: ChatGPT’s response to this was a bunch of good ideas and a tiny 0.5 second Mandelbrot video. Gemini shared a tiny list of 10 ideas (read them all) but made up with this brilliant Veo-generated video.