I asked multiple coding agents and models to build the same app:

Create a single-page web app at index.html that beautifully renders a GitHub user profile and activity comprehensively. Pick the ID in the URL ?id=…, default to ?id=torvalds.

… and compared their quality, cost, and speed.

My observations:

Quality variance is the highest. Some models / agents produce great visuals, some average, some fail completely.

Cost and time variance are lower among the successful models. About 2X variance in each.

This is unlike non-code usage, where quality varies less than cost.

My takeaway: Pick the best model / agent. Don’t worry about speed and cost - the variance is lower.

Results: https://sanand0.github.io/llmevals/coding-agents/

LinkedIn