Visualisation Archives

Wage Rates of Nations and LLMs

May 24, 2025 / LLMs, Visualisation / 1 Comment

How much does an LLM charge per hour for its services?

If we multiple the Cost Per Output Token with Tokens Per Second, we can get the cost for what an LLM produces in Dollars Per Hour. (We’re ignoring the input cost, but it’s not the main driver of time.)

Over time, different models have been released at different billing rates. Most new powerful models like O3 or Gemini 2.5 Pro cost ~$7 – $11 per hr.

To get a sense of this, let’s look at wage rates across countries and industries:

Rate ($/hr)	Countries (avg hourly wage)	Models
0–2	Bangladesh ($1.42/hr), Pakistan ($1.65/hr), Vietnam ($0.94/hr)	devstral-small ($0.01/hr), gemini-2.5-flash-preview ($1.50/hr)
2–5	Brazil ($3.09/hr), Mexico manufacturing ($4.90/hr)	claude-sonnet-4 ($2.23/hr), codex-mini ($2.54/hr)
5–10	India ($5.03/hr), South Africa avg ($9.38/hr), Poland min wage ($7.35/hr)	o3 ($7.16/hr), claude-opus-4 ($8.67/hr)
10–15	Germany ($12.93/hr), France ($12.41/hr), UK ($14.43/hr)	gemini-2.5-pro-preview ($11.89/hr), gpt-4.5-preview ($13.10/hr)
15–20	Spain ($15.87/hr), Italy ($16.80/hr), Japan ($17.98/hr)	No recent models in this range

Workers in Europe and Japan are already more expensive than the more expensive models, at $12+ per hour. India, Brazil, Mexico etc. are more expensive than most of the average models.

Once a language model’s run-time cost drops below the local minimum wage, the “offshoring” advantage disappears. AI becomes the cheapest employee in every country at once. Countries whose economies depend on being the “cheaper alternative” for labor-intensive work face potential economic disruption.

Paradoxically, workers in countries with strong labor protections, unions, and higher wages (like Germany and France) may paradoxically be safer from AI displacement.

Source code: sanand0/llmpricing
Analysis: ChatGPT

Wage Rates of Nations and LLMs Read More »

How to Create a Data Visualization Without Coding

April 27, 2025 / LLMs, Visualisation / Leave a Comment

After seeing David McCandless‘ post “Which country is across the ocean?” I was curious which country you would reach if you tunneled below in a straight line (the antipode).

This is a popular visualization, but I wanted to see if I could get the newer OpenAI models to create the visual without me 𝗿𝘂𝗻𝗻𝗶𝗻𝗴 any code (i.e. I just want the answer.) After a couple of iterations, O3 did a great job with this prompt:

𝙱𝚞𝚒𝚕𝚍 𝚊 _𝚜𝚒𝚗𝚐𝚕𝚎_ 𝙶𝚎𝚘𝙹𝚂𝙾𝙽 (𝙴𝙿𝚂𝙶:𝟺𝟹𝟸𝟼) 𝚝𝚑𝚊𝚝 𝚜𝚑𝚘𝚠𝚜, 𝚏𝚘𝚛 𝚎𝚊𝚌𝚑 𝚌𝚘𝚞𝚗𝚝𝚛𝚢, 𝚘𝚗𝚕𝚢 𝚝𝚑𝚎 𝚙𝚊𝚛𝚝𝚜 𝚘𝚏 𝚒𝚝𝚜 𝚊𝚗𝚝𝚒𝚙𝚘𝚍𝚎 𝚝𝚑𝚊𝚝 𝚕𝚒𝚎 𝚘𝚟𝚎𝚛 𝚘𝚌𝚎𝚊𝚗. 𝙲𝚊𝚛𝚎𝚏𝚞𝚕𝚕𝚢 𝚑𝚊𝚗𝚍𝚕𝚎 𝚌𝚘𝚞𝚗𝚝𝚛𝚒𝚎𝚜 𝚝𝚑𝚊𝚝 𝚜𝚝𝚛𝚊𝚍𝚍𝚕𝚎 𝚝𝚑𝚎 𝚙𝚛𝚒𝚖𝚎 𝚖𝚎𝚛𝚒𝚍𝚒𝚊𝚗 - 𝚄𝙺, 𝙵𝚛𝚊𝚗𝚌𝚎, 𝙰𝚕𝚐𝚎𝚛𝚒𝚊, 𝚎𝚝𝚌.

Here is the output and here is the ChatGPT conversation that generated it.

I learnt a few things:

Ask for the output, not the code. Models like O3 and O4 Mini can run code while thinking. Let’s stop asking for code to run. Just ask for the output directly. Let it figure out how.
Edge cases are everywhere. I had a problem with UK, France, Algeria, etc. straddling the prime meridian. If all goes well, you get AI-speed results. But it never does, and fixing it takes an expert and human-speed results. Programmers under-estimate edge cases, so compensate for this.

If you want to run this yourself, the code is at https://github.com/sanand0/antipodes

How to Create a Data Visualization Without Coding Read More »

How to Use the New O4 Mini for Data Visualization

April 20, 2025 / LLMs, Visualisation / Leave a Comment

O3/O4 Mini are starting to replace Excel (or Tableau/Power BI) for quick analysis and visualizations. At least for me.

I normally open Excel when I need a fast chart or pivot. For instance, we track outages of our semi‑internal server, LLM Foundry.

To grab the data I ran one line in the browser console:

$$(".lh-base").map(d => d.textContent.trim()).filter(d => d.includes("From"))

This produced lines like:

Apr 20, 2025 03:11:27 PM +08 to Apr 20, 2025 03:27:12 PM +08 (15 mins 45 secs)
Apr 19, 2025 10:03:15 PM +08 to Apr 19, 2025 10:05:45 PM +08 (2 mins 30 secs)
Apr 19, 2025 09:47:13 PM +08 to Apr 19, 2025 09:49:45 PM +08 (2 mins 32 secs)
Apr 19, 2025 08:49:00 PM +08 to Apr 19, 2025 08:51:51 PM +08 (2 mins 51 secs)
Apr 19, 2025 08:13:02 PM +08 to Apr 19, 2025 08:15:35 PM +08 (2 mins 33 secs)
...

Then I told O4-Mini-High:

Here are downtimes for llmfoundry.straive.com.
Convert this to CSV and allow me to download it.
Also, draw the downtimes on a grid, rows=hour of day, columns=date,
cell contains 1 circle per outage in that time period,
size of each circle is based on the duration of the outage.

Apr 20, 2025 03:11:27 PM +08 to Apr 20, 2025 03:27:12 PM +08 (15 mins 45 secs)
Apr 19, 2025 10:03:15 PM +08 to Apr 19, 2025 10:05:45 PM +08 (2 mins 30 secs)
Apr 19, 2025 09:47:13 PM +08 to Apr 19, 2025 09:49:45 PM +08 (2 mins 32 secs)
… (rest of the data – about 50 rows)

Here’s the power of what a model like O4 Mini High can do.

It can reason. So, it planned an approach. (Convert to CSV, transform into date and hour. create a grid-based plot, use a pandas DataFrame, save it to a CSV, etc.)
It can code. It is pretty good at coding, and this is not too hard a problem, so it got the code right in one shot.
I can run code. This is a powerful step. It executed the code and produced the visualization above.

All of this took less than one minute.

I did not look at the code. I just focused on the picture and suggested changes.

This draws crosses, not circles, for each hour. Also, if there are multiple outages in an hour, I want multiple circles.

Here’s the output that took less than 10 seconds:

Next iteration:

Make the circles red with the same level of transparency.
Set the title to “LLM Foundry Downtime (SGT)”.
Instead of jittering the circle, let the Y position be the middle of the outage time.

Next iteration:

Change the red to a milder shade.
Set alpha to 0.5 but add a stroke with alpha 0.9.
Format the dates like “Sun 20 Apr”, etc.

That’s it! I never even looked at the code. The whole loop took 3 minutes – far faster than I could manage, though I’m good at code and data visualization!

More importantly, the model frees me to focus on the real problem, which is why is the downtime high?

How to Use the New O4 Mini for Data Visualization Read More »

How isolated is Bollywood from world cinema?

January 5, 2022 / Data, Visualisation / 2 Comments

These are the major group actors based on who they act with most.

Actors mostly act with other actors in the same…

Language. Not country. For example, the Spanish / Mexican group is across countries. But Indian actors divide into North Indian and South Indian. It’s language, not country.
Time period. Old American actors are a separate group from Hollywood. (Naturally. Brad Pitt was born after Humphrey Bogart died. They couldn’t have acted together.)
Genre. Hollywood Porn actors don’t act with mainstream Hollywood. Same with Japanese Porn, Hollywood TV, and Hollywood Horror actors.

How are these groups themselves connected? Do Chinese actors act with Hollywood often? How isolated is Bollywood from world cinema?

Hollywood is the core group

Take groups that act with other groups at least 5% of the time. Mainstream Hollywood acts with British and Hollywood TV/Horror actors. All other clusters are isolated.

Indian & Japanese clusters emerge

Let’s go more liberal. Take groups that act with other groups at least 2% of the time. Hollywood forms a big connected cluster. It includes most of Europe — British, German, French, Czech, Yugoslavian & Italian actors.

North & South Indian actors form the first non-Hollywood cross-language cluster.

The Japanese and Japanese porn actors form a cluster too. (Interestingly, it’s easy for a Japanese porn actor to act with mainstream Japanese actors. Hollywood porn actors find it far harder to act with Hollywood.)

Among groups that **act with other groups at least 1% of the time**, we have:

Chinese & Korean cluster emerges

Chinese & South Korean actors form the first cross-country cross-language cluster.

Hollywood expands to act with Scandinavian, Spanish, Polish, Brazilian & Nigerian films.

Other film industries (Russian, Greek, Egyptian — even Hollywood Porn — are still isolated.)

World Cinema vs the rest

Among groups that act with other groups at least 0.5% of the time, we have:

Turkish & Iranian groups coming together
Indonesian actors acting with the Chinese
Hollywood expanding to cover Russian, Greek, Egyptian, and finally, Hollywood Porn. (It’s easier for Brazilian / Nigerian to act with Hollywood than to be a Hollywood Porn actor.)

At this point, there are 6 actor groups that act with each other at least 1 out of 200 times (0.5%).

World Cinema (Hollywood & friends)
Japanese (mainstream & porn)
Indian (North & South)
Chinese, South Korean & Indonesian
Turkish & Iranian
Filipino

One world of cinema

If we look at groups that act with other groups at least 0.5% of the time, we have a far more unified picture. Almost every actor group acts with another group at least 1 out of 400 times.

But even here, there’s an exception. Filipino actors — the most insular major actor group in the world.

So, how isolated is Bollywood from World Cinema? For its size, it’s one of the most isolated actor groups. (But not as much as Iranian/Turkish or Filipino.)

How isolated is Bollywood from world cinema? Read More »

Colour spaces

August 27, 2012 / Coding, Visualisation / 1 Comment

In reality, a colour is a combination of light waves with frequencies between 400-700THz, just like sound is a combination of sound waves with frequencies from 20-20000Hz. Just like mixing various pure notes produces a new sound, mixing various pure colours (like from a rainbow) produces new colours (like white, which isn’t on the rainbow.)

Our eyes aren’t like our ears, though. They have 3 sensors that are triggered differently by different frequencies. The sensors roughly peak around red, green and blue. Roughly.

It turns out that it’s possible to recreate most (not all) colours using a combination of just red, green and blue by mimicking these three sensors to the right level. That’s why TVs and monitors have red, blue and green cells, and we represent colours using hex triplets for RRGGBB – like #00ff00 (green).

There are a number of problems with this from a computational perspective. Conceptually, we think of (R, G, B) as a 3-dimensional cube. That’d mean that 100% red is about as bright as 100% green or blue. Unfortunately, green is a lot brighter than red, which is a lot brighter than blue. Our 3 sensors are not equally sensitive.

You’d also think that a colour that’s numerically mid-way between 2 colours should appear to be mid-way. Far from it.

This means that if you’re picking colours using the RGB model, you’re using something very far from the intuitive human way of perceiving colours.

Which is all very nice, but I’m usually in a rush. So what do I do?

I go to the Microsoft Office colour themes and use a colour picker to pick one. (I extracted them to make life easier.) These are generally good on the eye.
Failing that, I pick something from http://kuler.adobe.com/
Or I go to http://colorbrewer2.org/ and pick a set of colours
If I absolutely have to do things programmatically, I use the HCL colour scheme. The good part is it’s perceptually uniform. The bad part is: not every interpolation is a valid colour.

Colour spaces Read More »

Correlating subjects

February 12, 2012 / Visualisation / 8 Comments

A question from Dorai get me thinking: does being good at maths help in programming?

I don’t have a personal view. But since Reportbee has data on the Class 12 examination results for the last three years, we thought we could do a bit of analysis.

Here’s the correlation of the scores of various subjects with Computer Science.

Correlation	Subject
0.79	CHEMISTRY
0.79	PHYSICS
0.75	ENGLISH
0.75	MATHEMATICS
0.72	LANGUAGE
0.67	BIOLOGY
0.66	ECONOMICS
0.66	COMMERCE
0.65	ACCOUNTANCY
0.56	HISTORY
0.52	GEOGRAPHY

It almost breaks neatly into four groups.

Physics & Chemistry, both of which have a correlation of 0.79, and clearly are the most correlated with Computer Science
Maths, English & Language, which have a correlation of 0.72 – 0.75
Biology, Economics, Commerce and Accountancy, which hover at around 0.66
History & Geography, which are 0.52 – 0.56

The results in 2010 are almost exactly the same.

Correlation	Subject
0.78	PHYSICS
0.78	CHEMISTRY
0.75	ENGLISH
0.75	MATHEMATICS
0.73	LANGUAGE
0.67	ACCOUNTANCY
0.65	ECONOMICS
0.65	COMMERCE
0.64	BIOLOGY
0.60	GEOGRAPHY
0.55	HISTORY

I’m not sure what it is that leads to this kind of correlation. In fact, the full correlation between every pair of subjects (for 2011) is below:

What inferences would you draw from this?

And what do you think is the reason for this?

Correlating subjects Read More »

India district map

July 21, 2011 / Data, Visualisation / 9 Comments

I put together a district map of India in SVG this weekend.

So what?

You can now plot data available at a district level on a map, like the temperature in India over the last century (via IndiaWaterPortal). The rows are years (1901, 1911, … 2001) and the columns are months (Jan, Feb, … Dec). Red is hot, green is cold.

(Yeah, the west coast is a great place to live in, but I probably need to look into the rainfall.)

districts.svg has has 640 districts (I’ve no idea what the 641st looks like) and is tagged with the State and District names as titles:

<g title="Madhya Pradesh">
  <path title="Alirajpur" d="..." />
  <path title="Jhabua" d="..." />
  ...
</g>

How?

I made it from the 2011 census map (0.4MB PDF). I opened it in Inkscape, removed the labels, added a layer for the districts, and used the paint bucket to fill each district’s area. I then saved the districts layer, cleaning it up a big. Then I labelled each district with a title. (Seemed like the easiest way to get this done.)

Thanks to @planemad, @gkjohn, @arjunram for inputs. Play around. Feedback welcome.

India district map Read More »

Formatting tables

June 10, 2011 / Excel tips, Visualisation / 2 Comments

Formatting tables in Excel is a fairly common task, but there are a number of ways to improve on the way it’s done most of the time.

Here are a few tips. Fairly basic stuff, but hopefully useful.

Formatting tables Read More »

Eating more for less

May 19, 2011 / How I do things, Visualisation / 6 Comments

A couple of years ago, I managed to lose a fair bit of weight. At the start of 2010, I started putting it back on, and the trajectory continues. I’m at the stage where I seriously need to lose weight. I subscribe to The Hacker’s Diet principle – that you lose weight by eating less, not exercising.

An hour of jogging is worth about one Cheese Whopper. Now, are you going to really spend an hour on the road every day just to burn off that extra burger? You don't exercise to lose weight (although it certainly helps). You exercise because you'll live longer and you'll feel better.

I’m afraid I’ll live too long anyway, so I won't bother exercising just yet. It's down to eating less. Sadly, I like food. So to make my “diet” work, I need foods that add less calories per gram. Usually, when browsing stores, I check these manually. But being a geek, I figured there’s an easier way. Below is a graph of some foods (the kind I particularly need to avoid, but still end up eating). The ones on the top add a lot of calories (per 100g), and better to avoid. The ones at the right cost a lot more. Now, I’m no longer at the point where I need to worry about food expenses, but still, I can’t quite kick the habit, also you might want to check out this Rootine's comparison of B12 methylcobalamin and cyanocobalamin that will help you in your diet. Hover over the foods to see what they are, and click on them to visit the product. (If you’re using an RSS reader and this doesn’t work, read on my site.)

(The data was picked from Tesco.) It’s interesting that cereals are in the middle of the calorie range. I always thought they’d be low calories per gram. Turns out that if I want to to have such foods, I’m better off with desserts or ice creams (profiterole, lemon meringue or tiramisu). In fact, even jams have less calories than cereals. But there are some desserts to avoid. Nuts are a disaster. So are chocolates. Gums, dates and honey are in the middle – about as good as cereals. Salsa dip seems surprisingly low. Custards seem to hit the sweet spot – cheap, and very low in calories. Same for jellies. So: custards and jelly. My daughter’s going to be happy.

Eating more for less Read More »

Visualising the IMDb

April 17, 2011 / How I do things, Visualisation / 3 Comments

The IMDb Top 250, as a source of movies, dries out quickly. In my case, I’ve seen about 175/250. Not sure how much I want to see the rest.

When chatting with Col Needham (who’s working his way through every movie with over 40,000 votes), I came up with this as a useful way of finding what movies to watch next.

Each box is one or more movies. Darker boxes mean more movies. Those on the right have more votes. Those on top have a better rating. The ones I’ve seen are green, the rest are red. (I’ve seen more movies than that – just haven’t marked them green yet 🙂

I think people like to watch the movies on the top right – that popularity compensates (at least partly) for rating, and the number of votes is an indication of popularity.

For example, my movie pattern tells me that I ought to see Cidade de Deus, Inglourious Basterds and Heat – which I knew from the IMDb Top 250, but also that I ought to cover Kick-Ass, The Hangover and Juno.

It’s easy to pick movies in a specific genre as well.

Clearly, there are many more Comedy movies in the list than any other type – though Romance and Action are doing fine too. And I seem the have a strong preference for the Fantasy genre, in stark contrast to Horror.

(Incidentally, I’ve given up trying to see The Shining after three attempts. Stephen King’s scary enough. The novel kept me awake checking under my bed for a week at night. Then there’s Stanley Kubrick’s style. A Clockwork Orange was disturbing enough, but Haley Joel Osment in the first part of A.I. was downright scary. Finally, there’s Jack Nicholson. Sorry, but I won’t risk that combination on a bright sunny day with the doors open.)

You can track your list at http://250.s-anand.net/visual.

For those who want to play with the code, it’s at http://code.google.com/p/two-fifty/source/browse/trunk/visual.html.

Visualising the IMDb Read More »

Visualisation