This statistic is often inadequately explained

Yukio
3 min readAug 22, 2024

--

I’m a data scientist and a creator on topics related to data across multiple social media channels. Recently, I conducted a quiz in my stories and realized that not many people are are aware of the harmonic mean, despite it being the foundation of one of the most used metrics in classification models: the f1-Score.

In reality, while the harmonic mean is an interesting measure for anyone working with data, I’ve noticed that many tutorials fail to give clear guidance on how to apply it correctly. They usually just mention that it’s used for calculating averages of rates, but that alone isn’t enough, since the arithmetic mean can also be used in these cases.

Here’s a breakdown with a solid example I adapted fromone of my favourites books from Springer:

Image source: https://www.geeksforgeeks.org/harmonic-mean/

Rates are always expressed as a ratio between two units, like miles per gallon or price per kilogram. To decide whether to use the arithmetic or harmonic mean, keep the following in mind:

- Harmonic Mean: It’s the right choice when you’re averaging rates with constant numerators.
- Arithmetic Mean: This should be used when averaging rates where the denominators are constant.

Still confused? Let’s look at an example:

Generated by AI

Suppose we have five cooks — A, B, C, D, and E — in a donut shop, producing 20, 18, 15, 12, and 10 donuts per hour, respectively. Their productivity can be recorded in two different ways:

First Approach:
- Chef A: 20 donuts per hour
- Chef B: 18 donuts per hour
- Chef C: 15 donuts per hour
- Chef D: 12 donuts per hour
- Chef E: 10 donuts per hour

Second Approach:
- Chef A: 3 minutes per donut
- Chef B: 3.33 minutes per donut
- Chef C: 4 minutes per donut
- Chef D: 5 minutes per donut
- Chef E: 6 minutes per donut

If we aim to calculate the average productivity of these chefs using the first approach, the arithmetic mean would be appropriate if time is the constant factor (i.e., donuts per hour). However, if we’re focused on the time needed to produce one donut, like the second approach, the harmonic mean is more fitting since the numerator (one donut) remains constant.

As you can see, using the first approach, we can easily calculate the arithmetic mean, which is 15 (donuts/hour). Using the second one, we would get 3.99 (minutes per donut). Basically, the 1st and 2nd approach get the same results with the rates being used differently.

Have you considered this before? How do you incorporate the harmonic mean into your analyses?

Hope you enjoyed the reading, cheers!

--

--

Yukio
Yukio

Written by Yukio

Mathematician with a master degree in Economics. Working as a Data Scientist for the last 10 years.

No responses yet