Law of large numbers

Frequentist makes it possible to forecast the probability of a future event based on the current frequency. Frequentist works because of the law of large numbers.

There are two different versions of the law of large numbers: the weak law of large numbers and the strong law of large numbers.


In 1713, Swiss mathematician Jakob Bernoulli after 20 years of hard work proved that (if you have a sample of independent and identically distributed random variables, as the sample size grows larger, the sample mean will tend toward the population mean.) the sample average converges in probability towards the expected value

This known as the Weak Law of Large Numbers or Bernoulli's theorem.


It tells us as more trails performed the frequency will more likely to become closer to the true probability. Be aware that it is only more likely not a 100% guarantee.

Bernoulli’s theorem gives us confidence that a random event has a great chance to have a certain probability.


In 1930 Andrey Kolmogorov proved The strong law of large numbers - states that the sample average converges almost surely to the expected value.

It tells us as more trails performed the frequency will become closer to the true probability. Now we can confidently say there’s a certain probability for a random event.



The LLN is important because it guarantees stable long-term results for the averages of some random events.



how the law of large number operates under insurance:

Do you think insurance is a very risky business? They only charge a small amount fee but if something goes wrong they need to pay much more. Actually, it is not, we can control the risk thanks to LLN. imagining there is only one property insurance company, and it charges the same fee to all customers.  As we know older houses tend to have more problem. Then do you think the new house owner and older house owner willing to pay the same fee? So soon enough there must be another company say I have two plan: plan A for new house and plan B for old house, of course, plan A is cheaper than plan B. the result is all new house owners moves to this company. Follow this track there must be another company introduce more plans to target customers. Of course, there are more to consider when dealing with the risk such as the building structure, material, and location. So in order to improve their produce, the insurance companies try their best to put similar customers to the same group. This is called segregation.


Aggregation: how can insurance companies reduce the risk after put similar customers in the same group? Here LLN comes to help. As more trails performed the frequency will become closer to the true probability. If they can know the claim probability they can offer a price that is attractive and also profitable. So the solution is clear: sign customers as more as they can.


Another casino example: I not going to say that the casino has a slightly higher than 50% chance to win the game, so they make money because of LLN. 

Imagine that if you know a game have a roughly 50% winning chance and you saw someone play it 4 times and all lose, do you have the feeling that if keep playing the next time rather than 50% he has a great chance to win? Because the overall probability to win is 50% and he loses so many times. You know the law of large numbers? If he keeps losing how can you say that as more trails performed the frequency will become closer to the true probability?

That not true.  LLN works not through compensating for what already happens but through mean reversion. To understand mean reversion let’s see an example: if I put a spoon of sugar in a cup of water, you must can teste the sweetness, but how about I put that spoon of sugar into the sea? Even use the most advanced method I bet you can’t tell the difference. Can you think of another example of mean reversion? Leave a comment below.


Two takeaway:

First: as more trails performed the frequency will become closer to the true probability.

Second:  LLN works not through compensating for what already happens but through mean reversion.


Comments

Popular posts from this blog

Statistics and probability

What is Randomness

Quantify probability