Frequentist probability


Almost all statistics we carry on today are based on frequentist probability. 

Frequentists believes as the number of trails increases the relative frequency will get closer to the probability of that event. For any given event, only one of two possibilities may hold: it occurs or it does not. if a random event is repeated many times the frequency of it will be its probability.

For example, to get the defective percentage we collect enough samples and use the defective products to divide all products.

Frequentists believe there is a certain probability of every event. As long as we do enough trails the frequency of that event is approaching the probability of it.


The idea of repeat the random many time and using the frequency of it to review its probability is easy to understand. However, maybe you have the doubt that frequency and probability are two totally different concepts how can you confidently apply this theory to measure all random events? To answer this question let me do an experiment: toss a coin. Toss once: head. Can I say the probability of head is 100%? Of course not I haven’t repeat it enough times. So toss the coin 10 times: 8 head and 2 tails, still not 50% 50% as we expected. Shall we continue? Hold on actually many mathematicians have already done this before:


Experiment Researcher

Name

Number of tosses

Number of heads

Relative frequency of head

1

Buffon

4 040

2 048

0.503

2

Pearson

12 000

6 019

0.5016

3

Pearson

24 000

12 012

0.5005

4

Feynman

3 000

1 492

0.497

5

Morgan

4 092

2043

0.5005

6

Jevons

20 480

10 379

0.5068

7

Romanovidy

80 640

39 699

0.4923

8

Feller

10 000

4 979

0.4979


A big thanks to them for the hard work now we can see that it is about 50% to see head if we repeat it enough times.


Until now I think most of you may be satisfied with the conclusion and comfortably apply it to solve your own problem. But I know some of you may ask: can you approve it? Toss the coin thousands of times seems aline with your theory. But how about toss millions of times? Even though this theory is true when tossing the coin, is it still true for other random events?

Do you know that turkey story? On the farm, there is a smart turkey which good at observation and deduction. It finds out there will be a farmer who brings food every morning. The turkey thinks: ”it seems that farmer will feed me every morning, but I won’t rush to that conclusion. Let me observe it a little longer.” a year later, the smart turkey finally comfortably get the conclusion that there will be food every morning. This conclusion remains true until the thanksgiving...

  

Experiment can’t prove the theory is right it can only be used to prove it was wrong. We need mathematic approval!


Luckily this theory had been proved. In 1713, Swiss mathematician Jakob Bernoulli after 20 years of hard work proved this theorem in his book. law of large numbers (LLN) - the average of the results obtained from a large number of trials should be close to the expected value and will tend to become closer to the expected value as more trials are performed.


The law of large numbers is the keystone of statistics and probability because it guarantees stable long-term results for the averages of random events. Because of it, we can know for sure that it is correct to use frequency to estimate the probability. Further under the same condition using the historical data to predict the future is possible. It opens a door for us to understand the world!


You may notice using frequency to estimate probability is under one condition which is “repeat the trail many times”. So how many times is enough. Ideally, many times equal to approach infinity. But infinity is a mathematical idea, we can’t achieve it in the real world. Therefore in order to use it, we introduce another 2 concepts to assist. Confidence coefficient (or critical value) and margin of error.


The law of large numbers tells us as more trails performed the frequency will tend to become closer to the probability. In another word, the value of frequency will bound it is a range around the value of the probability. this range is the margin of error. For example, tossing a coin theoretically, there will be 50% head, if from an experiment the relative frequency of head is 52 - 48, then we can say the margin of error is +-2%. Based on the +-2% margin of error we perform the trail 100 times, if there are 96 trails are in this range then the confidence coefficient is 96%.


Thanks to the confidence coefficient (or critical value) and margin of error, we can study an event’s probability without infinite samples. If we use a 99.9% confidence coefficient and +- 2% margin of error, about 7000 samples is enough.  If we use a 95% confidence coefficient and +- 3% margin of error, about 1000 sample is enough. Actually, 95% and 3% is a common standard for scientific research and statistics investigation.


Comments

Popular posts from this blog

Statistics and probability

What is Randomness

Quantify probability