About a month or two ago, my AP statistics teacher, Mr. Wylder, presented us with the t-distribution (Student’s t-distribution for long). He laid out the concept simply: the t-distribution is nothing but a normal distribution that is thicker on the tails to better approximate small samples. I looked at his nice explanation and then at the bizarre equation in front of me, and asked, “Why?” and now we are here! While I can’t promise an exact answer to why the t-distribution is the way it is, I hope you can appreciate the journey I went on these past few weeks.
To Start…
We start our journey with clearly… NOT MATH BUT HISTORY! Yeah, I got you, you silly math nerd. In English literature at least, the Student’s t-distribution is attributed to its namesake William Sealy Gosset. This English statistician worked for a brewery when he noticed that his confidence intervals and z tests (which we will get into later) were slightly less accurate than predicted (total scam!). He then invented the t-distribution (t for test statistic t) to correct for the error by giving the normal distribution thicker tails (or higher kurtosis). He wanted to publish a paper on his new invention but his hire wanted him to conceal his name for fear of competing breweries learning of their secret weapon of error correction. Gosset decided to use his pseudonym, Student, and that was how, in 1908, the Student’s t-distribution was born!
Or so we are told. The full story of the Student’s t-distribution starts with German mathematicians Friedrich Robert Helmert and Jacob Lüroth who originally derived the t-distribution’s formula (derived as a Posterior Distribution at the time) in 1876. Additionally, in 1895, Karl Pearson published the special case Pearson IV function, the Pearson VII function, which was just a non-standardised Student’s t-distribution! Only in 1908 did Gosset improve on the Student’s t-distribution in his paper.
Conceptualizing the Student’s T-Distribution
Nice story, you may be thinking, but what IS the Student’s t-distribution? Now we talk about the confidence intervals and z/t tests. The t-distribution is used primarily when we don’t know the population’s standard deviation (which is most of the time) but want to use CLT (central limit theorem), such as in confidence intervals and z/t tests to approximate the mean or proportion.
CLT itself states that if we have a big enough sample of continuous variables, then the sampling distribution of means (probability density function of all possible means given that size) will be a normal distribution. The Student’s t-distribution, then, is the sample approximation of the population’s sampling distribution.
What does that mean exactly? Well, if we look at a sample variance and compare it to the population variance, we will notice that the sample variance has a correction factor; the correction factor (or degrees of freedom) is used to better approximate the population variance given the sample. In light of this, the t-distribution is just the statistic to the sampling distribution parameter.