Clarifying Commonly Confused Stats Concepts
In a previous blog post about the Stanford Introduction to Statistics course, I mentioned that I’d be revisiting a few topics that took me a little longer to wrap my head around. This post is exactly that. A way for me to break down topics, not just to clear things up for myself, but also to create something I can revisit whenever I need a quick refresher.
So here are the topics I'm going to cover:
- Law of Large Numbers vs Central Limit Theorem
- Significance Level vs. P-value
- Confidence Level vs Confidence Interval
- Standard Deviation vs Standard Error
- Z-test vs T-test
1. Law of Large Numbers (LLN) vs Central Limit Theorem (CLT)
Both are foundational ideas in statistics and often mentioned together, but they’re doing two very different jobs.
LLN says that as your sample size increases, your sample mean gets closer to the true population mean.
CLT explains that regardless of your population’s distribution, the sampling distribution of the sample mean becomes approximately normal as your sample size grows.
Example: Imagine flipping a fair coin. LLN says that as the number of flips increases, the proportion of heads will approach 0.5. CLT says that if you repeatedly sample 30 flips, the distribution of the sample means will resemble a normal distribution.
In short:
-
LLN = Accuracy improves with more data
-
CLT = Sampling distributions become bell-shaped as increases
-
LLN focuses on convergence; CLT focuses on distribution
2. Significance Level vs P-value
Both relate to how surprising your results are under the null hypothesis.
Significance level () is the threshold you set before doing your test. It's like saying, I’ll only consider a result unusual if it has less than a 5% chance of happening by random chance.
P-value is the actual probability of getting your observed result (or something more extreme), assuming the null is true.
Example: You run a hypothesis test and get a p-value of 0.03. If α = 0.05, then 0.03 < 0.05, so you reject the null.
In short:
-
= cutoff for "rare enough" (usually 0.05)
-
p-value = probability of observing your data, assuming is true
-
If , results are statistically significant
These two are often used together, but they represent different parts of the same idea.
Confidence level is about how confident you are in your method. If you created 100 different confidence intervals from different samples, around 95 of them would contain the true population parameter (if you’re using 95% confidence).
Confidence interval is the actual range you calculate from your sample.
Example: From a sample, you calculate a 95% CI for the mean as (1.8, 2.1). This means the method would contain the true mean 95% of the time if repeated.
In short:
-
Confidence level = how reliable the process is
-
Confidence interval = your actual range estimate
-
Never say “there’s a 95% chance the true mean is in this interval” — the interval either contains it or doesn’t
4. Standard Deviation vs Standard Error
They sound similar, but they tell you different things.
Standard deviation (SD) measures the variability within a single dataset.
Standard error (SE) tells you how much variability there is in the sample mean itself.
Example: You measure the height of 50 people. SD tells you how spread out the individual heights are; SE tells you how much the average height would vary if you repeated this sampling.
In short:
-
SD = spread of individual data points
-
SE = SD /
-
Use SE when you're interested in how reliable your estimate of the mean is
The difference comes down to whether you know the population standard deviation () and how big your sample is.
Z-test is used when is known and/or the sample size is large (usually ).
T-test is used when is unknown.
Example: You’re testing if the average weight of apples is 150g. You don’t know , and your sample is 20 apples, use a t-test.
-
Use Z-test if population SD is known
-
Use T-test if it’s unknown (which is common)
-
T-distribution becomes more like Z as increases
Saving the Tough Ones for Later
While drafting this post, I realised that I needed a deeper understanding of a few of the topics, specifically the Monte Carlo method, the bootstrap principle, the chi-squared test, and ANOVA. These concepts are powerful and widely used, but as I tried to explain them, it became clear that I don’t fully grasp them just yet. So instead of rushing through them here, I’m going to take some more time to study and explore them properly, and once I feel more confident, I’ll circle back and write them up in another blog post!
Comments
Post a Comment