You probably encounter at least a few statistical claims every day. You might see them in your day-to-day work, or in your TikTok routine, or scanning your local news roundup.
If you’re like most people, you don’t give them much thought unless you’re working on something that requires you to. Instead, you say “Hmm, interesting!” and move on. You rarely stop to interrogate the data, to ask whether any claims deriving from it are reasonable.
Fortunately, people like noted statistician Ryan McCorvie do interrogate statistical models day in and day out. It’s their job, after all. They’re not wizards — they just have practice about how to gather and interpret data more than most of us.
If you’d like to know enough to follow along with them, here are nine key terms you should know.
Table of Contents
1. Median
The median is the middle value, or midpoint, of any given data set. If the set has an odd number of values, it is the number exactly in the middle when the values are ordered from lowest to highest. If the set has an even number of values, it is the mean of the two middle numbers.
2. Mode
The mode is the most common value found in a data set. For example, in the following data set — 4, 6, 6, 7, 7, 7, 9 — the value ‘7’ is the mode (and also the median).
3. Mean
The mean is the “straight” average of a given data set. To find it, add up all of the values in the set and divide by their number.
4. Standard Deviation
The standard deviation is a measure of variance within a data set. More specifically, it measures by how far any given value or group of values varies from the average.
5. Regression Analysis
Regression analysis is “a way of mathematically sorting out which of those variables does indeed have an impact,” says statistics expert Amy Gallo. This is useful for finding which factors matter most in terms of their influence on the results, how the factors influence one another, and the certainty that their values (and influence) can be predicted.
6. Normal Distribution
A normal distribution is represented by the bell curve, where each value occurs more frequently as one approaches the mean. This distribution is often shown as a bell-shaped line graph with relatively long tails and a peak at the average, hence the name.
7. T-distribution
T-distribution describes the deviation of a subsample of a normally distributed larger population when the sample’s standard deviation is not known.
8. Standard Error
Also known as “standard error of the mean,” this is a measure of how likely a sample’s mean is to deviate from the mean of a larger population. To calculate standard error, simply take the square root of the sample size and divide the standard deviation by it.
9. Confidence Interval
The confidence interval uses a sample to assess the range (or interval) of probable values within a larger population, according to the National Library of Medicine. It shows the range within which values should fall over time.
Know Your Numbers
When it comes to numerical lingo, these 12 terms are only the tip of the iceberg. There’s so much more to learn not only about how data is gathered and statistical models developed but how experts like Ryan McCorvie talk about them once the work is done.
But it’s important to know this stuff, even if you’re not a “numbers person.” Governments, corporations and other influential organizations use statistics to inform policies and strategies that directly or indirectly affect our lives. The more you understand about their work, the more you may understand what drives their decision-making — and the more informed you can be when it’s time to question it, too.