Statistical variance gives a measure of how the data distributes itself about the mean or expected value. Unlike range that only looks at the extremes, the variance looks at all the data points and then determines their distribution.
This article is a part of the guide:Discover 17 more articles on this topic
In many cases of statistics and experimentation, it is the variance that gives invaluable information about the data distribution.
Variance Calculation (population of Scores)
The mathematical formula to calculate the variance is given by:
σ2 = variance
∑ (X - µ)2 = The sum of (X - µ)2 for all datapoints
X = individual data points
µ = mean of the population
N = number of data points
This means the square of the variance is given by the average of the squares of difference between the data points and the mean.
Step By Step Calculation
For example, suppose you want to find the variance of scores on a test. Suppose the scores are 67, 72, 85, 93 and 98.
1. Write down the formula for variance:
σ2 = ∑ (x-µ)2 / N
2. There are five scores in total, so N = 5.
σ2 = ∑ (x-µ)2 / 5
3. The mean (µ) for the five scores (67, 72, 85, 93, 98), so µ = 83.
σ2 = ∑ (x-83)2 / 5
4. Now, compare each score (x = 67, 72, 85, 93, 98) to the mean (µ = 83)
σ2 = [ (67-83)2+(72-83)2+(85-83)2+(93-83)2+(98-83)2 ] / 5
5. Conduct the subtraction in each paranthesis.
67-83 = -16
72-83 = -11
85-83 = 2
93-83 = 10
98 - 83 = 15
The formula will look like this:
σ2 = [ (-16)2+(-11)2+(2)2+(10)2+(15)2] / 5
6. Then, square each paranthesis. We get 256, 121, 4, 100 and 225.
This is how:
σ2 = [ (-16)x(-16)+(-11)x(-11)+(2)x(2)+(10)x(10)+(15)x(15)] / 5
σ2 = [ 16x16 + 11x11 + 2x2 + 10x10 + 15x15] / 5
σ2 = [256 + 121 + 4 + 100 + 225] / 5
7. Then summarize the numbers inside the brackets:
σ2 = 706 / 5
8. To get the final answer, we divide the sum by 5 (Because it was five scores). This is the variance for the dataset:
This is the variance of the population of scores.
Standard Deviation of Sample
In many cases, instead of a population, we deal with scores.
In this case, we need to slightly change the formula for standard deviation as
S2 = the standard deviation of the sample.
Note that the denominator is one less than the sample size in this case.
The concept of variance can be extended to continuous data sets too. In that case, instead of summing up the individual differences from the mean, we need to integrate them. This approach is also useful when the number of data points is very large, like the population of a country.
Variance is extensively used in probability theory, wherein from a given smaller sample set, more generalized conclusions need to be drawn. This is because variance gives us an idea about the distribution of data around the mean, and thus from this distribution, we can work out where we can expect an unknown data point.