Statistical correlation is a statistical technique which tells us if two variables are related.

Discover 34 more articles on this topic

Don't miss these related articles:

- 1Statistical Hypothesis Testing
- 2Relationships
- 3Correlation
- 4Regression
- 5Student’s T-Test
- 6ANOVA
- 7Nonparametric Statistics
- 8Other Ways to Analyse Data

For example, consider the variables of family income and family expenditure. It's well known that income and expenditure increase or decrease together. Thus they are related in the sense that change in any one variable is accompanied by change in the other variable.

Likewise, the price and the demand of a commodity are related variables; when price increases, demand will tend to decreases and vice versa.

If the change in one variable is accompanied by a change in the other, then the variables are said to be correlated. We can therefore say that family income and family expenditure are correlated, as are commodity price and demand.

Correlation is about the relationship between variables. Correlations tell us:

- whether this relationship is positive or negative
- the strength of the relationship.

In the case of family income and family expenditure, it is easy to see that they both rise or fall together in the same direction. This is called a positive correlation.

In case of price and demand, change occurs in opposing directions so that increase in one is accompanied by decrease in the other. This is called a negative correlation.

Statistical correlation is measured by what is called the coefficient of correlation (r). Its numerical value ranges from +1.0 to -1.0. It gives us an indication of both the strength and direction of the relationship between variables.

In general, r > 0 indicates a positive relationship, r < 0 indicates a negative relationship and r = 0 indicates no relationship (or that the variables are independent of each other and not related). Here r = +1.0 describes a perfect positive correlation and r = -1.0 describes a perfect negative correlation.

The closer the coefficients are to +1.0 and -1.0, the greater the strength of the relationship between the variables.

As a rule of thumb, the following guidelines on strength of relationship are often useful (though many experts would somewhat disagree on the choice of boundaries).

Value of r | Strength of relationship |
---|---|

-1.0 to -0.5 or 1.0 to 0.5 | Strong |

-0.5 to -0.3 or 0.3 to 0.5 | Moderate |

-0.3 to -0.1 or 0.1 to 0.3 | Weak |

-0.1 to 0.1 | None or very weak |

Correlation is only appropriate for examining the relationship between meaningful quantifiable data (e.g. air pressure, temperature) rather than categorical data such as gender, color etc.

While 'r' (the correlation coefficient) is a powerful tool, it has to be handled with care.

- The most used correlation coefficients only measure linear relationship. It is therefore perfectly possible that while there is strong non linear relationship between the variables, r is close to 0 or even 0. In such a case, a scatter diagram can roughly indicate the existence or otherwise of a non linear relationship.
- One has to be careful in interpreting the value of 'r'. For example, it has been shown that the number of people who have fallen into swimming pools each year since 1999 correlates with the number of films Nicolas Cage has appeared in. Obviously, irrespective of the value of 'r', this is what's called a non-sense correlation - and for good reason!
- 'r' should never be used to say anything about a cause and effect relationship. Put differently, by examining the value of 'r', we could only conclude that variables X and Y are related. However the same value of 'r' does not tell us if X influences Y or the other way round. Statistical correlation should not be the primary tool used to study causation, because of the problem with third variables.

Full reference:

Explorable.com, Lyndsay T Wilson (May 2, 2009). Statistical Correlation. Retrieved Apr 24, 2024 from Explorable.com: https://explorable.com/statistical-correlation

The text in this article is licensed under the Creative Commons-License Attribution 4.0 International (CC BY 4.0).

This means you're free to copy, share and adapt any parts (or all) of the text in the article, as long as you give ** appropriate credit** and

That is it. You don't need our permission to copy the article; just include a link/reference back to this page. You can use it freely (with some kind of link), and we're also okay with people reprinting in publications like books, blogs, newsletters, course-material, papers, wikipedia and presentations (with clear attribution).

Discover 34 more articles on this topic

Don't miss these related articles:

- 1Statistical Hypothesis Testing
- 2Relationships
- 3Correlation
- 4Regression
- 5Student’s T-Test
- 6ANOVA
- 7Nonparametric Statistics
- 8Other Ways to Analyse Data

Footer

Thank you to...

This project has received funding from the European Union's Horizon 2020 research and innovation programme under grant agreement No 827736.

Explorable.com - 2008-2024

You are free to copy, share and adapt any text in the article, as long as you give *appropriate credit* and *provide a link/reference* to this page.