Statistical correlation is a statistical technique which tells us if two variables are related.
For example, consider the variables family income and family expenditure. It is well known that income and expenditure increase or decrease together. Thus they are related in the sense that change in any one variable is accompanied by change in the other variable.
Again price and demand of a commodity are related variables; when price increases demand will tend to decreases and vice versa.
If the change in one variable is accompanied by a change in the other, then the variables are said to be correlated. We can therefore say that family income and family expenditure, price and demand are correlated.
Correlation can tell you something about the relationship between variables. It is used to understand:
Correlation is a powerful tool that provides these vital pieces of information.
In the case of family income and family expenditure, it is easy to see that they both rise or fall together in the same direction. This is called positive correlation.
In case of price and demand, change occurs in the opposite direction so that increase in one is accompanied by decrease in the other. This is called negative correlation.
Statistical correlation is measured by what is called coefficient of correlation (r). Its numerical value ranges from +1.0 to -1.0. It gives us an indication of the strength of relationship.
In general, r > 0 indicates positive relationship, r < 0 indicates negative relationship while r = 0 indicates no relationship (or that the variables are independent and not related). Here r = +1.0 describes a perfect positive correlation and r = -1.0 describes a perfect negative correlation.
Closer the coefficients are to +1.0 and -1.0, greater is the strength of the relationship between the variables.
As a rule of thumb, the following guidelines on strength of relationship are often useful (though many experts would somewhat disagree on the choice of boundaries).
|Value of r||Strength of relationship|
|-1.0 to -0.5 or 1.0 to 0.5||Strong|
|-0.5 to -0.3 or 0.3 to 0.5||Moderate|
|-0.3 to -0.1 or 0.1 to 0.3||Weak|
|-0.1 to 0.1||None or very weak|
Correlation is only appropriate for examining the relationship between meaningful quantifiable data (e.g. air pressure, temperature) rather than categorical data such as gender, favorite color etc.
While 'r' (correlation coefficient) is a powerful tool, it has to be handled with care.