Explorable.com160K reads

Linear regression analysis is a powerful technique used for predicting the unknown value of a variable from the known value of another variable.

Discover 34 more articles on this topic

Don't miss these related articles:

- 1Statistical Hypothesis Testing
- 2Relationships
- 3Correlation
- 4Regression
- 5Student’s T-Test
- 6ANOVA
- 7Nonparametric Statistics
- 8Other Ways to Analyse Data

More precisely, if X and Y are two related variables, then linear regression analysis helps us to predict the value of Y for a given value of X or vice verse.

For example age of a human being and maturity are related variables. Then linear regression analyses can predict level of maturity given age of a human being.

By linear regression, we mean models with just one independent and one dependent variable. The variable whose value is to be predicted is known as the dependent variable and the one whose known value is used for prediction is known as the independent variable.

There are two lines of regression- that of Y on X and X on Y. The line of regression of Y on X is given by Y = a + bX where a and b are unknown constants known as intercept and slope of the equation. This is used to predict the unknown value of variable Y when value of variable X is known.

Y = a + bX

On the other hand, the line of regression of X on Y is given by X = c + dY which is used to predict the unknown value of variable X using the known value of variable Y. Often, only one of these lines make sense.

Exactly which of these will be appropriate for the analysis in hand will depend on labeling of dependent and independent variable in the problem to be analyzed.

For example, consider two variables crop yield (Y) and rainfall (X). Here construction of regression line of Y on X would make sense and would be able to demonstrate the dependence of crop yield on rainfall. We would then be able to estimate crop yield given rainfall.

Careless use of linear regression analysis could mean construction of regression line of X on Y which would demonstrate the laughable scenario that rainfall is dependent on crop yield; this would suggest that if you grow really big crops you will be guaranteed a heavy rainfall.

The coefficient of X in the line of regression of Y on X is called the regression coefficient of Y on X. It represents change in the value of dependent variable (Y) corresponding to unit change in the value of independent variable (X).

For instance if the regression coefficient of Y on X is 0.53 units, it would indicate that Y will increase by 0.53 if X increased by 1 unit. A similar interpretation can be given for the regression coefficient of X on Y.

Once a line of regression has been constructed, one can check how good it is (in terms of predictive ability) by examining the coefficient of determination (R2). R2 always lies between 0 and 1. All software provides it whenever regression procedure is run.

R2 - coefficient of determination

The closer R2 is to 1, the better is the model and its prediction. A related question is whether the independent variable significantly influences the dependent variable. Statistically, it is equivalent to testing the null hypothesis that the regression coefficient is zero. This can be done using t-test.

Linear regression does not test whether data is linear. It finds the slope and the intercept assuming that the relationship between the independent and dependent variable can be best explained by a straight line.

One can construct the scatter plot to confirm this assumption. If the scatter plot reveals non linear relationship, often a suitable transformation can be used to attain linearity.

Full reference:

Explorable.com (Mar 7, 2009). Linear Regression Analysis. Retrieved May 15, 2022 from Explorable.com: https://explorable.com/linear-regression-analysis

The text in this article is licensed under the Creative Commons-License Attribution 4.0 International (CC BY 4.0).

This means you're free to copy, share and adapt any parts (or all) of the text in the article, as long as you give ** appropriate credit** and

That is it. You don't need our permission to copy the article; just include a link/reference back to this page. You can use it freely (with some kind of link), and we're also okay with people reprinting in publications like books, blogs, newsletters, course-material, papers, wikipedia and presentations (with clear attribution).

Discover 34 more articles on this topic

Don't miss these related articles:

- 1Statistical Hypothesis Testing
- 2Relationships
- 3Correlation
- 4Regression
- 5Student’s T-Test
- 6ANOVA
- 7Nonparametric Statistics
- 8Other Ways to Analyse Data

Thank you to...

This project has received funding from the European Union's Horizon 2020 research and innovation programme under grant agreement No 827736.

Subscribe / Share

- Subscribe to our RSS Feed
- Like us on Facebook
- Follow us on Twitter
- Founder:
- Oskar Blakstad Blog
- Oskar Blakstad on Twitter

Explorable.com - 2008-2022

You are free to copy, share and adapt any text in the article, as long as you give *appropriate credit* and *provide a link/reference* to this page.