Home > Sciences > Mathematics
Created on: September 18, 2009
What is regression?
Regression refers to a collection of techniques for modeling one variable (the dependent variable or DV), as a function of some other variables (the independent variables or IVs). Different regression techniques should be applied for different types of DVs. If the DV is a dichotomy (like living vs. dead), then the most common method is logistic regression. If the DV has multiple categories (e.g. Republican, Democrat, Independent) then the usual method is either multinomial or ordinal logistic regression. If the DV is a count (such as number of times something happens) then there are Poisson regression and negative binomial regression. If the DV is a time to an event (such as time to death) then there are a range of techniques known as survival analysis. There are other varieties too. But the most common type of DV is one that is continuous, or nearly so, such as weight, IQ, income, and so on.
What is linear regression?
When the DV is a continuous variable, or nearly continuous, then by far the most common regression technique is linear regression, almost always with ordinary least squares.
In fact, if you see the word "regression" used in a statistical context, you can usually be sure that it is linear regression, if nothing is specified. However, linear regression, like many statistical techniques, is often applied inappropriately. In these notes, I will first explain what linear regression is for, then show how it works and explain when it is appropriate and when it is inappropriate. I will not cover how to actually do regression, but will give references to works which explain it.
As noted, linear regression is a technique for modeling a continuous, or nearly continuous, DV as a function of one or more IVs. A variable is continuous when it can take on any value in a specified range. It is nearly continuous when it can take on a great many values. Thus, height is a continuous variable, because (say) adult humans can take on any height between (say) 3 feet and 8 feet. IQ is a nearly continuous variable, because, while it can't be fractional, it can be any integer between (say) 40 and 200. The DV is the variable that you wish to explain, or model, or explore. The IVs are the variables that you think will help explain the DV. Note that while the DV must be continuous or nearly so, the IVs need not be. Methods exist for using IVs that are dichotomous, categorical, or continuous. Also note that we do not expect, or even desire, our model to
Below are the top articles rated and ranked by Helium members on:
What is simple linear regression?
Helium Debate
Cast your vote!
Does science support the law of attraction theory?
Click for your side.