Skip navigation

14 Introduction to regression models

Concepts

  • The slope and intercepts we compute in a regression model are statistics calculated from the sample data. They are point estimates of corresponding parameters; namely, the slope and intercept in the "population". Hence they are subject to sampling variation just like the other point estimates we’ve seen.
  • The predicted y-value given by the regression line can be seen as the mean value of all possible y's that we could observe for that particular x value (assuming the model is good). So the value for the slope of a line is an estimate of the mean difference in Y for a one unit increase in X. We call these “slope values”, regression coefficients or beta coefficients.
  • The main distinction between different types of regression model is that they are used for different types of outcome (eg, linear regression for a continuous outcomes and a logistic regression for a binary outcome). The regression coefficients in each thus have a different interpretation.
  • Statistical inference procedures can be performed on the regression coefficients. In particular, we can perform a test to consider the null hypothesis that the population slope is of a value that would describe no association. If this hypothesis is true, then our linear model is not "useful," in the sense that our explanatory variable does not help us explain the value of our response variable. 

Connections

  • Regression is probably most explicit example of a statistical model.  The regression model provides both a systematic component  (y = a + bx) and a random component (errors). 
  • Independent sample t-tests can be done using a regression model.
  • All of the encountered ideas of statistical inference apply to regression coefficients. The regression coefficient would vary on repeated independent sampling so a regression coefficient from a study is just a point estimate of the true association (population parameter) that we are trying to understand. Hence we use hypothesis tests and confidence intervals to try to understand the true underlying association/ model following exactly the same principles that we've learnt in statistical inference.
  • Regression models are subject to assumptions about the way the data have been collected just like the simple comparisons, if the assumptions are not met then our results may not be valid.