What Is Statistical Learning?

Suppose that we observe a quantitative response and different predictors, , we assume that there is some relationship between and , which can be written in the very general form: Here, is some fixed but unknown function of $, and is a random error term, which is independent of and has mean zero. In essence, statistical learning refers to a set of approaches for estimating .

Why Estimate ?

There are two main reasons that we may wish to estimate : prediction and inference.

Prediction

We can predict using: where represents our estimate for , and represents the resulting prediction for .

The accuracy of as a prediction for depends on two quantities, reducible error and irreducible error. Even if it were possible to form a perfect estimate for , so that our estimated response took the form , our prediction would still have some error in it! This is because is also a function of , which cannot be predicted using .

Assume for a moment that both and are fixed, then it's easy to show that: Where represents the average, or expected value, of the squared different between predicted and actual value of , and represents the variance associated with the error term .

The focus of this book is on techniques for estimating with the aim of minimizing the reducible error. Irreducible error will always provide an upper bound on the accuracy of our prediction for , this bound is almost always unknown in practice.

Inference

We are often interested in understanding that way that is affected as $ change. In this situation we wish to estimate , but our goal is not necessarily to make predictions for .

We instead want to understand the relationship between and , or more specially, to understand how changes as a function of $.

In this setting, one may be interested in answering the following questions:

  • Which predictors are associated with the response?
  • What is the relationship between the response and each predictor?
  • Can the relationship between and each predictor be adequately summarized using a linear equation, or is the relationship more complicated?

Depending on whether our ultimate goal is prediction, inference, or a combination of the two, different methods for estimating may be appropriate.

How Do We Estimate ?

Our goal is to apply a statistic learning method to the training data in order to estimate the unknown function . Broadly speaking, most statistic learning methods for this task can be characterized as either parametric or non-parametric.

Some Trade-Offs

  • Prediction accuracy versus interpretability -- Linear models are easy to interpret; thin-plate splines are not.
  • Good fit versus over-fit or under-fit -- How do we know when the fit is just right?
  • Parsimony versus black-box -- We often prefer simpler model involving fewer variables over a black-box predictor involving them all.

results matching ""

    No results matching ""