1、Basis Expansions and Regularization,Based on Chapter 5 of Hastie, Tibshirani and Friedman (Prepared by David Madigan),Basis Expansions for Linear Models,Here the hms might be:,hm(X)=Xm, m=1,p recovers the original model hm(X)=Xj2 or hm(X)= Xj Xk hm(X)=I(LmXk Um),“knots”,Regression Splines,Bottom lef
2、t panel uses:,Number of parameters = (3 regions) X (2 params per region)- (2 knots X 1 constraint per knot)= 4,cubic spline,Cubic Spline,Number of parameters = (3 regions) X (4 params per region)- (2 knots X 3 constraints per knot)= 6 Knot discontinuity essentially invisible to the human eye,continu
3、ous first and second derivatives,Natural Cubic Spline,Adds a further constraint that the fitted function is linear beyond the boundary knotsA natural cubic spline model with K knots is represented by K basis functions:,Each of these basis functions has zero 2nd and 3rd derivative outside the boundar
4、y knots,Natural Cubic Spline Models,Can use these ideas in, for example, regression models.For example, if you use 4 knots and hence 4 basis functions per predictor variable, then simply fit logistic regression model with four times the number of predictor variables,Smoothing Splines,Consider this p
5、roblem: among all functions f(x) with two continuous derivatives, find the one that minimizes the penalized residual sum of squares:,smoothing parameter,=0 : f can be any function that interpolates the data =infinity : least squares line,Seems like there will be N features and presumably overfitting
6、 of the data. But, the smoothing term shrinks the model towards the linear fitThis is a generalized ridge regression Can show that where K does not depend on ,Smoothing Splines,Theorem: The unique minimizer of this penalized RSS is a natural cubic spline with knots at the unique values of xi , i=1,N
7、,Nonparametric Logistic Regression,Consider logistic regression with a single x:and a penalized log-likelihood criterion:,Again can show that the optimal f is a natural spline with knots at the datapointCan use Newton-Raphson to do the fitting.,Thin-Plate Splines,The discussion up to this point has been one-dimensional. The higher-dimensional analogue of smoothing splines are “thin-plate splines.” In 2-D, instead of minimizing:minimize:,Thin-Plate Splines,The solution has the form:,a type of “radial basis function”,