I've been asked similar questions by multiple people about my paper:- Boosted Regression Active Shape Models David Cristinacce and Tim Cootes, in Proceedings of 18th BMVA British Machine Vision Conference, pages 890-899, Warwick, UK, September, 2007. This is an attempt to combine the notes, to hopefully make the paper easier to understand. -------------------------Note 1---------------------------------- There are some typos in the paper, specifically regarding section 3.3 Boost Feature Regression on p3 and Algorithm2 specifically in Algorithm2:- Equation 2(c) should read:- 2(c) F(x)<- F(x) + alpha * f_m(x) where alpha is the "shrinkage" aka "learning rate" parameter also to be consistent equation 2(d) should be 2(d) y_i<- y_i - alpha *f(x) and the final regression output F(x) also needs multiplying by alpha. ---------------------------Note 2---------------------------------- Also in Algorithm2, the citation for ref 9, is not quite clear:- Ref 9, ie:- J. Friedman, T. Hastie, and R. Tibshirani. Additive logistic regression: a statistical view of boosting. The Annals of Statisics, 28:337-407, 2000. Does discuss Algorithm2 in section 2.1/2.2, under the heading "extended additive models", but does not explicitly quote the algorithm in the same was as my paper. Note that in my paper the classification method described in Algorithm1, is exactly the same as the GentleBoost Classifier method in ref 9 (Algorithm4 in Friedman et al) -------------------------Note 3--------------------------------------- In algorithm step2a, the phrase "Fit the regression function fm(x) by least squares of yi to xi", probably isn't very clear. What I mean is for each feature f_m you have an associated haar feature H_m. You compute a set of haar responses {H_m(x_i)} over your training set, which you divide into histogram bins (25 bins in my version). Then for each bin you calculate the average {y_i} target value (ie displacement distance associated with each image x_i). Then the function f_m(x) makes a simple prediction, by taking an image x_i, applying the Haar filter H_m, computing the bin from the haar response and outputting the average displacement for that particular bin. You do the above for all possible haar features and pick the one with the lowest least squares displacement error (relative to the target values {y_i}). The target values y_i vary between stages, so the optimum histogram bin displacement values have to be recomputed at each stage. -------------------------Note 4---------------------------------------- Training the local regression features takes a long time, so you might be tempted to try and train regression functions for x and y displacements independently. However in my experience you need to use all x+y displacements simulataneously (ie n^2 number of possible displacements, instead of 2*n). Training too independent directions unfortunately does not seem to work. ------------------------Note 5----------------------------------------- No source code - sorry