p. 50, algorithm 3.3, line 15 and 16, the term "+\sumi\log(...)" has the wrong sign, the correct term is "-\sum_i\log(...)" in both line 15 and 16. (thanks to Chris Mansley)
p. 51, algorithm 3.4, line 11, two occurrences of "R" should be "R_c". A simpler expression is "c := E_c(MT\(M\b))". (thanks to Yang Song and Chris Mansley)
p. 56, above eq. (3.61) "the role of W" should be "the role of W^{-1}".
p. 97, the eigenfunction given in eq. (4.40) is not normalized. It can be shown using eq 7.374.1 in Gradshteyn and Ryzhik (1980) that "\phi_k(x) = h_k \exp(-(c-a)x^2) H_k(\sqrt{2c}x)" is normalized wrt p(x) by setting "h_k^{-2} = \sqrt{a/c} 2^k k!". (thanks to Christian Walder)
p. 11, eq (2.9), the third expression is incorrect. The first, second and fourth expressions are equal. (thanks to Kevin S. Van Horn)
p. 12, line 16, typo: "setting Z^{-1}=\Sigma_p^2", should be "setting Z^{-1}=\Sigma_p". (thanks to Mikhail Parakhin)
p. 28, eq (2.41), top line, typo: the term "K_*K_y^{-1}(y - H \bar\beta)" should be "KT_*K_y^{-1}(y - HT \bar\beta)", ie transposes on both "K_*" and "H" are missing. (thanks to Mikhail Parakhin)
p. 50, algorithm 3.3, line 15 and 16, the term "+\sumi\log(...)" has the wrong sign, the correct term is "-\sum_i\log(...)" in both line 15 and 16. (thanks to Chris Mansley)
p. 51, algorithm 3.4, line 11, two occurrences of "R" should be "R_c". A simpler expression is "c := E_c(MT\(M\b))". (thanks to Yang Song and Chris Mansley)
p. 56, above eq. (3.61) "the role of W" should be "the role of W^{-1}".
p. 59, penultimate line, replace "In eq.(3.65) we have factored" with "In eq.(3.73) we have factored". (thanks to Baback Moghaddam)
p. 61, figure caption, the final two lines should read "The maximum value attained is 0.84, and the minimium is 0.19."
p. 68, Figure 3.9, caption, third line: Figure 3.7(a) should read Figure 3.8(a).(thanks to Baback Moghaddam)
p. 95, line 14 just under eq. (4.35) typo: "k(x,x')" should be "\tilde k(x,x')", ie there is a tilde missing over the "k".
p. 97, the eigenfunction given in eq. (4.40) is not normalized. It can be shown using eq 7.374.1 in Gradshteyn and Ryzhik (1980) that "\phi_k(x) = h_k \exp(-(c-a)x^2) H_k(\sqrt{2c}x)" is normalized wrt p(x) by setting "h_k^{-2} = \sqrt{a/c} 2^k k!". (thanks to Christian Walder)
p. 102, eq (4.46), \phi_{\theta}(x) should be \phiT_{\theta}(x). (thanks to Baback Moghaddam)
p. 121, eq (5.18), the Kronecker delta, \deltaxx' should really be on the indexes (not the values of x), i.e \deltapq as in eq (2.20). (thanks to Aki Vehtari)
p. 125, eq (5.24), the matrix "B^{-1}" in the right-hand side is incorrect. Instead of "B^{-1}" it should be "(I+KW)^{-1}".
p. 126, caption for Algorithm 5.1, 4th line: "line 11" should be "line 12". (thanks to Baback Moghaddam)
p. 126, line -6: "B^{-1}=(I+W^\frac{1}{2}KW^\frac{1}{2})^{-1}" is incorrect. It should be "(I+KW)^{-1}".
p. 127, eq (5.26) and (5.27), in both equations, the left hand side should be the "partial derivative of the log of ZEP, not ZEP itself. (thanks to Baback Moghaddam and Jurgen Van Gael)
p. 127, eq (5.26), first line, first term in rhs, \Sigma should be \tilde{\Sigma}. (thanks to Baback Moghaddam)
p. 137, both occurrences of 4 \pi should be 4 \pi^2. (thanks to Baback Moghaddam)
p. 139, eq (6.30), Z(u) should be Z(u) du. (thanks to Baback Moghaddam)
p. 144, eqs (6.37) and (6.38), both equations are missing a transpose on the first term's first f vector. (thanks to Baback Moghaddam)
p. 148, eq (6.43), replace both occurrences of p(y_i|X, y, \theta) with p(y_i|X, y_{-i}, \theta). (thanks to Baback Moghaddam)
p. 153, eq (7.9), rhs final denominator should be 1 + S^{-1}_f (s) \sigma^2_n/ \rho. (thanks to Baback Moghaddam)
p. 158, third line, replace J < \infty with KL_{sym} < \infty. (thanks to Baback Moghaddam)
p. 160, eq. (7.23), third line, last term k1(x)T should read k1(x*)T. (thanks to Baback Moghaddam)
p. 160, above eq. (7.26): eq. (7.26) is a lower bound on the generalization error, not (as stated) an upper bound. (thanks to Benjamin Sobotta)
p. 165, eq (7.34), first term is missing a trailing | (for det) and the third term is missing a tr(). (thanks to Baback Moghaddam)
p. 182 and 184, sec. 8.3.7: Unfortunately there was an error in the scripts that meant that the noise variance was added in twice when computing the predictive variance for the PP runs; this affected the PP results for MSLL (but not SMSE) in Table 8.1 and Figure 8.1(b). This pdf file gives corrected versions of the Table and plot. Notice now that there is not much difference in performance between the SR, PP and BCM methods for various sizes of m on this problem, and that they all outperform the SD method.