Gaussian Processes

In finite dimensions, the Gaussian distribution is defined by the density function

p(x)=1(2π)d/2|Σ|1/2exp(12(xμ)TΣ1(xμ))

where Σ=(Cov(xi,xj))i,j=1n is the covariance matrix.

Definition. Let X be a set, m:XR is the mean function, k:X×XR is the covariance function.

  • A random function f:XR (can write f:Ω×XR) is called a Gaussian Process with mean m and covariance k if for any choice of points D={x1,,xn}X, f(x1),,f(xn) are jointly Gaussian distributed with mean (m(x1),,m(xn)) and covariance matrix (k(xi,xj))i,j=1n.

    Ef(xi)=m(xi),Cov(f(xi),f(xj))=k(xi,xj).

  • The covariance matrix has to be symmetric positive semi-definite for any DX.

  • Convercely, for any m:XR and k:X×XR positive definite kernel, there is a Gaussian process with the characteristics, which we denote by GP(m,k).

Covariance function kernel k

Take x,yX,xy. Cov(f(x),f(y))=k(x,y).

Set m=0 for simplicity. If we look at (f(x),f(y))TR2 Gaussian, with mean zero and covariance matrix

(k(x,x)k(x,y)k(y,x)k(y,y))

GP Regression

(xi,yi)i=1N are given data, the goal is to find a f:XR.

In Bayesian,