In finite dimensions, the Gaussian distribution is defined by the density function
\[p(x) = \frac{1}{(2\pi)^{d/2}|\Sigma|^{1/2}}\exp\left(-\frac{1}{2}(x-\mu)^T\Sigma^{-1}(x-\mu)\right)\]
where \(\Sigma = (\mathrm{Cov}(x_i,x_j))_{i,j=1}^n\) is the covariance matrix.
Definition. Let \(X\neq \varnothing\) be a set, \(m:X\to \mathbb{R}\) is the mean function, \(k:X\times X\to \mathbb{R}\) is the covariance function.
A random function \(f:X\to \mathbb{R}\) (can write \(f:\Omega\times X\to \mathbb{R}\)) is called a Gaussian Process with mean \(m\) and covariance \(k\) if for any choice of points \(\mathcal{D}=\{x_1,\dots,x_n\}\subset X\), \(f(x_1),\dots,f(x_n)\) are jointly Gaussian distributed with mean \((m(x_1),\dots,m(x_n))\) and covariance matrix \((k(x_i,x_j))_{i,j=1}^n\).
\[\mathbb{E}f(x_i) = m(x_i),\quad \mathrm{Cov}(f(x_i),f(x_j)) = k(x_i,x_j).\]
The covariance matrix has to be symmetric positive semi-definite for any \(D\subset X\).
Convercely, for any \(m:X\to \mathbb{R}\) and \(k:X\times X\to \mathbb{R}\) positive definite kernel, there is a Gaussian process with the characteristics, which we denote by \(\mathrm{GP}(m,k)\).
Take \(x, y\in X, x\neq y\). \(\mathrm{Cov}(f(x),f(y)) = k(x,y)\).
Set \(m=0\) for simplicity. If we look at \((f(x),f(y))^T \in \mathbb{R}^2\) Gaussian, with mean zero and covariance matrix
\[\begin{pmatrix} k(x,x) & k(x,y) \\ k(y,x) & k(y,y) \end{pmatrix}\]
\((x_i, y_i)_{i=1}^N\) are given data, the goal is to find a \(f:X\to \mathbb{R}\).
In Bayesian,