Given a vector field \(v\), \(\nabla_v\) is an operator defined on the following spaces
\[ \nabla_v : TM\to TM, \quad\nabla_v : \mathcal{O}_M\to \mathcal{O}_M\]
Which are completely determined by the two requirements \(\nabla_v f = \frac{\partial f}{\partial v}\) and \(\nabla_{e_j}e_i = \Gamma_{ij}^k e_k\).
By extending tensorially (by multiplicative law), and extending dually (by the contravariant Hom functor), we can define the covariant derivative on any tensors of \(TM\) and \(T^*M\), as well as \(\mathrm{End}(TM)=TM\otimes T^*M\).
There might be many choices of connections / covariant derivatives, on a Riemannian manifold there is a unique and canonical choice called Levi-Civita connection that it is torsion-free and metric-compatible.
If we reduce the \(v\) inside \(\nabla\), then \(\nabla\) will be an operator that attach a covector tensor term to any tensor bundle \(V\)
\[\nabla : V \to T^*M\otimes V.\]
In terms of coordinates, for example for \(V=TM\) we have the formula
\[ \nabla : TM\to TM\otimes T^*M, \quad\nabla u = \sum_j (u^i_{;j} e_i + u^i\Gamma_{ij}^k e_k) \otimes dx^j.\]
Notice that on \(TM\otimes T^*M = \mathrm{End}(TM)\) there is a natural trace map, which gives the map \(TM\xrightarrow{\nabla} \mathrm{End}(TM)\xrightarrow{\mathrm{tr}} \mathcal{O}_M\)
\[ u\mapsto u^j_{;j} + u^i\Gamma_{ij}^j \]
and this is the divergence operator on vector fields!
The map \(\nabla: TM\to \mathrm{End}(TM)\) means that covariant derivative is making \(TM\) a \(TM\)-module! (\(TM\) is not a ring though).
We can try to extend the module structure to an actual ring, the tensor ring \(T^\bullet(TM) = \bigoplus_n (TM)^{\otimes n}\), so now \(TM\) is a \(T^\bullet(TM)\)-module. (Curious: If this module does not give up things like curvature, why does \(D\)-modules have to give up them and work on flat connections only? We will see.)
How to define the second covariant derivative \(\nabla^2\)? There are two candidates:
The twice covariant derivative \(\nabla_u\circ \nabla_v\) can be reduced as
\[ \nabla^2 : V \to T^*M\otimes T^*M \otimes V.\]
This one is just \(\nabla^2_{Y,X} = \nabla_Y \circ \nabla_X\).
Define it as \(V\to T^*M \otimes V \to T^*M\otimes (T^*M\otimes V)\) where in the second arrow we take derivative for the whole \(T^*M \otimes V\).
\[ \nabla^2 : V \to T^*M\otimes T^*M \otimes V.\]
This is a bit different, in the first arrow it maps \(u\mapsto \nabla_\square u\), and in the second arrow we need to differentiate on both \(\square\) and \(u\). Differentiation on the hole \(\square\) need to take into account a negative sign and then compose \(\nabla_Y\) on the right, because of dualizing. We get
\[ \nabla_Y (\nabla_\square u) = (\nabla_Y\circ \nabla_\square)u - \nabla_{\nabla_Y \square} u,\]
therefore
\[ \nabla^2_{Y,X} = \nabla_Y \nabla_X - \nabla_{\nabla_Y X}.\]
It seems that people prefer the second definition.
The Riemann curvature tensor is defined as
\[ R(X,Y) = [\nabla_X, \nabla_Y] - \nabla_{[X,Y]} = \nabla^2_{X,Y} - \nabla^2_{Y,X},\]
which measures the difference of taking the derivative in two different orders. The second derivative actually exhibits this point.
Given any vector bundle \(V\) on \(M\) one can similarly define connection as a map
\[ \nabla : V \to \Omega^1_M \otimes V. \]
Starting here we will write \(\Omega^1_M\) instead of \(T^*M\). Imagine them as the same thing!
Taking iterated covariant derivatives should give us a sequence of maps
\[ V \to \Omega^1 \otimes V \to \Omega^1 \otimes (\Omega^1 \otimes V) \to \Omega^1\otimes (\Omega^1\otimes (\Omega^1\otimes V)) \to \cdots \]
For example, the second derivative is given by \(\nabla^2_{X,Y}= \nabla_X \nabla_Y - \nabla_{\nabla_X Y}\), and
\[ \nabla^3_{X,Y,Z} = \nabla_X\nabla_Y\nabla_Z - \nabla_X\nabla_{\nabla_Y Z} -\nabla_{\nabla_X Y}\nabla_Z - \nabla_Y \nabla_{\nabla_X Z} - \nabla_{\nabla_{\nabla_X Y} Z} + \nabla_{\nabla_X\nabla_Y Z}.\]
In order to relate the above construction induced from connection to the de-Rham complex and cohomology theories, we can define another complex similar to the above sequence, with spaces replaced by \(\Omega^k_M \otimes V\) instead of \((\Omega^1)^k\otimes V\).
To do this, we recall the following constructions on vector spaces
\(\wedge^k V \xhookrightarrow{\iota_k} V^{\otimes k}\), the inclusion of antisymmetric tensors into symmetric tensors given by
\[\iota_k(v_1\wedge\cdots\wedge v_k) = \sum_{\sigma\in S_k} \mathrm{sgn}(\sigma) v_{\sigma(1)}\otimes\cdots\otimes v_{\sigma(k)}.\]
\(V^{\otimes k} \xrightarrow{\alpha_k} \wedge^k V\), the projection of symmetric tensors into antisymmetric tensors given by
\[\alpha_k(v_1\otimes\cdots\otimes v_k) = \frac{1}{k!} v_1\wedge\cdots\wedge v_k.\]
We have that \(\alpha\circ \iota = \mathrm{id}\), so the sequencd \(0\to \mathrm{Sym}^k V \xrightarrow{} V^{\otimes k} \xrightarrow{\alpha} \wedge^k V \to 0\) is split exact. Therefore we define the following complex as
Clearly \(\alpha_0 =\mathrm{id}\), \(\alpha_1 = \mathrm{id}\).
\[\nabla_\square(\omega\otimes v) = \nabla_\square \omega \otimes v + \omega\otimes \nabla_\square v.\]
\[ (\nabla_\square(\omega\otimes v))(X,Y) = (\nabla_X \omega)(Y) v + \omega(Y) \nabla_X v.\]
On the second line \(\Omega^1\otimes V\to \Omega^2\otimes V\), things are induced from the first line, let’s observe what is \(\iota_2\circ \alpha_2(\nabla_\square(\omega\otimes v))\):
\[\begin{align*} 2\iota_2\circ \alpha_2(\nabla_\square(\omega\otimes v)) (X,Y) &= (\nabla_X \omega)(Y) v - (\nabla_Y \omega)(X) v + \omega(Y)\nabla_X v - \omega(X)\nabla_Y v \\ \end{align*}\]
compare it with the definition that \(\nabla (\omega\otimes v) = d\omega\otimes v - \omega\wedge \nabla v\), which expands to
The first term is \[ (d\omega\otimes v)(X,Y) = \left[(\nabla_X\omega)(Y) - (\nabla_Y\omega)(X) + \omega(\nabla_X Y - \nabla_Y X - [X,Y])\right]v,\]
where \(d\omega\) is computed as
\[\begin{align*} (d\omega)(X,Y) &= X(\omega(Y)) - Y(\omega(X)) - \omega([X,Y])\\ &= \nabla_X \langle \omega, Y\rangle - \nabla_Y \langle \omega, X\rangle - \langle \omega, [X,Y]\rangle \\ &= \langle \nabla_X \omega, Y\rangle + \langle \omega, \nabla_X Y\rangle - \langle \nabla_Y \omega, X\rangle - \langle \omega, \nabla_Y X\rangle - \langle \omega, [X,Y]\rangle \\ &= (\nabla_X\omega)(Y) - (\nabla_Y\omega)(X) + \omega(\nabla_X Y - \nabla_Y X - [X,Y]). \end{align*}\]
The second term is \[ (\omega \wedge \nabla v)(X,Y) = \omega(X)\nabla_Y v- \omega(Y)\nabla_X v,\]
So this definition would give us
\[ (\nabla_X\omega)(Y)v - (\nabla_Y\omega)(X)v + \omega(\nabla_X Y - \nabla_Y X - [X,Y])v - \omega(X)\nabla_Y v+ \omega(Y)\nabla_X v\]
If we add an assumption that our connection be torsion free (which states \(\nabla_X Y - \nabla_Y X = [X,Y]\)), then the this definition coincides with the definition induced from the first row (only differ by a constant factor \(k!\)).
Remark:
The above construction interprets the complex \(\Omega^i\otimes V\) as a subcomplex of the connection in geometry \((\Omega^1)^i\otimes V\), where things are all anti-symmetric.
With the above correspondence and intuition, we can now define a purely algebraically defined connection on the complex \(\Omega^\bullet\otimes V\), without referring to the first row.
The rule of connection \(\nabla^{(p)}:\Omega^p\otimes V\to \Omega^{p+1}\otimes V\) is defined as
\[ \nabla(\omega\otimes v) = d\omega\otimes v +(-1)^p \omega\wedge \nabla v, \quad \omega\in \Omega^p, v\in V.\]
Because we are working in anti-symmetric tensors, the curvature tensor is always here as long as you try to differentiate twice.
\[\begin{align*} \nabla^2(\omega\otimes v) &= \nabla(d\omega\otimes v + (-1)^p \omega\wedge \nabla v)\\ &= d(d\omega)\otimes v + (-1)^{p+1} d\omega\wedge \nabla v + (-1)^p d\omega\wedge \nabla v + (-1)^{2p} \omega\wedge \nabla^2 v\\ &= \omega \wedge \nabla^2 v\\ &= \omega\wedge (X,Y\mapsto \nabla'^2_{X,Y}v - \nabla'^2_{Y,X}v)\\ &= \omega\wedge (X,Y\mapsto R(X,Y))v\\ &= \omega\wedge R v \end{align*}\]
here \(\nabla'\) means the original connection (the one in the first row, without implicit anti-symmetrization).
Therefore, the connection \(\nabla\) is called flat or integrable if \(\nabla^2=0\), i.e. the curvature tensor vanishes. When that happens, we have a cool complex \((\Omega^\bullet\otimes V, \nabla)\) that can be used to define algebraic de-Rham cohomology.