Author: Eiko

Tags: Algebra, Group, Ring, Module, Field, Linear Algebra, Symmetry, Group Action, Group Representation, Sylow's Theorem, Commutative Algebra, PID, Number Theory, Vector Space, Jordan Canonical Form, Classification of Linear Endomorphisms, Tensor Product, Dual Space

Time: 2024-12-07 04:00:00 - 2024-12-07 04:00:00 (UTC)

The purpose of this book is to lead readers to understand and comprehend algebraic structures such as groups, rings, and modules. The book aims to explain the principles in a systematic way, so that readers can understand the basic principles and the entire theory, rather than just presenting the information in a straightforward manner. There are many exercises interspersed throughout the book, which can help readers become familiar with the concepts and check their understanding. Difficult problems are marked with a difficulty level symbol \(\bigstar\), with the number of \(\bigstar\) representing their difficulty (ranging from 0 to 5 stars). The 0-star problems are those that can be immediately solved, and readers can easily figure out the answers. When encountering 0-star problems, readers should try to think about the answer immediately, as it can be used to check their understanding of the concepts. If they cannot come up with the answer, they should review the previous content. Some exercises are actually simple but have certain importance or useful conclusions. By thinking about them, readers can have a better understanding and impression of them.

The original version of this book was written by me in Chinese. I used some neovim magic and chatgpt api to translate it into English, so there may be something missing or inaccurate. If you find any errors, please let me know. I hope this book can help you understand algebra happily and easily. Enjoy reading!

Last updated: 2024

Symmetry Phenomena and Groups

Understanding Symmetry

In the vast and boundless universe, there are countless colorful and complex symmetry phenomena. The word "symmetry" is not unfamiliar to us. For example, we often say that a circle is both an axisymmetric figure and a central symmetric figure, and it is also "rotationally" symmetric; (equilateral) triangles are axisymmetric, with three axes of symmetry. Moreover, we feel that a triangle looks the same from three different directions, which seems to be a kind of symmetry... So what is symmetry?

We need to make a high-level summary of the phenomenon of symmetry. It is not difficult to see that we can re-understand the above examples in a unified way: reflecting and rotating a circle about a symmetry axis, or performing a central symmetry transformation, i.e. mapping \((x,y)\mapsto(-x,-y)\), will bring the circle back to its original state. Reflecting an (equilateral) triangle about a symmetry axis, or rotating it by \(120\) degrees, will not change the triangle.

So we can say that the so-called symmetry of an object refers to a transformation (mapping) that keeps the entire object unchanged. However, in general, this transformation is limited, usually only considering selecting some mappings that keep it unchanged from a larger, more normal/natural mapping space. It cannot be a bad mapping, otherwise there will be too many options. For this entire triangle, we only consider the transformation of treating it as an indivisible rigid body. The mappings that would divide the triangle can also be seen as a kind of symmetry according to this definition, but we do not consider them. Under this consideration, the symmetry of an (equilateral) triangle only needs to look at how its three vertices 1,2,3 move. For example, the transformation \((1,2,3)\mapsto(2,3,1)\) is a rotation, while \((1,2,3)\mapsto(1,3,2)\) is an axis symmetry.

Essential Properties of Symmetry

We now use mathematical notation to write our previous discussion. We first have an object \(X\) that may have some symmetry, which is some transformations \(T_1, T_2, \dots\) acting on \(X\) to keep it unchanged, that is, \[T_i(X)=X.\] A natural principle is that every object has a type of symmetry, which is doing nothing. This do-nothing transformation is denoted as \(1\) and is one of the essential properties of symmetry. Since they are transformations (mappings), they can be composed, and the composition operation must satisfy the associative law. We can easily see that the composition of symmetries is also a symmetry. In slightly more abstract language, this naive (another essential property of symmetry) principle is \[T_i\circ T_j(X)=T_i(T_j(X))=T_i(X)=X\] (In the following, we will omit the composition symbol \(\circ\) and write it as \(T_iT_j\) directly.) This simple equation shows how to "create" new symmetries using two symmetries: the composition of them must still be a symmetry of \(X\)! As the simplest example, let’s take a look at the equilateral triangle again. We know that the transformation of rotating 120 degrees can be applied twice, that is, \(\sigma\circ\sigma=\sigma^2\), which is, in fact, a rotation of 240 degrees. Applying it three times is, in fact, doing nothing. Let’s use \(\sigma\) to represent the transformation of rotating (counterclockwise) 120 degrees. We have found that \(\sigma\circ\sigma\circ\sigma=1\), which is the identity mapping, that is, \[\sigma^3=1\] It is worth noting that, just like the composition of mappings does not necessarily commute, there is no reason to believe that the composition of symmetries must commute, that is, \(T_iT_j=T_jT_i\) is not necessarily true. Symmetry also has another essential property. That is, every type of symmetry can be reversed! That is, for any symmetry \(T\), there is a corresponding inverse symmetry, denoted as \(T^{-1}\), called the inverse of \(T\), such that \(T^{-1}\circ T=1\). The inverse is a mutual relationship, you are my inverse, and I am also your inverse, so it must also satisfy \(T\circ T^{-1}=(T^{-1})^{-1}\circ T^{-1}=1\). For example, the reverse operation of \(\sigma\) is rotating 120 degrees clockwise. This is, in fact, the same as \(\sigma^2\), which can be seen by simultaneously composing both sides of the equation \(\sigma^3=1\) with \(\sigma^{-1}\): \[\sigma^{-1}\sigma^3=\sigma^{-1}=\sigma^2.\] If we use \(S(X)\) to represent the set of all symmetry transformations on \(X\), the above discussion can be summarized as follows:

  1. \(1\in S(X).\)

  2. \(a,b\in S(X)\Rightarrow ab\in S(X).\)

  3. \(a(bc)=(ab)c\).

  4. \(a\in S(X)\Rightarrow\)there exists an inverse \(b\in S(X)\) for \(a\), such that \(ab=ba=1.\)

In fact, we call such a set \(S(X)\) a group, namely the symmetric group of \(X\).

Symmetric Group of an Equilateral Triangle

The symmetry of an equilateral triangle includes rotation and reflection. In fact, it has three different axes of symmetry, and we can take the reflection transformation with the vertical axis as the axis of symmetry as \(\tau\). There are also two different rotations \(\sigma, \sigma^2 = \sigma^{-1}\) (one rotates 120 degrees counterclockwise, and the other rotates 120 degrees clockwise), and a symmetry \(1\) that does nothing. As before, we use 1, 2, 3 to denote the three vertices of the triangle (we can place the equilateral triangle flat, with the top vertex as 1, the bottom right vertex as 2, and the bottom left vertex as 3). So each symmetry is actually a mapping \(f\) from \(\{1,2,3\}\) to \(\{1,2,3\}\). Don’t forget, as a symmetry, \(f\) needs to have an inverse, so \(f\) must be a bijection. How many such \(f\) are there? We only need to determine the values of \(f(1), f(2), f(3)\). There are 6 possible permutations, so there can be at most 6 symmetries on the triangle! How many does it actually have? Is each of these \(f\) a symmetry of the triangle? Indeed it is! We will use the following notation \[\left( \begin{array}{ccc} 1 & 2 & 3 \\ a & b & c \\ \end{array} \right)\] to represent the mapping that maps 1 to \(a\), 2 to \(b\), and 3 to \(c\). It is easy to see that \[\sigma=\left( \begin{array}{ccc} 1 & 2 & 3 \\ 3 & 1 & 2 \\ \end{array} \right) \quad \tau=\left( \begin{array}{ccc} 1 & 2 & 3 \\ 1 & 3 & 2 \\ \end{array} \right)\] Don’t forget what we said, symmetric transformations can be composed. Let’s try to compose \(\sigma\) and \(\tau\) and see what happens. Calculate and give the result as \[\sigma\tau=\left( \begin{array}{ccc} 1 & 2 & 3 \\ 3 & 1 & 2 \\ \end{array} \right)\]

\[\left(\begin{array}{ccc} 1 & 2 & 3 \\ 1 & 3 & 2 \\ \end{array}\right) =\left(\begin{array}{ccc} 1 & 2 & 3 \\ 3 & 2 & 1 \\ \end{array}\right)\] Interestingly, this composition did not produce anything new. \(\sigma\tau\) is another type of symmetry, a reflection that keeps vertex 2 fixed. It is worth noting that \(\sigma\tau\not=\tau\sigma\), and readers can verify this. We can verify that the entire symmetry group of the triangle has the following 6 distinct elements: (which happens to form all the surjective maps from {1,2,3} to {1,2,3}.) \[1,\sigma,\sigma^2,\tau,\tau\sigma,\tau\sigma^2\] This group has good operational properties: it has an identity element 1, every element has an inverse, and the composition of any two elements (also known as the product of elements) still belongs to this group, meaning that the multiplication operation is closed. In fact, we generally refer to this group as \(D_3\), called the dihedral group. The term dihedral group actually includes a whole series of groups \(D_n\), where \(D_n\) refers to the group of all symmetries acting on a regular n-gon. (Note: Some books use \(D_{2n}\) to refer to what we call \(D_n\) here, this is just a difference in notation, unfortunately both notations are widely used.)

Another simple but important example of a group: permutation group

We consider the set \(\{1,2,\dots,n\}\) and study its symmetries. What are the transformations acting on it? They are all the surjective maps from it to itself (must be surjective because symmetry transformations require an inverse). We denote this group as \(S_n\), called the n-permutation group. The elements in it are called permutations. \[S_n=\{\text{surjective map }f:\{1,2,\dots,n\}\mapsto\{1,2,\dots,n\}\}.\] It is easy to see that \(|S_n|=n!\). Permutation groups are of fundamental importance, so it is important to introduce simple and convenient notation for them. We still use \[f=\left( \begin{array}{cccc} 1 & 2 & \dots & n \\ f(1) & f(2) & \dots & f(n) \\ \end{array} \right)\] to represent a permutation. In particular, we call a permutation with the following shape a cycle: \[\left( \begin{array}{ccccccc} a_1 & a_2 & \dots & a_n & b_1 & \dots & b_k\\ a_2 & a_3 & \dots & a_1 & b_1 & \dots & b_k\\ \end{array} \right)\] We denote it as \((a_1 a_2 a_3 \dots a_n)\). There can be cycles with length less than n, such as \((12)\) which represents a cycle that maps 1 to 2, 2 to 1, and leaves other elements unchanged.

Cycle Representation of Permutations

An interesting fact is that every permutation can actually be represented as a product (composition) of cycles! To obtain the cycle representation from a permutation, we simply do the following: it may map \(a_1\) to \(a_2\), \(a_2\) to \(a_3\), and so on until it maps some \(a_k\) back to \(a_1\), then we have found a cycle \((a_1a_2\dots a_k)\). Starting from an element not yet included in any cycle, we let \(f\) act on it repeatedly until it cycles, obtaining another cycle \((b_1b_2\dots b_l)\), and so on. Eventually, we exhaust all elements of the permutation and obtain \[f=(a_1a_2\dots a_k)(b_1b_2\dots b_l)\dots\] Moreover, the cycles appearing in the expression have no common elements (hence the order of composition of these cycles does not matter). For example, we can obtain the cycle representation of this permutation: \[\left( \begin{array}{cccc} 1 & 2 & 3 & 4 \\ 4 & 3 & 2 & 1 \\ \end{array} \right)=(14)(23)=(23)(14)\] Simply by observing that \(1\mapsto4\mapsto1\), \(2\mapsto3\mapsto2\).

Exercise 1.1. Compute \((12)(34)(13)(24)\) and write it as a product of disjoint cycles.

Exercise 1.2. Permutation: \[\left( \begin{array}{ccccc} 1 & 2 & 3 & 4 & 5 \\ 3 & 4 & 1 & 5 & 2\\ \end{array} \right)\]

Write as a disjoint cycle product.

The Essential Framework of Symmetry: Groups

A group essentially refers to the set of all symmetry transformations acting on some symmetric object \(X\), denoted as \(S(X)\). This set satisfies the following properties:

  1. \(1\in S(X)\).

  2. \(a,b\in S(X)\Rightarrow ab\in S(X)\).

  3. Multiplication is associative, \(a(bc)=(ab)c\).

  4. \(a\in S(X)\Rightarrow\) there exists an inverse \(b\in S(X)\) such that \(ab=ba=1\).

In fact, when we abstractly consider a group, we can see that its operation structure has nothing to do with \(X\), and all its operation information is completely contained in the multiplication operation. These conditions are present in all symmetric phenomena. In order to study the essential structure of all symmetric phenomena in the universe, we can simplify and define it as follows:

Definition 1.1. A group is a set \(G\) with an associated binary operation \(\,\cdot:G\times G\to G\) called multiplication, satisfying:

  1. \(a,b\in G\Rightarrow ab\in G\)

  2. There exists an element \(e\in G\) (called the identity element) such that for any element \(a\in G\), we have \(ea=ae=a\).

  3. For any \(a\in G\), there exists an element \(b\in G\) (called the inverse of \(a\)) such that \(ab=ba=e\).

  4. Multiplication is associative, meaning that for any \(a,b,c\in G\), we have \((ab)c=a(bc)\).

In particular, if this operation also satisfies the commutative law, \(ab=ba\), it is called an Abelian group.

Exercise 1.3. Verify that in a group, the identity element is unique. The inverse of an element is also unique.

In this group definition, we did not mention any symmetry transformations. Even if we only look at this definition, it is not clear what this is for1, and it may be thought of as a set closed under some operation. Indeed, a group can also represent such a set closed under an operation, such as the set of all integers under addition. (In fact, such a set closed under an operation, such as the group of all integers under addition, is also a kind of symmetry transformation: it is all the translation transformations acting on the set of integers.) But please remember the origin of groups, and the most essential property of groups is symmetry. In this way, by studying all groups, we are essentially studying all possible symmetries that can act on any possible object.

We only study finite groups, i.e. groups with a finite number of elements. We call the number of elements in a group its order. In Chapter 2, we will prove that for every finite group, we can construct an object \(X\) such that this group is exactly the symmetry group of that object, i.e. every group is a symmetry group. Therefore, the essence of groups is symmetry: groups are symmetry, and symmetry is groups!

Some Examples of Groups

In addition to the groups \(S_n\) and \(D_n\) that we have already introduced, there are some simple groups. The simplest example of a group is probably \(\mathbb{Z}\), the group formed by all integers under addition. We can think of this group as the group of translations of all integers. It is easy to verify that the set of even integers also forms a group under addition. In fact, any set of multiples of \(n\) forms a group under addition (since it is closed under addition).

If we take all the rotations in \(D_n\), \(\{1,\sigma,\sigma^2,\dots,\sigma^{n-1}\}\), and form a separate group, it is easy to verify that this also forms a group (since the composition of rotations is still a rotation). This type of group is also one of the most basic groups, and we denote it as \(\mathbb{Z}_n\), calling it the \(n\)th cyclic group2, which is an Abelian group.

Regarding cyclic groups, there is actually another arithmetic description: all integers are divided into \(n\) different residue classes according to the modulus \(n\), which are divided into \(n\) classes with remainders \(0,1,2,\dots,n-1\). Addition can be performed between residue classes, for example, in modulus \(5\), \(2+4=1\). In this way, these different residue classes form a group under addition, which seems different from the previous \(\mathbb{Z}_n\) because one is a rotation transformation \(\{1,\sigma,\sigma^2,\dots,\sigma^{n-1}\}\) and the other is a residue class \(\{0,1,2,3,\dots,n-1\}\). However, it can be seen that these two groups have no essential difference in structure, that is, if we identify residue class \(k\) with \(\sigma^k\), any group operation on both sides is the same. This means that under a correspondence \(f\) \[\begin{aligned} f: \{0,1,2,3,\dots,n-1\} &\to \{1,\sigma,\sigma^2,\dots,\sigma^{n-1}\}\\ k &\mapsto \sigma^k \end{aligned}\] the operations on the left can correspond to the operations on the corresponding elements on the right. In this case, we say that these two groups are isomorphic. Isomorphic groups have no essential difference and can be considered the same. A specific description of isomorphism will be given in the next chapter.

Exercise 1.4. Consider all \(n\)th roots of unity \(\{1,e^{\frac{2\pi i}{n}},e^{\frac{4\pi i}{n}},\dots,e^{\frac{2(n-1)\pi i}{n}}\}\) and verify that they form a group under the usual complex multiplication, that is, verify that they satisfy the definition of a group. Is this group isomorphic to \(\mathbb{Z}_n\)?

In this chapter, we will explore the basic structure of groups using a simple approach.3

Structural Information on Elements

Order of Elements

Let \(g\in G\). If there exists a positive integer \(n\) such that \(g^n=1\), then we say that \(g\) is of finite order and we call the smallest such positive integer \(n\) the order of \(g\). Otherwise, we say that \(g\) is of infinite order. For a finite group \(G\), for any element \(g\in G\), we can keep multiplying it to get the sequence \(g,g^2,g^3,\dots\). Since \(G\) is finite, this sequence will eventually repeat. Suppose \(g^k=g^l,k>l\), then we have \(g^{k-l}=1\). This means that every element in a finite group is of finite order.

Exercise 1.5. Let \(g\) be an \(n\)-order element in \(G\) and \(g^N=1\). Prove that \(N\) is a multiple of \(n\).

Proposition 1.1. Let \(g\) be an \(n\)-order element in a group \(G\). Then the order of \(g^m\) is \(\frac{n}{(m,n)}\), where \((m,n)\) denotes the greatest common divisor of \(m\) and \(n\).

Proof. First, from \((g^m)^{\frac{n}{(m,n)}}=g^\frac{mn}{(m,n)}=1\), we know that the order of \(g^m\) is \(\le\frac{n}{(m,n)}\). And if \((g^m)^k=1=g^{mk}\), then \(mk\) must be a multiple of \(n\), which means \(k\) must be a multiple of \(\frac{n}{(m,n)}\). This shows that the order of \(g^m\) is exactly \(\frac{n}{(m,n)}\). ◻

In an Abelian group, if the orders of \(a\) and \(b\) are known, then the order of \(ab\) is constrained.

Exercise 1.6. Let \(G\) be an Abelian group, and let \(a,b\in G\) have orders \(m\) and \(n\) respectively. Then the order of \(ab\) is a factor of \([m,n]\).

Exercise 1.7 (\(\bigstar\)). In the above exercise, if \(m\) and \(n\) are relatively prime, prove that the order of \(ab\) is \(mn\).

It is worth noting that for non-Abelian groups, knowing only the orders of \(a\) and \(b\) does not provide any information about the order of \(ab\). It can be proven that there exist groups where the order of \(ab\) can take on any given positive integer. As an example, in \(S_5\), let \(a=(12)(34), b=(135)\), then \(ab=(14352)\) is of order five.

Exercise 1.8 (\(\bigstar\)). Prove that if a group only has elements of order \(1\) and \(2\), then it is an Abelian group.

Exercise 1.9 (\(\bigstar\)). Let \(G\) be a group of even order, then the equation \(g^2=1\) has an even number of solutions. Hence, there must be elements of order two in the group \(G\).

Subgroups and Quotient Structures

It is not difficult to see that \(D_6\), the symmetry group of a regular hexagon, also contains all the symmetries of an equilateral triangle! For example, the rotation of \(120\) degrees is also a symmetry in the symmetry group of a regular hexagon. The three axes of symmetry of a triangle also form the three axes of symmetry of a regular hexagon, and of course, a regular hexagon has more axes of symmetry (reflection symmetry). In this case, we say that \(D_3\) is a subgroup of \(D_6\)4. The history of mathematics tells us that studying the substructures of a structure is important: we make the following definition

Definition 1.2. Let \(G\) be a group, and let \(S\subset G\) be a subset. If \(S\) satisfies the defining properties of a group under the multiplication operation of \(G\), then \(S\) is called a subgroup of \(G\). We denote this as \(S\le G\).

We have many examples of subgroups, for example, the additive group of all even numbers \(2\mathbb{Z}\) is a subgroup of \(\mathbb{Z}\). Another example is \(\mathbb{Z}_6=\{1,\sigma,\sigma^2,\sigma^3,\sigma^4,\sigma^5\}\) has subgroups \(\{1,\sigma^2,\sigma^4\}\) and \(\{1,\sigma^3\}\), which are actually \(\mathbb{Z}_3\) and \(\mathbb{Z}_2\).

Example 1.1. \(\{1,(12)(34),(13)(24),(14)(23)\}\) forms a subgroup of \(S_4\), called the Klein four-group, sometimes denoted as \(V\) or \(\mathbb{Z}_2^2\). This group is abelian, but is not isomorphic to the abelian group of the same order, \(\mathbb{Z}_4\).

Exercise 1.10. Prove that the intersection of (any) subgroups is still a subgroup.

Cosets and Lagrange’s Theorem

Let \(G\) be a group and let \(H \leq G\) be a subgroup. We can define an equivalence relation on \(G\) as follows: \[a \sim b \Leftrightarrow a^{-1}b \in H.\] It can be easily verified that this indeed defines an equivalence relation5, and according to this relation, the group \(G\) is partitioned into disjoint equivalence classes. It can be observed that the set of all elements equivalent to \(b\) is \(bH\), since \(b \sim a \Leftrightarrow b^{-1}a \in H \Leftrightarrow a \in bH\). Thus, all the equivalence classes are of the form \(bH\), and they all have the same size, which is \(|bH| = |H|\). We call such subsets of \(G\) cosets of \(H\), and we refer to them as left cosets6. We denote the number of cosets, or the index of the subgroup \(H\), by \([G:H]\). We have \[G = \bigcup_{i=1}^{[G:H]} b_iH\] Since the different cosets are disjoint, we have obtained

Theorem 1.1 (Lagrange’s Theorem). Let \(H\) be a subgroup of the group \(G\). Then, \[|G| = [G:H] |H|\]

By continuously applying the coset decomposition, we can obtain

Exercise 1.11 (\(\bigstar\)). Let \(A\le B\le G\). Then we have \([G:A]=[G:B][B:A].\)

This theorem also shows that the order of a subgroup must be a factor of \(|G|\). For example, using Lagrange’s theorem, we can see that a group of order \(12\) cannot have a subgroup of order \(5\).

Example 1.2. For the group \[D_3=\{1,\sigma,\sigma^2,\tau,\tau\sigma,\tau\sigma^2\}\] and its subgroup \(H=\{1,\sigma,\sigma^2\}\), we have the following coset decomposition: \[D_3=H\cup \tau H.\] In this case, \([D_3:H]=2\).

Exercise 1.12. Let \(H\le G\). Prove that \(g\in H\Leftrightarrow gH=H\), and hence \(a\sim b\Leftrightarrow aH=bH\).

Theorem 1.2. Let \(g\in G\) be an \(n\)-order element in a finite group \(G\). Then \(n\) is a factor of \(|G|\).

Proof. Since \(g\) is an \(n\)-order element, it is easy to verify that \(H=\{1,g,g^2,\dots,g^{n-1}\}\) forms an \(n\)-order subgroup of \(G\). Therefore, by Lagrange’s theorem, we know that \(n\) must be a factor of \(|G|\). ◻

Quotient Groups and Normal Subgroups

Just like how a subspace can be constructed from a vector space, groups also have quotient structures.

Given \(H\le G\), we try to view the set of cosets \(G/H=\{H,b_1H,\dots,b_{k-1}H\}\) as a group, and the desired group operation is defined as: \[\forall a, b\in G\quad (aH)(bH)=(ab)H\] Unfortunately, for some subgroups \(H\), this equation does not hold. In other words, not all subgroups can be used to construct the quotient group \(G/H\). We reason as follows: \((aH)(bH)=(ab)H\Leftrightarrow HbH=bH\Leftrightarrow (b^{-1}Hb)H=H\Leftrightarrow b^{-1}Hb\subset H\), and for all \(b\), \(b^{-1}Hb\subset H\) is equivalent to \(b^{-1}Hb=H\) for all \(b\). Therefore, in order to construct the quotient group \(G/H\), the subgroup \(H\) must satisfy the condition: \(\forall b\in G, b^{-1}Hb=H\), which leads to the following definition:

Definition 1.3. Let \(N\le G\), if \[\forall g\in G\quad g^{-1}Ng=N\] then \(N\) is a normal subgroup of \(G\), denoted as \(N\triangleleft G\). If a group \(G\) has no other normal subgroups except for \(1\) and \(G\), it is called a simple group.

Example 1.3. If \(G\) is an Abelian group, then any subgroup of \(G\) is a normal subgroup.

Example 1.4. We have \(\mathbb{Z}_3 \triangleleft D_3\), which can be verified by showing \(\tau^{-1}\sigma\tau = \sigma^{-1}\). It is easy to see that \(\{1,\tau\}\) is a subgroup of \(D_3\), but it is not a normal subgroup of \(D_3\) because \(\sigma^{-1}\tau\sigma = \sigma^2\tau \not\in \{1,\tau\}\).

Example 1.5. We have \(\{1,(12)(34),(13)(24),(14)(23)\}=V\triangleleft S_4\). This point can be immediately obtained after we talk about the concept of conjugation in permutation groups.

It is worth noting that normal subgroups do not have transitivity, that is, \(A\triangleleft B\triangleleft C\) does not imply \(A\triangleleft C\). A counterexample will be given in the later study.

Exercise 1.13. Let \(N\le H\le G\) and \(N\triangleleft G\). Prove that \(N\triangleleft H\).

Exercise 1.14. Proof: \(N\triangleleft G\) is equivalent to saying that for any \(g\in G\), \(Ng=gN\).

Exercise 1.15 (\(\bigstar\)). Assume \(M,N\triangleleft G\) and \(M\cap N=\{1\}\). Prove that the elements of \(M\) and \(N\) commute with each other. That is, for any \(m\in M\) and \(n\in N\), \(mn=nm\).

Exercise 1.16 (\(\bigstar\)). Let \(H\le G\) and \([G:H]=2\). Prove that \(H\triangleleft G\).

Exercise 1.17 (\(\bigstar\)). Find all finite simple groups.

Product of Subgroups

Let \(N,H\le G\). We define the set \[NH=\{nh|n\in N,h\in H\}\subset G.\] Under the operation defined on the larger group \(G\), can this set form a group? The answer is not always. However, this is true when at least one of \(N\) or \(H\) is a normal subgroup. Without loss of generality, let \(N\triangleleft G\). We calculate \[n_1h_1n_2h_2=n_1(h_1n_2h_1^{-1})h_1h_2\in NH\] \[(n_1h_1)^{-1}=(h_1^{-1}n_1^{-1}h_1)h_1^{-1}\in NH\] \[1\in NH\] The associativity of the group \(NH\) is inherited from the associativity of the larger group, so in this case, \(NH\) forms a subgroup.

In addition, even if neither \(N\) nor \(H\) is a normal subgroup, we can still calculate the number of elements in the set \(NH\) by using the method of coset decomposition.

Proposition 1.2. Let \(A,B\le G\), then \[|AB|=\frac{|A||B|}{|A\cap B|}\]

Exercise 1.18 (\(\bigstar\bigstar\)). Let \(A,B\le G\). Prove that \[[G:A\cap B]\le [G:A][G:B].\] In particular, if \([G:A]\) and \([G:B]\) are coprime, prove that the equality must hold and \(G=AB\).

Exercise 1.19 (\(\bigstar\bigstar\)). Let \(A,B\subset G\) be subsets (note that it is not said that they are subgroups). If \(|A|+|B|>|G|\), prove that \(G=AB\).

Group Homomorphism

The mappings between linear spaces usually only consider linear mappings related to their linear structures. Similarly, the mappings between groups also only consider group mappings related to their group structures. The group mappings we are looking for are group homomorphisms.

Definition 1.4 (Group Homomorphism). Let \(G,H\) be groups, and \(f:G\to H\) be a mapping. If \[f(ab)=f(a)f(b)\] then \(f\) is called a group homomorphism or group mapping, and is simply called a homomorphism. If \(f\) is an injection, it is called a monomorphism. If \(f\) is a surjection, it is called an epimorphism. If \(f\) is both a monomorphism and an epimorphism, it is called an isomorphism, and \(G\) and \(H\) are called isomorphic, denoted by \(G\cong H\).

It is worth noting that in the equation \(f(ab)=f(a)f(b)\), the multiplication operation used for \(ab\) is from the group \(G\), while the multiplication operation used for \(f(a)f(b)\) is from the group \(H\).

Example 1.6. As in Chapter 1, by numbering the vertices of a triangle, we find that in fact \(S_3\cong D_3\). However, for \(n>3\), \(S_n\) is not the same as \(D_n\).

Example 1.7. A group is isomorphic to itself. This can be achieved by taking \(f(g)=g\). Interestingly, a group can have isomorphisms other than the identity map, such as the map \[\begin{aligned} f: \mathbb{Z}_3 &\to \mathbb{Z}_3\\ g &\mapsto g^2 \end{aligned}\]. This is an isomorphism, but it is different from the identity map. We call these (including \(1\)) the automorphisms of \(\mathbb{Z}_3\). In fact, the existence of such (non-trivial) isomorphisms shows that the structure of the group \(G\) itself has a certain symmetry, and all automorphisms of \(G\), which are all the symmetries acting on \(G\), also form a group, called the automorphism group of \(G\), denoted by \({\rm Aut}(G)\).

Exercise 1.20. Verify that a homomorphism must map \(1\) in \(G\) to \(1\) in \(H\), and map \(g^{-1}\) to \(f(g)^{-1}\), i.e. \(f(1)=1\) and \(f(g^{-1})=f(g)^{-1}\).

Exercise 1.21. Verify that \(\mathbb{Z}_4\) has a subgroup of order \(2\), and prove that the quotient group of this subgroup is isomorphic to \(\mathbb{Z}_2\).

Exercise 1.22. Let \(g\) be an \(n\)-order element, prove that \(f(g)\) must be a finite-order element, and its order is a factor of \(n\).

Exercise 1.23. Find \({\rm Aut}(\mathbb{Z}_4)\).

Definition 1.5 (Kernel and Image). Let \(f:G\to H\) be a group homomorphism. We define the set \[\ker f=f^{-1}(1)=\{g\in G|f(g)=1\}\] as the kernel of \(f\), and the set \[{\rm Im}f=\{f(g)|g\in G\}\] as the image of \(f\).

It is worth noting that \(\ker f\) and \({\rm Im}f\) are subgroups of \(G\) and \(H\), respectively, and in fact \(\ker f\triangleleft G\).

Exercise 1.24. Prove that

  1. \(\ker f=\{1\}\Leftrightarrow f\text{ is injective.}\)

  2. \(\ker f\triangleleft G\).

Find all groups with order \(\le 5\). Including the trivial group with only one element, there are a total of \(6\) groups with order not exceeding \(5\). (Isomorphic groups are considered the same)

Simple, important and typical group homomorphisms

Monomorphism

The simplest group homomorphism should belong to monomorphism. A monomorphism \(f:H\to G\) essentially maps \(H\) to a subgroup \(f(H)\le G\), and \(f(H)\) is isomorphic to \(H\). In other words, a monomorphism is a way to "insert" \(H\) into \(G\) in an isomorphic way.

Example 1.8. Consider the monomorphism of groups \[f:\mathbb{Z}_3=\{1,a,a^2\}\to\mathbb{Z}_6=\{1,b,b^2,b^3,b^4,b^5\}\] It is easy to see that this homomorphism is completely determined by the value of \(f(a)\), since \(f(1)=1\) and \(f(a^2)=f(a)^2\). Since \(a\) is a 3-order element, \(f(a)\) must be a 1-order or 3-order element. The 3-order elements in \(\mathbb{Z}_6\) are only \(b^2\) and \(b^4\). Therefore, we know that \(f(a)\) can only take \(1,b^2,b^4\). Since we only consider monomorphisms, there are only two possible choices left: \(f(a)=b^2\) or \(f(a)=b^4\).

If \(f(a)=b^2\), then we have \(f(a^2)=b^4\), and the entire group \(\mathbb{Z}_3=\{1,a,a^2\}\) is mapped to \(\{1,b^2,b^4\}\). If we choose \(f(a)=b^4\), then we have \(f(a^2)=(b^4)^2=b^8=b^2\), and the entire group \(\mathbb{Z}_3=\{1,b^4,b^2\}\). We know that \(\{1,b^2,b^4\}\) is a subgroup of \(\mathbb{Z}_6\) isomorphic to \(\mathbb{Z}_3\), but there are two different ways to embed \(\mathbb{Z}_3\) into \(\mathbb{Z}_6\). However, in either case, this monomorphism establishes an isomorphism between \(H\) and \(f(H)\). Although there is only one subgroup isomorphic to \(H=\mathbb{Z}_3\) in \(G=\mathbb{Z}_6\), there are multiple ways to embed \(H\) into \(f(H)\) because there exists an automorphism from \(H\) to \(H\) mapping \((1,a,a^2)\) to \((1,a^2,a)\). In other words, this phenomenon occurs because of the symmetry inherent in \(H\) itself. rendering math failed o.o

Projection and Homomorphism Decomposition

Another simple and important mapping is projection. As we have mentioned, for a normal subgroup \(N\) of a group \(G\), we can construct the quotient group \(G/N\). We can consider a natural mapping that maps an element \(g\) in \(G\) to the coset in \(G/N\) that contains \(g\): \[\begin{aligned} p: G &\to G/N\\ g &\mapsto gN \end{aligned}\] This mapping is obviously a homomorphism, since \(p(ab)=abN=aNbN=p(a)p(b)\). It is also a surjective homomorphism. What is its kernel? Recall that the identity element in \(G/N\) is \(N\), which is the coset containing the identity element in \(G\). Therefore, \(p(a)=aN=N \Leftrightarrow a \in N\), and we conclude that \(\ker p = N\).

The importance and fundamentality of homomorphisms lies in considering the kernel \(\ker f\) of any homomorphism \(f: G \to H\). One thing worth noting is that \(a^{-1}b \in \ker f\) is equivalent to saying \(f(a^{-1}b) = 1 \Leftrightarrow f(a) = f(b)\). In other words, if we take the quotient group of \(G\) by \(\ker f\) (also known as the coset decomposition), every element in the coset \(b\ker f\) is mapped to the same element \(f(b)\) by \(f\), and different cosets are mapped to different elements (since if two cosets \(a\ker f\) and \(b\ker f\) are mapped to the same element, then \(f(a^{-1}b) \in \ker f \Leftrightarrow a\ker f = b\ker f\)). Since all elements in \(a\ker f\) are mapped to the same element \(f(a)\) (which seems like a waste), we can define a new mapping that considers \(a\ker f\) as an element in the quotient group \(G/\ker f\) and maps it to \(f(a)\).

In other words, we find that any homomorphism \(f: G \to H\) with kernel \(\ker f\) can be decomposed into the composition of two mappings: rendering math failed o.o This diagram means that we can map \(a\) to \(f(a)\) in two steps: \(a \mapsto a\ker f \mapsto f(a)\). The first step is to map \(a\) to the coset \(a\ker f\) it belongs to (i.e. the natural projection from \(G\) to \(G/\ker f\)), and the second step is to map \(a\ker f\) to \(f(a)\). The only potential issue is whether the second step is a homomorphism. The answer is yes. It is easy to verify this. Any group mapping can be decomposed into the composition of two basic mappings: a projection and an injective homomorphism. By the way, we can add one more step, which is the most trivial one: \(a \mapsto a\ker f \mapsto f(a) \mapsto f(a)\), which first maps \(f(a)\) to the subgroup \({\rm Im}f\) of \(H\), and then maps \({\rm Im}f\) to \(H\) through the trivial injective mapping. rendering math failed o.o Thus, the second step \(G/\ker f \to a\ker f \mapsto f(a)\) is a homomorphism and a bijection, and therefore an isomorphism. From this, we obtain the following important theorem:

Theorem 1.3 (Fundamental Homomorphism Theorem). If \(f: G \to H\) is a homomorphism, then \(\ker f \triangleleft G\), \({\rm Im}f \le H\), and there exists an isomorphism \[G/\ker f \cong {\rm Im}f\]

Proof. As above, we construct a mapping \[\begin{aligned} h: G/\ker f &\to {\rm Im}f\\ a\ker f &\mapsto f(a) \end{aligned}\] It is easy to verify that this mapping is well-defined, since \(f(b)=f(a)\) whenever \(b\ker f= a\ker f\). Now we prove that this is a homomorphism: \[h((a\ker f)(b\ker f))=h(ab\ker f)=f(ab)=f(a)f(b)\] The first equality holds because \(\ker f\) is normal. \(h\) is injective, since \(h(a\ker f)=f(a)=1\Leftrightarrow a\in\ker f\). Moreover, \(h\) is surjective, so \(h\) is an isomorphism. ◻

Isomorphism Theorem

The Homomorphism Fundamental Theorem, also known as the First Isomorphism Theorem, is easy to derive from the Homomorphism Fundamental Theorem. We can also derive many isomorphism theorems, which are similar to those we learned in linear algebra.

Theorem 1.4 (Second Isomorphism Theorem). Let \(N\triangleleft G\) and \(H\le G\). Then \(N\triangleleft NH\) and \(N\cap H\triangleleft H\), and \[NH/N\cong H/N\cap H\]

Proof. Since \(N\triangleleft G\) and \(N\le NH\le G\), it is easy to see that \(N\triangleleft NH\). For any \(n\in N\cap H\) and \(h\in H\), we have \[h^{-1}nh\in N\cap H\] Therefore, \(N\cap H\triangleleft H\). Now, we construct a mapping \[\begin{aligned} f: H &\to NH/N\\ h &\mapsto hN \end{aligned}\] We first prove that this mapping is a homomorphism. Since \(N\triangleleft NH\), we have \[f(ab)=abN=aNbN=f(a)f(b).\] This is clearly a surjective homomorphism. Now, we calculate \(\ker f\). Let \(h\in \ker f\), then \[hN=N\iff h\in N\iff h\in N\cap H\] Hence, \(\ker f=N\cap H\). By the first isomorphism theorem, we have \[H/N\cap H\cong NH/N.\] ◻

The following corresponding theorem is fundamental, it describes the correspondence between subgroups of \(G/N\) and subgroups of \(G\) that contain \(N\).

Theorem 1.5 (Corresponding Theorem). Let \(N\triangleleft G\), and let \(\rho:G\to G/N\) be the natural projection. Then there exists a one-to-one correspondence between all subgroups of \(G\) that contain \(N\) and all subgroups of \(G/N\): \[H\mapsto \rho(H)=H/N\subset G/N\] This maps \(H\) to all coset sets of the form \(hN\). The inverse map is \(T\mapsto \rho^{-1}(T)\), where \(T\) is a subgroup of \(G/N\).

Proof.

  • Note that the image of \(H\to G\to G/N\) is \(H/N\), which is the image of a homomorphism, and therefore is a subgroup of \(G/N\). Similarly, for any subgroup \(T\le G/N\), the preimage \(\rho^{-1}(T)\) is also a subgroup.

  • Now we only need to verify that these two mappings are inverse mappings, i.e. \(\rho^{-1}(\rho(H))=H\) and \(\rho(\rho^{-1}(T))=T\). The latter is obvious since \(\rho\) is surjective. For the former, suppose there is a coset decomposition \(H=\cup hN\), then \[\rho^{-1}(\rho(H))=\rho^{-1}\left(\bigcup \rho(h)\right)=\bigcup hN = H.\]

 ◻

Cyclic Groups, Generators and Group Representations

Cyclic groups are the simplest and most basic type of group, but they are still very important. The cyclic group \(\mathbb{Z}_n\) can be described in various ways, such as the rotation transformation group of a regular \(n\)-gon, or the additive group of residue classes modulo \(n\). The essential characteristic of a cyclic group is that it can be generated by a single element. This means that we can find an element \(a\), and by repeatedly taking inverses \(a^{-1}\) and multiplying \(1,a,a^2,a^3,\dots\), we can obtain all elements in the group \(G\). To make this clearer:

Definition 1.6. Let \(G\) be a group, and \(S\subset G\) be a subset of \(G\). The intersection of all subgroups of \(G\) that contain \(S\), also known as the smallest subgroup containing \(S\), is called the subgroup generated by \(S\), denoted by \(\langle S\rangle\). If \(G=\langle S\rangle\), then \(G\) is called a group generated by \(S\), and \(S\) is called a set of generators for \(G\). If a group can be generated by a single element, i.e. there exists \(a\in G\) such that \(G=\langle a\rangle\), then it is called a cyclic group.

According to this definition, in addition to the finite cyclic group \(\mathbb{Z}_n\), there can be infinite cyclic groups, which are essentially \(\mathbb{Z}\).

Therefore, the cyclic group \(\mathbb{Z}_n\) can be imagined as a group generated by the abstract letter \(a\), satisfying the relation \(a^n=1\). This method of describing a group using generators and constraints is called group presentation. In this example, we write it as \[\mathbb{Z}_n=\langle a|a^n=1\rangle.\]

Example 1.9. Similarly, we can imagine that the group \(D_3\) can be abstractly described as a group generated by the abstract letters \(\sigma\) and \(\tau\), satisfying the constraints \(\sigma^3=\tau^2=1\) and \(\sigma\tau=\tau\sigma^{-1}\). Let’s try to list a series of elements generated by these two letters: \[1,\sigma,\sigma^2,\tau,\tau\sigma,\tau\sigma^2\] In the special case of the group \(D_3\), any element generated by abstract letters, such as \(\sigma\tau\sigma^2\), can be written in the above form, i.e. in the form of \(\tau^{i}\sigma^j\). This is because \[\sigma\tau\sigma^2=\tau\sigma^{-1}\sigma^2=\tau\sigma.\] We write this as \[D_3=\langle \sigma,\tau|\sigma^3=\tau^2=(\sigma\tau)^2=1\rangle\] (The condition \(\sigma\tau=\tau\sigma^{-1}\) can be simplified to \((\sigma\tau)^2=1\)). This can be easily generalized to the definition of \(D_n\) \[D_n=\langle\sigma,\tau|\sigma^n=\tau^2=(\sigma\tau)^2=1\rangle.\] This definition is abstract, it does not involve symmetry, but it clearly determines the structure of the group!

Exercise 1.25. Continuing with the example, please calculate \[\sigma\tau\sigma^{-1}\tau\sigma\tau^{-1}.\]

The description in this section is meant to be inspirational and informal. In the future, we will treat this type of group representation more rigorously.

How many generators does an \(n\)-order cyclic group have? The answer to this question is obvious. Let \(a\) be a generator, then \(a^i\) is a generator if and only if \(i\) is coprime to \(n\). In other words, there are a total of \(\varphi(n)\) generators, where \(\varphi(n)\) represents the number of numbers coprime to \(n\) in the range of \(1\) to \(n\), also known as the Euler’s totient function in number theory. There is an interesting identity about this Euler’s function: \[\sum_{d|n}\varphi(d)=n\] where \(d|n\) means \(d\) divides \(n\), and the summation symbol \(\sum_{d|n}\) represents the sum of all positive divisors of \(n\). This identity can be derived from the following observation: for each \(d|n\), there is only one \(d\)-order subgroup in the cyclic group \(\mathbb{Z}_n\), and this subgroup has \(\varphi(d)\) generators. Every element \(g\in\mathbb{Z}_n\) is exactly a generator of some \(\mathbb{Z}_d\).

There is an important proposition that characterizes the properties of cyclic groups:

Theorem 1.6. Let \(G\) be a finite group. If for any factor \(d\) of \(|G|\), there is at most one \(d\)-order subgroup in \(G\), then \(G\) is a cyclic group. The converse is also true.

Proof. Note that any \(g\in G\) has a cyclic subgroup \(\langle g\rangle\) generated by it. By assumption, there is at most one subgroup of this order. For \(d|n\), if there is a cyclic subgroup of order \(d\), then let its set of generators be denoted as \({\rm gen}(d)\), otherwise, let \({\rm gen}(d)\) be the empty set. Since any \(g\in G\) is a generator of some cyclic group of order \(d\), we have \[G=\bigcup_{d|n}{\rm gen}(d)\] and a cyclic group of order \(d\) has \(\varphi(d)\) generators, thus \[n\le \sum_{d|n}|{\rm gen}(d)|\le\sum_{d|n}\varphi(d)=n\] We find that the equality holds, so for each \(d|n\) it has exactly one cyclic subgroup of order \(d\), which is also true for \(d=n\), so it is a cyclic group. ◻

Exercise 1.26 (\(\bigstar\)). (For those who have not heard of fields, you can skip this) Prove that any finite subgroup of the multiplicative group \(F^\times\) of a field \(F\) is cyclic.

Permutation Groups and Conjugation

Permutation groups are the most important class of groups in finite groups. It is not an exaggeration to say that they are the most fundamental, because permutation groups actually contain all finite groups, that is, any finite group is a subgroup of a permutation group! (We will give this later)

Structure and Conjugacy of Permutations

The fundamental element of a permutation group is, of course, the permutation itself. For a permutation, its (disjoint) cycle decomposition best reflects its structure. We know that every permutation has a product decomposition of disjoint cycles, and there can be many cycle decompositions. However, in fact, the cycle decomposition of a permutation is unique up to the order of the cycles. This is quite intuitive and can be proven by induction, without much to say.

Proposition 1.3. Every \(\alpha\in S_n\) can be written as a product of disjoint cycles \(\beta_1,\dots,\beta_k\): \[\alpha=\beta_1\dots\beta_k.\] Moreover, this decomposition is unique up to the order of the cycles.

Proof. Assume that \(\alpha\) has two disjoint cycle decompositions \[\alpha=(\alpha_{11}\dots\alpha_{1k_1})(\alpha_{21}\dots\alpha_{2k_2})\dots\] \[\alpha=(\beta_{11}\dots\beta_{1l_1})(\beta_{21}\dots\beta_{2l_2})\dots\] If \(\alpha\) moves \(\alpha_{11}\), then \(\alpha_{11}\) will appear in one of the cycles in the other decomposition, let’s say \(\beta_{11}=\alpha_{11}\). By repeatedly applying \(\alpha\) to \(\alpha_{11}\), we get \(\alpha_{11}=\beta_{11}\), \(\alpha_{12}=\beta_{12}\dots\) It is clear that the first cycles in both decompositions must be equal. Multiplying both equations by the inverse of this cycle, we get a decomposition with fewer cycles. Repeating this process, we can easily prove the proposition. ◻

So each permutation has a unique cycle structure, for example \((123)(45)\) and \((234)(15)\) have the same cycle structure, only the numbers are different. We say their cycle structure is \(2\cdot 3\). Similarly, if the cycle decomposition of \(\alpha\) has \(m_k\) cycles of length \(k\), we say its cycle structure is \(2^{m_2}\cdot 3^{m_3}\dots k^{m_k}\).

Having the same cycle structure is like matrices having exactly the same Jordan blocks, and can be achieved by "change of basis", for example \[(123)(45)=\sigma^{-1}(234)(15)\sigma\] where \(\sigma=(14)\). This phenomenon is called conjugation:

Definition 1.7. For \(a,b\in G\), if there exists \(g\in G\) such that \[a=g^{-1}bg\] we say \(a\) and \(b\) are conjugate in \(G\). It can be verified that conjugation is an equivalence relation. Elements that are only conjugate to themselves7 are called central elements, and all central elements are denoted by \(C(G)\) or \(Z(G)\), and are called the center of the group \(G\).

Exercise 1.27. Prove that the number of conjugacy classes of a group \(G\) is equal to \(|G|\) if and only if \(G\) is an Abelian group.

Exercise 1.28. Let \(g\in G\) be a fixed element. Verify that the conjugation map \[f_g(a)=gag^{-1}\] is an automorphism of the group \(G\).

Exercise 1.29 (\(\bigstar\)). Continuing from the previous exercise, we have a mapping from \(G\) to \({\rm Aut}(G)\): \[\begin{aligned} i: G &\to {\rm Aut}(G)\\ g &\mapsto f_g \end{aligned}\] Verify that this is also a group homomorphism. Prove that \(\ker i=Z(G)\), and thus conclude that \(Z(G)\) is a normal subgroup of \(G\).
Let \({\rm Im}i={\rm Inn}(G)\le {\rm Aut}(G)\), and call it the inner automorphism group of \(G\). By the fundamental homomorphism theorem, we have: \[{\rm Inn}(G)\cong G/Z(G).\]

As an example for everyone to see, we prove a useful proposition.

Proposition 1.4. If \(G/Z(G)\) is a cyclic group, then \(G\) is an Abelian group.

Proof. Let \(a\) be a generator of the cyclic group \(G/Z(G)\) in \(G\), and we can construct a coset decomposition of \(G\) with respect to \(Z(G)\) as follows: \[G=\bigcup_{i=0}^{n-1}a^iZ(G)\] Thus, every element can be written as \(a^iz\), and it is obvious that \(a^iz_1a^jz_2=a^jz_2a^iz_1\). ◻

Exercise 1.30 (\(\bigstar\)). Prove that if \({\rm Aut}(G)\) is a cyclic group, then \(G\) is an Abelian group.

Exercise 1.31 (\(\bigstar\)). Let \(|G|=p^2\), where \(p\) is a prime number. Prove that \(G\) is an Abelian group.

Exercise 1.32 (\(\bigstar\bigstar\bigstar\bigstar\)). Let \(\alpha\in{\rm Aut}(G)\), and let \(I=\{g\in G|\alpha(g)=g^{-1}\}\). Prove that if \[|I|>\frac{3}{4}|G|\] then \(G\) is an Abelian group.

So, since conjugation is an equivalence relation on a group, the elements in the group will naturally be divided into some equivalence classes, which are called conjugacy classes. Generally speaking, the situation of conjugacy classes in a group is quite complex, but it is relatively simple for \(S_n\).

Exercise 1.33 (\(\bigstar\)). Find the number of conjugacy classes in \(D_5\).

Exercise 1.34 (\(\bigstar\bigstar\bigstar\)). Let \(p>2\) be a prime number. Find the number of conjugacy classes in \(D_p\).

It is easy to see that if a subgroup \(H\) of a group \(G\) is a normal subgroup, then \(H\) is composed of complete conjugacy classes. Conversely, if some conjugacy classes are put together and happen to form a subgroup, then this subgroup is a normal subgroup.

Exercise 1.35. Think carefully about the above statement and prove that \[\{1,(12)(34),(13)(24),(14)(23)\}\triangleleft S_4.\]

Exercise 1.36 (\(\bigstar\)). Observe \[\{1,(12)(34),(13)(24),(14)(23)\}\triangleleft S_4.\] Explain why normal subgroups do not have transitivity.

Exercise 1.37 (\(\bigstar\)). We call \(H\le G\) a characteristic subgroup of \(G\) if every automorphism \(\alpha\) of \(G\) maps elements of \(H\) to elements of \(H\). Prove that if \(H\le N\triangleleft G\) and \(H\) is a characteristic subgroup of \(N\), then \(H\triangleleft G\). In other words, the characteristic subgroup of a normal subgroup is still a normal subgroup.

Exercise 1.38 (\(\bigstar\)). Let \(p\) be the smallest prime factor of \(|G|\). Prove that if there exists a \(p\)-subgroup \(A\triangleleft G\), then \(A\le Z(G)\).

Parity of Permutations

Permutations can be classified as odd or even, denoted by the symbol \({\rm sgn}\sigma\): odd permutations are \(-1\), while even permutations are \(1\). The symbol \({\rm sgn}\sigma\) for permutations is not unfamiliar to us, as we have encountered it in linear algebra. It can be defined as the inverse of the number of inversions, or as the function that satisfies the following equation: \[\prod_{1\le i\le j\le n}(X_{\sigma(j)}-X_{\sigma(i)})={\rm sgn}\sigma\prod_{1\le i\le j\le n}(X_j-X_i).\] Any permutation can be decomposed into a product of transpositions, as any cycle can be decomposed into a product of transpositions, i.e. \((12\dots k)=(1k)(1,k-1)\dots(13)(12)\). We can also define the parity of a permutation as the parity of the number of transpositions it can be decomposed into. It can be proven that the parity of the number of transpositions in a permutation remains unchanged. According to this definition, a transposition \((12)\) is an odd permutation, while transpositions of odd length, such as \((123)\) and \((12345)\), are even permutations.

We will give another definition, and we will explain how this definition is derived. We know that every permutation \(\sigma \in S_n\) has a unique disjoint cycle decomposition. If we consider the fixed element \(c\) as a cycle of length \(1\), denoted as \((c)\), we can use \(t\) to represent the number of cycles in the disjoint cycle decomposition of the permutation, including the cycles of length \(1\). Then we define \[{\rm sgn}\sigma = (-1)^{n-t}.\] Firstly, for the identity permutation, \(t=n\), so \({\rm sgn}1 = 1\). For the cycle \((12\dots k)\), \(t=n-k+1\), so \({\rm sgn}(12\dots k) = (-1)^{k-1}\). And \(n-t\) is actually the sum of the length of each cycle minus \(1\), and its parity will depend on the number of odd permutations in the cycle decomposition. The tricky part is that we need to prove that this is a homomorphism. It is not easy to prove directly, so we take a slightly simpler route: to prove that for any transposition \(\tau\), we have \({\rm sgn}(\tau \sigma) = {\rm sgn}(\tau) {\rm sgn}(\sigma) = -{\rm sgn}(\sigma)\).

Lemma 1.1. For any transposition \(\tau\), we have \({\rm sgn}(\tau \sigma) = {\rm sgn}(\tau) {\rm sgn}(\sigma) = -{\rm sgn}(\sigma)\).

Proof. If \(\tau=(ab)\) and any of the disjoint cycle decompositions of \(\sigma\) are not disjoint, the equation is obvious. Otherwise, if there is an intersecting element, it is easy to calculate that \[(ab)(ac..d)=(ac..db)\] The length of \((ac..db)\) increases by \(1\) relative to \((ac..d)\), so the sign changes. If there are two intersecting elements, consider \[(ab)(abc..d)=(ac..d)\] The sum of the lengths of the cycles decreases by \(1\), and the right side has one less \((abc..d)\) than the left side. For the other case, \[(ab)(ac..dbe..f)=(ac..d)(be..f)\] The sum of the lengths of the cycles decreases by \(1\), and the right side has one less \((ac..dbe..f)\) than the left side. Therefore, we immediately obtain the proposition. ◻

Theorem 1.7. \({\rm sgn}:S_n\to \{1,-1\}\cong\mathbb{Z}_2\) is a homomorphism.

Proof. By Chapter 1, let \(\alpha=\tau_1\tau_2\dots\tau_k\) be the decomposition of \(\alpha\) into transpositions. Then \[\begin{aligned} {\rm sgn}(\alpha\beta)&={\rm sgn}(\tau_1\tau_2\dots\tau_k\beta)\\ &={\rm sgn}(\tau_1){\rm sgn}(\tau_2\dots\tau_k\beta)\\ &\dots\\ &={\rm sgn}(\tau_1)\dots{\rm sgn}(\tau_{k-1}){\rm sgn}(\tau_k){\rm sgn}(\beta)\\ &={\rm sgn}(\tau_1)\dots{\rm sgn}(\tau_{k-1}\tau_k){\rm sgn}(\beta)\\ &\dots\\ &={\rm sgn}(\tau_1\tau_2\dots\tau_k){\rm sgn}(\beta)\\ &={\rm sgn}(\alpha){\rm sgn}(\beta). \end{aligned}\] The kernel of the homomorphism \({\rm sgn}\) is denoted as \(A_n\), called the \(n\)-element alternating group, which is the group of all even permutations. As a corollary, \(A_n\triangleleft S_n\).

Exercise 1.39. Determine the parity of the permutation \((132)(421)(12345)\) and the permutation \[\left( \begin{array}{ccccc} 1 & 2 & \dots & n-1 & n \\ n & n-1 & \dots & 2 & 1 \\ \end{array} \right)\]

 ◻

Exercise 1.40. Prove that the order of a permutation is equal to the least common multiple of the lengths of all the cycles in its disjoint cycle decomposition.

Exercise 1.41 (\(\bigstar\)). Classify all elements in \(S_4\) into conjugacy classes and find all its normal subgroups.

Exercise 1.42 (\(\bigstar\bigstar\)). Prove that \(A_{2n}\) has a subgroup isomorphic to \(S_n\).

Direct Product

We can construct new groups using old groups, and one very simple construction is the direct product. Let \(G_1\) and \(G_2\) be groups, and we want to give the Cartesian product \(G_1\times G_2\) the structure of a group. The simplest way to do this is to define the multiplication as component-wise multiplication, that is, \[(g_1,g_2)(h_1,h_2):=(g_1h_1,g_2h_2).\] This way, the identity element in the group is \((1,1)\), and the inverse of \((g,h)\) is \((g^{-1},h^{-1})\). We will denote this group as \(G_1\times G_2\). If \(G_1\) and \(G_2\) are both abelian groups, this construction is sometimes denoted as the direct sum \(G_1\oplus G_2\), with the group operation denoted by addition.

Exercise 1.43. Investigate the group \(\mathbb{Z}_2\times\mathbb{Z}_2\) (also denoted as \(\mathbb{Z}_2\oplus\mathbb{Z}_2\)) and prove that it is isomorphic to \[V=\{1,(12)(34),(13)(24),(14)(23)\}.\]

Exercise 1.44 (\(\bigstar\bigstar\)). Prove that \[D_{6}\cong D_3\times \mathbb{Z}_2.\] In fact, for any odd number \(n\), we have \[D_{2n}\cong D_n\times \mathbb{Z}_2.\]

Exercise 1.45 (\(\bigstar\)). Prove that when \(m,n\) are coprime, we have \[\mathbb{Z}_m\times\mathbb{Z}_n\cong\mathbb{Z}_{mn}\]

Group Actions

Basic Terminology

A group is essentially an abstract structure of symmetric groups/transformation groups/action groups. In this chapter, we will study the action of groups, which means implementing the abstract elements of a group as concrete transformations/actions. For example, how does a cyclic group, such as a symmetry structure, act on various objects? We know that it can be implemented as rotations of regular polygons, which can be represented by a group homomorphism: \[\mathbb{Z}_n\to\{\text{rotation group of a regular $n$-gon}\}\] where the generator of \(\mathbb{Z}_n\) is mapped to a unit rotation. This action can also be described as follows: interpreting the rotation of a regular \(n\)-gon as a permutation on the set of vertices \(\{1,2,\dots,n\}\), we get a (mono)morphism: \[\mathbb{Z}_n\to S_n\] where the generator of \(\mathbb{Z}_n\) is mapped to the cycle \((123\dots n)\). To study the most general actions of a group, we consider a homomorphism from the group \(G\) to the largest possible class of transformation groups. The two main classes of transformation groups that are usually considered are permutation groups and matrix groups. The study of homomorphisms to matrix groups is known as group representation theory. When studying the action of a group on a set, we mainly focus on homomorphisms to permutation groups. We denote the set of all permutations on a set \(X\) by \({\rm Sym}(X)\)8, which consists of all bijective mappings, and is called the symmetric group of \(X\). Then, we define an action of a group \(G\) on a set \(X\) as a homomorphism (not necessarily injective or surjective): \[\alpha:G\to{\rm Sym}(X).\] This is also known as a permutation representation of the group \(G\), or simply a representation. In other words, \(\alpha(g)\) becomes a transformation (permutation) on the set \(X\), which is bijective, i.e. \(\alpha(g):X\to X\). This transformation can act on an element \(x\in X\) to give \(\alpha(g)(x)\). This notation is more rigorous, but it is too long. After defining the homomorphism \(\alpha\), we can imagine \(G\) as a set of "operators", and simply use \(gx\) to represent \(\alpha(g)(x)\).

Let us first give some basic terminology for group actions. If \(\alpha\) is a monomorphism, then the action is called faithful, meaning that it does not map two different elements in the group to the same transformation. For \(x\in X\), the set \[Gx=\{gx|g\in G\}\] is called the orbit of \(x\). It can be easily shown that if \(a,b\) belong to the same orbit, i.e. there exists \(g\) such that \(a=gb\), then this is an equivalence relation, closely related to the fact that \(G\) is a group. As a result, the elements in \(X\) are partitioned into disjoint classes, i.e. the union of disjoint orbits. If \(X\) has only one orbit, we say that the action is transitive, meaning that any \(x\) can be mapped to any given \(y\) by some element in \(G\).

Orbit Formula

Definition 2.1. Given a group \(G\) acting on a set \(X\), we define \[{\rm Stab}(x)=\{g\in G|gx=x\}\] to be the subgroup of \(G\) that keeps \(x\) fixed. This is called the stabilizer of \(x\).

Exercise 2.1 (\(\bigstar\)). There is a stronger concept than transitivity, called doubly-transitive, which means that for any \(x\neq y,z\neq w\), there exists \(g\in G\) such that \[(gx,gy)=(z,w).\] Prove: \(G\) is doubly-transitive on \(X\) if and only if for all \(x\in X\), \({\rm Stab}(x)\) is transitive on \(G\backslash\{x\}\).

Since \({\rm Stab}(x)\) is a subgroup, we can make a coset decomposition \[G=\bigcup_{i}g_i{\rm Stab}(x)\] where each coset acts on \(x\) in the same way, i.e. \(g_ix\). Moreover, different cosets give different actions on \(x\), otherwise \(ax=bx\iff a^{-1}b\in{\rm Stab}(x)\iff a{\rm Stab}(x)=b{\rm Stab}(x)\). We immediately obtain the following simple but fundamental orbit formula (for calculating the length of an orbit)

Theorem 2.1 (Orbit formula). \[|Gx|=[G:{\rm Stab}(x)]\]

Since \(X\) is partitioned into some orbits, we denote each orbit’s representative element as \(x_i\). It is obvious that we must have \[|X|=\sum|\text{orbit}|=\sum_{i}[G:{\rm Stab}(x_i)].\] Do not underestimate this orbit formula, with it, we have a method to calculate the order of some symmetric groups, because \(|G|=|Gx||{\rm Stab}(x)|\).

Example 2.1. We calculate \(|D_n|\). Take a vertex \(x\) of a regular \(n\)-gon, since the action is transitive, \(|Gx|=n\). And there are only two transformations in \(D_n\) that keep \(x\) unchanged, the identity transformation and a reflection (axisymmetric) transformation. Therefore, \[|D_n|=|Gx||{\rm Stab}(x)|=2n.\]

Exercise 2.2 (\(\bigstar\)). Calculate the order of the symmetry group of a regular tetrahedron.

Example 2.2. Let \(H,K\le G, a\in G\), we calculate the number of elements in the following double coset set \[HaK=\{hak|h\in H, k\in K\}.\] Consider the following action of \(H\times K\) on \(G\): \[\begin{aligned} \rho: H\times K &\to {\rm Sym}(G)\\ (h,k) &\mapsto (g\mapsto hgk^{-1}) \end{aligned}\] It is easy to see that \(HaK\) is the orbit of \(a\) under this action. Therefore, by the orbit formula, \[|HaK|=[H\times K: {\rm Stab}(a)].\] We consider \((h,k)\in {\rm Stab}(a)\Leftrightarrow hak = a\Leftrightarrow h=ak^{-1}a^{-1}\in H\cap aKa^{-1}\). Hence, \[|HaK|=\frac{|H||K|}{|H\cap aKa^{-1}|}.\]

In addition, we can immediately obtain the following formula for calculating the number of orbits. For any subset \(S\subset G\) of \(G\), we introduce the notation \[X^S=\{\text{elements of $X$ fixed by all elements of $S$}\}.\] Then we have

Theorem 2.2 (Burnside). The number of orbits \(N\) has the following formula9 \[N=\frac{1}{|G|}\sum_{g\in G}|X^g|.\]

Proof. Let \(N=\sum_{x\in X}\frac{1}{|Gx|}\). \[\begin{aligned} N&=\sum_{x\in X}\frac{1}{|Gx|}\\ &=\sum_{x\in X}\frac{|{\rm Stab}(x)|}{|G|}\\ &=\frac{1}{|G|}\sum_{x\in X}\sum_{g\in G}1_{gx=x}\\ &=\frac{1}{|G|}\sum_{g\in G}\sum_{x\in X}1_{gx=x}. \end{aligned}\] ◻

Exercise 2.3 (\(\bigstar\)). If \(G\) acts transitively on \(X\) and \(|X|>1\), then there exists \(g\in G\) without a fixed point.

Exercise 2.4 (\(\bigstar\)). Let \(G\) be a group of order \(p^k\). Prove that if \(G\) acts on a set \(X\), then \[\left|X^G\right|\equiv |X|\mod p.\]

Kernel of Representation

We will find \(\ker\alpha\), which is generally important, because by the first isomorphism theorem, \(G/\ker\alpha\) is isomorphic to a subgroup of the permutation group \({\rm Sym}(X)\).

What is \(\ker\alpha\)? It is the set of all elements \(g\) that fix every element, so \[\ker\alpha=\bigcap_{x\in X}{\rm Stab}(x).\] If this action is transitive, then \[\ker\alpha=\bigcap_{g\in G}{\rm Stab}(gx).\] There is a natural relationship between \({\rm Stab}(x)\) and \({\rm Stab}(gx)\). The transformations that fix \(x\) can also be used to fix \(gx\) by slightly modifying them. This only requires composing with \(g^{-1}\), applying \({\rm Stab}(x)\), and then composing with \(g\). In other words, we find that \[g{\rm Stab}(x)g^{-1}\subset{\rm Stab}(gx).\] Similarly, we have \[g^{-1}{\rm Stab}(gx)g\subset{\rm Stab}(x).\] Therefore, we find that \[{\rm Stab}(gx)=g{\rm Stab}(x)g^{-1}.\] Thus, when the action is transitive, we have \[\ker\alpha=\bigcap_{g\in G}g{\rm Stab}(x)g^{-1}.\] This is a normal subgroup. This fact also suggests the following trivial proposition: for \(H\le G\), we have \[\bigcap_{g\in G} gHg^{-1}\triangleleft G.\]

Common Actions

Groups can not only act on other objects, but they can also naturally act on certain structures within themselves. In the study of group theory, this is an extremely important, intuitive, and powerful method. We will give a few simple examples.

Conjugation Action

The group \(G\) acts on itself by conjugation, i.e. \[\begin{aligned} \alpha: G &\to {\rm Sym}(G)\\ x &\mapsto (g\mapsto xgx^{-1}) \end{aligned}\]

Exercise 2.5 (\(\bigstar\)). Can we define the conjugation action as \[\begin{aligned} \alpha: G &\to {\rm Sym}(G)\\ x &\mapsto (g\mapsto x^{-1}gx) \end{aligned}\] Why is this definition not valid?

The orbit of this action is the conjugacy class, it is easy to see that the kernel of the action \(\ker\alpha=Z(G)\). For \(g\in G\), its stabilizer \({\rm Stab}(g)\) is denoted as \(C_G(g)=\{h\in G|gh=hg\}\)10, which is a subgroup of \(G\). We call the orbit equation in this case \[|G|=\sum_{i}[G:C_G(g_i)]\] the class equation of the group \(G\). It is worth noting that some elements \(g\in G\) may have orbit length of \(1\), which means they are only conjugate to themselves, central elements, \(Z(G)\). If we let \(|G|=p^n\), where \(p\) is a prime number, then we have \[p^n=|Z(G)|+\sum_{i,g_i\not\in Z(G)}[G:C_G(g_i)]\] It is easy to see that \([G:C_G(g_i)]\) must be a multiple of \(p\). Therefore, in this case, \(|Z(G)|\) must be a multiple of \(p\), which means \(|Z(G)|>1\). In other words, we have the following conclusion:

Proposition 2.1. If \(|G|=p^n,n>1\), then \(G\) has a non-trivial center \(Z(G)\).

Exercise 2.6 (\(\bigstar\)). If we denote the number of conjugacy classes of a group \(G\) by \(c(G)\), prove that \[c(G\times H)=c(G)\cdot c(H).\]

Exercise 2.7 (\(\bigstar\bigstar\bigstar\)). With the notation as above, suppose \(G\) is a non-Abelian group. Prove that \[c(G)\le \frac{5}{8}|G|\]

Exercise 2.8 (\(\bigstar\bigstar\bigstar\bigstar\bigstar\)). Can the equality hold? Furthermore, prove that if \(p\) is the smallest prime factor of \(|G|\), then \[c(G)\le\frac{p^2+p-1}{p^3}|G|\]

Regular Representation (Action)

The group \(G\) acts (on the left)11 by multiplication on the group \(G\), i.e. \[\begin{aligned} \alpha: G &\to {\rm Sym}(G)\\ g &\mapsto (a\mapsto ga) \end{aligned}\] This naturally views \(g\) as a permutation on \(G\), \(\alpha(g)\). This is a faithful representation.

Exercise 2.9. Prove that this is a faithful and transitive action.

We immediately obtain Cayley’s theorem: every group is isomorphic to a subgroup of some permutation group. This may seem extremely trivial, but it is precisely this naive observation that allows us to obtain non-trivial conclusions. We give an example.

Proposition 2.2. Let \(G\) be a group of order \(4k+2\). Then it must have a normal subgroup of index \(2\).

Proof. Consider the left regular action of the group \[\alpha:G\to{\rm Sym}(G)\] Since this is a faithful representation, we can identify \(G\) with \(\alpha(G)\). Then, each \(\alpha(g)\) is a permutation on \(G\), and due to the existence of inverses in group multiplication, this permutation cannot fix any element in the group (\(ga=a\iff g=1\).) Therefore, we can write \(\alpha(g)\) as a product of some disjoint cycles of length greater than \(1\). Now, take \(g\) to be an element of order \(2\) in the group, which we know exists in groups of even order. Thus, \(\alpha(g)\) decomposes into a product of \(2n+1\) disjoint transpositions, and is therefore an odd permutation. This proves that there is an odd permutation in \(G\), and thus all even permutations in \(G\) form a normal subgroup of index \(2\). ◻

Corollary 2.1. Let \(k\ge1\), then a group of order \(4k+2\) is not a simple group.

Induced Representation (Action on Cosets)

Given a subgroup \(H\) of a group \(G\), there is a (left) coset set \(G/H\), and \(G\) can naturally act on \(G/H\). \[\begin{aligned} \alpha: G &\to {\rm Sym}(G/H)\\ g &\mapsto (aH\mapsto gaH) \end{aligned}\] This is called the (left) induced representation12. It is obviously transitive. Its kernel is \[\ker\alpha=\bigcap_{g\in G}gHg^{-1}\] and note that \(\ker\alpha\triangleleft H\). Using this representation, we can obtain many powerful (actually simple) conclusions.

Example 2.3. Let \(H\) be a proper subgroup of an infinite group \(G\) with finite index, then \(G\) must have a proper normal subgroup with finite index.

Proof. Consider the induced representation of \(G\) on the coset \(G/H\): \[\rho:G\to{\rm Sym}(G/H)\] Since \([G:\ker\rho]\le|{\rm Sym}(G/H)|=[G:H]!\) is finite, we have completed the proof of the proposition. ◻

Here is another typical application:

Proposition 2.3. Let \(p\) be the smallest prime factor of \(|G|\). If \(H\le G\) and \([G:H]=p\), then \(H\triangleleft G\).

Proof. Consider the induced representation \(\alpha: G \to {\rm Sym}(G/H)\) on the coset set \(G/H\). Since \(G/\ker\alpha\) is isomorphic to a subgroup of \({\rm Sym}(G/H)\), it must be that \(|G/\ker\alpha|\) is a factor of \([G:H]!\). Now, since \(\ker\alpha \le H\), we have \[|G/\ker\alpha| = \frac{|G|}{|\ker\alpha|} = \frac{|G|}{|H|}[H:\ker\alpha]\] which is a multiple of \([G:H] = p\). Since \(p\) is the smallest prime factor of \(|G|\), we know that \([H:\ker\alpha]\) is a multiple of \(1\) or \(p\). If it is not \(1\), then \(p^2 | p!\), which is impossible. Therefore, \[H = \ker\alpha \triangleleft G.\] ◻

Exercise 2.10 (\(\bigstar\bigstar\)). Let \(G\) be a non-Abelian group of order greater than \(3\), and \(H\le G\). Prove that \([G:H]> 4\).

Proof. Let \(p\) be a prime factor of \(|G|\). Then, by the orbit-stabilizer theorem, we have that \(|G| = |Gx| \cdot |G_x|\), where \(Gx\) is the orbit of \(x \in G\) and \(G_x\) is the stabilizer of \(x\). Since \(|G|\) is divisible by \(p\), we have that \(p\) divides \(|Gx| \cdot |G_x|\). Since \(p\) is prime, this implies that \(p\) divides either \(|Gx|\) or \(|G_x|\). If \(p\) divides \(|Gx|\), then by Lagrange’s theorem, there exists an element of order \(p\) in \(Gx\). If \(p\) divides \(|G_x|\), then by the induction hypothesis, there exists an element of order \(p\) in \(G_x\). In either case, we have that \(G\) contains an element of order \(p\), as desired. ◻

Before we start looking at the proof, let’s understand the idea. The case of \(p=2\) has already been done as an exercise, do you remember the method? Your method is probably to pair the elements in \(G\), pairing \(g\) with \(g^{-1}\), the second-order elements \(g\) and \(g^{-1}\) are the same, so they cannot be paired. Finally, because the order of the group is a multiple of \(2\), the number of paired elements is also a multiple of \(2\), and we conclude that there are an even number of elements that satisfy \(x^2=1\). Since \(1^2=1\), we know that there must be second-order elements.

However, at first glance, this proof does not seem to be generalizable to the case of \(p>2\). But if you restate the above proof from the perspective of symmetry, you can immediately see how to generalize it. We want to understand the phenomenon mentioned in the proof as a kind of symmetry, so there must be a symmetry transformation. What is the phenomenon we observe? It is the symmetry transformation \((x,y)\mapsto (y,x)\) on the set of all ordered pairs \(X=\{(g,g^{-1})|g\in G\}\). This is the action of the group \(\mathbb{Z}_2\) on this set. So \(X\) can be decomposed into a union of orbits. And the length of the orbit under this action is \(1\) if and only if \(g=g^{-1}\), that is, \(g^2=1\). From the orbit equation \[|X|=|\{(g,g)|g^2=1\}|+\sum|\text{other orbits of length $2$}|\] the orbit formula shows that the length of the orbit can only be \(1\) or \(2\), and we immediately conclude that \(|\{g|g^2=1\}|\) is a multiple of \(2\). Now, I think it is obvious how to generalize this proof.

Proof. Let \(X=\{(g_1,g_2,\dots,g_p)|g_1g_2\dots g_p=1\}\), and let \(a\) be a generator of the group \(\mathbb{Z}_p\). Consider the following action of the group \(\mathbb{Z}_p\) on \(X\): \[\begin{aligned} \alpha: \mathbb{Z}_p &\to {\rm Sym}(X)\\ a &\mapsto f \end{aligned}\] Where \(f:X\to X\) is a mapping that maps \((g_1,g_2,\dots,g_p)\) to \((g_2,g_3,\dots,g_p,g_1)\).

We need to verify that \(g_2g_3\dots g_pg_1=1\), which can be obtained from \(ab=1\Leftrightarrow ba=1\). Now, from the orbit equation \[|G|^{p-1}=|X|=|\{g\in G|g^p=1\}|+\sum|\text{other orbits of length $p$}|\] We can conclude that \(|\{g\in G|g^p=1\}|\) is a multiple of \(p\), and since \(1^p=1\), we immediately know that there exists a \(p\)-order element in the group \(G\), and there are at least \(p-1\) of them. In fact, we can obtain a stronger conclusion: \((p\)-order element count\(+1)\) is a multiple of \(p\). ◻

Exercise 2.11 (\(\bigstar\bigstar\)). Let \(|G|=mp,1<m<p,\) where \(p\) is a prime number. Prove that \(\mathbb{Z}_p\triangleleft G\).

Exercise 2.12 (\(\bigstar\)). Let \(G\) be an Abelian group of order \(6\). Prove that \(G\cong \mathbb{Z}_6\) using the following hints. (By Cauchy’s theorem, \(G\) has elements of order \(2\) and \(3\). Hence, \(G\cong\mathbb{Z}_2\times\mathbb{Z}_3\cong\mathbb{Z}_6\).)

Exercise 2.13 (\(\bigstar\bigstar\)). Let \(G\) be a non-Abelian group of order \(6\). Prove that \(G\cong S_3\) using the following hints. Hence, there are only two groups of order \(6\): \(\mathbb{Z}_6\) and \(S_3\).

  1. First prove that \(Z(G)=1\).

  2. Prove that \(G\) has exactly \(3\) elements of order \(2\).

  3. Consider the conjugation action of \(G\) on the set \(X=\{a,b,c\}\), where \(a,b,c\) are all elements of order \(2\). (You need to prove that this is an action on \(X\) first.)

  4. Prove that \(\{a,b,c\}\) generates \(G\), and hence the above action is faithful. The first isomorphism theorem gives \(G\cong S_3\).

Conjugation on Subgroups

Let \(A\le G\) be a subgroup. It is easy to see that \(g^{-1}Ag\) is also a subgroup. This is because the mapping \(x\mapsto g^{-1}xg\) is an automorphism, and when restricted to a subgroup, it remains an isomorphism, thus its image must be a group. We say that subgroups \(A\) and \(B\) are conjugate if \(A=g^{-1}Bg\). It is clear that \(G\) can act on some of its subgroups by conjugation. A subgroup can have many conjugate subgroups, and the normal subgroups of \(G\) are those that are invariant under the conjugation action of \(G\), meaning they are only conjugate to themselves. In this case, they form a conjugacy class on their own.

For \(S\subset G\), we use the notation \(N_G(S)\) to denote \(\{g\in G|gS=Sg\}\), which is called the normalizer of \(S\). It is easy to see that the stabilizer of \(A\le G\) is \(N_G(A)\), and thus the size of the conjugacy class is \[[G:N_G(A)].\]

Exercise 2.14 (\(\bigstar\)). Let \(|G|=p^n\). Prove that the number of non-normal subgroups of \(G\) is a multiple of \(p\).

Applications of Groups in Counting Problems

How many ways are there to paint two faces of a cube red? At first glance, it seems like there are \[\binom{6}{2}=\frac{6\times 5}{2}=15\] ways to do so. However, many of these ways are actually the same, differing only by a rotation. In other words, when painting a symmetric object, we only need to consider its symmetry group and let this group act on all possible sets of colors. To determine if two ways of painting are the same, we just need to see if they are in the same orbit. Therefore, to find the number of essentially different ways to paint, we just need to count the number of orbits! Burnside’s theorem provides a method for doing so.

Let’s use a simple example to illustrate this method. Consider the following counting problem: how many different ways are there to color a circular necklace of length \(n\), using \(c\) different colors? Two colorings are considered the same if they are equivalent under rotation and reflection (axis symmetry). We can consider the space of group actions \[X=\{(a_1,a_2,\dots,a_n)|a_i\text{ is one of $c$ colors}\}\] and let \(D_n\) act on this space in the following way:

  1. \(\sigma (a_1,a_2,\dots,a_{n-1},a_n) = (a_2,a_3,\dots,a_n,a_1)\)

  2. \(\tau (a_1,a_2,\dots, a_{n-1},a_n) = (a_n,a_{n-1},\dots, a_2,a_1)\).

It is easy to see that since all elements in \(D_n\) can be represented in the form of \(\sigma^i\tau^j\), it is only necessary to define the actions of \(\sigma\) and \(\tau\) on \(X\). It can be easily verified that the above definition indeed gives an action of \(D_n\) on \(X\) (it is necessary to verify that the actions of \(\sigma^n=\tau^2=(\sigma\tau)^2=1\) are trivial). Therefore, our problem is now equivalent to calculating the number of orbits in \(X\), denoted by \(N\). By Burnside’s theorem, \[N=\frac{1}{|D_n|}\sum_{g\in D_n} |X^g|.\] Now let’s consider the size of \(X^g\). First, \(D_n\) can be viewed as a subgroup of \(S_n\), and it actually acts on \(X\) in the same way as the permutation with subscripts. Therefore, we can assume that \(g\in D_n\) has a disjoint cycle decomposition in \(S_n\): \[g=c_1c_2\dots c_k=(c_{11} c_{12} \dots c_{1l_1})(c_{21} c_{22} \dots c_{2l_2})\dots\] where \(c_i=(c_{i1} c_{i2} \dots c_{il_i})\) are disjoint cycles with lengths \(l_i\). For example, \(\sigma = (123\dots n)\), \(\tau = (1,n) (2,n-1)\dots\).

If \(g\) fixes an \(x\in X\), i.e. \(gx=x\), we can obtain \[x=(a_1,a_2,\dots,a_n)= (a_{g(1)},a_{g(2)},\dots, a_{g(n)})\] i.e. \(a_i = a_{g(i)}\). From \(gx=x\), it naturally follows that \(g^kx=x\), thus \(a_{c_{11}}=a_{c_{12}}=\dots=a_{c_{1l_1}}\). This means that in all components of the cycle \(c_1\), the corresponding positions must have the same color. Similarly, the same result can be obtained for all cycles, thus \(|X^g|=c^{g\text{number of cycles}}\), note that when considering cycles, the identity cycle \(1\) must be included.

Note that the cycle decomposition shape of elements in the same conjugacy class of \(D_n\) is the same. In general, it is only necessary to find all conjugacy classes of \(D_n\) and calculate their cycle decomposition. In this example, we can consider \[D_n=\{1,\sigma,\sigma^2,\dots,\sigma^{n-1},\tau\sigma,\tau\sigma^2,\dots,\tau\sigma^{n-1}\}\] Note that \(\sigma^{j}(\tau\sigma^{i})\sigma^{-j} =\sigma^j\tau\sigma^{i-j} = \tau \sigma^{i-2j}\), thus when \(n\) is odd, all \(\tau\sigma^i\) are conjugate, and like \(\tau\), they have \(\frac{n-1}{2}+1=\frac{n+1}{2}\) cycles. When \(n\) is even, it may be divided into two conjugacy classes \(\tau\sigma^{2k+1}\) and \(\tau\sigma^{2k}\), which have \(\frac{n}{2}\) and \(\frac{n}{2}+1\) cycles, respectively, just like \(\tau\sigma\) and \(\tau\).

For elements in \(\mathbb{Z}_n\le D_n\), \(\sigma^k\) is a cycle of length \(n\) when \(k\) and \(n\) are coprime. When their greatest common divisor \((k,n)=d\), it is a product of \(d\) cycles of length \(\frac{n}{d}\). Recall that in the section on cyclic groups, we mentioned that for factors of \(n\), the number of generators of the subgroup of order \(\frac{n}{d}\), \(d\mathbb{Z}_n\le \mathbb{Z}_n\), which are elements \(\sigma^k\) satisfying \((k,n)=d\), is equal to the Euler’s totient function \(\varphi\left(\frac{n}{d}\right)\). Therefore, we can now derive the formula for \(N\): \[\begin{aligned} N&=\frac{1}{|D_n|}\sum_{g\in D_n}|X^g|\\ &=\frac{1}{2n}\left(\sum_{k=1}^{n}c^{(k,n)}+1_{2|n}\frac{n}{2}c^{\frac{n}{2}}+1_{2|n}\frac{n}{2}c^{\frac{n}{2}+1}+1_{2|n+1}nc^{\frac{n+1}{2}}\right)\\ &=\frac{1}{2n}\sum_{d|n}\varphi\left(\frac{n}{d}\right)c^{d}+1_{2|n}\frac{c^{\frac{n}{2}}(c+1)}{4}+1_{2|n+1}\frac{c^{\frac{n+1}{2}}}{2}. \end{aligned}\]

Exercise 2.15 (\(\bigstar\bigstar\)). Assuming that on a necklace of length \(2n\), only two colors are allowed and each color must use exactly \(n\) beads. How many different coloring methods are there?

Sylow’s Theorem

Sylow’s theorem may be the most important theorem in elementary group theory, it is also a wonderful application of group actions. We will first introduce the theorem, and then give a proof using group actions.

Theorem 2.3 (Sylow). Let \(G\) be a group with \(|G|=mp^n\), where \(p\) is a prime and \(p\not|m\). Then:

  1. There exists a subgroup \(P\le G\) of order \(p^n\), called a Sylow-\(p\) subgroup of \(G\).

  2. Let \(N(p^r)\) denote the number of subgroups of \(G\) of order \(p^r\), for \(r\le n\). Then \[N(p^r)\equiv 1\mod p.\]

  3. All Sylow-\(p\) subgroups are conjugate to each other, so \(N(p^n)=[G:N_G(P)]\).

It is worth noting that general \(p\)-subgroups may not be conjugate to each other.

Proof. Let \(|G|=np^r\) (without the requirement that \(n\) and \(p\) are coprime, i.e. \(r\) is not necessarily maximal). We consider all \(p^r\)-element subsets of \(G\), denoted by \(X\), with \(|X| = \binom{np^r}{p^r}\). Now we can consider the left action of \(G\) on \(X\): \[\begin{aligned} \rho: G &\to {\rm Sym}(X)\\ g &\mapsto (S\mapsto gS) \end{aligned}\] We decompose \(X\) into a union of orbits: \[X=\bigcup_{i=1}^m X_i\] with \(|X_i| = [G:{\rm Stab}(S_i)]\). Since \({\rm Stab}(S_i) S_i = S_i\), we can decompose \[S_i = \bigcup_{j=1}^{k_i} {\rm Stab}(S_i) g_{ij}\] which implies \(k_i |{\rm Stab}(S_i)| = p^r\) and \(|X_i| = nk_i\). Note that \(k_i\) must be a power of \(p\), and \(k_i = 1\) if and only if \(S_i\) is of the form \(Pg\), where \(P\le G\) is a \(p^r\)-element subgroup of \(G\) and \(g\in G\). Thus \(k_i=1\) if and only if \(T_i\) contains exactly one \(p^r\)-element subgroup, which implies \[|X| = \binom{np^r}{p^r} = \sum_{i=1}^m nk_i \equiv n\sum_{k_i = 1} 1 \mod np.\] Hence \[\frac{1}{n}\binom{np^r}{p^r} \equiv N(p^r) \mod p.\]

Now we only need to calculate the remainder of the left number \(p\). First, it is equal to \(\binom{np^r-1}{p^r-1}\), so it is equal to the coefficient of the \(p^r-1\)th term of the expression \((1+x)^{np^r-1}\). Considering this expression in the formal power series ring \(\mathbb{Z}_p\llbracket x\rrbracket\) modulo \(p\), we have \[\frac{(1+x)^{np^r}}{1+x}=\frac{(1+x^{p^r})^n}{1+x}=(1+nx^{p^r}+\dots)(1-x+\dots +(-1)^{p^r-1}x^{p^r-1}+\dots)\] We conclude that the desired coefficient modulo \(p\) is \((-1)^{p^r-1}\equiv 1\), and thus we obtain the desired proposition. Of course, this number can also be calculated by other methods, such as direct calculation, or considering the special case where \(G\) is a cyclic group, in which case we know that \(N(p^r)=1\). However, the left side of the equation is independent of the group and only depends on the order of the group, which directly gives the desired result.

Now we prove that all Sylow-\(p\) subgroups are conjugate to each other. First, \(G\) acts on all Sylow-\(p\) subgroups by conjugation, and the order of each orbit \([G:N_G(P)]\) must be coprime to \(p\). Let \(P\) be a Sylow-\(p\) subgroup of \(G\), and consider the conjugation action of \(P\) on all Sylow-\(p\) subgroups. Then \(Q\) is fixed by the action of \(P\) if and only if \(P\le N_G(Q)\). In this case, consider the projection \(P \to N_G(Q)\to N_G(Q)/Q\). Since the order of the quotient group \(N_G(Q)/Q\) does not contain \(p\), and all elements of \(P\) have order \(p^k\), we know that the image of this projection must be \(1\), which implies that \(P\le Q\). Therefore, \(P=Q\). This shows that only \(P\) is fixed by this conjugation action, and all other orbits have length a power of \(p\). However, under the conjugation action of \(G\), each orbit is not divisible by \(p\), so when each orbit is restricted to act on \(P\), there must be an orbit of length \(1\). We have already proved that only \(P\) is fixed by the conjugation action, so there can only be one \(G\)-orbit. ◻

Corollary 2.2. If \(|G|=np^r\), where \(n\) and \(p\) are coprime, and there exists a unique Sylow-\(p\) subgroup, then this subgroup is a normal subgroup.

Corollary 2.3. If \(|G|=np^r\), where \(n\) and \(p\) are coprime, then \(N=N(p^r)=[G:N_G(P)]\) is a factor of \(n\) and has the form of \(pk+1\).

Example 2.4. We will prove that a group of order \(15\) must be cyclic. According to Sylow’s theorem, we can consider two Sylow subgroups \(P_3\) and \(P_5\) of orders \(3\) and \(5\), respectively. We have \(N(3)\equiv 1\mod 3\) and \(N(3)\) is a factor of \(5\). However, the only number that satisfies the congruence is \(1\). Similarly, we find that \(N(5)=1\). Therefore, for all factors \(d\) of \(|G|\), there is only one subgroup of order \(d\), which must be cyclic.

Exercise 2.16. Let \(G\) be a group of order \(40\). Prove that it has a normal subgroup of order \(5\).

Exercise 2.17 (\(\bigstar\bigstar\)). Let \(P\) be a Sylow-\(p\) subgroup of \(G\). Prove that if \(N\triangleleft G\) or \(P\triangleleft G\), then \(N\cap P\) is a Sylow-\(p\) subgroup of \(N\).

Exercise 2.18 (\(\bigstar\bigstar\bigstar\bigstar\)). Prove that if \(H\le G\) and \(P\) is a Sylow-\(p\) subgroup of \(G\), then there exists \(g\in G\) such that \(H\cap gPg^{-1}\) is a Sylow-\(p\) subgroup of \(H\).

Exercise 2.19. Find all Sylow-\(2\) subgroups and Sylow-\(3\) subgroups of \(S_3\).

Exercise 2.20 (\(\bigstar\bigstar\bigstar\)). Try to find all Sylow subgroups of \(S_4\).

As we have seen, Sylow’s theorem is often used in conjunction with conjugation, because there is a natural relationship between them: Sylow \(p\)-subgroups are conjugate to each other. In addition to the most common numerical applications of Sylow’s theorem, which must be mastered, using it in conjunction with conjugation can often lead to non-trivial conclusions.

Using Sylow’s theorem to find normal subgroups

Finding normal subgroups is important and can often be used to classify the structure of a group or prove that a group is not simple. One common application of Sylow’s theorem is to use it to find normal subgroups. The following are common methods for using Sylow’s theorem to find normal subgroups.

  • Take \(|G|=np^r\) where \(p\) and \(n\) are coprime, and calculate the possible number of Sylow \(p\)-subgroups \(N\), which satisfies \[\left\{ \begin{array}{c} N\equiv 1\mod p \\ N\text{ is a factor of }|G| \\ \end{array} \right.\]

  • If \(N=1\), then by Sylow’s theorem we know that \(P\triangleleft G\).

  • If \(r=1\), consider each prime \(p\) with \(r=1\) and the minimum number of elements that each \(P\) can contribute to \(N(p-1)+1\). This may help to show that there cannot be so many Sylow subgroups.

  • If \(r=2\), consider the intersection \(H=P_1\cap P_2\) of two Sylow subgroups. If \(|P|^2 > |G|\), then \(|H|=p\) (since \(|P_1P_2|=|P|^2/|P_1\cap P_2|\)) and \(P_1,P_2\le N_G(H)\), so \(G=N_G(H)\) and \(H\triangleleft N_G(H)=G\).

  • Consider the conjugation action of \(G\) on \(X=\{P_1,P_2,\dots,P_N\}\), and its kernel \(\ker\rho\), which may be a non-trivial normal subgroup of \(G\).

Example 2.5. We prove that the group of order \(36\) is not a simple group. Consider the number of \(9\)-order subgroups of \(G\), denoted by \(N\). We have \(N=1\) or \(4\). If \(N=4\), consider the conjugation action of \(G\) on these four Sylow-\(3\) subgroups: \[\rho: G\to S_4\] Since \(|S_4|=24\), \(\rho\) cannot be injective. Moreover, by the transitivity of Sylow subgroups under the conjugation action of \(G\), we know that this action is non-trivial. Therefore, \(\ker\rho\triangleleft G\) forms a non-trivial normal subgroup of \(G\).

There is another proof. If \(N=4\), consider \(H=P_1\cap P_2\). Since \(36\ge |P_1P_2|=81/|H|\), we know that \(|H|=3\). But \(P_1,P_2\le N_G(H)\), so \(N_G(H)\) contains the group generated by \(P_1\) and \(P_2\). Hence, \(G=N_G(H)\triangleright H\) is a non-trivial normal subgroup of \(G\).

Exercise 2.21 (\(\bigstar\)). Prove that the group of order \(150\) is not a simple group.

Using Sylow’s Theorem and Conjugation

How can we combine Sylow’s Theorem and conjugation? The following lemma describes how conjugation interacts with Sylow subgroups.

Lemma 2.1. Let \(H\) be a \(p\)-subgroup acting on all Sylow \(p\)-subgroups, then \(P_i\) forms a single orbit \(\Leftrightarrow H\le P_i\)

Proof. One side is obvious, we prove the other side. Suppose \(a^{-1}Pa=P\) and \(a\) is a \(p^k\)-order element, then \(a\in N_G(P)\). Since \(P\triangleleft N_G(P)\), consider the projection of \(a\) under \[N_G(P)\to N_G(P)/P\] Since \(|N_G(P)/P|\) is coprime to \(p\), its projection can only be the identity element, so \(a\in P\). This is enough to prove the proposition. ◻

Proposition 2.4. Every \(p\)-power order subgroup \(H\) of \(G\) is contained in some Sylow-\(p\) subgroup.

Proof. Consider the action of \(H\) by conjugation on all Sylow-\(p\) subgroups of \(G\). The orbit lengths are either \(1\) or multiples of \(p\). Since the number of Sylow-\(p\) subgroups is \(\equiv 1\mod p\), we know that there must be one \(P\) that forms its own orbit, thus by Lemma, \(H\le P\). ◻

Proposition 2.5. Let \(H\) be a \(p\)-subgroup. Then the number of Sylow-\(p\) subgroups containing \(H\), \(a\), is \(\equiv 1\mod p\).

Proof. Let \(H\) act on all Sylow-\(p\) subgroups, \(H\le P\) is equivalent to the orbit length of \(P\) being \(1\). Since the number of all Sylow-\(p\) subgroups is \(\equiv 1\mod p\), we have \[a\equiv a+\text{other orbits}=N\equiv 1\mod p.\] ◻

Structure of finitely generated abelian groups

We will study the structure of all abelian groups that can be generated by a finite number of elements. Note that not all abelian groups are finitely generated, for example, \(\mathbb{Q}\) cannot be generated by a finite number of elements. The simplest example of a finitely generated abelian group is the so-called \(n\)-element free abelian group \(\mathbb{Z}^n=\mathbb{Z}\times\mathbb{Z}\times\dots\times\mathbb{Z}\), which is generated by \(n\) elements: \(e_1=(1,0,\dots,0), e_2=(0,1\dots,0),\dots,e_n\). It can also be called a free \(\mathbb{Z}\)-module: \(\mathbb{Z}\)-modules and abelian groups are the same concept.

Let \(G\) be an Abelian group that can be generated by a finite number of elements \(g_1,\dots, g_n\in G\). Then we can define a surjective homomorphism as follows: \[\begin{aligned} \varphi: \mathbb{Z}^n &\to G\\ \sum_{i=1}^n a_ie_i &\mapsto \sum_{i=1}^n a_ig_i. \end{aligned}\] According to the fundamental homomorphism theorem, we have \(G\cong \mathbb{Z}^n/\ker\varphi\). Therefore, our goal is to study what kind of subgroups \(\mathbb{Z}^n\) has, which will allow us to determine all possible quotient groups.

Let’s first consider the simple case of \(n=1\). We know that the only subgroups of \(\mathbb{Z}\) are \(0\) and \(n\mathbb{Z}\) for \(n\ge 1\). Therefore, all non-trivial subgroups of \(\mathbb{Z}\) are isomorphic to \(\mathbb{Z}\).

For the case of \(n=2, G=\mathbb{Z}^2\), let \(H\le G\) be a subgroup. Consider \(H_1=H\cap \mathbb{Z}\times 0\cong \mathbb{Z}\) and \(H_2=H\cap 0\times \mathbb{Z}\cong \mathbb{Z}\). We can think of \(H_1\) and \(H_2\) as subgroups of \(\mathbb{Z}\), so they must be \(H_1=a\mathbb{Z}\) and \(H_2=d\mathbb{Z}\). This means we can find two elements \(\alpha_1=(a,b), \alpha_2=(c,d)\) in \(H\) such that the first component of each element is divisible by \(a\) and the second component is divisible by \(d\). So we can write \(\alpha_1=(a,xd), \alpha_2=(ya,d)\). Now, note that \[\alpha_2-y\alpha_1=(0,(1-xy)d)\in H\] and \[\alpha_1-x\alpha_2=(0,(1-yx)a)\in H.\] We will prove the following important result (essentially linear algebra) in Chapter 4:

Theorem 2.4. Subgroups of \(\mathbb{Z}^n\) are necessarily free Abelian groups, isomorphic to \(\mathbb{Z}^m\) with \(m\le n\). In fact, if \(H\le \mathbb{Z}^n\), then there exists a set of \(\alpha_i\in G\) and a set of non-negative integers \(d_i\) such that \(d_i|d_{i+1}\) and \[H=\mathbb{Z}d_1\alpha_1\oplus \mathbb{Z}d_2\alpha_2 \oplus\dots \mathbb{Z}d_n\alpha_n.\]

Corollary 2.4. All finitely generated Abelian groups are isomorphic to some \[\mathbb{Z}^m\oplus \mathbb{Z}_{d_1}\oplus \mathbb{Z}_{d_2}\dots \oplus \mathbb{Z}_{d_k}.\]

Corollary 2.5. All finite Abelian groups are direct sums of finite cyclic groups.

Semidirect Product

Semidirect product is an important concept commonly used in group theory. If \(N\triangleleft G\), then we can construct the quotient group \(Q=G/N\). One might wonder if \(G\cong N\times Q\) holds? In most cases, this is not true. For example, consider the case of \(G=\mathbb{Z}_4\) and \(N=2\mathbb{Z}_4\), it is obvious that \(\mathbb{Z}_4\not\cong \mathbb{Z}_2\times \mathbb{Z}_2\).

If \(Q\) can be naturally realized as a subgroup of \(G\), i.e. there exists \(Q\le G\) such that the natural projection \(G\to G/N\) is an isomorphism, then we can assert a weaker proposition: \(G\) is the semidirect product of \(N\) and \(Q\), i.e. \(G\cong N\ltimes Q\). Let us explain this, we can reconstruct \(G\) from \(N\) and \(Q\): for each \(g\in G\), there exists an image \(q\) in \(Q\), so \(gq^{-1}\in N\), which means \(g=nq\), and this representation is unique. Since \(N\triangleleft G\), the multiplication operation on \(G=NQ\) can also be recovered from the conjugation action of \(Q\) on \(N\): \[(n_1 q_1) (n_2 q_2) = n_1 q_1 n_2 q_1^{-1} (q_1 q_2).\] This inspires us to define a group structure on the set \(N\times Q\) from \(N\), \(Q\), and the (conjugation action) homomorphism \(\theta:Q\to {\rm Aut}(N)\) as follows: \[(n_1, q_1) (n_2, q_2) = (n_1 \theta_{q_1}(n_2), q_1q_2).\] The identity element in the group is \((1, 1)\), and the inverse element is \((\theta_{q^{-1}}(n^{-1}), n^{-1})\). It can be verified that this definition satisfies the associative law, so \(N\times Q\) forms a group under the above multiplication, which we denote as \(N\rtimes_\theta Q\), called the semidirect product of \(N\) and \(Q\).

Exercise 2.22 (\(\bigstar\)). Verify the associative law in the above definition.

Example 2.6. \(G=\mathbb{Z}_3\rtimes_\theta \mathbb{Z}_2\) has two possibilities depending on the choice of \(\theta:\mathbb{Z}_2\to {\rm Aut}(\mathbb{Z}_3)\cong \mathbb{Z}_2\):

  1. If \(\theta = 1_{\mathbb{Z}_3}\) is the trivial homomorphism, then \(G=\mathbb{Z}_3\times \mathbb{Z}_2\cong \mathbb{Z}_6\).

  2. If \(\theta\) is the homomorphism that maps \(\theta(1)\) to the unique non-trivial automorphism \(\alpha\in {\rm Aut}(\mathbb{Z}_3)\) where \(\alpha(a)=a^{-1}\), then we have \[G=\langle a, b|a^3 = b^2 = 1, bab^{-1} = a^{-1}\rangle \cong D_3.\]

Verify that \(D_n\cong \mathbb{Z}_n\rtimes_\theta \mathbb{Z}_2\), what is the homomorphism \(\theta\)? We give the following simple criterion:

Proposition 2.6. If \(N\triangleleft G\), \(Q\le G\), \(G=NQ\), and \(N\cap Q=1\), then \[G\cong N\rtimes_\theta Q\] where \(\theta\) is the conjugation action of \(Q\) on \(N\).

Proof. From \(G=NQ\) and \(N\cap Q = 1\), we can see that each element \(g\in G\) can be uniquely represented in the form of \(nq\). Therefore, by the construction of the semidirect product, we know that we can define an obvious homomorphism \[G\to N\rtimes_\theta Q\] which is obviously a surjective homomorphism. ◻

Exercise 2.23 (\(\bigstar\)). Verify that \(G=N\rtimes H\) implies \(N\triangleleft G\).

It is worth noting that although there are many possible choices for \(\theta\) in the semi-direct product \(N\rtimes_\theta Q\), different choices may result in isomorphic groups. (In fact, the same group may have different semi-direct product representations.) In order to easily identify these isomorphic groups in applications, we provide the following two criteria to determine when \(N\rtimes_\theta Q\cong N\rtimes_{\theta'} Q\).

Proposition 2.7 (Symmetry of \(N\)). If there exists \(\alpha\in{\rm Aut}(N)\) such that for all \(n,q\), \[\theta_q(n)=\alpha^{-1}(\theta'_q(\alpha(n)))\] or equivalently, there exists a commutative diagram rendering math failed o.o then \(N\rtimes_{\theta} Q\cong N\rtimes_{\theta'} Q\).

Proof. Construct a homomorphism \(\varphi: N\rtimes_\theta Q\to N\rtimes_{\theta'} Q\) \[\varphi(n, q) := (\alpha(n), q).\] Firstly, we verify that it is a homomorphism, we have \[\begin{aligned} \varphi((n_1, q_1)(n_2, q_2)) &= \varphi(n_1\theta_{q_1}(n_2), q_1q_2) \\ &= (\alpha(n_1)\alpha(\theta_{q_1}(n_2)), q_1q_2) \\ &= (\alpha(n_1)\theta'_{q_1}(\alpha(n_2)), q_1q_2) \\ &= \varphi(n_1, q_1) \varphi(n_2, q_2). \end{aligned}\] It is obviously a surjection. ◻

Similarly, we have the following proposition:

Proposition 2.8 (Symmetry of \(Q\)). If there exists \(\beta\in{\rm Aut}(Q)\) such that for all \(n,q\), \[\theta_q(n)=\theta'_{\beta(q)}(n)\] or equivalently, there exists a commutative diagram rendering math failed o.o then \(N\rtimes_{\theta} Q\cong N\rtimes_{\theta'} Q\).

Exercise 2.24 (\(\bigstar\)). Prove the above proposition.

Exercise 2.25 (\(\bigstar\)). Prove the group isomorphism \[\langle a,b| a^3=b^7=1, aba^{-1}=b^2\rangle\cong \langle a,b| a^3=b^7=1, aba^{-1}=b^4\rangle.\]

Exercise 2.26 (\(\bigstar\bigstar\bigstar\)). Find all \(pq\)-order groups, where \(p\) and \(q\) are both prime numbers. Prove that there are at most two different isomorphism classes.

Exercise 2.27 (\(\bigstar\bigstar\bigstar\)). If \(G=\mathbb{Z}_p\rtimes_\theta \mathbb{Z}_n\), where \(p\) is a prime number, prove that the number of conjugacy classes \(c(G)\) satisfies \[c(G)=n\left(1+\frac{p-1}{h^2}\right).\] Here, \(h\) represents the order of \(\theta\), i.e. the smallest number such that \(\theta^h=1\).

Exercise 2.28 (\(\bigstar\bigstar\bigstar\)). If \(Q\) is a cyclic group and \(\theta(Q)\) and \(\theta'(Q)\) are conjugate in \({\rm Aut}(N)\), prove that \[N\rtimes_\theta Q\cong N\rtimes_{\theta'} Q.\] (Hint: Prove that there exist \(\alpha\in {\rm Aut}(N)\) and \(\beta\in {\rm Aut}(Q)\) such that \(\alpha^{-1}\circ\theta_q\circ\alpha = \theta'_{\beta(q)}\).)

All Groups with \(|G|\le 15\)

This section will contain quite a few examples: we will classify all groups with order not exceeding \(15\), that is, we will find all non-isomorphic groups in this range. First, it should be noted that when \(p\) is a prime, the only \(p\)-order group is the cyclic group \(\mathbb{Z}_p\). Therefore, we are left with orders \(n=4,6,8,9,10,12,14,15\). We have already proven that there are only two groups of order \(p^2\), \(\mathbb{Z}_{p^2}\) and \(\mathbb{Z}_p^2\). Now consider the following proposition:

Proposition 2.9 (\(|G|=pq\) case). Let \(1<p<q\) be two primes, and \(G\) be a group of order \(pq\). Then,

  1. If \(p|q-1\), then \(G\cong \mathbb{Z}_{pq}\) or \(G\cong \mathbb{Z}_q\rtimes \mathbb{Z}_p\).

  2. If \(p\not|q-1\), then \(G\cong \mathbb{Z}_{pq}\).

Proposition 2.10. By Sylow’s theorem, the \(q\)-order cyclic subgroup, denoted by \(\mathbb{Z}_q\), is a normal subgroup of \(G\). Let \(\mathbb{Z}_p\) be a Sylow-\(p\) subgroup of \(G\). Since \(|\mathbb{Z}_q\cap\mathbb{Z}_p|=1\) and \(G=\mathbb{Z}_p\mathbb{Z}_q\), we have \(G=\mathbb{Z}_q\rtimes_\theta \mathbb{Z}_p\), where \[\theta:\mathbb{Z}_p\to{\rm Aut}(\mathbb{Z}_q)\cong \mathbb{Z}_{q-1}.\] This implies that if \(p\not|q-1\), the above mapping is trivial, so \(G\cong\mathbb{Z}_q\times\mathbb{Z}_p\cong\mathbb{Z}_{pq}\). If \(p|q-1\), consider the non-trivial choice of \(\theta\), so that \(\theta\) becomes a monomorphism. However, the embedding image of \(\mathbb{Z}_p\) in \(\mathbb{Z}_{q-1}\) is unique, so these non-trivial homomorphisms differ only by an automorphism of \(\mathbb{Z}_p\). Therefore, the use of the symmetry lemma tells us that the groups they induce are isomorphic.

So the main difficulty to be discussed is concentrated on the orders \(n=8,12\).

Classification of \(8\)-order groups

The commutative \(8\)-order groups are only \(\mathbb{Z}_8\), \(\mathbb{Z}_4\times \mathbb{Z}_2\), and \(\mathbb{Z}_2^3\). Now we will study the non-commutative \(8\)-order groups. By Sylow’s theorem, we know that \(G\) has \(2k+1\) subgroups of order \(4\). It is impossible to have \(5\) subgroups, otherwise it will occupy at least \(5\times(4-2)+2=12\) elements. Therefore, there can only be \(1\) or \(3\) subgroups.

If there is only one subgroup of order \(4\), \(H\triangleleft G\), then \(H\cong \mathbb{Z}_4\) must hold, otherwise if \(H\cong \mathbb{Z}_2^2\), there will be no elements of order \(4\) in \(G\), which leads to all elements of \(G\) being of order \(1\) or \(2\), making \(G\) abelian, which is not relevant to our discussion. So now we have \(H=\mathbb{Z}_4\). Let \(a\in G-H\), then we must have \(G=\langle a\rangle H\cong \mathbb{Z}_4\rtimes \mathbb{Z}_2\cong D_4\), and there is only one non-trivial choice for the conjugate action. However, in reality, \(D_4\) has \(3\) subgroups of order \(4\): \(\langle \sigma\rangle, \langle \sigma^2, \tau\rangle, \langle \tau\sigma, \sigma^2\rangle.\)

If there are \(3\) \(4\)th-order subgroups, note that all subgroups with an index of \(2\) are normal. By Cauchy’s theorem, we can choose an element \(b\) of order \(2\). If it does not belong to any of the \(4\)th-order subgroups, then \(G\) must be a semidirect product, which could be \(\mathbb{Z}_4\rtimes \mathbb{Z}_2\), which is the same as \(D_4\) mentioned above. If it is the case of \(\mathbb{Z}_2^2\rtimes \mathbb{Z}_2=\langle a,b,c|a^2=b^2=(ab)^2=c^2=1, cac^{-1}=b\rangle\), then choosing \(a=\tau\sigma,b=\tau\sigma^3,c=\tau\) shows that this is also \(D_4\).

Now, let’s assume that all elements of order \(2\) belong to all \(4\)th-order subgroups. Since at least one of these subgroups must be \(\mathbb{Z}_4\) (otherwise, \(G\) would be abelian with only elements of order \(2\)), we can conclude that there is only one element of order \(2\) in \(G\). This means that all \(4\)th-order subgroups are isomorphic to \(\mathbb{Z}_4\). Let \(i,j,k\) be the generators of these subgroups, then we have \(i^2=j^2=k^2=b\). Now, consider the element \(ij\), it cannot belong to \(\langle i\rangle\) or \(\langle j\rangle\), otherwise if \(ij = i^k\), we would have \(j\in \langle i\rangle\), which is impossible. Therefore, we must have \(ij\in \langle k\rangle -\{b\}\). If \(ij=k\), then \(ijk=b\). If \(ij=k^{-1}=kb\), we can choose \(k\) to be \(k^{-1}\) and reduce it to the previous case. These relations determine the group \(G\), which we will denote as \(Q_8\), known as the quaternion group. In fact, this group is generated by \(\{i,j,k\}\) in the quaternion field: \[Q_8=\{1,-1,i,-i,j,-j,k,-k\}.\] Now, we will show that \(Q_8\) is not isomorphic to \(D_4\), as \(Q_8\) has \(1\) element of order \(2\) while \(D_4\) has \(5\).

Classification of Groups of Order \(12\)

For groups of order \(12\), it is easy to see that the only Abelian groups are \(\mathbb{Z}_{12}\) and \(\mathbb{Z}_2\times \mathbb{Z}_6\). Now we only need to consider the case of non-Abelian groups. By the Sylow theorem, we know that \(G\) has either \(1\) or \(3\) normal subgroups of order \(4\), and either \(1\) or \(4\) subgroups of order \(3\). We can see that \(G=H\rtimes K\cong H\rtimes_\theta \mathbb{Z}_3\), where \(H\cong \mathbb{Z}_4\) or \(\mathbb{Z}_2^2\) and \(\theta:\mathbb{Z}_3\to {\rm Aut}(H)\) is a non-trivial homomorphism (if it is trivial, we get a direct product, which is one of the Abelian groups mentioned before).

First, let us assume that \(G\) has a normal subgroup \(H\) of order \(4\). If \(H=\mathbb{Z}_4\), then considering \(\theta:\mathbb{Z}_3\to {\rm Aut}(\mathbb{Z}_4)\cong \mathbb{Z}_2\), we can see that \(\theta\) must be trivial.

If \(H=\mathbb{Z}_2\times \mathbb{Z}_2\), we know that \({\rm Aut}(H)\cong S_3\) is the permutation group of the three \(2\)-order elements in \(H\). Consider the map \(\theta:\mathbb{Z}_3\to {\rm Aut}(H)=S_3\) that maps to the unique subgroup \(A_3\cong \mathbb{Z}_3\) in \(S_3\). Thus, \(\theta\) can be seen as an isomorphism from \(\mathbb{Z}_3\) to \(A_3\cong\mathbb{Z}_3\), and different choices of \(\theta\) differ only by an element in \({\rm Aut}(\mathbb{Z}_3)\). By the symmetric lemma of \(Q\), we know that the groups \(G=\mathbb{Z}_2^2\rtimes \mathbb{Z}_3\) induced by them are isomorphic. In fact, this group is not a new and unfamiliar group.

Exercise 2.29 (\(\bigstar\)). It can be seen that the above group is \(A_4\).

Now let’s consider the case where it has \(3\) subgroups of order \(4\). Since these \(3\) subgroups must occupy \((4-2)\times 3+1=7\) elements, it is impossible to have \(4\) subgroups of order \(3\), because \(4\) subgroups of order \(3\) would occupy at least \((3-1)\times 4 + 1 = 9\) elements. We conclude that there can only be one subgroup of order \(3\), which makes it normal. If \(H=\mathbb{Z}_4\), there is only one non-trivial semidirect product \(\mathbb{Z}_3\rtimes \mathbb{Z}_4\).

If \(H=\mathbb{Z}_2^2\), then \(\theta:\mathbb{Z}_2^2\to {\rm Aut}(Z_3)\cong \mathbb{Z}_2\). It is easy to see that there are only three non-trivial mappings \(\theta\) that can be chosen. But these three choices only differ by an automorphism of \(\mathbb{Z}_2^2\), which determines the same group \(\mathbb{Z}_3\rtimes \mathbb{Z}_2^2\). In fact, it is \(D_6\).

Exercise 2.30 (\(\bigstar\)). Prove that it is \(D_6\).

Now we will prove that \(\mathbb{Z}_3\rtimes \mathbb{Z}_4\) is neither \(D_6\) nor \(A_4\). Since \(A_4\) does not have any normal subgroup of order \(3\), \(\mathbb{Z}_3\rtimes \mathbb{Z}_4\) cannot be isomorphic to \(A_4\). Also, \(D_6\) does not have any element of order \(4\), so it cannot be isomorphic to it.

Table of Small Order Groups

To summarize our discussion in a table, we have the following tables of all groups of order not exceeding \(15\):

Order Abelian Non-Abelian
\(2\) \(\mathbb{Z}_2\)
\(3\) \(\mathbb{Z}_3\)
\(4\) \(\mathbb{Z}_4,\mathbb{Z}_2^2\)
\(5\) \(\mathbb{Z}_5\)
\(6\) \(\mathbb{Z}_6\) \(S_3\)
\(7\) \(\mathbb{Z}_7\)
\(8\) \(\mathbb{Z}_8,\mathbb{Z}_4\times \mathbb{Z}_2, \mathbb{Z}_2^3\) \(D_4,Q_8\)
\(9\) \(\mathbb{Z}_9,\mathbb{Z}_3^2\)
\(10\) \(\mathbb{Z}_{10}\) \(D_5\)
\(11\) \(\mathbb{Z}_{11}\)
\(12\) \(\mathbb{Z}_{12},\mathbb{Z}_2^2\times \mathbb{Z}_3\) \(A_4, D_6, \mathbb{Z}_3\rtimes \mathbb{Z}_4\)
\(13\) \(\mathbb{Z}_{13}\)
\(14\) \(\mathbb{Z}_{14}\) \(D_7\)
\(15\) \(\mathbb{Z}_{15}\)

Ring and Field

Definition and Examples

From an operational perspective, a group is a set with a "multiplication operation". When this group is an abelian group, the operation is usually written as addition. What if we consider both addition and multiplication operations on a set? For example, \(\mathbb{Z}\) is an abelian group under addition, but it also has a multiplication structure. This is a new algebraic structure called a ring. Of course, the addition and multiplication on a ring cannot be arbitrarily defined, they need to be compatible with each other and satisfy some reasonable properties. The complete definition is as follows:

Definition 3.1 (Ring). A set \(R\) with two operations \((+)\) and \((\cdot)\) is called a ring if \(R\) forms an abelian group under the operation \((+)\), and for any \(a,b,r\in R\), the following properties hold:

  1. (Associativity) \((ab)r=a(br)\)

  2. (Distributivity) \(r(a+b)=ra+rb\), \((a+b)r=ar+br\)

  3. There exists a multiplicative identity \(1\in R\) such that \(r1 = 1r = r\).

Here, as usual, we agree that the multiplication operation has a higher priority than addition, and we omit the multiplication symbol between letters. In a ring, we use \(0\) to represent the additive identity and \(1\) to represent the multiplicative identity.

It is worth noting that in the definition of a ring, we do not require the multiplication to be commutative, we only require the addition to be commutative. If the multiplication is also commutative, we call \(R\) a commutative ring.

Definition 3.2 (Field). If \(R\) is a ring and every non-zero element \(r\in R\) has a multiplicative inverse \(r^{-1}\in R\) such that \[r^{-1}r = r r^{-1} = 1\] then we call \(R\) a field. If the multiplication in \(R\) is commutative, then \(R\) is called a commutative field.

In a ring, non-zero elements may not have a multiplicative inverse. If a non-zero element \(r\in R\) has a multiplicative inverse, we call it a unit.

Exercise 3.1. Find all units in the ring \(\mathbb{Z}\).

The concepts of rings and fields have many examples in algebra, here are a few common ones.

Example 3.1. The set of all integers \(\mathbb{Z}\) forms a ring under the usual addition and multiplication.

Example 3.2. The set of all rational numbers \(\mathbb{Q}\), real numbers \(\mathbb{R}\), and complex numbers \(\mathbb{C}\) form fields under the usual operations.

Example 3.3. The set of all polynomials with integer coefficients, denoted by \(\mathbb{Z}[x]\), forms a ring under the usual polynomial multiplication and addition, where \[\mathbb{Z}[x]:=\{a_nx^n+a_{n-1}x^{n-1}+\dots+a_1x+a_0 | a_i\in \mathbb{Z}, n\ge 0\}\] Similarly, for any ring \(R\), we can define the polynomial ring \(R[x]\) as \[R[x]:=\{a_nx^n+a_{n-1}x^{n-1}+\dots+a_1x+a_0 | a_i\in R, n\ge 0\}\] This is also a ring, with addition defined in the usual way and multiplication defined as \[\left(\sum_{i=0}^m a_i x^i\right)\left(\sum_{j=0}^n b_j x^j\right)=\sum_{k=0}^{m+n} \left(\sum_{i+j=k}a_i b_j\right) x^k\] It is easy to verify that this satisfies the associative and distributive properties.

Example 3.4. Let \(S\) be a set consisting of all finite strings. Define \(\mathbb{Z}[S]\) as a ring that consists of all finite forms and combinations of strings (and \(\mathbb{Z}\)), for example \[5+2\text{hello}-3\text{world}\in \mathbb{Z}[S]\] Addition is given in the natural way, and multiplication is given by concatenation of strings: \[\text{abc}\cdot \text{adc}=\text{abcadc}\] \[(2a+3b)(2a-3b)=4aa-6ab+6ba-9bb.\] Under this definition, \(\mathbb{Z}[S]\) forms a non-commutative ring, for example \(a\cdot b = ab\) while \(b\cdot a =ba\).

Exercise 3.2. Is the set of all natural numbers \(\mathbb{N}=\{0,1,2,\dots\}\) a ring under the usual integer operations?

Exercise 3.3 (\(\bigstar\)). Verify that the set \(\mathbb{Z}[\sqrt{2}]=\{a+b\sqrt{2} | a,b\in \mathbb{Z}\}\) forms a ring under the usual operations.

Exercise 3.4 (\(\bigstar\)). Consider the set \[\mathbb{Z}_{(2)}:=\left\{\left.\frac{a}{b}\in \mathbb{Q}\right| \text{as a reduced fraction, the denominator }b\text{ is odd}\right\}.\] Is it a ring? Is it a field?

Exercise 3.5 (\(\bigstar\bigstar\)). Let \(\mathbb{H}=\{a+bi+cj+dk|a,b,c,d\in\mathbb{R}\}\) be a ring defined by the following rules, with multiplication given by \[ij=k, jk=i, ki = j\] \[ji=-k, kj=-i, ik = -j\] \[i^2=j^2=k^2=-1\] Prove that \(\mathbb{H}\) is a field. (Hint: Consider \((a+bi+cj+dk)(a-bi-cj-dk)\))

Similarly, we also need to study the relationship between rings, just like group homomorphisms. We can define ring homomorphisms and field homomorphisms, which need to preserve the operation structure and identity element.

Definition 3.3 (Ring Homomorphism). A mapping \(f:R\to S\) between rings is called a ring homomorphism if \(f\) satisfies

  1. (Abelian group homomorphism) \(f(a+b)=f(a)+f(b)\)

  2. (Preserves multiplication) \(f(ab) = f(a)f(b)\)

  3. (Preserves identity element) \(f(1)=1, f(0)=0\).

If \(f\) is still an injection, it is called a monomorphism. If \(f\) is also a surjection, it is called an epimorphism. If \(f\) has a homomorphic inverse, it is called an isomorphism. Homomorphisms between fields are also understood as homomorphisms between rings.

Example 3.5. Let \(K\) be a field, \(x_0\in K\), and define a polynomial valuation homomorphism \[\begin{aligned} f: K[x] &\to K\\ p(x) &\mapsto p(x_0) \end{aligned}\] that maps a polynomial \(p(x)\) to its value at \(x=x_0\), \(p(x_0)\).

Exercise 3.6. Verify that this is a ring homomorphism and also a surjective homomorphism.

Ideal

The concept of ideal was first introduced by Kummer. This concept was proposed to save the lack of a beautiful unique factorization law in rings considered in many number theories, unlike integers which can be uniquely factored into a product of primes. In general rings, prime factorization may not be unique. Kummer discovered that if the concept of "ideal numbers" is introduced, the product of ideals can still satisfy the unique factorization law.

Simply put, an ideal is a set of multiples. For example, for a prime number \(p\in\mathbb{Z}\), we can consider the set of all multiples of \(p\), denoted as \((p):=\{kp|k\in\mathbb{Z}\}\), which is called the ideal generated by \(p\). The essential characteristic of this set of multiples is its "absorption" property, which means that for any \(r\in\mathbb{Z}, i\in(p)\), we have \(ri\in(p)\). This leads to the concept of an ideal as follows:

Definition 3.4 (Ideal). An Abelian subgroup \(I\) of a ring \(R\) is called a left ideal if for any \(r\in R, i\in I\), we have \(ri\in I\). If the condition is changed to \(ir\in I\), it is called a right ideal. If \(I\) is both a left ideal and a right ideal, it is called a two-sided ideal. Note that in a commutative ring, we do not need to consider the concepts of left and right ideals, and all ideals are two-sided. In a commutative ring, left and right ideals are simply referred to as ideals. \(R\) itself is an ideal, called the trivial ideal, and any ideal that is not equal to \(R\) is called a proper ideal.

One important use of ideals is that they can be used to construct a quotient ring, just like defining a quotient group. Given a ring \(R\) and a two-sided ideal \(I\), we can define the set of cosets \[R/I=\{r+I|r\in R\}.\] This is first and foremost an Abelian group, the quotient group of \(R\) by the subgroup \(I\), with elements being cosets \(r+I\) or understood as representatives of the equivalence relation on \(R\) determined by \(I\). Its multiplication is inherited from the larger ring \(R\), defined as \[(r+I)(s+I)=rs+I.\] It can be easily verified that due to the absorption property of ideals, as a subset of \(R\), the product \((r+I)(s+I)\subset rs+I\). Therefore, the above multiplication is well-defined (independent of the choice of representatives).

Example 3.6. Consider the ideal \(n\mathbb{Z}=\{nk|k\in \mathbb{Z}\}\) of the ring \(\mathbb{Z}\). The quotient ring with respect to this ideal is called the ring of residue classes modulo \(n\): \[\mathbb{Z}/n\mathbb{Z}=\{a+n\mathbb{Z}|a\in\mathbb{Z}\}=\{n\mathbb{Z},1+n\mathbb{Z},2+n\mathbb{Z},\dots,(n-1)+n\mathbb{Z}\}.\] As an additive group, this is a cyclic group of order \(n\). As a ring, it is a finite ring, with the additive identity element being \(n\mathbb{Z}\) and the multiplicative identity element being \(1+n\mathbb{Z}\). The number-theoretic meaning of this ring is that it consists of \(n\) residue classes of all integers modulo \(n\). It is worth noting that when \(n\) is a prime number \(p\), \(\mathbb{Z}/p\mathbb{Z}\) is in fact a finite field with \(p\) elements.

Exercise 3.7. For a sequence of ideals \(I_\lambda\), we can define their sum: \(\Sigma_\lambda I_\lambda\) as the following ideal \[\left\{\text{all finite sums}\sum_\lambda a_\lambda i_\lambda | a_\lambda\in R, i_\lambda\in I_\lambda\right\}.\] Prove that this is an ideal. For a finite number of ideals \(I_1,\dots,I_n\), we can define their product \(J=I_1\dots I_n\) as \[J=\left\{i_1\dots i_n| i_k\in I_k\right\}.\] Prove that this is also an ideal.

Exercise 3.8. Prove Fermat’s little theorem in number theory by following steps:

  1. Let \(0\not=a\in \mathbb{Z}/p\mathbb{Z}\). Prove that for any \(k\), \(a^k\not=0\).

  2. From this, it follows that \(a\) has a multiplicative inverse, i.e. \(\mathbb{Z}/p\mathbb{Z}\) is a field, sometimes denoted as \(\mathbb{F}_p\), called the \(p\)-element finite field.

  3. Prove that \(\mathbb{F}_p^\times := \mathbb{Z}/p\mathbb{Z}-\{0\}\) is a multiplicative group, hence for any \(a\in \mathbb{F}_p^\times\), \(a^{p-1}=1\).

Prime Ideals

The concept of ideals comes from number theory, so naturally, it is related to many arithmetic concepts. In number theory, the most important step in proving the unique factorization law for natural numbers is to prove the Euclidean lemma, which states that if \(p\) is a prime number and \(p|ab\), then \(p|a\) and \(p|b\) must hold. From the perspective of ideals, \((p)=p\mathbb{Z}\subset \mathbb{Z}\) is the set of multiples of \(p\), which is an ideal of \(\mathbb{Z}\). In this case, we have \[ab\in (p)\Rightarrow a\in (p) \text{or} b\in (p).\] We can generalize this concept to any ring. If a proper ideal \(I\in R\) satisfies the above property, then \(I\) is called a prime ideal. That is, if \[ab\in I\Rightarrow a\in I \text{or} b\in I,\] then \(I\) is a prime ideal. For a commutative ring \(R\), the set of all prime ideals is usually denoted by \({\rm Spec}(R)\). Note that the trivial ideal is not a prime ideal. This is a convention, similar to the convention that \(1\) is not a prime number. The reason is that \(1\) is a unit, and should not be called a prime number.

Exercise 3.9 (\(\bigstar\)). We call a ring \(R\) an integral domain if \(ab=0\) implies \(a=0\) or \(b=0\) in \(R\). Prove that \(R/I\) is an integral domain if and only if \(I\) is a prime ideal.

Exercise 3.10. Let \(f:R\to S\) be a ring homomorphism. Prove that the preimage of a prime ideal is always a prime ideal. That is, if \(P\subset S\) is a prime ideal, then \(f^{-1}(S)\) is a prime ideal of \(R\).

Another useful concept is the maximal ideal. A proper ideal \(I\) of a ring \(R\) is a maximal ideal (left, right, two-sided) if there is no proper ideal (left, right, two-sided) of \(R\) that contains \(I\). We state a few simple properties of maximal ideals.

Proposition 3.1 (Maximal ideals are always prime ideals). Let \(I\) be a maximal ideal (two-sided) of \(R\). Then \(I\) is also a prime ideal (two-sided) of \(R\).

Proof. Otherwise, suppose \(ab\in I\) but \(a,b\not\in I\). Consider \[((a)+I)((b)+I)=(ab)+I\subset I\] Since \((a)+I,(b)+I\) are strictly larger than \(I\), according to the assumption, they can only be equal to the trivial ideal \(R\). Thus, we have \[R=R\cdot R\subset I\] This contradicts the fact that \(I\) is a maximal ideal (by definition, it must be a proper ideal). ◻

Proposition 3.2. Any non-unit element \(a\in R\) is necessarily contained in some maximal ideal.

Proof. Consider the (left) ideal \(Ra\) generated by \(a\). Since \(a\) is not a unit, we know that \(Ra\neq R\). Therefore, \(Ra\) is a proper ideal. Using Zorn’s lemma on the set of all ideals containing \(Ra\), we can find a maximal (left) ideal that contains it. ◻

Exercise 3.11 (\(\bigstar\)). Prove: Let \(I\) be a two-sided ideal of \(R\), prove that \(R/I\) is a field if and only if \(I\) is a maximal ideal.

Exercise 3.12 (\(\bigstar\)). Let \(a\) be contained in all maximal ideals, prove that \(a+1\) is a unit.

Kernels and Images of Homomorphisms

Just like the kernels and images of group homomorphisms, for a ring homomorphism \(f:R\to S\), we can define \[\ker(f):=\{r\in R|f(r)=0\}\] and \[{\rm Im}(f):=\{f(r)| r\in R\}.\]

Exercise 3.13. Verify that the kernel is a two-sided ideal in \(R\) and the image is a subring in \(S\).

We have an obvious isomorphism theorem.

Theorem 3.1 (First Isomorphism Theorem). Given a ring homomorphism \(f:R\to S\), it induces a ring isomorphism \[\overline{f}:R/\ker(f)\to {\rm Im}(f)\]

Proof. This isomorphism is defined by \[r+\ker(f)\mapsto f(r)\] It is obviously well-defined, since the image of \(\ker(f)\) is 0. If \(f(r)=0\), then \(r\in\ker(f)\), so it is injective. Surjectivity is also obvious. ◻

Theorem 3.2 (Corresponding Theorem). Let \(I\) be a bilateral ideal, \(p:R\to R/I\) be the natural projection. Then all ideals of \(R/I\) correspond one-to-one with ideals of \(R\) that contain \(I\).

Exercise 3.14. Prove that the only ideals of a field are the trivial ideal and the zero ideal \((0):=\{0\}\). Explain that a homomorphism from a field to any ring can only be a monomorphism or the zero homomorphism.

Noetherian Property

If all ideals in the ring \(R\) are finitely generated, i.e. any ideal \(I\subset R\) can be expressed as \[I=Ra_1+Ra_2+\dots+Ra_n\] (where for right ideals, we replace \(Ra_i\) with \(a_iR\)), where \(a_i\in R\), then we say that \(R\) is a Noetherian ring (left, right, or two-sided). This is an important and fundamental property in ring theory, and most of the rings we study are Noetherian.

There are several classical equivalent statements about the Noetherian property, as follows:

Theorem 3.3 (Noetherian Property). The following three statements are equivalent (each with a corresponding theorem for left, right, and two-sided, respectively):

  1. \(R\) is a Noetherian ring.

  2. Any ascending chain of ideals in \(R\), \[I_1\subset I_2\subset\dots\] must eventually stabilize, i.e. there exists \(n\) such that \(I_n=I_{n+1}=I_{n+2}=\dots\).

  3. Any non-empty set \(S\) consisting of ideals in \(R\) must have a maximal element, i.e. there exists \(I\in S\) such that \(I\subset J\in S\Rightarrow I=J\).

Proof. \((1)\Rightarrow (2):\)
Assume \(I=\bigcup_{i=1}^\infty I_i\), it is easy to prove that this is an ideal. Therefore, \(I\) is finitely generated. Let it be generated by \(a_1,\dots, a_m\). By definition, we can assume \(a_i\in I_{n_i}\), then when \(n\ge \max(n_1,\dots, n_m)\), \(I_n=I\), so the chain stops here.

\((2)\Rightarrow (3):\)
Otherwise, there exists a strictly increasing infinite chain, which contradicts \((2)\).

\((3)\Rightarrow (1):\)
Let \(I\) be an arbitrary ideal. Suppose \(S\) is the set of all finitely generated ideals contained in \(I\). Then we can choose a maximal element \(J\in S\). If \(J\neq I\), there exists \(a\not\in J\) but \(a\in I\). Consider \(J+Ra\), which is a finitely generated ideal strictly larger than \(J\). This contradicts the definition of \(J\). Therefore, \(J=I\). ◻

Exercise 3.15. Prove that a finite ring is necessarily Noetherian.

Exercise 3.16. Prove that if \(R\) is Noetherian, then \(R/I\) is also, where \(I\) is a two-sided ideal. Hence, it follows that any homomorphic image of \(R\) is also Noetherian.

The following fundamental theorem shows that many rings are Noetherian.

Theorem 3.4 (Hilbert’s Basis Theorem). If \(R\) is Noetherian, then \(R[x]\) is also.

Proof. Let \(I\) be an ideal in \(R[x]\) (assume it is a left ideal). Define \[J_m = \{r\in R | r\text{ appears in the leading coefficient of some $m$-degree polynomial $f(x)\in I$}\}\cup\{0\}.\] There is an ascending chain of ideals \[J_0\subset J_1\subset J_2\dots\] By the Noetherian property of \(R\), each \(J_i\) is finitely generated and this chain must stop. Let \(J_n = J_{n+1} = \dots\) Then take the polynomials \(f_{i,j}(x)\in R[x]\) corresponding to the generators of \(J_0,\dots, J_n\), where \(\deg(f_{i,j})=i\le n\). We will prove that \(\forall g_0\in I\), it can be generated by \(\{f_{i,j}\}\). Suppose its leading coefficient appears in \(f_{n_1,j_1}\), where \(n_1\le \deg(g_0)\), then we have \[g_1=g-f_{n_1,j_1}x^{\deg(g)-n_1}\] with degree strictly less than \(\deg(g_0)\). If it is not \(0\), we can continue by taking some \(f_{n_2,j_2}\), where \(n_2\le \deg(g_1)\), and define \[g_2=g_1-f_{n_2,j_2}x^{\deg(g_1)-n_2}\] with degree strictly less than \(\deg(g_1)\). If it is not \(0\), we can continue this process. But since the degree is a non-negative integer, this process must stop, and we can finally get \[g_0 = \sum_k f_{n_k,j_k}x^{\deg(g_{k-1}-n_k)}\] This proves the proposition. ◻

Arithmetic in Principal Ideal Domains

In this section, we assume that all rings are commutative. In this case, there is no need to distinguish between left and right ideals, and we refer to them collectively as ideals. Let \(a_1,\dots, a_n\in R\), we define the ideal notation \[(a_1,\dots, a_n):= Ra_1+Ra_2+\dots +Ra_n\] to represent the ideal generated by \(a_1,\dots,a_n\). If an ideal \(I\) can be generated by a single element \(a\in R\), i.e. \(I=(a)\), then we call it a principal ideal. If all ideals of an integral domain \(R\) are principal, then we call it a principal ideal domain (PID).

Theorem 3.5 (Greatest Common Divisor Theorem). \(\mathbb{Z}\) is a PID.

Proof. Let \(I\) be a nonzero ideal of \(\mathbb{Z}\), and let \(0<a\in I\). If \(I=(a)\), the proof is complete. Otherwise, let \(0<b\in I\) such that \(a\) does not divide \(b\). Consider the division with remainder \[b=aq+r, \quad 0<r<a\] Thus, we have \(r=b-aq\in I\). If all elements in \(I\) are divisible by \(r\), then \(I=(r)\) and the proof is complete. Otherwise, we can continue to take out an element that is not divisible by \(r\) and repeat this process, obtaining a positive integer smaller than \(r\) that belongs to \(I\). However, since \(r\) is a positive integer, this process cannot continue infinitely and there must be a step where \(r\) is an element of \(I\) and all elements in \(I\) are divisible by \(r\). Therefore, \(I\) is a principal ideal. ◻

Corollary 3.1. Let \(d\) be the greatest common divisor of \(a_1,\dots, a_n\in \mathbb{Z}\). Then there exist integers \(x_1,\dots, x_n\in \mathbb{Z}\) such that \[d=a_1x_1+\dots +a_n x_n.\]

Proof. According to the theorem, \(I=(a_1,\dots,a_n)=(r)\) is a principal ideal. Obviously, \(d|r=r_1a_1+\dots+r_na_n\). But since \(a_i\in (r)\), we also have \(r|a_i\), so \(r\) divides the greatest common divisor of \(a_i\), \(r|d\). This shows that \(r=\pm d\). Therefore, \(d\in (d)=\mathbb{Z}a_1+\dots +\mathbb{Z}a_n\), and there exist \(x_i\in \mathbb{Z}\) such that \[d=a_1x_1+\dots +a_n x_n.\] ◻

Let \(R\) be a general integral domain. Similar to how we define prime numbers, we call a non-unit element \(a\in R\) irreducible if \(a=bc\) implies that either \(b\) or \(c\) is a unit. We also call \(p\in R\) a prime element if \((p)\) is a prime ideal, meaning that \(p|ab\) implies \(p|a\) or \(p|b\). We know that in the case of the integer ring \(\mathbb{Z}\), irreducible elements and prime elements are equivalent, but this is not necessarily true in a general integral domain. Interestingly, this is true in a PID.

Exercise 3.17 (\(\bigstar\)). Prove that in a PID, irreducible elements and prime elements are equivalent.

We will now show that principal ideal domains have very good properties in terms of arithmetic.

Theorem 3.6 (PID\(\Rightarrow\)UFD). We call an integral domain \(R\) a Unique Factorization Domain (UFD) if every non-zero element can be written uniquely as a product of irreducible elements (without considering order, and assuming units). We have: \(R\) is a PID \(\Rightarrow R\) is a UFD. As a corollary, irreducible elements and prime elements are also equivalent in a UFD.

We first prove a lemma, which is equivalent to the existence of decomposition.

Lemma 3.1. In a Noetherian ring, for any non-zero ideal \(I\), there exist a finite number of non-zero prime ideals \(P_i\) such that \(I\supset P_1\dots P_n\).

Proof. Let \(S\) be the set of ideals for which the proposition does not hold. Then we choose a maximal element \(J\in S\). Since \(J\) is not a prime ideal, there exists \(xy\in J\) such that \(x,y\not\in J\). We have \[((x)+J)((y)+J)=(xy)+J\subset J\] Note that \((x)+J\) and \((y)+J\) are strictly larger than \(J\). By the definition of \(J\), they have decompositions. Therefore, their product also has a decomposition, which contradicts the definition of \(J\). Hence, \(S\) is an empty set.

As for uniqueness, we can assume that there are two decompositions \[a = u p_1\dots p_m = v q_1\dots q_n\] where \(u,v\) are units and \((p_i),(q_j)\) are prime ideals. From \((p_i)\ni v q_1\dots q_n\), we can conclude that \((p_i)\ni q_j\Rightarrow (p_i)=(q_j)\), which means they differ by a unit and can be cancelled from both sides. Repeating this process until one side becomes a unit, the other side cannot possibly have any prime elements left. This proves that \(m=n\) and the prime factors appearing in the decomposition are exactly the same up to order and units. ◻

Corollary 3.2. \(\mathbb{Z}\) is a UFD.

Exercise 3.18 (\(\bigstar\)). Explain why in PID, all non-zero prime ideals are maximal.

Note that although the unique factorization property may seem obvious and you might think it holds in all rings, this is not the case. For example, in \(\mathbb{Z}[\sqrt{-5}]=\{a+b\sqrt{-5}|a,b\in\mathbb{Z}\}\), we have \[6=2\cdot 3 = (1+\sqrt{-5}) (1-\sqrt{-5})\] These are two distinct factorizations of \(6\). The interesting thing is that the factors appearing in these factorizations are irreducible but not prime.

Exercise 3.19 (\(\bigstar\bigstar\)). Prove that \(2,3,1+\sqrt{-5},1-\sqrt{-5}\) are irreducible elements in \(\mathbb{Z}[\sqrt{-5}]\), but not prime. Hence, the unique factorization property does not hold in \(\mathbb{Z}[\sqrt{-5}]\).

Interesting Example: Gaussian Integers

A Gaussian integer is defined as \(R=\mathbb{Z}[i]=\{a+bi| a,b\in\}\). We will use a method similar to \(\mathbb{Z}\) to prove that \(\mathbb{Z}[i]\) is a PID and therefore a UFD. We define a norm in the ring \(\mathbb{Z}[i]\): \[N(a+bi):= a^2+b^2\] It describes the ’size’ of elements in the ring. It is easy to see that this norm is multiplicative, i.e. \(N(\alpha \beta)=N(\alpha)N(\beta)\). This also has implications in arithmetic.

Exercise 3.20. Prove that if both \(n\) and \(m\) can be written as the sum of two squares, then \(nm\) can also be written as the sum of two squares.

Exercise 3.21. Prove that in \(\mathbb{Z}[i]\), \(N(\alpha)=1\) if and only if \(\alpha\) is a unit.

Lemma 3.2 (Euclidean division in Gaussian integers). Let \(\alpha,\beta\in\mathbb{Z}[i]\). Then there exist \(\gamma,\delta\in \mathbb{Z}[i]\) such that \[\alpha=\beta\gamma +\delta\] and \(0\le N(\delta)<N(\beta)\).

Proof. Consider the quotient of complex numbers \(\frac{\alpha}{\beta}=x+iy\), where \(a\) and \(b\) are the two integers closest to \(x\) and \(y\), respectively. Then, \[\alpha = \beta(x+iy) = \beta(a+bi + E) = \beta (a+bi) + \delta\] where \(\delta=\alpha - \beta(a+bi)\in \mathbb{Z}[i]\), and \[N(E)=|E|^2\le \frac{1}{4}+\frac{1}{4}<1\] Thus, \[N(\delta) = N(\beta) |E|^2 < N(\beta).\] ◻

Corollary 3.3. \(\mathbb{Z}[i]\) is a PID, and therefore also a UFD.

Proof. Following the proof that \(\mathbb{Z}\) is a PID, for any non-zero non-unit element \(a\in I\), if \((a)\neq I\), we can choose \(b\in I-(a)\) and repeatedly use Euclidean division to make the norm of the remainder \(r=a-bq\) smaller. Since this is a positive integer, the process must eventually stop, and we can obtain \(r\in I\) such that \(r\) divides all elements in \(I\). ◻

Let’s see the powerful application of the fact that \(\mathbb{Z}[i]\) is a UFD.

Proposition 3.3. We prove that the equation \[y^2+1=x^3\] has only one unique integer solution, \((x,y)=(1,0)\).

Proof. Consider the equation in \(\mathbb{Z}[i]\): \[(y+i)(y-i)=x^3\] Let \((d)=(y+i,y-i)\) be their greatest common divisor in \(\mathbb{Z}[i]\). Then \(d|y+i,d|y-i\Rightarrow d|y+i-(y-i)=2i\). It is easy to factor \(2i\) as \(2i=(1+i)(1-i)i\), where \(i\) is a unit and \((1+i),(1-i)\) are irreducible elements, hence prime elements. If \(d\) contains \((1+i)\) or \((1-i)\) as prime factors, then \((y-i)(y+i)\) would be even, which means \(x^3\) is divisible by \(8\), and thus \(y^2\) is a number of the form \(8k+7\). However, this is impossible since the remainder of any integer squared divided by \(8\) can only be \(0,1,\) or \(4\). Therefore, \(d\) is a unit, which means \(y+i\) and \(y-i\) are coprime. In a UFD, if the product of two coprime numbers is a cube, then both numbers are cubes up to a unit, so we can assume: \[y+i = u(a+bi)^3\] \[y-i = \overline{u}(a-bi)^3\] where \(u\in \mathbb{Z}[i]^\times\) is a unit. Since the only units in \(\mathbb{Z}[i]\) are \(\pm 1, \pm i\), and \(\pm 1 = (\pm 1)^3\), \(\pm i = (\mp i)^3\), we can multiply \(u\) into the parentheses of the cube, so we can assume \(u=1\). Then: \[y+i = a^3-3b^2a + (-b^3+3a^2b)i\] This implies \(-b^3+3a^2b=b(-b^2+3a^2)=1\), which means \(b=\pm 1\). Therefore, \(-b^2+3a^2=3a^2-1|1\), which can only be satisfied if \(a=0\) and \(b=-1\). Thus, the only integer solution to the equation is \(y=0, x=1\). ◻

Exercise 3.22 (\(\bigstar\bigstar\)). Prove that \(\mathbb{Z}[\sqrt{-2}]\) is a PID.

Exercise 3.23 (\(\bigstar\bigstar\bigstar\bigstar\)). Find all integer solutions to \[y^2+2=x^3\]

Exercise 3.24 (\(\bigstar\bigstar\bigstar\)). Prove that \(\mathbb{Z}[\sqrt{-3}]\) is not a UFD, but the ring \(\mathbb{Z}\left[\frac{1+\sqrt{-3}}{2}\right]\) which is slightly larger than it is a PID.

Polynomial rings over fields are PID

This conclusion is very classic and important. We still give its proof here, which is similar to the proof of Euclidean division.

Theorem 3.7. Let \(a,b\in K[x]\) be two polynomials, where the degree of \(b\), \(\deg(b)>0\). Then there exist polynomials \(q,r\in K[x]\) such that \[a = bq+r\] where \(\deg(r)<\deg(b).\)

Proof. Let \(a(x)=a_0 x^n + a_1 x^{n-1}+\dots\), \(b(x) = b_0 x^m + b_1 x^{m-1} +\dots\). Then take \[q_0(x)=\frac{a_0}{b_0}x^{n-m}\] We have \[a-bq_0 = 0 x^n +\dots\] The \(n\)th term is cancelled. If its degree is not less than that of \(b\), let \(a_0'x^{n_1}\) be its next highest non-zero term, we can take \[q_1 = q_0 + \frac{a_0'}{b_0}x^{n_1-m}\] We have \[a-bq_1 = 0x^n + 0x^{n_1}+\dots\] If its degree is not less than that of \(b\), we can continue this process until its degree is less than that of \(b\), and we denote this \(a-bq\) as \(r\). ◻

Corollary 3.4. \(K[x]\) is a PID.

Exercise 3.25 (\(\bigstar\)). Is the polynomial ring \(K[x,y]\) generated by two variables \(x\) and \(y\) still a PID? Why?

Fractionalization and Fraction Field of Rings

This section discusses only commutative rings. For an integral domain \(R\), we can take a multiplicative subset \(S \subset R\), that is, a subset satisfying \(a \in S, b \in S \Rightarrow ab \in S\). Without loss of generality, we can assume that \(1 \in S\). The fractionalization of the ring \(R\) with respect to \(S\) is defined as \[S^{-1}R := \left\{\frac{r}{s} | r \in R, s \in S\right\}\] This set can be naturally equipped with addition and multiplication structures \[\frac{r}{s} + \frac{r'}{s'} := \frac{rs' + sr'}{ss'} \in S^{-1}R\] \[\frac{r}{s} \cdot \frac{r'}{s'} := \frac{rr'}{ss'} \in S^{-1}R.\] Two fractions \(r/s, r'/s'\) are considered equivalent if and only if \(rs' - r's = 0\). It can be easily verified that this forms a ring. If we take \(S = R \backslash \{0\}\), then the resulting fractionalization is denoted as \({\rm Frac}(R)\) and is called the fraction field of \(R\).

Definition 3.5 (Integrally Closed). Let \(R\) be an integral domain and let \(K = {\rm Frac}(R)\). If there exists an element \(x \in K\) satisfying a monic polynomial equation with coefficients in \(R\) \[x^n + r_1x^{n-1} + \dots + r_{n-1}x + r_n = 0\] then \(x\) is said to be integral over \(R\). The set of all elements in \(K\) that are integral over \(R\) is called the integral closure of \(R\) in \(K\), denoted as \(\overline{R}\). If the integral closure of \(R\) in \(K\) is equal to \(R\), then \(R\) is said to be integrally closed (or normal).

Example 3.7. The integral closure of \(\mathbb{Z}\) in \(\mathbb{Q}\) is \(\mathbb{Z}\), because if \(\frac{x}{y}\) is a reduced fraction satisfying a monic \(\mathbb{Z}\)-coefficient equation, we can multiply both sides by \(y\) to get \[x^n+r_1yx^{n-1}+\dots +r_n y^n=0\] which implies that every prime factor of \(y\) divides \(x\). By the assumption of being reduced, \(y\) can only have no prime factors, which means \(y\) is a unit and \(\frac{x}{y}\) is an integer. Therefore, \(\mathbb{Z}\) is integrally closed.

Example 3.8. Let \(d\) be a square-free number, then \[{\rm Frac}(\mathbb{Z}[\sqrt{d}])=\mathbb{Q}(\sqrt{d}):=\{x+y\sqrt{d}|x,y\in\mathbb{Q}\}.\] This is because, \[\frac{r+s\sqrt{d}}{a+b\sqrt{d}}=\frac{(r+s\sqrt{d})(a-b\sqrt{d})}{(a+b\sqrt{d})(a-b\sqrt{d})}=\frac{ra-sbd}{a^2-b^2d}-\frac{rb+sa}{a^2-b^2d}\sqrt{d}\] The denominator \(a^2-b^2d\) is not equal to \(0\), because \(d\) is square-free.

Exercise 3.26. Prove that UFDs are integrally closed.

Exercise 3.27. Verify that \(\frac{1+\sqrt{-7}}{2}\) is integral over \(\mathbb{Z}\), thus proving that \(\mathbb{Z}[\sqrt{-7}]\) is not a UFD.

Exercise 3.28 (\(\bigstar\bigstar\)). Find the integral closure of \(\mathbb{Z}[\sqrt{5}]\).

Exercise 3.29 (\(\bigstar\bigstar\)). Let’s examine an example of how localization affects prime ideals.

  1. For the ring \(\mathbb{Z}\), list all of its prime ideals.

  2. Take \(S=\{1,2,4,8,16,\dots\}\) and consider the localization of the ring \(S^{-1}\mathbb{Z}\). List all of its prime ideals.

  3. Take the prime ideal \((2)\subset \mathbb{Z}\) and let \(S=\mathbb{Z}\backslash (2)\). Prove that this is a multiplicative subset and list all of the prime ideals in \(S^{-1}\mathbb{Z}\).

By following the prompts below, prove that \(K[x,y]\) is a UFD. Hence, it can be concluded that a UFD is not necessarily a PID.

  1. Let \(K(x)\) be the field of fractions of \(K[x]\). Then \(K(x)[y]\) is a polynomial ring over the field \(K(x)\), and thus a PID.

  2. Place any polynomial \(f(x,y)\in K[x,y]\) into \(K(x)[y]\) to obtain a factorization.

  3. Examine this factorization in some \((K[x]/(p(x)))[y]\), where \(p(x)\in K[x]\) is some irreducible polynomial. Note that \(K[x]/(p(x))\) is a field, and thus \((K[x]/(p(x)))[y]\) is also a PID.

  4. Prove that \(K[x,y]\) is a UFD.

Elementary Number Theory in Cryptography

\(\mathbb{Z}/n = \mathbb{Z}_n\) is an abelian group, representing the \(n\) residue classes modulo \(n\), and forms a group under addition. If we consider \(\mathbb{Z}_n^\times = (\mathbb{Z}/n)^\times\), the set of all residue classes coprime to \(n\), this set forms a group under multiplication, and its order is \(\varphi(n)\). According to Lagrange’s theorem, for any \(a\in \mathbb{Z}_n^\times\), we have \(a^{\varphi(n)}=1\), which means that for any integer \(a\) coprime to \(n\), we have \[a^{\varphi(n)} \equiv 1 \mod n.\] This is the Euler-Fermat theorem in elementary number theory. This theorem has interesting applications in modern cryptography.

In public key cryptography, people hope to achieve the following goals:

  • We need to communicate in a channel that is reliable in terms of communication quality, but not secure in terms of information security.

  • We want only the sender and receiver to be able to decipher the content of the message, while any third party eavesdropping on the channel can only obtain the ciphertext, not the plaintext.

  • We also hope that this encryption communication system does not require pre-distribution of keys.

The most secure encryption method in classical cryptography is the one-time pad, where both parties in communication prepare a long random string \(k\in \mathbb{Z}_n^m\) as a one-time key that only they know. Before communication, the key \(k\) is added to the plaintext \(x\) to obtain the ciphertext \(x+k\). Since \(k\) is completely random and independent of \(x\), the distribution of \(x+k\) is also completely random, i.e. it has a trivial uniform distribution. Therefore, anyone who only obtains \(x+k\) without knowing \(k\) cannot gain any information about \(x\). The recipient only needs to calculate \((x+k)-k=x\) to obtain the original message. However, this key can only be used once, as \(x\) has a non-trivial probability distribution, making it unsafe to use the same key multiple times. Therefore, both parties need to prepare a large number of one-time keys, which is not only inconvenient but also makes key distribution a difficult problem: after using a one-time key, it is difficult to ensure that the key can be securely distributed to the other party (through a channel or offline).

The beginning of modern cryptography, the RSA public key encryption system, solves this problem by introducing elementary number theory and the concept of asymmetric encryption. Consider a large integer \(n=pq\) formed by multiplying two large prime numbers, then \(\mathbb{Z}_n^\times\) is a group of order \(\varphi(n)=(p-1)(q-1)\). That is, for \(a\in\mathbb{Z}^\times\), \(a^{(p-1)(q-1)}=1\). In fact, the corresponding congruence \[a^{\varphi(pq)}\equiv 1 \mod pq\] holds for all \(a\in \mathbb{Z}-(pq)\mathbb{Z}\). This means that if we take a pair of exponents \((e,d)\) such that \(ed\equiv 1\mod \varphi(pq)\), we will have \[(x^{e})^d \equiv x^{eq} \equiv x \mod n\] for all \(x\). Therefore, we can use \(e\) to encrypt the message \(x\in \mathbb{Z}/n\), obtaining the ciphertext \(x^e\in \mathbb{Z}/n\), and then use \(d\) to decrypt it, obtaining \(x^{ed}=x\in \mathbb{Z}/n\). Note that the key used for encryption and the key used for decryption are different, and their relationship needs to be obtained by solving \(ed=1 \in \mathbb{Z}/(p-1)(q-1)\) (according to ideal theory, it is only necessary to ensure that \(e\) and \((p-1)(q-1)\) are coprime to exist \(ex+(p-1)(q-1)y=1\)). If \(p,q\) are large, it will be difficult to factor \(n\), and without factoring \(n\), it is impossible to know the value of \((p-1)(q-1)\) (it can be easily proved that, given \(n\), knowing \((p-1)(q-1)\) is equivalent to knowing \(p\) and \(q\)). Therefore, we can construct the following encryption system without the need for key distribution:

  1. First, generate two random, large prime numbers \(p,q\).

  2. Calculate \(n=pq, m=(p-1)(q-1)\).

  3. Randomly generate an integer \(e\) that is coprime to \(m\), and use Euclidean division to calculate \(d\) such that \(ed\equiv 1\mod m\).

  4. Publicly announce \((n,e)\) as your public key, keep \(d\) as private (secret) information, and keep or discard information such as \(p,q,m\).

  5. When someone needs to send an encrypted message \(x\in\mathbb{Z}/n\) to you, they can calculate \(x^e\in \mathbb{Z}/n\) and send it to you through a channel.

  6. A third party who obtains \(x^e\) cannot know \(x\) because they do not know \(d\) and cannot factor \(n\).

  7. You only need to calculate \((x^e)^d=x\in \mathbb{Z}/n\) to decrypt the original message.

Modules and Linear Algebra

Basic Knowledge of Modules

Modules can be considered as the most extensive and important algebraic structure in modern algebra. It can be seen as a natural generalization of linear algebra, but the phenomena it contains far exceed those of vector spaces over fields. Just as groups can act on sets, we can also let rings act on a set. However, we hope that the addition and multiplication of the ring can naturally act on such a set. If we regard the multiplication in the ring as the composition of actions, we are still not sure how the addition of the ring can cooperate with this action, so we need an additional structure for addition. This leads to the definition of modules:

Definition 4.1 (\(R\)-module, multiple module, bimodule). Let \(R\) be a ring, an \(R\)-module \(M\) is defined as an abelian group with a (left) action of \(R\) on \(M\), which means for \(r\in R\), \(m\in M\), we can define an action of \(r\) on \(m\), denoted as \(rm\in M\). This can be understood as a mapping from \(R\times M\) to \(M\), and this action satisfies the following properties:

  1. (Identity) \(1m=m\)

  2. (Associativity) \((r'r)m=r'(rm)\)

  3. (Left Distributivity of \(R\)) \((r'+r)m = r'm + rm\)

  4. (Right Distributivity of \(M\)) \(r(m+m')=rm+r'm\)

We say that \(M\) is a left \(R\)-module, also denoted as \({}_RM\). Similarly, if \(M\) defines a right action \(mr\in M\), satisfying similar properties, note that the associativity should be changed to \(m(r'r)=(mr')r\). We say that \(M\) is a right \(R\)-module, also denoted as \(M_R\).

If we compare the definition made for group actions, that is, a group action is a group homomorphism \(G\to {\rm Aut}(X)\), then we can also understand the above definition as a left \(R\)-module \(M\) is the action of \(R\) on an abelian group \(M\), that is, a ring homomorphism \(\rho:R\to {\rm End}(M)\). Here, \({\rm End}(M)\) represents the ring formed by group homomorphisms from \(M\) to itself, and its multiplication is the composition of abelian group homomorphisms \(f\cdot g = f\circ g\). So each \(r\) can be imagined as an operator on \(M\), an endomorphism, that is, \(rm = \rho(r)m\). Because \(\rho(r)\) is an abelian group homomorphism, the distributive law of \(M\) is automatically satisfied. The existence of identity element, associativity, and the distributive law of \(R\) are guaranteed by the fact that \(\rho\) is a ring homomorphism.

It is worth noting that a right \(R\)-module \(M\) is a contrahomomorphism \(\varphi:R\to {\rm End}(M)\), which reverses the direction of multiplication, \(\varphi(rs) = \varphi(s)\cdot \varphi(r)\).

If a group \(M\) has multiple module structures defined on it, and a family of rings \(\{A_i\}\) has left module structures defined on it, and another family of rings \(\{B_j\}\) has right module structures defined on it, then we say that their actions are commutative, or that their actions are compatible, if \(a_j(a_im)=a_i(a_jm)\), and \((a_im)b_j=a_i(mb_j)\) and \((mb_i)b_j=(mb_j)b_i\). If these module structures are compatible with each other, then we call \(M\) a multi-module, which can be denoted as \((\{A_i\};\{B_j\})\)-module, or \((A_i|B_j)\)-module, or written as \({}_{(A_i)}M_{(B_j)}\), and sometimes also written as \(M_{A_i|B_j}\). If \(M\) is a \((R|R)\) multi-module, i.e. it is both a left \(R\)-module and a right \(R\)-module and the two module structures are compatible with each other, i.e. \((rm)r' = r(mr')\), then \(M\) is called a bi-module over \(R\). In the case where \(R\) is a commutative ring, left \(R\)-modules and right \(R\)-modules are collectively referred to as \(R\)-modules.

Exercise 4.1. Verify that the left multiplication of a ring \(R\) on itself can be seen as a left \(R\)-module. Similarly, the right multiplication is a right \(R\)-module, so \(R\) is a bi-module over \(R\).

Exercise 4.2. Let \(I\) be a bilateral ideal of \(R\). Verify that \(R/I\) is a bilateral \(R\)-module under the natural left and right multiplication.

Exercise 4.3. Explain that every Abelian group is a \(\mathbb{Z}\)-module, and that a \(\mathbb{Z}\)-module is also an Abelian group. Show the equivalence of these two concepts.

Exercise 4.4. For a commutative ring \(R\), the structure of a right \(R\)-module can be defined for a left \(R\)-module \(M\) as follows: \[mr:= rm\] Why does this definition not hold when \(R\) is not a commutative ring?

Rings can be very complicated, and so can the modules over them. However, there are two types of modules that are particularly important, namely modules over fields (vector spaces) and modules over principal ideal domains (finitely generated).

Homomorphism of Modules

Definition 4.2. Let \(M\) and \(N\) be two \(R\)-modules. A homomorphism of \(R\)-modules \(\varphi: M \to N\) is defined as an Abelian group homomorphism satisfying \(\varphi(rm) = r\varphi(m)\) for all \(r \in R\) and \(m \in M\). For right modules, the condition is modified to \(\varphi(mr) = \varphi(m)r\). A homomorphism of \(R\)-modules is also called an \(R\)-linear map.

For homomorphisms between multiple modules, \(\varphi\) must be compatible with all module structures.

The meanings of monomorphism, epimorphism, and isomorphism are obvious and will not be repeated here. If there exists a module homomorphism \(\varphi:M\to N\), we say that \(M\) and \(N\) are isomorphic and denote it as \(M\cong N\). Since the theory of right modules is completely similar, we will mostly consider left modules by default.

For a subset \(N\subset M\) of \(M\), if it satisfies \(RN\subset N\), we call \(N\) a submodule of \(M\). For any submodule, we can construct a quotient module \(M/N\), whose elements are cosets \(m+N\), and the action is defined as \[r(m+N):=rm+N\] It can be easily verified that this definition does not depend on the choice of coset representatives and satisfies the definition of an \(R\)-module.

Exercise 4.5. Assume that \(\varphi:M\to N\) is an \(R\)-module homomorphism. Prove that \(\ker\varphi:=\{m\in M|\varphi(m)=0\}\) is a submodule of \(M\), and \({\rm Im}\varphi\) is a submodule of \(N\). Prove the first isomorphism theorem \[\overline{\varphi}:M/\ker\varphi \cong {\rm Im}\varphi.\]

Exercise 4.6. Let \(\{M_i\}_{i\in I}\) be a family of submodules of \(M\). Prove that \[\sum_{i\in I}M_i=\{\text{all finite sums}\,m_{i_1}+\dots+m_{i_n}|m_{i_1}\in M_{i_1}, \dots, m_{i_n}\in M_{i_n}, \{i_1,\dots, i_n\}\subset I\}\] and \(\bigcap_{i\in I} M_i\) are both submodules of \(M\).

The following two theorems are essentially the same as the isomorphism theorem in group theory.

Theorem 4.1 (Second Isomorphism Theorem). Let \(M,N\) be submodules of some \(R\)-module. Then \(M+N\) and \(M\cap N\) are also submodules and \[(M+N)/N \cong M/M\cap N.\]

Theorem 4.2 (Corresponding Theorem). Let \(N\subset M\) be a submodule of \(M\). Then the submodules of \(M/N\) correspond to the submodules \(L\) of \(M\) such that \(M\supset L\supset N\).

Exact Sequence

When we have a sequence of modules and homomorphisms rendering math failed o.o if \(\ker(g)={\rm Im}(f)\), we say that the sequence is exact at \(N\). If a sequence of mappings of modules is exact at every point (except the leftmost and rightmost endpoints), we say that it is an exact sequence.

Exercise 4.7. rendering math failed o.o is an exact sequence if and only if \(f\) is a monomorphism.

Exercise 4.8. rendering math failed o.o is an exact sequence if and only if \(f\) is an epimorphism.

Example 4.1. rendering math failed o.o is a short exact sequence of \(\mathbb{Z}\)-modules, where \(2\) is the multiplication by \(2\) map and \(p\) is the projection. Here, \(\ker(p)=2\mathbb{Z} = {\rm Im}(2)\).

Basic Construction of Modules

In this section, we will discuss how to construct new modules from existing ones. The methods we know include quotient modules, module intersections, and module sums.

\({\rm Hom}\) Functor

Given two left \(R\)-modules \(M\) and \(N\), we can define the set of all \(R\)-module homomorphisms from \(M\) to \(N\) as \({\rm Hom}_R(M,N)\). For a collection of multiple module homomorphisms \(((A_i)|(B_j))\), we denote the set as \({\rm Hom}_{(A_i)|(B_j)}(M,N)\). Sometimes, to distinguish the direction of module homomorphisms, we use the notation \({\rm Hom}_{R|}(M,N)\) to denote the set of left \(R\)-module homomorphisms from \(M\) to \(N\). For two homomorphisms \(f:M\to N\) and \(g:M\to N\), we can define their sum as \[(f+g)(m):=f(m)+g(m)\] which is also a homomorphism. Therefore, \(f+g\in {\rm Hom}_R(M,N)\), showing that \({\rm Hom}_R(M,N)\) is an Abelian group. It is worth noting that if both \(M\) and \(N\) are either left or right \(R\)-modules without any multiple structure, then their \({\rm Hom}_R(M,N)\) does not have a natural \(R\)-module structure and is only an Abelian group (a \(\mathbb{Z}\)-module). However, if at least one of \(M\) and \(N\) has a multiple structure, the \({\rm Hom}\) functor can be naturally endowed with a module structure, which we will demonstrate.

Module Structure of \({\rm Hom}\)

Let \(R\) and \(S\) be two rings. If \(M\) has a left \(R\)-module structure and a right \(S\)-module structure, we write it as \({}_RM_S\) when we need to specify the module structure.

We will give an example to show that \({\rm Hom}_R({}_RM_S,{}_RN)\) has a left \(S\)-module structure. This is because we can define, for \(f\in{\rm Hom}(M,N)\) and \(s\in S\), \[(sf)(m):= f(ms)\] Then \((sf)\) is clearly a left \(R\)-module homomorphism (we need to use the definition of multiple module: the module structures of multiple rings need to be compatible with each other). We need to verify that \({\rm Hom}_R(M,N)\) satisfies the definition of a left \(S\)-module, all other properties are obvious, the only one that needs careful checking is the associativity: \[(s's)f = s'(sf)\] The left homomorphism is \(m\mapsto f(ms's)\), the right homomorphism is \(s'(m\mapsto f(ms))=m\mapsto f(ms's)\), so this is true, which means that \({\rm Hom}_R(M,N)\) becomes a left \(S\)-module under this definition.

There are a total of eight similar cases, and we list all cases in the following table:

Group Natural module structure Definition of \(S\)-module structure
\({\rm Hom}_{R|}\left({}_RM,{}_RN\right)\)
\({\rm Hom}_{|R}\left(M_R,N_R\right)\)
\(\mathbb{Z}\)-module none
\({\rm Hom}_{R|}\left({}_RM_S,{}_RN\right)\)
\({\rm Hom}_{|R}\left({}M_{S,R},{}N_R\right)\)
Left \(S\)-module \((sf)(m):=f(ms)\)
\({\rm Hom}_{R|}\left({}_{S,R}M,{}_RN\right)\)
\({\rm Hom}_{|R}\left({}_SM_R,{}N_R\right)\)
Right \(S\)-module \((fs)(m):=f(sm)\)
\({\rm Hom}_{R|}\left({}_RM,{}_RN_S\right)\)
\({\rm Hom}_{|R}\left({}M_R,{}N_{S,R}\right)\)
Right \(S\)-module \((fs)(m):=f(m)s\)
\({\rm Hom}_{R|}\left({}_RM,{}_{R,S}N\right)\)
\({\rm Hom}_{|R}\left({}M_R,{}_SN_R\right)\)
Left \(S\)-module \((sf)(m):=sf(m)\)

In summary, the direction of the \(S\)-module structure on \(N\) determines the direction of the \(S\)-module structure on \({\rm Hom}_R(M,N)\). If the bimodule structure appears on \(M\), the action will be reversed by composing functions from the inside out, causing a change in the direction of the module.

Exercise 4.9 (\(\bigstar\)). Confirm that you understand all the cases in the table.

Endomorphism Ring

When \(M=N\), \({\rm Hom}_R(M,M)\) is denoted as \({\rm End}_R(M)\), and it has a natural ring structure, where multiplication is the composition of homomorphisms from \(M\) to \(M\).

Exercise 4.10 (\(\bigstar\)). Prove that if \(M\) is an \(R\)-module, then it has a natural left \(R,{\rm End}_R(M)\) bimodule structure.

Functoriality of \({\rm Hom}\)

What does the \({\rm Hom}\) functor mean? The so-called functor is an abstract concept in modern category theory, which describes how objects in different categories correspond to each other. It requires:

  1. To correspond the object \(A\) to the object \(F(A)\).

  2. To correspond the map \(f:A\to B\) to another map \(F(f):F(A)\to F(B)\) or \(F(f):F(B)\to F(A)\). If it satisfies the former, it is called a covariant functor, and if it satisfies the latter, it is called a contravariant functor.

  3. The functor must preserve composition, i.e. \(F(f\circ g)=F(f)\circ F(g)\).

  4. We also require the functor to correspond the identity map to the identity map, i.e. \(F(1_A)=1_{F(A)}\).

As an example, we will explain in detail how \({\rm Hom}\) forms a functor.

  • For a fixed \(R\)-module \(N\), we find that \(M\mapsto {\rm Hom}_{R}(M,N)\) is a contravariant functor, sometimes denoted as \(h_N\). This functor maps the \(R\)-module \(M\) to the Abelian group \({\rm Hom}_R(M,N)\), and maps the \(R\)-module homomorphism \(f:M'\to M\) to the group homomorphism \[{\rm Hom}(f,N):={\rm Hom}(M,N)\to {\rm Hom}(M',N):h\mapsto h\circ f.\] It is clear that it preserves composition and identity maps.

  • Similarly, for a fixed module \(M\), we find that \(N\mapsto {\rm Hom}_R(M,N)\) is a covariant functor, which maps the module homomorphism \(g:N'\to N\) to the group homomorphism \[{\rm Hom}(M,g):={\rm Hom}(M,N')\to {\rm Hom}(M,N):h\mapsto g\circ h.\] It also preserves composition and identity maps.

Direct Sum

Given two left \(R\)-modules \(M,N\), their direct sum \(M\oplus N\) is defined as the set \(M\times N=\{(m,n)|m\in M, n\in N\}\) as an Abelian group, with addition defined as component-wise addition, and \(R\)-action defined as \[r(m,n):=(rm,rn)\] It can be easily verified that this definition gives an \(R\)-module structure. Direct sums are not limited to finite sums, for any family of \(R\)-modules indexed by a set \(I\), \(\{M_\lambda\}\), we can define the direct sum as \[\bigoplus_{\lambda\in I} M_\lambda:=\left\{(m_\lambda)\in \prod_{\lambda\in I} M_\lambda | \text{at most finitely many $m_\lambda$ are non-zero}\right\}.\] If we take a sequence of identical modules \(M_{\lambda}=M\), we denote this direct sum as \(M^{(I)}\).

Exercise 4.11 (\(\bigstar\)). Verify that if \(M,N\) are two submodules of \(L\), satisfying \[M\cap N=0\] then there exists an isomorphism \(M+N\cong M\oplus N\). In this case, we call \(M+N\) the internal direct sum of \(M,N\).

Exercise 4.12. Prove that we have the following obvious exact sequence:

  1. \[0\to M\to M\oplus N\to N\to 0\]

  2. Here, let \(M'\subset M\) be a submodule. Then \[0\to M'\to M\to M/M'\to 0\] is a natural exact sequence.

Direct Product

Similar to direct sum, a direct product of a family of modules \(M_\lambda\) is obtained by giving the set \(\prod_\lambda M_\lambda\) a natural module structure. The difference is that in the direct product, we do not require that each component of an element has only finitely many non-zero elements. Any sequence \((m_\lambda)\in \prod M_\lambda\) is allowed as an element. Similarly, if we take a sequence of the same module \(M_\lambda=M\), we denote this direct product as \(M^I\).

Example 4.2. Consider \(F=\oplus_{i=0}^\infty \mathbb{Z}\), and \(G=\prod_{i=0}^\infty \mathbb{Z}\) are two \(\mathbb{Z}\)-modules, then \((1,2,3,4,\dots)\in G\) is an element of \(G\), but not in \(F\). There is a natural inclusion \(F\subset G\), and \(F\) is a submodule of \(G\).

Exercise 4.13. The finite direct sum of modules is isomorphic to the finite direct product.

Exercise 4.14. Prove the following universal properties of direct sums and direct products, and the natural isomorphisms: \[{\rm Hom}_R\left(\bigoplus_{\lambda\in I} M_\lambda, N\right)=\prod_{\lambda\in I}{\rm Hom}_R(M_\lambda, N).\] \[{\rm Hom}_R\left(M, \prod_{\lambda\in I} N_\lambda\right)=\prod_{\lambda\in I}{\rm Hom}_R(M, N_\lambda).\] Explain their equivalence with the following statements:

  • For any mapping \(h_\lambda:M_\lambda\to N\), there exists a unique mapping \((\oplus h_\lambda):\bigoplus_\lambda M_\lambda\to N\) such that \(h_\lambda = (\oplus h_\lambda)\circ \iota_\lambda\). Here, \(\iota_\lambda:M_\lambda\to \oplus M_\lambda\) is the natural inclusion mapping.

  • For any mapping \(g_\lambda:P\to N_\lambda\), there exists a unique mapping \((\prod p_\lambda):P\to \prod M_\lambda\) such that \(g_\lambda={\rm pr}_\lambda\circ (\prod p_\lambda)\). Here, \({\rm pr}_\lambda:\prod M_\lambda\to M_\lambda\) is the projection onto the component.

Free Module

A module obtained from an arbitrary number of direct sums (possibly infinite) of \(R\), or a module isomorphic to it, is called a free \(R\)-module. For example, \(R^n = R \oplus R \oplus \dots \oplus R\) (\(n\) direct sums of \(R\)).

The importance of free modules lies in their good properties. Any homomorphism \(f: R^n \to M\) starting from \(R^n\) can be easily determined by specifying where each \(e_i = (0, \dots, 0, 1, 0, \dots, 0) \in R^n\) (the \(i\)-th position is \(1\)) is mapped to. The entire homomorphism is completely determined by this. Because every element \((r_1, \dots, r_n) \in R^n\) can be written as \[(r_1, \dots, r_n) = \sum r_i e_i,\] we have \[f((r_1, \dots, r_n)) = \sum f(r_i e_i) = \sum r_i f(e_i).\] It is worth noting that there is no restriction on how to choose the value of \(f(e_i)\). In other words, the determination of \(f\) depends entirely on the arbitrary choice of \(n\) values \(f(e_i)\), or, it depends on an arbitrary element \((f(e_1), f(e_2), \dots, f(e_n)) \in M^n\). This gives the following natural isomorphism \[{\rm Hom}_R(R^n, M) = M^n.\]

Exercise 4.15. Prove the basic property of free modules \[{\rm Hom}_R(R^{(I)}, M) = M^I\] Note that \(I\) may be infinite, and note the difference between direct sums and direct products.

Generators and Basis

For a (left) \(R\)-module \(M\), after selecting a family of elements \((m_i)_{i\in I\in M^I}\), we can define a map \(\oplus (r\mapsto rm_i):R^{(I)}\to M\) from \({\rm Hom}_R(R^{(I)},M)=M^I\). We temporarily denote this map as \(\rho\). The image of this map is called the \(R\)-linear combination of \(m_i\). We say that \(\{m_i\}_{i\in I}\subset M\) is a set of generators of \(M\) if \(\rho:R^{(I)}\to M\) is surjective. In other words, any \(m\in M\) can be written as a finite linear combination of \(m_i\): \[m=r_{1}m_{i_1}+\dots+r_n m_{i_n}.\] If there exists a finite family of \(m_i\) that generates \(M\), then \(M\) is called finitely generated. We say that \(\{m_i\}_{i\in I}\) is a set of \(R\)-linearly independent elements, or simply independent, if the corresponding map \(\rho:R^{(I)}\to M\) is injective. In other words, if any finite subset of \(m_i\), \(m_{i_1},\dots, m_{i_n}\) has any \(R\)-linear relation \[r_1 m_{i_1}+\dots +r_n m_{i_n}=0,\] then all coefficients \(r_1=r_2=\dots=r_n=0\). If a set of elements \(\{m_i\}\) is both a set of generators and independent, then it is called a basis of \(M\).

Exercise 4.16. Show that \(M\) having a basis is equivalent to \(M\) being isomorphic to a free module.

Exercise 4.17. Give an example of a non-free module, so it has no basis, although it must have generators, but they must not be linearly independent.

Exercise 4.18 (\(\bigstar\)). Consider \(R=k\) as a field. Since \(k\) is a commutative ring, \(k\)-modules are not distinguished between left and right. We call \(k\)-modules \(k\)-vector spaces or \(k\)-linear spaces. \(k\)-module homomorphisms are called \(k\)-linear maps or \(k\)-linear transformations.

Basis of Vector Space

The best thing about linear space is that it must have a basis, so all linear spaces are isomorphic to some free \(k\)-module. This makes linear space very easy to study, and we will explain this. Next, we assume that \(V\) is a \(k\)-vector space.

Theorem 4.3 (Basis Extension Theorem). Let \(F\) be an independent set of \(V\) and \(G\) be a generating set of \(V\), satisfying \(F \subset G\). Then there exists a basis \(B\) of \(V\) such that \(F \subset B \subset G\).

Proof. If there is a series of increasing independent sets \(B_i\) such that \(F \subset B_i \subset G\), then obviously \(\cup_i B_i\) is also an independent set and satisfies \(F \subset \cup_i B_i \subset G\). Therefore, by Zorn’s lemma, there exists a maximal independent set \(B\) that satisfies \(F \subset B \subset G\). We will prove that \(B\) can generate \(V\), that is, the corresponding mapping \(\rho_B: k^{(B)} \to V\) is an isomorphism. Otherwise, there exists \(v \in V\) that cannot be represented as a linear combination of elements in \(B\). Since \(G\) is a generating set, there exists a finite sum \(v = \rho_G(a) = \sum_{g \in G'} a_g g\), where \(G'\) is a finite set. Therefore, there must be an element \(g \in G'\) that is independent of \(B\), otherwise all \(g \in G'\) are linearly dependent on \(B\), and since \(B\) is an independent set, it can be deduced that all \(g \in G'\) can be generated by \(B\), which leads to the conclusion that \(v\) can also be generated by \(B\), which is impossible. However, in this case, \(\{g\} \cup B\) is a larger independent set contained in \(F \subset \{g\} \cup B \subset G\), which contradicts the assumption that \(B\) is a maximal independent set. Therefore, \(B\) is a basis. ◻

Example 4.3. Consider \(k=\mathbb{R}\), \(M=\mathbb{R}^2=\{(x,y)\}\) is the set of all vectors on the plane. Under component-wise addition, \(M\) forms an Abelian group. Then \(M\) naturally becomes a \(k\)-vector space under \(k\)-multiplication: \[a(x,y):=(ax,ay)\in M.\] It is easy to verify that \(M\) satisfies the four definitions of a \(k\)-vector space (\(k\)-module) under this action. \(\{(0,1),(1,0)\}\) is a basis for \(M\).

Exercise 4.19 (\(\bigstar\)). Prove that the following statements are equivalent for a vector space \(V\):

  1. \(B\) is a basis for \(V\).

  2. \(B\) is a maximal linearly independent set in \(V\).

  3. \(B\) is a minimal generating set for \(V\).

Exercise 4.20. Let \(k=\mathbb{Q}\), find out which of the following are bases of \(k^2\) and which are not?

  1. \((1,2),(3,4)\)

  2. \((-1,1),(1,-1)\)

  3. \((1,1),(1,2),(1,0)\)

Dimension Theory

If \(V\) has two different bases \(B_1, B_2\), we hope to prove that \(B_1\) and \(B_2\) are equipotent, that is, there exists a bijection between them.

Finite Case

If \(B_1, B_2\) are both finite, let \(m = |B_1|\) have fewer elements. We use induction on \(m\) to prove that \(|B_2| \le m\). If \(m = 1\), \(B_1 = \{b\}\), then any element in \(B_2\) can be written as a nonzero \(k\)-coefficient multiple of \(b\), and they cannot be linearly independent unless \(|B_2| = 1\).

Now assuming that the proposition holds for \(|B_1|\le m\), we consider the case where \(|B_1|=m+1\). Let \(a\in B_2\), by the basis extension theorem, there exists a basis \(B=\{a\}\cup B'\) such that \(\{a\}\subset B\subset \{a\}\cup B_1\), so \(V\simeq ka\oplus \left(\bigoplus_{b'\in B'} kb'\right)\), and \(V'=V/ka\) has \(B'\) (as the projection image) as a basis, with \(m\) elements. Similarly, \(V'\) also has \(B_2-\{a\}\) as a basis, so \(|B_2|-1\le m\).

Infinite case

Now assuming that \(V\) has an infinite basis \(B\), so \(V\cong k^{(B)}\), if \(C\) is another basis, for each \(c\in C\), there exists a finite expression \(c=\sum_{b\in B} c_b b\) where only finitely many \(c_b\) are non-zero. Then for any \(c\in C\), consider the finite set \(B_c:=\{b\in B|c_b\neq 0\}\), since \(C\) can generate the entire space, for each \(b\) there must be some \(c\) with non-zero coefficient for \(b\), so we have \[\bigcup_{c\in C} B_c=B\] Since \(B\) is infinite, this implies \(|C|\ge |B|\), and since \(B\) is infinite, we can use the same reasoning to show that \(|B|\ge |C|\).

Definition 4.3. We define the dimension \(\dim_k V\) of \(V\) to be the cardinality \(|B|\) of any basis \(B\) of \(V\).

Exercise 4.21 (\(\bigstar\)). Let there be a family of exact sequences of \(k\)-vector spaces \[0\to E\to F\to G\to 0\] Using the basis extension theorem, prove that \[\dim E + \dim G=\dim F.\] From this, deduce the following dimension formulas

  1. Let \(f:V\to W\), then \[\dim V=\dim \ker f +\dim {\rm Im}f.\]

  2. Let \(F\subset E\) be a subspace, then \[\dim E = \dim(E/F)+\dim F.\]

  3. \(\dim(M\oplus N)=\dim M+\dim N\).

  4. Let \(M,N\subset V\) be two subspaces, then \[\dim M+\dim N=\dim(M\oplus N)=\dim (M+N)+\dim M\cap N.\]

Matrices over Commutative Rings

Finitely Generated Free Modules

In this section, we assume that \(R\) is a commutative ring. It is worth noting that in the free module \(R^m\), there exists a standard basis, where each element can be uniquely written in the form \(a_1e_1+a_2e_2+\dots+a_me_m\).

If an \(R\)-module \(M\) is isomorphic to \(R^n\), we say that \(M\) has rank \(n\). However, we have not yet proven the uniqueness of rank, that is, we need to prove \(R^m\cong R^n\Rightarrow m=n\). Unfortunately, this is not always true in general rings. But at least in the case of PIDs, this is true. To prove this, we need to develop some knowledge of linear algebra over rings.

Matrices over Rings

We consider the \(R\)-module homomorphisms from \(R^n\) to \(R^m\). Since \[{\rm Hom}_R(R^n,R^m)=({\rm Hom}_R(R,R^m))^n=(R^m)^n=R^{mn}\] we know that to determine a homomorphism \(\varphi:R^n\to R^m\), it is essentially enough to know where each \(e_j\in R^n\) is mapped to in \(R^m\). We use \(e'_i\in R^m\) to denote the standard basis of \(R^m\). If we denote the \(i\)th component of the image of \(e_j\) as \(a_{ij}\), then we can write \[\varphi(e_j)=\sum_{i} a_{ij}e'_i\] Here \(a_{ij}\in R\) are the \(mn\) elements in \(R\) that we want. We can arrange these elements into a matrix \[A=\left( \begin{array}{ccccc} a_{11}& a_{12} & a_{13} & \dots & a_{1n} \\ a_{21}& a_{22} & a_{23} & \dots & a_{2n} \\ \dots & \dots & \dots & \dots & \dots \\ a_{m1}& a_{m2} & a_{m3} & \dots & a_{mn} \\ \end{array} \right)\] Matrices are usually represented by capital letters \(A,B,\dots\), etc. Here our matrix is the matrix of \(\varphi\), which we also denote as \(A_\varphi\). When we need to emphasize that the matrix elements are \(a_{ij}\), we write \(A=(a_{ij})_{1\le i\le m,1\le j\le n}\). This matrix has \(m\) rows and \(n\) columns, and we say it is an \(m\times n\) matrix over \(R\), denoted as \(A\in M_{m\times n}(R)\). When \(m=n\), this notation is simply written as \(M_n(R)\).

Exercise 4.22. Make sure you understand: giving an \(m\times n\) matrix is equivalent to giving a homomorphism \(R^n\to R^m\). That is, any matrix consisting of elements of \(R\) represents a homomorphism. So any matrix \(A\) can be (uniquely) written in the form of \(A_f\), where \(f\) is some homomorphism.

Composition of mappings and multiplication of matrices

For \(f,g\in {\rm Hom}_R(R^n,R^m)\), and \(r\in R\), we can define the homomorphisms \(rf\) and \(f+g\). Similarly, we can define scalar multiplication and addition of matrices, i.e. we define \(r\) multiplied by \(A_f\) as the matrix \(B_{rf}\) corresponding to the mapping \(rf\): \[rA_f := B_{rf}\] Let \(A_f\) and \(B_g\) be the matrices corresponding to the homomorphisms \(f\) and \(g\), respectively. Then we define \(A+B\) as the matrix \(C_{f+g}\) corresponding to the mapping \(f+g\): \[A_f+B_g:=C_{f+g}.\]

Exercise 4.23. Prove that the elements of \(rA\) are \((ra_{ij})\), and the elements of \(A+B\) are \((a_{ij}+b_{ij})\). Thus, prove that \(M_{m\times n}(R)\) forms an \(R\)-module.

When we have the composition of homomorphisms rendering math failed o.o we can calculate \(C_{gf}\) using \(A_f\) and \(B_g\), and we call this process matrix multiplication, denoted as \[B_g\cdot A_f=C_{gf}.\] How do we actually perform this operation? We can write out the matrices \(A_f=(a_{ij}),B_g=(b_{ij})\) as \[f(e_j)=\sum_i a_{ij}e'_{i}\] \[g(e'_i)=\sum_k b_{ki}e''_k\] Therefore, \[(g\circ f)(e_j)=g(f(e_j))=g\left(\sum_{i}a_{ij} e'_i\right)=\sum_k\left(\sum_{i} b_{ki} a_{ij}\right)e''_k.\] This shows that the coefficients of \(C_{gf}=(c_{ij})=B\cdot A\) should be \[c_{ij}=\sum_{k} b_{ik}a_{kj}.\] This is the formula for matrix multiplication. Note that \(B\) is \(n\times m\), \(A\) is \(m\times l\), and \(BA\) is \(n\times l\), which means that matrix multiplication gives a mapping \[M_{n\times m}(R)\times M_{m\times l}(R) \to M_{n\times l}(R).\] When \(m=n=l\), this multiplication operation gives multiplication on \(M_n(R)\), making \(M_n(R)\) a (usually non-commutative) ring.

Exercise 4.24. The necessary condition for a ring is that multiplication satisfies the associative law. From the fact that composition of mappings satisfies the associative law, we can deduce that matrix multiplication also satisfies the associative law.

Exercise 4.25. Let \(1:R^n\to R^n\) be the identity mapping. Write its matrix as \(I_n\). This matrix is called the \(n\)-dimensional identity matrix. It is the multiplicative identity in \(M_n(R)\).

Exercise 4.26. Let \(A\in M_{m\times n}(R)\).

  1. Show that if we view elements in \(R^n\) as \(n\times 1\) matrices (called column vectors), then the mapping \(R^n\to R^m\) corresponding to \(A\) is the matrix multiplication \[R^n\to R^m:v\mapsto A\cdot v\] where \(v\in M_{n\times 1}(R)=R^n\).

  2. Show that the image of \(A\) as a linear mapping, \({\rm Im}(A)\), is a submodule of \(R^m\) generated by all the columns of \(A\).

Reduction Theory of Matrices over PID

In this section, we assume that \(R\) is a PID and consider an \(m\times n\) matrix \(M\in M_{m\times n}(R)\) over \(R\). Note that, by matrix multiplication, \(M_{m\times n}(R)\) has a \((M_{m}(R),R;M_{n}(R),R)\) multi-module structure, which means that it can be acted upon by \(m\)-order square matrices on the left and \(n\)-order square matrices on the right.

The group formed by the multiplicative inverses in the ring \(M_n(R)\) is called \(GL_n(R)\). We say that \(A, B \in M_{m \times n}\) are equivalent if there exist two invertible elements \(P \in GL_m(R)\), \(Q \in GL_n(R)\) such that \(A = PBQ\). Note that equivalence is an equivalence relation.

Elementary Transformations of Matrices

For the matrix multiplication \(C = BA\), we can interpret it as \(C\) being a recombination of the columns of \(B\) and the rows of \(A\), with the coefficients determined by another matrix. From the formula for matrix multiplication \[c_{ij} = \sum_k b_{ik}a_{kj}\] we can see that the \(j\)th column of \(C\) is obtained by multiplying the first column of \(B\) by \(a_{1j}\), adding the second column of \(B\) multiplied by \(a_{2j}\), and so on until the \(k\)th column multiplied by \(a_{kj}\). Similarly, the \(i\)th row of \(C\) is obtained by multiplying each \(k\)th row of \(A\) by the corresponding \(b_{ik}\) and then adding them together. Thus, left multiplication by a matrix corresponds to row operations, while right multiplication corresponds to column operations.

The so-called elementary transformation refers to a class of simple and basic reversible operations on matrices. These transformations can be applied to a matrix by left and right multiplying with some simple invertible matrices. There are several types of elementary transformations:

  1. Multiplying one row (or column) of the matrix by a non-zero element in \(R\), and then adding it to another row (or column). For example, multiplying the first row of \(A\) by \(r\) and adding it to the second row: \[\left(\begin{array}{cc} 1 & 0 \\ r & 1 \\ \end{array}\right) \left( \begin{array}{cc} a_{11} & a_{12}\\ a_{21} & a_{22}\\ \end{array} \right) = \left( \begin{array}{cc} a_{11} & a_{12}\\ ra_{11}+a_{21}& ra_{12}+a_{22}\\ \end{array} \right).\]

  2. Rearranging the rows or columns of the matrix. Let \(A=(a_{ij})\in M_{m\times n}(R)\), if we want to rearrange the rows of the matrix using the permutation \(\sigma\in S_m\), we left multiply with an \(m\)-dimensional matrix \(P=(\delta_{\sigma(i)j})\). Here, \(\delta_{ij}=1_{i=j}\) is a function that equals \(1\) when \(i=j\) and \(0\) otherwise. We have \[(PA)_{ij}=\sum_k \delta_{\sigma(i)k}a_{kj} = a_{\sigma(i)j}.\] Such a matrix is called a permutation matrix. Similarly, to rearrange the columns of the matrix, we right multiply with an \(n\)-dimensional permutation matrix.

  3. Multiplying a row or column of the matrix by a unit \(u\in R^\times\). This can be achieved by left or right multiplying the matrix with a diagonal matrix \[\left( \begin{array}{ccccc} 1 & & & & \\ & u & & & \\ & & 1 & & \\ & & & \dots & \\ & & & & 1\\ \end{array} \right).\] The row (or column) containing \(u\) is the one that needs to be multiplied.

  4. Adding \(a\) times the \(i\)th row (or column) of the matrix to \(b\) times the \(j\)th row (or column) to create a new \(i\)th row (or column), and adding \(c\) times the \(i\)th row (or column) to \(d\) times the \(j\)th row (or column) to create a new \(j\)th row (or column), where \(ad-bc=1\). This is also a reversible transformation, which is equivalent to left multiplying with the following matrix (for the column case: right multiply and swap \(b\) and \(c\)): \[\left( \begin{array}{ccccc} \dots & & & & \\ & a & & b & \\ & & \dots & & \\ & c & & d & \\ & & & & \dots\\ \end{array} \right)\] where the omitted parts are all diagonal \(1\)s. Its inverse is \[\left( \begin{array}{ccccc} \dots & & & & \\ & d & & -b & \\ & & \dots & & \\ & -c & & a & \\ & & & & \dots\\ \end{array} \right).\]

It is worth noting that these elementary transformations are all reversible, and their inverses are also elementary transformations. Moreover, a series of elementary transformations can be obtained by left-multiplying a matrix and right-multiplying another matrix. This is because performing a series of row and column elementary transformations can be viewed as multiplying a series of \(P_i\) and \(Q_i\) in a certain order. By the associative law, the result can be written as \[(P_k\dots P_1)A (Q_1\dots Q_k)\] Of course, this is equivalent to saying that \(M_{m\times n}(R)\) is a multi-module of \((M_m(R);M_n(R))\).

Equivalent Canonical Form

Theorem 4.4 (Equivalent Canonical Form in PID). Let \(A\in M_{m\times n}(R)\), where \(R\) is a PID. Then there exists a unique (up to units in \(R\)) set of \(d_i\in R\) such that \(d_1|d_2|\dots|d_r\) and \(A\) is equivalent to the following diagonal form: \[\left( \begin{array}{ccccc} d_1 & & & & \\ & d_2 & & & \\ & & \dots & & \\ & & & d_r & \\ & & & & O \end{array} \right).\] Here, \(O\) represents a zero matrix (which may not exist), and the empty positions are all \(0\). We call the obtained \(r\) the rank of \(A\), denoted as \({\rm rank}A\).

Proof.

  • Consider the top left element \(a=a_{11}\) of the matrix. By performing row and column swaps, we can move the element with the least number of prime factors to this position. We define \(0\) to have infinitely many prime factors. If the entire matrix is \(0\), we can consider it as already completed. Otherwise, we can assume that \(a_{11}\) has a finite number of prime factors.

  • If \(a_{11}\) divides all elements in the first column, we can use elementary row operations to eliminate all remaining elements in the first column. Otherwise, we move the \(a_{i1}\) that is not divisible by \(a_{11}\) to the position of \(a_{21}\). Let \(I=(a_{11},a_{21})=(d)\). We have \(\alpha a_{11}+\beta a_{21}=d\). Here, it must be that \((\alpha,\beta)=(1)\), otherwise if both \(\alpha\) and \(\beta\) are divisible by some prime factor \(p\), we have \((\alpha/p) a_{11}+(\beta/p) a_{21}=d/p\in I=(d)\), which is impossible. Therefore, we can assume that there exist \(\delta,\gamma\in R\) such that \(\alpha\delta-\beta\gamma=1\). Then, by multiplying the original first row by \(\alpha\) and adding it to the second row multiplied by \(\beta\) as the new first row, and multiplying the original first row by \(\gamma\) and adding it to the second row multiplied by \(\delta\) as the new second row, we obtain a new top left element \(d\), which is a proper factor of the original \(a_{11}\) and has fewer prime factors.

  • We continue this operation until all elements in the first column are divisible by \(a_{11}\). At this point, the other elements in the first column can be eliminated by \(a_{11}\) through elementary operations, and we perform a similar operation on the first row until all elements except \(a_{11}\) are \(0\).

  • If \(a_{11}\) divides all elements in the matrix, we let \(d_1=a_{11}\) and continue the above induction operation on the matrix starting from the second row and second column. Otherwise, we repeat the first step and move the element with the least number of prime factors in the matrix to \(a_{11}\). This operation will eventually stop, because the number of prime factors of \(a_{11}\) will continue to decrease until it either divides all elements in the matrix or all its prime factors are removed and it becomes a unit, in which case it still divides all elements in the matrix.

  • By repeating this operation, the matrix can eventually be transformed into the desired form.

 ◻

Exercise 4.27 (\(\bigstar\)). In this problem, \(R\) is a PID. And let \(A\in M_n(R)\) be viewed as an \(R\)-linear transformation from \(R^n\) to \(R^n\). Follow the hints below to discover the Gaussian elimination method.

  1. Prove that \(A\in M_n(R)\) is invertible if and only if it is equivalent to the identity matrix \(I_n\), and hence conclude that \(A\) is invertible if and only if \({\rm rank}A=n\) and \(d_1=d_2=\dots=d_n=1\).

  2. Show that any invertible matrix can be written as a product of elementary matrices (i.e. the matrices used in elementary transformations). In other words, elementary matrices generate \(GL_n(R)\).

  3. Prove that if \(A\) is invertible, then it can be transformed into the equivalent standard form \(I_n\) by using only row transformations.

  4. Consider the following algorithm for finding the inverse matrix \(A^{-1}\), and write down the two matrices \((A|B)\) side by side, where \(B\) is an \(n\times m\) matrix. Use only row transformations (or column transformations if \(B\) is also an \(n\)-order square matrix) to perform the same operations on both \(A\) and \(B\). Prove that when \(A\) is transformed into the identity matrix, \(B\) is transformed into \(A^{-1}B\).

  5. Consider the cases where \(B=I_n\) and \(B=\beta\) is a column vector, and obtain an algorithm for finding the inverse matrix and solving the equation \[Ax=\beta\] when \(A\) is invertible. Here \(x\in R^n =M_{n\times 1}(R)\) is an unknown column vector.

  1. Suppose \(A\) is not necessarily invertible. Prove that the kernel of \(A\), denoted by \(\ker A\), can be generated by \(Qe_{r+1},\dots,Qe_n\), where \(Q\) is the matrix that transforms \(A\) into its equivalent standard form, and \(r={\rm rank}A\) is the number of nonzero \(d_1|\dots|d_r\).

  2. Obtain the following algorithm for computing the submodule \(\ker A\subset R^n\): Write the two matrices \((A|I)\) side by side, where \(I\) is the identity matrix. Apply elementary transformations to \(A\) and perform the same column operations on \(I_n\), but do not perform any row operations. When \(A\) is transformed into standard form, the submodule generated by the last \(n-r\) columns of the right matrix is \(\ker A\).

Let \(\varphi:V\to W\) be a linear map. Under a certain basis of \(V\) and \(W\), the corresponding matrix is \(A\in M_{m\times n}(k)\). Prove that \({\rm rank}A=\dim_k {\rm Im}\varphi\).

Structure of Finite-Generated Modules on PID

Using the results established in this chapter, we will prove two important conclusions. The first is that the rank of a (finite-generated) free module on a PID is determined, that is, \(R^m\cong R^n\Rightarrow m=n\). The second is that a submodule of a finite-generated free module on a PID is still a finite-generated free module.

Rank of Finite-Generated Free Modules

In fact, this holds for all integral domains.

Theorem 4.5. If \(R\) is an integral domain, then \(R^n\cong R^m\Rightarrow n=m\).

Proof. We use the field of fractions \(K={\rm Frac}(R)\) of the integral domain \(R\) and consider "vectorizing" this problem. This may allow us to use the fact that vector spaces have a well-defined dimension to prove the proposition.

In fact, the mapping \(f:R^n\to R^m\) defines an \(m\times n\) matrix \(A\) over \(R\). Since \(R\) is an integral domain, the inclusion mapping \(i:R\to K\) is injective, and we can view \(A\) as a matrix over \(K\), denoted as \(A_K\). This matrix defines a mapping \(f_K:K^n\to K^m\), and since \(f\) has an inverse, we can see that \(f_K\) also has an inverse. Therefore, \(f_K\) is also an isomorphism. By the uniqueness of dimension for vector spaces, we have \(m=n\).

Thus, we can define the rank of a finitely generated free module as \({\rm rank}(R^n)=n\). ◻

Submodules of finitely generated free modules

Noetherian Modules

A Noetherian module is a module in which all submodules are finitely generated (and hence the module itself is also finitely generated).

Example 4.4. Let \(R\) be a Noetherian ring. Then, as a submodule of itself, \(R\) is a module whose submodules are all ideals of \(R\). By the Noetherian property, these ideals are finitely generated, so \(R\) is a Noetherian module.

Similar to Noetherian rings, we have the following equivalent statements:

Theorem 4.6. The following statements about \(R\)-module \(M\) are equivalent:

  1. \(M\) is a Noetherian module.

  2. Any ascending chain of submodules of \(M\) must terminate.

  3. The set of all submodules of \(M\) has a maximal element.

Exercise 4.28 (\(\bigstar\)). Using the proof of Noetherian ring, explain the equivalence relation mentioned above.

In order to show that all common modules are Noetherian modules, we construct other Noetherian modules starting from \(R\).

Theorem 4.7.

  1. If \(M\) is a Noetherian module, then all homomorphic images of \(M\) are also Noetherian modules.

  2. \(M\) is a Noetherian module if and only if there exists a submodule \(N \subset M\) such that \(N\) and \(M/N\) are also Noetherian modules.

  3. Finite sums and direct sums of Noetherian modules are also Noetherian modules.

Proof.

  1. The image of any homomorphism \(\varphi:M\to N\) is isomorphic to \(M/\ker\varphi\). By the Correspondence Theorem, all ascending chains of submodules of \(M/\ker\varphi\) can be corresponded to ascending chains of submodules of \(M\), and thus must terminate.

  2. \((\Rightarrow)\): Clearly holds.
    \((\Leftarrow)\): Consider a family of ascending chains of submodules of \(M\), denoted by \(M_i\). Then \(M_i\cap N\) and \((M_i+N)/N\) are two ascending chains of submodules of \(N\) and \(M/N\), respectively. Since they are both Noetherian, these two sequences must terminate. Therefore, it suffices to prove that if \(P\subset Q\), \(P\cap N=Q\cap N\), and \((P+N)/N=(Q+N)/N\), then \(P=Q\). Suppose \(q\in Q\), then there exists \(p\in P\) such that \(q-p\in N\). Since \(q-p\in Q\cap N = P\cap N\subset P\), we have \(q\in P\).

  3. From \(0\to M\to M\oplus N\to N\to 0\), we can see that finite direct sums are Noetherian. Moreover, for a general finite sum \(\sum M_i\subset M\), it is the image of the homomorphism \(\bigoplus M_i\to \sum M_i\subset M\).

 ◻

As a conclusion, if \(R\) is a Noetherian ring, \(R^n\) is Noether, so all finitely generated \(R\)-modules are Noether, because they are all homomorphic images of some \(R^n\).

Finite generated free modules over PID

Let \(R\) be a PID, then \(R\) is obviously a Noetherian ring, so \(R^n\) is a Noetherian module, so all its submodules are finitely generated. Let \(N \subset R^n\) be a submodule, since it is finitely generated, there is a surjection \(R^m \to N\), which can be composed with the natural map \(N \to R^n\) to obtain a map \(f: R^m \to R^n\), whose image is \(N\). So we can write this map as an \(n \times m\) matrix \(A\), consider the equivalent standard form of \(A\), there are invertible \(n\)-order and \(m\)-order square matrices \(P, Q\) such that \(PAQ = D\), where \(D\) is a diagonal matrix. Translating back to the language of maps, this means that there are two isomorphic maps \(p: R^n \to R^n\) and \(q: R^m \to R^m\) such that \(p \circ f \circ q = \delta: R^m \to R^m \to R^n \to R^n\). Obviously \({\rm Im}(f) = {\rm Im}(f \circ q) \cong {\rm Im}(p \circ f \circ q) = {\rm Im}\delta\), where the matrix corresponding to \(\delta\) is \(D\), which is of the following form \[\left( \begin{array}{ccccc} d_1 & & & & \\ & d_2 & & & \\ & & \dots & & \\ & & & d_r & \\ & & & & O \end{array} \right).\] Here \(d_i \in R\), \(r \le n\). So we know that there is a basis \(e_1', e_2', \dots, e_n'\) of \(R^n\) such that \({\rm Im}\delta = d_1Re_1' \oplus d_2Re_2' \oplus \dots \oplus d_rRe_r'\), where the direct sum is an internal direct sum. This \({\rm Im}\delta\) is already isomorphic to \(N\), if you want to change back to the original form of \(N\), then since \(p^{-1}\) is an isomorphism, \[N = p^{-1}({\rm Im}\delta) = \bigoplus_{i \le r} d_iR p^{-1}(e_i').\] Here \(e_i = p^{-1}(e_i')\) still forms a basis of \(R^n\), because \(p^{-1}\) is an isomorphism. So there is a basis \(e_1, \dots, e_n\) of \(R^n\) such that \[N = d_1Re_1 \oplus \dots \oplus d_rRe_r.\]

Corollary 4.1. \(N \cong R^r\) and \({\rm rank}N = r \le n\).

Finite Generation Modules over PID

Any finite generation module \(M\) is the homomorphic image of a free module, and thus is the quotient of a free module by one of its submodules. We know that the submodules of a free module \(R^n\) over a PID are of the form \((d_1)e_1\oplus \dots \oplus (d_r)e_r\), where \(e_i\) is a basis of \(R^n\) (not necessarily the standard basis). Therefore, under the isomorphism \(R^n\cong Re_1\oplus \dots Re_n\), this quotient is isomorphic to \[R/(d_1)\oplus\dots\oplus R/(d_r)\oplus R^{n-r}\]

Theorem 4.8. Let \(R\) be a PID and \(M\) be a finite generation \(R\)-module. Then there exists a sequence of elements \(d_1|d_2|\dots|d_r\) in \(R\) and a non-negative integer \(k\) such that \[M\cong R/(d_1)\oplus\dots \oplus R/(d_r)\oplus R^k.\] This is called the ’cyclic decomposition’ of \(M\).

Exercise 4.29 (\(\bigstar\)). Prove that all finitely generated abelian groups are isomorphic to the following form \[G\cong \mathbb{Z}_{d_1}\oplus\dots\oplus \mathbb{Z}_{d_r}\oplus \mathbb{Z}^k\] where \(d_1|d_2|\dots|d_r\) is a sequence of integers.

Exercise 4.30 (\(\bigstar\)). Using prime factorization in PID, transform the above structure theorem for finitely generated modules into the following "quasi-prime decomposition" form.

  1. If \(d=up_1^{k_1}\dots p_s^{k_s}\), where \(u\) is a unit, \(p_i\) are distinct primes, and \(k_s\ge 0\) are non-negative integers, prove (this can be seen as a special case of the Chinese Remainder Theorem. Essentially, you only need to prove the case where \(d=d_1d_2\), where \(d_1,d_2\) are coprime) \[R/(d)\cong R/(p_1^{k_1})\oplus \dots R/(p_s^{k_s}).\]

  2. Show that \(\mathbb{Z}_2\oplus \mathbb{Z}_6\oplus \mathbb{Z}_{18}\cong \mathbb{Z}_2^3\oplus (\mathbb{Z}_3\oplus \mathbb{Z}_9)\).

  3. Prove that all finitely generated \(R\)-modules are direct sums of modules of the form \(R/(p^k)\) and \(R\). More precisely, there is the following decomposition \[M\cong R^k\oplus \bigoplus_{p} \left((R/p^{\alpha_{p,1}})^{\beta_{p,1}}\oplus (R/p^{\alpha_{p,2}})^{\beta_{p,2}}\oplus\dots \oplus (R/p^{\alpha_{p,s_p}})^{\beta_{p,s_p}}\right)\] where \(0\le \alpha_{p,1}\le \alpha_{p,2}\le\dots \alpha_{p,s_p}\), \(\beta_{p,i}\ge 0\).

  4. Conversely, show how to transform quasi-prime decomposition into cyclic decomposition.

Standard Form of Linear Maps

In this section, we consider \(\varphi:V\to V\) as a linear map on a finite-dimensional \(k\)-vector space. Let \(B=\{e_1,\dots,e_n\}\) be a basis of \(V\) and write \(\varphi\) in matrix form \(A\). A problem arises here, as there is no explicit choice of basis in the vector space \(V\) and the definition of \(\varphi\). In fact, many linear spaces and maps can be defined without specifying a basis, i.e. without relying on the choice of basis. However, if we want to write it in matrix form, we must artificially choose a basis. This leads to the following questions:

  1. How does the choice of different bases affect the matrix?

  2. Is there a ’best’ or ’good’ special basis that simplifies the form of the matrix?

Similar Matrices

Let’s first look at the first question. Let \(B'=\{e_1',\dots,e_n'\}\) be another basis, with its matrix denoted as \(A'\). The so-called matrix is essentially a mapping from \(k^n\) to \(k^n\), and its relationship with the original mapping can be described using the basis mapping \(\rho:k^{(B)}\to V\), where \(\rho\) is the mapping determined by \(\{e_1,\dots,e_n\}\) and is an isomorphism. Viewing \(A\) as a linear mapping from \(k^n\) to \(k^n\), we have \(A=\rho^{-1}\circ\varphi\circ \rho\), as shown in the commutative diagram below (a commutative diagram refers to the fact that when mappings are composed along different paths, the resulting mappings are consistent). rendering math failed o.o Similarly, if \(\rho'\) is the mapping determined by another basis, we have \(A'=\rho'^{-1}\circ\varphi\circ \rho'\). Therefore, \[A'=\rho'^{-1}\rho A \rho^{-1}\rho' = (\rho'^{-1}\rho) A (\rho'^{-1}\rho)^{-1}\] This story can be represented by the commutative diagram below. rendering math failed o.o Note that \(\rho'^{-1}\circ \rho:k^{(B)}\to k^{(B')}\) is also an \(n\times n\) matrix and is also an isomorphism (thus an invertible matrix). Let’s denote this matrix as \(P\) and call it the coordinate transformation matrix. We then have \[A'=PAP^{-1}.\] This way, two matrices from the same mapping are connected. If there exists an invertible matrix \(P\) such that two square matrices \(A',A\) are connected by the above relationship, we call these two matrices similar. It can be easily verified that similarity is an equivalence relation on \(M_n(k)\).

Exercise 4.31. Let \(V=k^2\), and choose \(\varphi:V\to V\) to be the unique linear mapping that satisfies \(\varphi(e_1)=e_2\) and \(\varphi(e_2)=e_1\). What is the matrix \(A\) corresponding to \(\varphi\)? If we choose a new basis \(e_1':=(1,1),e_2'=(1,-1)\), calculate the transformation matrix \(P\) and \(A'\).

Similar Standard Form

Now we need to answer the question, given two matrices \(A, B\), how do we determine if they are similar? This is essentially a classification problem. Generally speaking, we will study this problem from the following perspectives: first, we look for invariants, which are quantities that can be calculated from matrix \(A\) and are invariant under similarity transformations. Second, we can find a representative element in each similarity equivalence class, and then see if matrices \(A\) and \(B\) can be transformed into the same representative element, or if these representative elements can be calculated from the invariants. Either way, it inspires us to approach the problem from a coordinate-independent perspective. Let us assume that our matrix is the matrix representation of a linear mapping \(\varphi\) on \(V\) under a basis. We find that, in addition to being a \(k\)-module, \(V\) can also be acted upon by \(\varphi\). By repeatedly applying \(\varphi\) and their linear combinations, we can apply any \(k\)-polynomial \(p(\varphi)\) to \(V\), where \(p(t)\in k[t]\) is a polynomial. Thus, \(V\) becomes a \(k[t]\)-module, with the action defined as \[p(t)v:=p(\varphi)v.\] We consider, if we let \(\varphi\) represent the action of the matrix, is the similarity of matrices related to the corresponding \(V\)-module structure? In fact, we can find that

Lemma 4.1. Let \(V_A\) be the \(k[t]\)-module structure of \(V\) under the action \(t.v:=A.v\), and \(V_B\) be the \(k[t]\)-module structure of \(V\) under the action \(t.v:=B.v\). Then \(A\) and \(B\) are similar if and only if, as \(k[t]\)-modules, \(V_A\cong V_B\).

Proof.

  • If \(A\) is similar to \(B\), then the transition matrix \(P: V_A \to V_B\) is the required isomorphism mapping, noting that its action on \(t\) is different on its range and domain. Since it is an invertible matrix, it is obviously a \(k\)-module isomorphism. To verify that it is a \(k[t]\)-module homomorphism, it is sufficient to show that it commutes with \(t\): \[P(tv) = PAv_a = PAP^{-1}Pv = B(Pv) = t(Pv).\] Note that conceptually, \((Pv) \in V_B\), although as vector spaces \(V_A\) and \(V_B\) are identical, their \(k[t]\)-module structures are different.

  • Conversely, if there exists a \(k[t]\)-module isomorphism \(P: V_A \cong V_B\), then we have \[P(tv) = tP(v)\] which implies \(PA = BP \Leftrightarrow PAP^{-1} = B\).

 ◻

So, we found that classifying matrices or mappings by similarity is equivalent to classifying \(k[t]\) modules. Note that \(k[t]\) is a PID, and we know that the structure of finitely generated modules over a PID has been completely classified! It can be written as a direct sum of modules of the form \(R/(d)\) or quasi-prime modules \(R/(p^k)\). One idea is that we can use matrices corresponding to these forms as our representatives, so we only need to transform \(V_A\) into this form to transform the matrix into standard form. Let’s first discuss what the standard form looks like.

Rational Standard Form

In the cyclic decomposition of \(V_A\), we call the obtained \(d_i\) the invariant factors of the matrix \(A\) or \(\varphi\). In order to find a simpler form, we decompose the invariant factors into a product of prime factors, thus becoming a quasi-prime decomposition, that is, \(V\) becomes a direct sum of modules of the form \(R/(p^k)\) (\(R\) will not appear in the decomposition because \(V\) is a finite-dimensional \(k\)-vector space), where \(p\in R\) is an irreducible element of \(R\). When we write \[V_A\cong R/(p_1^{k_1})\oplus \dots\oplus R/(p_r^{k_r})\] (allowing for repetition of elements in the sequences \(p_i\) and \(k_i\)), we call all of these \(p_i(t)^{k_i}\) (which can be repeated) the elementary factors of \(A\) or \(\varphi\). It is easy to see that giving the elementary factors completely determines the module structure of \(V_A\). The above decomposition is not only a direct sum decomposition of \(R\) modules, but also a direct sum decomposition as \(k\) modules. Therefore, we only need to find a \(k\)-basis for \(R/(p_i^{k_i})\) to find a \(k\)-basis for \(V\). Let \(p_i(t)^{k_i}=t^{d_i}-a_{d_i-1}t^{d_i-1}-\dots -a_{0}\) be a polynomial with leading coefficient \(1\) (because in \(k[t]\), \(k-\{0\}\) are all units), we take the images of \(1,t,t^2,\dots, t^{d_i-1}\) in \(R/(p_i^{k_i})\) as a \(k\)-basis. Then the action of \(\varphi\) on this basis is just multiplication by \(t\), so we have \[t\cdot 1=t\] \[t\cdot t=t^2\] \[\dots\] \[t\cdot t^{d_i-1}=t^{d_i}=a_{d_i-1}t^{d_i-1}+a_{d_i-2}t^{d_i-2}+\dots +a_{0}\] This gives the matrix of \(\varphi\) on the subspace \(R/(p_i^{k_i})\) as \[A_{p_i^{k_i}}= \left( \begin{array}{ccccc} & & & & a_{0}\\ 1 & & & & a_1\\ & 1 & & & a_2\\ & & \dots & & \dots\\ & & & 1 & a_{d_i-1} \end{array} \right).\] Thus, in this basis, the matrix of \(\varphi\) is a matrix composed of such matrices arranged diagonally \[A= \left( \begin{array}{cccc} A_{p_1^{k_1}} & & & \\ & A_{p_2^{k_2}} & & \\ & & \dots & \\ & & & A_{p_r^{k_r}} \end{array} \right).\] Here, all elements in the matrix belong to \(k\), and we call it the rational standard form of \(\varphi\).

Exercise 4.32 (\(\bigstar\)). Let \(\varphi:V\to V\) be a linear transformation on an \(n\)-dimensional \(k\)-vector space. Thus, \(\varphi\) makes \(V\) into a \(k[t]\) module \(V_\varphi\).

  1. Prove that \(d_n\) in the invariant factors of \(\varphi\) is the least common multiple of all elementary factors.

  2. For a module \(M\) over a ring \(R\), define \[{\rm ann}(M):=\{r\in R| rM=0\}\] and prove that it is an ideal of \(R\).

  3. Show that \(\varphi\) has a minimal polynomial, that is, \({\rm ann}(V_\varphi)=(m(t))\) and the polynomial is \(m(t)=d_n(t)\), with degree \(\le n\).

Jordan Normal Form

If our field \(k\) is algebraically closed, that is, all polynomials over \(k\) have roots in \(k\), which can be decomposed into a product of linear factors (such as \(k=\mathbb{C}\)), then the matrix can be reduced to a simpler form. In this case, the irreducible polynomials over \(k[t]\) are all of the form \((t-a)\), so \(p_i^{k_i}\) must be of the form \((t-\lambda_i)^{k_i}\). We consider a more convenient \(k\)-basis for \(R/((t-\lambda_i)^{k_i})\): \((t-\lambda)^{k_i-1},(t-\lambda)^{k_i-2},\dots,1\). In this basis, the action of \(t\) is as follows: \[t\cdot (t-\lambda)^{k_i-1} = \lambda\cdot (t-\lambda)^{k_i-1}+0\] \[t\cdot (t-\lambda)^{k_i-2} = \lambda\cdot (t-\lambda)^{k_i-2}+(t-\lambda)^{k_i-1}\] \[\dots\] \[t\cdot 1 = \lambda\cdot 1+(t-\lambda)\] This shows that in this basis, \(\varphi\) has the following matrix representation on this subspace: \[J_{p_i^{k_i}}= \left( \begin{array}{ccccc} \lambda_i & 1 & & & \\ & \lambda_i & 1 & & \\ & & \lambda_i & 1 & \\ & & & \dots & 1 \\ & & & & \lambda_i \end{array} \right).\] Here the matrix is of size \(k_i\times k_i\). So \(\varphi\) has the following matrix representation on \(V\) with this chosen basis: \[A= \left( \begin{array}{cccc} J_{p_1^{k_1}} & & & \\ & J_{p_2^{k_2}} & & \\ & & \dots & \\ & & & J_{p_r^{k_r}} \end{array} \right).\] This is called the Jordan normal form of \(\varphi\). In particular, if \(k_i=1\), then \(J_{(x-\lambda_i)}=(\lambda_i)\) is a first-order matrix with only diagonal elements.

Exercise 4.33. Prove that \(A\) can be diagonalized over the field \(k\) if and only if all elementary divisors are of first order. This is equivalent to the minimal polynomial of \(A\) being able to be factored into a product of first-order polynomials over \(k\) and having no repeated roots.

Exercise 4.34 (\(\bigstar\bigstar\bigstar\)). Let \(k=\mathbb{F}_p\) be a finite field. How many \(k\)-similarity equivalence classes are there in \(M_2(k)\)?

The \(k[t]\)-decomposition of \(V\)

In this section, we denote \(R=k[t]\). How can we give a surjection from \(R^n\) to \(V\)? Or in other words, how can we find a set of generators? In fact, when we give a basis \(\alpha_1,\dots,\alpha_n\) of \(V\), \(\varphi\) has a matrix representation \(A\), and \(\alpha_1,\dots,\alpha_n\) are a set of \(k\)-generators of \(V\), which are also a set of \(R\)-generators. Therefore, we consider the \(R\)-surjection determined by \(\alpha_1,\dots,\alpha_n\in V\) \[\rho:R^n\to V.\] Now to find the kernel of this mapping, we consider the matrix representation \(A\), and obviously we have \[x_i:=te_i-(a_{1i}e_1+\dots+a_{ni}e_n)\in \ker\rho.\] Here \(e_i\in R^n\) is the standard basis, which gives us \(n\) generators \(x_1,\dots,x_n\) of \(\ker\rho\). Therefore, we can construct a mapping \(R^n\to R^n\) whose image corresponds to \(x_i\), and in fact, this mapping is described by the matrix \(tI-A\in M_n(R)\). Since \(te_i=x_i+(a_{1i}e_1+\dots)\), we find that \(N=\sum R x_i + \sum k e_i\) is an \(R\)-submodule of \(R^n\). Therefore, for any \(x=\sum p_i(t)e_i\in \ker\rho\), it belongs to \(N\), and can be written as \[x=\sum q_i(t)x_i + \sum c_i e_i.\] Note that \(0=\rho(x)=\sum c_i \alpha_i\), we know that all \(c_i=0\). This shows that \(x\) is an \(R\)-linear combination of \(x_i\), and we have proved that \(x_i\) are indeed a set of generators of \(\ker\rho\).

We also need to prove that \(R^n\to R^n\) is an injection, which means that \(x_i\) is a linearly independent set over \(R\). To do this, let \(\sum q_i(t)x_i=0\). We have \[\sum_{i}q_i(t)te_i - \sum_{i,j}q_ja_{ij}e_i=\sum_i \left(q_it-\sum_{j}q_ja_{ij}\right)e_i = 0\] This implies that \(tq_i=\sum_j a_{ij}q_j\). Then we find that \((q_1,\dots,q_n)=\sum Rq_i\subset \sum kq_i\). Since \(R\) is a PID, the left ideal is a principal ideal, let it be \((d)\). If \(d\neq 0\), since \(R\) is an integral domain, \((d)\) should contain polynomials of any degree, but the right side \(\sum kq_i\) is a finite dimensional \(k\)-space and only contains polynomials of finite degree, which is impossible. Therefore, \(d=0\), and we know that \(q_1=q_2=\dots =q_n=0\).

Corollary 4.2. We have the following short exact sequence of \(R\)-modules \[0\to R^n\to R^n\to V_A\to 0.\] where \(R^n\to R^n\) is the map determined by \(tI-A\in M_n(R)\).

After having this basic exact sequence, we will examine how to specifically calculate the module structure of \(V_A\). Note that \(tI-A\) is a matrix over a PID, so there exist invertible matrices \(P,Q\in M_n(R)\) such that \[P(tI-A)Q=D={\rm diag}(d_1,\dots,d_r,0,\dots,0)\] where \(d_1|\dots|d_r\), and \({\rm diag}\) denotes a diagonal matrix with these elements on the diagonal and \(0\) elsewhere. Then we have the following commutative diagram rendering math failed o.o Here we need the following simple lemma

Lemma 4.2. Let \(R\) be a (not necessarily commutative) ring, and let \(M'\) and \(N'\) be (let’s assume) left \(R\)-modules. If we are given the following commutative diagram with \(R\)-modules and \(R\)-module homomorphisms (except for \(\gamma\)) rendering math failed o.o and both rows are exact, then we can uniquely construct an \(R\)-module homomorphism \(\gamma\) such that the entire diagram commutes. Furthermore, if \(\alpha\) and \(\beta\) are isomorphisms, then \(\gamma\) is also an isomorphism. (Essentially, this homomorphism \(\gamma\) is induced by \(\beta\).)

Proof. In fact, let \(m'' \in M''\) and consider its preimage \(m\) in \(M\). We map it to \(g' \circ \beta(m)\) and define \(\gamma(m'') = g' \circ \beta(m)\). It is necessary to verify that this definition does not depend on the choice of preimage \(m\). In fact, another preimage differs from \(m\) by an element in \(\ker(g)\), which is equivalent to an element in \({\rm Im}f\). Thus, we can denote the other preimage as \(m + f(m')\), and its image in \(N''\) is \(g'(\beta(m)) + g'(\beta(f(m'))) = g'(\beta(m)) + g'(f'(\alpha(m'))) = g'(\beta(m))\). Therefore, the definition of \(\gamma\) does not depend on the choice of preimage \(m\). Here we use commutativity. It is easy to verify that \(\gamma\) is indeed an \(R\)-module homomorphism. Uniqueness is obvious, since \(\gamma\) is uniquely determined by the commutativity of the diagram.

Now let us consider the case when \(\alpha\) and \(\beta\) are isomorphisms, and show that \(\gamma\) is also an isomorphism. In fact, we can construct \(\alpha^{-1}\) and \(\beta^{-1}\), and define a commutative diagram as follows: rendering math failed o.o Then by the first part of the lemma, we can uniquely construct \(\gamma'\) that commutes with the diagram. Note that \(1_{M''}: M'' \to M''\) is obviously a map that commutes with the first and third rows. Therefore, by uniqueness, \(\gamma' \circ \gamma = 1_{M''}\), which implies that \(\gamma\) is an isomorphism. ◻

Now, since \(P,Q\) are invertible, \(Q^{-1},P\) obviously define an isomorphic mapping. From this lemma, we know that an isomorphism can be constructed from \(V_A\) to \(V'\), where \[V'\cong R^n/DR^n=R/(d_1)\oplus \dots \oplus R/(d_r) \oplus R^{n-r}.\] Note that \(V'\cong V_A\) is also a \(k\)-module isomorphism, and is a finite-dimensional \(k\)-vector space, but \(R\) is not a finite-dimensional \(k\)-vector space, so \(n=r\), \(d_1|d_2|\dots |d_n\) is a nonzero element in \(R\) such that \[V_A\cong V'\cong R/(d_1)\oplus \dots \oplus R/(d_n).\] Of course, it is not necessary to prove that \(V_A\) is isomorphic to the above form, only the structure theorem for finitely generated \(R\)-modules is needed. However, the above process gives an explicit way to calculate the standard form of \(V_A\) using the form of \(A\).

Exercise 4.35 (\(\bigstar\)). Prove that, when the following commutative diagram of \(R\)-module homomorphisms is given (where the rows are all short exact sequences) rendering math failed o.o a unique mapping \(\alpha\) can be constructed to make the diagram commute. If \(\beta,\gamma\) are both isomorphisms, then \(\alpha\) is also.

Exercise 4.36 (\(\bigstar\bigstar\)). Let \(K\subset L\) be two fields, where \(K\) is a subfield of \(L\). Prove that \(A,B\in M_n(K)\) are similar over \(K\) if and only if they are similar over \(L\).

Eigenvectors and Root Subspaces

In this section, we will discuss how to actually obtain a transition matrix \(P\) such that \(P^{-1}AP=D\) is our standard form, or equivalently, \(AP=DP\). Let \(k\) be a field (e.g. an algebraically closed field like \(k=\mathbb{C}\)) such that all elementary divisors of \(\varphi\) are products of linear factors \((x-\lambda_i)^{k_i}\). We call the \(\lambda\)’s that appear in these elementary divisors the eigenvalues of \(\varphi\). By arranging the eigenvalues in order of their corresponding elementary divisors, we have \[V_\varphi\cong \bigoplus_{\lambda} \left(\bigoplus_k (R/(t-\lambda)^{k})^{n_k}\right).\] For each eigenvalue \(\lambda\), the subspace \(\bigoplus_k (R/(t-\lambda)^{k})^{n_k}\) corresponds to a subspace of \(V_\varphi\) under this isomorphism, which we call the root subspace with respect to \(\lambda\). It is given by \[R_\lambda:=\bigcup_k \ker(\varphi-\lambda)^k.\] The subspace \(\ker(\varphi-\lambda)\) is called the eigenspace with respect to \(\lambda\), denoted by \(V_\lambda\). The vectors in this subspace are called the eigenvectors of \(\varphi\), and they are obviously subspaces of the root subspace corresponding to the eigenvalue. Note that when we discussed the Jordan canonical form, we chose a \(k\)-basis for a submodule of the form \(R/((t-\lambda)^k)\), and we know that the projection of \(1\in R\) onto this submodule generates it. In other words, it has a basis \((t-\lambda)^{k-1},\dots,(t-\lambda),1\), and we need to arrange the basis in this order to ensure that the matrix of \(\varphi\) is the Jordan form described earlier. Now if we find the image of \(1\in R/((x-\lambda)^k)\) under the isomorphism to \(V_\varphi\), denoted by \(\alpha\), then the \(k\)-basis for this subspace can be written as \((\varphi-\lambda)^{k-1}\alpha, \dots, (\varphi-\lambda)\alpha, \alpha\).

If \(\varphi\) is diagonalizable, then all elementary factors are linear, and the root subspaces are equal to the characteristic subspaces. By finding the eigenvectors, we can obtain \(P\). For a general \(\varphi\), we need to find a basis for the root subspaces \(\ker(\varphi-\lambda)^k\). Here are some examples to illustrate this.

Example 4.5. We consider the following matrix over the field \(k=\mathbb{Q}\): \[A=\left( \begin{array}{cc} 1 & 2\\ 3 & 4\\ \end{array} \right),tI-A=\left( \begin{array}{cc} t-1 & -2 \\ -3 & t-4 \\ \end{array} \right).\] By performing elementary transformations on \(tI-A\), we get: \[\Rightarrow \left( \begin{array}{cc} -2 & t-1 \\ t-4 & -3 \\ \end{array} \right)\Rightarrow \left( \begin{array}{cc} -2 & 0 \\ t-4 & \frac{(t-4)(t-1)}{2}-3 \\ \end{array} \right)\Rightarrow \left( \begin{array}{cc} 1 & 0 \\ 0 & t^2-5t-2 \\ \end{array} \right).\] Thus, the rational canonical form of \(A\) over \(\mathbb{Q}\) is \(\left( \begin{array}{cc} 0 & 2 \\ 1 & 5 \\ \end{array} \right)\). In other words, there exists an invertible matrix over \(\mathbb{Q}\) such that \(A\) is similar to the above form. If we consider \(k=\mathbb{R}\) or \(k=\mathbb{C}\) (in fact, we only need \(k=\mathbb{Q}(\sqrt{33})\)), then from \(t^2-5t-2=\left(t-\frac{5+\sqrt{33}}{2}\right)\left(t-\frac{5-\sqrt{33}}{2}\right)\), we obtain its elementary factors, and thus it is similar to the following diagonal matrix over \(k\): \[A'=\left( \begin{array}{cc} \frac{5+\sqrt{33}}{2} & 0 \\ 0 & \frac{5-\sqrt{33}}{2} \\ \end{array} \right).\] This is the Jordan canonical form of \(A\) over any field \(k\) that contains \(\mathbb{Q}(\sqrt{33})\).

To find \(P\), consider its eigenspaces for the two eigenvalues \(\lambda_1=\frac{5+\sqrt{33}}{2}, \lambda_2=\frac{5-\sqrt{33}}{2}\), \(V_{\lambda_1}=\ker(A-\lambda_1 I)\) and \(V_{\lambda_2}=\ker(A-\lambda_2 I)\). Using the algorithm for calculating \(\ker\) described in the previous sections (or solving the equations directly), we can compute the generators of these two spaces, which are the eigenvectors \(\left(\begin{array}{cc} 2 \\ \lambda-1\end{array}\right)\), where \(\lambda=\lambda_1,\lambda_2\) for the corresponding eigenvalues. Therefore (note that \(P\) does not need to be unique), \[P=\left( \begin{array}{cc} 2 & 2 \\ \frac{3+\sqrt{33}}{2} & \frac{3-\sqrt{33}}{2} \\ \end{array} \right)\] satisfies \(AP=PA'\), i.e. \(P^{-1}AP=A'\).

Example 4.6. Consider the following matrix over \(\mathbb{Q}\), transform it into Jordan canonical form \(J\), and find the corresponding \(P\) such that \(P^{-1}AP=J\). \[A=\left( \begin{array}{cccc} 2 & 1 & 2 & 1 \\ 1 & 2 & 2 & 1 \\ 0 & 0 & 1 & 1 \\ 0 & 0 & 0 & 1 \\ \end{array} \right).\] By performing elementary transformations on \(tI-A\), we can calculate its invariant factors \[\left( \begin{array}{cccc} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & (t-1) & 0 \\ 0 & 0 & 0 & (t-1)^2(t-3) \\ \end{array} \right)\] Thus, the elementary factors are \((t-1),(t-1)^2\), and \((t-3)\). We then calculate \(\ker(A-1\cdot I)^2\) and obtain the root subspace \(R_1\) with the following basis (the basis is not unique and may differ from what others calculate) \[v_1=\left(\begin{array}{c} -1 \\ 1 \\ 0\\0\end{array}\right), v_2=\left(\begin{array}{c} -2 \\ 0\\ 1\\0\end{array}\right), v_3=\left(\begin{array}{c} -2 \\0 \\0\\ 1\end{array}\right).\] By applying \(A-I\) to these vectors, we find that the first two vectors fall into \(\ker(A-I)\), while the last one \((A-I)v_3=-e_1-e_2+e_3\) belongs to \(\ker(A-I)^2-\ker(A-I)\). Therefore, we let \(\alpha_2=v_3\), which corresponds to the generator \(1\) in \(R/((t-1)^2)\), and its Jordan block corresponds to the basis \((A-I)\alpha_2,\alpha_2\). We also need another eigenvector that does not belong to the space generated by \(k[\varphi]\alpha_2\) to correspond to the other elementary factor \((t-1)\). It is clear that we can take \(\alpha_1=v_1\).

Finally, by simply calculating the feature vector \(\ker(A-3I)\), we find an eigenvalue of \(3\) and its corresponding eigenvector \[\beta=\left(\begin{array}{c} 1\\ 1\\ 0\\ 0\end{array}\right)\] By arranging them in the order of \(\alpha_1,\,(A-I)\alpha_2,\,\alpha_2,\,\beta\) and putting them into the matrix \(P\), we obtain the desired transition matrix \[P=\left( \begin{array}{cccc} -1 & -1 & -2 & 1 \\ 1 & -1 & 0 & 1 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ \end{array} \right), P^{-1}AP= \left( \begin{array}{cccc} 1 & 0 & 0 & 0 \\ 0 & 1 & 1 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 3 \end{array} \right).\]

Exercise 4.37 (\(\bigstar\bigstar\)). In fact, in the calculation process of \(P\) in the above example, we can also reduce the amount of calculation by the following means.

  1. Let \(\varphi:V\to V\) be a general linear map on an \(n\)-dimensional space, and its minimal polynomial is of the form \((t-\lambda_1)^{k_1}(t-\lambda_2)^{k_2}\), where the two eigenvalues are distinct. Prove that \[{\rm Im}(\varphi-\lambda_1)^{k_1}=\ker(\varphi-\lambda_2)^{k_2}\] \[\ker(\varphi-\lambda_1)^{k_1}={\rm Im}(\varphi-\lambda_2)^{k_2}.\]

  2. Here, \(A\) returns to the \(A\) in the previous example. By calculating the matrix \((A-3I)\) and \(\ker(A-3I)\), find the eigenvector of \(\lambda=3\), and use \(\ker(A-I)^2={\rm Im}(A-3I)\) to show that \(\ker(A-I)^2\) is directly generated by the columns of \(A-3I\), so there is no need to calculate \((A-I)^2\) and its kernel.

Exercise 4.38 (\(\bigstar\)). The Fibonacci sequence is a unique sequence of positive integers determined by the initial conditions \(F_1=1, F_2=1\) and the recurrence relation \[F_{n+2}=F_{n+1}+F_n.\] We can convert the above recurrence relation into matrix form \[\left(\begin{array}{c} F_{n+2}\\ F_{n+1}\end{array}\right)=\left( \begin{array}{cc} 1 & 1 \\ 1 & 0 \\ \end{array} \right)\left(\begin{array}{c} F_{n+1}\\F_n\end{array}\right).\] Let the matrix be denoted as \(A\). By converting this matrix into Jordan canonical form over \(k=\mathbb{C}\), we can easily calculate \(A^n\) and derive the general formula for \(F_n\).

Exercise 4.39 (\(\bigstar\bigstar\)). Is there a matrix \(X\) such that \[X^2=\left(\begin{array}{ccc} 0 & 1 & 1 \\ 0 & 0 & 1 \\ 0 & 0 & 0 \end{array}\right)?\]

Invariants of Linear Maps

An invariant is either a quantity whose definition does not depend on the choice of basis, or a quantity that remains unchanged for any choice of basis (i.e. a similarity invariant). In this section, we will look at several classical invariants of linear maps.

Determinant

Consider an \(n\)-dimensional \(k\)-vector space \(V\) and a linear map \(\varphi:V\to V\). We consider an \(n\)-linear alternating function on \(V\) \[D(v_1,v_2,\dots,v_n):V\times V\times\dots \times V\to k.\] Here, \(n\)-linear means that \(D\) is linear in each variable \(v_i\), i.e. when fixing all other elements, \[v_i\mapsto D(v_1,\dots, v_i,\dots,v_n): V\to k\] is a \(k\)-linear map. Alternating means that if two vectors are the same, e.g. \(v_i=v_j=v\), then the value of the function is \(0\). \[D(v_1,\dots,v,\dots,v,\dots,v_n)=0.\] As a consequence, this implies that when we swap the positions of two vectors \(v_i,v_j\), the value of the function becomes the negative of its original value \[D(v_1,\dots,v_i,\dots,v_j,\dots,v_n)=-D(v_1,\dots,v_j,\dots,v_i,\dots,v_n).\] This is because \(D(\dots,v_i+v_j,\dots,v_i+v_j,\dots)=0\), and using the \(n\)-linear expansion, we can obtain the above consequence. Now, we may ask, with all these strange conditions, can such a function even exist? We consider introducing a basis \(e_1,\dots,e_n\) for \(V\), and consider the decomposition of \(v_i\) in terms of this basis \[v_i=\sum a_{ji}e_j.\] Then, by \(n\)-linearity, we have \[\begin{aligned} D(v_1,\dots)&=\sum_{j_1} a_{j_11}D(e_{j_1},\dots)\\ &=\sum_{j_1}\sum_{j_2} a_{j_11}a_{j_22}D(e_{j_1},e_{j_2},\dots)\\ &=\dots\\ &=\sum_{j_1,\dots,j_n} a_{j_11}\dots a_{j_nn} D(e_{j_1},\dots,e_{j_n}) \end{aligned}\] Since \(D(e_{j_1},\dots,e_{j_n})\) requires no repeated elements in order to be non-zero, we only need to consider the case where \(k\mapsto j_k\) forms a permutation of \(\{1,\dots,n\}\). Let us denote this permutation as \(\sigma\). In this case, the permutation \(\sigma\) can be decomposed into a product of transpositions, and each transposition produces a negative sign for \(D\). Therefore, \[D(e_{\sigma(1)},\dots,e_{\sigma(n)})={\rm sgn}(\sigma)D(e_1,\dots, e_n)\] Thus, \[D(v_1,\dots,v_n)=\left(\sum_{\sigma\in S_n} {\rm sgn}(\sigma) a_{\sigma(1)1}\dots a_{\sigma(n)n} \right) D(e_1,\dots,e_n).\] It is easy to verify that the above expression defines a function that satisfies \(n\)-linearity and alternativity. Therefore, the desired function \(D\) exists. However, its definition seems to depend on the choice of the value of \(D(e_1,\dots,e_n)\) and a basis for \(V\). In order to avoid the meaningless zero function, we choose \(D(e_1,\dots,e_n)\neq 0\). For any mapping \(\varphi\), we consider \[\det(\varphi):=\frac{D(\varphi(e_1),\dots,\varphi(e_n))}{D(e_1,\dots,e_n)}\] which is called the determinant of \(\varphi\). If we write the matrix decomposition as \(\varphi(e_i)=\sum a_{ji}e_j\), according to the above formula, we know that this definition does not depend on the choice of the value of \(D(e_1,\dots,e_n)\) (as long as it is non-zero). Therefore, we can write the definition of the determinant for a matrix \(A\) as \[\det A = \sum_{\sigma\in S_n} {\rm sgn}(\sigma) a_{\sigma(1)1}\dots a_{\sigma(n)n}.\] This is equivalent to assuming that the value of \(D(\alpha_1,\dots,\alpha_n)\), where \(\alpha_i\) are the columns of \(A\), is equal to \(1\) when \(D(e_1,\dots,e_n)=1\). Interestingly, the definition of the determinant for \(\varphi\) also does not depend on the choice of the basis \(e_1, \dots, e_n\)! To prove this, we only need to show that similar matrices have the same determinant. This is because the determinant is a multiplicative homomorphism:

Theorem 4.9 (Determinant is a multiplicative homomorphism). Fix a choice of basis for the vector space \(V\), and let \(A,B\in M_n(k)\) be two matrices. Then, \[\det(AB)=\det(A)\det(B).\]

Proof. Let the columns of \(A\) from left to right be \(\alpha_1,\dots,\alpha_n\), then the \(j\)th column of \(AB\) is \(b_{1j}\alpha_1+\dots+b_{nj}\alpha_n\). By using a similar alternating sum expansion as before, we have \[\begin{aligned} \det(AB)&=\sum_{\sigma\in S_n} {\rm sgn}(\sigma)b_{\sigma(1)1}\dots b_{\sigma(n)n}D(\alpha_1,\dots,\alpha_n)\\ &=\sum_{\sigma\in S_n} {\rm sgn}(\sigma)b_{\sigma(1)1}\dots b_{\sigma(n)n}\det(A)\\ &=\det(A)\det(B). \end{aligned}\] ◻

Proof. It is easy to see that \(\det(I)=1\), so for an invertible matrix \(P\), we have \[\det(PP^{-1})=\det(P)\det(P^{-1})=1\Rightarrow \det(P^{-1})=(\det P)^{-1}.\] Therefore, for two similar matrices \(A'=PAP^{-1}\), their determinants are equal: \[\det(A')=\det(P)\det(A)(\det P)^{-1}=\det(A).\] In fact, the determinant has a clear meaning: it is a kind of signed volume function in \(n\)-dimensional space, which describes the volume of the parallelotope formed by the vectors \(v_1,\dots,v_n\). The definition of this volume ultimately depends on the value of the "base volume" \(D(e_1,\dots,e_n)\). Therefore, the determinant of \(\varphi\) is the "amplification factor" of the volume of the parallelotope formed by the base under the mapping of \(\varphi\). This factor does not depend on the value of the "base volume". In other words, it describes the "size" of the linear mapping \(\varphi\) in a signed manner, which is independent of the choice of the base.

Exercise 4.40. Explain why the determinant can actually be defined for matrices in \(M_n(R)\), where \(R\) is a commutative ring.

 ◻

Exercise 4.41 (\(\bigstar\)). In this problem, let \(A\in M_n(R)\) be a matrix over a PID.

  1. \(A\) is invertible if and only if \(\det(A)\in R^\times\) is a unit.

  2. \(A\) defines an injection if and only if \(\det(A)\neq 0\).

  3. \({\rm rank}A<n\) if and only if \(\det(A)=0\).

Exercise 4.42 (\(\bigstar\)). Let \(\varphi:V\to V\) be a \(k\)-linear map on an \(n\)-dimensional vector space, and let \(A\) be its matrix with respect to some basis. Prove the equivalence of the following statements:

  1. \({\rm rank}A=\dim V\).

  2. The columns of \(A\) span \(\dim V\).

  3. \(\ker \varphi = 0\), i.e. \(\varphi\) is injective.

  4. \(\det\varphi \neq 0\).

  5. \(0\) is not an eigenvalue of \(\varphi\).

  6. \(Ax=0\) has no non-zero solutions.

  7. \(A\) is an invertible matrix and \(\varphi\) is an isomorphism.

Exercise 4.43 (\(\bigstar\bigstar\)). Let \(\varphi:V\to V\) be a linear map on an \(n\)-dimensional vector space over a field \(k\). Define the polynomial \(p_\varphi(t)=\det(t1_{V}-\varphi)\).

  1. \(p_\varphi(t)\) is an invariant of \(\varphi\) under similarity, and its roots are all eigenvalues of \(\varphi\). Therefore, it is called the characteristic polynomial of \(\varphi\).

  2. Prove that \(p_\varphi(t)\) is equal to the product of all invariant factors of \(\varphi\), i.e. the product of all elementary factors.

  3. Prove the Cayley-Hamilton theorem \(p_\varphi(\varphi)=0\).

Tensor product

Let \(M\) and \(N\) be a right \(A\)-module and a left \(A\)-module, respectively. We can define an abelian group \(M\otimes_A N\), where the elements are generated by symbols of the form \(m\otimes n\). The zero element is \(0\otimes 0\), and it satisfies the following three constraints:

  1. (Coefficient balance) \((ma)\otimes n = m\otimes (an)\),

  2. (Left linearity) \((m+m')\otimes n = m\otimes n + m'\otimes n\),

  3. (Right linearity) \(m\otimes (n+n') = m\otimes n+ m\otimes n'\).

That is, it is required that the symbol \(\otimes\) has bilinearity, and the elements in \(A\) can freely pass through this symbol. Strictly speaking, this definition means that \(M\otimes_A N\) is a free \(\mathbb{Z}\)-module generated by all symbols \(m\otimes n\), modulo all relations generated by elements such as \((ma)\otimes n - m\otimes (an), (m+m')\otimes n-m\otimes n-m'\otimes n\), etc. When we do not write out \(A\), either the context is clear about which base ring is taking the tensor product, or we are taking the tensor product over \(A=\mathbb{Z}\). Since any module can be viewed as a bi-module over \(\mathbb{Z}\), \(\otimes_\mathbb{Z}\) can always be defined.

Example 4.7. Let \(M=\mathbb{Z}/2\) and \(N=\mathbb{Z}/3\) be two \(\mathbb{Z}\)-modules. Consider \(T=M\otimes_\mathbb{Z}N=\mathbb{Z}/2 \otimes \mathbb{Z}/3\). This group is generated by all symbols of the form \(\overline{m}\otimes \overline{n}\), where \(\overline{m}\in \mathbb{Z}/2\) represents the residue class of the integer \(m\).

  1. When \(\overline{m}=\overline{0}\), we have \(\overline{0} = \overline{0}\cdot 0\), so \(\overline{0}\otimes \overline{n} = (\overline{0}\cdot 0)\otimes \overline{n} = \overline{0}\otimes (0\cdot \overline{n}) = \overline{0}\otimes \overline{0}=0\), which is the zero element in \(T\).

  2. Similarly, when \(\overline{n}=\overline{0}\), \(\overline{m}\otimes \overline{0} = 0\).

  3. When \(\overline{m}=\overline{1}\), we have \(\overline{1}=3\cdot \overline{1}\), so \(\overline{1}\otimes \overline{n} = (\overline{1}\cdot 3)\otimes \overline{n} = \overline{1}\otimes (3\cdot \overline{n}) = \overline{1}\otimes \overline{0} = 0\).

Therefore, in this example, the tensor product \((\mathbb{Z}/2)\otimes_\mathbb{Z}(\mathbb{Z}/3)=0\). In fact, for \((m,n)=1\) coprime, we have \((\mathbb{Z}/m)\otimes_\mathbb{Z}(\mathbb{Z}/n)=0\).

Exercise 4.44. Observe the above example and answer: Is \(m\otimes n=0\) equivalent to at least one of \(m,n\) being \(0\) in \(M\otimes_A N\)?

Although in general, the \(M\otimes_A N\) we define is just an Abelian group, similar to \({\rm Hom}\), when either \(M\) or \(N\) has a bimodule structure, we can give \(M\otimes_A N\) a module structure. Let’s give an example of how \({}_SM_R\otimes_R {}_RN\) has a left \(S\)-module structure. This is because we can define, for \(s\in S\), \[s(m\otimes n):= (sm)\otimes n.\]

Similarly, there are four possible cases, which we list in the following table:

Group Natural module structure Definition of \(S\)-module structure
\(M_R \otimes {}_RN\)
\(\mathbb{Z}\)-module none
\({}_SM_R \otimes {}_RN\)
\({}M_{R}\otimes {}_{S,R}N\)
Left \(S\)-module
\(s(m\otimes n):=(sm)\otimes n\)
\(s(m\otimes n):=m\otimes (sn)\)
\({}M_{S,R} \otimes {}_RN\)
\({}M_{R}\otimes {}_{R}N_S\)
Right \(S\)-module
\((m\otimes n)s:=(ms)\otimes n\)
\((m\otimes n)s:=m\otimes (ns)\)

Exercise 4.45. Confirm that you understand all the situations in the table.

Example 4.8. Let \(M\) be a left \(A\)-module, and let \(\rho: A \to B\) be a coefficient extension homomorphism. Here, you can think of \(B\) as a ring with larger coefficients, such as \(\mathbb{Q}\to \mathbb{C}\). Then, since \(B\) can be viewed as a left \(B\)-right \(A\)-module, we can use tensor product to extend the \(A\)-coefficients of \(M\) to \(B\)-coefficients, resulting in a left \(B\)-module: \[N = B \otimes_A M\] The balancing relation can be understood as \(b \otimes (am) = (b\rho(a)) \otimes m\), since the right \(A\)-module structure of \(B\) is given by \(\rho\).

Theorem 4.10 (Universal Property of Tensor Product).

  1. Let \(M\) and \(N\) be two \(\mathbb{Z}\)-modules. Then any bilinear map \(B:M\times N\to P\) can be decomposed as the composition of maps rendering math failed o.o where \(\varphi(m,n)=m\otimes n\) is a canonical map that depends only on \(M\) and \(N\), and \(B':M\otimes N\to P\) is the unique map that makes the above diagram commute.

  2. Let \(M\) and \(N\) be two modules with right and left \(A\)-module structures, respectively. Then similarly, any \(\mathbb{Z}\)-bilinear map \(B:M\times N\to P\) that is \(A\)-balanced can be uniquely decomposed as \(B'\circ \varphi: M\times N\to M\otimes_A N\to P\).

According to the definition, \(M\otimes N=F/I\), where \[F=\bigoplus_{(m,n)\in M\times N}\mathbb{Z}(m\otimes n)\] is the free module generated by all abstract symbols \(m\otimes n\), and \(I\) is the submodule generated by all bilinear relations, \[I=\langle (m+m')\otimes n - m\otimes n - m'\otimes n, m\otimes (n+n')-m\otimes n - m\otimes n'\rangle_{m,n}.\] Given a bilinear map \(B:M\times N\to P\), we can define a \(\mathbb{Z}\)-linear map \[f\in {\rm Hom}_\mathbb{Z}(F, P) = \prod_{M\times N} {\rm Hom}(\mathbb{Z}(m\otimes n), P)\] that satisfies \(m\otimes n \mapsto B(m,n)\) uniquely. Its existence is guaranteed by the property of free modules. Since \(B\) is bilinear, it is obvious that \(f(I)=0\) (because \(f\) is \(0\) on all generators of \(I\), we have \(I\subset \ker f\)), so we can make a quotient map \(\overline{f}:F/I\to P: a+I\mapsto f(a)\), which is the \(B':M\otimes N\to P\) we need. The second part of the proposition is similar, and only requires minor modifications.

Corollary 4.3. \[{\rm Hom}_\mathbb{Z}(M\otimes_A N,P) = {\rm Hom}_{|A}(M, {\rm Hom}_{\mathbb{Z}}(N, P))\] Here, the left side corresponds to \(B'\) in the proposition, and the right side corresponds to the \(A\)-balanced bilinear map \(B:M\times N\to P\).

Lemma 4.3. If \(V,W\) are \(k\)-vector spaces, then in \(V\otimes_k W\), \(v\otimes w=0\Leftrightarrow v=0\) or \(w=0\).

Proof. \((\Leftarrow)\) is obvious. For \((\Rightarrow)\), if \(v,w\) are both non-zero, we can construct \(k\)-linear functions \(f:V\to k,g:W\to k\) such that \(f(v),g(w)\neq 0\). Then it is easy to verify that \(B(x,y):=f(x)g(y)\) is a \(k\)-balanced bilinear map, so it can be decomposed as \(B=B'\circ \varphi: V\times W\to V\otimes_k W\to k\). But \(B(v,w)=f(v)g(w)\) is not \(0\), so \(\varphi(v,w)=v\otimes w\neq 0\). ◻

Exercise 4.46. Generalize the proposition to the case where \(M,N\) are right and left \(A\)-modules respectively. Here we need to add the appropriate assumption: \(M\) and \(N\) satisfy that any non-zero element has an \(A\)-linear function that does not take zero on that element.

Example 4.9. Let \(k\) be a field, \(V\) and \(W\) be two \(k\)-vector spaces. Obviously, they are two bi-\(k\)-modules, so we can define a \(k\)-linear space \(V\otimes_k W\). If \(V\) and \(W\) are generated by bases \(\{e_i\}\) and \(\{f_j\}\) respectively, then \(V\otimes_k W\) is generated by the basis \(e_i\otimes f_j\). It is obvious that this basis can generate the entire \(V\otimes W\), the difficulty lies in showing that \(\{e_i\otimes f_j\}\subset V\otimes W\) is linearly independent. Suppose there is a finite sum \[x=\sum a_{ij} e_i\otimes f_j =0\] This implies that for any \(k\)-balanced bilinear map \(B:V\times W\to k\), we have \(B'(x)=0\). Since \((v,w)\mapsto e_i^*(v)f_j^*(w)\) is obviously a \(k\)-balanced bilinear map, where \(e_i^*\) represents the unique linear function on \(V\) that takes the value \(1\) on the basis \(e_i\) and \(0\) on all other bases, applying this function to \(x\) gives us \(a_{ij}=0\). Since \(i\) and \(j\) are arbitrary, this shows that all coefficients of \(x\) are \(0\).

Exercise 4.47 (\(\bigstar\)).

  1. Can a natural linear mapping \(f\otimes g:V'\otimes W'\to V\otimes W\) be defined if there are two linear mappings \(f:V'\to V\) and \(g:W'\to W\)?

  2. Let \(V',V,W',W\) have bases \(e_i',e_i,f_j',f_j\) respectively, and let \(f,g\) have matrix representations \(A=(a_{ij}),B=(b_{ij})\) with respect to the bases. Find the matrix representation \(A\otimes B\) of \(f\otimes g\) with respect to the natural bases \(\{e_i'\otimes f_j'\},\{e_i\otimes f_j\}\) of \(V'\otimes W'\) and \(V\otimes W\) respectively. You should be able to obtain the following form by arranging the bases in the appropriate order: \[\left(\begin{array}{ccc} a_{11}B & \dots & a_{1m'}B \\ \vdots & \vdots & \vdots \\ a_{m1}B & \dots & a_{mm'}B \\ \end{array}\right).\] If you change the order of the bases, you can obtain the following matrix form: \[\left(\begin{array}{ccc} b_{11}A & \dots & b_{1n'}A \\ \vdots & \vdots & \vdots \\ b_{n1}A & \dots & b_{nn'}A \\ \end{array}\right).\] As a corollary, these two matrices must be similar. Usually, we define the former as the matrix \(A\otimes B\).

Dual Space

For an \(A\)-module \(M\), we can define a dual module \(M^* := {\rm Hom}_A(M,A)\). Its module structure is given by the bi-module structure of \(A\), i.e. if \(M\) is a left \(A\)-module, then \(M^*\) is a right \(A\)-module, and vice versa. In the case where \(A\) is a commutative ring, or a field, we do not need to distinguish between left and right modules, and \(M^*\) is naturally an \(A\)-module.

In the simple case of a \(k\)-vector space \(V\), \(V^*={\rm Hom}_k(V,k)\) is called the dual space of \(V\). It can be understood as the vector space of all linear functions \(f:V\to k\) on \(V\). The interesting thing about the dual space is that it is a contravariant functor.

Theorem 4.11. The dual functor \(M\mapsto M^*\) is a contravariant functor from left \(A\)-modules to right \(A\)-modules, meaning it maps a left \(A\)-module to a right \(A\)-module \(M^*\), and maps a left \(A\)-module homomorphism \(f:M\to N\) to a right \(A\)-module homomorphism \(f^*:M^*\xleftarrow{}N^*\). This correspondence satisfies the law of composition, meaning that the composition \[L\xrightarrow{g} M \xrightarrow{f} N = L\xrightarrow{f\circ g} N\] is preserved by the dual functor (but because it is a contravariant functor, the direction of composition is reversed): \[L^*\xleftarrow{g^*} M^* \xleftarrow{f^*} N^* = L^*\xleftarrow{(f\circ g)^*=g^*\circ f^*} N^*\]

Proof. In fact, this is a special case of the property of the Hom functor \((-)^* = {\rm Hom}_A(-,A)\) that we discussed earlier. Recall that it maps \(f:M\to N\) to \(f^*:M^*\xleftarrow{\circ f} N^*\), i.e. it pulls back the linear function \(h:N\to A\) on \(N\) to the linear function \(h\circ f: M\to N\to A\) on \(M\) through \(f:M\to N\). ◻

Dual basis

If \(M\) has a basis, i.e. \(M\cong A^{(I)}\) is a free module, then its dual module is \({\rm Hom}_A(A^{(I)},A)={\rm Hom}_A(A,A)^I=A^I\). The latter is usually a larger module than \(A^{(I)}\), but if \(I\) is finite, then \(A^{(I)}\) and \(A^I\) are isomorphic. In the case of vector spaces, this shows that the dual space \(V^*\) of a finite-dimensional vector space \(V\) has the same dimension as \(V\). However, this isomorphism is not canonical, i.e. we cannot give this isomorphism without specifying a basis, and it depends on the choice of basis, it is not a natural isomorphism. Nevertheless, we can still discuss the concept of a dual basis corresponding to a given basis.

Let \(V\) be generated by a basis \(\{b_1,\dots,b_n\}\). Then this basis is equivalent to an isomorphism \(\varphi:k^n\to V\), which maps the standard basis \(\{e_i\}\) of \(k^n\) to \(\varphi(e_i) = b_i\). Taking the dual functor of this mapping, we obtain \[k^n \cong (k^n)^*\xleftarrow{\varphi^*} V^*\] By functoriality, \(\varphi^*\) is also an isomorphism. If we reverse the direction of \(\varphi^*\), we get an isomorphism \(k^n\cong (k^n)^*\xrightarrow{(\varphi^*)^{-1}} V^*\), which gives a basis for \(V^*\), called the dual basis of \(\{b_i\}\) and denoted by \(\{b_i^*\}\).

Specifically, \(e_i^*\) is defined such that its image under \(\varphi^*\) is the function on \(k^n\) that takes the \(i\)th coordinate, i.e. \(e_i^*:k^n\to k\). This satisfies \[b_i^*\circ \varphi = e_i^*\] Substituting \(e_j\) gives \[b_i^*(b_j) = e_i^*(e_j) = \delta_{ij}\] Here, \(\delta_{ij}=1_{i=j}\) is a symbol that equals \(1\) when \(i=j\) and \(0\) otherwise. This last line is also commonly used as an equivalent definition of the dual basis, since it can be linearly extended to all vectors by satisfying \(b_i^*\circ \varphi = e_i^*\) for all \(e_j\in k^n\).

Matrix Representation of Dual Linear Mapping: Transpose

Let \(f:V\to W\) be a linear mapping between finite-dimensional vector spaces. After choosing bases \(v_i\) and \(w_i\) for \(V\) and \(W\), respectively, we can represent \(f\) as a matrix \(A=(a_{ij})\) such that \(f(v_j)=\sum_i a_{ij}w_i\). Then, what is the matrix representation \(B=(b_{ij})\) of the corresponding dual mapping \(f^*:V^*\xleftarrow{} W^*\) under the dual bases \(v_i^*\) and \(w_i^*\)? We can calculate \[\sum_{i'} b_{i'j} v_{i'}^* = f^*(w_j^*) = w_j^*\circ f,\] and substituting \(v_i\) gives \[b_{ij} = w_j^*(f(v_i)) = w_j^*\sum_{j'} a_{j'i}w_{j'} = a_{ji}.\] This matrix is actually obtained by flipping all elements of \(A\) along the main diagonal, i.e. rows become columns and columns become rows. This operation is called the transpose of \(A\), denoted as \(A^T\). Therefore, we have the following corollary:

Theorem 4.12. The matrix representation of the dual mapping under the dual bases is the transpose of the matrix representation of the original mapping under the original bases.

Corollary 4.4. Transposition reverses the direction of matrix multiplication, i.e. \((AB)^T = B^TA^T\).

It is worth noting that the above reasoning does not require \(V\) and \(W\) to have the same dimension, and the transpose of a matrix can be applied to matrices of any shape.

Exercise 4.48 (\(\bigstar\)). Let \(V\) be a finite-dimensional vector space. Discuss how the dual basis changes when the basis changes. That is, if we have a basis \(\{\alpha_i\}\) and a new basis \(\{\beta_i\}\) given by a coordinate transformation, \(\beta_i=\sum_{j} a_{ji} \alpha_j\), where \(A=(a_{ij})\) is an invertible matrix. How can the corresponding dual basis \(\beta_i^*\) be expressed in terms of \(\alpha_i^*\)?

Double Dual

Although \(V\) and \(V^*\) are not naturally isomorphic when the vector space is finite-dimensional, it is remarkable that \(V\) and \(V^{**}\) are naturally isomorphic.

Theorem 4.13. Let \(V\) be a finite-dimensional vector space. Then the map \(\mathrm{ev}: V\to V^{**}: v\mapsto \mathrm{ev}_v\) is a natural isomorphism that does not depend on the choice of basis. Here, \(\mathrm{ev}_v\) represents the linear map that maps a function \(f\) in \(V^*\) to \(\mathrm{ev}_v(f)=f(v)\in k\), i.e. the value of \(f\) at \(v\).

Proof. It is easy to see that \(\mathrm{ev}_v:V^*\to V\) is an element of \(V^{**}=(V^*)^*\). We will prove that this mapping is an isomorphism. Firstly, \(\mathrm{ev}\) is obviously a linear mapping, since it calculates the value of linear functions. To prove this, we only need to show that \(\mathrm{ev}\) is an injection, because in the case of finite dimension, we know that \(\dim V=\dim V^*= \dim V^{**}\).

If \(v\neq 0\), we can always find a linear function \(f:V\to k\) such that \(f(v)\neq 0\) (for example, by extending \(v\) to a basis and then taking the dual basis \(f=v^*\)). In this case, \(\mathrm{ev}_v(f) = f(v)\neq 0\), so as an element of \(V^{**}\), \(\mathrm{ev}_v\neq 0\). This proves that \(\mathrm{ev}\) is an injection, and thus proves the proposition. ◻

If \(V\) is not finite-dimensional, \(V^{**}\) is usually very large, but we still have the natural monomorphism \(\mathrm{ev}:V\to V^{**}\) mentioned above.

Field Extensions and Galois Theory


  1. Most books only write this definition from the beginning, which can be difficult for beginners. They may think that a group is just a set with an operation that satisfies some definition without knowing why. It takes them a long time to realize that groups actually represent symmetries.↩︎

  2. It is also denoted as \(C_n\) or \(\mathbb{Z}/n\mathbb{Z}\).↩︎

  3. We will mainly discuss finite groups.↩︎

  4. Strictly speaking, \(D_3\) is isomorphic to a subgroup of \(D_6\)↩︎

  5. A relation \(\sim\) is called an equivalence relation if it satisfies reflexivity \(a \sim a\), symmetry \(a \sim b \Leftrightarrow b \sim a\), and transitivity \(a \sim b, b \sim c \Rightarrow a \sim c\).↩︎

  6. Right cosets can also be defined. However, for convenience, this book uses left cosets.↩︎

  7. i.e. elements that commute with all other elements↩︎

  8. Some books also use \(S_X\)↩︎

  9. It is said that Frobenius discovered this result before Burnside.↩︎

  10. For \(S\subset G\), we use the notation \(C_G(S)\) to represent \(\{g\in G|gs=sg,\forall s\in S\}\), which is called the centralizer of \(S\).↩︎

  11. We can also define the right regular representation \[\begin{aligned} \alpha: G &\to {\rm Sym}(G)\\ g &\mapsto (a\mapsto ag^{-1}) \end{aligned}\]↩︎

  12. It is called induced representation because it is actually \({\rm Ind}_H^G V_0=\bigoplus_{g\in G/H}gV_0\), where \(V_0\) is the trivial one-dimensional representation of \(H\).↩︎