Mathematically, a relation is just a subset of certain product spaces, like \(R\subset A\times B\times C\). Its elements are of the form of tuples \((a, b, c)\).
The format for creating a table like this in SQL is simple:
CREATE TABLE table_name (
column_a datatype_a,
column_b datatype_b,
column_c datatype_c,....
);
There are three core operations:
Selection: This is similar to projection to coordinates
\[ \prod_{\text{all}} T_i \xrightarrow{\text{select}} \prod_{\text{some}} T_i \]
Where: This is a filter operation
\[\prod_i T_i \xrightarrow{\text{conditions}} \prod_i T_i\]
Join: Join is similar to a fibred product in mathematics, but there are multiple different kinds of join operations. For example the inner join on \(R_{AB}\subset A\times B\) and \(R_{BC}\) is \(R_{AB}\times_B R_{AC}\), selecting all \((a,b,c)\) that satisfies \((a,b)\in R_{AB}\) and \((b,c)\in R_{BC}\).
\[ \prod_I T_i \times \prod_J T_j \xrightarrow{\text{join}} \prod_I T_i \times_{\prod_{K} T_k} \prod_{J} T_j , \quad K\subset I\cap J \]
SELECT a, b, c
FROM table1
INNER JOIN table2
ON table1.b = table2.b;
Left join: besides a normal join, it forces the left table to be included in the result, even if there is no match in the right table (by filling NULL values).
\[ \prod_I T_i \times \prod_J T_j \xrightarrow{\text{left join}} \prod_I T_i \times_{\prod_{K} T_k} \left(\text{Either }\prod_{J} T_j \text{ or NULL}\right) , \quad K\subset I\cap J \]
SELECT column_name(s)
FROM table1
LEFT JOIN table2
ON table1.column_name = table2.column_name;
Right join: similar to left join, but the right table is included in the result.
Full join: includes all rows from both tables, filling NULL values when necessary.
Cross join: returns the Cartesian product of the two tables.
SELECT column_name(s)
FROM table1
CROSS JOIN table2;
Basic types like INTEGER
, REAL
, TEXT
, BLOB
, NULL
, DATE
, BOOLEAN
, etc.
Constraints like NOT NULL
, UNIQUE
, PRIMARY KEY
, FOREIGN KEY
, CHECK
, DEFAULT
, etc.
Transactions: COMMIT
, ROLLBACK
, SAVEPOINT
, BEGIN TRANSACTION
, etc. (We can think of this as a kind of homotopy theory in mathematics.)
Domain Types: you can create your custom types that encapsulate certain constraints and properties.
Example:
CREATE TABLE Users
PRIMARY KEY
( user_id SERIAL UNIQUE NOT NULL
, username TEXT NOT NULL
, email TEXT INTEGER CHECK (age >= 18)
, age DATE DEFAULT CURRENT_DATE
, created_on );
here
SERIAL
is a shorthand for INTEGER PRIMARY KEY AUTOINCREMENT
.
PRIMARY KEY
is a constraint that uniquely identifies each record in a table, it must be injective.
UNIQUE
constraint ensures that all values in this column are different, ensuring injectivity.
NOT NULL
constraint enforces a column to not accept NULL values.
CHECK
constraint ensures that all values in this column satisfy certain conditions.
PostgreSQL has a extension system allowing users to augment the database with additional functionalities, this can be new types, new functions, new operators, index types, etc.