In database management an aggregate function is a function where the values of multiple rows are grouped together to form a single value of more significant meaning or measurement such as a set, a bag or a list. Wikipedia
Common Aggregate Functions
count()returns total number of rows that match our search
sum()returns added sum of values for a group of rows
avg()returns calculated average value for a group of rows
max()returns largest value in a group of rows
min()returns smallest value in a group of rows
min()only work if the values are numbers (numeric values)
count()doesn’t count null values
count( * )will count all the rows, regardless of the null values
SELECT count( * ) FROM movies; # count all rows in movies database SELECT count( title ) FROM movies; # count all rows in title column, which are not `null` SELECT sum( cost ) FROM movies; # product cost for all movies SELECT avg( tickets ) FROM movies; # average ticket cost SELECT max( tickets ) FROM movies; # most popular movie, by determining the max amount of tickets sold SELECT min( tickets ) FROM movies; # least popular movie, by determining the max amount of tickets sold SELECT max( tickets ), min( tickets ) FROM movies; # multiple functions = multiple values
GROUP BYgroups results by column values
# SELECT ___, aggregateFunction( ___ ) FROM ___ GROUP BY ___ # SELECT columnName, aggregateFunction( columnName ) FROM tableName GROUP BY columnName SELECT genre, sum(cost) FROM movies GROUP BY genre # returns lists of all genres & sum total of costs for each of them
Grouping with conditions
HAVINGis the keyword for inclusion, show results only if they have met the condition
HAVNIGworks in conjunction with an aggregate function
# SELECT ___, aggregateFunction( ___ ) # FROM ___ # GROUP BY ___ # HAVING aggregate_functino(___) operator value; # SELECT columnName, aggregateFunction( columnName ) FROM tableName GROUP BY columnName HAVING aggregate_functino(columnName) operator value; SELECT genre, sum(cost) FROM movies GROUP BY genre HAVING count(*) > 1 # returns lists of all genres (grouped) & sum total of costs for each of them, only if there are more than one row
- You can add a
GROUP BYto further refine your query
Constraints are meant to avoid addition of bad or unwanted data. They provide additional validation
NOT NULLprevents NULL values
UNIQUEEnsures column values are unique (prevents duplicates)
PRIMARY KEYunique identifiers (both UNIQUE and NOT NULL). Can only be assigned once per table (i.e. you can not have more than one primary key column).
PRIMARY KEYcan only be defined on only one column (once per table) while
NOT NULLcan be used on multiple columns
Column constraints syntax
CREATE TABLE Promotions ( id int PRIMARY KEY, # name must be NOT NULL, must be UNIQUE name varchar(150) NOT NULL, # name can not be null category varchar(15) ); CREATE TABLE Promotions ( id int PRIMARY KEY, name varchar(150) NOT NULL UNIQUE, # name can't be null, or the same as another value already in column category varchar(15) );
ERROR: duplicate key value violates unique constraint "promotions_name_key" DETAIL: Key (name)=(Matinee) already exists.
Table constraints syntax
You can define your constraints (i.e. assign it a name). Makes it easy to edit them later. Also, you get the constraint name in the error message instead of the key value
CREATE TABLE Promotions ( id int PRIMARY KEY, name varchar(150) NOT NULL, category varchar(15) CONSTRAINT unique_name UNIQUE (name) # unique_name is what we assigned for our custom constraint name ); CREATE TABLE Promotions ( id int PRIMARY KEY, name varchar(150) NOT NULL, category varchar(15) CONSTRAINT unique_name UNIQUE (name, category) # constraint on multiple columns );
ERROR: duplicate key value violates unique constraint "unique_name" DETAIL: Key (name)=(Matinee) already exists.
With value constraints you create references between two tables.
Naming convention for these is to singularize the table you are referencing, followed by an underscore and the column name (e.g.
- A foreign key is a column in one table that references the primary key column of another table
movie_idcolumn is a foreign key referencing the
idprimary key column in the
# find out movie id SELECT id FROM movies WHERE title = "Gone With The Wind"; # assume it returns 2 # use that movie id to find out promotions SELECT name, category FROM Promotions WHERE movie_id = 2;
- To avoid overriding referenced columns (e.g. adding values to a foreign key column will add the value, but it wouldn’t give you any referenced data because there is no reference, you just added the value directly in this table, not the referenced one), you use a
- You can only reference (i.e. create a constraint) an existing table
movie_id int REFERENCES movies(id)
# table being referenced must be created first (should be existing) CREATE TABLE Movies ( id int PRIMARY KEY, title varchar(120) NOT NULL UNIQUE ); CREATE TABLE Promotions ( id int PRIMARY KEY, movie_id int REFERENCES movies(id), name varchar(50), category varchar(15) ); CREATE TABLE Promotions # same as above ( id int PRIMARY KEY, movie_id int REFERENCES movies, # primary key column is used by default name varchar(50), category varchar(15) ); CREATE TABLE Promotions # same as above, different syntax ( id int PRIMARY KEY, movie_id int, name varchar(50), category varchar(15), FOREIGN KEY (movie_id) REFERENCES movies );
Once you have a foreign key constraint in place, it’ll give you an error if you try to reference a foreign key value that doesn’t exist
INSERT INTO Promotions (id, movie_id, name, category) VALUES (4, 999, 'Fake Promotion, 'Hoax');
ERROR: insert or update on table "promotions" violates foreign key constraint "promotions_movie_id fkey" DETAIL: Key (movie_id)=(999) is not present in table "movies".
(Child) records with a foreign key reference to a (parent) record that has been deleted
- Orphan records are records with foreign key references where the foreign records have since been deleted. For example, you were referencing
movie_id11 in your Promotions table, but the movie has been deleted.
- This could happen when you delete the referenced row or just drop the referenced table
- Orphan records = bad data
- The solution is a foreign key constraint
DELETE FROM Movies WHERE id = 6;
ERROR: update or delete on table "movies" violates foreign key constraint "promotions_movie_id_fkey" on table "promotions" DETAIL: Key (id)=(6) is still being referenced from table "promotions".
DROP TABLE Movies;
ERROR: can not drop table movies because other objects depend on it DETAIL: constraint promotions_movie_id_fkey on table promotions depends on table movies
If you delete both the Movie and Promotions, you’ll get no errors. Basically, you’ll delete all referenced/associated records first and then delete the thing you wanted to delete
DELETE FROM Promotions WHERE movie_id = 6; DELETE FROM Movies WHERE id = 6;
Same goes for dropping tables. First, drop any tables that are making references to the table you want to delete, and then delete the actual table
DROP TABLE Promotions; DROP TABLE Movies;
Validating input values
For example, let’s say you want to make sure people are not entering negative values
CREATE TABLE Movies ( id int PRIMARY KEY, title varchar(120) NOT NULL UNIQUE, genre varchar(120), duration int CHECK (duration > 0) # validate input for this column );
INSERT INTO Movies (id, title, genre, duration) VALUES (7, 'Intouchables', 'Drama', -10);
ERROR: new row for relation "movies" violates check constraint "movies_duration_check" DETAIL: Failing row contains (7, 'Intouchables', 'Drama', -10 )