SQL cheetsheet

Relational Algebra

A procedural language that tells the machine “How” to get what we want

Types

Selection(select) Projection(get required column) Cartesian product(outer join) and others…

Relational Calculus

A non-procedural language that tells the machine “What” to get.

A basic SQL clause

SELECT, FROM WHERE

It usually looks like SELECT A1, A2…An FROM r1,…rm where P; SELECT is used to list desired attributes in the result A1…An are attributes FROM is a list of relations we need to access WHERE clause is a predicate(conditions) involving attributes of the relaion in the from clause.

SELECT doesn’t eliminate duplicates!

Does FROM internally calculates a Cartesian product?

for each tuple t1 in relation r1
    ...
       for each tuple tm in relation rm
            comcatanate t1,....tm into a single tuple t, 
            add t into the result relation

But we rarely used it. Instead we use where to restrict the combinations. If we have

SELECT a1, a2
FROM R1, R2
WHERE R1.a3 == R2.a3

We would only combine tuple in R1 with only those tuples in R2 that has the same a3. We only match tuples with the same a3.

Natural Join

Natural join will join two relations together. But will only concatenate tuples that have the same values for all attributes that appear in the schemas of both relations.

If R1 has a1, a2, a3 and R2 has a2, a3, a4. R1 NATURAL JOIN R2 will only concatenate tuple with identical a2 and a3. It will return relation with R3 = a1, a2, a3, a4.

Specifying what column should be equated

The above example shows that natural join could be error prone. To change it, use

R1 NATURAL JOIN R2 USING(a2)

This will only equate a2 as a condition for concatnating tuples in R1 and R2.

Rename

Reasons for rename

Identical attribute names from two relations

Arithmetic expression will result in attribute with no name

Just wanna change it

As clause

The syntax for as is old name as new name As can appear in both select ** and **from In select, we can rename attributes

SELECT name as person_name
...

We can also rename relations to simplify

SELECT A.name, B.id
FROM banana_whatever_is_good as A, no_apple_is_the_best as B
where A.id = B.id;
Very helpful to compare tuple in the same relation.
VERY USEFUL

EG: Find all student whose credit hours completed is at least greater than at least one senior at our college.
```SQL
SELECT DISTINCT S.name
FROM student as S, student as C
WHERE S.hour_complete > C.hour_complete and C.year = 'senior'

But C is not a copy of S. It is just an alisas.

String ops

Strings are in single quote.

Pattern matching

% for matching substring

hello%' will match any string that starts with hello%hello$’ will match anystring that has hello

_(underscore) for matching any character

‘ will match any string with exact 3 chars ‘%%’ will match any string at least three chars long.

Order by

the result tuples will be in sorted order. (may it be numerical, alphabetical…) Can also be performed on multiple attributes

`\* FROM student ORDER BY birthday desc, name asc; This is list all students info in descending birthday order. In case people have the same birthday, they will be ordered ascendingly based on name.

Set ops

Union

Combined tuples from two relations that have the same set of attributes

No duplicates

If want duplicate, you use UNION ALL

INTERSECT

Find all tuples that exist in both relations.

NO duplicates as well.

Except

Find all tuples that exist in first relation but not the second

Null

Arithmetic with null

For + – * /, if one of the operand is null, the value is null.

Comparison

If any comparison has one operand as NULL. Then the comparison is unknown. This creates a third logical value besides true and false.

In a where clause

true AND unknown = unknown. false AND unknown = false. true OR unknown = true, false OR unknown = unknown. NOT unknown = unknown

If in where a tuple is eval to false or unknown, it won’t be included.

Aggregate functions

They take a collection of values and return a single value. Such as AVG, MIN, MAX, SUM, COUNT

count unique student id. we need to use distinct

SELECT COUNT(DISTINCT id)
FROM STUDENT

Group by

WE want to aggregate by group of sets of tuple. Such as calculating average GPA for different class man in college. We use group by. The attribute given in group by clause are used to form groups. Tuples with the same value on all attributes in the GROUP BY clause are placed in one group.

SELECT class_name, avg(GPA) as avg_GPA FROM student GROUP BY class

The attributes in SELECT must appear in GROUP BY

Having

Having will state a condition on the group formed by GROUP BY. We used it after GROUP BY, when the groups have been formed. The predicate for HAVING is often an aggregate.

EG

select name, avg (GPA)
from student
group by department
having avg (GPA) > 3;

Nested query

in WHERE clause

A common use of subqueries is to perform tests for set membership, make set comparisons, and determine set cardinality, by nesting subqueries in the where clause.

IN operator

We can use IN to specify multiple value in WHERE clause. It is also a shorthand for multiple OR.

SELECT column_name(s)
FROM table_name
WHERE column_name IN (value1, value2, ...);
SELECT name
FROM student
WHERE name IN ('TOM', 'JOHN')

Find all the TOM and JOHN in the student body.

We can also put in subquery in IN.

SELECT column_name(s)
FROM table_name
WHERE column_name IN (SELECT STATEMENT);

If we want to find all the courses that are taught in 2009 and 2010. We can find it using intersect.

(SELECT course id
FROM section
WHERE semester = ’Fall’ AND year= 2009)
INTERSECT
(SELECT course id
FROM section
WHERE semester = ’Spring’ AND year= 2010);

(This example is from DB systems concept 6e)

We can also use IN

SELECT course id
FROM section 
WHERE year = 2009 and course_id in (SELECT course id FROM section
                                     WHERE year = 2019)

Set comparison

SOME

For a query like “Find the names of all instructors whose salary is greater than at least one instructor in the Biology department.” We can use alias to refer to the same relation.

select distinct T.name
from instructor as T, instructor as S
where T.salary > S.salary and S.dept name = ’Biology’;
select name
from instructor
where salary > some (select salary
                     from instructor
                     where dept name = ’Biology’);

The subquery will generate all the salary of biology professor. Then the >SOME comparison is TRUE if the salary of that professor is greater than at least one member of all the salary values.

ALL

select name
from instructor
where salary > all (select salary
               from instructor
               where dept name = ’Biology’);

This query will find instructor’s name whose salary is greater than all salary in the bio department.

consider the query “Find the departments that have the highest average salary.”

select dept_name 
from prof_list
group by dept_name
having avg(salary) >=all (select avg(salary) from prof_list group by dept_name)

We first write the sub-query to find the avg salary of all the department. Then we will find select all the group that has their average salary higher than all department’s salary average.

Subqueries in FROM

A select-from-where expression in SQL returns a relation as a result. So we can insert select-from-where(a subquery) anywhere a relation can appear.

In an previous example, we tried to find the departments whose student has average gpa higher than 3.6. We used having on the group generated by GROUP BY.

select dept_name, avg(GPA) as avg_gpa
from student_list
group by dept_name
having avg_gpa > 3.5

We can also express this with a sub-query in the from clause.

select dept_name, avg_gpa
from (select dept_name, avg(GPA) as avg_gpa
      from dept_name
      group by dept_name)
where avg_gpa > 3.6

In this query, the subquery first generate all dept with their average gpa.
Then we use a where clause to filter out needed tuples.
The having predicate is now in the where clause.

The attribute of the sub-query can be used in the outer query. As in the case for dept_name and avg_gpa.

This sub-query in from can be replaced by HAVING. But in some situations, HAVING can’t help. Notably aggregate functions like SUM that is not DESCRIPTIVE of the population.( I don’t even know how to describe it. Can someone help?)

EG: We want to find the max salary spending department.

select dept_name,max( total_salary)
from (select dept_name, sum(salary)
      from dept
      group by dept_name)


##With Clause
The with clause defines a temporary relation whose definition is available only to the query in which the WITH clause occurs. It gives a sub-query block a name.

An example from geeksforgeeks:
```SQL
WITH temporaryTable (averageValue) as
    (SELECT avg(Attr1)
    FROM Table),
    SELECT Attr1
    FROM Table
    WHERE Table.Attr1 > temporaryTable.averageValue;

Modification of DB

Delete

We can only delete tuple. We can not delete particular attributes.

delete from relation
where P

we will delete all tuples that satisfy the predicate.

We can make P the predicate very complex. Some example borrowed from the book:
delete all instructors who’s department is located in X building. instructor is stored in the instructor table and department is stored in the department table.

delete from instructor
where dept_name in (select dept_name
                    from department
                    where dept_loc = 'X')

delete can only delete one relation at a time.
But we can refer to the relation under deletion in the where clause’s. EG: delete all students with below-average gpa.

delete from student
where gpa < (select avg(gpa) 
             from student);

We will first calculate the avg gpa. Then do the comparisons and delete all tuples that fail the test.

Insert

Either specify the tuple to be inserted or write a query that return a set of tuples to be inserted.

simple insert

insert into course
values (’CS-437’, ’Database Systems’, ’Comp. Sci.’, 4);

query insert

EG: add all students who have higher than 3.5 GPA to honor group

insert into honor
       select ID, name, GPA
       from student
       where GPA > 3.5

The select statement gives a set of tuple. Then it is insert into the relation.

Update

Update some value in a tuple without changing all.

update student
set grade = grade * 1.5
where grade < 60;

Give students who need help a curve.

update student
set grade = grade * 1.5
where grade < (select avg(grade) from student);

We can also add cases for and if-elseif-else style update

update student
set grade = case
            when grade < 60 then grade*1.5
            when  ....      then ...
            else grade * 1.3
            end

Leave a Reply

Your email address will not be published. Required fields are marked *

To create code blocks or other preformatted text, indent by four spaces:

    This will be displayed in a monospaced font. The first four 
    spaces will be stripped off, but all other whitespace
    will be preserved.
    
    Markdown is turned off in code blocks:
     [This is not a link](http://example.com)

To create not a block, but an inline code span, use backticks:

Here is some inline `code`.

For more help see http://daringfireball.net/projects/markdown/syntax