Joins
Joins types
Inner join
Inner join will combined tuples from two relations that satisfy the join predicate and return a relation where each tuple is a concatenation of the two tuples that satisfy the join predicate
ON condition
We can use ON for a general predicate over the relations being joined. EG:
select *
from student join classes_taken on student_id = taken_id
This on keyword specifies that a student matches with class if their id match. It is very similar to using natural join. But using on will list the id twice. Using natural join will only have id attribute once/
Difference between WHERE and ON
Even though using WHERE and JOIN are effectively the same for inner join. But where clause will let the inner join execute first then filter. But ON will first filter out.
Recall, outer join adds null-padded tuples only for tuples that don’t match.(or don’t contribute to the inner join). The ON condition is part of the outer join specification.
select *
from student left outer join takes on student.ID = takes.ID
Assume we have a student name kd with ID 1001. But there is no class with ID 1001.
Then using the query above, since there is no match, we will a tuple with attribute
(kd, 1001,..,...,..., NULL,NULL,NULL). Where all the NULL fill the parts for takes's attribute.
But if we move the predicates to where:
```SQL
select *
from student left outer join takes on true
where student.id = takes.id
the on true will be evaluated as true for every pair of tuples from student and takes. Therefore we will have a Cartesian product of the two relations because the join predicate is all true. Since there is no class with ID 1001, every time a tuple with name “kd” appear in the result of the outer join, the ID of student.id and takes.id must be different. So the where clause will eliminate all tuples with name “kd”.
outer join
Outer join is a join the preserve those tuples that don’t get matched by creating tuples in the result containing null values. EG: If we want to create a table of students with their courses. But some students might not have taken any classes. We need to use outer join.
left outer join
Preserve tuples on in the relation named before the OUTER JOIN clause
right outer join
Preserve tuples in the relation named after the OUTER JOIN clause
Full outer join
Preserve tuples in both direction.
Primary key of outer join.
Join conditions
Natural
On
using(A1…An)*
View
Why view
Not desiriable for all users to see the entire logical model.
Security
Virtual relation
In SQL, we can define a virtual relation by a query and the relation contains the result of the query. It is usually not stored. Any such relation that is not part of the logical model is called a view.
Defn
EG
create view *v* as <query expression>;
We can also specify the name of the attributes explicitly
create dept_avg_gpa(dept_name, avg_gpa) as
select dept_name, avg(gpa)
from student
group by dept_name;
storing the views(most of the time) or store it
Why?
Because views are result of a query, so if the underling relations change, the views stored might be out of views.
Materialized views
We can store it and update it when the underlying relations change.
Benefit
If we need fast response. Or If we need to repeatedly do large aggregates.
Update views
Rule of Thumbs: Don’t do it since you might have to alter the underlying relations.
EG:
“`create view instructor_info as
select ID, name, building
from instrctor, department
where instructor.dept_name = department.dept_name
Then
```SQL
insert into instructor_info
values('1001', 'kd', 'Taylor building')
But what if there is no tuples in instructor with id 1001 and name ‘kd’. and there is no building called Taylor. If we just naively insert (1001, ‘kd’, NULL, NULL…) into instructor relation and (Taylor building, NULL…) into department relation, this is no good. Becuase the instroctor_info view still won’t have the desired tuple since dept_name for both newly created tuples are NULL.
When can you do updatable views.
- from clauses only have one relation.
- select contains only attribute names. No aggregate or distinct.
Transactions
A transaction consists of a sequence of query and or update statements.
Two outcomes
Commit work
updates performed become permanent in the DB
Rollback work
all the updates performed SQL are undone. DB is restored to what it was before the transaction.
Atomic
Transactions are Atomic because of rollback. It is indivisible. Transaction is either done or not done. No half done.
How big is the transaction?
By default, each query is a transaction and gets committed each time it is finished.
We can use begin atomic…end to group multiple SQL statements into one trasaction.