A DB is mapped into files maintained by OS.
Layers of a file
File
File is organized logically as a sequence of records.
Blocks
Each file is logically partitioned into fixed_length storage units called blocks. Usually of size 4 to 8 kb.
records
A block might contain several records.
Assumptions about records
- No record is larger than a block. If the record has variable length objects, we use pointers.
- each record is contained in only one block. It is good for speed up accessing
Records
Fixed length record
type person = record
name varchar(10);
id = numeric(8,2);
Assume the above record takes 20 bytes.
How many records in a block
We use block size // record size(int division) as the number of records per block. Some space might left unused.
Delete
When deleting a record, we could move all records after it forward but too much movement.
Move the last record to the place of the deleted record.
Leave open and wait for subsequent insertion using free list
We need to record where each empty record is. We do so by adding a header in front of the block that records the first deleted record(at the beginning, point it to NULL). After deleting the first record, we point the pointer in the header to the beginning of the deleted block. If we delete again, we will store the location of the second deleted record in the first deleted record.
Upon insertion, we will insert the new record into the place pointed to by the header pointer, and update the header pointer to the next available free location (that was stored in the location of the first deleted record).
The deleted records are like a linked list. It is called free list.
Variable length records
Where do they come from
- Storing multiple record type in a file
- record has variable length attributes
Representation of variable length record
Usually has two parts. a fixed length attributes, then followed by variable-length attributes.
The last three fields are variable field. Before the variable length field, The beginning of the record will have information about each attribute’s length and beginning through (offset, length) pairs, no matter variable of fixed length.
In the middle, we have null bitmap, indicating which attr have a null value. In some DB, null bitmap is stored at the beginning and the null attributes will have no data at all, saving space.
Storing varlen record in a block using slotted page structure
At the beginning of each block there is a header that has
- Num of record
- end of free space
- an array that has location and size of each record
The records are allocated continuously in the block starting from the end of the block. The free space in the block is contiguous, between the header and the first record.
Insert record in slotted page structure
Insert at the end of the free space, so the records are still contiguous.
Delete
We set the entry in the array to deleted(for example, set the size to -1). Then we need to occupy the deleted space with other record to maintain the contiguous free space between header and the records.
How to organize records in a file
Heap file organization
Any record can be placed anywhere in the file. No ordering. Typically one file only contains one relation.
Sequential file organization
Order the records based on search key. We also store records physically in search-key order(or as close to it). Insertion creates problem: If we can’t find a free block, we insert the record in a overflow block and adjust the pointer of the previous block to point to this record.
Might need to reorder the records.
Hash
Use hash function computed on the record to locate the block it resides
What large database does with file
Large db usually don’t use OS file managements. It just ask the OS to give it a big file and put all records in it.
Multi table clustering file system
If we store multiple relations in a single block, it might be advantageous for some join query.
select dept_name, student_name, dept_location
from dept natural join student
For each tuple of department, db must locate the student tuple with the same department name and transfer it to the main memory. Worst case, each record will be in a different block we have to do one block read for each record. But if we store related records in each block, we can use one block read to satisfy join.
In the previous example, Assume department looks like
dept_name, dept_loc, dept_chair
CS Klaus Buzz
MUSIC NYC Shostakovitch
and the student looks like
id, name, dept_name, year
1 kd CS 2020
Then we would store these two relation in a multitable like:
cs | klaus | Buzz
1 kd | 2020
2 ted | 2021
music| nyc | Shostakovitch
33 | tom | 2019
Thus a join will require fewer block seek.
Buffer manager
Since we can only alter data in memory, we need to drag pages to from disk to main memory. Each page being dragged in will be stored in the buffer pool’s frame. When we need more space for more pages, we might evict some pages in the buffer pool based some predefined criteria. If the page is “dirty”, we also need to write the dirty page back to the disk.