File organization in DB

A DB is mapped into files maintained by OS.

Layers of a file


File is organized logically as a sequence of records.


Each file is logically partitioned into fixed_length storage units called blocks. Usually of size 4 to 8 kb.


A block might contain several records.

Assumptions about records

  1. No record is larger than a block. If the record has variable length objects, we use pointers.
  2. each record is contained in only one block. It is good for speed up accessing


Fixed length record

type person = record
              name varchar(10);
              id = numeric(8,2);

Assume the above record takes 20 bytes.

How many records in a block

We use block size // record size(int division) as the number of records per block. Some space might left unused.


When deleting a record, we could move all records after it forward but too much movement.

Move the last record to the place of the deleted record.

Leave open and wait for subsequent insertion using free list

We need to record where each empty record is. We do so by adding a header in front of the block that records the first deleted record(at the beginning, point it to NULL). After deleting the first record, we point the pointer in the header to the beginning of the deleted block. If we delete again, we will store the location of the second deleted record in the first deleted record.

Upon insertion, we will insert the new record into the place pointed to by the header pointer, and update the header pointer to the next available free location (that was stored in the location of the first deleted record).

The deleted records are like a linked list. It is called free list.

Variable length records

Where do they come from

  1. Storing multiple record type in a file
  2. record has variable length attributes

Representation of variable length record

Usually has two parts. a fixed length attributes, then followed by variable-length attributes. var len representation

The last three fields are variable field. Before the variable length field, The beginning of the record will have information about each attribute’s length and beginning through (offset, length) pairs, no matter variable of fixed length.

In the middle, we have null bitmap, indicating which attr have a null value. In some DB, null bitmap is stored at the beginning and the null attributes will have no data at all, saving space.

Storing varlen record in a block using slotted page structure

At the beginning of each block there is a header that has

  1. Num of record
  2. end of free space
  3. an array that has location and size of each record

The records are allocated continuously in the block starting from the end of the block. The free space in the block is contiguous, between the header and the first record.

Insert record in slotted page structure

Insert at the end of the free space, so the records are still contiguous.


We set the entry in the array to deleted(for example, set the size to -1). Then we need to occupy the deleted space with other record to maintain the contiguous free space between header and the records. slotted page structure

How to organize records in a file

Heap file organization

Any record can be placed anywhere in the file. No ordering. Typically one file only contains one relation.

Sequential file organization

Order the records based on search key. We also store records physically in search-key order(or as close to it). Insertion creates problem: If we can’t find a free block, we insert the record in a overflow block and adjust the pointer of the previous block to point to this record.

Might need to reorder the records.


Use hash function computed on the record to locate the block it resides

What large database does with file

Large db usually don’t use OS file managements. It just ask the OS to give it a big file and put all records in it.

Multi table clustering file system

If we store multiple relations in a single block, it might be advantageous for some join query.

select dept_name, student_name, dept_location
from dept natural join student

For each tuple of department, db must locate the student tuple with the same department name and transfer it to the main memory. Worst case, each record will be in a different block we have to do one block read for each record. But if we store related records in each block, we can use one block read to satisfy join.

In the previous example, Assume department looks like

dept_name, dept_loc, dept_chair
CS         Klaus     Buzz       
MUSIC      NYC       Shostakovitch 

and the student looks like

id, name, dept_name, year
1    kd    CS        2020

Then we would store these two relation in a multitable like:

cs   |  klaus   | Buzz
1       kd      | 2020
2       ted     | 2021
music|  nyc     | Shostakovitch
33   |  tom     | 2019

Thus a join will require fewer block seek.

Buffer manager

Since we can only alter data in memory, we need to drag pages to from disk to main memory. Each page being dragged in will be stored in the buffer pool’s frame. When we need more space for more pages, we might evict some pages in the buffer pool based some predefined criteria. If the page is “dirty”, we also need to write the dirty page back to the disk.

Leave a Reply

Your email address will not be published. Required fields are marked *

To create code blocks or other preformatted text, indent by four spaces:

    This will be displayed in a monospaced font. The first four 
    spaces will be stripped off, but all other whitespace
    will be preserved.
    Markdown is turned off in code blocks:
     [This is not a link](

To create not a block, but an inline code span, use backticks:

Here is some inline `code`.

For more help see