We want to locate where our records are. We might have to resort to index, just like when we are trying to find a key word, we go to the indice in the back of a book.
Indices are used to locate record; Attributes that are used to loop up are called search key. They don’t have to be Key or super key.
Ordered indices: it is based on sorted order of the records
operations and performance affected by choice of indices
We can have specific attribute look up or range look up
Ordered indices are indices that store indices in sorted order.
Record in the files might themselves be sorted in some order. A clustering index is an index whose search key also defines the sequential order of the file. They are also called primary index.
Indices order is different from the sequential order of the records. There is a second list that has pointers to the physical rows.
It is a search key value and pointer to one or **more than one ** records that has the specific search key value.
Index entry appears for every search key value in the file. If multiple records have the same search key, then the index’s pointer will point to the first record. But for non-clustering index, each index entry will need store pointers to all records that has the specific search key. (sequential scan from pointer won’t work since the record is not in order).
Sparse index doesn’t store index entry for every search key in the records. It can only be used by clustering index. Because to locate a record, we will find the largest index entry that’s smaller than the search key we are looking for. Then do a sequential scan.
Comparisons and compromise
Dense index is fast but takes up more space Sparse index is slower but take less space. Compromise: Use sparse index to point to the beginning of each block. Since the cost of processsing in DB is to drag blocks from disk to memory.
Store a index record of the original index. For comparison purpose, let’s compared this two level index with binary search.
For a clustering index, we can use binary search and the time is O(log_2 N). So for a 10,000 block of index, it will take at most 14 block reads to find the desired index. Also, if overflow blocks are used, binary search is not even available.
We can construct a multilevel index, treating the index just a regular sequential file and construct a sparse outer index on the original index. (Because the original index is sorted, we can use sparse outer index).
To find a record, we first find the largest outer index that’s smaller than the search key we want. The pointer in that index will point to a block of index. Then we can do sequential scan.
let’s assume 10,000 block of orignal index would need 100 block of outer sparse index. Since 100 block is small enough, it could be stored in memory so there will be no block read from disk. This is a huge improvement from 14 block reads required by binary search.
We need to update index when a record is inserted or delete. We also need to update index when the search key attribute is altered.
First we need to look up.
Dense index insertion
If the search key is not in the index, Db inserts an index entry at the appropriate place.
- If the index entry stores pointers of all records that have the same search key value, append this record’s pointer to the index entry.
- Else if the index entry only stores the first record with the same search key value, DB will place the record being inserted after the other records that have the same search key value.
Sparse index insertion
We first need to assume each index stores an entry for each block.
- If our record is stored in a new block. we store the search key value in index entry.
- ELIF the new record has the least search key value in its block, making that record the first of its block, we update the index entry of that block pointing to our record.
- Else we do not change the index
Delete for dense index
First, we need to look up the search key
If the delete record is the only record with the search key, just remove the record and index entry.
If the dense index stores all the pointers to records with the same search key, then remove pointer to to-be-deleted record
Elif the dense index stores the pointer of the first record that has the same search key value and if the to-be-deleted record is the first record in the search-key pointer, we need to update the pointer to point the next next record that has the same search key value.
Delete for sparse
If the index doesn’t have the search key value of the to-be-deleted record, then do nothing
- If the deleted record is the only record with that search key and the next record in the search key order doesn’t have index pointing at it, update the index entry to point to the next record by search key order. But if that record already has an index entry pointing at it, we will delete the index record.
- ELIF if multiple records have the same record, we update the index to point to the next record that has the same search key value.