# Insertion sort

The idea is really simple. When you sort an array of number, you start with two sides. One side will hold the sorted elements, the other will hold the rest of the unsorted. In the very beginning, the sorted part will hold one element, which is sorted by itself. Then we pick the second element will be inserted into the sorted part of the array based on the preference(non-descending or non-increasing).

This process continues in a loop. We start with the 1… (i – 1) element sorted in the array and we pick the ith element and insert it into the right place. If we want non-descending and the ith element is bigger than (i – 1)element, then we will move the (i – 1)element to ith element’s place, effectively making space for the ith element. The psedo code is:

We assume the array starts at position 1.

```
for j = 2 to A.length
key = A[j] //insert A[j] into the sorted array A[1..J-1]
i = j - 1
while i > 0 and A[i] > key // while we haven't reached the first elem of sorted array and the key is smaller than the current element A
A[i + 1] = A[i]
i --
A[i + 1] = key
```

## Loop invariants and the correctness of the insertion sort

This algorithm is in place. To show the correctness of the algorithm, we will usee something called *loop invariant*, which means at the start of each iteration, the subarrayA[1..j – 1] consists of the same elements originally in A[1..j – 1] but in sorted order. To show this, we need to prove three things:

1. Initialization: it is true at the beginning of the loop. (The first element is sorted by itself.) 2. Maintenance: it is true before an iteration of a loop and it is also true before the next iteration. (after an iteration, the sorted part will remain sorted with one additional element, the ith element.) 3. Termination: when the loop stops, the invariant gives us useful information.(When then loop stops, the subarray A[1..n] will be sorted.

It is very similar to induction except induction will never terminate.

# Analyzing algorithms

We want to predict the resource an algorithm needs such as memory and time.

## Assumption

RAM, No concurrent operations. No memory hierarchy such as cache or virtual effect.

## Input size

It depends on the problem to decide which notion should be used for input size. For sorting, it is better to use the number of element, but for graph problem, it is better to use two numbers: the number of vertices and edges.

## Running time

Of an algorithm is the number of primitive ops or steps executed.

We will now analyze the insertion sort. First the complicated way.

The and are costs of the instruction. For each where n = A.length, we denote to be the number of times the while loop will be executed for the value of

Then the running time is the sum of each statement’s time. We pay special attention to line 5 – 7. For line 5, the time this statement will be run is determined by the number of comparison needed to find the right place for *key*. For each *key*, the time will differ. But the sum of these multiplies by the time it takes to execute the comparison will give us the total time of the while statement. The same for the statement that follows while. But the test is executed one more time than the body.

The formula is shown above. The order of the input will hugely affect the time. Assuming the list is already sorted. Then for each comparison in the while statement, we only need to compare once therefore will be 1 for all j and our formula becomes The expression is of the form and it is linear function of n.

But if the array is in reverse order then each comparison in while loop will take j times. and

the formula will be above. It is of the form and this is a quardratic function.

## Worst case and average case analysis

We usually will try to find the worst-case running time, the longest running time for *any* input of size n. There are three reason as to why we want to find worst case.

- The worst-case will provide us with a guarantee.
- The worst case often happens. EG: For database search, the worst case is not finding the content and that happened quite often.
- The “average case is often as bad as the worst case. For our insertion sort, assume half of the array is sorted and half is not. So we basically will need to sort the unsorted half, which has input size of and it is still a quadratic function.

## Order of growth

We want to make some simplification. We are interested in the order of growth. So we consider only the leading term of the formula . We also ignore the leading coefficient since it is far less significant than the n’s effect. So we are left with . The worst-case running time is .

# Designing algorithms

For our insertion sort, we have an incremental approach. We first start with a singleton sorted list, then make the sorted list bigger and bigger one element after another. We will now learn about divide-and-conqure

## The divide-and-conquer approach

Lots of good algorithms are recursive; Solving a problem by breaking the problem up into smaller and similar problems and after solving them, combine the solutions to find the final solution. Divide-and-conquer is one.

- Divide: Divide the problem into sub problems
- Conquer: Solve the subproblem recursively(breaking them up again and again). If the subproblems are small enough, we will solve them straghtforwardly.
- Combine: Combine the solutions of subproblems to get the original.

For merge sort: We will divide the unsorted list into 2 unsorted list(divide). Then we will sort the two unsorted list recursively using merge sort.(conquer). Then merge the two sorted list to get the original answer.

To merge, we need to get an auxiliary function merge(A, p, q, r), where A is the list, and we assume A[1,p] and A[p + 1, q] are both sorted.

The merge function will start with an empty array and pick the smaller of the two array’s first element. Then after picking an element from that array, we push all other element forward so we have a new first element in that array. This process will repeat till one array is empty. Then we will just append the other array to the end of the sorted array.

The psedo code has one trick: It won’t check if we have reached the end for both array. Instead it will use an sentinel value, . Since we will put the smaller of the two top element into the sorted array and nothing is smaller than , we virtually append the array to the sorted array.

```
Merge(A, p, q, r)
n1 = q - p + 1 // length of the first sorted array
n2 = r - q //length of the second sorted array
let L[1... n1 + 1] and R[1..n2 + 1] be new arrays //leave 1 extra slot for sentinel value
for i = 1 to n1
L[i] = A[ p + i - 1]
for j = 1 to n2
R[j] = A[ q + j]
L[n1 + 1] = infty
R[n2 + 1] = infty
i = 1
j = 1
for k = p to r
if L[i] <= R[j]
A[k] = L[i]
i += 1
else
A[k] = R[j]
j += 1
```

We need to check the loop invariant for this algorithm. The loop invariant is **at the beginning of the last for loop, A[p..k – 1] contain k – p number of the smallest element among array L and R. And L[i] and R[j] are the smallest element in their arrays that aren’t copied into A.** The initialization is good since at the beginning the A array is “empty” and k – p = 0. And L[i] and R[j] have the smallest element(sorted, not touch, so the first one is the smallest).

Maintenance: Picking one element from the two array and insert will give us a A[p..k] that’s sorted. And L and R will remain sorted.

Termination: This algorithm terminate when k = r. So the array A[p..r] will be sorted which is the whole array.

We can see that merge’s largest iteration is a single for loop and has step. In each step, the operation takes constant time. So the running time is .

Now go back to merge sort

```
Merge_sort(A, p, r)
if p < r
q = floor((p + q)/2)
Merge_sort(A, p, q)
Merge_sort(A, q + 1, r)
Merge(A, p, q, r)
```

It will look like:

To find the running time, we assume that we will divide the problem into *a* parts and each one is the size of the original. Ignoring the time it takes to divide and combine

[T(n) = aT(\dfrac{n/b})]

We will need master theorem to find the exact running time. But we can informally deduct the time for merge sort.

Let’s assume that the n is exactly a power of 2. Then we will make a recursive tree. The next level from the top will have two subtree that has size n/2. Assuming the time to do divide and combine step for each element is c, the total cost of the two subtree is c(n/2 + n/2) = cn. The next level will have 4 (n/4) subtree. The total time is 4c(n/4) = cn.

The i level below the top has node and will takes . The total number of level is lgn + 1.

So the total cost is level * cost at each level = cn * lgn = .