Insertion sort

vector<int> sortArray(vector<int>& nums) {
        //insertion sort. 
        // [sorted | ... > x ... | x | ....]
        // after sort
        // [smaller than x | x | larger than x|  not sorted]
         we need to move x to the right place
         The right place means everything larger than x 
         is to x's right (till the sorted boundary)
         everything smaller than x is to its left(till the beginning)

         In insertion sort, during the kth iteration, the first k + 1th 
         element are in order (relatively)
         So this loop invariant, when k == nums.size() means the whole 
         array is in order relatively. IF not show contradiction.

        int cur_elem_pos = 1;
        for(;cur_elem_pos < nums.size()  ; cur_elem_pos ++){
            int insert_index = cur_elem_pos;
            int elem_to_be_insert = nums[cur_elem_pos];
            while(insert_index > 0 &&
                  nums[insert_index - 1] > elem_to_be_insert){
                nums[insert_index] = nums[insert_index - 1];
                insert_index --;
            nums[insert_index] = elem_to_be_insert;


        return nums;


Learning ANTLR 3: More examples are coming

Matching arithmetic expression

An arithmetic grammar for +-x/ and parenthesis:

grammar Expr;

/** The start rule; begin parsing here. */
prog:   stat+ ;         // a program contains multiple statement
//a statement can either be expr \n or assignment \n or \n
stat:   expr NEWLINE                
    |   ID '=' expr NEWLINE        
    |   NEWLINE                   

expr:   expr ('*'|'/') expr   
    |   expr ('+'|'-') expr   
    |   INT                    
    |   ID                    
    |   '(' expr ')'         

ID  :   [a-zA-Z]+ ;      // match identifiers <label id="code.tour.expr.3"/>
INT :   [0-9]+ ;         // match integers
NEWLINE:'\r'? '\n' ;     // return newlines to parser (is end-statement signal)
WS  :   [ \t]+ -> skip ; // toss out whitespace

Some highlights

  1. Grammar consists of rules for synatctic structure and rules for vocabulary symbols(tokens).
  2. program consisits of multiple statements
  3. Seperate alternatives of a rule with |
  4. Group symbols with parenthesis into subrules(‘*’ | ‘/’)
  5. We can do left recursion(invoke itself at the start of alternative rules.

To test, we run

**$ antlr4 Expr.g4
$ javac Expr*.java
$ grun Expr prog -gui t.expr # launches org.antlr.v4.runtime.misc.TestRig**

We will see the parse tree for input in t.expr and **invoke the parser with rule prog **.

Calling a rule method is like invoking that rule. We can call any parser rule method.

Building a calculator

We need to perform some actions using parse-tree visitor.

To do so wee need to add

1.label the alternative rules so we can have different visitor per rule so we can have different event for different input.

expr: expr op=('*'|'/') expr # MulDiv
| expr op=('+'|'-') expr # AddSub
| INT # int
| ID # id
| '(' expr ')' # parens
  1. We need to define token names for the operator literals so we can reference token names as Java constants in the visitor

This time, we will use the visitor pattern and ANTLR will generate a method for each labeled alternative name

$ antlr4 -no-listener -visitor LabeledExpr.g4

It will also also generate default visitor implementation and we can subclass it. We will need to override methods associated with statement and expression alternatives.

Some interesting override is:

    public Integer visitMulDiv(LabeledExprParser.MulDivContext ctx) {
        int left = visit(ctx.expr(0));  // get value of left subexpression
        int right = visit(ctx.expr(1)); // get value of right subexpression
        if ( ctx.op.getType() == LabeledExprParser.MUL ) return left * right;
        return left / right; // must be DIV

Here, when we visit multi or divide, we will first infer the left and right expression. Then we will use ctx.op.getType to determine the type of the op. Then do operation accordingly.

Translation of java Source code

import java.util.List;
import java.util.Map;
public class Demo {
    void f(int x, String y) { }
    int[ ] g(/*no args*/) { return null; }
    List<Map<String, Integer>>[] h() { return null; }

We want to translate the source code into interface(just like header definition) while preserving whitespace and common.

<br />interface IDemo {
    void f(int x, String y);
    int[ ] g(/*no args*/);
    List<Map<String, Integer>>[] h();

We are given a Java Grammar and make a parse tree. We will derive interface from class name and grab signature(return type, name and arg list).

WE want to know how a class is decleared and how the method is defined.

    :   'class' Identifier typeParameters? ('extends' type)?
        ('implements' typeList)?
    :   type Identifier formalParameters ('[' ']')* methodDeclarationRest
    |   'void' Identifier formalParameters methodDeclarationRest

We will use antlr4 to get the visitors

$ antlr4 Java.g4

We do a

ls Java*.java 

and got

We will then extract an extended class from JavaBaseListener, (Java here means a listener for the Java grammar). We need to have 3 strategies:

  1. print out interface header at the start of a class
  2. print a terminating } at the end of class definition
  3. When we see each method definition, print out signature.
    public void enterClassDeclaration(JavaParser.ClassDeclarationContext ctx){
        System.out.println("interface I"+ctx.Identifier()+" {");

    public void exitClassDeclaration(JavaParser.ClassDeclarationContext ctx) {

    public void enterMethodDeclaration(
        JavaParser.MethodDeclarationContext ctx
        // need parser to get tokens
        TokenStream tokens = parser.getTokenStream();
        String type = "void";
        if ( ctx.type()!=null ) {
            type = tokens.getText(ctx.type());
        String args = tokens.getText(ctx.formalParameters());
        System.out.println("\t"+type+" "+ctx.Identifier()+args+";");

We also need to write a main to run the parser and walk the tree with listener.

public class ExtractInterfaceTool {
    public static void main(String[] args) throws Exception {
        String inputFile = null;
        if ( args.length>0 ) inputFile = args[0];
        InputStream is =;
        if ( inputFile!=null ) {
            is = new FileInputStream(inputFile);
        ANTLRInputStream input = new ANTLRInputStream(is);

        JavaLexer lexer = new JavaLexer(input);
        CommonTokenStream tokens = new CommonTokenStream(lexer);
        JavaParser parser = new JavaParser(tokens);
        ParseTree tree = parser.compilationUnit(); // parse

        ParseTreeWalker walker = new ParseTreeWalker(); // create standard walker
        ExtractInterfaceListener extractor = new ExtractInterfaceListener(parser);
        walker.walk(extractor, tree); // initiate walk of tree with listener

embedding actions in a grammar

For more control, we can embed actions in grammar. These actions will be copied into the recursive-descent parser. actions are code snippets surrounded by curly braces.

Disclaimer :code snippets are from the ANTLR reference. If you found this note to be infrindging upon its copy right, please contact me and I will take it down.

LC 81, search in rotated array 2

The sorted array is rotated once, and we might have duplicate (i.e., [0,0,1,2,2,5,6] might become [2,5,6,0,0,1,2]).

We will still use binary search but this time the duplicate will give us an edge case:
[1,1,1,1,1,1,1,1,1,2,3,4] rotate -> [1,2,3,4,1,1,1,1,1,1,1]
low = 1, high = 1 and mid = 1 and we don’t know which part is sorted so we can either low ++ or high — to edge toward an interval where at least we know [low,mid] or [mid, high] is sorted.

class Solution {
    bool search(vector<int>& nums, int target) {
        int low = 0;
        int high = nums.size() - 1;
        while(low <= high){
            int mid = low + (high - low) / 2;
            if(nums[mid] == target)
                return true;
            if(nums[low] < nums[mid]){//low-mid are sorted
                if(nums[low] <= target && target < nums[mid]){
                    high = mid - 1;
                    low = mid + 1;

            else if(nums[mid] < nums[low]){  //mid-high are sorted
                if(nums[mid] < target && target <= nums[high]){
                    low = mid + 1;
                    high = mid - 1;

                low ++;

        return false;


else if condition

In else if, I first used

else if(nums[mid] < nums[high]

to decide if [mid, high] is sorted but it will fail at test case such as [3,1,1] t = 3.
low = 0 high = 2 mid = 1

we see that nums[low] !< nums[mid] so [low, mid] is not sorted so [mid, high] should be sorted, but our elseif statement is not true because nums[mid] == nums[high]. So we should just change elseif to the compliement of

if(nums[low] < nums[mid])

Which is

else if(nums[low] > nums[mid]

(Not exact complement since equality is already tested)

LC 58. Length of Last Word

Given a string s consists of upper/lower-case alphabets and empty space characters ‘ ‘, return the length of last word in the string.

If the last word does not exist, return 0.


Keep a counter, lastWordLen, for the length of the last word encounter. It is initialized at 0. We also need a counter to determine the current word len.

Then we iterate through the list, if we see a non-whitespace, we increment the curWordLen by 1.

If we see a whitespace

  1. if curWordLen is 0, the previous letter is also white space and it won’t constitute a word, so we do nothing
  2. else, we update lastWordLen and set curWordLen to 0.

Edge case

Empty input

since lastWordLen is 0 and will never be updated, its good.

Single word

The for loop only update lastwordLen when it sees a white space. If the input is single word, we need to check lastWordLen and update.

    int lengthOfLastWord(string s) {
        int lastWordLen = 0;
        int lenCtr = 0;
        for (int i = 0; i < s.size(); i ++){
            if (s[i] != ' '){
                lenCtr += 1;
                if (lenCtr != 0){
                    lastWordLen = lenCtr;    

                lenCtr = 0;
        if(lenCtr != 0)
            lastWordLen = lenCtr;
        return lastWordLen;


Recovery system in database

WE will use log and checkpoints to recover

Transaction roll back

If we try to rollback transaction Ti

Scan the log backward

We will try to scan each log record <Ti, Xj, V1, V2> that belongs to Ti.

Then we will write V1(the old value) to Xj.

Write a REDO-only log <Ti, Xj, V1> to the log. This is called the compensating log. ( we don’t need to store both old and new value here since during a undo, because if system crash during recovery, we only need to REDO these logs to finish the UNDO).

Once we found in our backward scan, we stop and add a in log.

Recovery after a crash

REDO phase

Scan the log forward from the last checkpoint. and replay all transactions.
This replay include both rollback and uncommitted transaction.

We also need to determine the list of incomplete transactions. They are either in ( active transactions during checkpoint ) or start after the checkpoint and didn’t have neither commit nor abort. This list is called undo-list and will be used to rollback during UNDO.

UNDO Phase

Rollback all transactions in the undo-list by scanning the log backward from the end

If DB finds a transaction Ti in the undo list, it will rollback to old value.

When DB sees a and Ti is in the undo list, that means we have undone all Ti’s change and can write a to the log to signified that this transaction is complete.

Undo terminate when we have seen all Ti’s

Fuzzy checkpoint

This will allow updates to start once has been written, but before the modified buffers are written to disk.

If a crash happened before all pages in the modified buffers are written to disk, we might have an incomplete checkpoint.
We will deal with this by storing a fixed position last-checkpoint ptr on disk and we will not update this ptr when we first write .

The will create a list of modified buffer blocks and last-checkpoint will only be updated once all modified blocks are written to disk.

LC 39 Combination Sum

Given a set of candidate numbers (candidates) (without duplicates) and a target number (target), find all unique combinations in candidates where the candidate numbers sums to target.

The same repeated number may be chosen from candidates unlimited number of times.


We basically try to find paths in tree that sum up to target. At the beginning, we pick a number as the root and initialize the total_sum to that number. If total_sum < target, we proceed to “branch off”, recursively calling DFS and adding leaves.

update solution sets

We update the solution set by appending to sol when the target == 0.


return is used to terminate a function.

Avoid duplicates

How does duplicates occur

Duplicates occur because when we branch off the tree, we didn’t exclude the elements used in previous path.
To exclude duplicates, when we are making recursive calls at each level, we make sure each function will


    def __init__(self):
        self.all_path = []
    def combinationSum(self, candidates, target):
        :type candidates: List[int]
        :type target: int
        :rtype: List[List[int]]
        self.bfs(candidates, 0, target, [])
        return self.all_path

    def bfs(self, nums, index, target, path):
        if target < 0:
        if target == 0:
        for i in range(index, len(nums)):
            self.bfs(nums, i, target - nums[i], path + [nums[i]])

Exclude duplicates

In the for loop, we append nums[i] to the path. Also, we will let index be i so the paths afterword will only contain nums[i]…nums[end].
This avoids duplicates because any possible combinations that sums to target with nums[0]…nums[i – 1] and nums[i]…nums[end] are already found by the function with index nums[i – 1]. So a start index at i will not add this solution to the path.

Database index and hashing

We want to locate where our records are. We might have to resort to index, just like when we are trying to find a key word, we go to the indice in the back of a book.

Basic concepts:


Indices are used to locate record; Attributes that are used to loop up are called search key. They don’t have to be Key or super key.

Ordered indices: it is based on sorted order of the records

Hash indices

operations and performance affected by choice of indices

Access type:

We can have specific attribute look up or range look up

Space overhead



Ordered indices

Ordered indices are indices that store indices in sorted order.

Clustering indices

Record in the files might themselves be sorted in some order. A clustering index is an index whose search key also defines the sequential order of the file.
They are also called primary index.

Nonclustering indices

Indices order is different from the sequential order of the records. There is a second list that has pointers to the physical rows.

Index entry

It is a search key value and pointer to one or **more than one ** records that has the specific search key value.

Dense index

Index entry appears for every search key value in the file.

CLRS chpt 11: hashtable

We use hashtable when we only want to insert, search and delete. There are three main stream methood for hashtable: 1. direct address. 2. chainning and 3. open addressing.

Direct addressing.

We directly access the array that contains the pointer to satellite data. It works well when the universe of k is small and we can afford to allocate enough space.(we assume each key is unique). Since everyone has slot already, why not store the data directly?

Exercise for direct addressing


A bit vector is simply an array of bits (0s and 1s). A bit vector of length m takes much less space than an array of m pointers. Describe how to use a bit vector to represent a dynamic set of distinct elements with no satellite data. Dictionary operations should run in O(1) time.

Given a bit vector b, it contains 1 in position k if k is in the dynamic set.


Suggest how to implement a direct-address table in which the keys of stored elements do not need to be distinct and the elements can have satellite data. All three dictionary operations (INSERT, DELETE, and SEARCH) should run in O(1) time. (Don’t forget that DELETE takes as an argument a pointer to an object to be deleted, not a key.)

The keys are not distinct so we could let T[x.key] points to a doubly linked list and store all the data that has the same key. Insert takes O(1). When we delete, we take pointer to the object as an argument and removing a node from linked list is O(1).
Search will take key as the argument and check if T[k] is null so O(1).

Down side of direct addressing

We need to allocate space for all possible key in the universe and this become infeasible when Universe is large. More importantly the actual keys used might be way smaller than U. So we use hashtable


Hashtable uses hash function to computer key for the table. For an element with key k, we will store it at slot h(k). But multiple key values might be hashed into the same value: h(k1) = h(k2)… when k1 != k2 !=…. a collision has occurred.

What constitutes a good hashing function

A good hashing function will appear to be “random” while being deterministic(else we can’t retrieve).

Load factor

Assume uniform hashing, where any given element is equally likely to be hashed into any of the m slot. n = number of elements in the dictionary. m = number of slot in the table. n/m is the load factor, the average length of linked list pointed by the table.


Collision occurs when two element’s key got hash to the same value. Then we can do Chaining; we will store the elements with the same h(k) into the same linked list.

Run time of Insert, search and delete


we will insert at the front of the doubly linked list pointed by h(x.key). O(1) time

Search(T,k) k is the key value.

Go to h(k) first and traverse the linked list until we either find k or fail to find it. It takes O(n/m) to search. (1 if we fail, we need to traverse the whole list, O(n/m)
(2 if we the key is in the dictionary…well we need to take the average of 1 plus the expected number of elements added to x’s list after x was added to the list(because we have to traverse through all the element before x, who are added after x

\dfrac{1}{n} \sum_{i = 1}^{n} 1 + \sum_{j = i + 1}^{n}{E(X_{(ij)}

Where X_{ij} is the indicator variable that h(ki) = h(kj). The inner summation will sum the expected value of collision after x is inserted. And the earlier the x is inserted, the larger the number of sum will be.  P(h(k_i)) == h(k_j)) = 1/mby uniform hashing. So the inner loop sums to\dfrac{n}{m}. We then take the average of it and it will still be\dfrac{n}{m}. (Please ignore the gross math omission but the general idea is close.  <h3>Exercise</h3>  <h4>1</h4>  Professor Marley hypothesizes that he can obtain substantial performance gains by modifying the chaining scheme to keep each list in sorted order. How does the professor's modification affect the running time for successful searches, unsuccessful searches, insertions, and deletions?  It will make insert slower to O(n/m) since we need to traverse. It will <strong>NOT</strong> make delesion faster since the order doesn't help delete to be asymptotically faster. But it will help search to become faster if instead of using linked list we just use an array! Because we can use binary search. Times to search reduce to O(1 + lg(n/m))  <h1>Hash function</h1>  <h2>Division method</h2>  h(k) = k mod m. For this method, we want m not to be power of two. Otherwise it is just shifting bits and will not have the "random" effect a good hash function would have  <h2>Multiplication  method</h2>  h(k) = \floor{m (kA mod1)}Where kA mod1 is the fractional part of the kA.  This method has the advantage that m can be power of 2.  <h2>Universal hashing</h2>  We randomly choose hash function from a class of suitable function. The probability of going to a bad one is slim. (Just like quick sort).  <h1>Open addressing</h1>  In open addressing, all element occupy the table itself. SO NO POINTER and the element stored is the key. And the table can fill up. No more than m element can be stored in the table and the load factor is never greater than 1. Instead of pointer, for each hashing, we generate a sequence of slots to be examined in case of collision. To insert and search, we <strong>probe</strong> the table until we find a suitable position for our element.  h : U x {1,2,3…m – 1} = {1,2,3… m – 1}The sequence of probe depends on the hash value of the element. For every key k, we want the prob sequence to be<(h(k,0), h(k,1)….h(k, m – 1)>. It is a permutation of {1,2,3...m}. so that eventually we consider every slot when the table is filling up.  <pre><code class="python">hash_insert(T, k):    i = 0     while T[h(h,i)] != null and i != m:  #probe until we either find null( empty) or have no space left         i += 1     if i != m:         T[h(k,i)] = k     else:error "no space left" </code></pre>  The same logic goes for search. We either find it in the permutation sequence or finish the whole sequence...  <pre><code class="python">search(T,k): i = 0 while T[h(h,i)] != null and i != m:       i +=1 if i != m:     return i </code></pre>  But for delete, we can not just delete the key in the table. Because removing key will make it impossible to search. We can insert a DELETE instead of Null when delete a key.  We define h'(k,i), an ordinary hash function, to be the auxiliary hash function.  <h2>Linear probing</h2>  h(k) = (h'(k) + 1) mod m)linear probing will search h'(k) then h'(k) + 1.. h'(k) + m - 1 and then wrap around and search until h'(k) - 1. It will iterate through the whole table eventually. It suffers from <strong>primary clustering</strong> and long run sequence will build up the table and search will take longer. <strong>the initial starting position will dictate the sequence</strong>  <h2>Quardratic</h2>  h(k) = (h'(k) + c_1 i + c_2 i^2) mod mThe offset is provided by the polynomial. It works better than linear probing but we must constrain the use ofc_1, c_2, m.  <strong>the initial starting position will dictate the sequence as well</strong>  <h2>Double</h2>  h(k) = h_1(k) + i h_2(k) mod mwhere h1 and h2 are both aux hash function. The first probe position ish_1(k)and then we offset each probe with(h_2(k) + i) mod m
So the probe sequence depends on two hashing function is not dictated by the initial starting point.

Perfect hashing

Two level hashing using universal hashing at each level.