kth largest element in an unsorted array

sort the array and pick the kth


Use size k min heap

class Solution {
    int findKthLargest(vector<int>& nums, int k) {
        priority_queue<int, vector<int> ,greater<int>> pq;
        for(auto n : nums){
            if (pq.size() > k)
        return pq.top();

We use a min heap of size k to filter through all elements and if after adding an element to heap causing the heap size to exceed k, then we pop it.

This algorithm will eventually contain the kth largest element since all elements smaller than kth largest elem will be at the internal or leaf. And any node larger than kth largest will eventually be at the top and get pop off.

Quick select

Divide & conquer 2: binary search

class Solution {
    int search(vector<int>& nums, int target) {
        return bin_search(nums, 0, nums.size() - 1, target);
    int bin_search(vector<int>& nums, int l, int h, int target){
        if(l > h) return -1;
        int mid = (l + h)  / 2;
        if(nums[mid] == target) return mid;
        if(nums[mid] > target){
            // target in [l: mid - 1]
            return bin_search(nums, l, mid - 1, target);
        return bin_search(nums, mid + 1, h, target);


One thing that helps with this recursive problem is to set the boundary of input for the algorithm.
For binary search, a bin_search(nums, int l, int h, int target), specify that this will search [l, h] for target. So for the subproblems, you will know that it will search between [l, mid – 1] and [mid + 1, h] with bin_search.

Merge sort

It uses divide and conquer and use O(n) extra memory to achieve the O(nlogn) run time

class Solution { public: vector<int> sortArray(vector<int>& nums) { int l = 0, r = nums.size() -1; vector<int> temp(r-l+1); mergeSort(nums, temp, l, r); return nums; } void mergeSort(vector<int>& A, vector<int>& temp, int l, int r){ if(l >= r) return; int m = (l+r)/2; mergeSort(A, temp, l, m); mergeSort(A, temp, m+1, r); merge(A, temp, l, m, r); } void merge(vector<int>& A, vector<int>& temp, int l, int m, int r){ int p1 = l, p2 = m+1, p_temp = 0; while(p1 <= m && p2 <= r){ if(A[p1] <= A[p2]) temp[p_temp++] = A[p1++]; else temp[p_temp++] = A[p2++]; } while(p1 < m+1) temp[p_temp++] = A[p1++]; while(p2 < r+1) temp[p_temp++] = A[p2++]; for(int i = 0; i < r-l+1; i++){ A[l+i] = temp[i]; } } };

The function MergeSort(nums, temp, start, end) will sort an array that starts at index start and whose last element is at index end.
So the subsequent subproblem will sort [start, middle], and [middle + 1, end].

Merge will mergetwo sorted subarrays A[l, m] and A[m + 1, r].

NOTICE, we uses index, not size in the while in merge that’s why we have p1 <= m.
If we use size, notice that an array that starts from index l and ends on index m will have size m – l + 1 and since p1 starts at l, it needs to go forward m steps to reach the last element.

Insertion sort

vector<int> sortArray(vector<int>& nums) {
        //insertion sort. 
        // [sorted | ... > x ... | x | ....]
        // after sort
        // [smaller than x | x | larger than x|  not sorted]
         we need to move x to the right place
         The right place means everything larger than x 
         is to x's right (till the sorted boundary)
         everything smaller than x is to its left(till the beginning)

         In insertion sort, during the kth iteration, the first k + 1th 
         element are in order (relatively)
         So this loop invariant, when k == nums.size() means the whole 
         array is in order relatively. IF not show contradiction.

        int cur_elem_pos = 1;
        for(;cur_elem_pos < nums.size()  ; cur_elem_pos ++){
            int insert_index = cur_elem_pos;
            int elem_to_be_insert = nums[cur_elem_pos];
            while(insert_index > 0 &&
                  nums[insert_index - 1] > elem_to_be_insert){
                nums[insert_index] = nums[insert_index - 1];
                insert_index --;
            nums[insert_index] = elem_to_be_insert;


        return nums;


Bubble sort

Gonna start from the bottom and work my way up. Starting from Bubble sort!

Bubble sort, as the name implies, bubble up the largest element to the top of the array. For a size k array, it will iterate it k times, each times, bring the kth largest element to the top.

    vector<int> sortArray(vector<int>& nums) {
        for(int i = 0; i < nums.size(); i ++){
            for(int j = 0; j < nums.size() - 1; j ++){
                if(nums[j] > nums[j + 1]){
                    int temp = nums[j];
                    nums[j] = nums[j + 1];
                    nums[j + 1] = temp;
        return nums;

We can optimize it 2 ways
1: we know that at kth iteration, 1st … kth largest element are already in their right place. So we can stop swapping loop at k – 1.

2: If no swapping happens in a loop, we know the array is already sorted.

An internship at Google

Just started my internship at Google in the network infra team.
I have to say it has been very eye-opening, in a very literal way.
The networking knowledge I learned from school and books are actually out-dated, compared with the cutting-edge Google and other companies (AWS and other).

MY colleagues are all very nice and knowledgeable. Based on my manager, 70% of the team consists of PhD. There are also a lot of professors. I guess Google runs half of the internet so there are no where better to be than here.

I will keep updating this post and reflect upon my internship here.


Forwarding and routing are different

Forwarding means taking a package, looking up its entry in forward table and sending it.
Routing is the process of building the forward table.

Routing table might look like this :
prefix | Next hop
18/8 | 171.1.1

And forward table might look like
prefix | Interface |MAC
18/8 | if0 |1:1:1:1:1:1:1:1

Forwarding table should use data structure that’s fast for look up and routing table should be fast for topology calculation.

The routing protocols described below are all Intradomain routing protocol or Interior gateway protocol (IGP). Domain here means An internetwork which all routers are under the same admin.


Or Bellman-Ford algorithm O(VE)
Each node has a 1d vector of cost to reach all other nodes. Immediate links cost are known and all other are first set to + inf.
The process of getting this info for all nodes is called convergence. No nodes need global knowledge


Periodic or event driven

Problem with down link

If one link go down and update the cost to reach the node at other end to inf. It will try to learn a new way with lower cost to that node.
EG link between A and G goes down. cost of A -> G becomes inf.
Assume the networks look like B -> A -> G.
A might attempt to learn the route through B since B still has a cost of 2.
Then the cost of A-> G goes up to 3. B will update its cost to G to 4 and it goes on till the end of world…

Can use poison reverse to prevent route from sending routes learned from neighbour back to neighbour

Protocol using distance vector (RIP)

Routing information protocol. Routers will broadcast min distance to Network. Also a major constrain is the num of hops in RIP can be no more than 16.
Since 16 counts as infinity (because of the counting to inf problem mentioned abov).

Link state

Each node creates an update msg called LSP(Link state packet) and flood the network with it
It contains
1. ID of the node that created the LSP. 2. Node that’s directly link to the sending node. 3. A sequence number 4. TTL
The first 2 info is enough to calculate route. using Dijkstra

Comparison between Link state and Distance vector

Distance vector shares the distances to all nodes to each other | L-S shares knowledge of its neighbors with every other router in the network

Protocol using link state

OSPF Open Shortest path first protocol

Additional feature of OSPF

additional hierarchy allow use to partition domains into areas so routers within a network doesn’t need to know how to reach every networks in the same domain.
load balacing

Switches and Bridges

Switches and bridges are concerned with connecting Homogeneous network(think Ethernet and 802.11)
They connect links together.

They have two approaches to user the packet header to forward. datagram connectionless and **virtual circuit **(connection orineted).


Every packet contains complete address of the host and just use the forward table. Given a forward table, datagram model’s host can send a packet
anywhere at anytime. It doesn’t need to establish a session.(Unlike virtual circuit).

Virtual circuit

Connection oriented just like TCP. Need to first get state info from the other host. Then transfer.

We need to use virtual circuit identifier VCI to identify a unique at each switch. It will be in the header package.
We also need a incoming interface for this VCI and an outgoing interface for this VCI.
**A potentially different VCI will be used for the outgoing
VCI is unique at the given link(ethernet).

VCI example:
For switch 1 S
incoming interface | incoming VCI | outgoing interface | outgoing VCI|

1 | 5 | 4 | 11 |

For switch 2

incoming interface | incoming VCI | outgoing interface | outgoing VCI|

5 | 11 | 9 | 2 |

Note the outgoing VCI at 1 switch == incoming VCI at another switch.

To send packet, we will use interface + VCI to determine where to switch.


Need to do a round trips to insert VCI entries for each switch along the way.
To do this we need to use global address of the end host.
But after the setup, we only need to use the VCI and the interface.

Learning forward table via spanning tree algo

Used to handle loop situation.


Internetworking is an arbitrary collection of network(either 802.11 or Eithernet or other) inter-connected together via routers(Gateway) to provide end-to-end service.

It uses IP to build heterogenious and scalable network.

Service model

IP needs to provide a service model that’s undemanding enough that the underlying physical network can provide the necessary service.

Addressing scheme

Datagram delivery

Datagram is a type of packet that’s sent in connectionless manner over a network. If the packet is corrupted, then that’s it. It’s unreliable. This “best effort” means: 1. Packet could be lost 2. Packet might arrive out of order 3. Packet might be received twice

IP packet format

ip format

First word:

Version: IP v4 or Iv Hlen : Header length in words. If not set the default is 5 words(20 bytes) TOS: Type of service: TCP or UDP. Allow packet to be treated differently for different service Lentgh: Size of datagram. In BYTE. It has 16 bits so the max size of a packet is 65535 bytes.

Second word:

Used to fragmentation. Explained in detail in next section.

Third word:

About transportation TTL: Time to live. Each router will decrease by 1. Set by sending host. Default at 64. Protocol:TCP or UDP Checksum:

Forth word

Source addr

Fifth word

Destination Address


Fragmentation breaks larger packets into smaller packets and provide means to reassemble them eventually at the end host. Fragmentation is needed because every network has its MTU Maximum trasmission unit, the largest IP datagram it can carry in a frame.

MTU and and max IP datagram size

MAX IP datagram size is smaller than MTU since we need to fit the entire IP datagram into the payload of frame for the network(Ethernet or wireless)


With Offset and IsEnd flag in the second word in the IP header.

Offset counts 8-bytes chunk

It is assumed that the boundary for fragmentation is always gonna be multiple of 8 bytes It is done so because the num of bits used to express offset length is smaller than 16 because we need to accommendate other flags. So we have to count in a higher unit.

Global Address

IP is hierarchical

Made up with several parts. Network and host. The network part identifies the network that the host is connected to. All host in the same network has the same network part for IP address.

Class-ful network

IP address is divided into 3 classes. A, B and C. Class A has leading bit 0, class B has leading bit 10 and class C has leading 110. Presumabily, class D has leading 1110.

Size of network for each class

There are 2^32 – 1 possible address. Since half of them can start with leading 0 bit, class A has half of those IP address. Of the remaining half, the leading bit of the IP addresses all start with one and can be split in half by the second leading bit. Class B address starts with 10, which takes half from the remaining half == 1/4 of total.

Datagram forwarding in IP

Assume the forwarding table is built already by routing proces. Entries in forwarding table is just <NetworkNumber, NextHop>

if(NetworkNum of Dest IP == NetworkNum of one of my interface){
  forward to that interface
else if (NetworkNum of DestIp == one of the entry in forwarding table){
  Forward it to NextHop
else {
Forward to default

Inefficiency introduced by class-ful network

A class C network can have 254 hosts. But if you need to have 255 hosts, then you need to upgrade to a class B network. But the utilization would be less than 1%.


Appear to outside as a single network but inside its another Inter-network. Use subnet mask to check if packet belong to that specific network

Classless network (CIDR)

CIDR stands for Classless interdomain Routing. Imagine a client needs 16 class C addresses. We give them a contiguous class C addresses. EG 192.4.16 — 192.4.31. The top 20 bits are identical. Effectively created a subnet of size 20 bits. EG: ISP can advertise a shorter network address prefix so the routing backbone can route all traffic that share the same prefix to that ISP.

ARP(address resolution protocol)

Translate IP addres to link-layer address(MAC). Broadcast an ARP request. The host with the dest address will reply the ARP

Virtual Network

It’s like virtual circuit switching in layer2 but in layer 3(in concept) It is a virtual point to point connection. But actually its separated by arbitrary amount of links.

The router at the entrance of the tunnel will encapsulate the IP packet with another IP header ** that contains **the dest IP address. The router at the entrance will also has a virtual interface for packet to the exit of tunnel. If the next hop of the forwarding table, the router will simply append the IP header and treat it like a regular packet, forwarding it to the right physical interface based on the appended IP header.

At the end of the tunnel, since the router must be connected to the network the sender is intended, the router can just forward the packet to the network.

Host config via DHCP

In Ethernet, MAC addresses are predetermined by the manufacturer. IP addresses also need to be unique and also need to reflect structure of the network. This prevents hard-code IP since the device’s network might change.
Use DHCP, dynamic host configuration protocol.
Set up a DHCP server that maintains a pool of available IP and users can request them.
Q: How to get to DHCP in the first place?
A: A new device can send a DHCPDISCOVER msg to ip(225,225,225,225) the broadcast address. All hosts and routers will receive this msg routers won’t forward it.
DHCP will response.

We can let multiple networks use the same DHCP server by using relay agents

The Ethernet’s size limit

Ethernet is a LAN(local area network) that connects devices via a shared medium. The size of the Ethernet is confined to less than 2.5 km. The reason for the size limit is because Ethernet needs to run collision detection and for the worst case scenario, two devices at two opposite ends send frames at the same time and if it takes d times to go from one end to the other, then in the worst case scenario, it will take 2d times to detect the collision.

So the sender has to keep transmitting after 2d time so it can jam the frame.