Median of two sorted array

Median of Two Sorted Arrays

The generalize version of the problem: find the kth elem of two sorted array.

Thoughts about this problem

To find the medium of one sorted array, we can just go to the middle of it.

Divide and conquer

We will start by finding the median position of the two sorted list. If k is larger than med1_pos + med2_pos, then the kth element must be in the latter part of array1 and array2 so we can now reduce the search range from two sorted array of length L to two sorted array of length L/2.

We have 4 situations here about the size of k, position of median of array 1 and position of median of array 2.

k is larger than both med1_pos, med2_pos and med1_pos + med2_pos

A1 = [1,2,3], A2 = [4,5,6], med1_pos = 1, med2_pos = 1, k = 6
For example, we want to find the 6th element, aka, the element that would be at position 5 of the combined array. 5 > 1 and 5 > 1 + 1. So we know that the 6th element will not be in the left side of A1 or A2 because the array is sorted and anything on the left side of the median will be smaller than median and the k we are looking for is bigger than both median and their sum. So we will shrink the search range to right side of A1 and A2. In our case, that would be A1 = [2,3], A2 = [5,6]. We will also need to subtract the size of the chopped of arrays from k. So our new k would be 6 – 2 = 4.

k is smaller than both med1_pos and med2_pos

It is the same as above except we flip the direction of chopping. We will chop the right side of both array.

k > med1_pos, k > med2_pos and k < med1_pos + med2_pos

A1 = [1,2,3,4,5], A2 = [6,7,8,9,10], med1_pos = 2, med2_pos = 2, k = 4
the kth element will be at position 3 of the combined array.

when k > med1_pos, k > med2_pos and k < med1_pos + med2_pos. We know that the potential subarrays that kth element will be in can not be in the left side of array with a smaller median. Let’s prove it by contradiction; if the kth element is indeed in the left side of the array with smaller median, then it must be smaller than med1 and also med2. So therefore k must be k < med1_pos and k < med2_pos, a contradiction of k > med1_pos.

k also can not be in the right side of the array with larger median. Let’s prove it by contradiction; If it is indeed on the right side of the array with the larger median. We certainly have k > med2_pos. And also the kth element must be larger than med2 and also med1. If we combine A1 and A2, we would find k > med1_pos + med2_pos, a contradiction.

So we will chop off the left side of array with smaller median and the right side of the array with larger median and the new k will be k – med1_pos because we have discarded the left side of the array with larger median.

Base case

We stop when one of A1 or A2 reaches size 0.

Recursive implementation

    def findMedianSortedArrays(self, nums1, nums2):
        """
        :type nums1: List[int]
        :type nums2: List[int]
        :rtype: float
        """
        total_len = len(nums1) + len(nums2)
        if total_len % 2 == 0:
            left_med = self.__find_kth_of_two_sorted_array(nums1, nums2, total_len // 2 - 1)
            right_med = self.__find_kth_of_two_sorted_array(nums1, nums2, total_len // 2 )
            return (left_med + right_med) / 2

        return self.__find_kth_of_two_sorted_array(nums1, nums2, total_len // 2)

    def __find_kth_of_two_sorted_array(self, nums1, nums2, k):

            len1 = len(nums1)
            len2 = len(nums2)
            if len1 == 0:
                return nums2[k]

            if len2 == 0:
                return nums1[k]


            med1_pos = len1 // 2
            med2_pos = len2 // 2

            if k > med1_pos + med2_pos:
                if nums1[med1_pos] >= nums2[med2_pos]:  
                    return self.__find_kth_of_two_sorted_array(nums1, nums2[med2_pos + 1: ], k - med2_pos -1 )

                else: 
                    return self.__find_kth_of_two_sorted_array(nums1[med1_pos + 1 : ], nums2, k - med1_pos - 1)

            else: 

                if nums1[med1_pos] >= nums2[med2_pos]:

                    return self.__find_kth_of_two_sorted_array(nums1[ : med1_pos], nums2, k)
                else:
                    return self.__find_kth_of_two_sorted_array(nums1, nums2[: med2_pos], k)



Use binary search

Leave a Reply

Your email address will not be published. Required fields are marked *

To create code blocks or other preformatted text, indent by four spaces:

    This will be displayed in a monospaced font. The first four 
    spaces will be stripped off, but all other whitespace
    will be preserved.
    
    Markdown is turned off in code blocks:
     [This is not a link](http://example.com)

To create not a block, but an inline code span, use backticks:

Here is some inline `code`.

For more help see http://daringfireball.net/projects/markdown/syntax