A comparison-based, divide-and-conquer algorithm that partitions the input into equal halves and then combines those in a sorted manner.
Like the quick sort algorithm, Merge Sort is also a divide and conquer based algorithm. But in contrast, it performs the maximum work while combining the individual partitions instead of during the partitioning itself (as in quick sort).
The main idea behind merge sort is to recursively divide the input into halves until no more partitions can be created (up to single or no element partition). Once this stage is reached, the individual partitions are combined in a sorted manner, resulting in the sorted input.
Also, unlike quick sort, the initial ordering of the input does not impact the merge sort’s runtime.
Consider the following example: [3,5,1,9,7,8]
The input is partitioned into subsets till it cannot be further subdivided: [3], [5], [1], [9], [7], [8]
. On the way back up the recursive tree, the individual partitions are combined in a sorted order to make the complete input sorted:
Java implementation for the above algorithm is as follows:
public void sort(int[] input, int start, int end){
if(start < end){
int mid = (start + end) / 2;
// array[start...mid]
sort(input, start, mid);
// array[mid+1.. end]
sort(input, mid + 1, end);
// merge the array in sorted order
// array[start... mid ... end]
merge(input, start, mid, end);
}
}
The above method will divide the given input into partitions. The merge
method mentioned below is then responsible for combining those individual partitions in a sorted manner:
private void merge(int[] input, int start, int mid, int end){
// create new temp array of size end-start+1
int[] temp = new int[end - start + 1];
int i = start, j = mid + 1, k = 0;
while(i <= mid && j <= end){
if(input[i] < input[j]){
temp[k++] = input[i++];
} else{
temp[k++] = input[j++];
}
}
// copy remaining elements
while(i <= mid){
temp[k++] = input[i++];
}
while(j <= end){
temp[k++] = input[j++];
}
// copy from temp to input
System.arraycopy(temp, 0, input, start, temp.length);
}
The overall complexity of the algorithm is \(𝑂(n * log_2n)\) irrespective of where the partition happens (balanced or unbalanced). Additionally, it requires a space of \(O(n)\).
One thing to note here is that the implementation is highly parallelizable, where we can have separate threads for separate partitions and can then finally combine the individual results. See this Wikipedia article for details.
Be notified of new posts. Subscribe to the RSS feed.