Quick reference
Complexity | |
---|---|
Worst case time | |
Best case time | |
Average case time | |
Space |
Strengths:
- Fast. Heap sort runs in time, which scales well as n grows. Unlike quicksort, there's no worst-case complexity.
- Space efficient. Heap sort takes space. That's way better than merge sort's overhead.
Weaknesses:
- Slow in practice. While the asymptotic complexity of heap sort makes it look faster than quicksort, in real systems heap sort is often slower. (Remember, n and 2n are both , even though the first is twice as fast as the second. Heap sort's hides constant factors, but they still impact overall performance.)
The High-Level Idea
Heapsort is similar to selection sort—we're repeatedly choosing the largest item and moving it to the end of our array. The main difference is that instead of scanning through the entire array to find the largest item, we convert the array into a max heap to speed things up.
Heapify
Our first step is to transform the input array into a heap (a.k.a. "heapify" it).
Take this example input:
Instead of treating the input like an array, we can treat it like the nodes in a complete binary tree. The 0^{th} item in the array is the root; the 1^{st} item in the array is the root's left child, and so on.
We're not actually creating a tree here. We're just treating the input like it specifies the nodes in a tree.
When we manipulate the tree by moving around items or removing elements, we're actually rearranging the underlying input array (not some separate tree structure).
We'll show both the tree and the array in our diagrams, but the tree is just to help us visualize how we're interpreting the array as a heap. In memory, we're just storing the array.
Our tree will need some fixups in order to make it into a valid heap. For instance, it's definitely not valid now since the largest element (9) isn't at the root.
To transform the tree into a heap, we'll work from the bottom of the tree upwards. We'll compare each node to its children and move nodes around so that parent nodes are always larger than their children.
This causes smaller nodes to move lower in the tree, "bubbling down" to allow larger values to reach the top.
To start off, let's look at the leaves.
The leaf nodes don't have any children, so they don't need to move down at all. Great.
Let's look at the nodes in the next level:
We'll start with the left node (3) and its children:
Since 7 and 9 are both greater than 3, we definitely need to move some items around. We'll swap the 3 and the 9 to make the parent larger than its children. (If we'd swapped the 3 with the 7, then we'd still have a problem since the parent node (7) would be smaller than one of its children (9).)
Next we'll look at the right node (2) and its children. Since 4 is larger than 2, we'll swap them.
Notice how we've got two small valid max-heaps. We're getting close!
Moving up, we've got an 8 at the root.
Since 8 is smaller than 9, 8 bubbles down, swapping places with the larger child: 9.
Then, we need to compare 8 to its two children—7 and 3. Since 8 is bigger than both of them, we're done bubbling down and don't need to do any additional swaps.
At this point, we've transformed the input tree into a valid max heap. Nice!
Repeatedly Removing The Max
Each time we remove an element from the heap, it's the largest item in the underlying array. So, in sorted order, it belongs at the end of the array. As we'll see, removing an item from the heap conveniently frees up space at the end of the underlying array where we can put the removed item.
Here's what that looks like. Taking our heap from earlier:
We'll remove the max element.
That leaves a gap at the root that needs to be filled. We'll put our right-most item (2) there.
But this isn't a valid heap anymore. Our largest item (8) isn't at the root. So we'll "bubble down" to restore our heap (just like in our heapify step).
Notice the empty hole at the end of the array? We'll fill that in with the 9 we removed.
We've got the largest item at the end of the array. Progress! We'll continue this strategy, removing the next largest item from our heap and adding it to our growing sorted section at the end of the array.
The next-largest element is 8. We'll remove it from the root, filling its spot with the bottom, right-most element (1).
To finish up the removal, we have to bubble down the 1 to make the heap valid again.
Finally, we can use the freed space where the 1 used to be to store the removed max value (8).
Removing the next-largest value (7), we've got:
And after "bubbling down":
Next up, 4:
And after "bubbling down":
You can probably finish the rest yourself. :)
Implementation
Here's how we'd code it up:
Complexity
For the heapify step, we're examining every item in the tree and moving it downwards until it's larger than its children. Since our tree height is , we could do up to moves. Across all n nodes, that's an overall time complexity of .
We've shown that the heapify step is . With a bit more complex analysis, it turns out to actually be . Pretty cool!
After transforming the tree into a heap, we remove all n elements from it—one item at a time. Removing from a heap takes time, since we have to move a new value to the root of the heap and bubble down. Doing n remove operations will be time.
Is this analysis too pessimistic? Each time we remove an item from the heap it gets smaller, so won't later removes be cheaper than earlier ones?
A more thorough analysis shows that doing n removals is still .
Putting these steps together, we're at time in the worst case (and on average).
But what happens if all the items in the input are the same?
Every time we remove an element from the tree root, the item replacing it won't have to bubble down at all. In that case, each remove takes time, and doing n remove operations takes .
So the best case time complexity is . This is the runtime when everything in the input is identical.
Since we cleverly reused available space at the end of the input array to store the item we removed, we only need space overall for heapsort.
Interview coming up?
Get the free 7-day email crash course. You'll learn how to think algorithmically, so you can break down tricky coding interview questions.
No prior computer science training necessary—we'll get you up to speed quickly, skipping all the overly academic stuff.
No spam. One-click unsubscribe whenever.
You're in! Head over to your email inbox right now to read day one!