We use cookies to personalize content and ads, to provide social media features and to analyse our traffic. We also share information about your use of our site with our social media, advertising and analytics partners. By clicking "Accept Cookies," scrolling this page, clicking a link or continuing to browse otherwise, you agree to the use of cookies. Privacy and Cookie Policies
Intuitive. Ever arranged your cards while playing
poker or go-fish? Chances are, you used something like insertion sort.
Space efficient. Insertion sort can be done in-place↴
An in-placefunction
modifies data structures or objects outside of its
own stack frame↴
Overview
The call stack is what a program uses to keep
track of function calls. The call stack is
made up of stack frames—one for
each function call.
For instance, say we called a function that
rolled two dice and printed the sum.
defroll_die():return random.randint(1,6)defroll_two_and_sum():
total =0
total += roll_die()
total += roll_die()print total
roll_two_and_sum()
First, our program calls roll_two_and_sum(). It goes
on the call stack:
roll_two_and_sum()
That function calls roll_die(), which gets pushed
on to the top of the call stack:
roll_die()
roll_two_and_sum()
Inside of roll_die(), we call random.randint().
Here's what our call stack looks like then:
random.randint()
roll_die()
roll_two_and_sum()
When random.randint() finishes, we return back to
roll_die() by removing
("popping") random.randint()'s stack frame.
roll_die()
roll_two_and_sum()
Same thing when roll_die() returns:
roll_two_and_sum()
We're not done yet! roll_two_and_sum()
calls roll_die()again:
roll_die()
roll_two_and_sum()
Which calls random.randint() again:
random.randint()
roll_die()
roll_two_and_sum()
random.randint() returns, then roll_die() returns,
putting us back in roll_two_and_sum():
roll_two_and_sum()
Which calls print():
print()
roll_two_and_sum()
What's stored in a stack frame?
What actually goes in a function's
stack frame?
A stack frame usually stores:
Local variables
Arguments passed into the function
Information about the caller's stack frame
The return address—what the program should do
after the function returns (i.e.: where it should "return
to"). This is usually somewhere in the middle of the caller's
code.
Some of the specifics vary between processor architectures. For
instance, AMD64 (64-bit x86) processors pass some arguments in
registers and some on the call stack. And, ARM processors (common
in phones) store the return address in a special register instead
of putting it on the call stack.
The Space Cost of Stack Frames
Each function call creates its own stack
frame, taking up space on the call stack. That's important
because it can impact the space complexity of an algorithm.
Especially when we use recursion.
For example, if we wanted to multiply all the numbers
between 1 and n,
we could use this recursive approach:
defproduct_1_to_n(n):return1if n <=1else n * product_1_to_n(n -1)
What would the call stack look like
when n = 10?
First, product_1_to_n() gets called
with n = 10:
product_1_to_n() n = 10
This calls product_1_to_n() with
n = 9.
product_1_to_n() n = 9
product_1_to_n() n = 10
Which calls product_1_to_n()
with n = 8.
product_1_to_n() n = 8
product_1_to_n() n = 9
product_1_to_n() n = 10
And so on until we get to n = 1.
product_1_to_n() n = 1
product_1_to_n() n = 2
product_1_to_n() n = 3
product_1_to_n() n = 4
product_1_to_n() n = 5
product_1_to_n() n = 6
product_1_to_n() n = 7
product_1_to_n() n = 8
product_1_to_n() n = 9
product_1_to_n() n = 10
Look at the size of all those stack frames! The entire call stack
takes up O(n) space. That's right—we
have an O(n) space cost even though
our function itself doesn't create any data
structures!
What if we'd used an iterative approach instead of a recursive one?
defproduct_1_to_n(n):# We assume n >= 1
result =1for num in range(1, n +1):
result *= num
return result
This version takes a constant amount of space. At the beginning of the loop,
the call stack looks like this:
product_1_to_n() n = 10, result = 1, num = 1
As we iterate through the loop, the local variables change, but we
stay in the same stack frame because we don't call any other
functions.
product_1_to_n() n = 10, result = 2, num = 2
product_1_to_n() n = 10, result = 6, num = 3
product_1_to_n() n = 10, result = 24, num = 4
In general, even though the compiler or interpreter will take
care of managing the call stack for you, it's important to consider the
depth of the call stack when analyzing the space complexity of an
algorithm.
Be especially careful with recursive functions!
They can end up building huge call stacks.
What happens if we run out of space? It's a stack
overflow! In Python 2.7, you'll get
a RecursionError.
If the very last thing
a function does is call
another function, then its stack frame
might not be needed any more. The functioncould free up its stack frame before doing its final
call, saving space.
This is called tail call optimization
(TCO). If a recursive function is optimized with TCO, then it
may not end up with a big call stack.
In general, most languages don't provide TCO. Scheme
is one of the few languages that guarantee tail call
optimization. Some Ruby, C, and Javascript
implementations may do it. Python and Java decidedly
don't.
(i.e.: stored on
the process heap or in
the stack frame of a calling function). Because of this, the
changes made by the function remain after
the call completes.
In-place algorithms are sometimes called
destructive, since the original input is
"destroyed" (or modified) during
the function call.
Careful: "In-place" does not mean "without
creating any additional variables!" Rather, it means
"without creating a new copy of the input." In general, an
in-place function will only create
additional variables that are O(1) space.
An out-of-placefunction
doesn't make any changes that are visible to
other functions. Usually,
those functions copy any data structures or objects
before manipulating and changing them.
In many languages, primitive values (integers,
floating point numbers, or characters) are copied when passed as
arguments, and more complex data structures
(lists, heaps, or hash tables) are
passed by
reference. This is what Python does.
Here are two functions that do the same
operation on a list, except one is
in-place and the other is out-of-place:
defsquare_list_in_place(int_list):for index, element in enumerate(int_list):
int_list[index]*= element
# NOTE: no need to return anything - we modified# int_list in placedefsquare_list_out_of_place(int_list):# We allocate a new list with the length of the input list
squared_list =[None]* len(int_list)for index, element in enumerate(int_list):
squared_list[index]= element **2return squared_list
Working in-place is a good way to save time and
space. An in-place algorithm avoids the cost of
initializing or copying data structures, and it usually has
an O(1) space cost.
But be careful: an in-place algorithm can cause side effects.
Your input is "destroyed" or "altered," which can affect
code outside of your function. For
example:
Generally, out-of-place algorithms are considered safer
because they avoid side effects. You should only use an
in-place algorithm if you're space constrained or
you're positive you don't need the original input
anymore, even for debugging.
, requiring O(1) additional space.
Fast on a sorted list. If the input
list is already sorted,
then insertion sort runs in O(n) time.
Weaknesses:
Slow. Insertion sort usually
takes O(n2) time—too slow
to be used on super-big data sets.
How It Works
Insertion sort works by inserting
elements from an unsorted list into
a sorted subsection of the
list, one item at a time.
Here's how it'd work on this list:
We'll break the the list into two chunks:
a sorted portion and an unsorted portion.
(The 0th item is "sorted" to start.)
Now we add the next item, 3, to our sorted portion.
It goes before 8, so we need to swap them.
Now the first two elements in
the list are sorted.
Next up we've got a 2.
It goes before the 8, so we'll swap them.
2 is also smaller than 3, so we need to swap those as well.
At this point, the first three elements are sorted.
As you can see, the idea is to "swap" each new item to the left until it's in the right spot.
The next unsorted item is a 7. It'll need to be swapped with the
8.
Once that's done, we can compare 7 and 3. Since 7 is bigger,
we don't need to swap them. So we're done moving the 7 down.
Now we've got the first four elements in sorted order.
9 is next, and it's already in the right spot.
We'll have to swap a bunch of elements to get the 1 down to the
start of the list
Finally, we'll swap the 4 over to the right spot.
Implementation
Let's code it up:
definsertion_sort(the_list):# For each item in the input listfor index in xrange(len(the_list)):# Shift it to the left until it's in the right spotwhile index >0and the_list[index -1]>= the_list[index]:
the_list[index], the_list[index -1]=\
the_list[index -1], the_list[index]
index -=1
Time Complexity
Worst Case
In the worst case, the input list
is in descending order (reverse-sorted order). So each time we
insert an element into the sorted portion, we'll need to swap it
with each of the elements already in the
sorted list to get it all the way
to the start. That's 1 swap the first time, 2 swaps the second
time, 3 swaps the third time, and so on, up to n−1 swaps for the last item.
Adding them all up:
1+2+3+…+(n−2)+(n−1)
That's the triangular series,
whose sum is O(n2)↴
A triangular series is a series of numbers where each number could be the row of an equilateral triangle.
So 1, 2, 3, 4, 5 is a triangular series, because you could stack the numbers like this:
Their sum is 15, which makes 15 a triangular number.
A triangular series always starts with 1 and increases by 1 with each number.
Since the only thing that changes in triangular series is the value of the highest number, it’s helpful to give that a name. Let’s call it n.
# n is 81,2,3,4,5,6,7,8
Triangular series are nice because no matter how large n is, it’s always easy to find the total sum of all the numbers.
Take the example above. Notice that if we add the first and last numbers together, and then add the second and second-to-last numbers together, they have the same sum! This happens with every pair of numbers until we reach the middle. If we add up all the pairs of numbers, we get:
1+8=92+7=93+6=94+5=9
This is true for every triangular series:
Pairs of numbers on each side will always add up to the same value.
That value will always be 1 more than the series’ n.
This gives us a pattern. Each pair's sum is n+1, and there are 2n pairs. So our total sum is:
(n+1)∗2n
Or:
2n2+n
Ok, but does this work with triangular series with an odd number of elements? Yes. Let's say n=5. So if we calculate the sum by hand:
1+2+3+4+5=15
And if we use the formula, we get the same answer:
252+5=15
One more thing:
What if we know the total sum, but we don't know the value of n?
Let’s say we have:
1+2+3+…+(n−2)+(n−1)+n=78
No problem. We just use our formula and set it equal to the sum!
2n2+n=78
Now, we can rearrange our equation to get a quadratic equation (remember those?)
n2+n=156n2+n−156=0
Here's the quadratic formula:
2a−b±b2−4ac
If you don't really remember how to use it, that's cool. You can just use an online calculator. We don't judge.
Taking the positive solution, we get n=12.
So for a triangular series, remember—the total sum is:
2n2+n
. Each swap costs constant time, so that's O(n2) time overall.
Best Case
In the best case, the
input list is already sorted. We'll still need to
compare every element to the one before it, to check if it needs to be swapped. But the answer will always be "no." So we sidestep the swapping entirely and just do those O(n) comparisons, costing
O(n) time overall.
Space Complexity
Insertion sort doesn't rely on any
extra lists, so
it's O(1) space.
Actually, we don't support password-based login. Never have. Just the OAuth methods above. Why?
It's easy and quick. No "reset password" flow. No password to forget.
It lets us avoid storing passwords that hackers could access and use to try to log into our users' email or bank accounts.
It makes it harder for one person to share a paid Interview Cake account with multiple people.
“I started at Google as of this week! Interview Cake was the primary studying tool I used before the onsite interview and it prepared me well for what they asked.
—
Victor