You only have 3 free questions left (including this one).

But it doesn't have to end here! Sign up for the 7-day coding interview crash course and you'll get a free Interview Cake problem every week.

Write a function to find the 2nd largest element in a binary search tree.

Costs

Balanced Unbalanced (Worst Case)
space O(n)O(n) O(n)O(n)
insert O(lg(n))O(lg(n)) O(n)O(n)
lookup O(lg(n))O(lg(n)) O(n)O(n)
delete O(lg(n))O(lg(n)) O(n)O(n)

Quick reference

A binary search tree is a binary tree where the nodes are ordered in a specific way. For every node:

  • The nodes to the left are smaller than the current node.
  • The nodes to the right are larger than the current node.

Checking if a binary tree is a binary search tree is a favorite question from interviews.

A binary search tree with nodes containing integers. The root node contains the integer 50. Each child node to the left of the root contains integers less than 50, and each child node to the right of the root contains integers greater than 50.

Strengths:

  • Good performance across the board. Assuming they're balanced, binary search trees are good at lots of operations, even if they're not constant time for anything.

    • Compared to a sorted array, lookups take the same amount of time (O(lg(n))O(lg(n))), but inserts and deletes are faster (O(lg(n))O(lg(n)) for BSTs, O(n)O(n) for arrays).
    • Compared to dictionaries, BSTs have better worst case performance—O(lg(n))O(lg(n)) instead of O(n)O(n). But, on average dictionaries perform better than BSTs (meaning O(1)O(1) time complexity).
  • BSTs are sorted. Taking a binary search tree and pulling out all of the elements in sorted order can be done in O(n)O(n) using an in-order traversal. Finding the element closest to a value can be done in O(lg(n))O(lg(n)) (again, if the BST is balanced!).

Weaknesses:

  • Poor performance if unbalanced. Some types of binary search trees balance automatically, but not all. If a BST is not balanced, then operations become O(n)O(n).
  • No O(1)O(1) operations. BSTs aren't the fastest for anything. On average, a list or a dictionary will be faster.

Balanced Binary Search Trees

Two binary search trees can store the same values in different ways:

Two binary search trees: the one on the left is balanced and the one on the right is unbalanced. Both contain the same values: the integers 1 through 6.

Some trees (like AVL trees or Red-Black trees) rearrange nodes as they're inserted to ensure the tree is always balanced. With these, the worst case complexity for searching, inserting, or deleting is always O(lg(n))O(lg(n)), not O(n)O(n).

Here's a sample binary tree node class:

  class BinaryTreeNode(object):

    def __init__(self, value):
        self.value = value
        self.left  = None
        self.right = None

    def insert_left(self, value):
        self.left = BinaryTreeNode(value)
        return self.left

    def insert_right(self, value):
        self.right = BinaryTreeNode(value)
        return self.right

Gotchas

Our first thought might be to do an in-order traversal of the BST

Sometimes we have a BST and we want to go through the items in order from smallest to largest. This is called an "in-order traversal."

We can write a recursive algorithm for this:

  • Everything in the left subtree is smaller, so print that first, then,
  • print the current node, then
  • print the right subtree.
  def inorder_print(node):
    if node:
        inorder_print(node.left)
        print node.value
        inorder_print(node.right)

This takes O(n)O(n) time (we're traversing the whole tree, so we're looking at all nn items) and O(h)O(h) space, where hh is the height of the tree (this is the max depth of the call stack during our recursion).

If the tree is balanced its height is lg(n)lg(n), so we have O(lg(n))O(lg(n)) space. If we can't make any assumptions about the "shape" of the tree, in the worst case it's just one continuous line, giving us a height of nn, and a space complexity of O(n)O(n) for our recursive in-order traversal.

and return the second-to-last item. This means looking at every node in the BST. That would take O(n)O(n) time and O(h)O(h) space, where hh is the max height of the tree (which is lg(n)lg(n) if the tree is balanced,

Formally, a tree is said to be balanced if the difference between the depths of any node's left tree and right tree is no greater than 1.

Thus a 'balanced' tree 'looks full', without any apparent chunks missing or any branches that end much earlier than other branches.

but could be as much as nn if not).

We can do better than O(n)O(n) time and O(h)O(h) space.

We can do this in one walk from top to bottom of our BST. This means O(h)O(h) time (again, that's O(lgn)O(\lg{n}) if the tree is balanced, O(n)O(n) otherwise).

A clean recursive implementation will take O(h)O(h) space in the call stack,

Overview

The call stack is what a program uses to keep track of function calls. The call stack is made up of stack frames—one for each function call.

For instance, say we called a function that rolled two dice and printed the sum.

  def roll_die():
    return random.randint(1, 6)

def roll_two_and_sum():
    total = 0
    total += roll_die()
    total += roll_die()
    print total

roll_two_and_sum()

First, our program calls roll_two_and_sum(). It goes on the call stack:

roll_two_and_sum()

That function calls roll_die(), which gets pushed on to the top of the call stack:

roll_die()
roll_two_and_sum()

Inside of roll_die(), we call random.randint(). Here's what our call stack looks like then:

random.randint()
roll_die()
roll_two_and_sum()

When random.randint() finishes, we return back to roll_die() by removing ("popping") random.randint()'s stack frame.

roll_die()
roll_two_and_sum()

Same thing when roll_die() returns:

roll_two_and_sum()

We're not done yet! roll_two_and_sum() calls roll_die() again:

roll_die()
roll_two_and_sum()

Which calls random.randint() again:

random.randint()
roll_die()
roll_two_and_sum()

random.randint() returns, then roll_die() returns, putting us back in roll_two_and_sum():

roll_two_and_sum()

Which calls print():

print()
roll_two_and_sum()

What's stored in a stack frame?

What actually goes in a function's stack frame?

A stack frame usually stores:

  • Local variables
  • Arguments passed into the function
  • Information about the caller's stack frame
  • The return address—what the program should do after the function returns (i.e.: where it should "return to"). This is usually somewhere in the middle of the caller's code.

Some of the specifics vary between processor architectures. For instance, AMD64 (64-bit x86) processors pass some arguments in registers and some on the call stack. And, ARM processors (common in phones) store the return address in a special register instead of putting it on the call stack.

The Space Cost of Stack Frames

Each function call creates its own stack frame, taking up space on the call stack. That's important because it can impact the space complexity of an algorithm. Especially when we use recursion.

For example, if we wanted to multiply all the numbers between 11 and nn, we could use this recursive approach:

  def product_1_to_n(n):
    return 1 if n <= 1 else n * product_1_to_n(n - 1)

What would the call stack look like when n = 10?

First, product_1_to_n() gets called with n = 10:

    product_1_to_n()    n = 10

This calls product_1_to_n() with n = 9.

    product_1_to_n()    n = 9
    product_1_to_n()    n = 10

Which calls product_1_to_n() with n = 8.

    product_1_to_n()    n = 8
    product_1_to_n()    n = 9
    product_1_to_n()    n = 10

And so on until we get to n = 1.

    product_1_to_n()    n = 1
    product_1_to_n()    n = 2
    product_1_to_n()    n = 3
    product_1_to_n()    n = 4
    product_1_to_n()    n = 5
    product_1_to_n()    n = 6
    product_1_to_n()    n = 7
    product_1_to_n()    n = 8
    product_1_to_n()    n = 9
    product_1_to_n()    n = 10

Look at the size of all those stack frames! The entire call stack takes up O(n)O(n) space. That's right—we have an O(n)O(n) space cost even though our function itself doesn't create any data structures!

What if we'd used an iterative approach instead of a recursive one?

  def product_1_to_n(n):
    # We assume n >= 1
    result = 1
    for num in range(1, n + 1):
        result *= num

    return result

This version takes a constant amount of space. At the beginning of the loop, the call stack looks like this:

    product_1_to_n()    n = 10, result = 1, num = 1

As we iterate through the loop, the local variables change, but we stay in the same stack frame because we don't call any other functions.

    product_1_to_n()    n = 10, result = 2, num = 2

    product_1_to_n()    n = 10, result = 6, num = 3

    product_1_to_n()    n = 10, result = 24, num = 4

In general, even though the compiler or interpreter will take care of managing the call stack for you, it's important to consider the depth of the call stack when analyzing the space complexity of an algorithm.

Be especially careful with recursive functions! They can end up building huge call stacks.

What happens if we run out of space? It's a stack overflow! In Python 2.7, you'll get a RecursionError.

If the very last thing a function does is call another function, then its stack frame might not be needed any more. The function could free up its stack frame before doing its final call, saving space.

This is called tail call optimization (TCO). If a recursive function is optimized with TCO, then it may not end up with a big call stack.

In general, most languages don't provide TCO. Scheme is one of the few languages that guarantee tail call optimization. Some Ruby, C, and Javascript implementations may do it. Python and Java decidedly don't.

but we can bring our algorithm down to O(1)O(1) space overall.

Breakdown

Start your free trial!

Log in or sign up with one click to get immediate access to 3 free mock interview questions

Where do I enter my password?

Actually, we don't support password-based login. Never have. Just the OAuth methods above. Why?

  1. It's easy and quick. No "reset password" flow. No password to forget.
  2. It lets us avoid storing passwords that hackers could access and use to try to log into our users' email or bank accounts.
  3. It makes it harder for one person to share a paid Interview Cake account with multiple people.

Solution

Start your free trial!

Log in or sign up with one click to get immediate access to 3 free mock interview questions

Where do I enter my password?

Actually, we don't support password-based login. Never have. Just the OAuth methods above. Why?

  1. It's easy and quick. No "reset password" flow. No password to forget.
  2. It lets us avoid storing passwords that hackers could access and use to try to log into our users' email or bank accounts.
  3. It makes it harder for one person to share a paid Interview Cake account with multiple people.

Complexity

We're doing one walk down our BST, which means O(h)O(h) time, where hh is the height of the tree (again, that's O(lgn)O(\lg{n}) if the tree is balanced, O(n)O(n) otherwise). O(1)O(1) space.

What We Learned

Start your free trial!

Log in or sign up with one click to get immediate access to 3 free mock interview questions

Where do I enter my password?

Actually, we don't support password-based login. Never have. Just the OAuth methods above. Why?

  1. It's easy and quick. No "reset password" flow. No password to forget.
  2. It lets us avoid storing passwords that hackers could access and use to try to log into our users' email or bank accounts.
  3. It makes it harder for one person to share a paid Interview Cake account with multiple people.

Do you have an answer?

1
2
import unittest
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

Reset editor

Powered by qualified.io

. . .