But it doesn't have to end here! Sign up for the 7-day coding interview crash course and you'll get a free Interview Cake problem every week.
You're in!
Write a function to see if a
binary tree
↴
Binary trees are a fundamental data structure that organizes items
in a hierarchical, branching way. Each node in a binary tree has a
value and pointers to two other nodes - a left child and a right
child.
Binary trees are surprisingly powerful and efficient. With
balanced implementations, most operations are logarithmic time
complexity, meaning operations on binary trees are fast even when
the number of nodes is large.
Binary trees are a common interview topic because they test
multiple skills simultaneously: recursion, efficient algorithm
design, and pointer manipulation. In this article, we'll explore
the fundamentals of binary trees, describe common traversal
techniques, and cover a few sample interview questions. By
building a strong foundation, you'll be well-prepared to face a
wide range of coding challenges.
Let's dive in!
There’s an extensive set of terms used to describe properties of
binary trees. Make sure you’re familiar with these, so you’re on
the same page as your interviewer when they describe the problem:
Trees are often measured by their height or depth — the
number of levels between the root and the furthest leaf
node. Depending how it’s defined, a tree with just one node might
have a height of zero or one. Ask your interviewer if you’re
unsure how they’re using the term.
Depending on their structure, some binary trees can have special names.
A Binary Search Tree is a specialized binary tree optimized for efficient searching operations.
In a binary tree, when looking at any node:
How does this make for efficient search operations? Starting at
the root node, we can compare the value of our target to the value
stored in our current node. If our target is:
If a binary search tree is balanced, then each time we descend
into a subtree we’ll eliminate roughly half of the remaining
nodes. That means our search operation will
be O(lg(n)).
Some implementations of binary search trees are self-balancing,
meaning their search, insert, and delete operations will always be
O(lg(n)). Other implementations might not be though - in those
cases, binary search trees can degenerate into a
linked list,
meaning search operations could be O(n). Yikes!
During a coding interview, always be explicit about your
assumptions. If your input is a tree, is it a binary search tree?
Is it a balanced binary search tree? Make sure you’re in alignment
with your interviewer to ensure you’re solving the problem they
want you to focus on.
When working with binary trees, a common operation is to visit
each node exactly once to process or inspect its data. This
process is known as tree traversal. Recognizing the appropriate
traversal technique for a given problem is a crucial skill in
coding interviews.
Because of the inherent recursive structure of trees (each node
has smaller left and right subtrees), tree traversals are often
implemented using recursive algorithms. Let's explore the common
traversal methods.
In an in-order traversal, we visit nodes in the following order:
Here’s the order we’d visit each node in our tree using an in-order traversal:
In-order traversal is particularly useful when working with Binary
Search Trees (BSTs). Because of the BST property (smaller values
in the left subtree, larger values in the right subtree), an
in-order traversal will visit the nodes in ascending sorted
order. This makes it useful for operations like:
In a pre-order traversal, we visit nodes in this order:
Here’s what that order looks like:
Pre-order traversal is often used in scenarios like:
In a post-order traversal, the node visiting order is:
Going back to our sample tree, here's the order when traversing this way:
Post-order traversal is frequently employed when:
This one is different from the others. In a level-order traversal,
the tree is explored level by level, starting from the root.
Walking our example tree one last time:
Level-order traversal is typically implemented iteratively using a queue. It's useful for:
Let's analyze the time and space complexity of these traversal techniques:
Time Complexity: O(N)—
where N is the number of nodes in the tree. In
each traversal, we visit every node exactly once and perform a
constant amount of work at each node (assuming each visit
operation is O(1)).
Space
Complexity: O(h)—where h
is the height of the tree. The space complexity is primarily
determined by the recursive call stack. In the worst case, for a
skewed or unbalanced tree (e.g., a linked list-like tree), the
height can be N, leading
to O(N) space complexity. However, for a
balanced binary tree, the height is
approximately lg(N) resulting
in O(lg(n)) space complexity.
Time Complexity: O(N)
— similar to the recursive traversals, we visit each node
once and do O(1) work during the visit.
Space Complexity: O(W)
— where W is the maximum width of the
tree. In the worst-case scenario, a complete binary tree, the
maximum width is approximately N/2 (at the last
level). Therefore, the space complexity can
be O(N) in the worst case, as the queue
might hold a significant portion of the nodes at a given level. In
more sparse trees, the space complexity might be less.
Interviews often test your ability to apply traversals and tree
thinking to solve specific problems. Let’s look at a few classic
examples to see how the concepts discussed above come up in
practice.
Problem Statement: Given a binary tree,
determine if the tree is a valid binary search tree.
General Approach: Remember that a BST has the
property that for each node, all values in its left subtree are
smaller than the node's value, and all values in its right subtree
are larger.
For each node, we maintain a valid range of values it can
take. Starting from the root, the range is initially unbounded
(negative infinity to positive infinity). As we traverse down:
At each node, we check if the node's value falls within its
allowed range. If it does, we update the range and recursively
check the left and right subtrees. If not, we short circuit and
return that the tree is not a BST.
Alternate Approach: Generate an in-order
traversal of the tree and then check if the resulting list is in
ascending order.
Edge Cases: What if the tree is empty? What if the tree contains duplicate values?
Problem Statement: Given a binary tree and two
nodes (let's call them node1
and node2), find their Lowest Common
Ancestor (LCA). The LCA is defined as the root of the smallest
subtree that contains both node1
and node2.
General Approach: When examining a node and its subtrees, there are three possible cases:
Edge Cases: What happens if one of the nodes
isn’t in the tree? Are there possible optimizations if we know the
tree is a binary search tree? What if we could add parent pointers
to the tree nodes, allowing us to walk upwards from both of the
nodes?
Problem Statement: Determine if there’s a path
from the root to a leaf that sums to a target sum.
General Approach: Recursion shines here.
Edge Cases: What if we wanted to count how many
paths there were? Or, what if a path didn’t have to run from the
root all the way to a leaf, but instead was any sequence of
adjacent nodes?
Problem Statement: Determine whether two binary trees are mirrors of each other.
General Approach: Another elegant recursive algorithm:
Edge Case: What if we didn’t care about the
values, but only wanted to check that the tree structures were
mirrors of each other?
Sometimes, the problem won't explicitly scream "binary tree!" You
need to learn to spot the subtle clues that hint at a tree-based
solution. Look for these indicators:
While binary tree operations often have logarithmic time
complexity (assuming they’re balanced), space complexity can
sometimes be a concern, especially with recursion. Here are a few
techniques to consider for space optimization:
Prioritize clarity and correctness over premature
optimization. In an interview setting, it's generally
better to first implement a correct and understandable
solution, even if it's not the most space-optimized. If space
complexity becomes a concern or if the interviewer
specifically asks about space optimization, then you can
discuss or implement iterative approaches or more advanced
techniques.
Always consider these edge cases when working with trees:
In this guide, we’ve covered fundamental tree concepts, traversal
techniques, and worked up to complex problems and edge cases. From
understanding basic terminology to tackling classic interview
problems, you've built a solid foundation.
Remember, binary trees are more than just abstract concepts; they
are powerful tools for representing hierarchical data and solving
a wide array of computational problems. In an interview, remember
to:
With dedication and consistent practice, you'll be able to
confidently "branch out" and conquer any binary tree problem that
comes your way in your next coding interview.
Good luck!
class BinaryTreeNode(object):
def __init__(self, value):
self.value = value
self.left = None
self.right = None
Nodes, Edges, and Tree Terminology
Types of Binary Trees
Binary Search Trees (BST) - Special Properties and Uses
Tree Traversal Techniques
In-order traversal
Pre-order traversal
Post-order Traversal
Level-order Traversal (Breadth First Search)
Complexity Analysis
For Recursive Traversal Algorithms (In-order, Pre-order, Post-order):
For Level-order Traversal (Breadth-First Search):
Classic Binary Tree Interview Problems
Validate a Binary Search Tree
Lowest Common Ancestor
Path Sum Problem
Binary Tree Mirror
Binary Tree Problem-Solving Strategies
How to Recognize an Interview Problem is a Binary Tree Problem
Space Optimization Techniques
Common Edge Cases to Consider
Conclusion: Branch Out and Conquer Tree Challenges
A tree is "superbalanced" if the difference between the depths of any two
leaf nodes
↴
A leaf node is a tree node with no children.
It's the "end" of a path to the bottom, from the root.
Here's a sample binary tree node class:
class BinaryTreeNode(object):
def __init__(self, value):
self.value = value
self.left = None
self.right = None
def insert_left(self, value):
self.left = BinaryTreeNode(value)
return self.left
def insert_right(self, value):
self.right = BinaryTreeNode(value)
return self.right
Your first thought might be to write a recursive function, thinking, "the tree is balanced if the left subtree is balanced and the right subtree is balanced." This kind of approach works well for some other tree problems.
But this isn't quite true. Counterexample: suppose that from the root of our tree:
Both subtrees are balanced, but from the root we will have leaves at 3 different depths.
We could instead have our recursive function get the list of distinct leaf depths for each subtree. That could work fine. But let's come up with an iterative solution instead. It's usually better to use an iterative solution instead of a recursive one because it avoids
stack overflow.
↴
The call stack is what a program uses to keep
track of function calls. The call stack is
made up of stack frames—one for
each function call.
For instance, say we called a function that
rolled two dice and printed the sum.
First, our program calls roll_two_and_sum(). It goes
on the call stack:
That function calls roll_die(), which gets pushed
on to the top of the call stack:
Inside of roll_die(), we call random.randint().
Here's what our call stack looks like then:
When random.randint() finishes, we return back to
roll_die() by removing
("popping") random.randint()'s stack frame.
Same thing when roll_die() returns:
We're not done yet! roll_two_and_sum()
calls roll_die() again:
Which calls random.randint() again:
random.randint() returns, then roll_die() returns,
putting us back in roll_two_and_sum():
Which calls print()():
What actually goes in a function's
stack frame?
A stack frame usually stores:
Some of the specifics vary between processor architectures. For
instance, AMD64 (64-bit x86) processors pass some arguments in
registers and some on the call stack. And, ARM processors (common
in phones) store the return address in a special register instead
of putting it on the call stack.
Each function call creates its own stack
frame, taking up space on the call stack. That's important
because it can impact the space complexity of an algorithm.
Especially when we use recursion.
For example, if we wanted to multiply all the numbers
between 1 and n,
we could use this recursive approach:
What would the call stack look like
when n = 10?
First, product_1_to_n() gets called
with n = 10:
This calls product_1_to_n() with
n = 9.
Which calls product_1_to_n()
with n = 8.
And so on until we get to n = 1.
Look at the size of all those stack frames! The entire call stack
takes up O(n) space. That's right—we
have an O(n) space cost even though
our function itself doesn't create any data
structures!
What if we'd used an iterative approach instead of a recursive one?
This version takes a constant amount of space. At the beginning of the loop,
the call stack looks like this:
As we iterate through the loop, the local variables change, but we
stay in the same stack frame because we don't call any other
functions.
In general, even though the compiler or interpreter will take
care of managing the call stack for you, it's important to consider the
depth of the call stack when analyzing the space complexity of an
algorithm.
Be especially careful with recursive functions!
They can end up building huge call stacks.
What happens if we run out of space? It's a stack
overflow! In Python 3.6, you'll get
a RecursionError.
If the very last thing
a function does is call
another function, then its stack frame
might not be needed any more. The function
could free up its stack frame before doing its final
call, saving space.
This is called tail call optimization
(TCO). If a recursive function is optimized with TCO, then it
may not end up with a big call stack.
In general, most languages don't provide TCO. Scheme
is one of the few languages that guarantee tail call
optimization. Some Ruby, C, and Javascript
implementations may do it. Python and Java decidedly
don't.
Overview
def roll_die():
return random.randint(1, 6)
def roll_two_and_sum():
total = 0
total += roll_die()
total += roll_die()
print(total)
roll_two_and_sum()
roll_two_and_sum()
roll_die()
roll_two_and_sum()
random.randint()
roll_die()
roll_two_and_sum()
roll_die()
roll_two_and_sum()
roll_two_and_sum()
roll_die()
roll_two_and_sum()
random.randint()
roll_die()
roll_two_and_sum()
roll_two_and_sum()
print()()
roll_two_and_sum()
What's stored in a stack frame?
The Space Cost of Stack Frames
def product_1_to_n(n):
return 1 if n <= 1 else n * product_1_to_n(n - 1)
product_1_to_n() n = 10
product_1_to_n() n = 9
product_1_to_n() n = 10
product_1_to_n() n = 8
product_1_to_n() n = 9
product_1_to_n() n = 10
product_1_to_n() n = 1
product_1_to_n() n = 2
product_1_to_n() n = 3
product_1_to_n() n = 4
product_1_to_n() n = 5
product_1_to_n() n = 6
product_1_to_n() n = 7
product_1_to_n() n = 8
product_1_to_n() n = 9
product_1_to_n() n = 10
def product_1_to_n(n):
# We assume n >= 1
result = 1
for num in range(1, n + 1):
result *= num
return result
product_1_to_n() n = 10, result = 1, num = 1
product_1_to_n() n = 10, result = 2, num = 2
product_1_to_n() n = 10, result = 6, num = 3
product_1_to_n() n = 10, result = 24, num = 4
We can do this in O(n) time and O(n) space.
What about a tree with only one leaf node? Does your function handle that case properly?
Sometimes it's good to start by rephrasing or "simplifying" the problem.
The requirement of "the difference between the depths of any two leaf nodes is no greater than 1" implies that we'll have to compare the depths of all possible pairs of leaves. That'd be expensive—if there are n leaves, there are O(n2) possible pairs of leaves.
But we can simplify this requirement to require less work. For example, we could equivalently say:
If you're having trouble with a recursive approach, try using an iterative one.
To get to our leaves and measure their depths, we'll have to traverse the tree somehow. What methods do we know for traversing a tree?
Depth-first
↴
Depth-first search (DFS) is a method for exploring a
tree or graph. In a DFS, you go as deep as possible down one path
before backing up and trying a different one.
Depth-first search is like walking through a corn maze. You explore
one path, hit a dead end, and go back and try a different one.
Here's a how a DFS would traverse this tree, starting with the root:
We'd go down the first path we find until we hit a dead end:
Then we'd do the same thing again—go down a path until we hit a dead end:
And again:
And again:
Until we reach the end.
Depth-first search is often compared with breadth-first search.
Advantages:
Disadvantages
Breadth-First Search is an algorithm that systematically
explores a graph level by level. Starting from a source point, BFS
visits all its immediate neighbors first, then their neighbors,
and so on. This methodical approach makes it incredibly useful for
finding shortest paths in unweighted graphs, exploring tree-like
structures, and solving a variety of problems that come up
frequently in coding interviews.
BFS is a cornerstone algorithm. It's often favored in interviews
because it strikes a balance: it's conceptually straightforward to
understand and implement, yet adaptable to a wide range of
challenging questions.
In this article, we’ll cover core concepts of BFS, look at some
implementations with time / space complexity analysis, and explore
a few different problems where BFS shines. Let's dive in!
Breadth-First Search (BFS) explores a graph or tree level by
level. Imagine dropping a pebble into still water: the ripples
spread outwards in circles. BFS mimics this, starting at a source
node and systematically visiting all nodes at the current level
before moving to the next level. This level-by-level approach
ensures you explore the graph in layers of "hops" from the
starting point.
Starting from one point in the graph,
we look at all the nodes one level away,
then all the nodes two levels away,
then all the nodes three levels away,
and so on - until we’ve seen every node.
The key behind BFS's level-by-level exploration is a
queue.
A queue is a First-In, First-Out (FIFO)
data structure: items are added to the back and removed from the
front.
In BFS, a queue stores the nodes we need to explore. When we visit
a node, we add its unvisited neighbors to the back of the
queue. Then, we process the next node at the
front of the queue. This FIFO behavior ensures that we
explore all nodes at the current "level" before moving to the next
level.
To prevent infinite loops and redundant processing in graphs
(especially those with cycles), we need to keep track of visited
nodes. This is where a
visited set
is essential.
Before adding a neighbor to the queue, we check if it's already in
the visited set. If it is, we skip it. Otherwise, we add it to the
queue and mark it as visited by adding it to the visited
set. Using an
efficient set data
structure, these checks and inserts can be done in
O(1) time on average.
In many BFS problems, especially shortest path problems, you'll
need to reconstruct the path from the starting node to the target
node once you find it. To do this, we use a predecessor
dictionary (or
parent dictionary).
As we explore the graph with BFS, whenever we enqueue a neighbor
node, we record in the predecessor dictionary that we reached this
neighbor from the current node. This dictionary stores each node and its
immediate predecessor in the BFS traversal.
Once BFS finds the target node, we can easily reconstruct the path
by starting at the target node and using the predecessor dictionary to
trace back to its predecessor, then to its predecessor, and so on,
until we reach the starting node. This backtracking process gives
us the shortest path in reverse order, which we can then reverse
to get the correct path.
Path reconstruction takes time proportional to the length of the
shortest path. In the worst case, every node in the graph could be
in the path, so reconstructing takes
O(V) overall.
Let's break down the BFS algorithm into clear steps:
Termination: The algorithm terminates when we reach
our target node or the queue becomes empty.
At this point, you often will reconstruct the path from
start_node to
end_node using the
predecessors dictionary.
Once you’ve coded up an algorithm, interviewers are going to ask you to analyze its time and space complexity.
Time Complexity: O(V+E)
V is the number of vertices (nodes)
and E is the number of edges in the
graph. We visit each vertex at most once (due to the visited
set). For each vertex,
we iterate through its neighbors. In total, across all
vertices, we examine each edge at most twice (once for the
node on each end). Putting it all together, we’ve
got O(V+E) time complexity.
Space Complexity: O(V)
V is the number of vertices. Our three data
structures are the node queue, the
visited set, and the
backtracking dictionary. Because each node gets added to the
queue exactly once, the queue won’t grow
beyond O(V). Our
visited set also grows
to hold all the nodes, O(V)
size. Finally, our backtracking dictionary has one entry per
node, so it’s O(V), too. Putting these
together, we’re still O(V) space.
Because of its versatility, BFS shows up in a lot of
places. Look for these problem structures that suggest BFS is
the appropriate algorithm:
When approaching a problem that might be solved with BFS,
consider these steps:
Here are some common interview problems where a BFS-style algorithm
could work:
Problem Description: Given a binary tree,
return its node values in level order traversal. That is,
nodes at each level are visited from left to right, level by
level.
Applying BFS:
Problem Description: Given an ASCII maze
where each step has the same cost, find the shortest path
from the top left to the bottom right.
Applying BFS:
Problem Description: Given a 2D grid
indicating land and water, count the number of islands.
Applying BFS:
Problem Description: Given a start word and a
target word, find the fewest steps required to transform one
into the other, changing only one letter at a time.
As an example, to transform head into tail, we could generate
this sequence:
Applying BFS:
Breadth-First Search is an indispensable algorithm in your
coding interview toolkit. Its systematic level-by-level
exploration makes it a powerful and efficient approach for a
wide range of problems. By mastering the core concepts – the
queue, visited set, predecessor dictionary – and practicing its
implementation, you'll be well-prepared to confidently tackle
BFS-related interview questions.
Remember to:
With practice and a solid understanding of BFS, you'll be
ready to impress in your next coding interview! Good luck!
Understanding the Core Concepts of BFS
What is Breadth-First Search?
Key Data Structure: The Queue
The Visited Set
Path Reconstruction: The Predecessor Dictionary
The BFS Algorithm: Step-by-Step
from collections import deque
def reconstruct_path(predecessors, start_node, end_node):
reversed_shortest_path = []
# Start from the end of the path and work backwards
current_node = end_node
while current_node:
reversed_shortest_path.append(current_node)
current_node = predecessors[current_node]
# Reverse our path to get the right order
reversed_shortest_path.reverse() # flip it around, in place
return reversed_shortest_path # no longer reversed
def bfs_get_path(graph, start_node, end_node):
if start_node not in graph:
raise Exception('Start node not in graph')
if end_node not in graph:
raise Exception('End node not in graph')
queue = deque()
queue.append(start_node)
visited = set([start_node])
predecessors = {start_node: None}
while len(queue) > 0:
current_node = queue.popleft()
# Stop when we reach the end node
if current_node == end_node:
return reconstruct_path(predecessors, start_node, end_node)
for neighbor in graph[current_node]:
if neighbor not in visited:
queue.append(neighbor)
visited.add(neighbor)
predecessors[neighbor] = current_node
# If we get here, then we never found the end node
# so there's NO path from start_node to end_node
return None
Analyzing Time and Space Complexity of BFS
Applications of BFS: Where is BFS Useful?
Sample BFS Interview Questions
Binary Tree Level Order Traversal
Shortest Path Through a Maze
.####### @#######
..#.#..# @@#.#..#
#.....## #@@@@@##
###.#..# ==> ###.#@.#
#...#.## #...#@##
#.###.#. #.###@#.
#.....## #....@##
#####... #####@@@
Number of Islands
Word Ladder
Conclusion: BFS - A Powerful Tool in Your Interview Toolkit
The worst-case time and space costs of both are the same—you could make a case for either.
But one characteristic of our algorithm is that it could short-circuit and return False as soon as it finds two leaves with depths more than 1 apart. So maybe we should use a traversal that will hit leaves as quickly as possible...
Depth-first traversal will generally hit leaves before breadth-first, so let's go with that. How could we write a depth-first walk that also keeps track of our depth?
We do a
depth-first walk
↴
Depth-first search (DFS) is a method for exploring a
tree or graph. In a DFS, you go as deep as possible down one path
before backing up and trying a different one.
Depth-first search is like walking through a corn maze. You explore
one path, hit a dead end, and go back and try a different one.
Here's a how a DFS would traverse this tree, starting with the root:
We'd go down the first path we find until we hit a dead end:
Then we'd do the same thing again—go down a path until we hit a dead end:
And again:
And again:
Until we reach the end.
Depth-first search is often compared with breadth-first search.
Advantages:
Disadvantages
Each time we hit a leaf with a new depth, there are two ways that our tree might now be unbalanced:
Why are we doing a depth-first walk and not a
breadth-first
↴
Breadth-First Search is an algorithm that systematically
explores a graph level by level. Starting from a source point, BFS
visits all its immediate neighbors first, then their neighbors,
and so on. This methodical approach makes it incredibly useful for
finding shortest paths in unweighted graphs, exploring tree-like
structures, and solving a variety of problems that come up
frequently in coding interviews.
BFS is a cornerstone algorithm. It's often favored in interviews
because it strikes a balance: it's conceptually straightforward to
understand and implement, yet adaptable to a wide range of
challenging questions.
In this article, we’ll cover core concepts of BFS, look at some
implementations with time / space complexity analysis, and explore
a few different problems where BFS shines. Let's dive in!
Breadth-First Search (BFS) explores a graph or tree level by
level. Imagine dropping a pebble into still water: the ripples
spread outwards in circles. BFS mimics this, starting at a source
node and systematically visiting all nodes at the current level
before moving to the next level. This level-by-level approach
ensures you explore the graph in layers of "hops" from the
starting point.
Starting from one point in the graph,
we look at all the nodes one level away,
then all the nodes two levels away,
then all the nodes three levels away,
and so on - until we’ve seen every node.
The key behind BFS's level-by-level exploration is a
queue.
A queue is a First-In, First-Out (FIFO)
data structure: items are added to the back and removed from the
front.
In BFS, a queue stores the nodes we need to explore. When we visit
a node, we add its unvisited neighbors to the back of the
queue. Then, we process the next node at the
front of the queue. This FIFO behavior ensures that we
explore all nodes at the current "level" before moving to the next
level.
To prevent infinite loops and redundant processing in graphs
(especially those with cycles), we need to keep track of visited
nodes. This is where a
visited set
is essential.
Before adding a neighbor to the queue, we check if it's already in
the visited set. If it is, we skip it. Otherwise, we add it to the
queue and mark it as visited by adding it to the visited
set. Using an
efficient set data
structure, these checks and inserts can be done in
O(1) time on average.
In many BFS problems, especially shortest path problems, you'll
need to reconstruct the path from the starting node to the target
node once you find it. To do this, we use a predecessor
dictionary (or
parent dictionary).
As we explore the graph with BFS, whenever we enqueue a neighbor
node, we record in the predecessor dictionary that we reached this
neighbor from the current node. This dictionary stores each node and its
immediate predecessor in the BFS traversal.
Once BFS finds the target node, we can easily reconstruct the path
by starting at the target node and using the predecessor dictionary to
trace back to its predecessor, then to its predecessor, and so on,
until we reach the starting node. This backtracking process gives
us the shortest path in reverse order, which we can then reverse
to get the correct path.
Path reconstruction takes time proportional to the length of the
shortest path. In the worst case, every node in the graph could be
in the path, so reconstructing takes
O(V) overall.
Let's break down the BFS algorithm into clear steps:
Termination: The algorithm terminates when we reach
our target node or the queue becomes empty.
At this point, you often will reconstruct the path from
start_node to
end_node using the
predecessors dictionary.
Once you’ve coded up an algorithm, interviewers are going to ask you to analyze its time and space complexity.
Time Complexity: O(V+E)
V is the number of vertices (nodes)
and E is the number of edges in the
graph. We visit each vertex at most once (due to the visited
set). For each vertex,
we iterate through its neighbors. In total, across all
vertices, we examine each edge at most twice (once for the
node on each end). Putting it all together, we’ve
got O(V+E) time complexity.
Space Complexity: O(V)
V is the number of vertices. Our three data
structures are the node queue, the
visited set, and the
backtracking dictionary. Because each node gets added to the
queue exactly once, the queue won’t grow
beyond O(V). Our
visited set also grows
to hold all the nodes, O(V)
size. Finally, our backtracking dictionary has one entry per
node, so it’s O(V), too. Putting these
together, we’re still O(V) space.
Because of its versatility, BFS shows up in a lot of
places. Look for these problem structures that suggest BFS is
the appropriate algorithm:
When approaching a problem that might be solved with BFS,
consider these steps:
Here are some common interview problems where a BFS-style algorithm
could work:
Problem Description: Given a binary tree,
return its node values in level order traversal. That is,
nodes at each level are visited from left to right, level by
level.
Applying BFS:
Problem Description: Given an ASCII maze
where each step has the same cost, find the shortest path
from the top left to the bottom right.
Applying BFS:
Problem Description: Given a 2D grid
indicating land and water, count the number of islands.
Applying BFS:
Problem Description: Given a start word and a
target word, find the fewest steps required to transform one
into the other, changing only one letter at a time.
As an example, to transform head into tail, we could generate
this sequence:
Applying BFS:
Breadth-First Search is an indispensable algorithm in your
coding interview toolkit. Its systematic level-by-level
exploration makes it a powerful and efficient approach for a
wide range of problems. By mastering the core concepts – the
queue, visited set, predecessor dictionary – and practicing its
implementation, you'll be well-prepared to confidently tackle
BFS-related interview questions.
Remember to:
With practice and a solid understanding of BFS, you'll be
ready to impress in your next coding interview! Good luck!
Understanding the Core Concepts of BFS
What is Breadth-First Search?
Key Data Structure: The Queue
The Visited Set
Path Reconstruction: The Predecessor Dictionary
The BFS Algorithm: Step-by-Step
from collections import deque
def reconstruct_path(predecessors, start_node, end_node):
reversed_shortest_path = []
# Start from the end of the path and work backwards
current_node = end_node
while current_node:
reversed_shortest_path.append(current_node)
current_node = predecessors[current_node]
# Reverse our path to get the right order
reversed_shortest_path.reverse() # flip it around, in place
return reversed_shortest_path # no longer reversed
def bfs_get_path(graph, start_node, end_node):
if start_node not in graph:
raise Exception('Start node not in graph')
if end_node not in graph:
raise Exception('End node not in graph')
queue = deque()
queue.append(start_node)
visited = set([start_node])
predecessors = {start_node: None}
while len(queue) > 0:
current_node = queue.popleft()
# Stop when we reach the end node
if current_node == end_node:
return reconstruct_path(predecessors, start_node, end_node)
for neighbor in graph[current_node]:
if neighbor not in visited:
queue.append(neighbor)
visited.add(neighbor)
predecessors[neighbor] = current_node
# If we get here, then we never found the end node
# so there's NO path from start_node to end_node
return None
Analyzing Time and Space Complexity of BFS
Applications of BFS: Where is BFS Useful?
Sample BFS Interview Questions
Binary Tree Level Order Traversal
Shortest Path Through a Maze
.####### @#######
..#.#..# @@#.#..#
#.....## #@@@@@##
###.#..# ==> ###.#@.#
#...#.## #...#@##
#.###.#. #.###@#.
#.....## #....@##
#####... #####@@@
Number of Islands
Word Ladder
Conclusion: BFS - A Powerful Tool in Your Interview Toolkit
def is_balanced(tree_root):
# A tree with no nodes is superbalanced, since there are no leaves!
if tree_root is None:
return True
# We short-circuit as soon as we find more than 2
depths = []
# We'll treat this list as a stack that will store tuples of (node, depth)
nodes = []
nodes.append((tree_root, 0))
while len(nodes):
# Pop a node and its depth from the top of our stack
node, depth = nodes.pop()
# Case: we found a leaf
if (not node.left) and (not node.right):
# We only care if it's a new depth
if depth not in depths:
depths.append(depth)
# Two ways we might now have an unbalanced tree:
# 1) more than 2 different leaf depths
# 2) 2 leaf depths that are more than 1 apart
if ((len(depths) > 2) or
(len(depths) == 2 and abs(depths[0] - depths[1]) > 1)):
return False
else:
# Case: this isn't a leaf - keep stepping down
if node.left:
nodes.append((node.left, depth + 1))
if node.right:
nodes.append((node.right, depth + 1))
return True
O(n) time and O(n) space.
For time, the worst case is the tree is balanced and we have to iterate over all n nodes to make sure.
For the space cost, we have two data structures to watch: depths and nodes.
depths will never hold more than three elements, so we can write that off as O(1).
Because we’re doing a depth first search, nodes will hold at most d nodes where d is the depth of the tree (the number of levels in the tree from the root node down to the lowest node). So we could say our space cost is O(d).
But we can also relate d to n. In a balanced tree, d is O(log2(n)). And the more unbalanced the tree gets, the closer d gets to n.
In the worst case, the tree is a straight line of right children from the root where every node in that line also has a left child. The traversal will walk down the line of right children, adding a new left child to nodes at each step. When the traversal hits the rightmost node, nodes will hold half of the n total nodes in the tree. Half is O(n), so our worst case space cost is O(n).
This is an intro to some tree basics. If this is new to you, don't worry—it can take a few questions for this stuff to come together. We have a few more coming up.
Particular things to note:
Focus on
depth-first
↴
Depth-first search (DFS) is a method for exploring a
tree or graph. In a DFS, you go as deep as possible down one path
before backing up and trying a different one.
Depth-first search is like walking through a corn maze. You explore
one path, hit a dead end, and go back and try a different one.
Here's a how a DFS would traverse this tree, starting with the root:
We'd go down the first path we find until we hit a dead end:
Then we'd do the same thing again—go down a path until we hit a dead end:
And again:
And again:
Until we reach the end.
Depth-first search is often compared with breadth-first search.
Advantages:
Disadvantages
Breadth-First Search is an algorithm that systematically
explores a graph level by level. Starting from a source point, BFS
visits all its immediate neighbors first, then their neighbors,
and so on. This methodical approach makes it incredibly useful for
finding shortest paths in unweighted graphs, exploring tree-like
structures, and solving a variety of problems that come up
frequently in coding interviews.
BFS is a cornerstone algorithm. It's often favored in interviews
because it strikes a balance: it's conceptually straightforward to
understand and implement, yet adaptable to a wide range of
challenging questions.
In this article, we’ll cover core concepts of BFS, look at some
implementations with time / space complexity analysis, and explore
a few different problems where BFS shines. Let's dive in!
Breadth-First Search (BFS) explores a graph or tree level by
level. Imagine dropping a pebble into still water: the ripples
spread outwards in circles. BFS mimics this, starting at a source
node and systematically visiting all nodes at the current level
before moving to the next level. This level-by-level approach
ensures you explore the graph in layers of "hops" from the
starting point.
Starting from one point in the graph,
we look at all the nodes one level away,
then all the nodes two levels away,
then all the nodes three levels away,
and so on - until we’ve seen every node.
The key behind BFS's level-by-level exploration is a
queue.
A queue is a First-In, First-Out (FIFO)
data structure: items are added to the back and removed from the
front.
In BFS, a queue stores the nodes we need to explore. When we visit
a node, we add its unvisited neighbors to the back of the
queue. Then, we process the next node at the
front of the queue. This FIFO behavior ensures that we
explore all nodes at the current "level" before moving to the next
level.
To prevent infinite loops and redundant processing in graphs
(especially those with cycles), we need to keep track of visited
nodes. This is where a
visited set
is essential.
Before adding a neighbor to the queue, we check if it's already in
the visited set. If it is, we skip it. Otherwise, we add it to the
queue and mark it as visited by adding it to the visited
set. Using an
efficient set data
structure, these checks and inserts can be done in
O(1) time on average.
In many BFS problems, especially shortest path problems, you'll
need to reconstruct the path from the starting node to the target
node once you find it. To do this, we use a predecessor
dictionary (or
parent dictionary).
As we explore the graph with BFS, whenever we enqueue a neighbor
node, we record in the predecessor dictionary that we reached this
neighbor from the current node. This dictionary stores each node and its
immediate predecessor in the BFS traversal.
Once BFS finds the target node, we can easily reconstruct the path
by starting at the target node and using the predecessor dictionary to
trace back to its predecessor, then to its predecessor, and so on,
until we reach the starting node. This backtracking process gives
us the shortest path in reverse order, which we can then reverse
to get the correct path.
Path reconstruction takes time proportional to the length of the
shortest path. In the worst case, every node in the graph could be
in the path, so reconstructing takes
O(V) overall.
Let's break down the BFS algorithm into clear steps:
Termination: The algorithm terminates when we reach
our target node or the queue becomes empty.
At this point, you often will reconstruct the path from
start_node to
end_node using the
predecessors dictionary.
Once you’ve coded up an algorithm, interviewers are going to ask you to analyze its time and space complexity.
Time Complexity: O(V+E)
V is the number of vertices (nodes)
and E is the number of edges in the
graph. We visit each vertex at most once (due to the visited
set). For each vertex,
we iterate through its neighbors. In total, across all
vertices, we examine each edge at most twice (once for the
node on each end). Putting it all together, we’ve
got O(V+E) time complexity.
Space Complexity: O(V)
V is the number of vertices. Our three data
structures are the node queue, the
visited set, and the
backtracking dictionary. Because each node gets added to the
queue exactly once, the queue won’t grow
beyond O(V). Our
visited set also grows
to hold all the nodes, O(V)
size. Finally, our backtracking dictionary has one entry per
node, so it’s O(V), too. Putting these
together, we’re still O(V) space.
Because of its versatility, BFS shows up in a lot of
places. Look for these problem structures that suggest BFS is
the appropriate algorithm:
When approaching a problem that might be solved with BFS,
consider these steps:
Here are some common interview problems where a BFS-style algorithm
could work:
Problem Description: Given a binary tree,
return its node values in level order traversal. That is,
nodes at each level are visited from left to right, level by
level.
Applying BFS:
Problem Description: Given an ASCII maze
where each step has the same cost, find the shortest path
from the top left to the bottom right.
Applying BFS:
Problem Description: Given a 2D grid
indicating land and water, count the number of islands.
Applying BFS:
Problem Description: Given a start word and a
target word, find the fewest steps required to transform one
into the other, changing only one letter at a time.
As an example, to transform head into tail, we could generate
this sequence:
Applying BFS:
Breadth-First Search is an indispensable algorithm in your
coding interview toolkit. Its systematic level-by-level
exploration makes it a powerful and efficient approach for a
wide range of problems. By mastering the core concepts – the
queue, visited set, predecessor dictionary – and practicing its
implementation, you'll be well-prepared to confidently tackle
BFS-related interview questions.
Remember to:
With practice and a solid understanding of BFS, you'll be
ready to impress in your next coding interview! Good luck!
Understanding the Core Concepts of BFS
What is Breadth-First Search?
Key Data Structure: The Queue
The Visited Set
Path Reconstruction: The Predecessor Dictionary
The BFS Algorithm: Step-by-Step
from collections import deque
def reconstruct_path(predecessors, start_node, end_node):
reversed_shortest_path = []
# Start from the end of the path and work backwards
current_node = end_node
while current_node:
reversed_shortest_path.append(current_node)
current_node = predecessors[current_node]
# Reverse our path to get the right order
reversed_shortest_path.reverse() # flip it around, in place
return reversed_shortest_path # no longer reversed
def bfs_get_path(graph, start_node, end_node):
if start_node not in graph:
raise Exception('Start node not in graph')
if end_node not in graph:
raise Exception('End node not in graph')
queue = deque()
queue.append(start_node)
visited = set([start_node])
predecessors = {start_node: None}
while len(queue) > 0:
current_node = queue.popleft()
# Stop when we reach the end node
if current_node == end_node:
return reconstruct_path(predecessors, start_node, end_node)
for neighbor in graph[current_node]:
if neighbor not in visited:
queue.append(neighbor)
visited.add(neighbor)
predecessors[neighbor] = current_node
# If we get here, then we never found the end node
# so there's NO path from start_node to end_node
return None
Analyzing Time and Space Complexity of BFS
Applications of BFS: Where is BFS Useful?
Sample BFS Interview Questions
Binary Tree Level Order Traversal
Shortest Path Through a Maze
.####### @#######
..#.#..# @@#.#..#
#.....## #@@@@@##
###.#..# ==> ###.#@.#
#...#.## #...#@##
#.###.#. #.###@#.
#.....## #....@##
#####... #####@@@
Number of Islands
Word Ladder
Conclusion: BFS - A Powerful Tool in Your Interview Toolkit
You should also be very comfortable coding each of them up.
One tip: Remember that breadth-first uses a
queue
↴
A queue stores items in a first-in, first-out
(FIFO) order.
Picture a queue like the line outside a busy restaurant. First come, first served.
Queues are easy to implement with linked lists:
You could implement a queue with an array or dynamic array, but it would get kinda messy. Try drawing it out. You'll notice that you'd need to build out a "scoot over" or "re-center" operation that automatically fires when your queue items hit the bottom edge of the array.
A stack stores items in a last-in, first-out (LIFO)
order.
Picture a pile of dirty plates in your sink. As you add more
plates, you bury the old ones further down. When you take a plate
off the top to wash it, you're taking the last plate you put
in. "Last in, first out."
You can implement a stack with either a linked list or a dynamic array—they both work pretty well:
Quick reference
Worst Case
space
O(n)
enqueue
O(1)
dequeue
O(1)
peek
O(1)
Strengths:
Uses
Implementation
Quick reference
Worst Case
space
O(n)
push
O(1)
pop
O(1)
peek
O(1)
Strengths:
Uses:
Implementation
Stack Push
Stack Pop
Linked Lists
insert at head
remove at head
Dynamic Arrays
append
remove last element
Do you have an answer?
Wanna review this one again later? Or do you feel like you got it all?
Mark as done Pin for review laterWant more coding interview help?
Check out interviewcake.com for more advice, guides, and practice questions.
Reset editor
Powered by qualified.io
With just a couple clicks.
We'll never post on your wall or message your friends.
Actually, we don't support password-based login. Never have. Just the OAuth methods above. Why?