Breadth-First Search (BFS) - Shortest Paths in Unweighted Graphs

Breadth-First Search is an algorithm that systematically explores a graph level by level. Starting from a source point, BFS visits all its immediate neighbors first, then their neighbors, and so on. This methodical approach makes it incredibly useful for finding shortest paths in unweighted graphs, exploring tree-like structures, and solving a variety of problems that come up frequently in coding interviews.

BFS is a cornerstone algorithm. It's often favored in interviews because it strikes a balance: it's conceptually straightforward to understand and implement, yet adaptable to a wide range of challenging questions.

In this article, we’ll cover core concepts of BFS, look at some implementations with time / space complexity analysis, and explore a few different problems where BFS shines. Let's dive in!

Understanding the Core Concepts of BFS

What is Breadth-First Search?

Breadth-First Search (BFS) explores a graph or tree level by level. Imagine dropping a pebble into still water: the ripples spread outwards in circles. BFS mimics this, starting at a source node and systematically visiting all nodes at the current level before moving to the next level. This level-by-level approach ensures you explore the graph in layers of "hops" from the starting point.

Starting from one point in the graph,

A 4-row binary tree represented by circles connected with lines. Our breadth-first search has us start at the root node at the top of the tree.

we look at all the nodes one level away,

The same 4-row binary tree with all nodes at depth 1 (second row) bolded after being visited.

then all the nodes two levels away,

The same 4-row binary tree with all nodes at depth 2 (third row) bolded after being visited.

then all the nodes three levels away,

The same 4-row binary tree with all nodes at depth 3 (fourth and final row) bolded after being visited.

and so on - until we’ve seen every node.

Key Data Structure: The Queue

The key behind BFS's level-by-level exploration is a queue. A queue is a First-In, First-Out (FIFO) data structure: items are added to the back and removed from the front.

Loading...

In BFS, a queue stores the nodes we need to explore. When we visit a node, we add its unvisited neighbors to the back of the queue. Then, we process the next node at the front of the queue. This FIFO behavior ensures that we explore all nodes at the current "level" before moving to the next level.

The Visited Set

To prevent infinite loops and redundant processing in graphs (especially those with cycles), we need to keep track of visited nodes. This is where a visited set is essential.

Before adding a neighbor to the queue, we check if it's already in the visited set. If it is, we skip it. Otherwise, we add it to the queue and mark it as visited by adding it to the visited set. Using an efficient set data structure, these checks and inserts can be done in time on average.

Path Reconstruction: The Predecessor Dictionary

In many BFS problems, especially shortest path problems, you'll need to reconstruct the path from the starting node to the target node once you find it. To do this, we use a predecessor dictionary (or parent dictionary).

As we explore the graph with BFS, whenever we enqueue a neighbor node, we record in the predecessor dictionary that we reached this neighbor from the current node. This dictionary stores each node and its immediate predecessor in the BFS traversal.

Once BFS finds the target node, we can easily reconstruct the path by starting at the target node and using the predecessor dictionary to trace back to its predecessor, then to its predecessor, and so on, until we reach the starting node. This backtracking process gives us the shortest path in reverse order, which we can then reverse to get the correct path.

Path reconstruction takes time proportional to the length of the shortest path. In the worst case, every node in the graph could be in the path, so reconstructing takes overall.

The BFS Algorithm: Step-by-Step

Let's break down the BFS algorithm into clear steps:

  1. Initialization:
    • Create an empty queue, queue.
    • Create an empty predecessor dictionary, predecessors
    • Create an empty set, visited
    • Enqueue the starting node into the queue.
    • Add the starting node to visited and predecessors.
  2. Iteration (while the queue is not empty):
    • Dequeue a node from the front of the queue. Let's call this current_node.
    • Process current_node. This might involve:
      • Checking if it's the target node you're searching for.
      • Performing some operation based on the node's value.
      • Simply noting that you've visited it.
    • Explore Neighbors: Get all valid, unvisited neighbors of current_node.
    • For each unvisited neighbor:
      • Enqueue the neighbor into the queue.
      • Add it to visited.
      • In predecessors, record that we got to this neighbor via current_node
  3. Termination: The algorithm terminates when we reach our target node or the queue becomes empty.

    At this point, you often will reconstruct the path from start_node to end_node using the predecessors dictionary.

from collections import deque def reconstruct_path(predecessors, start_node, end_node): reversed_shortest_path = [] # Start from the end of the path and work backwards current_node = end_node while current_node: reversed_shortest_path.append(current_node) current_node = predecessors[current_node] # Reverse our path to get the right order reversed_shortest_path.reverse() # flip it around, in place return reversed_shortest_path # no longer reversed def bfs_get_path(graph, start_node, end_node): if start_node not in graph: raise Exception('Start node not in graph') if end_node not in graph: raise Exception('End node not in graph') queue = deque() queue.append(start_node) visited = set([start_node]) predecessors = {start_node: None} while len(queue) > 0: current_node = queue.popleft() # Stop when we reach the end node if current_node == end_node: return reconstruct_path(predecessors, start_node, end_node) for neighbor in graph[current_node]: if neighbor not in visited: queue.append(neighbor) visited.add(neighbor) predecessors[neighbor] = current_node # If we get here, then we never found the end node # so there's NO path from start_node to end_node return None

Analyzing Time and Space Complexity of BFS

Once you’ve coded up an algorithm, interviewers are going to ask you to analyze its time and space complexity.

Time Complexity:

V is the number of vertices (nodes) and E is the number of edges in the graph. We visit each vertex at most once (due to the visited set). For each vertex, we iterate through its neighbors. In total, across all vertices, we examine each edge at most twice (once for the node on each end). Putting it all together, we’ve got time complexity.

Space Complexity:

V is the number of vertices. Our three data structures are the node queue, the visited set, and the backtracking dictionary. Because each node gets added to the queue exactly once, the queue won’t grow beyond . Our visited set also grows to hold all the nodes, size. Finally, our backtracking dictionary has one entry per node, so it’s , too. Putting these together, we’re still space.

Applications of BFS: Where is BFS Useful?

Because of its versatility, BFS shows up in a lot of places. Look for these problem structures that suggest BFS is the appropriate algorithm:

  • Problems where you’re exploring a set of states and have to find the shortest sequence from one state to another.
  • Problems involving grid-based structures (mazes, grids, etc.) where each step has the same cost.
  • Problems where you need to look at connected items (friends in a social network, links within a website, railroads between cities, etc.).

When approaching a problem that might be solved with BFS, consider these steps:

  • Identify BFS Applicability: Does the problem involve finding the shortest path in an unweighted graph, exploring levels or layers, or searching connected components? If so, BFS might be a good fit.
  • Define the Graph/States: What are the "nodes" in your problem? How are they connected (what are the "edges" or transitions)? This could be explicit graph nodes, grid cells, words in a word ladder, etc.
  • Define Start and End: What is the starting node(s)? Is there a specific target node or condition you are searching for?
  • Define Neighbors: How do you find the valid neighbors of a node? Consider constraints (walls in a maze, valid word transformations, etc.).
  • Handle Boundary Conditions: Consider edge cases like empty inputs, disconnected components, and whether BFS is feasible given the input size.

Sample BFS Interview Questions

Here are some common interview problems where a BFS-style algorithm could work:

Binary Tree Level Order Traversal

Problem Description: Given a binary tree, return its node values in level order traversal. That is, nodes at each level are visited from left to right, level by level.

Applying BFS:

  • Our starting node is the root of the binary tree. We don't have a specific ending node; we'll continue until the queue is empty.
  • Each node's neighbors are their left and right children (if they have them).
  • We can check for an empty tree (comparing the root node to None). And, assuming the tree is a reasonable size, BFS should be feasible.

Shortest Path Through a Maze

Problem Description: Given an ASCII maze where each step has the same cost, find the shortest path from the top left to the bottom right.

.####### @####### ..#.#..# @@#.#..# #.....## #@@@@@## ###.#..# ==> ###.#@.# #...#.## #...#@## #.###.#. #.###@#. #.....## #....@## #####... #####@@@

Applying BFS:

  • Our starting node is the top left corner. Our ending node is the bottom right corner.
  • Each node's neighbors are the points above, below, left, and right - as long as they're not walls. We need to be careful of boundary conditions, like nodes on the edges of the grid.
  • We can check for edge cases like the start or end position being a wall or mazes where there's no path from the start to the finish.
  • If the maze is very large, it's possible BFS will be too slow or require too much memory. Ask your interviewer how large the maze is expected to be!

Number of Islands

Problem Description: Given a 2D grid indicating land and water, count the number of islands.

Applying BFS:

  • Start with a set of visited nodes, initially empty.
  • Iterate through the entire grid. When we reach a point that's an island:
    • If the point is already visited, do nothing.
    • Otherwise, increase our count by 1 and do BFS from that point outwards. Each BFS visits all the points on a single island. A point's neighbors are adjacent points that are also land.
  • Return the total number of islands seen.

Word Ladder

Problem Description: Given a start word and a target word, find the fewest steps required to transform one into the other, changing only one letter at a time.

As an example, to transform head into tail, we could generate this sequence:

  • head
  • heal
  • teal
  • tell
  • tall
  • tail

Applying BFS:

  • Our starting node is the start word. Our ending node is the target word.
  • Each node's neighbors are words that differ by one letter. As an example, some neighboring words for node would be:
    • code
    • mode
    • nods
    • note
    • rode
    • ...
  • We should check for edge cases like:
    • The start or target not being valid words
    • The start and target being different lengths
    • No possible word ladder

Conclusion: BFS - A Powerful Tool in Your Interview Toolkit

Breadth-First Search is an indispensable algorithm in your coding interview toolkit. Its systematic level-by-level exploration makes it a powerful and efficient approach for a wide range of problems. By mastering the core concepts – the queue, visited set, predecessor dictionary – and practicing its implementation, you'll be well-prepared to confidently tackle BFS-related interview questions.

Remember to:

  • Practice recognizing problems where BFS could be used.
  • Clearly articulate your BFS approach during interviews, highlighting how you are defining nodes, neighbors, and handling visited states.
  • Understand and be able to explain the time and space complexity for different BFS variants.

With practice and a solid understanding of BFS, you'll be ready to impress in your next coding interview! Good luck!

. . .