You only have 3 free questions left (including this one).

But it doesn't have to end here! Sign up for the 7-day coding interview crash course and you'll get a free Interview Cake problem every week.

Given an undirected graph

A graph organizes items in an interconnected network.

Each item is a node (or vertex). Nodes are connected by edges

A graph is composed of nodes (or vertices) that are connected by edges.

Strengths:

  • Representing links. Graphs are ideal for cases where you're working with things that connect to other things. Nodes and edges could, for example, respectively represent cities and highways, routers and ethernet cables, or Facebook users and their friendships.

Weaknesses:

  • Scaling challenges. Most graph algorithms are O(nlg(n))O(n*lg(n)) or even slower. Depending on the size of your graph, running algorithms across your nodes may not be feasible.

Terminology

Directed or undirected

In directed graphs, edges point from the node at one end to the node at the other end. In undirected graphs, the edges simply connect the nodes at each end.

In a directed graph, edges point from one node to another. In an undirected graph, the edges simply connect the nodes at each end.

Cyclic or acyclic

A graph is cyclic if it has a cycle—an unbroken series of nodes with no repeating nodes or edges that connects back to itself. Graphs without cycles are acyclic.

A cyclic graph has at least one cycle: an unbroken series of nodes with no repeating nodes or edges that connects back to itself. Otherwise the graph is acyclic.

Weighted or unweighted

If a graph is weighted, each edge has a "weight." The weight could, for example, represent the distance between two locations, or the cost or time it takes to travel between the locations.

Each edge in a weighted graph has a weight, which could represent counts, distance, time, or any other quantifiable measurement.

Legal coloring

A graph coloring is when you assign colors to each node in a graph. A legal coloring means no adjacent nodes have the same color:

For a graph to be colored legally, no adjacent nodes can share the same color.

Representations

There are a few different ways to store graphs. Let's take this graph as an example:

An example graph composed of 4 labeled vertices.

Edge list

A list of all the edges in the graph:

  int[][] graph = new int[][] { new[] {0, 1}, new[] {1, 2}, new[] {1, 3}, new[] {2, 3} };

Since node 3 has edges to nodes 1 and 2, {1, 3} and {2, 3} are in the edge list.

Sometimes it's helpful to pair our edge list with a list of all the nodes. For example, what if a node doesn't have any edges connected to it? It wouldn't show up in our edge list at all!

Adjacency list

A list where the index represents the node and the value at that index is a list of the node's neighbors:

  int[][] graph = new int[][]
{
    new[] {1},
    new[] {0, 2, 3},
    new[] {1, 3},
    new[] {1, 2}
};

Since node 3 has edges to nodes 1 and 2, graph[3] has the adjacency list {1, 2}.

We could also use a dictionary where the keys represent the node and the values are the lists of neighbors.

  var graph = new Dictionary<int, int[]>
{
    {0, new[] {1}},
    {1, new[] {0, 2, 3}},
    {2, new[] {1, 3}},
    {3, new[] {1, 2}},
};

This would be useful if the nodes were represented by strings, objects, or otherwise didn't map cleanly to array indices.

Adjacency matrix

A matrix of 0s and 1s indicating whether node x connects to node y (0 means no, 1 means yes).

  int[][] graph = new int[][]
{
    new[] {0, 1, 0, 0},
    new[] {1, 0, 1, 1},
    new[] {0, 1, 0, 1},
    new[] {0, 1, 1, 0},
};

Since node 3 has edges to nodes 1 and 2, graph[3][1] and graph[3][2] have value 1.

Algorithms

BFS and DFS

You should know breadth-first search (BFS) and depth-first search (DFS) down pat so you can code them up quickly.

Lots of graph problems can be solved using just these traversals:

Is there a path between two nodes in this undirected graph? Run DFS or BFS from one node and see if you reach the other one.

What's the shortest path between two nodes in this undirected, unweighted graph? Run BFS from one node and backtrack once you reach the second. Note: BFS always finds the shortest path, assuming the graph is undirected and unweighted. DFS does not always find the shortest path.

Can this undirected graph be colored with two colors? Run BFS, assigning colors as nodes are visited. Abort if we ever try to assign a node a color different from the one it was assigned earlier.

Does this undirected graph have a cycle? Run BFS, keeping track of the number of times we're visiting each node. If we ever visit a node twice, then we have a cycle.

Advanced graph algorithms

If you have lots of time before your interview, these advanced graph algorithms pop up occasionally:

  • Dijkstra's Algorithm: Finds the shortest path from one node to all other nodes in a weighted graph.
  • Topological Sort: Arranges the nodes in a directed, acyclic graph in a special order based on incoming edges.
  • Minimum Spanning Tree: Finds the cheapest set of edges needed to reach all nodes in a weighted graph.
with maximum degree

The degree of a node is the number of edges connected to the node.

A node's degree refers to the number of edges connected to that node.

The maximum degree of a graph is the highest degree of all the nodes. The graph above has a maximum degree of 3.

In a directed graph, nodes have an indegree and an outdegree.

Nodes in a directed graph also have an indegree (the number of edges pointing to the node) and an outdegree (the number of edges pointing out from the node).
DD, find a graph coloring

Graph coloring is when you assign colors to the nodes in a graph.

A legal coloring means no adjacent nodes have the same color:

For a graph to be colored legally, no adjacent nodes can share the same color.

The color could be used literally. Say we're coloring a map. We'll want to color adjacent states or countries differently so their borders are clear.

The color could also represent some concept or property. For example, say we have a graph where nodes represent university classes and edges connect classes with overlapping students. We could use colors to represent the scheduled class exam time. In an illegal coloring, a student could be booked for multiple exams at once!

Edge coloring is less common, but it's also a thing. A legal edge coloring means no nodes have two edges with the same color.

using at most D+1D+1 colors.

For example:

First described by Robert Frucht in 1939, the Frucht graph is a 3-regular graph with 12 vertices, 18 edges, and no nontrivial symmetries.

This graph's maximum degree (DD) is 3, so we have 4 colors (D+1D+1). Here's one possible coloring:

The Frucht graph with legal coloring.

Graphs are represented by an array of NN node objects, each with a label, a set of neighbors, and a color:

  using System.Collections.Generic;

public class GraphNode
{
    public string Label { get; }
    public ISet<GraphNode> Neighbors { get; }
    public string Color { get; set; }
    public bool HasColor { get { return Color != null; } }

    public GraphNode(string label)
    {
        Label = label;
        Neighbors = new HashSet<GraphNode>();
    }

    public void AddNeighbor(GraphNode neighbor)
    {
        Neighbors.Add(neighbor);
    }
}

var a = new GraphNode("a");
var b = new GraphNode("b");
var c = new GraphNode("c");

a.AddNeighbor(b);
b.AddNeighbor(a);
b.AddNeighbor(c);
c.AddNeighbor(b);

var graph = new GraphNode[]
{
    a,
    b,
    c
};

Gotchas

D+1D+1 colors is always enough. Does your method ever need more colors than that?

Does your method go through every color for every node? You can do better. You don't want NDN*D in your final runtime.

We can color a graph in linear

In general, linear time means an algorithm runs in O(n)O(n) time, where nn is the size of the input.

In graph problems like this, we usually express complexity in terms of the number of nodes (NN) and edges (MM) in the input graph. So, when we say linear time here, we mean O(N+M)O(N+M).

time and space (on the number of nodes, edges and/or the maximum degree).

What if the input graph has a loop

A loop in a graph is an edge where both ends connect to the same node:

A loop is an edge that connects a node to itself.
? Does your method handle that reasonably?

Breakdown

Start your free trial!

Log in or sign up with one click to get immediate access to 3 free mock interview questions

Where do I enter my password?

Actually, we don't support password-based login. Never have. Just the OAuth methods above. Why?

  1. It's easy and quick. No "reset password" flow. No password to forget.
  2. It lets us avoid storing passwords that hackers could access and use to try to log into our users' email or bank accounts.
  3. It makes it harder for one person to share a paid Interview Cake account with multiple people.

Solution

Start your free trial!

Log in or sign up with one click to get immediate access to 3 free mock interview questions

Where do I enter my password?

Actually, we don't support password-based login. Never have. Just the OAuth methods above. Why?

  1. It's easy and quick. No "reset password" flow. No password to forget.
  2. It lets us avoid storing passwords that hackers could access and use to try to log into our users' email or bank accounts.
  3. It makes it harder for one person to share a paid Interview Cake account with multiple people.

Complexity

O(N+M)O(N+M) time where NN is the number of nodes and MM is the number of edges.

The runtime might not look linear

In general, linear time means an algorithm runs in O(n)O(n) time, where nn is the size of the input.

In graph problems like this, we usually express complexity in terms of the number of nodes (NN) and edges (MM) in the input graph. So, when we say linear time here, we mean O(N+M)O(N+M).

because we have outer and inner loops. The trick is to look at each step and think of things in terms of the total number of edges (MM) wherever we can:

  • We check if each node appears in its own hash set of neighbors. Checking if something is in a set is O(1)O(1), so doing it for all NN nodes is O(N)O(N).
  • When we get the illegal colors for each node, we iterate through that node's neighbors. So in total, we cross each of the graphs MM edges twice: once for the node on either end of each edge. O(M)O(M) time.
  • When we assign a color to each node, we're careful to stop checking colors as soon as we find one that works. In the worst case, we'll have to check one more color than the total number of neighbors. Again, each edge in the graph adds two neighbors—one for the node on either end—so there are 2M2*M neighbors. So, in total, we'll have to try O(N+M)O(N+M) colors.

Putting all the steps together, our complexity is O(N+M)O(N+M).

What about space complexity? The only thing we're storing is the illegalColors hash set. In the worst case, all the neighbors of a node with the maximum degree (DD) have different colors, so our hash set takes up O(D)O(D) space.

Bonus

  1. Our solution runs in O(N+M)O(N+M) time but takes O(D)O(D) space. Can we get down to O(1)O(1) space?
  2. Our solution finds a legal coloring, but there are usually many legal colorings. What if we wanted to optimize a coloring to use as few colors as possible?

The lowest number of colors we can use to legally color a graph is called the chromatic number.

There's no known polynomial time solution for finding a graph’s chromatic number. It might be impossible, or maybe we just haven’t figured out a solution yet.

We can't even determine in polynomial time if a graph can be colored using a given kk colors. Even if kk is as low as 3.

We care about polynomial time solutions (nn raised to a constant power, like O(n2)O(n^2)) because for large nns, polynomial time algorithms are more practical to actually use than higher runtimes like exponential time (a constant raised to the power of nn, like O(2n)O(2^n)). Computer scientists usually call algorithms with polynomial time solutions feasible, and problems with worse runtimes intractable.

The problem of determining if a graph can be colored with kk colors is in the class of problems called NP (nondeterministic polynomial time). This means that in polynomial time, we can verify a solution is correct but we can’t come up with a solution. In this case, if we have a graph that's already colored with kk colors we verify the coloring uses kk colors and is legal, but we can't take a graph and a number kk and determine if the graph can be colored with kk colors.

If you can find a solution or prove a solution doesn't exist, you'll win a $1,000,000 Millennium Problem Prize.

For coloring a graph using as few colors as possible, we don’t have a feasible solution. For real-world problems, we'd often need to check so many possibilities that we’ll never be able to use brute-force no matter how advanced our computers become.

One way to reliably reduce the number of colors we use is to use the greedy algorithm but carefully order the nodes. For example, we can prioritize nodes based on their degree, the number of colored neighbors they have, or the number of uniquely colored neighbors they have.

What We Learned

Start your free trial!

Log in or sign up with one click to get immediate access to 3 free mock interview questions

Where do I enter my password?

Actually, we don't support password-based login. Never have. Just the OAuth methods above. Why?

  1. It's easy and quick. No "reset password" flow. No password to forget.
  2. It lets us avoid storing passwords that hackers could access and use to try to log into our users' email or bank accounts.
  3. It makes it harder for one person to share a paid Interview Cake account with multiple people.

Do you have an answer?

14
15
16
# Create a valid coloring for the graph
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

Reset editor

Powered by qualified.io

. . .