Guide to Competitive Programming

A spanning tree contains all nodes of a graph and some of its edges so that there is a path between any two nodes. Like trees in general, spanning trees are connected and acyclic. The weight of a spanning tree is the sum of its edge weights. For example, Shows a graph and one of its spanning tree. The weight of this spanning tree is 3 + 5 + 9 + 3 + 2 = 22.
A minimum spanning tree is a spanning tree whose weight is as small as possible. Shows a minimum spanning tree for our example graph with weight 20.

In a similarway, a maximum spanning tree is a spanning tree whose weight is as large as possible. Shows a maximum spanning tree for our example graph with
weight 32. Note that a graph may have several minimum and maximum spanning trees, so the trees are not unique.

graph and spanning tree

It turns out that several greedy methods can be used to construct minimum and maximum spanning trees. This section discusses two algorithms that process the edges of the graph ordered by their weights.We focus on finding minimum spanning trees, but the same algorithms can also find maximum spanning trees by processing the edges in reverse order.

Kruskal’s Algorithm:

Kruskal’s algorithm builds a minimum spanning tree by greedily adding edges to the graph. The initial spanning tree only contains the nodes of the graph and does not contain any edges. Then the algorithm goes through the edges ordered by their weights and always adds an edge to the graph if it does not create a cycle.

The algorithm maintains the components of the graph. Initially, each node of the graph belongs to a separate component. Always when an edge is added to the graph, two components are joined. Finally, all nodes belong to the same component, and a minimum spanning tree has been found.

As an example, let us construct a minimum spanning tree for our example graph. The first step is to sort the edges in increasing order of their weights:

graph Kruskals algorithm

Then, we go through the list and add each edge to the graph if it joins two separate components. Shows the steps of the algorithm. Initially, each node belongs to its own component. Then, the first edges on the list (5–6, 1–2, 3–6, and 1–5) are added to the graph. After this, the next edge would be 2–3, but this edge is not added, because it would create a cycle. The same applies to edge 2–5. Finally, the edge 4–6 is added, and the minimum spanning tree is ready.

Why Does This Work? It is a good question of why Kruskal’s algorithm works. Why does the greedy strategy guarantee that we will find a minimum spanning tree?

Let us see what happens if the minimum weight edge of the graph is not included in the spanning tree. For example, suppose that a minimum spanning tree of our example graph would not contain the minimum weight edge 5–6. We do not know the exact structure of such a spanning tree, but in any case it has to contain some edges. Assume that the tree would look like the tree.

However, it is not possible that the tree would be a minimum spanning tree, because we can remove an edge from the tree and replace it with the minimum weight edge 5–6. This produces a spanning tree whose weight is smaller.

hypothetical minimum spanning tree

For this reason, it is always optimal to include the minimum weight edge in the tree to produce a minimum spanning tree. Using a similar argument, we can show that it is also optimal to add the next edge in weight order to the tree, and so on. Hence, Kruskal’s algorithm always produces a minimum spanning tree.

Implementation When implementing Kruskal’s algorithm, it is convenient to use the edge list representation of the graph. The first phase of the algorithm sorts the edges in the list in O(m logm) time. After this, the second phase of the algorithm builds the minimum spanning tree as follows:

for (…) {
if (!same(a,b)) unite(a,b);
}

The loop goes through the edges in the list and always processes an edge (a, b) where a and b are two nodes. Two functions are needed: the function same determines if a and b are in the same component, and the function unite joins the components that contain a and b.

The problem is how to efficiently implement the functions same and unite. One possibility is to implement the function same as a graph traversal and check if we can get from node a to node b. However, the time complexity of such a function would be O(n+m) and the resulting algorithm would be slow, because the function same will be called for each edge in the graph.

We will solve the problem using a union-find structure that implements both functions in O(log n) time. Thus, the time complexity of Kruskal’s algorithm will be O(m log n) after sorting the edge list.

Union-Find Structure:

A union-find structure maintains a collection of sets. The sets are disjoint, so no element belongs to more than one set. Two O(log n) time operations are supported:

union-find structure with three sets

The unite operation joins two sets, and the find operation finds the representative of the set that contains a given element.

In a union-find structure, one element in each set is the representative of the set, and there is a path from any other element of the set to the representative. For example, assume that the sets are {1, 4, 7}, {5} and {2, 3, 6, 8}. Shows one way to represent these sets.

In this case the representatives of the sets are 4, 5, and 2.We can find the representative of any element by following the path that begins at the element. For example, the element 2 is the representative for the element 6, because we follow the path 6 → 3 → 2. Two elements belong to the same set exactly when their representatives are the same.

To join two sets, the representative of one set is connected to the representative of the other set. For example, a possible way to join the sets {1, 4, 7} and {2, 3, 6, 8}. From this on, the element 2 is the representative for the entire set and the old representative 4 points to the element 2.

The efficiency of the union-find structure depends on how the sets are joined. It turns out that we can follow a simple strategy: always connect the representative of the smaller set to the representative of the larger set (or if the sets are of equal size, we can make an arbitrary choice). Using this strategy, the length of any path will be O(log n), so we can find the representative of any element efficiently by following the corresponding path.

Implementation The union-find structure can be conveniently implemented using arrays. In the following implementation, the array link indicates for each element the next element in the path, or the element itself if it is a representative, and the array size indicates for each representative the size of the corresponding set.
Initially, each element belongs to a separate set:

for (int i = 1; i <= n; i++) link[i] = i;
for (int i = 1; i <= n; i++) size[i] = 1;

The function find returns the representative for an element x. The representative can be found by following the path that begins at x.

int find(int x) {
while (x != link[x]) x = link[x];
return x;
}

The function same checks whether elements a and b belong to the same set. This can easily be done by using the function find:

bool same(int a, int b) {
return find(a) == find(b);
}

The function unite joins the sets that contain elements a and b (the elements have to be in different sets). The function first finds the representatives of the sets and then connects the smaller set to the larger set.

void unite(int a, int b) {
a = find(a);
b = find(b);
if (size[a] < size[b]) swap(a,b);
size[a] += size[b];
link[b] = a;
}

The time complexity of the function find is O(log n) assuming that the length of each path is O(log n). In this case, the functions same and unite also work in O(log n) time. The function unite makes sure that the length of each path is O(log n) by connecting the smaller set to the larger set.

Path Compression Here is an alternative way to implement the find operation:

int find(int x) {
if (x == link[x]) return x;
return link[x] = find(link[x]);
}

This function uses path compression: each element in the path will directly point to its representative after the operation. It can be shown that using this function, the union-find operations work in amortized O(α(n)) time, where α(n) is the inverse Ackermann function which grows very slowly (it is almost a constant). However, path compression cannot be used in some applications of the union-find structure, such as in the dynamic connectivity algorithm (Sect. 15.5.4).

Prim’s algorithm

Prim’s Algorithm:

Prim’s algorithm is an alternative method for constructing minimum spanning trees. The algorithm first adds an arbitrary node to the tree, and then always chooses a minimum weight edge that adds a new node to the tree. Finally, all nodes have been added and a minimum spanning tree has been found.

Prim’s algorithm resembles Dijkstra’s algorithm. The difference is that Dijkstra’s algorithm always selects a node whose distance from the starting node is minimum, but Prim’s algorithm simply selects a node that can be added to the tree using a minimum weight edge.

As an example, how Prim’s algorithm constructs a minimum spanning tree for our example graph, assuming that the starting node is node 1. Like Dijkstra’s algorithm, Prim’s algorithm can be efficiently implemented using a priority queue. The priority queue should contain all nodes that can be connected o the current component using a single edge, in increasing order of the weights of the corresponding edges.

The time complexity of Prim’s algorithm is O(n + m logm) that equals the time complexity of Dijkstra’s algorithm. In practice, Prim’s and Kruskal’s algorithms are both efficient, and the choice of the algorithm is a matter of taste. Still, most competitive programmers use Kruskal’s algorithm.