## Lecture 30 - 04/05

### Minimum Spanning Trees

• Remember from last time we have shortest path trees (SPT's) from Djikstra's Algorithm, that give us a tree telling us the shortest way to get from a start node s to any other vertex in the graph
• Alternate idea: minimum spanning tree (MST)
• We want to create a tree (acyclic graph) that connects all of the vertices in the graph, such that the total edge cost for this spanning tree is the minimum possible
• So essentially we want a tree that spans the entire graph, such that it's total cost is the minimum possible
• Depending on the graph and node, running Djikstra's algorithm may or may not return us the MST (MST could potentially be equal to the SPT from a certain start vertex)
• However, this is often not the case, as the shortest path from vertices A and B could be different from the path between them as determined by the MST
• Example of SPT's vs MST's (in this case, the SPT from vertex B is equivalent to the graph's MST):
• It's also very possible that none of the vertices of a graph will yield a SPT that is actually the MST. So how to get the MST?

#### Prim's Algorithm

• To find the MST, we start from any arbitrary start vertex, and expand/traverse through the tree exactly like we did with Djikstra's Algorithm, with a key difference: we no longer use total cumulative costs, and only consider immediate edge costs when determining the edges of the algorithm's resulting tree
• The resulting tree will be a MST (it may or may not be an SPT for some vertex of the graph)
• Essentially we keep building up a set of nodes in our "cut" of the graph (a subset of all the vertices of the graph), such that our cut has the minimum possible total edge cost between its vertices
• We build up this cut until our cut spans the entire graph
• Build up is by looking at all the edges that connect our cut of the graph to the rest of the graph (if we drew a squiggly line around only the nodes in our cut, we would be looking at all the edges that intersect this line)
• Among all these edges, we find the edge of least weight, add this to our cut, and create an edge to it's corresponding vertex in our MST
• So instead of Djikstra, which considers distance from a start node, Prim considers the distance from the entire tree (by tree I'm referring to the cut of the tree that we create and build up through the algorithm)
• Runtime identical to Dijkstra's algorithm
• Assuming E>V$E>V$, runtime is Θ(ElogV)$\Theta(E \log V)$
• Implementation pseudocode:
public PrimMST(Graph G) {
distTo = new double[G.V()];
Edge[] edgeTo = new Edge[G.V()];
boolean[] marked = new boolean[G.V()];
PQ<Double> fringe = new PQ<>(G.V());

distTo[s] = 0.0; // Arbitrary start node
setAllDistancesToInfinityExceptS(distTo);
fringe.insert(s, distTo[s]);
// Fringe ordered by distTo the cut tree

// Add vertices in order of distance from cut tree
while (!fringe.isEmpty()) {
int v = fringe.delMin();
scan(G, v);
}
}

private void scan(Graph G, int v) {
marked[v] = true;
for (Edge e : G.adj(v)) {
int w = e.other(v); // Other vertex in edge
if (marked[w]) { continue; } // Already in MST so go to next edge
if (e.weight() < distTo[w]) { // This is a better path so update MST
distTo[w] = e.weight();
edgeTo[w] = e;
if (fringe.contains(w)) {
pq.decreasePriority(w, distTo[w]);
} else {
pq.insert(w, distTo[w]);
}
}
}


#### Kruskal’s Algorithm:

• Alternate approach to Prim's Algorithm
• Prim's algorithm: start from an arbitrary start node, and build up a "cut" tree
• Tree expansion based on distance from the tree (not from the start node)
• Kruskal's algorithm: add all the edges to the priority queue fringe, and then each iteration pop the smallest/tied for smallest weight edge from the fringe
• Add the given edge to the MST, unless doing so would create a cycle
• Repeat untill the MST we create is of size V1$V-1$ (when we reach this, no need to go through remaining edges in the fringe)
• Kruskal's algorithm converges to Prim's algorithm! (both have same result, just different means to attain it)
• Implementation pseudocode:
public KruskalMST {
private Queue<Edge> mst = new Queue<>();

public KruskalMST(Graph G) {
PQ<Double> fringe = new PQ<>(G.V());
for (Edge e : G.E()) {
fringe.insert(e);
}

// Weighted Quick Union with Path Compression (disjoint set)
WeightedQuickUnionPC uf = new WeightedQuickUnionPC(G.V());
while (!fringe.isEmpty() && mst.size() < G.V() - 1) {
Edge e = fringe.delMin();
int v = e.from();
int w = e.to();
if (!uf.connected(v, w)) {
uf.union(v, w);
mst.enqueue(e);
}
}
}
}

• Kruskal's Algorithm is Θ(ElogE)$\Theta(E\log E)$ but if the edges are pre-sorted(stored in ordered linked-list so no need for PQ building) then we have Θ(ElogV)$\Theta(E\log* V)$ runtime
• Over the years, algorithms have been developed that give tighter and tighter bounds on the runtime