## Lecture 29 - 04/03

### DFS vs. BFS, Shortest Paths

• DFS worse when dealing with very deep spindly trees
• BFS worse when dealing with absurdly bushy trees (high amount of enqueueing)
• Summary so far: paths from a starting vertex to any other vertex, topological sorts (with DFS), shortest paths for graphs of all equal edge length (BFS)
• All Θ(V+E)$\Theta(V+E)$ runtime and Θ(V)$\Theta(V)$ space

#### Shortest paths and Djikstra's Algorithm

• We use this when dealing with graphs that have edges of non-uniform length (different edges have different length)
• From some start node s, expand to other vertices in the graph based on cumulative edge cost (total cost to the vertex we are considering expanding to)
• This in essence is like a BFS, but instead of expanding by depth of tree, we are expanding by total cost/length from the start node s
• This generates a Shortest Path Tree (SPT)
• Will never generate a cycle, as this indicates there are two paths to get to a certain vertex
• Say in the SPT for a certain graph there are two edges pointing to a vertex t, forming a cycle (there are two ways to go from s to t, thus cycle)
• Then one of those two paths from s to t must be shorter than the other, or they are equal distance
• If one path is shorter, then it would never be in the SPT in the first place! This contradiction ensures the SPT is always a tree (acyclic)
• If the paths are equal in distance, we can arbitrarily choose one or the other, but we won't choose both. Either choice will lead to the creation of a valid SPT via Djikstra's Algorithm
• Remember in Djikstra's algorithm, we consider cumulative cost
• Instead of always recomputing the total distance from the start node to the next vertices to explore, as we traverse through the graph in our Djikstra algorithm we store the cumulative cost
• So let's say we are running the algorithm and are at some intermediary step where we have some set of vertices e, f, and g to still explore the children of, and we want to find the next edge to expand to
• The algorithm will already have stored the total cumulative cost from e, f, and g at those vertices
• Let's say there are now two edges to get to some vertex h: (f, h) and (g, h)
• We want to compute the cumulative costs of getting to h from both paths, and compare and choose the lesser
• We do this by adding the edge cost for (f, h) to the cumulative cost of f (which we have stored at f), and the same for the edge (g, h) and vertex g
• Let's say the cumulative cost to h via g is smaller. Then we set the edge leading to h in our SPT as (g, h), and store at h the cumulative cost to h (which is the cumulative cost of g summed with the (g, h) edge cost)
• Code pseudo-implementation:
public DijkstraSP(Graph G, int s) {
distTo = new double[G.V()];
DirectedEdge[] edgeTo = new DirectedEdge[G.V()];
distTo[s] = 0;
// Initialize cumulative distance to all non-start nodes
setAllDistancesToInfinityExceptS(distTo);
PQ<Integer> fringe = new PQ<>();
insertAllVertices(fringe);

// Relax vertices in order of distance from s
while (!fringe.isEmpty()) {
int v = fringe.delMin();
for (DirectedEdge e : G.adj(v)) {
relax(e);
}
}
}

private void relax(DirectedEdge e) {
int v = e.from();
int w = e.to();
// If edge is better than current
// Note: distTo[w] is initialized to infinity
if (distTo[w] > distTo[v] + e.weight()) {
distTo[w] = distTo[v] + e.weight();
edgeTo[w] = e;
// If the vertex w is still in the fringe, update its priority
if (fringe.contains(w)) {
fringe.decreasePriority(w, distTo[w])
}
}
}


#### Djikstra Runtime Analysis

• We have V$V$ insertions, each costing Θ(logV)$\Theta(\log V)$ time where V$V$ is number of vertices
• We have V$V$ delete-mins to the priority queue fringe, each costing Θ(logV)$\Theta(\log V)$ time where V$V$ is number of vertices
• We have E$E$ priority decreases in our fringe, each costing Θ(logV)$\Theta(\log V)$ time where E$E$ is number of edges
• This is an overall runtime of Θ(VlogV+VlogV+ElogV)Θ(ElogV)$\Theta(V\log V + V\log V + E \log V)\approx \Theta(E\log V)$
• Overall runtime assumes that E>V$E > V$ which is almost always the case
• Algorithm takes Θ(V)$\Theta(V)$ space

#### A* Algorithm introduction

• Like Djikstra's algorithm, but addition of a heuristic function in our graph traversal
• The heuristic function is like an estimated distance to the goal
• Djikstra's algorithm will expand in all directions based on cumulative cost; A* expands specifically towards some specified goal/target in the graph
• Sense of closeness to the goal is supplied by the heuristic function
• Thus A* will expand by total cumulative cost summed with heuristic function value
• A good heuristic function will lead to us reaching the goal much faster than by the other search methods
• For example, in a map application where we are trying to find the shortest route from city A to city B, we can treat the geographic straight-line distance to B as the heuristic function
• Thus, vertices/roads that are closer to the goal city B (favored by the heuristic function) are expanded over vertices farther from the goal
• It is important that we have a good heuristic function: must be admissible and consistent
• Admissible: heuristic doesn't overestimate (heuristic function must always return an estimated distance less than or equal to the true distance)
• Consistent: Vertices closer to the goal have a lower heuristic than those farther away
• Runtime is same as Djikstra's algorithm. Basically we now add a bias to our vertex expansion order via the heuristic function