Monday, September 28, 2009

Notes from Lecture #6

Some points from today's lecture:

Firstly, here's a discussion of the pointer machine versus the RAM machine models from the Handbook of Data Structures (just a convenient online ref, you can check out the wikipedia articles on these machine models, or check out other books):

As you can see, random access of arrays (given i, access A[i]) is possible in unit time on a RAM machine, but not in the pointer machine model.

Secondly, we went over Seidel's algorithm for Undirected Unweighted APSP really fast at the end, so here's the description again:

UUAPSP(adjacency matrix A for G)
  1. if A is the adjacency matrix for the complete graph K_n, then return d = J_n - I_n, where J_n is the n*n all-ones matrix and I_n is the identity matrix.
  2. compute B, the adjacency matrix for G^2. (This can be done by setting B_ij = 1 whenever (A^2 + A)_ij > 0, and takes one matrix multiply.)
  3. D = UUAPSP(B)
  4. compute D*A, this takes another matrix multiply.
  5. let d_ij = 2D_ij - 1( (DA)_ij < (D_ij * degree_j) ).
  6. return the matrix (d_ij)
Note that if the original graph is connected, the recursion bottoms out in O(log n) calls. And each call requires two matrix multiplies (computing A^2 and D*A), and another O(n^2) work. Hence the total runtime is O(M(n) log n), where M(n) is time for one matrix multiply.

For the correctness, (DA)_ij = \sum_k D_ik A_kj = sum_{k neighbor of j} D_ik, which we're comparing to D_ij * degree_j in step 5,

We know by Claim 1 that d_ij (the distance in G) should either be 2D_ij or 2D_ij-1, where D_ij is the distance in G^2.

Suppose d_ij should be 2D_ij. In this case, each term in the summation is at least D_ij by Claim 2, and hence the sum is at least D_ij * degree_j. In this case, we (correctly) do not subtract anything and set d_ij = 2D_ij.

On the other hand, suppose d_ij should be 2D_ij - 1. Now Claim 3 implies that each term in the summation is at most D_ij, and at least one term is strictly less than D_ij. Hence the sum is less than D_ij * degree_j. In this case we (correctly) subtract 1, and set d_ij = 2D_ij - 1.

This completes the proof of correctness of Seidel's algorithm for computing the distances. The algorithm to compute (an implicit description of) the shortest paths will be covered as part of the homeworks that accompany this fine television series.

In next lecture, we will cover some other shortest path algorithms: e.g., using the van Emde Boas priority queues, and Goldberg's scaling algorithm to compute valid potentials faster than Bellman Ford. And the homework should be out by tomorrow or Wednesday.

Wednesday, September 23, 2009

Lecture 5 notes

Some notes on today's lecture:
  1. The algorithm we saw today was independently due to Chu and Liu (1965), Edmonds (1967) and Bock (1971).

  2. The best running time for the min-cost rooted arborescence is apparently O(m + n log n) due to Gabow, Galil, Spencer and Tarjan, in the same paper that gave the O(m log \beta(m,n)) algorithm for undirected MSTs.

  3. The variant of Prim's algorithm Or had mentioned fails on this example:
  4. I didn't explicitly say why the final two claims proved optimality for the branching F*, here is the reason --- if there were a branching F' that had smaller cost, then by the second claim it would have cost strictly less than \sum_S w*_S. But since w* was a valid weight function, that would violate the first claim.

Tuesday, September 22, 2009

HW1: updates

Some changes to problem 2, thanks for Jeremiah for asking the question: please have a look at the new PDF file...

Sunday, September 20, 2009

Union-Find Notes (Part 1)

I've written some notes on Danny's lecture last Wednesday. It doesnt have all the proofs yet (I'll add them soon), but hopefully it will be useful to recap things for lecture tomorrow, when Danny will finish the analysis.

Monday, September 14, 2009

Homework 1 posted

I've posted the first assignment on the course website.
http://www.cs.cmu.edu/afs/cs.cmu.edu/academic/class/15859-f09/www/