Monday, November 23, 2009

The Final Three Lectures

Today's lecture, and the two lectures next week are the final three lectures of the course. Today we will do a bit more on approximation algorithms, in particular we will discuss the max-cut problem.

Next week we'll talk a bit more about online algorithms via solving LPs, and also on using the ideas behind online learning algorithms to solve LPs offline.

Wednesday, November 18, 2009

Lecture 21

The deterministic integrality gap instance is super-simple: take all the d-bit strings that have at least d/2 1's in them. There are d sets, one for each dimension, with the set S_i containing all strings with the i-th bit 1.
  • Now OPT is at least d/2: else we would have covered less than d/2 positions, and hence missed some element.
  • But the LP solution picks each x_S to a fraction 2/d: hence each element is fractionally covered to at least 1.
For d = log n, the universe has n elements. So this gives a Omega(log n) integrality gap. You should try to see whether you can get an integrality gap larger than this. Also, to fill in the details of the randomized construction I sketched in class.

Monday, November 16, 2009

Lectures 20 and 21 (today and Wednesday)

Danny lectured today on online algorithms, in particular, on ski-rental, list-update, and paging/k-server problems. Some directions to think about, before we briefly return to online algorithms in a post-thanksgiving lecture.
  • It is a easy exercise to show that in ski-rental, you cannot do better than 2-r/p using any deterministic algorithm. (I always think of r=1, in which case this is 2-1/p.) Can you get a randomized algorithm that has a competitive ratio bounded away from 2, independent of p?

    For example: consider the algorithm that buys at day 2p/3 with probability 1/2, and at day p with the remaining probability 1/2? What is the competitive ratio for this algorithm?

    • If the sequence has length less than 2p/3, we are optimal.
    • If the sequence has length i in [2p/3,p), the algorithm's expected cost is 0.5(2p/3+p) + 0.5(i), and the optimal cost is i---giving us c(alg)/c(opt) = (5p/6i + 1/2). For i in this interval, this ratio is bounded above by (5/4+1/2) = 1.75.

    • If the sequence has length >= p, the algorithm's expected cost is 0.5(2p/3+p) + 0.5(2p-1) <= 5p/6+p = 11/6p. Hence c(Alg)/c(opt) <= 11/6.

    The worst of the three options is 11/6, which is the competitive ratio for this algorithm. This shows that one can do strictly better than 2 (independent of the value of p) using randomization. How much better can you get for the ski-rental problem using randomization? How do you show lower bounds for randomized online algorithms? More about this in an upcoming lecture.

  • Note that the analysis for randomized algorithms computes the expected cost of the online algorithm for every fixed sequence, and compares that to the best offline algorithm's cost on the same sequence. It is as though the adversary chooses the sequence sigma knowing the algorithm but not the random choices made by the online algorithm---such an adversary is known as an oblivious adversary. We will discuss more about adversary models in a future post.

  • For the paging problem where you have a set of n pages, but can only store k pages in a cache, the generalization of the cat-and-mouse game is clear: we now have n-k mice and n positions. It's fun to think about what the algorithm for this case should be.
For Wednesday's lecture, we will do a quick introduction to approximation algorithms. If you are already familiar with the topic, this lecture may be less useful to you---but it seemed wrong to do a course on algorithms without at least touching upon this topic, and talking about some of the concepts used and the techniques commonly employed.

Wednesday, November 11, 2009

Lecture 19 notes

I think I was over-simplifying the proof of Fact 2 (the expected number of violated constraints by a random r-set is at most md/r, where m is the total number of constraints, d the dimension and r the sample size): however, here is a just-as-simple proof of that fact.

Note that given a set R, a constraint h (not in R) is violated by the optimal point for R only if h is in the optimal basis for (R union h). Take a random h outside a random r-set R---now h is a random element of (R union h)---and hence h is one of the d basis constraints with probability d/(r+1). Hence the expected number of violated constraints is (m-r)d/(r+1) <= md/r. QED.

This is a "backwards-analysis" style proof: instead of considering what happens when we add a random constraint h to a random r-set R, we consider what happens when we remove a random constraint from a random (r+1)-set R'.

Tuesday, November 10, 2009

Homework #3

There was a bug in problem #4b of the homework (thanks, Srivatsan): I've removed that part.

Friday, November 6, 2009

Long-delayed Posts II: Cutting Planes

This is a vast topic dealing with the following question: you are given a convex polytope P in R^n (say, as the intersection of some halfspaces), and you only care about the integer points inside P (i.e., S = P intersect Z^n). How do you come up with some halfspace h such that S lies in h, but some portion of P lies outside the halfspace (and hence is "cut off")?

Ideally you want to find a set H of halfspaces so that P'(P,H) = P intersect (intersection of halfspaces in H) equals convex-hull(S) --- note that this P' is the smallest convex body that can contain all of S.

As a simple illustration, if you have the following inequalities
3x + 4y <= 17 2x + 6y <= 22 x, y >= 0
Graphing this, we get:

Adding the first two inequalities gives us
5x + 10y <= 39
or
x + 2y <= 7.80

But since we only care about integer values of x,y, we can round down the RHS, and get the following "valid" inequality:
x+2y <= 7.

Similarly, the fact that x>=0 along with the second inequality implies that 6y <= 22, and using the same rounding-down idea, we get another "valid" inequality:
y <= 3.
And similarly (using the first inequality, and y >= 0), one gets
x <= 5.
Note that each of these cuts off some portion of the original polytope; though we are still left with a fractional vertex (5, 0.5). (How should we cut this fractional vertex off?)

Note that we used some simple rules-of-thumb to generate these inequalities that were valid for all integer points S inside P, but cut off some portions of the polytope. One can ask: are there automated procedures to generate such valid inequalities in a systematic fashion? Are these procedures to generate these cutting planes "complete" --- do they eventually give leave us with the convex hull of S? How much time do these procedures take?

You can refer to entire books on the subject (e.g., Schrijver's book on Linear and Integer Programming). There has been much interest in some of these advanced techniques: the Lovasz-Schrijver method, the Sherali-Adams method, and the Lassere method. We may blog about this sometime later, but before we end, let us see how the simple idea about rounding down the right hand side is applicable to the matching polytope.

The Matching Polytope

One can use same ideas as above to get the Edmonds' matching polyope formulation from the naive matching constraints: start with
x(delta v) <= 1 for all v
x_e >= 0

now take a set A of vertices, and add up the constraint x(delta v) <= 1 for v in A. You get
2 \sum_{e subsetof A} x_e + \sum_{e in delta A} x_e <= |A|.

Now since the x_e's are non-negative, drop the second summation to get
\sum_{e subsetof A} x_e <= |A|/2.

But since all integer solutions give an integer value on the left, you can replace the right by
\sum_{e subsetof A} x_e <= floor(|A|/2).

Thursday, November 5, 2009

Integrality of the Bipartite Perfect Matching Polytope

Just to elaborate on what Yuan mentioned in class yesterday: (yet) another way to prove the integrality of the bipartite perfect matching polytope BPM is via the following fact:

Given a Delta-regular bipartite graph, it can be decomposed into Delta perfect matchings.

(The proof is via Hall's theorem: show that there exists a perfect matching in any regular bipartite graph, remove this perfect matching from the graph, and repeat.)

Suppose there was a non-integer vertex x* of BPM, note that since it is defined by some set of tight constraints with integer coefficients, this vector x* must be rational. Now you can scale the numbers by some integer C to get an integer vector Cx*. But this integer vector can be viewed as a bipartite multigraph where every vertex i has degree C(\sum_j x*_ij) = C(1) = C. Now we can use the fact above: this vector Cx* can be written as the sum of C perfect matchings.

Hence x* can be written as 1/C*this sum of perfect matchings, or a convex combination of perfect matchings, which means that it cannot itself be a vertex (extreme point) of the BPM.

Lecture 17: Separation for Odd Cuts

Yesterday we reduced the separation problem for the matching polytope to the following problem:

given an undirected graph G, find a minimum cut (A,A')
such that A has odd cardinality,
and A does not contain the special vertex s.


We showed that if the weight of the edges crossing such a min-cut was strictly less than 1, then A would correspond to a violated constraint in the (exponentially-sized) LP. And one could find the min-cut (B, B') in the graph in polynomial time. But what if the side B not containing the special vertex s in this min-cut turned out to be of even cardinality? Well, It turns out we need a little more work.

To handle this, we do the following: first, we recurse on G/B (which is the graph obtained by shrinking the set B to a single node), and on G/B' obtained by shrinking B' to a single node. In the latter, we treat the new node obtained from shrinking B' as the "special" node.

Why would one of the two recursions succeed in finding a minimum odd cut? Suppose (A, A') was a minimum odd cut we were looking for (with s not in A). Recall that (B, B') was the mincut but B was even. And s is in neither of A or B,
  • If A lay completely inside B or B', we'd find it in one of the recursive calls.
  • If A contains B, we'd find it the recursive call on G/B.
  • Suppose A intersects both B and B' but does not contain B. Then consider the 4 parts: A cap B, A cap B', A' cap B, and A' cap B'. |A| is odd, |B| is even. So either |A cap B| is odd, or |A cap B'| is odd.

    1. Say it's the former: |A cap B| is odd. A' cap B' contains s, so it's non-empty.

      Now we bust out the following fact (which is called submodularity of the cut function): if cut(X) is the weight of all edges going out of some set X,
      cut(X) + cut(Y) >= cut(X cap Y) + cut(X cup Y).

      So cut(A) + cut(B) >= cut(A cap B) + cut(A cup B) = cut(A cap B) + cut(A' cap B').

      cut(A) = min-odd-cut.
      cut(B) = min-cut.
      cut(A' cap B') >= min-cut, since A' cap B' is non-empty.
      cut(A cap B) >= min-odd-cut.

      Putting these together, the only way to satisfy submodularity is when we have equality everywhere. so cut(A cap B) = min-odd-cut, and we will find it recursively.

    2. If it's the latter: |A cap B'| is odd. Since A does not contain all of B, A' cap B is non-empty. Now apply submodularity to A and B', and repeat the same argument.

    This shows that at least one of the recursive calls to G/B or G/B' still has an odd-cut with the same value as cut(A), and hence we will find it recursively.
Finally, we need to bound the runtime of the entire procedure, but that can be bounded by poly(n).

Long-delayed Posts I: the Hirsch Conjecture

In Lecture 16, when we'd mentioned the simplex algorithm, I'd mumbled something about the diameter of polytopes and the Hirsch conjecture --- here are some details.

If you consider a polyhedron defined in d dimensions by the intersection of n half-spaces, then the Hirsch conjecture says that the diameter of the 1-skeleton of such a polyhedron is at most n-d. We've been using n dimensions (variables) and m half-spaces (constraints), but I will stick with the n,d notation --- you can do the translation. Kalai and Kleitman showed (in this two-page paper) that the diameter is at most n^{2 + log d} --- and this is essentially the best result known.

This problem was of interest to us, because if the diameter of some LP polytope were indeed large, then the simplex algorithm starting from a bad vertex might require a large number of steps. Of course, we only care about polynomiality, so we would be happy with the polynomial Hirsch conjecture: the diameter of the 1-skeleton is at most poly(n,d).

Some more resources: