UB Theory Weekly – Week 23

Hi all,

The 23rd Theory Weekly Meeting will happen this Friday between 2:00 – 3:00 pm at 203 Davis Hall (Theory Lab). The schedule will be as follows :

2:00 – 2:50 pm – A talk by Michael Langberg. The abstract in her words:

Can one bit of added communication significantly change the amount for information one can communicate over a noisy network?  In this talk I will introduce and discuss this question. Hopefully, I will be able to show the (surprising?) results of a recent work on the topic.

Ken Clarkson’s Talk at FWCG 2015

I was extremely fortunate to attend Ken Clarkson‘s excellent talk on “Input Sparsity and Hardness for Robust Subspace Approximation” at FWCG this year. Excerpts from the talk –

Let’s assume that we are given an input of n points a_1, a_2, ... , a_n in d dimensions. These points are arranged as rows of a matrix A \in \mathbb{R}^{n * d}.

The goal is to find a k – dimensional subspace minimizing cost(A,L) \equiv \sum_{i\in [n]} dist(a_i,L)^2 among all possible k-dimensional subspaces.

The immediate solution is through Singular Value Decomposition(SVD hereon) – there is a best rank-k A_k = U_k\sum_kV_k^\top with cost(A,L^*) = \|A - A_k\|_F^2 = \sum_{i\in[n]}dist(a_i, a_iV_kV_k^\top)^2 where L^* is the span of top k right singular vectors in V_k^\top. The best known approximation algorithms can find L' such that dim(L') = k and cost(A, L') \le (1 + \epsilon)cost(A, L^*) in O(nnz(A) + (n + d)poly(k\epsilon^{-1})). It is also important to note that SVD primarily functions by reducing the dimensionality of the given set of data points and hence, a single point (or an outlier) can carry a lot of weightage and influence the decisions made in the decomposition. This leads to the conclusion that SVD is not a very robust technique.

The alternative solution is to use M-Estimators for Regression – Let r_i's be the residuals i.e the difference between the ith observation and the fitted value. The traditional least squares method tries to minimize \sum_i{r_i^2} which runs into trouble when there are outliers in the data as we saw in the earlier case. The M-estimators uses \min_x\sum_iM(r_i) to address this issue and are proven to reduce the effect of outliers.

Now that a more robust technique is established, we need to see how it can be applied to low-rank approximation. Low-rank approximation also calls for rotational invariance where we want the cost to be preserved even if \mathbb{R}^d is rotated by a matrix W. For example, cost functions of the form \sum_i{|a_i - b_i|}_1 can never be used since when B has a zero rank, the rotational invariance property is not preserved for most rotations. The workaround is to use cost functions of the form \sum_{i \in [n]}M(dist(a_i, L)) for M(z) = z^p.  These functions are rotational invariant and more robust for smaller values of p. For any basis V of L, we have cost(A,L) = cost(A,V) \equiv \sum_{i\in [n]} M(dist(a_i, a_i V V^\top)).  A subspace L' is a (1 + \epsilon) approximation if dim(L') = k and cost(A, L') <= (1 + \epsilon)min_{dim(L) = k}cost(A, L).

Some nice results on this front –

  1. Hardness – It is NP-hard find a O(1 + 1/d) approximation for p \in [1,2).
  2. For p \in [1, 2), an algorithm exists such that it returns a (1 + \epsilon)-approximation in O(nnz(A) +(n+d)poly(k\epsilon^{-1}) +exp(poly(k\epsilon^{-1}))) with constant probability.
  3. Results for M-estimators that are nice –
    1. All such functions are even – M(z) = M(-z)
    2. Monotonic – |a| > |b| -> M(a) > M(b)
    3. Bounded growth – There is a C_M such that for a \ge b > 0, C_Ma/b \le M(a)/M(b) \le a^2/b^2.
    4. Root sub-additivity property – \sqrt{M(a + b)} \le \sqrt{M(a)} + \sqrt{M(b)}.
  4. For any vector x, \sqrt{\sum_{i \in [n]}M(x_i)} is almost a norm except for a few properties like scale invariance, scale insensitivity, etc.

There are some open questions too –

  1. If \epsilon is fixed, what is the hardness of approximation?

He also went onto describe some algorithms for the nice M-estimators and row and column reduction techniques which preserve the approximation. Overall, I would say it was a wonderful talk which helped me understand low rank approximations.

Note – I have only sketched a basic idea on the topic here. I strongly recommend the following papers for further reading –

1. Sparse Subspace Embeddings

2. Sketching for M-estimators.

Acknowledgements – I would like to thank Ken Clarkson for his support and encouragement(and slides :)), which played a major role in this blog post.

UB Theory Weekly – Week 2

Hi all,

The second Theory Weekly Meeting will happen this Friday between 4:00 – 5:00 pm at 203 Davis Hall(Theory Lab). The schedule will be as follows :

4:00 – 4:50 pm – A talk by Prof. Roger(Xin) He.

The abstract in his own words :

Two “graph drawing” problems –

Given a graph G, the general idea of “graph drawing” is to draw G on the plane so that certain conditions/optimization measures are met. Depending on thees conditions, these problems can be fairly practical, or very theoretical. The tools involved are often graph-theoretic. Sometimes, there are geometry flavor in it. (Since we are talking about the drawings on the plane, most interesting stuff are for planar graphs.)

I’ll be discussing two such problems, (towards theoretical end), that I and my students have studied/are doing.

Definition 1: Given a planar graph G and a metric function D(*,*), a “greedy drawing” of G is:
1. Each vertex of G is drawn as a point on the plane.
2. Each edge of G is drawn as a straight line connecting its two end vertices.
3. Satisfies the “greedy drawing” condition: For any two vertices u, w in G, there exits a neighbor v of  w, such that D(v,w) < D(u,w). (In other words, v is closer to w than u.)

The condition 3 is equivalent to the following: For any two vertices u,w in G, there is a path P_{uw}, such that, when traveling from u to w,  the distance D(x,w) between the intermediate vertices x of P_{uw} to w strictly decreases.

The problem is motivated by a network routing problem. If G has a greedy drawing, we will have a very simple routing protocol: If u wants to send a message to the destination w, u can simply send it to a neighbor v, that is closer to w than itself. (Then v will follow the same protocol. The greedy property guarantees the message will reach the destination w).

The questions are: Which class of graphs has such drawings? If so, what’s the smallest size of the drawing.

A (moderately famous) related conjecture (now theorem) is (Papadimitriou, 2003): Every 3-connected plane graph G has a greedy drawing with respect to Euclidean distance D(*,*). The conjecture was proved by Tom Leighton et. al.(2008). However, there there drawbacks and open problems, which we will discuss.

Another problem:

Def 2: Same as Def 1, except that the condition 3 is replaced by the following. Condition 3 satisfies the “monotone condition”: For any two vertices u,w, there is a path P_{uw} from u to w, and a line (or direction) l_{uw}, such that P_{uw} is “monotone with respect l_{uw}” (namely, the order of the projection points of the vertices of P_{uw}on the direction l_{uw} is the same order of the vertices in P_{uw}).

This drawing style is introduced 3 years ago, partially to address the drawbacks of “greedy drawing”. The questions are: Which class of graphs has such drawings? If so, what’s the smallest size of the drawing.
I’ll discuss partial results, and related problems.

4:50 – 5:00 pm – Socializing with the Theory community.

All are welcome!