I was extremely fortunate to attend Ken Clarkson‘s excellent talk on “Input Sparsity and Hardness for Robust Subspace Approximation” at FWCG this year. Excerpts from the talk –

Let’s assume that we are given an input of points in dimensions. These points are arranged as rows of a matrix .

The goal is to find a – dimensional subspace minimizing among all possible -dimensional subspaces.

The immediate solution is through Singular Value Decomposition(SVD hereon) – there is a best rank- with where is the span of top right singular vectors in . The best known approximation algorithms can find such that and in . It is also important to note that SVD primarily functions by reducing the dimensionality of the given set of data points and hence, a single point (or an outlier) can carry a lot of weightage and influence the decisions made in the decomposition. This leads to the conclusion that SVD is not a very robust technique.

The alternative solution is to use -Estimators for Regression – Let be the residuals i.e the difference between the th observation and the fitted value. The traditional least squares method tries to minimize which runs into trouble when there are outliers in the data as we saw in the earlier case. The -estimators uses to address this issue and are proven to reduce the effect of outliers.

Now that a more robust technique is established, we need to see how it can be applied to low-rank approximation. Low-rank approximation also calls for rotational invariance where we want the cost to be preserved even if is rotated by a matrix . For example, cost functions of the form can never be used since when B has a zero rank, the rotational invariance property is not preserved for most rotations. The workaround is to use cost functions of the form for . These functions are rotational invariant and more robust for smaller values of . For any basis of , we have . A subspace is a approximation if and .

Some nice results on this front –

- Hardness – It is NP-hard find a approximation for .
- For , an algorithm exists such that it returns a -approximation in with constant probability.
- Results for -estimators that are nice –
- All such functions are even –
- Monotonic –
- Bounded growth – There is a such that for , .
- Root sub-additivity property – .

- For any vector , is almost a norm except for a few properties like scale invariance, scale insensitivity, etc.

There are some open questions too –

- If is fixed, what is the hardness of approximation?

He also went onto describe some algorithms for the nice -estimators and row and column reduction techniques which preserve the approximation. Overall, I would say it was a wonderful talk which helped me understand low rank approximations.

Note – I have only sketched a basic idea on the topic here. I strongly recommend the following papers for further reading –

2. Sketching for M-estimators.

Acknowledgements – I would like to thank Ken Clarkson for his support and encouragement(and slides :)), which played a major role in this blog post.