Nuit Blanche: 03/01/2008

Monday, March 31, 2008

Monday Morning Algorithm 12: Boosted Subspace Pursuit using Re-weigthed Lp for Compressive Sensing

[Warning: Before you read this entry, read this one first ]
Today's algorithm is going to be a composition of two algorithms already implemented. When I implemented the algorithm of Wei Dai and Olgica Milenkovic on Subspace Pursuit (Monday Morning Algorithm 10), my implementation and my simple example would work most of the time but not all the time. I also noticed that the final solution was close to the expected solution. Then I remembered what Emmanuel Candes said in his lecture on re-weighted L1 (specifically, the last one entitled Applications, experiments and open problems) : find a solution close enough and then use re-weighted L1 to refine the solution further. Since the re-weighted Lp solution technique of Rick Chartrand and Wotao Yin has also been implemented (Monday Morning Algorithm 2), I rewrote a line into it so that it would take the solution of the subspace Pursuit as the initial guess of its solution (instead of the L2 solution as is done in the original algorithm). By running SPtrial of the initial subspace pursuit implementation one may get this result:

The solution is wrong but not that wrong. By using this solution as a guess into the re-weighted Lp algorithm, one eventually obtains:

function x = lp_re1(A,y,p,x0)
%
% lp-Reweighted Least Squares Problem Solver
% Algorithm implementing an approach suggested in
% "Iteratively Reweighted Algorithms for Compressive Sensing"
% by Rick Chartrand and Wotao Yin
% Algorithm implemented as featured in:
% http://math.lanl.gov/Research/Publications/Docs/chartrand-2008-iteratively.pdf
%
%
% lp_re solves problems of the following form:
% minimize ||x||_p
%
% subject to A*x = y
%
% where A and y are problem data and x is variable (described below).
%
% CALLING SEQUENCES
% [x] = lp_re1(A,y,p,x0);
%
% INPUT
% A : mxn matrix; input data. columns correspond to features.
%
% m : number of measurements ( or rows) of A
% n : dimension of ambient space (or columns) of A
%
% y : vector of dimension m; result.
%
% p : norm order; p = 1 is l1 norm
%
% x0 : guess for x, vector of size n
%
% OUTPUT
% x : vector of vector n; initial unknown
%
% USAGE EXAMPLES
% x = lp_re(A,y,p,x0);
%
% AUTHOR Igor Carron
% UPDATE ver. 0.1, Feb 7 2008
%
% Creative Commons Licence
% http://en.wikipedia.org/wiki/Creative_Commons
%
epsilon = 1
% u_0 is a guess given to the subroutine
u_0 = x0;
u_old = u_0;
j=0;
while epsilon > 10^(-8)
j;
epsilon;
j = j + 1
w = (u_old.^(2) + epsilon).^(p/2-1);
v = 1./w;
Q_n = diag(v,0);
tu = inv(A*Q_n*A');
u_new = Q_n * A' * tu * y;
if lt(norm(u_new - u_old,2),epsilon^(1/2)/100)
epsilon = epsilon /10;
end
u_old = u_new;
end
x=u_new;

This is obviously a toy example that can be started by running SP1trial. In larger cases, one needs to pay attention to the issue Joel Tropp mentioned earlier (fast multiply) as re_lp1 is real slow otherwise. The code and example can be found on the Monday Morning Algorithm page.

Credit:NASA/JPL/GSFC/SwRI/SSI, Enceladus Heat map as seen by Cassini 19 days ago.

Compressed Sensing: Who are you ? and the poll result.

This blog was initially there so I would be able to put those nagging ideas in a written form somewhere. It has evolved to be a little more than that in many ways but the most profound change recently has been a focus on Compressed Sensing or Compressive Sensing. With that in mind and after a year in that mode, here are some of the statistics of the blog that might account for how a nascent subject is faring on the interwebs.

First things first, there have been about 170 posts on the subject in the past year. More recent entries provide the abstract of the new papers because I believe that in this day and age, unless there is a location that does that, the papers won't get read. It's a shame to say this but my own experience suggests that unless you have been hooked on by an abstract and maybe some graphs, it is very difficult to get people to read your stuff. No matter how good the paper is. The reason ? Everybody is busy and at the same time drowning under a pile of things to do. As one of the reader mentioned, each entry provides a good read when commuting. The blog serves the purpose of getting people's attention on every new paper but it also provides for Google a way to index it. While Mark Davenport and other folks at Rice do an excellent job at gathering new papers on the subject when it is submitted to them, sometimes a more pro-active means is necessary to find new papers on the subject. Mark also does a great job at adding other subjects areas that have, in retrospect, great relevance to the subject.

Eventually, I am still bewildered at the different communities that contribute to the subject. The side effect of this mix is that the community sees itself as loosely tied both in terms of subject areas treated and geographically (as witnessed from the heat map above.) The figure below provides actual hits and page views to the blog. It shows a steady increase of the readership mostly through search requests from Google. This March was the first time the blog saw more than 4,000 page views in a month. This is not a Slashdot effect as I have seen before but rather a growing trend over several months.

The blog is also read through RSS readers that are traditionally not counted toward page views or site visits. Feedburner says that about 40 readers read the blog daily, while Google Reader says that I have 62 readers using that application. Those two numbers should not be added but rather should be understood as saying that about 50 people read this blog daily through their RSS readers.

Feedburner also provides page statistics, and the 160 average daily visit squares well with the 4,000 visit/monthly given above. That 160 number is also well correlated with Google analytics numbers shown below.

But the statistics I like the most is this one: In the past week, about 20 percent of the people visiting the blog directly, ended up spending more than 5 minutes on the blog. Five minutes is a loooooong time......

Now with regards to the poll. Answer 2 "Compressed Sensing: A Paradigm shift in sensing, don't be the last one to say DUH" is the one with the most hits. It looks like going for the Simpsons is always a winner.

One more thing, the ads ran for 18 days and the words "compressed sensing" and "compressive sensing" seemed to have been searched on Google about 1,600 times or about 88 times a day (number of impressions) during that time period.

And remember, this statistic 50/160/88 squarely goes against George Constanza's routine: It's not me, It's you :-)

Friday, March 28, 2008

Compressed Sensing: A Lecture on the Convergence of CoSaMP

Joel Tropp gave an excellent talk today at IHP. For two hours, he pretty much gave the proofs of the convergence of CoSaMP, a work he has done with Deanna Needell.

Besides the talk, here are some excerpts that I found interesting. He mentioned that Emmanuel Candes is still getting reviews saying "Everyone knows this is impossible" when talking about finding the solution of an under-determined system when the solution is sparse. I should probably include this in the adwords ads (please take the poll). It also really points to the need to do a better job spreading that word.

One of the reason, Joel is partial to the Partial Fourier Transform lies in three reasons:

It is technologically relevant: some technology like MRI directly access the Fourier domain
The FFT is a fast matrix-vector multiply (see previous discussion) and finally,
There is a low storage requirement for that matrix (as opposed to Gaussian ensembles for instance)

Finally, when the Conjugate gradient error estimate is included in the general error treatment for the convergence of CoSaMP ( see previous discussion) it looks as though, one requires on the order of three iterations per use of the Conjugate Gradient to get a satisfactory convergence without polluting CoSaMP.

Joel also seems to think that RIP is not the reason why CoSaMP works and that there maybe a more generic reason but that it has not been found yet (my words). The proof of the convergence of CoSaMP relies heavily on RIP but maybe, this is my guess, better bounds independent of RIP could be found in the future.

The organizers did a good job of inviting him over.

On a different note Irina Rish seems top think that taping the workshop is a good idea. We'll see.

Call for Papers: Sparse Optimization and Variable Selection

Here is a workshop related to many of the themes mentioned on Nuit Blanche. It is an ICML/UAI/COLT 2008 workshop on Sparse Optimization and Variable Selection. There are 38 days before the deadline submission on May 5. From the site:

Call for Papers

Please submit an extended abstract (1 to 3 pages in two-column ICML format) to the workshop email address sparse.ws@gmail.com. The abstract should include author names, affiliations, and contact information. Papers will be reviewed by at least 3 members of the program committee.

Overview

Variable selection is an important issue in many applications of machine learning and statistics where the main objective is discovering predictive patterns in data that would enhance our understanding of underlying physical, biological and other natural processes, beyond just building accurate 'black-box' predictors. Common examples include biomarker selection in biological applications [1], finding brain areas predictive about 'brain states' based on fMRI data [2], and identifying network bottlenecks best explaining end-to-end performance [3,4], just to name a few.

Recent years have witnessed a flurry of research on algorithms and theory for variable selection and estimation involving sparsity constraints. Various types of convex relaxation, particularly L1-regularization, have proven very effective: examples include the LASSO [5], boosted LASSO [6], Elastic Net [1], L1-regularized GLMs [7], sparse classifiers such as sparse (1-norm) SVM [8,9], as well as sparse dimensionality reduction methods (e.g. sparse component analysis [10], and particularly sparse PCA [11,12] and sparse NMF [13,14]). Applications of these methods are wide-ranging, including computational biology, neuroscience, graphical model selection [15], and the rapidly growing area of compressed sensing [16-19]. Theoretical work has provided some conditions when various relaxation methods are capable of recovering an underlying sparse signal, provided bounds on sample complexity, and investigated trade-offs between different choices of design matrix properties that guarantee good performance.

We would like to invite researchers working on the methodology, theory and applications of sparse models and selection methods to share their experiences and insights into both the basic properties of the methods, and the properties of the application domains that make particular methods more (or less) suitable. We hope to further explore connections between variable selection and related areas such as dimensionality reduction, optimization and compressed sensing.

Suggested Topics

We would welcome submissions on various aspects of sparsity in machine-learning,from theoretical results to novel algorithms and interesting applications. Questions of interest include, but are not limited to:

* Does variable selection provide a meaningful interpretation of interest to domain experts?
* What method (e.g., combination of regularizers) is best-suited for a particular application and why?
* How robust is the method with respect to various type of noise in the data?
* What are the theoretical guarantees on the reconstruction ability of the method? consistency? sample complexity?

Comparison of different variable selection and dimensionality reduction methods with respect to their accuracy, robustness, and interpretability is encouraged.

I need to ask if some recordings of the tutorials will be made, the same way IMA did for the course on Compressed Sensing last year.

Compressed Sensing: A Poll.

In the past three weeks, I have toyed with the Google's Adwords system to gauge the general interest in Compressed Sensing or Compressive Sensing. The field is very small, as expected, so the keywords are very likely to show up every time somebody hits on these keywords (please don't click it kills me :-)). There is also the ability to customize different banners. The poll question is: As of today (no need to click on the ad to change the percentage) which Ad listed above got the most hits ? I'll give the answer on Monday.

Thursday, March 27, 2008

Compressed Sensing: Chirp sensing codes a new deterministic measurement tool and Image Processing with Non-local Spectral Bases

A new update from the Rice Repository, where we have yet another deterministic construction of a measurement matrix by Lorne Applebaum, Stephen Howard, Stephen Searle and Robert Calderbank in Chirp sensing codes: Deterministic compressed sensing measurements for fast recovery. The abstract reads:

Compressed sensing is a novel technique to acquire sparse signals with few measurements. Normally, compressed sensing uses random projections as measurements. Here we design deterministic measurements and an algorithm to accomplish signal recovery with computational efficiently. A measurement matrix is designed with chirp sequences forming the columns. Chirps are used since an efficient method using FFTs can recover the parameters of a small superposition. We show empirically that this type of matrix is valid as compressed sensing measurements. This is done by a comparison with random projections and a modified reduced isometry property. Further, by implementing our algorithm, simulations show successful recovery of signals with sparsity levels similar to those possible by Matching Pursuit with random measurements. For sufficiently sparse signals, our algorithm recovers the signal with computational complexity O(K logK) for K measurements. This is a significant improvement over existing algorithms.

Let us note the use of the statistics of the eigenvalues of the Gram matrices (made from the measurement matrix) "rather than a formulation of their bound" in oder to express some statement on the RIP constants. The measurement matrix is made up of a combination of chirps. The paper also prescribes a reconstruction technique as well. The reconstruction complexity is expected to scale as O(MK logK). From the conclusion:

This paper can serve as a preliminary read to [11] for those less familiar with Reed-Muller codes. Regardless of the above limitation, chirp sensing codes excel as method for fast signal recovery with compressed sensing.

In a different closely related area, we have an investigation of Image Processing with Non-local Spectral Bases by Gabriel Peyré. The abstract reads:

This article studies regularization schemes that are defined using a lifting of the image pixels in a high dimensional space. For some specific classes of geometric images, this discrete set of points is sampled along a low dimensional smooth manifold. The construction of differential operators on this lifted space allows one to compute PDE flows and perform variational optimizations. All these schemes lead to regularizations that exploit the manifold structure of the lifted image. Depending on the specific definition of the lifting, one recovers several well-known semi-local and non-local denoising algorithms that can be interpreted as local estimators over a semilocal or a non-local manifold. This framework also allows one to define thresholding operators in adapted orthogonal bases. These bases are eigenvectors of the discrete Laplacian on a manifold adapted to the geometry of the image. Numerical results compare the efficiency of PDE flows, energy minimizations and thresholdings in the semi-local and non-local settings. The superiority of the non-local computations is studied through the performance of non-linear approximation in orthogonal bases.

It is a long paper worth reading as it tries to connect traditional techniques to newer ones like diffusion methods that have also been used for dimension reduction and by the same token the building of sparse dictionaries.

The main contribution of this paper is that it bridges the gap between adaptive filtering methods and thresholding in adapted orthogonal bases. In order to do so, it reviews recent approaches to semi-local and non-local adaptive filtering of images. The use of thresholding in non-local orthogonal bases is suggested in [50] and we further study the properties of such thresholdings. The connexions with more traditional wavelet-based and total variation algorithms is made clear together with a study of the pros and cons of each method.

Wednesday, March 26, 2008

Compressed Sensing: Combinatorial Explosion of Dictionaries-Measurements-Solvers, Can Sparco Help ?

The current undertakings summarized in the big picture document show a large diversity of approaches when performing Compressed Sensing or Compressive Sensing. And yet, if one were new to the field, what would be required to design a specific means for acquiring compressed samples ?

One would first have to know the domain specific characteristics of the dataset, that is we take for granted that it is understood that the domain specific signals are sparse in some basis. Given that information, one would then perform some type of trade studies on how the processes for measurement and reconstruction are eventually yielding satisfactory data. But here lies a problem that has also plagued wavelets a decade ago: too much choice and a predisposition to always deal with 1-d information.

On the measurement side, we have about 26 ways of performing compressed measurements. On the reconstruction side, we have 17 different algorithms. I am not saying they are a good fit to one another, but we are facing, today, a potential of about 442 different implementations for the same problem. Let us not even talk about the fact that several dictionaries of sparse functions could be used in the reconstruction of one test case. Obviously, a hardware solution will remove many of the measurement and reconstruction schemes either based on engineering trade-offs or because of the nature of the signal. Yet, it would seem that a more automated evaluation of the phase space of possibilities might be helped by a general framework like the one implemented in Sparco. The richness of potential combinations for a specific signal type seem painless to implement. After I mentioned some installation issues, Michael P. Friedlander and Ewout van den Berg have written some work-around and it now works flawlessly after a clean install. They just released that version (v.1.1.3) on the Sparco website. They are also looking for feedbacks and users.

Michael and Ewout mentioned that they wanted Sparco to be relatively agnostic in terms of the solver used, but it would seem to me that there should also be a similar concern with regards to measurement means. In particular, one should probably think of a way to add measurements and solvers at once so that it can be run directly within Sparco.

Credit: NASA/JPL/Space Science Institute, Janus one of Saturn's moon, photograph released today.

Tuesday, March 25, 2008

Wired on Face Recognition and Sparse Representation: An Introduction to Compressed Sensing ?

Wired has a report on the excellent work of Allen Yang, John Wright and Yi Ma mentioned earlier (a technique we implemented in one of the Monday Morning Algorithm.) For those readers coming to this site, please note that a similar type of effort has been undertaken by Jort Gemmeke for voice (Using sparse representations for missing data imputation in noise robust speech recognition). For those of you that want to know more from the description given in the Wired article:

The secret sauce in Yang's new method is a mathematical technique for solving linear equations with sparse entries called, appropriately enough, sparse representation

The "solving linear equation with sparse entries" technique is intimately linked to a subject called compressed sensing or compressive sensing this blog is covering. The same technique is used to implement the astounding single pixel camera at Rice. In short, solving this type of system does not involve solving a least-square problem but involves the use of "weirder" techniques that have only recently been unveiled by people like the recent Fields Medalist Terry Tao, Emmanuel Candes, Justin Romberg, David Donoho and many others. A technical view of the rapidly evolving field on the subject can be found here. Reference papers can be found here.

Credit: NASA/JPL/Space Science Institute, A photograph of Thetys, Titan and Saturn released yesterday.

Compressed Sensing: Computation and Relaxation of Conditions for Equivalence between l1 and l0 Minimization, Sparco remark and an addendum.

Here is an update on Computation and Relaxation of Conditions for Equivalence between l1 and l0 Minimization by Allen Yang, John Wright and Yi Ma. The abstract reads

In this paper, we investigate the exact conditions under which the l1 and l0 minimizations arising in the context of sparse error correction or sparse signal reconstruction are equivalent. We present a much simplified condition for verifying equivalence, which leads to a provably correct algorithm that computes the exact sparsity of the error or the signal needed to ensure equivalence. In the case when the encoding matrix is imbalanced, we show how an optimal diagonal rescaling matrix can be computed via linear programming, so that the rescaled system enjoys the widest possible equivalence.

As stated in this interesting paper:

The main contribution of this paper is a simple, novel algorithm for determining when l1-l0 equivalence holds in a given linear system of equations.

I wonder if this type of algorithm should not be included in Sparco. While Sparco wants to be agnostic with regards to the reconstruction solver (so that everyone can use theirs), measurement matrices and other algorithm checking UUP and/or l0-l1 correspondance might add some additional value to the software.

There is also an addendum to the published version of a paper we covered before (Single Pixel Imaging via Compressive Sampling). It reads:

Erratum for "Single Pixel Imaging via Compressive Sampling"

In page 91, paragraph 5 of the appendix, where the paper reads

"...and so each measurement follows a Poisson distribution with mean and
variance (PT/2) for BS and (PTM/2N) for CS."

it should read

"...and so each measurement follows a Poisson distribution with mean and
variance (PT/2) for BS and (PTN/2M) for CS."

Image courtesy: NASA/JPL-Caltech/Cornell/ASU/Texas A&M/Navigation camera, Opportunity rover camera watching Martian clouds.

Monday, March 24, 2008

Monday Morning Algorithm 11: Sparse Measurement Matrix using Scrambled Block Hadamard Ensemble

In a previous entry, we discovered the paper by Lu Gan, Thong Do and Trac Tran that provides us with a compact measurement matrix as described in Fast compressive imaging using scrambled block Hadamard ensemble. We are going to implement this sparse measurement matrix as we have previously done with a similar entry in the Monday Morning Algorithm series. This measurement matrix is a cornerstone in the field of Compressed Sensing. If you don't know about Compressed Sensing, you may want to check this out. As usual all errors are mine and if you find a mistake in the implementation, please let me know. If you also know how to quietly do evals with matlab I am also a taker.

function [wfinal]=sbhe(k,hd,mt)
%
% Function implementing measurement matrix as given in
% "Fast compressive imaging using scrambled block Hadamard ensemble."
% by Lu Gan, Thong Do and Trac Tran
%
% Written by Igor Carron.
%
% Input:
% k : number of hadamard matrix blocks
% hd: size of Hadamard matrix (needs to be divisible by 2)
% mt: number of columns of wfinal
% Output:
% wfinal: matrix with k*nd columns and mt rows
%
% k=10;
% hd = 32;
% mt = 30;
ntotal = hd * k;
wh = hadamard(hd);
st1='';
for i=1:k-1
st1 = strcat(st1,'wh,');
end
st1= strcat(st1,'wh');
eval(['w = blkdiag(' st1 ')']);
u = randperm(ntotal);
for ii=1:ntotal
wnew(:,ii)=w(:,u(ii));
end
%figure(1)
%spy(w)
%figure(2)
%spy(wnew)
vv = [];
while size(vv,1)~=mt
v2 = cat(2,vv,round(1+rand(mt,1)*(ntotal-1)));
vv = unique(v2);
end
m = size(vv,1);
for j=1:m
wfinal(j,:) = wnew(vv(j),:);
end
%spy(wfinal)

The m file can be downloaded from here or here.

Credit Photo: ESA/ MPS/DLR/IDA, Venus’s atmosphere, taken by the Venus Monitoring Camera (VMC) during Venus Express orbit number 458 on 23 July 2007.

Saturday, March 22, 2008

The story behind the Dantzig Selector paper and an Anagram

Terry Tao provides a nice overview of the story behind the ideas and results of “The Dantzig selector: Statistical estimation when p is much larger than n" a paper he published with Emmanuel Candés. Publishing the paper was one thing, but as he explains:

After accepting our paper, the Annals of Statistics took the (somewhat uncommon) step of soliciting responses to the paper from various experts in the field, and then soliciting a rejoinder to these responses from Emmanuel and I. Recently, the Annals posted these responses and rejoinder on the arXiv.

I'd say, this is indeed rather unusual even in the Applied Sciences. In my view, this move only attests to importance of the work. The blog entry is here.

On a lighter note, Laurent Duval proposes a witty anagram for Compressive sensing: Processing Vein Mess.

Compressed Sensing: Reconstructing sparse signals from their zero crossings

In looking at the updates on the Rice site, I nearly missed this one from Petros Boufounos, Richard Baraniuk, Reconstructing sparse signals from their zero crossings. The abstract reads:

Classical sampling records the signal level at pre-determined time instances, usually uniformly spaced. An alternative implicit sampling model is to record the timing of pre-determined level crossings. Thus the signal dictates the sampling times but not the sampling levels. Logan’s theorem provides sufficient conditions for a signal to be recoverable, within a scaling factor, from only the timing of its zero crossings. Unfortunately, recovery from noisy observations of the timings is not robust and usually fails to reproduce the original signal. To make the reconstruction robust this paper introduces the additional assumption that the signal is sparse in some basis. We reformulate the reconstruction problem as a minimization of a sparsity inducing cost function on the unit sphere and provide an algorithm to compute the solution. While the problem is not convex, simulation studies indicate that the algorithm converges in typical cases and produces the correct solution with very high probability.

In the conclusion they mention:

In summary, formulating the problem of sparse signal reconstruction from zero crossings as a sparsity inducing optimization on the unit sphere stabilizes the reconstruction. This formulation can be extended beyond Fourier sparsity to accommodate other sparsity bases by appropriately modifying the sampling operator but the performance of such a modification needs to be evaluated. Furthermore, although the algorithm described provides the solution with high empirical probability, further research on reconstruction algorithms and their performance is necessary.

This is not unlike the result of Yves Meyer and Basarab Matei where using an additional constraint on the optimization (in the case of Meyer and Basarab, it's really a theorem), one seems to be able to retrieve full signal (in the case of Meyer and Basarab, it is proven).

Compressed Sensing: Fast compressive imaging using scrambled block Hadamard ensemble, Compressed sensing for surface metrology

New updates from the Rice repository:

Lu Gan, Thong Do, Trac Tran that seem to provide us with a compact measurement matrix with Fast compressive imaging using scrambled block Hadamard ensemble

With the advent of a single-pixel camera, compressive imaging applications have gained wide interests. However, the design of efficient measurement basis in such a system remains as a challenging problem. In this paper, we propose a highly sparse and fast sampling operator based on the scrambled block Hadamard ensemble. Despite its simplicity, the proposed measurement operator offers universality and requires a near-optimal number of samples for perfect reconstruction. Moreover, it can be easily implemented in the optical domain thanks to its integer-valued elements. Several numerical experiments show that its imaging performance is comparable to that of the dense, floating-coefficient scrambled Fourier ensemble at much lower implementation cost.

The conclusion is attractive where one would to implement it right away:

This paper has proposed the scrambled block Hadamard ensemble (SBHE) as a new sampling operator for compressive imaging applications. The SBHE is highly sparse and fast computable along with optical-domain friendly implementations. Both theoretical analysis and numerical simulation results have been presented to demonstrate the promising potential of the SBHE. In particular, we showed that a highly sparse SBHE can produce similar compressive imaging performance as that of a dense scrambled Fourier ensemble at much lower implementation cost.

In a different area Jianwei Ma provides a new fascinating reason for using Compressed Sensing in Compressed sensing for surface metrology.

Surface metrology is the science of measuring small-scale features on surfaces. In this paper we introduce compressed sensing theory into the surface metrology to reduce data acquisition. We describe that the compressed sensing is naturally fit to surface analysis. A recovery algorithm is proposed for scratched and textural surfaces by solving a convex optimal problem with sparse constrained by curvelet transform and wave atom transform. One can stably recover compressible surfaces from a random incomplete and inaccurate measurements, i.e., far fewer measurements than traditional methods use while does not obey the Shannon sampling theorem. This is very significant for measurements limited by physical and physiological constraints, or extremely expensive. The compressed measurement essentially shift measurement cost to computational cost of off-line nonlinear recovery. Thus we call this method as compressed metrology or computational metrology. Different from traditional measuring method, it directly senses the geometric and structural features instead of single pixel's information. By combining the idea of sampling, sparsity and compression, the compressed measurement indicates an new acquisition protocol and leads to building simpler and cheaper measurement instruments. Experiments on engineering and bioengineering surfaces demonstrate good performances of the proposed method.

Creidt Photo: Darvaza flaming crater in Turkmenistan John Bradley pictures, Via Fogonazos.

Friday, March 21, 2008

Compressed Sensing: SPARCO is your friend.

In October, I mentioned the birth of SPARCO ( by Ewout van den Berg, Michael P. Friedlander, Gilles Hennenfent, Felix J. Herrmann, Rayan Saab, and Ozgur Yilmaz) a new framework where one could try his/her reconstruction technique and follow the reproducible research guidelines. The software has grown to be a little bit more since then it seems and now it provides a good platform for learning about Compressed Sensing or Compressive Sensing by running different cases and have direct access to the sometime strange measurement matrices found in the literature. Not unlike Wavelab, SPARCO provides a set of problem example that have been worked out and published. All these problem sets are nicely featured in this page. Not only does it list the problem but it also provides the reader with the Measurement matrix type and the Sparsity matrix as well. This is is a nice touch.

Of further interest is the ability to use complex operations that enable to build measurement or sparsity operators as well as meta-operators. A lot has gone into this and the authors should be thanked for it.

The technical report provides ample information on how these examples and operators work. In the documentation, one can see how to rapidly prototype a new case with SPARCO operators. I have had some problems running some of the examples because of functions of the Rice Wavelet Toolbox not getting the right parameters.

Besides this minor problem, SPARCO seems a solid platform that will grow by including other dictionaries, measurement matrices and reconstruction operators to the satisfaction of the whole community. Eventually, we'll need operators to bring the dictionaries found from tensor factorizations into 2-D that SPARCO can handle.

Wednesday, March 19, 2008

Computing Nonnegative Tensor Factorizations, Arthur C. Clarke and Multi-Aperture Lenses

If you are lost because your data isn't 2-D, and you want a sparse decomposition of your dataset, Michael Friedlander and Kathrin Hatz just released Computing nonnegative tensor factorizations. The abstract reads:

Nonnegative tensor factorization (NTF) is a technique for computing a parts-based representation of high-dimensional data. NTF excels at exposing latent structures in datasets, and at finding good low-rank approximations to the data. We describe an approach for computing the NTF of a dataset that relies only on iterative linear-algebra techniques and that is comparable in cost to the nonnegative matrix factorization. (The better-known nonnegative matrix factorization is a special case of NTF and is also handled by our implementation.) Some important features of our implementation include mechanisms for encouraging sparse factors and for ensuring that they are equilibrated in norm. The complete Matlab software package is available under the GPL license.

The Matlab code is available here. This is a tool that enables one to build dictionaries from data not necessarily in dimension 2. It has its place in the sparse dictionaries section of the big picture on Compressive Sensing.

Arthur Clarke (who just passed away), the man publicizing the idea of geostationary satellites, once said, there are three phases to a great idea:

The first phase is when people tell you it’s a crazy idea, it will never work; the second phase is when people say, it might work, but it’s not worth doing; and the third phase is when people say, I told you that was a great idea all along!

Projecting on a few random bases and getting the full signal back from solving a linear programming step, that'll never work....

Following up on the evaluation of the current imaging systems that are tweaked to provide additional information, here is a multi-aperture system at Stanford using 12,616 lenses.

The 3-D information is extracted from the stereographic view projected in different parts of the CMOS. One should also note the statement that one now needs about 10 MP in order to provide a 3 MP spatial resolution. Much of that hardware engineering implementation is focused on not mixing different rays of light and one wonders if mixing them as done in Compressive Sensing would not be simpler. The difficulty of the Compressed Sensing approach resides in that most photographic customers do not want a "computational reconstruction" if that process requires a computer.

So the real question is: is there a customer base that is OK with the computational reconstruction step because it brings additional information (in hyperspectral mode or in a time mode) that they really care about?

Photo Credit: NASA/JPL, 36 locations found by the Mars Odyssey orbiter (named in Honor of Arthur Clarke and his book 2001: A Space Odyssey) and selected by NASA to be the first regions where humans (future Martians) will set foot on Mars when they go there.

Tuesday, March 18, 2008

Compressed Sensing: Subspace Pursuit Code, Fast Matrix-Vector multiply is crucial for Fast Greedy Algorithms, Partial FMM or MRP

Wei Dai kindly provided me with the original code used in the paper on Subspace Pursuit and asked me to put it on this site. In this day and age, I think people expect students and post-docs to set up a webpage. For instance, Google Pages (create a Google account first) provides an easy way of creating a webpage (through page creator) and allows one to own a nice placeholder, especially when one is likely to move into another position in the future. I generally link directly to the website of the people who have written papers on Compressed Sensing. I am always surprised when I cannot find any trace of somebody on the net, especially when that somebody has published and do research at a U.S. University. If you feel I did not link to your site, or did so but to the wrong site, please let me know, I'll correct it.

Following yesterday's entry on CoSaMP: Iterative signal recovery from incomplete and inaccurate samples, Joel Tropp sent me an e-mail on an issue I totally missed and misreported:

"... Thanks for noting CoSaMP in your blog. There's another key point that you didn't discuss, so I wanted to mention it.

There are about three trillion different variations on the OMP algorithm, none of which is especially interesting per se. In fact, I wouldn't be surprised if CoSaMP needs to be tweaked before it performs well in practice. (Different halting criteria, extra least-squares steps, etc.)

There's a truly fundamental algorithmic idea that everyone keeps missing: the least-squares problems MUST be solved iteratively. For large-scale problems, you simply cannot use direct methods (e.g., Gram--Schmidt orthogonalization) for the reasons we outlined in the paper. Once you recognize the importance of iterative methods, you realize that the sampling matrix really ought to support a fast multiply. As a result, with a partial Fourier sampling matrix, you can get CoSaMP to run in time O(N*log(N)). This kind of efficiency is essential if you want to do compressive sampling in practice. The O(m*N) time bound for a Gaussian matrix is unacceptable.

(By the way, a second consequence of using iterative least-squares is that you have to revise the analysis to reflect the fact that iterative methods result in an approximation error.)

The other important point is that you need to prune the approximation to make sure it stays sparse. This idea, however, is not new: it is an important step in HHS pursuit. Indeed, most of the ideas that make CoSaMP work can ultimately be traced back to HHS pursuit and its precursors..."

And he is absolutely right. As one can see in this and other greedy pursuit algorithms, one of the bottleneck (beside proxy forming and sample update) is the computation of the pseudo-inverse of the measurement matrix. In order to do this fast, one needs an iterative scheme like the conjugate gradient which puts an emphasis of a fast matrix-vector multiplication operation. One can also note that there is a similar issue in the re-weighted Lp algorithm of Rick Chartrand and Wotao Yin.

The Fourier transform can be seen as a fast vector-matrix multiply but so can other transforms. This point also brings to light the potential for other transforms that have fast matrix-vector multiply to be part of new Compressive Sensing Measurement schemes such as :

Wavelet transforms (choose whichever one from this list) and,
Fast Multipole Methods or Fast Multipole-like methods.

Let us note that we have seen some of these fast transforms in the measurements side of the compressive sensing before. Multipole methods, for instance have been used in the computational sensing experiment of a linear scattering analysis by Lawrence Carin, Dehong Liu, Wenbin Lin, and Bin Guo in Compressive Sensing for Multi-Static Scattering Analysis.
An extract reads:

The MLFMM (Multi-Level Fast Multipole Method) code was now run in a CS mode. Specifically, rather than assuming plane-wave excitation, the excitation v in (15) was composed of a linear combination of plane waves, with the weights on this superposition drawn iid from a Gaussian random variable, with zero mean and unit variance.

Sampling different columns of a Fourier transform could be seen as equivalent to the process of a random linear combination of a Fast Multipole Transform. That probably needs to be looked into. Let us note that the authors did not try a different random set such as choosing only certain columns of that transform such as in the Partial Fourier Transform.

Fast matrix-vector multiply also point to other algorithms that are hierarchical in nature such as the Multiscale Random Projections identified by Marco Duarte, Mark Davenport, Michael Wakin, Jason Laska, Dharmpal Takhar, Kevin Kelly, and Richard Baraniuk [1]. The hierarchical nature of the measurement matrix would also ensure that at every stage, the result of the matrix - vector multiply is sparse.

The big question then becomes: Are there any transforms such as a Partial Fast Multipole Transform, Partial Wavelet Transform or the Multiscale Random Projections that yield measurement matrices that fulfill the Restricted Isometry Property with a small constant? and how do they fare with a greedy algorithm such as CoSaMP ?

While the partial Fourier Transform might point to application in optics, Fast Multipole Methods might point to applications in numerous engineering inverse problems and related hardware.

References:
[1]Marco Duarte, Mark Davenport, Michael Wakin, Jason Laska, Dharmpal Takhar, Kevin Kelly, and Richard Baraniuk, Multiscale random projections for compressive classification
Credit Photo: NASA/JPL, Dust devil on Mars as seen by the Spirit rover on February 26, 2007.

Monday, March 17, 2008

Compressed Sensing: CoSaMP: Iterative signal recovery from incomplete and inaccurate samples

Deanna Needell and Joel Tropp just released CoSaMP: Iterative signal recovery from incomplete and inaccurate samples. The abstract reads:

Compressive sampling offers a new paradigm for acquiring signals that are compressible with respect to an orthonormal basis. The major algorithmic challenge in compressive sampling is to approximate a compressible signal from noisy samples. This paper describes a new iterative recovery algorithm called CoSaMP that delivers the same guarantees as the best optimization-based approaches. Moreover, this algorithm offers rigorous bounds on computational cost and storage. It is likely to be extremely efficient for practical problems because it requires only matrix-vector multiplies with the sampling matrix. For many cases of interest, the running time is just O(N logN), where N is the length of the signal.

It is a long manuscript for an algorithm but the paper itself is a very impressive one. On top of describing their algorithm, they do a very nice job providing a picture on how their algorithm and other similar ones are related to each other. This is a welcome addition for newcomers as it gives a good big picture of the current crop of greedy algorithms and their corresponding limitations.

The authors also have an interest in mind with regards to the hardware implementation as shown here:

Even though partial Fourier matrices may require additional samples to achieve a small restricted isometry constant, they are more interesting for the following reasons [2]. There are technologies that acquire random Fourier measurements at unit cost per sample.
The sampling matrix can be applied to a vector in time O(N logN).
The sampling matrix requires only O(mlogN) storage.
Other types of sampling matrices, such as the random demodulator [33],
enjoy similar qualities. These traits are essential for the translation of compressive sampling
from theory into practice.

I like this paper for many reasons including bringing an awareness of other current reconstruction methods. I should follow their classification in my big picture page. As it turns out, I had not made clear the difference between convex relaxation and combinatorial algorithms. The authors point :

The literature describes a huge number of approaches to solving this problem. They fall into three rough categories:

Greedy pursuits: These methods build up an approximation one step at a time by making locally optimal choices at each step. Examples include OMP [34], stagewise OMP (StOMP) [12], and regularized OMP (ROMP) [27, 26].
Convex relaxation: These techniques solve a convex program whose minimizer is known to approximate the target signal. Many algorithms have been proposed to complete the optimization, including interior-point methods [2, 23], projected gradient methods [13], and iterative thresholding [10].
Combinatorial algorithms: These methods acquire highly structured samples of the signal that support rapid reconstruction via group testing. This class includes Fourier sampling [18, 20], chaining pursuit [15], and HHS pursuit [16], as well as some algorithms of Cormode/Muthukrishnan [9] and Iwen [21].

At present, each type of algorithm has its native shortcomings. Many of the combinatorial algorithms are extremely fast-sublinear in the length of the target signal but they require a large number of somewhat unusual samples that may not be easy to acquire. At the other extreme, convex relaxation algorithms succeed with a very small number of measurements, but they tend to be computationally burdensome. Greedy pursuits, in particular, the ROMP algorithm- are intermediate in their running time and sampling efficiency.

The analysis is inspired by the work on ROMP [27, 26] and the work of Candes-Romberg-Tao [3] on convex relaxation methods.

The relative performance explanation of the following table can be found on page 21.

We probably need to have a living document on these performances.

The Man Who Lived in Three Centuries and Ended One

In 1897, Joseph Thomson discovered electrons that allow us to read these words. That year was also the year Lazare Ponticelli was born. To most people, World War I was the war that defined the 20th century. Lazare was the last soldier who fought that war in the trenches of France and Italy. In an interview, he reminded us of what these soldiers were saying to each other before they would fatefully get out of the trenches and fight in horrible conditions:

"Si j'y passe, tu te rappelleras de moi",

(If I don't make it, you'll remember me)

Lazare passed away last week on March 12th and with him all these remembrances of his comrades.

The same day of his passing, the Cassini probe was taking pictures of the North Polar Region of Enceladus, a moon of Saturn millions miles away. The uncanny resemblance of that moon with the famous illustration of the Little Prince is startling. This week-end, we also finally learned, after 64 years, what was the final fate of his author: Antoine de Saint Exupery. Horst Rippert, now 88, then a Luftwaffe pilot describes with remorse what happened and continues with:

"He could deftly describe the sky, the thoughts and feelings of pilots. His work inspired our vocation for many of us. I liked the man...In our youth, at school, we had all read him. We loved his books....If I had known, I would not have opened fire. Not on him!..."

War is a bitch and then it kills you.

Credit Photo: NASA/JPL/Space Science Institute/Le Figaro/Wikipedia

Friday, March 14, 2008

Monday Morning Algorithm 10: Subspace Pursuit for Compressive Sensing.

[Warning: Before you read this entry, read this one first ]
I know this is not Monday, but if you read this on Monday, it'll look like it was posted on Monday. Today's algorithm has to do with Compressed Sensing. In Compressed Sensing, the idea is that you can sample a signal much below the Nyquist rate if you have the prior knowledge that it is sparse. Since most signal are sparse, this technique should eventually become mainstream. If you want to know more about Compressed Sensing check the Big Picture out.

If you have been obtaining Compressed Sensing measurements, the most difficult task is then to decode that information into the actual signal. You can use any of the amazing reconstruction codes produced so far or you could implement a simple one proposed by Wei Dai and Olgica Milenkovic in Subspace Pursuit for Compressive Sensing: Closing the Gap Between Performance and Complexity. [Update: Wei Dai provided me with the original code used in the paper and asked me to put it on this site ]

The algorithm is pretty straightforward from the explanation given in the paper. Here is the SP.m file but the full set of m files implementing this technique can be found either in the local CS code section or in the Monday Morning Algorithm archive. Once you have downloaded all the functions in one directory, you can run directly trialSP from the prompt. It runs one simple example of a vector of 290 elements and 3 non-zero elements. As expected by the log(K) asymptotic, I am pretty much amazed at the speed of this thing. If you find a bug please do let me know.

function [xfinal,That]=SP(K, phi, y,nn)
%
% This is an example using the Subspace Pursuit
% Algorithm of Wei Dai and Olgica Milenkovic
% "Subspace Pursuit for Compressive Sensing: Closing the
% Gap Between Performance and Complexity"
%
% Written by Igor Carron
% http://nuit-blanche.blogspot.com
% Creative Commons Licence

% Initialization
%
u = phi' * y;
[That,phiThat]=deft(phi,u,K);
y_r = resid(y,phiThat);
%
%
% Iteration
%
while y_r ~= 0
u1 = phi' * y_r;
[Tprime1,phiTprime1]=deft(phi,u1,K);
[Tprime,phiTprime]=uni(Tprime1,That,phi);
phiTprimecross = cros(phiTprime);
x_pprime = phiTprimecross * y;
[Ttilde,phiTtilde] = deft(phi,x_pprime,K);
ytilde_r = resid(y,phiTtilde);
if gt(norm(ytilde_r) ,norm(y_r))
break
else
That = Ttilde;
y_r = ytilde_r;
end
end
%
% Output
%
phiThatcross = cros(phiThat);
x_That = phiThatcross * y;
That;
xfinal = zeros(nn,1);
for ii = 1:K
xfinal(That(ii))=x_That(ii);
end

Compressed Sensing: A negative result, cross validation in CS via JL, a new deterministic construction, GRIP, Exploiting channel sparsity, CISS 08

Here is a new batch from the Rice repository. I am listing only those that I have not covered before. First, a somewhat bad news for some deterministic constructions of measurement matrices such as the contruction of Radu Berinde and Piotr Indyk or this one. Venkat Chandar came up with A negative result concerning explicit matrices with the restricted isometry property. The abstract reads:

In this note, we prove that matrices whose entries are all 0 or 1 cannot achieve good performance with respect to the Restricted Isometry Property (RIP). Most currently known deterministic constructions of matrices satisfying the RIP fall into this category, and hence these constructions suffer inherent limitations. In particular, we show that DeVore’s construction of matrices satisfying the RIP is close to optimal once we add the constraint that all entries of the matrix are 0 or 1.

A paper by Rachel Ward on Cross validation in compressed sensing via the Johnson Lindenstrauss lemma takes aim at providing sharp bounds for iterative reconstruction algorithms.

Compressed Sensing decoding algorithms aim to reconstruct an unknown N dimensional vector x from m less than N given measurements y = x, with an assumed sparsity constraint on x. All algorithms presently are iterative in nature, producing a sequence of approximations (s1, s2, ...) until a certain algorithm-specific stopping criterion is reached at iteration j, at which point the estimate ˆx = sj is returned as an approximation to x. In many algorithms, the error ||x−ˆx||lN2 of the approximation is bounded above by a function of the error between x and the best k-term approximation to x. However, as x is unknown, such estimates provide no numerical bounds on the error. In this paper, we demonstrate that tight numerical upper and lower bounds on the error ||x − sj ||lN2 for j p iterations of a compressed sensing decoding algorithm are attainable with little effort. More precisely, we assume a maximum iteration length of p is pre-imposed; we reserve 4 log p of the original m measurements and compute the sj from the m − 4 log(p) remaining measurements; the errors ||x − sj ||lN2 , for j = 1, ..., p can then be bounded with high probability. As a consequence, a numerical upper bound on the error between x and the best k-term approximation to x can be estimated with almost no cost. Our observation has applications outside of Compressed Sensing as well.

We also have another deterministic construction of a measurement matrix by Stephen Howard, Robert Calderbank, and Stephen Searle in A fast reconstruction algorithm for deterministic compressive sensing using second order Reed-Muller codes. The abstract reads:

This paper proposes a deterministic compressed sensing matrix that comes by design with a very fast reconstruction algorithm, in the sense that its complexity depends only on the number of measurements n and not on the signal dimension N. The matrix construction is based on the second order Reed- Muller codes and associated functions. This matrix does not have RIP uniformly with respect to all k-sparse vectors, but it acts as a near isometry on k-sparse vectors with very high probability.

There is a follow-on on GRIP by Jarvis Haupt, Robert Nowak . The technical report is entitled A generalized restricted isometry property. The abstract reads:

Compressive Sampling (CS) describes a method for reconstructing high-dimensional sparse signals from a small number of linear measurements. Fundamental to the success of CS is the existence of special measurement matrices which satisfy the so-called Restricted Isometry Property (RIP). In essence, a matrix satisfying RIP is such that the lengths of all sufficiently sparse vectors are approximately preserved under transformation by the matrix. In this paper we describe a natural consequence of this property – if a matrix satisfies RIP, then acute angles between sparse vectors are also approximately preserved. We formulate this property as a Generalized Restricted Isometry Property (GRIP) and describe one application in robust signal detection.

Finally, the paper by Georg Taubock and Franz Hlawatsch is out. It is entitled A compressed sensing technique for OFDM channel estimation in mobile environments: Exploiting channel sparsity for reducing pilots. The abstract reads:

We consider the estimation of doubly selective wireless channels within pulse-shaping multicarrier systems (which include OFDM systems as a special case). A new channel estimation technique using the recent methodology of compressed sensing (CS) is proposed. CS-based channel estimation exploits a channel’s delay-Doppler sparsity to reduce the number of pilots and, hence, increase spectral efficiency. Simulation results demonstrate a significant reduction of the number of pilots relative to least-squares channel estimation.

At Princeton, the Conference on Information Sciences and Systems will take place March 19 through 21. It has been added to the Calendar. We have already seen some preprints but there are new papers as well. From the abstract list, the talks relevant to Compressed Sensing are:

Wednesday, March 19, 8:30am- 11:45am Session (with a break 10:00am -10:15am)

(INVITED SESSION)
WA-01 - Sparse Representations and Frames I: Compressed Sensing

Iteratively Re-weighted Least Squares Minimization: Proof of Faster than Linear Rate for Sparse Recovery
Daubechies, Ingrid, Princeton University (687)
DeVore, Ronald, University of South Carolina (688)
Fornasier, Massimo, Courant Institute of Mathematical Sciences (689)
Gunturk, Sinan, Johann Randon Institute for Computational and Applied Mathematics (690)

1-Bit Compressive Sensing (287)
Boufounos, Petros T., Rice University (665)
Baraniuk, Richard G., Rice University (666)

The Dantzig Selector and Generalized Thresholding (290)
Romberg, Justin, Georgia Tech (684)

Compressed Channel Sensing (264)
Bajwa, Waheed U., University of Wisconsin-Madison (610)
Haupt, Jarvis D., University of Wisconsin-Madison (611)
Raz, Gil M., GMR Research and Technology (612)
Nowak, Robert D., University of Wisconsin-Madison (613)

Noisy Compressive Sampling Limits in Linear and Sublinear Regimes (233)
Akcakaya, Mehmet, Harvard University (541)
Tarokh, Vahid, Harvard University (542)

A fast reconstruction algorithm for deterministic compressed sensing using second order Reed Muller codes (270)
Howard, Stephen, Defence Science and Technology Organisation, Australia (629)
Calderbank, Robert, Princeton University (630)
Searle, Stephen, University of Melbourne (631)

Thursday, March 20
8:30am- 11:45am Session (with a break 10:00am -10:15am)
TA-01 - Compressed Sensing I

On the frequency resolution of empirical mode decomposition (20)
Roy, Arnab, Pennsylvania State University (41)
Doherty, John F., Pennsylvania State University (42)

Improved Bounds for a Deterministic Sublinear-Time Sparse Fourier Algorithm (72)
Iwen, Mark A., University of Michigan, Ann Arbor (156)
Spencer, Craig V., University of Michigan, Ann Arbor (157)

An adaptive implementation of reference levels in Level-Crossing Analog-to-Digital Converters (116)
Guan, Karen M., University of Illinois at Urbana and Champaign (261)
Singer, Andrew C., University of Illinois at Urbana and Champaign (262)

Sparse Weighted Euclidean Superimposed Coding for Integer Compressed Sensing (158)
Dai, Wei, University of Illinois at Urbana and Champaign (366)
Milenkovic, Olgica, University of Illinois at Urbana and Champaign (367)

Reconstruction of compressively sensed images via neurally plausible local competitive algorithms (159)
Ortman, Robert L., Rice University (368)
Rozell, Christopher J., University of California, Berkeley (369)
Johnson, Don H., Rice University (370)

Sparse Weighted Euclidean Superimposed Coding for Integer Compressed Sensing (253)
Dai, Wei, University of Illinois at Urbana and Cha (588)
Milenkovic, Olgica, University of Illinois at Urbana and Cha (589)

Friday, March 21
8:30am- 11:45am Session (with a break 10:00am -10:15am)

FA-02 - Compressed Sensing II

Unsupervised distributional anomaly detection for a self-diagnostic speech activity detector (77)
Borges, Nash M., Johns Hopkins University (169)
Meyer, Gerard G., Johns Hopkins University (170)

Compressive Sensing Detection of Stochastic Signals (97)
Vila-Forcen, Jose-Emilio, Universidad Carlos III de Madrid (217)
Artes-Rodriguez, Antonio, Universidad Carlos III de Madrid (218)
Garcia-Frias, Javier, University of Delaware (219)

A Measure of Interference in Sparse Atomic Estimations (105)
Sturm, Bob L., University of California (238)
Shynk, John J., University of California (239)
Laurent, Daudet, University Pierre and Marie Curie (286)

On empirical capacity, random coding bound, and probability of outage of an object recognition system under constraint of PCA-encoding (186)
Chen, Xiaohan, West Virginia University (432)
Schmid, Natalia A., West Virginia University (433)

Image Credit: NASA/JPL/Space Science Institute, Photo of Enceladus taken the day before yesterday by Cassini.