Wednesday, April 23, 2008

Compressed Sensing: Building Dictionaries, What is it good for ?

In the big picture on Compressed Sensing, a significant amount of time is spent on figuring out the difference between what the theorems can provably show and what in practice works pretty often. Let us recall that the initial results showing that the L1 regularization provided the sparsest decomposition were in found in geophysics. Only afterwards did it catch the attention of more mathematically inclined folks. This is why surprising results such as the one by Laurent Jacques and others are always welcome because they will eventually be proven for some category of signal or other parameters. Until then we are left with a lot of tweaking and trials.

Part of the big picture is built on the assumption that once the dictionaries have been found, one can directly proceed into the sensing (measurement) part with the knowledge that the two ensembles (the dictionary and the measurement ensemble) are somehow incoherent. The intent is then to produce some sort of hardware that will in effect project as efficiently as possible on this measurement ensemble. The expectation is that the projection measurement will be linear and will therefore reduce power consumption at the sensor level.

But how do you produce the dictionaries in the first place and what type of properties should they have ? Initially the thought is to use known signals, decompose them in some sort of atoms and then cross one's fingers and hope for the best. In particular, hope that with different training data, one would obtain the same dictionary. Well, the story is a little bit more complicated than that, as eloquently presented by Remi Gribonval in his Habilitation a Diriger Des Recherches. In France, this document is part of a process whereby upon completion, one is allowed to become the thesis advisor of Ph.D and M.S. students. The title is in French but the main body of the report is in English:

Sur quelques problèmes mathématiques de modélisation parcimonieuse.
[The presentation slides of that defense (more visual) can be found here (and they are in English)]

The style of the document uses "I" pretty often as the goal of the document is to feature what accomplishments the individual has made in the field to the reviewers. In all, this review is an excellent snapshot of what we know and what we do not know in that field. Because it is an important document, I have added it to the CS big picture page. Maybe instead of being called the big picture, the page should be renamed: Compressed Sensing Technology Watch.

As made clear in the document, Remi Gribonval is one of the co-author of the Matching Pursuit Toolkit listed in the code section. Let us note the incredible capabilities of this software:

MPTK provides an implementation of Matching Pursuit which is:

  • FAST: e.g., extract 1.5 million atoms from a 1 hour long, 16kHz audio signal (15dB extracted) in 0.25x real time on a Pentium IV@2.4GHz, out of a dictionary of 178M Gabor atoms. Such incredible speed makes it possible to process "real world" signals.

  • FLEXIBLE: multi-channel, various signal input formats, flexible syntax to describe the dictionaries => reproducible results, cleaner experiments.

  • OPEN: modular architecture => add your own atoms ! Free distribution under the GPL.




But if you think this is fast, wait till people can get their hands on the one of these Red cameras at 12 MP, 100 fps and want to process these movies! Talk about data flood.... We had a similar camera we wanted to put on the International Space Station but our main issue was data storage. Data transmission could really only be done by carrying the hard drives back to Earth with the Space Shuttle. TDRSS was never designed for that purpose.


The table of content of the document features the following paragraphs:

1 Audio source separation
1.1 Introduction
1.2 Evaluation of audio source separation algorithms
1.2.1 Performance measures
1.2.2 Oracle source separation estimators
1.3 Multichannel separation by sparse decomposition
1.3.1 Identification of a mixing matrix
1.3.2 Separation by sparse decomposition, given the mixing matrix
1.3.3 Dictionary design
1.4 Single-channel source separation
1.4.1 "Morphological" source diversity through sparsity ?
1.4.2 Gaussian Mixture Models
1.5 Conclusion and perspectives
2 Sparse approximation with redundant dictionaries
2.1 Introduction
2.2 Example : approximation with an orthonormal basis
2.3 Jackson and Bernstein inequalities
2.4 Approximation with general redundant dictionaries
2.4.1 Sparsity spaces
2.4.2 Approximation spaces
2.4.3 General Jackson type embeddings
2.5 A quest for Bernstein inequalities
2.5.1 Approximation with framelet systems
2.5.2 Approximation with localized frames ?
2.5.3 Approximation with decomposable incoherent dictionaries
2.6 Conclusion
3 Algorithms for sparse approximation
3.1 Introduction
3.2 Recovery of sparse representations by global optimization
3.2.1 Recovery in arbitrary incoherent dictionaries
3.2.2 Simultaneous recovery with l_tau representations
3.2.3 From representations to approximations : stable recovery
3.3 Matching Pursuit
3.3.1 The Matching Pursuit ToolKit (MPTK)
3.3.2 Stable recovery with Matching Pursuit
3.4 Bridging the gap between theory and practice
3.4.1 How incoherent are real-life dictionaries ?
3.4.2 Why is incoherence too pessimistic a tool ?
3.4.3 Structured approximation
3.4.4 Recovery with high probability
3.5 Conclusion and perspectives

No comments:

Printfriendly