Tuesday, July 12, 2011

Facing Mona Lisa

I cannot seem to find the paper anywhere of this lens free tiny imager, but I have gone through this route several times already. It looks like we are witnessing a typical Compressive Sensing problem but the designers are probably not aware of it. It is understandable as it takes a lot of effort to design hardware and sometimes the deconvolution is just, if not an afterthought, something that is considered known . If I go to Alyosha Molnar's group and go to his research page, one can see some of the drawing/figures of what this sensor is doing.


In short, the response of the chip is angle dependent because for each ray of light (that one could consider a dirac) striking it at a specific angle, the chip records three different voltages. These voltages are different enough over the range of the field of view to permit some discretization thanks to probably the dynamic range of the chip (this is not unlike the Anger camera scheme)

In this example, we have a ray of light that is providing three measurements in parallel. Remember this is a one-to-many set up similar to the analog-to-digital converters work being undertaken by several groups. For one pixel, several rays of light of different intensities (figure 3) will strike the chip. This operation is similar to a typical multiplexing operation.  In order to obtain the image back from the chip measurement, one is facing a deconvolution problem with a transfer function that looks like the second figure above. What do we get out of this if we naively take the results from the chip and deconvolute these using, I am making a guess, a least square solver? Well, from the press release, we get this pretty bad Mona Lisa (on the right).


Why ? Most probably because an L_1 reconstruction solver was not used to deconvolute this image. There are many solutions to deconvoluting this multiplexing process since there are not that many sensors (it is very likely underdetermined). The L_1 solvers have the particularity that they will "search" for the sparsest decomposition that fits the measurements provided by the chip as opposed to search for a solution with the  smallest error. It turns out, in most cases, the L_1 solution is also the one with the least error among all the ones with small reconstruction error (including the Least Squares solution)..

After having spent so many hours devising this ingenious piece of hardware, how can we make this Mona Lisa look better ... fast ?  L_1 reconstruction solvers can be used like any other Least Squares solvers. If one were interested in getting a better looking Mona Lisa, easy to use and efficient implementations can be used such as: SPGL1, GPSR or SL0. Instead of performing y = A\x in Matlab, one just needs to perform y = SPGL1(A,x,..) or y = GPSR_BB(A, x,..) or y = SL0(A, x,..) after having downloaded any of these packages. Please note that the "...." replace the arguments generally used in the examples provided in these packages. There should not be a need for tuning these parameters to get a better Mona Lisa. Let us also note that for an optimal result, one should certainly add a wavelet dictionary in the reconstruction process, i.e. use A*W as opposed to just A with W representing a wavelet basis (see Sparco's problems for an easy to use implementation).

11 comments:

Anonymous said...

The discussion on this blog is very "scientific" and I havent started to read up on it yet. I was imagining that it would be possible to use these methods to create a new compression-algorithm for images, like a new JPEG or ECW or MrSID- format. What is your idea on this?

Thanks for a nice blog and I will try to read up on compressed sensing!

Igor said...

Compressive sensing provides a simple way of compressing data however, it is also known that a naive approach to it yield schemes that are not as optimal compared to jpeg or wavelet based compression encodings. Less naive schemes include adaptive compressive sampling and/or structured sparsity based decoders and are likely to yield better results than the naive approach. However, if one looks at the semi-success of jpeg2000, a wavelet based encoding system, it looks like the era of better compression systems does not have bright future.

In the case of this sensor however, a jpeg or jpeg2000 encoding system would not make sense as the compressed data is already gathered through the readings of the voltage of the chip. What is needed is a better reconstruction solvers that takes these readings from voltages to an images. The point I made in this entry is that there are better reconstructions solvers than the least square solution (which seems to have been used by the authors of the paper) especially for a system that looks like it is underdetermined such as happens in compressive sensing.

Thanks for reading the blog,

Cheers,

Igor.

Anonymous said...

Thx for the quick and good reply. Thats all I wanted to know for now. Keep up the good work

Laurent DUVAL said...

"Compressive sensing provides a simple way of compressing data however, it is also known that a naive approach". I would not be in complete accordance with that. Compressive is (still IMHO) somehow an adjective to sensing, as (seemingly) pointed out by Bob Sturm here: "I agree that compressed sensing is much more a method of acquisition than it is a method for compression.", more than a way to compress stuff. Actual compression generally needs some amount of enginner wizardy than goes far beyond usual "sparsity" assumptions. In a word: Fourier+Huffman < local Fourier (DCT)+ Huffman but local (8x8) wavelet + Huffamn < local Fourier (DCT)+ Huffman. Yet Wavelet + Zerotree > local Fourier (DCT)+ Huffman. Again, Wavelet + Zerotree ~ DCT + zerotree, both < to Windowed Fourier (GenLOT like) + treillis. And the story may not be ver yet.
There is some intertwining between sparsity yield by transform and coding, AND expectation. Poor sparsity + black magic coding may perform better than highest sparsity and 2nd ordre entropy.

As for:
"However, if one looks at the semi-success of jpeg2000, a wavelet based encoding system, it looks like the era of better compression systems does not have bright future."
i believe this assertion deserves some further historical, time-to-market and corporate weighting. And a reference to Sentience is indescribable by Daniel Lemire. When we can describe the forest... but NP or PCP progresses are needed first.

Eric Tramel said...

WRT the "next era" of coding methods, I think we're still on the look out for the next wave. Currently, advanced coders, like Laurent mentioned, are mostly completely hyper-engineered, 100-mode contraptions that can squeeze compression out of rocks. H.264 for video is a hairy beast of corner cases, but it is my understanding that H.265 makes it look simple by comparison, all for ~30% rate-distortion performance gain.

CS is attractive for its simplicity, and that despite its simplicity it has passable performance in practice (in some applications). In order for CS to become the next thing in coding, measurement quantization will need to be fully understood and we'll need more accurate solvers (i.e. ones that make the most out of signal models & priors). I don't see this happening at a rate that would put CS on par with anything but last-gen traditional coding techniques. But of course, CS isn't intended for these purposes :P

TL;DR: Still looking for something elegant for the next gen of coders...

Eric Tramel said...

Another comment that is more on the topic of what you talked about in the post, Igor:

You'd be surprised what you can recover from. Well, maybe you wouldn't be, but others might. We talk a lot about making sure that the projector, \Phi, A, etc, matches bounds to ensure performance. But, legibility can be achieved with even really, really, really crappy & poorly conditioned projectors and a handful of measurements. This is potentially useful for some very simple vision applications.

So, it is very possible that, as you are suggesting, an L1 solver approach could provide some kind of performance gain. They just have to make sure that they characterize the effective projection accurately :P Also remember that they are dealing with a single, non-reconfigurable, shot. So, they are also limited by the number of sensors on the chip. It is possible that for a low sensor resolution and a single capture, a maximum liklihood approach might make more sense than an L1 solution.

But, I'm having the same problem you stated at the beginning of the article. I can't find the paper! I went to look for it right after reading the press release but came up empty. It could be that they are waiting for some kind of patent application to go through...

No paper makes it hard to tell what they're doing exactly for recovery or what kind of trade-offs they're making in design :'(

Alain K said...

The paper is in the early posting system (http://8.18.37.105/ol/upcoming.cfm, page 2 for the moment)

best

Igor said...

Alain,

Thanks, Laurent also sent me a copy of the link:

www.opticsinfobase.org/ol/upcoming_pdf.cfm?id=147442

I am writing a follow up as we speak.

Cheers,

Igor.

Anonymous said...

from what i understand, the blurred image is the result not much of the processing, but from the hardware limitation

Igor said...

Dear Anonymous,

No.

Cheers,

Igor.

Igor said...

Dear Anonymous,

The longer answer:

Compressive sensing has the potential to improve on the current approach for calibrating this imager. The physical limitations of this device are, in my view, not even reached.

Cheers,

Igor.

Printfriendly