## Saturday, October 31, 2015

### Saturday Morning Videos: ICML 2015 in Lille

All the videos of all the talks at ICML were just released this week. Enjoy !

Here are a few I had not noticed before. On average they last about 15 to 20 minutes:

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

## Friday, October 30, 2015

### Phase Retrieval: An Overview of Recent Developments

It's nice to take a step back sometimes:

The problem of phase retrieval is a classic one in optics and arises when one is interested in recovering an unknown signal from the magnitude (intensity) of its Fourier transform. While there have existed quite a few approaches to phase retrieval, recent developments in compressed sensing and convex optimization-based signal recovery have inspired a host of new ones. This work presents an overview of these approaches.
Since phase retrieval, by its very nature, is ill-posed, to make the problem meaningful one needs to either assume prior structure on the signal (e.g., sparsity) or obtain additional measurements (e.g., masks, structured illuminations). For both the cases, we review conditions for the identifiability of the signal, as well as practical algorithms for signal recovery. In particular, we demonstrate that it is possible to robustly and efficiently identify an unknown signal solely from phaseless Fourier measurements, a fact with potentially far-reaching implications.

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

### Is audio signal processing still useful in the era of machine learning?

A continuing step toward the great convergence:

the link goes to a zip file with the presentation and a few .wav files.

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

## Thursday, October 29, 2015

### Learning to Hash for Indexing Big Data - A Survey

Learning to Hash for Indexing Big Data - A Survey by Jun Wang, Wei Liu, Sanjiv Kumar, Shih-Fu Chang

The explosive growth in big data has attracted much attention in designing efficient indexing and search methods recently. In many critical applications such as large-scale search and pattern matching, finding the nearest neighbors to a query is a fundamental research problem. However, the straightforward solution using exhaustive comparison is infeasible due to the prohibitive computational complexity and memory requirement. In response, Approximate Nearest Neighbor (ANN) search based on hashing techniques has become popular due to its promising performance in both efficiency and accuracy. Prior randomized hashing methods, e.g., Locality-Sensitive Hashing (LSH), explore data-independent hash functions with random projections or permutations. Although having elegant theoretic guarantees on the search quality in certain metric spaces, performance of randomized hashing has been shown insufficient in many real-world applications. As a remedy, new approaches incorporating data-driven learning methods in development of advanced hash functions have emerged. Such learning to hash methods exploit information such as data distributions or class labels when optimizing the hash codes or functions. Importantly, the learned hash codes are able to preserve the proximity of neighboring data in the original feature spaces in the hash code spaces. The goal of this paper is to provide readers with systematic understanding of insights, pros and cons of the emerging techniques. We provide a comprehensive survey of the learning to hash framework and representative techniques of various types, including unsupervised, semi-supervised, and supervised. In addition, we also summarize recent hashing approaches utilizing the deep learning models. Finally, we discuss the future direction and trends of research in this area.

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

### Sparse SPM: Sparse-Dictionary Learning for Resting-state Functional Connectivity MRI Analysis

Jong just sent me the following:

Hi Igor,

I would like to bring your attention to our recent paper showing that  a sparse dictionary learning can address a long-standing  problem in brain connectivity analysis.

Young-Beom Lee, Jeonghyeon Lee, Sungho Tak, Kangjoo Lee, Duk L. Na, Sangwon Seo, Yong Jeong, and Jong Chul Ye, "Sparse SPM: Sparse-Dictionary Learning for Resting-state Functional Connectivity MRI Analysis", NeuroImage (in press), 2015.

In this  paper,  we developed a unified  mixed-model  called sparse SPM for group sparse dictionary learning and inference for  resting-state fMRI analysis.
Unlike  ICA methods,  the new algorithm exploits the fact that  temporal dynamics at each voxel can be represented as a sparse combination of global dynamics because of  the property of small-worldness of brain networks.  Under the reasonable assumption that the individual network structures in the same group are similar, we demonstrated that  a group sparse dictionary learning algorithm can extract the subnetwork structures as group biomarkers, and that mixed-effect analysis with ReML covariance estimation can provide  a unified individual- and group- level statistical analysis and inference.   This unified mixed effect framework also enabled ANOVA  at a group level, which provided systematic tools to investigates the disease progression. We compared and validated our tools with the existing seed-based and ICA approaches for normal, MCI and Alzheimer's disease patients.  The results indicate that the DMN network extracted with our method shows a clear correlation with the progression of disease.

Cheers,
-Jong

Thank you Jong for the context, here is the paper:
Sparse SPM: Sparse-Dictionary Learning for Resting-state Functional Connectivity MRI Analysis by Young-Beom Lee, Jeonghyeon Lee, Sungho Tak, Kangjoo Lee, Duk L. Na, Sangwon Seo, Yong Jeong, Jong Chul Ye , and The Alzheimer's Disease Neuroimaging Initiative

Recent studies of functional connectivity MR imaging have revealed that the default-mode network activity is disrupted in diseases such as Alzheimer's disease (AD). However, there is not yet a consensus on the preferred method for resting-state analysis. Because the brain is reported to have complex interconnected networks according to graph theoretical analysis, the independency assumption, as in the popular independent component analysis (ICA) approach, often does not hold. Here, rather than using the independency assumption, we present a new statistical parameter mapping (SPM)-type analysis method based on a sparse graph model where temporal dynamics at each voxel position are described as a sparse combination of global brain dynamics. In particular, a new concept of a spatially adaptive design matrix has been proposed to represent local connectivity that shares the same temporal dynamics. If we further assume that local network structures within a group are similar, the estimation problem of global and local dynamics can be solved using sparse dictionary learning for the concatenated temporal data across subjects. Moreover, under the homoscedasticity variance assumption across subjects and groups that is often used in SPM analysis, the aforementioned individual and group analyses using sparse dictionary learning can be accurately modeled by a mixed-e ect model, which also facilitates a standard SPM-type group-level inference using summary statistics. Using an extensive resting fMRI data set obtained from normal, mild cognitive impairment (MCI), and Alzheimer's disease patient groups, we demonstrated that the changes in the default mode network extracted by the proposed method are more closely correlated with the progression of Alzheimer's disease.

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

## Wednesday, October 28, 2015

### Random Features Roundup: Dual Control for Approximate Bayesian RL, Scatter Component Analysis, Spherical Random Features

Here are a few uses the Random Features approximation:

Dual Control for Approximate Bayesian Reinforcement Learning by Edgar D. Klenske, Philipp Hennig

Control of non-episodic, finite-horizon dynamical systems with uncertain dynamics poses a tough and elementary case of the exploration-exploitation trade-off. Bayesian reinforcement learning, reasoning about the effect of actions and future observations, offers a principled solution, but is intractable. We review, then extend an old approximate approach from control theory---where the problem is known as dual control---in the context of modern regression methods, specifically generalized linear regression. Experiments on simulated systems show that this framework offers a useful approximation to the intractable aspects of Bayesian RL, producing structured exploration strategies that differ from standard RL approaches. We provide simple examples for the use of this framework in (approximate) Gaussian process regression and feedforward neural networks for the control of exploration.

Scatter Component Analysis: A Unified Framework for Domain Adaptation and Domain Generalization by Muhammad Ghifary, David Balduzzi, W. Bastiaan Kleijn, Mengjie Zhang

This paper addresses classification tasks on a particular target domain in which labeled training data are only available from source domains different from (but related to) the target. Two closely related frameworks, domain adaptation and domain generalization, are concerned with such tasks, where the only difference between those frameworks is the availability of the unlabeled target data: domain adaptation can leverage unlabeled target information, while domain generalization cannot.
We propose Scatter Component Analyis (SCA), a fast representation learning algorithm that can be applied to both domain adaptation and domain generalization. SCA is based on a simple geometrical measure, i.e., scatter, which operates on reproducing kernel Hilbert space. SCA finds a representation that trades between maximizing the separability of classes, minimizing the mismatch between domains, and maximizing the separability of data; each of which is quantified through scatter. The optimization problem of SCA can be reduced to a generalized eigenvalue problem, which results in a fast and exact solution.
Comprehensive experiments on benchmark cross-domain object recognition datasets verify that SCA performs much faster than several state-of-the-art algorithms and also provides state-of-the-art classification accuracy in both domain adaptation and domain generalization. We also show that scatter can be used to establish a theoretical generalization bound in the case of domain adaptation.

Compact explicit feature maps provide a practical framework to scale kernel methods to large-scale learning, but deriving such maps for many types of kernels remains a challenging open problem. Among the commonly used kernels for non-linear classification are polynomial kernels, for which low approximation error has thus far necessitated explicit feature maps of large dimensionality, especially for higher-order polynomials. Meanwhile, because polynomial kernels are unbounded, they are frequently applied to data that has been normalized to unit `2 norm. The question we address in this work is: if we know a priori that data is so normalized, can we devise a more compact map? We show that a putative affirmative answer to this question based on Random Fourier Features is impossible in this setting, and introduce a new approximation paradigm, Spherical Random Fourier (SRF) features, which circumvents these issues and delivers a compact approximation to polynomial kernels for data on the unit sphere. Compared to prior work, SRF features are less rank-deficient, more compact, and achieve better kernel approximation, especially for higher-order polynomials. The resulting predictions have lower variance and typically yield better classification accuracy.

Image Credit: NASA/JPL-Caltech/Space Science Institute
NASA's Cassini spacecraft zoomed by Saturn's icy moon Enceladus on Oct. 14, 2015, capturing this stunning image of the moon's north pole. A companion view from the wide-angle camera (PIA20010) shows a zoomed out view of the same region for context. Scientists expected the north polar region of Enceladus to be heavily cratered, based on low-resolution images from the Voyager mission, but high-resolution Cassini images show a landscape of stark contrasts. Thin cracks cross over the pole -- the northernmost extent of a global system of such fractures. Before this Cassini flyby, scientists did not know if the fractures extended so far north on Enceladus. North on Enceladus is up. The image was taken in visible green light with the Cassini spacecraft narrow-angle camera.
The view was acquired at a distance of approximately 4,000 miles (6,000 kilometers) from Enceladus and at a Sun-Enceladus-spacecraft, or phase, angle of 9 degrees. Image scale is 115 feet (35 meters) per pixel.

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

### Input Sparsity and Hardness for Robust Subspace Approximation

Input Sparsity and Hardness for Robust Subspace Approximation by  Kenneth L. Clarkson, David P. Woodruff

In the subspace approximation problem, we seek a k-dimensional subspace F of R^d that minimizes the sum of p-th powers of Euclidean distances to a given set of n points a_1, ..., a_n in R^d, for p ge 1. More generally than minimizing sum_i dist(a_i,F)^p,we may wish to minimize sum_i M(dist(a_i,F)) for some loss function M(), for example, M-Estimators, which include the Huber and Tukey loss functions. Such subspaces provide alternatives to the singular value decomposition (SVD), which is the p=2 case, finding such an F that minimizes the sum of squares of distances. For p in [1,2), and for typical M-Estimators, the minimizing $F$ gives a solution that is more robust to outliers than that provided by the SVD. We give several algorithmic and hardness results for these robust subspace approximation problems.
We think of the n points as forming an n x d matrix A, and letting nnz(A) denote the number of non-zero entries of A. Our results hold for p in [1,2). We use poly(n) to denote n^{O(1)} as n goes to infty. We obtain:
(1) For minimizing sum_i dist(a_i,F)^p, we give an algorithm running in O(nnz(A) + (n+d)poly(k/eps) + exp(poly(k/eps))),
(2) we show that the problem of minimizing sum_i dist(a_i, F)^p is NP-hard, even to output a (1+1/poly(d))-approximation, answering a question of Kannan and Vempala, and complementing prior results which held for p  gt 2,
(3) For loss functions for a wide class of M-Estimators, we give a problem-size reduction: for a parameter K=(log n)^{O(log k)}, our reduction takes O(nnz(A) log n + (n+d) poly(K/eps)) time to reduce the problem to a constrained version involving matrices whose dimensions are poly(K eps^{-1} log n). We also give bicriteria solutions,
(4) Our techniques lead to the first O(nnz(A) + poly(d/eps)) time algorithms for (1+eps)-approximate regression for a wide class of convex M-Estimators.

Credit: NASA/Johns Hopkins University Applied Physics Laboratory/Southwest Research Institute

Sputnik Planum, in Color
Release Date: October 15, 2015
Keywords: LORRI, MVIC, PlutoThis high-resolution image captured by NASA's New Horizons spacecraft combines blue, red and infrared images taken by the Ralph/Multispectral Visual Imaging Camera (MVIC). The bright expanse is the western lobe of the "heart," informally called Sputnik Planum, which has been found to be rich in nitrogen, carbon monoxide and methane ices.

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

## Tuesday, October 27, 2015

### A Picture is Worth a Billion Bits: Real-Time Image Reconstruction from Dense Binary Pixels

It looks like Eric Fossum's QIS and similar concepts [3,4] are raising the interest of people looking to reconstruct images from these unique flows of photons. We mentioned a path a while back in this Sunday Morning Insight: A Compressive Sensing Approach to Quanta Image Sensor (QIS) but today's preprint is taking a slightly different path:

A Picture is Worth a Billion Bits: Real-Time Image Reconstruction from Dense Binary Pixels Tal Remez, Or Litany, Alex Bronstein

The pursuit of smaller pixel sizes at ever increasing resolution in digital image sensors is mainly driven by the stringent price and form-factor requirements of sensors and optics in the cellular phone market. Recently, Eric Fossum proposed a novel concept of an image sensor with dense sub-diffraction limit one-bit pixels jots, which can be considered a digital emulation of silver halide photographic film. This idea has been recently embodied as the EPFL Gigavision camera. A major bottleneck in the design of such sensors is the image reconstruction process, producing a continuous high dynamic range image from oversampled binary measurements. The extreme quantization of the Poisson statistics is incompatible with the assumptions of most standard image processing and enhancement frameworks. The recently proposed maximum-likelihood (ML) approach addresses this difficulty, but suffers from image artifacts and has impractically high computational complexity. In this work, we study a variant of a sensor with binary threshold pixels and propose a reconstruction algorithm combining an ML data fitting term with a sparse synthesis prior. We also show an efficient hardware-friendly real-time approximation of this inverse operator.Promising results are shown on synthetic data as well as on HDR data emulated using multiple exposures of a regular CMOS sensor.
References:

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

### Fast genome and metagenome distance estimation using MinHash / Machine learning for metagenomics: methods and tools

Awesome!  If you've read these blog entries [1,2,3] you know that genomics is now heading toward the analysis of large population of genomes aka metagenomics.

The first preprint was promised back in August and uses a Locality Sensitive Hashing to perform a comparison on a large population of genomes. The second preprint reviews several ML techniques for metagenomics and points to the earlier third preprint where another hashing technique is used.  That last approach used Vowpal Wabbit that is not a Locality Sensitive Hashing technique while MinHash used in the first preprint is. Without further ado:

Fast genome and metagenome distance estimation using MinHash by Brian D. Ondov, Todd J. Treangen, Adam B. Mallonee, Nicholas H. Bergman, Sergey Koren, Adam M. Phillippy

Machine learning for metagenomics: methods and tools by Hayssam Soueidan, Macha Nikolski

While genomics is the research field relative to the study of the genome of any organism, metagenomics is the term for the research that focuses on many genomes at the same time, as typical in some sections of environmental study. Metagenomics recognizes the need to develop computational methods that enable understanding the genetic composition and activities of communities of species so complex that they can only be sampled, never completely characterized.
Machine learning currently offers some of the most computationally efficient tools for building predictive models for classification of biological data. Various biological applications cover the entire spectrum of machine learning problems including supervised learning, unsupervised learning (or clustering), and model construction. Moreover, most of biological data -- and this is the case for metagenomics -- are both unbalanced and heterogeneous, thus meeting the current challenges of machine learning in the era of Big Data.
The goal of this revue is to examine the contribution of machine learning techniques for metagenomics, that is answer the question "to what extent does machine learning contribute to the study of microbial communities and environmental samples?" We will first briefly introduce the scientific fundamentals of machine learning. In the following sections we will illustrate how these techniques are helpful in answering questions of metagenomic data analysis. We will describe a certain number of methods and tools to this end, though we will not cover them exhaustively. Finally, we will speculate on the possible future directions of this research.

Large-scale Machine Learning for Metagenomics Sequence Classification by Kévin Vervier, Pierre Mahé, Maud Tournoud, Jean-Baptiste Veyrieras, Jean-Philippe Vert
Metagenomics characterizes the taxonomic diversity of microbial communities by sequencing DNA directly from an environmental sample. One of the main challenges in metagenomics data analysis is the binning step, where each sequenced read is assigned to a taxonomic clade. Due to the large volume of metagenomics datasets, binning methods need fast and accurate algorithms that can operate with reasonable computing requirements. While standard alignment-based methods provide state-of-the-art performance, compositional approaches that assign a taxonomic class to a DNA read based on the k-mers it contains have the potential to provide faster solutions. In this work, we investigate the potential of modern, large-scale machine learning implementations for taxonomic affectation of next-generation sequencing reads based on their k-mers profile. We show that machine learning-based compositional approaches benefit from increasing the number of fragments sampled from reference genome to tune their parameters, up to a coverage of about 10, and from increasing the k-mer size to about 12. Tuning these models involves training a machine learning model on about 10 8 samples in 10 7 dimensions, which is out of reach of standard soft-wares but can be done efficiently with modern implementations for large-scale machine learning. The resulting models are competitive in terms of accuracy with well-established alignment tools for problems involving a small to moderate number of candidate species, and for reasonable amounts of sequencing errors. We show, however, that compositional approaches are still limited in their ability to deal with problems involving a greater number of species, and more sensitive to sequencing errors. We finally confirm that compositional approach achieve faster prediction times, with a gain of 3 to 15 times with respect to the BWA-MEM short read mapper, depending on the number of candidate species and the level of sequencing noise.

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

## Monday, October 26, 2015

### When Are Nonconvex Problems Not Scary?

When Are Nonconvex Problems Not Scary? by Ju Sun, Qing Qu, John Wright

In this paper, we focus on nonconvex optimization problems with no "spurious" local minimizers, and with saddle points of at most second-order. Concrete applications such as dictionary learning, phase retrieval, and tensor decomposition are known to induce such structures. We describe a second-order trust-region algorithm that provably converges to a local minimizer in polynomial time. Finally we highlight alternatives, and open problems in this direction.

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

### A Faster Cutting Plane Method and its Implications for Combinatorial and Convex Optimization

This is wonderful news. One of the things I wonder is if this new approach will do better than some  greedy algorithms: A Faster Cutting Plane Method and its Implications for Combinatorial and Convex Optimization by Yin Tat Lee, Aaron Sidford, Sam Chiu-wai Wong

We improve upon the running time for finding a point in a convex set given a separation oracle. In particular, given a separation oracle for a convex set KRn contained in a box of radius R, we show how to either find a point in K or prove that K does not contain a ball of radius ϵ using an expected O(nlog(nR/ϵ)) oracle evaluations and additional time O(n3logO(1)(nR/ϵ)). This matches the oracle complexity and improves upon the O(nω+1log(nR/ϵ)) additional time of the previous fastest algorithm achieved over 25 years ago by Vaidya for the current matrix multiplication constant ω<2.373 when R/ϵ=nO(1).
Using a mix of standard reductions and new techniques, our algorithm yields improved runtimes for solving classic problems in continuous and combinatorial optimization:
• Submodular Minimization: Our weakly and strongly polynomial time algorithms have runtimes of O(n2lognMEO+n3logO(1)nM) and O(n3log2nEO+n4logO(1)n), improving upon the previous best of O((n4EO+n5)logM) and O(n5EO+n6).
• Matroid Intersection: Our runtimes are O((nrlog2nTrank+n3logO(1)n)lognM) and O((n2lognTind+n3logO(1)n)lognM), achieving the first quadratic bound on the query complexity for the independence and rank oracles. In the unweighted case, this is the first improvement since 1986 for independence oracle.
• Submodular Flow: Our runtime is O(n2lognCUEO+n3logO(1)nCU), improving upon the previous bests from 15 years ago roughly by a factor of O(n4).
• Semidefinite Programming: Our runtime is O~(n(n2+mω+S)), improving upon the previous best of O~(n(nω+mω+S)) for the regime where the number of nonzeros S is small.

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.