Wednesday, March 22, 2017

Paris Machine Learning Hors Serie #10 : Workshop SPARK (atelier 1)






Leonardo Noleto, data scientist chez KPMG, nous fait découvrir le processus de nettoyage et transformation des données brutes en données “propres” avec Apache Spark.

Apache Spark est un framework open source généraliste, conçu pour le traitement distribué de données. C’est une extension du modèle MapReduce avec l’avantage de pouvoir traiter les données en mémoire et de manière interactive. Spark offre un ensemble de composants pour l’analyse de données: Spark SQL, Spark Streaming, MLlib (machine learning) et GraphX (graphes).

Cet atelier se concentre sur les fondamentaux de Spark et le paradigme de traitement de données avec l’interface de programmation Python (plus précisément PySpark).

L’installation, configuration, traitement sur cluster, Spark Streaming, MLlib et GraphX ne seront pas abordés dans cet atelier.

Matériel à installer c'est ici. ..


Objectifs 

  • Comprendre les fondamentaux de Spark et le situer dans l'écosystème Big Data ;
  • Savoir la différence avec Hadoop MapReduce ;
  • Utiliser les RDD (Resilient Distributed Datasets) ;
  • Utiliser les actions et transformations les plus courantes pour manipuler et analyser des données ;
  • Ecrire un pipeline de transformation de données ;
  • Utiliser l’API de programmation PySpark.


Cet atelier est le premier d’une série de 2 ateliers avec Apache Spark. Pour suivre les prochains ateliers, vous devez avoir suivi les précédents ou être à l’aise avec les sujets déjà traités.


Quels sont les pré-requis ?


  • Connaître les base du langage Python (ou apprendre rapidement via ce cours en ligne Python Introduction)
  • Être sensibilisé au traitement de la donnée avec R, Python ou Bash (why not?)
  • Aucune connaissance préalable en traitement distribué et Apache Spark n’est demandée. C’est un atelier d’introduction. Les personnes ayant déjà une première expérience avec Spark (en Scala, Java ou R) risquent de s'ennuyer (c’est un atelier pour débuter).


Comment me préparer pour cet atelier ?


  • Vous devez être muni d’un ordinateur portable relativement moderne et avec minimum 4 Go de mémoire, avec un navigateur internet installé. Vous devez pouvoir vous connecter à Internet via le Wifi.
  • Suivre les instructions pour vous préparer à l’atelier (installation Docker + image docker de l’atelier).
  • Les données à nettoyer sont comprises dans l’image Docker. Les exercices seront fournis lors de l’atelier en format Jupyter notebook. 

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

Summer School "Structured Regularization for High-Dimensional Data Analysis" - IHP Paris - June 19th to 22nd

Gabriel just sent me the following:

Dear Igor,
In case you find it suitable, could you advertise for this summer school ?
All the best
Gabriel

Sure !
=======
=======
The SMF (French Mathematical Society) and the Institut Henri Poincaré organize a mathematical summer school on "Structured Regularization for High-Dimensional Data Analysis". This summer school will be the opportunity to bring together students, researchers and people working on High-Dimensional Data Analysis around three courses and four talks on new methods in structured regularization. The mathematical foundations of this event will lie between probability, statistics, optimization, image and signal processing.
More information (including registration, free but mandatory) is available on the webpage: https://regularize-in-paris.github.io/
Courses:
  • * Anders Hansen (Cambridge)
  • * Andrea Montanari (Stanford)
  • * Lorenzo Rosasco (Genova and MIT)
Talks:
  • * Francis Bach (INRIA and ENS)
  • * Claire Boyer (UPMC)
  • * Emilie Chouzenoux (Paris Est)
  • * Carlos Fernandez-Granda (NYU)
Organizers:
  • * Yohann De Castro (Paris-Sud)
  • * Guillaume Lecué (CNRS and ENSAE)
  • * Gabriel Peyré (CNRS and ENS)
The program is here; https://regularize-in-paris.github.io/program/


Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

Tuesday, March 21, 2017

Making Backpropagation Plausible

The two papers mentioned on Monday morning are, to a certain, opening the door for making backpropagation plausible within the human brain architecture and also potentially allow for much faster and scalable ways of learning. Here are four papers that are investigating this avenue as a result of these two papers ([1][2]). 

When training neural networks, the use of Synthetic Gradients (SG) allows layers or modules to be trained without update locking - without waiting for a true error gradient to be backpropagated - resulting in Decoupled Neural Interfaces (DNIs). This unlocked ability of being able to update parts of a neural network asynchronously and with only local information was demonstrated to work empirically in Jaderberg et al (2016). However, there has been very little demonstration of what changes DNIs and SGs impose from a functional, representational, and learning dynamics point of view. In this paper, we study DNIs through the use of synthetic gradients on feed-forward networks to better understand their behaviour and elucidate their effect on optimisation. We show that the incorporation of SGs does not affect the representational strength of the learning system for a neural network, and prove the convergence of the learning system for linear and deep linear models. On practical problems we investigate the mechanism by which synthetic gradient estimators approximate the true loss, and, surprisingly, how that leads to drastically different layer-wise representations. Finally, we also expose the relationship of using synthetic gradients to other error approximation techniques and find a unifying language for discussion and comparison.

An ongoing challenge in neuromorphic computing is to devise general and computationally efficient models of inference and learning which are compatible with the spatial and temporal constraints of the brain. One increasingly popular and successful approach is to take inspiration from inference and learning algorithms used in deep neural networks. However, the workhorse of deep learning, the gradient descent Back Propagation (BP) rule, often relies on the immediate availability of network-wide information stored with high-precision memory, and precise operations that are difficult to realize in neuromorphic hardware. Remarkably, recent work showed that exact backpropagated weights are not essential for learning deep representations. Random BP replaces feedback weights with random ones and encourages the network to adjust its feed-forward weights to learn pseudo-inverses of the (random) feedback weights. Building on these results, we demonstrate an event-driven random BP (eRBP) rule that uses an error-modulated synaptic plasticity for learning deep representations in neuromorphic computing hardware. The rule requires only one addition and two comparisons for each synaptic weight using a two-compartment leaky Integrate & Fire (I&F) neuron, making it very suitable for implementation in digital or mixed-signal neuromorphic hardware. Our results show that using eRBP, deep representations are rapidly learned, achieving nearly identical classification accuracies compared to artificial neural network simulations on GPUs, while being robust to neural and synaptic state quantizations during learning.

The back-propagation (BP) algorithm has been considered the de-facto method for training deep neural networks. It back-propagates errors from the output layer to the hidden layers in an exact manner using the transpose of the feedforward weights. However, it has been argued that this is not biologically plausible because back-propagating error signals with the exact incoming weights is not considered possible in biological neural systems. In this work, we propose a biologically plausible paradigm of neural architecture based on related literature in neuroscience and asymmetric BP-like methods. Specifically, we propose two bidirectional learning algorithms with trainable feedforward and feedback weights. The feedforward weights are used to relay activations from the inputs to target outputs. The feedback weights pass the error signals from the output layer to the hidden layers. Different from other asymmetric BP-like methods, the feedback weights are also plastic in our framework and are trained to approximate the forward activations. Preliminary results show that our models outperform other asymmetric BP-like methods on the MNIST and the CIFAR-10 datasets.



Recent studies have shown that synaptic unreliability is a robust and sufficient mechanism for inducing the stochasticity observed in cortex. Here, we introduce Synaptic Sampling Machines (S2Ms), a class of neural network models that uses synaptic stochasticity as a means to Monte Carlo sampling and unsupervised learning. Similar to the original formulation of Boltzmann machines, these models can be viewed as a stochastic counterpart of Hopfield networks, but where stochasticity is induced by a random mask over the connections. Synaptic stochasticity plays the dual role of an efficient mechanism for sampling, and a regularizer during learning akin to DropConnect. A local synaptic plasticity rule implementing an event-driven form of contrastive divergence enables the learning of generative models in an on-line fashion. S2Ms perform equally well using discrete-timed artificial units (as in Hopfield networks) or continuous-timed leaky integrate and fire neurons. The learned representations are remarkably sparse and robust to reductions in bit precision and synapse pruning: removal of more than 75% of the weakest connections followed by cursory re-learning causes a negligible performance loss on benchmark classification tasks. The spiking neuron-based S2Ms outperform existing spike-based unsupervised learners, while potentially offering substantial advantages in terms of power and complexity, and are thus promising models for on-line learning in brain-inspired hardware.

Job: Machine Learning, LightOn, Paris.


LightOn is hiring. We have two openings: One for a person in Machine Learning, the other one for a person in Electronics Hardware Design. More info at: http://www.lighton.io/careers

1. Machine Learning Research Engineer
Do you want to contribute to a fast-growing company at the cutting edge of innovation between optics and artificial intelligence? LightOn is looking for a Research Engineer specialized in Machine Learning / Data Science for the development of new optical co-processors for Artificial Intelligence.
Within the R&D team and reporting to the CTO, your main duties will include :
  • the design of statistical learning algorithms that take advantage of LightOn processors,
  • algorithm testing on LightOn’s processors,
  • managing and interacting with industrial partners,
  • interacting with developers of the software layer for network access (API),
  • interfacing with the hardware developing team,
  • carrying out rapid prototyping activities in synchronization with the rest of the team.
REQUIRED PROFILE Engineering Degree (MSc or PhD) in Machine Learning / Data Science. An industry experience would be a plus.
Technical skills (required) : You should
  • Have some theoretical knowledge and hand-on experience in unsupervised or supervised machine learning (eg Deep Neural Networks),
  • Have some experience on how to process and make sense of very large amounts of data,
  • Be proficient in scientific programming (Python, C ++, Matlab, ...).
  • Be a user of one or more Machine Learning/Deep Learning framework(s) (Scikit-learn, TensorFlow, Keras, Theano, Torch, etc)
A significant interest in one or more of the following topics would be a plus:
  • automated search for hyper-parameters,
  • digital electronics or FPGA programming.
In order to work in a small startup such as LightOn, you will also need to creative and pragmatic, have some team spirit and some good communication skills.  
CONDITIONS  This position is for a full-time employment, that can start as soon as a possible. Salary will based on technical skills and experience.  The candidate must have the right to work in the EU.  We cannot pay for relocation costs.
CONTACT
To respond to this offer, please send an e-mail to jobs@LightOn.io with [ML Engineer] in the subject line. Please attach a resume and cover letter both in PDF.
THE COMPANY  Founded in 2016, LightOn (www.LightOn.io)  is a technology start-up that develops a new generation of optical co-processors designed to accelerate the low power Artificial Intelligence algorithms for massive amounts of data. The technology developed by LightOn originates from the ESPCI and Ecole Normale Supérieure laboratories. LightOn won in 2016 the best Digital Tech startup from the City of Paris. We are located in the center of Paris within the Agoranov incubator.
2. Electronics Hardware Systems engineer
Would you like to contribute to a fast-growing company at the cutting edge of innovation between optics and artificial intelligence? LightOn is looking for a Research Engineer specialized in Electronics and Embedded Systems to develop our new optical co-processors for Artificial Intelligence.
Within the R&D team, reporting to the CTO, your main duties include :
  • system integration with high-throughput opto-electronic components,
  • design of driver software,
  • digital design / guidance of PCB layout,
  • Interaction with developers of the software layer for network access (API),
  • Rapid prototyping activities with the rest of the team.
  • functional verification,
  • manufacturing production / subcontracting support.
REQUIRED PROFILE An Engineering Degree (MSc or PhD) in Electrical Engineering or related field.
Technical skills (required) :
  • Relevant experience (ideally 5+ years in industry) designing embedded systems,
  • Successful track record of delivering highly innovative products.
  • Digital logic board design of embedded CPU, RAM, ROM, and FPGA subsystems.
  • Experience with high-speed digital circuits such as HDMI, DDR, PCIe, USB, Ethernet / GigE
A significant interest in one or more of the following topics would be a plus:
  • Machine Learning,
  • Cloud-based services.
In order to work in a small startup such as LightOn, you will also need to creative and pragmatic, have some team spirit and some good communication skills.  
CONDITIONS  This position is for a full-time employment, that can start as soon as a possible. Salary will based on technical skills and experience.  The candidate must have the right to work in the EU.  We cannot pay for relocation costs.
CONTACT
To respond to this offer, please send an e-mail to jobs@LightOn.io with [EE Engineer] in the subject line. Please attach a resume and cover letter both in PDF.
THE COMPANY  Founded in 2016, LightOn (www.LightOn.io)  is a technology start-up that develops a new generation of optical co-processors designed to accelerate the low power Artificial Intelligence algorithms for massive amounts of data. The technology developed by LightOn originates from the ESPCI and Ecole Normale Supérieure laboratories. LightOn won in 2016 the best Digital Tech startup from the City of Paris. We are located in the center of Paris within the Agoranov incubator.






Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

Random Triggering Based Sub-Nyquist Sampling System for Sparse Multiband Signal

Yijiu let me know of his new paper:



Random Triggering Based Sub-Nyquist Sampling System for Sparse Multiband Signal by Yijiu Zhao, Yu Hen Hu, Jingjing Liu

We propose a novel random triggering based modulated wideband compressive sampling (RT-MWCS) method to facilitate efficient realization of sub-Nyquist rate compressive sampling systems for sparse wideband signals. Under the assumption that the signal is repetitively (not necessarily periodically) triggered, RT-MWCS uses random modulation to obtain measurements of the signal at randomly chosen positions. It uses multiple measurement vector method to estimate the non-zero supports of the signal in the frequency domain. Then, the signal spectrum is solved using least square estimation. The distinct ability of estimating sparse multiband signal is facilitated with the use of level triggering and time to digital converter devices previously used in random equivalent sampling (RES) scheme. Compared to the existing compressive sampling (CS) techniques, such as modulated wideband converter (MWC), RT-MWCS is with simple system architecture and can be implemented with one channel at the cost of more sampling time. Experimental results indicate that, for sparse multiband signal with unknown spectral support, RT-MWCS requires a sampling rate much lower than Nyquist rate, while giving great quality of signal reconstruction.



Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !

Monday, March 20, 2017

Learning in the Machine: Random Backpropagation and the Learning Channel

Carlos Perez's blog entry on Medium entitled Deep Learning: The Unreasonable Effectiveness of Randomness just led me to the following paper I had not read before (probably because it came out during NIPS). I also added the latest version of Arild Nokland's earlier paper on a similar idea that was itself published at NIPS (and featured on Nuit Blanche). 




Random backpropagation (RBP) is a variant of the backpropagation algorithm for training neural networks, where the transpose of the forward matrices are replaced by fixed random matrices in the calculation of the weight updates. It is remarkable both because of its effectiveness, in spite of using random matrices to communicate error information, and because it completely removes the taxing requirement of maintaining symmetric weights in a physical neural system. To better understand random backpropagation, we first connect it to the notions of local learning and the learning channel. Through this connection, we derive several alternatives to RBP, including skipped RBP (SRPB), adaptive RBP (ARBP), sparse RBP, and their combinations (e.g. ASRBP) and analyze their computational complexity. We then study their behavior through simulations using the MNIST and CIFAR-10 bechnmark datasets. These simulations show that most of these variants work robustly, almost as well as backpropagation, and that multiplication by the derivatives of the activation functions is important. As a follow-up, we study also the low-end of the number of bits required to communicate error information over the learning channel. We then provide partial intuitive explanations for some of the remarkable properties of RBP and its variations. Finally, we prove several mathematical results, including the convergence to fixed points of linear chains of arbitrary length, the convergence to fixed points of linear autoencoders with decorrelated data, the long-term existence of solutions for linear systems with a single hidden layer, and the convergence to fixed points of non-linear chains, when the derivative of the activation functions is included.



Artificial neural networks are most commonly trained with the back-propagation algorithm, where the gradient for learning is provided by back-propagating the error, layer by layer, from the output layer to the hidden layers. A recently discovered method called feedback-alignment shows that the weights used for propagating the error backward don't have to be symmetric with the weights used for propagation the activation forward. In fact, random feedback weights work evenly well, because the network learns how to make the feedback useful. In this work, the feedback alignment principle is used for training hidden layers more independently from the rest of the network, and from a zero initial condition. The error is propagated through fixed random feedback connections directly from the output layer to each hidden layer. This simple method is able to achieve zero training error even in convolutional networks and very deep networks, completely without error back-propagation. The method is a step towards biologically plausible machine learning because the error signal is almost local, and no symmetric or reciprocal weights are required. Experiments show that the test performance on MNIST and CIFAR is almost as good as those obtained with back-propagation for fully connected networks. If combined with dropout, the method achieves 1.45% error on the permutation invariant MNIST task.

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !

Saturday, March 18, 2017

Six million page views: The Number's Game in Long Distance Blogging



The Long Distance Blogging continues. Six million page views roughly amounts to about a million page views per year but those stats don't include the 800+ people receiving this blog as a "newsletter" every day. If you want every new entries directly in your mailbox, enter your email address here:
At NIPS, I was surprised to learn that some readers would call the blog a newsletter as they probably seldo come on the site directly.  There is also the subscribe to Nuit Blanche RSS feed which has about 1200 subscribers. 

The Nuit Blanche Community where I repost blog entries on other social networks are:
These are two groups on LinkedIn where people can interact. Members receive only a monthly Nuit Blanche in Review, as opposed to a daily blog posting frequency as above:


We are currently in Season 4 of the Paris Machine Learning meetup. Our membership seems to indicate that this is probably the third largest in the world regionwise after Silicon Valley and New York, so with Franck, we decided to invest in a site, it's MLParis.org. We've had 18 meetups since September. Here are some sites associated with the Paris Machine Learning Community 

Over the course of writing Nuit Blanche, there was a need to make some information available in more constant fashion, here are these reference pages
Here are the historical figures:

Friday, March 17, 2017

The Unreasonable Effectiveness of Random Orthogonal Embeddings



In the series "The Unreasonable effectiveness of", we've had 

Today, we have something that is a subset of random projections: The Unreasonable Effectiveness of Random Orthogonal Embeddings by Krzysztof Choromanski, Mark Rowland, Adrian Weller
We present a general class of embeddings based on structured random matrices with orthogonal rows which can be applied in many machine learning applications including dimensionality reduction, kernel approximation and locality-sensitive hashing. We show that this class yields improvements over previous state-of-the-art methods either in computational efficiency (while providing similar accuracy) or in accuracy, or both. In particular, we propose the \textit{Orthogonal Johnson-Lindenstrauss Transform} (OJLT) which is as fast as earlier methods yet provably outperforms them in terms of accuracy, leading to a `free lunch' improvement over previous dimensionality reduction mechanisms. We introduce matrices with complex entries that further improve accuracy. Other applications include estimators for certain pointwise nonlinear Gaussian kernels, and speed improvements for approximate nearest-neighbor search in massive datasets with high-dimensional feature vectors.

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !

Warming Up

Two items on Twitter made a passing mention on the connection between AI and power consumption:







Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

Thursday, March 16, 2017

Sparsebn - A new R package for learning sparse graphical models from high-dimensional data via sparse regularization - implementation -

Bryon just sent me the following:


Igor -

I hope that all is well! My group recently released a new library for learning sparse graphical models using ideas from compressed sensing and high-dimensional statistics. For Bayesian networks it scales much better than existing approaches. I would very much appreciate if you consider posting this on your blog, I think the readers of Nuit Blanche would be interested.
Here is a more formal announcement:


Introducing sparsebn - A new R package for learning sparse graphical models from high-dimensional data via sparse regularization. Designed from the ground up to handle:
  • Experimental data with interventions
  • Mixed observational / experimental data
  • High-dimensional data with p much larger than n
  • Datasets with thousands of variables
  • Continuous and discrete data
The emphasis of this package is scalability and statistical consistency on high-dimensional datasets. Compared to existing algorithms, sparsebn scales much better and is under active development. We have several features and improvements in the pipeline that will help scale these algorithms even further.



-B

Bryon Aragam
CMU Machine Learning Department
Twitter: @itsrainingdata


Learning graphical models from data is an important problem with wide applications, ranging from genomics to the social sciences. Nowadays datasets typically have upwards of thousands---sometimes tens or hundreds of thousands---of variables and far fewer samples. To meet this challenge, we develop a new R package called sparsebn for learning the structure of large, sparse graphical models with a focus on Bayesian networks. While there are many existing packages for this task within the R ecosystem, this package focuses on the unique setting of learning large networks from high-dimensional data, possibly with interventions. As such, the methods provided place a premium on scalability and consistency in a high-dimensional setting. Furthermore, in the presence of interventions, the methods implemented here achieve the goal of learning a causal network from data. The sparsebn package is open-source and available on CRAN.


Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

Wednesday, March 15, 2017

Jobs: Two Senior Research Scientist/Principal Research Scientist, NPL, Teddington, U.K.

Stéphane just sent me the following:

Dear Igor,


I hope you are doing fine. We have openings at NPL in the Data Science Division.
· 65350 - Senior Research Scientist/Principal Research Scientist
· 65352 - Senior Research Scientist/Principal Research Scientist
The link to the jobs description is here: http://careers.npl.co.uk/vacancies/
These openings might be of interest to the readers of Nuit Blanche.
Kind regards, 
Stephane

Job: Research Associate in Deep Learning, NRL, Washington D.C.

Leslie just sent me the following:

Hi Igor,
I hope you are doing well. I am currently looking for a postdoctoral researcher to work on a few of my projects at the US Naval Research Laboratory (NRL) in DC. Below is a description of this opportunity. I would deeply appreciate it if you would consider posting it on Nuit Blanche. Many thanks in advance.

Best regards,
Leslie


Here is the announcement:


Research Associate in Deep Learning


We are seeking a postdoctoral researcher in deep learning, machine learning, artificial intelligence, computer science, applied mathematics or a related field, for basic and applied research, development, and evaluation of innovative deep learning methodologies.


Our current research foci include improving state-of-the-art deep learning methods, making deep networks explainable, and deep reinforcement learning for decision making. Specifically, projects include reducing the need for large labeled training datasets, improving deep neural network architectural design, empirically probing network loss function topology anomalies, investigating dynamic hyper-parameters, explainable DL, classifying genomic sequences, the use of new types of non-linear functionality within networks, investigating information overlap between data modalities, and the use of Generative Adversarial Networks for generating novel experiences for training deep reinforcement learning networks. Expertise gained from this fundamental deep learning research is utilized to support novel Navy applications which often guide us to new research avenues.


The successful applicant will be based in the Navy Center for Applied Research in Artificial Intelligence (NCARAI) at the U.S. Naval Research Laboratory, in our nation’s capital, Washington, D.C. Our group, which often includes students, visiting summer researchers, and other postdocs, enjoys close ties with academia, the military, and other customers who can benefit from the prototypes which we develop. The goals of our deep learning team is to both publish improvements to the state-of-the-art in deep learning research and consult with groups applying deep learning to novel applications of importance to the U.S. Navy. The successful applicant will work closely with Dr. Leslie N. Smith and potential candidates should feel free to contact him with questions at leslie.smith@nrl.navy.mil.



Further information is available at http://nrc58.nas.edu/RAPLab10/Opportunity/Opportunity.aspx?LabCode=64&ROPCD=641518&RONum=B8506. Please note that potential candidates need to be either US citizens or permanent residents.








Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

k-NN at Scale and Deep Learning





Similarity search finds application in specialized database systems handling complex data such as images or videos, which are typically represented by high-dimensional features and require specific indexing structures. This paper tackles the problem of better utilizing GPUs for this task. While GPUs excel at data-parallel tasks, prior approaches are bottlenecked by algorithms that expose less parallelism, such as k-min selection, or make poor use of the memory hierarchy.
We propose a design for k-selection that operates at up to 55% of theoretical peak performance, enabling a nearest neighbor implementation that is 8.5x faster than prior GPU state of the art. We apply it in different similarity search scenarios, by proposing optimized design for brute-force, approximate and compressed-domain search based on product quantization. In all these setups, we outperform the state of the art by large margins. Our implementation enables the construction of a high accuracy k-NN graph on 95 million images from the Yfcc100M dataset in 35 minutes, and of a graph connecting 1 billion vectors in less than 12 hours on 4 Maxwell Titan X GPUs. We have open-sourced our approach for the sake of comparison and reproducibility.
An implementation is here: https://github.com/facebookresearch/faiss 



Nearest neighbor (kNN) methods have been gaining popularity in recent years in light of advances in hardware and efficiency of algorithms. There is a plethora of methods to choose from today, each with their own advantages and disadvantages. One requirement shared between all kNN based methods is the need for a good representation and distance measure between samples.
We introduce a new method called differentiable boundary tree which allows for learning deep kNN representations. We build on the recently proposed boundary tree algorithm which allows for efficient nearest neighbor classification, regression and retrieval. By modelling traversals in the tree as stochastic events, we are able to form a differentiable cost function which is associated with the tree's predictions. Using a deep neural network to transform the data and back-propagating through the tree allows us to learn good representations for kNN methods. We demonstrate that our method is able to learn suitable representations allowing for very efficient trees with a clearly interpretable structure.


Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

Tuesday, March 14, 2017

Emmanuel Candès in Paris, Insense, Imaging M87, gauge and perspective duality, Structured signal recovery from quadratic measurements




Emmanuel Candès is in Paris for the week and will talk every day starting today. More information can be found here. Today we have four papers from different areas of compressive sensing. The first one has to do with sensor selection provided signals are sparse, the second is about imaging the structure of an unknown object usign sparse modeling, the third is about showing equivalency of  problem set-ups and finally, the last one is about nonlinear compressive sensing. Enjoy !




Sensor selection refers to the problem of intelligently selecting a small subset of a collection of available sensors to reduce the sensing cost while preserving signal acquisition performance. The majority of sensor selection algorithms find the subset of sensors that best recovers an arbitrary signal from a number of linear measurements that is larger than the dimension of the signal. In this paper, we develop a new sensor selection algorithm for sparse (or near sparse) signals that finds a subset of sensors that best recovers such signals from a number of measurements that is much smaller than the dimension of the signal. Existing sensor selection algorithms cannot be applied in such situations. Our proposed Incoherent Sensor Selection (Insense) algorithm minimizes a coherence-based cost function that is adapted from recent results in sparse recovery theory. Using six datasets, including two real-world datasets on microbial diagnostics and structural health monitoring, we demonstrate the superior performance of Insense for sparse-signal sensor selection.

An implementation of Insense is here: https://github.com/amirmohan/Insense





We propose a new imaging technique for radio and optical/infrared interferometry. The proposed technique reconstructs the image from the visibility amplitude and closure phase, which are standard data products of short-millimeter very long baseline interferometers such as the Event Horizon Telescope (EHT) and optical/infrared interferometers, by utilizing two regularization functions: the ℓ1-norm and total variation (TV) of the brightness distribution. In the proposed method, optimal regularization parameters, which represent the sparseness and effective spatial resolution of the image, are derived from data themselves using cross validation (CV). As an application of this technique, we present simulated observations of M87 with the EHT based on four physically motivated models. We confirm that ℓ1+TV regularization can achieve an optimal resolution of ∼20−30% of the diffraction limit λ/Dmax, which is the nominal spatial resolution of a radio interferometer. With the proposed technique, the EHT can robustly and reasonably achieve super-resolution sufficient to clearly resolve the black hole shadow. These results make it promising for the EHT to provide an unprecedented view of the event-horizon-scale structure in the vicinity of the super-massive black hole in M87 and also the Galactic center Sgr A*.



Common numerical methods for constrained convex optimization are predicated on efficiently computing nearest points to the feasible region. The presence of a design matrix in the constraints yields feasible regions with more complex geometries. When the functional components are gauges, there is an equivalent optimization problem---the gauge dual---where the matrix appears only in the objective function and the corresponding feasible region is easy to project onto. measurements. We revisit the foundations of gauge duality and show that the paradigm arises from an elementary perturbation perspective. We therefore put gauge duality and Fenchel duality on an equal footing, explain gauge dual variables as sensitivity measures, and show how to recover primal solutions from those of the gauge dual. In particular, we prove that optimal solutions of the Fenchel dual of the gauge dual are precisely the primal solutions rescaled by the optimal value. The gauge duality framework is extended beyond gauges to the setting when the functional components are general nonnegative convex functions, including problems with piecewise linear quadratic functions and constraints that arise from generalized linear models used in regression.


This paper concerns the problem of recovering an unknown but structured signal x∈Rn from m quadratic measurements of the form yr=|ar,x|2 for r=1,2,...,m. We focus on the under-determined setting where the number of measurements is significantly smaller than the dimension of the signal (m much less than n). We formulate the recovery problem as a nonconvex optimization problem where prior structural information about the signal is enforced through constrains on the optimization variables. We prove that projected gradient descent, when initialized in a neighborhood of the desired signal, converges to the unknown signal at a linear rate. These results hold for any constraint set (convex or nonconvex) providing convergence guarantees to the global optimum even when the objective function and constraint set is nonconvex. Furthermore, these results hold with a number of measurements that is only a constant factor away from the minimal number of measurements required to uniquely identify the unknown signal. Our results provide the first provably tractable algorithm for this data-poor regime, breaking local sample complexity barriers that have emerged in recent literature. In a companion paper we demonstrate favorable properties for the optimization problem that may enable similar results to continue to hold more globally (over the entire ambient space). Collectively these two papers utilize and develop powerful tools for uniform convergence of empirical processes that may have broader implications for rigorous understanding of constrained nonconvex optimization heuristics. The mathematical results in this paper also pave the way for a new generation of data-driven phase-less imaging systems that can utilize prior information to significantly reduce acquisition time and enhance image reconstruction, enabling nano-scale imaging at unprecedented speeds and resolutions.


Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !

Printfriendly