Thursday, October 30, 2014

Higher Criticism for Large-Scale Inference: especially for Rare and Weak effects

Two papers on an interesting subject today with a common co-author. From a footnote in the first paper:

Our point here is not that HC [Higher Criticism] should replace formal methods using random matrix theory, but instead that HC can be used in structured settings where theory is not yet available. A careful comparison to formal inference using random matrix theory not possible here would illustrate the benefits of theoretical analysis of a specific situation as exemplified by random matrix theory, in this case over the direct application of a general procedure like HC.



Higher Criticism for Large-Scale Inference: especially for Rare and Weak effects by David Donoho, Jiashun Jin

In modern high-throughput data analysis, researchers perform a large number of statistical tests, expecting to find perhaps a small fraction of significant effects against a predominantly null background. Higher Criticism (HC) was introduced to determine whether there are any non-zero effects; more recently, it was applied to feature selection, where it provides a method for selecting useful predictive features from a large body of potentially useful features, among which only a rare few will prove truly useful.
In this article, we review the basics of HC in both the testing and feature selection settings. HC is a flexible idea, which adapts easily to new situations; we point out how it adapts to clique detection and bivariate outlier detection. HC, although still early in its development, is seeing increasing interest from practitioners; we illustrate this with worked examples. HC is computationally effective, which gives it a nice leverage in the increasingly more relevant "Big Data" settings we see today.
We also review the underlying theoretical "ideology" behind HC. The Rare/Weak} (RW) model is a theoretical framework simultaneously controlling the size and prevalence of useful/significant items among the useless/null bulk. The RW model shows that HC has important advantages over better known procedures such as False Discovery Rate (FDR) control and Family-wise Error control (FwER), in particular, certain optimality properties. We discuss the rare/weak {\it phase diagram}, a way to visualize clearly the class of RW settings where the true signals are so rare or so weak that detection and feature selection are simply impossible, and a way to understand the known optimality properties of HC.




Rare and Weak effects in Large-Scale Inference: methods and phase diagrams by Jiashun Jin, Tracy Ke

Often when we deal with `Big Data', the true effects we are interested in are Rare and Weak (RW). Researchers measure a large number of features, hoping to find perhaps only a small fraction of them to be relevant to the research in question; the effect sizes of the relevant features are individually small so the true effects are not strong enough to stand out for themselves.
Higher Criticism (HC) and Graphlet Screening (GS) are two classes of methods that are specifically designed for the Rare/Weak settings. HC was introduced to determine whether there are any relevant effects in all the measured features. More recently, HC was applied to classification, where it provides a method for selecting useful predictive features for trained classification rules. GS was introduced as a graph-guided multivariate screening procedure, and was used for variable selection.
We develop a theoretic framework where we use an Asymptotic Rare and Weak (ARW) model simultaneously controlling the size and prevalence of useful/significant features among the useless/null bulk. At the heart of the ARW model is the so-called phase diagram, which is a way to visualize clearly the class of ARW settings where the relevant effects are so rare or weak that desired goals (signal detection, variable selection, etc.) are simply impossible to achieve. We show that HC and GS have important advantages over better known procedures and achieve the optimal phase diagrams in a variety of ARW settings.
HC and GS are flexible ideas that adapt easily to many interesting situations. We review the basics of these ideas and some of the recent extensions, discuss their connections to existing literature, and suggest some new applications of these ideas.
 
Join the CompressiveSensing subreddit or the Google+ Community and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

No comments:

Printfriendly