LEI Lihua: AdaPT: An interactive procedure for multiple testing with side information

Theme: AdaPT: An interactive procedure for multiple testing with side information

Lecturer: Dr. LEI Lihua, Stanford Universtiy

Host: Professor CHANG Jinyuan, School of Statistics

Date: 10:00-11:20, July 03, 2020

Conference ID: Tencent Conference, 686 799 248

Organizers: Statistical Research Center, Joint Laboratory for Data Science and Business Intelligence, School of Statistics and Office of Research Affairs

Introduction to the Lecturer:

Lihua Lei is a postdoctoral researcher in the Statistics Department at Stanford University, advised by Professor Emmanuel Candès. Previously he got his Ph.D. at UC Berkeley, advised by Professors Peter Bickel and Michael Jordan. He was also very fortunate to be supervised by Professors Noureddine El Karoui, William Fithian and Peng Ding on particular projects. Prior to this, he was major in mathematics and statistics in School of Mathematical Sciences at Peking University with an economic minor in China Center for Economic Research at Peking University. He was pleased to be a research assistant with Professor Lan Wu and supervised by Professor Song Xi Chen on his undergraduate thesis. His research interests include multiple hypothesis testing, causal inference, network analysis, high dimensional statistical inference, optimization, resampling methods, time series analysis and econometrics.

Content Summary:

We consider the problem of multiple hypothesis testing with generic side information: for each hypothesis H_i we observe both a p-value p_i and some predictor x_i encoding contextual information about the hypothesis. For large-scale problems, adaptively focusing power on the more promising hypotheses (those more likely to yield discoveries) can lead to much more powerful multiple testing procedures. We propose a general iterative framework for this problem, called the Adaptive p-value Thresholding (AdaPT) procedure, which adaptively estimates a Bayes-optimal p-value rejection threshold and controls the false discovery rate (FDR) in finite samples. At each iteration of the procedure, the analyst proposes a rejection threshold and observes partially censored p-values, estimates the false discovery proportion (FDP) below the threshold, and either stops to reject or proposes another threshold, until the estimated FDP is below α. Our procedure is adaptive in an unusually strong sense, permitting the analyst to use any statistical or machine learning method she chooses to estimate the optimal threshold, and to switch between different models at each iteration as information accrues. This is a joint work with Professor William Fithian.