Grad Seminar: Jonathan Williams
Jonathan P. Williams
University of North Carolina at Chapel Hill
Joint work with: Jan Hannig-UNC-Chapel Hill
Yuying Xie-Michigan State University
Non-penalized variable selection in high-dimensional settings via generalized fiducial inference
Standard penalized methods of variable selection and parameter estimation rely on the magnitude of coefficient estimates to decide which variables to include in the final model. However, coefficient estimates are unreliable when the design matrix is collinear. To over- come this challenge an entirely new perspective on variable selection is presented within a generalized fiducial inference framework. This new procedure is able to effectively account for linear dependencies among subsets of covariates in a high-dimensional setting where p can grow almost exponentially in n, as well as in the classical setting where p ≤ n. It is shown that the procedure very naturally assigns small probabilities to subsets of covariates which include redundancies by way of explicit L0 minimization. Furthermore, with a typical sparsity assumption, it is shown that the proposed method is consistent in the sense that the probability of the true sparse subset of covariates converges in probability to 1 as n → ∞, or as n → ∞ and p → ∞. Very reasonable conditions are needed, and little restriction is placed on the class of possible subsets of covariates to achieve this consistency result.
In addition to establishing the variable selection methodology in the high-dimensional linear model setting, in a forthcoming second paper we modify the procedure for use in the vector autoregressive time-series setting. Assuming a true sparse coefficient matrix, we are able to prove that under typical conditions the generalized fiducial probability of the true model converges to 1 in probability, even with the number of unknown parameters exceeding the number of observed instances of the time-series.