- This event has passed.
STOR Colloquium: Li Ma, Duke University
8 Apr @ 3:30 pm - 4:30 pm
STOR Colloquium: Li Ma, Duke University
8 Apr @ 3:30 pm – 4:30 pmGenerative modeling with trees and recursive partitions
Trees and recursive partitions are most well-known in supervised learning for predictive tasks, such as regression and classification. Famous examples include CART and its various forms of ensembles—e.g., random forest and boosting. A natural question is whether such successes can be replicated in the context of unsupervised problems and for the task of generative modeling. I present a recent example of tree-based approaches in the context of generative modeling, where the primary objective is to learn the underlying nature of complex multivariate, possibly high-dimensional distributions based on unlabeled i.i.d. training data and allow effective sampling from the trained distribution. The approach addresses density learning and generative modeling in multivariate sample spaces using an additive ensemble of tree-based density learners. The employment of trees and partitions leads to highly efficient, statistically rigorous algorithms that scale approximately linearly in the sample size. The resulting model can be trained quickly or even in real-time on single computers and achieves competitive performance when gauged against neural network-based approaches in some application contexts. The method is a counterpart of supervised tree boosting and preserves many desirable properties of its supervised cousin. It also has a close connection to normalizing flows and diffusion models for the purpose of generating samples from a target distribution based on sequential transforms on noise samples.