Grad Student Seminar: Jianyu Liu and Zhengling Qi
Joint Skeleton Estimation of Multiple Directed Acyclic Graphs
for Heterogeneous Population
The directed acyclic graph (DAG) is a powerful tool to model the interactions of high-dimensional variables. While estimating edge directions in a DAG often requires interventional data, one can estimate the skeleton of a DAG (i.e., an undirected graph formed by removing the direction of each edge in a DAG) using observational data. In real data analyses, the samples of the high-dimensional variables may be collected from a mixture of multiple populations. Each population has its own DAG while the DAGs across populations may have significant overlap. In this paper, we propose a two-step approach to jointly estimate the DAG skeletons of multiple populations while the population origin of each sample may or may not be labeled. In particular, our method allows a probabilistic soft label for each sample, which can be easily computed and often leads to more accurate skeleton estimation than hard labels. Compared with separate estimation of skeletons for each population, our method is more accurate and robust to labeling errors. Simulation studies are performed to demonstrate the performance of the new method. Finally, we apply our method to analyze gene expression data from breast cancer patients of multiple cancer subtypes.
Multi-armed Angle-based Direct learning for Estimating Optimal Individual Treatment Rules with Treatment Scores
Estimating an optimal individual treatment rule (ITR) based on patients’ information is an important problem in personalized medicine. An optimal ITR is a decision function that optimizes expected clinical outcomes. Many existing methods in the literature are designed for binary treatment settings with the interest of a continuous outcome. Much less work has been done on estimating optimal ITRs in multiple treatment settings with good interpretation. In this paper, we propose Angle-based Direct Learning (AD-learning) to efficiently estimate optimal ITRs with multiple treatments. Our proposed method can be applied to various types of outcomes, such as continuous, survival or binary outcomes. Moreover, it has an interesting geometric interpretation about the effect of different treatments for each individual patient, which can help doctors and patients make better decisions. Finite sample error bounds have been established to provide a theoretical guarantee for AD-learning. Finally, we demonstrate the superior performance of our method via an extensive simulation study and real data applications.