Skip to main content
Loading Events

« All Events

  • This event has passed.

Colloquium: Ye Tian (Columbia)

13 Jan @ 3:30 pm - 4:30 pm

Colloquium: Ye Tian (Columbia)

13 Jan @ 3:30 pm – 4:30 pm
Title: Transfer and Multi-task Learning: Statistical Insights for Modern Data Challenges

Abstract: Knowledge transfer, a core human ability, has inspired numerous data integration methods in machine learning and statistics. However, data integration faces significant challenges: (1) unknown similarity between data sources; (2) data contamination; (3) high-dimensionality; and (4) privacy constraints.

This talk addresses these challenges in three parts across different contexts, presenting both innovative statistical methodologies and theoretical insights.

In Part I, I will introduce a transfer learning framework for high-dimensional generalized linear models that combines a pre-trained Lasso with a fine-tuning step. We provide theoretical guarantees for both estimation and inference, and apply the methods to predict county-level outcomes of the 2020 U.S. presidential election, uncovering valuable insights.

In Part II, I will explore an unsupervised learning setting where task-specific data is generated from a mixture model with heterogeneous mixture proportions. This complements the supervised learning setting discussed in Part I, addressing scenarios where labeled data is unavailable. We propose a federated gradient EM algorithm that is communication-efficient and privacy-preserving, providing estimation error bounds for the mixture model parameters.

In Part III, I will introduce a representation-based multi-task learning framework that generalizes the distance-based similarity notion discussed in Parts I and II. This framework is closely related to modern applications of fine-tuning in image classification and natural language processing. I will discuss how this study enhances our understanding of the effectiveness of fine-tuning and the influence of data contamination on representation multi-task learning.

Finally, I will summarize the talk and briefly introduce my broader research interests. The three main sections of this talk are based on a series of papers [TF23, TWXF22, TWF24, TGF23] and a short course I co-taught at NESS 2024 [STL24]. More about me and my research can be found at https://yet123.com.

 

[TF23] Tian, Y., & Feng, Y. (2023). Transfer Learning under High-dimensional Generalized Linear Models. Journal of the American Statistical Association, 118(544), 2684-2697. [Link]
[TWXF22] Tian, Y., Weng, H., Xia, L., & Feng, Y. (2022). Unsupervised Multi-task and Transfer Learning on Gaussian Mixture Models. arXiv preprint arXiv:2209.15224. [Link]
[TWF24] Tian, Y., Weng, H., & Feng, Y. (2024). Towards the Theory of Unsupervised Federated Learning: Non-asymptotic Analysis of Federated EM Algorithms. ICML 2024. [Link]
[TGF23] Tian, Y., Gu, Y., & Feng, Y. (2023). Learning from Similar Linear Representations: Adaptivity, Minimaxity, and Robustness. arXiv preprint arXiv:2303.17765. [Link]
[STL24] A (Selective) Introduction to the Statistics Foundations of Transfer Learning. (2024). [Link]

Share this Event

Colloquium: Ye Tian (Columbia)

This event has passed.

Details

Date:
13 Jan
Time:
3:30 pm – 4:30 pm

Venue

120 Hanes Hall
Hanes Hall, Chapel Hill, NC, 27599, United States

Organizer

Department of Statistics & Operations Research

Details

Date:
13 Jan
Time:
3:30 pm - 4:30 pm
Event Category:

Venue

120 Hanes Hall
Hanes Hall
Chapel Hill, NC 27599 United States
+ Google Map