318 Hanes Hall, CB #3260 Chapel Hill, NC 27599-3260
919-962-1329
Loading Events

STOR Colloquium: Jacob Bien, USC

November 2, 2020 @ 4:00 pm - 5:15 pm

Jacob Bien
University of Southern California


Tree-Based Aggregation of Rare Features for Prediction

 

It is common in modern prediction problems for many features to be counts of rarely occurring events.  The challenge posed by such “rare features” has received little attention despite its prevalence in diverse areas, ranging from biology (e.g., rare species within a microbiome) to natural language processing (e.g., rare words within an online hotel review). We show, both theoretically and empirically, that not explicitly accounting for the rareness of features can greatly reduce the effectiveness of an analysis. We next propose a framework for aggregating rare features into denser features in a flexible manner that creates better predictors of the response.  Applications to the microbiome and to online hotel reviews show how our methodology is useful in a wide range of contexts.

Share This Event

  • This event has passed.

Details

Date:
November 2, 2020
Time:
4:00 pm - 5:15 pm
Event Category:

Venue

Hanes Hall
Hanes Hall
Chapel Hill, NC 27599 United States