318 Hanes Hall, CB #3260 Chapel Hill, NC 27599-3260
919-962-1329
Loading Events

Ph.D. Defense: Weiwei Li

March 19, 2020 @ 12:30 pm - 2:30 pm

Ph.D. Thesis Defense

Public Presentation
Thursday, March 19th, 2020

12:30 PM

Location: Virtual

www.zoom.us/join

Meeting ID: 604 122 248

Password: 244139

Weiwei Li

  

Data Science Methods with Applications to Genetic Sequencing

 

Data science methods is of increasing importance in modern genetic sequencing analysis. In this dissertation, we mainly focus on applying statistical modeling to structural variant detection problem and a new frame work for scalable and provable subspace clustering.

In the first project, we discuss the optimal sampling strategy for structural variant detection using optical mapping. Here we develop an optimization approach using a simple, yet realistic, model of the genomic mapping process using a hypergeometric distribution and probabilistic concentration inequalities.

In the second project, we introduce a formal probabilistic model to assessing how well an optical read maps to a reference genome. We use this approach to infer the most likely location within that reference for any given read, as well as the likelihood of mapping to all other possible locations. Using data produced by BioNano Saphyr to parameterize a simulation, we show that our approach accurately identifies the likeliest locations of the observed optical read data.

If time permits, in the third part we introduce a new algorithm for subspace clustering. We consider modeling the collection of points in a high dimensional space as a union of low dimensional subspaces. In particular we propose a highly scalable sampling based algorithm that clusters the entire dataset via first spectral clustering of a small random sample followed by classifying the remaining out of sample points. The numerical results indicate we outperform other state-of-the-art subspace clustering algorithms with respect to both accuracy and speed.

Share This Event

  • This event has passed.

Details

Date:
March 19, 2020
Time:
12:30 pm - 2:30 pm
Event Category:

Venue

Hanes Hall