Dissertation Defense: Robust, Scalable, and Provable Approaches to High Dimensional Unsupervised Learning
January 8, 2018 @ 10:30 AM - 12:30 PM
Announcing the Final Examination of Mostafa Rahmani for the degree of Doctor of Philosophy
This doctoral thesis focuses on three popular unsupervised learning problems: subspace clustering, robust PCA, and column sampling. For the subspace clustering problem, a new transformative idea is presented. The proposed approach, termed Innovation Pursuit, is a new geometrical solution to the subspace clustering problem whereby subspaces are identified based on their relative novelties. A detailed mathematical analysis is provided establishing sufficient conditions for the proposed method to correctly cluster the data. The numerical simulations with both real and synthetic data demonstrate that Innovation Pursuit notably outperforms the state-of-the-art subspace clustering algorithms. For the robust PCA problem, we focus on both the outlier detection and the matrix decomposition problems. A new outlier detection algorithm, termed Coherence Pursuit, is presented in addition to three scalable randomized frameworks for high dimensional robust PCA. The Coherence Pursuit method is the first provable and non-iterative robust PCA method which is provably robust to both unstructured and structured outliers. Coherence Pursuit is remarkably simple and it notably outperforms the existing methods in dealing with structured outliers. In the proposed randomized designs, we leverage the low dimensional structure of the low rank component to apply the robust PCA algorithm to a random sketch of the data as opposed to the full scale data. Importantly, it is analytically shown that the presented randomized designs can make the computation and sample complexities of the low rank matrix recovery algorithm independent of the size of the data. At the end, we focus on the column sampling problem. A new sampling tool, dubbed Spatial Random Sampling, is presented which performs the random sampling in the spatial domain. The most compelling feature of Spatial Random Sampling is that it is the first unsupervised column sampling method which preserves the spatial distribution of the data.
Committee in Charge: George Atia (Chair), Azadeh Vosoughi, Marianna Pensky, Wasfy B. Mikhael, Zuhair Nashed