Clustering and dimensionality reduction can be considered as the two most common approaches in analysing high-dimensional gene expression data in genomics. In clustering, the goal is to find cell or patient subpopulations that exhibit similar features. Conversely, in dimensionality reduction, the goal is to compress features that are highly correlated. A combination of two fundamental unsupervised methods, namely factor analysis and gaussian mixture models leads to a statistical method which concurrently performs clustering and, within each cluster, local dimensionality reduction.
BSc in Computer Science, 2022
Goethe-Universität Frankfurt am Main