Probabilistic latent variable models
Latent variable models (LVMs) are a statistical tool to infer an unobserved, hidden state of a complex (e.g. biological) system based on observable data that is often high-dimensional. To this end, a high-dimensional dataset of correlated observations is reduced into a low-dimensional dataset of uncorrelated and interpretable latent variables. Probabilistic approaches allow for a principled way to disentangle distinct sources of variation and explicitly model dependencies between features as well as samples.
Accounting for dependencies between genes in LVMs
- Standard latent variable models only model dependencies between samples
- Can we make dependencies between features (genes) explicit?
- Use framework of Gaussian Process Latent Variable Models (GP-LVM)
- Probabilistic kernel PCA via GP regression with unobserved input
- Introduce kernel to model covariance between genes
- Learn latent variables for genes and samples and connect via Kronecker Product
- Apply to matrix completion tasks
Reference: Yang & Buettner, UAI 2021 (in revision)
Hierarchical autoencoders for Domain Generalisation
- Learn VAE to disentangle domain-specific information form class-specific information and residual variance
- Place Dirichlet prior on domain representation
- Learn “topics” that describe domain structure in unsupervised manner
- Interpretable model for unsupervised domain generalisation
Reference: Sun & Buettner, ICLR Workshop on robustML, 2021