From Causal Inference to Autoencoders and Gene Regulation
Recent progress in genomics makes it possible to perform perturbation experiments at a very large scale. This motivates the development of a causal inference framework that is based on observational and interventional data. We characterize the causal relationships that are identifiable and present the first provably consistent algorithm for learning a causal network from such data. I will then couple gene expression with the 3D genome organization. In particular, we will discuss approaches for integrating different data modalities such as sequencing or imaging via autoencoders. We end by a theoretical analysis of autoencoders linking overparameterization to memorization. In particular, we will show that overparameterized autoencoders trained using standard optimization methods implement associative memory and provide a mechanism for memorization and retrieval of real-valued data.
Caroline Uhler recently joined ETH Zürich as Full Professor of Machine Learning, Statistics and Genomics. Prior to joining ETH Zürich, she was Associate Professor at MIT. SHe holds a PhD in statistics from UC Berkeley, spent a semester in the “Big Data” program at the Simons Institute at UC Berkeley, held postdoctoral positions at the IMA and at ETH Zurich, was assistant professor for 3 years at IST Austria. Her research focuses in particular on graphical models, causal inference, algebraic statistics and applications to genomics, for example on linking the spatial organization of the DNA with gene regulation.