Palmer Commons, Forum Hall
"Integrative computational approaches for modeling complex disease biology across scales and systems"
Abstract
To effectively model the molecular underpinnings of complex traits and diseases, computational methods must integrate diverse data types, handle partial or limited observations, and remain robust to variations in dataset size. In this talk, I will present several recent methods developed to address these challenges across diverse studies, assay types, and organisms, leveraging novel statistical and machine learning approaches.
First, I will introduce ALPINE, an NMF-based framework that disentangles the influence of technical and non-relevant phenotypic factors in single-cell transcriptomic data, enabling the integration of multiple studies.
Integrating across data types, I will discuss our method, seismic, which combines genome-wide association studies with single-cell RNA sequencing to prioritize disease-relevant cell types, linking genetic variation to cellular function.
Finally, I will discuss ETNA, a machine translation-inspired approach that embeds protein-protein interaction networks from different organisms into a shared space, facilitating cross-species functional comparisons.
Together, these methods highlight how diverse data sources can be integrated across molecular, cellular, and organism levels to better model complex disease biology.

Assistant Professor of Computer Science at Rice University
Vicky Yao is an Assistant Professor in the Department of Computer Science at Rice University. She was a postdoctoral fellow at the Lewis-Sigler Institute for Integrative Genomics and received her PhD from the Department of Computer Science, advised by Olga Troyanskaya at Princeton University.
Her research focus is in computational biology, where she develops machine learning and statistical methods to improve our understanding of the biological circuitry that underlies living organisms and how its dysregulation may lead to disease. More specifically, she has worked on modeling tissue and cell type specificity as well as disease progression, both by developing general methods (such as semi-supervised network integration) and in applying them to decipher the molecular underpinnings of diseases such as Alzheimer’s, Parkinson’s, and rheumatoid arthritis.
An important facet of her research is building intuitive, interactive systems as interfaces to the models and predictions that she develops and she has built such systems whenever appropriate.

Professor