Palmer Commons, Forum Hall
"Permutation enhances the rigor of single-cell data analysis"
Abstract
Ensuring the reliability and accuracy of single-cell data analysis is critical, particularly in visualizing complex biological structures and addressing data sparsity. This talk introduces two novel statistical methods—scDEED and mcRigor—that leverage permutation-based techniques to enhance the rigor of these analyses.
scDEED ([Xia et al., 2024, Nature Communications](https://www.nature.com/articles/s41467-024-45891-y)) addresses the challenge of evaluating the reliability of two-dimensional (2D) embeddings produced by visualization methods like t-SNE and UMAP, which are commonly used to visualize cell clusters. These methods, however, can sometimes misrepresent data structure, leading to erroneous interpretations. scDEED calculates a reliability score for each cell embedding, comparing the consistency between a cell's neighbors in the 2D embedding space and its pre-embedding neighbors. Cells with low reliability scores are flagged as dubious, while those with high scores are deemed trustworthy. Additionally, scDEED provides guidance for optimizing t-SNE and UMAP hyperparameters by minimizing the number of dubious embeddings, significantly improving visualization reliability across multiple datasets.
mcRigor focuses on enhancing metacell partitioning in single-cell RNA-seq and ATAC-seq data analysis, a common strategy to address data sparsity by aggregating similar single cells into metacells. Existing algorithms often fail to verify metacell homogeneity, risking bias and spurious findings. mcRigor introduces a feature-correlation-based statistic to measure heterogeneity within a metacell, identifying dubious metacells composed of heterogeneous single cells. By optimizing metacell partitioning algorithm hyperparameters, mcRigor enhances the reliability of downstream analyses. Moreover, mcRigor allows for benchmarking and selecting the most suitable partitioning algorithm for a dataset, ensuring more robust discoveries.
scDEED and mcRigor demonstrate the power of permutation-based approaches in refining single-cell data analysis, providing researchers with tools to achieve more accurate and reproducible insights into complex cellular processes.
Professor of Statistics (primary), Department of Human Genetics
Professor of Biomathematics (secondary)
University of California, Los Angeles
Jingyi Jessica Li (李婧翌) is a Professor in the Department of Statistics (primary), Department of Human Genetics and Department of Biomathematics (secondary) at University of California, Los Angeles. She is also a faculty member in the Interdepartmental Ph.D. Program in Bioinformatics and a member in the Jonsson Comprehensive Cancer Center (JCCC) Gene Regulation Research Program Area. Prior to joining UCLA, Jessica obtained her Ph.D. degree from the Interdepartmental Group in Biostatistics at University of California, Berkeley, where she worked with Profs Peter J. Bickel and Haiyan Huang. Jessica received her B.S. (summa cum laude) from Department of Biological Sciences and Technology at Tsinghua University, China in 2007.
Professor