Finding biological signal in single cell RNA-seq data
Single-cell RNA-sequencing (scRNA-seq) is a technology that enables the measurement of gene expression in individual cells. It has brought the study of the transcriptome to a higher resolution and makes it possible for scientists to provide answers with more clarity never seen before. I will share our methods development in finding signal unique in scRNA-seq data. The first is to identify the regulation of expression of individual genes. We present a method, SC2P, that identifies the phase of expression a gene is in, by taking into account of both cell- and gene-specific contexts, in a model-based and data-driven fashion. We then identify two forms of transcription regulation: phase transition, and magnitude tuning. We demonstrate that compared with existing methods, SC2P provides substantial improvement in sensitivity without sacrificing the control of false discovery, as well as better robustness. The second is to identify biological topics by adapting the Latent Dirichlet Allocation (LDA) model. By considering cells as documents and genes as words, we use concepts in natural language processing to identify latent biological topics in cells. We find that topic-level summaries provide a robust dimension reduction that can improve cell clustering and classification and the topic profiles provide biological insights in gene networks.