Droplet-based single-cell assays, including single-cell RNA sequencing (scRNA-seq), single-nucleus RNA sequencing (snRNA-seq) and cellular indexing of transcriptomes and epitopes by sequencing (CITE-seq), generate considerable background noise counts, the hallmark of which is nonzero counts in cell-free droplets and off-target gene expression in unexpected cell types. Such systematic background noise can lead to batch effects and spurious differential gene expression results. Here we develop a deep generative model based on the phenomenology of noise generation in droplet-based assays. The proposed model accurately distinguishes cell-containing droplets from cell-free droplets, learns the background noise profile and provides noise-free quantification in an end-to-end fashion. We implement this approach in the scalable and robust open-source software package CellBender. Analysis of simulated data demonstrates that CellBender operates near the theoretically optimal denoising limit. Extensive evaluations using real datasets and experimental benchmarks highlight enhanced concordance between droplet-based single-cell data and established gene expression patterns, while the learned background noise profile provides evidence of degraded or uncaptured cell types.
Summary
Human genome variation contributes to diversity in neurodevelopmental outcomes and vulnerabilities; recognizing the underlying molecular and cellular mechanisms will require scalable approaches. Here, we describe a “cell village” experimental platform we used to analyze genetic, molecular, and phenotypic heterogeneity across neural progenitor cells from 44 human donors cultured in a shared in vitro environment using algorithms (Dropulation and Census-seq) to assign cells and phenotypes to individual donors. Through rapid induction of human stem cell-derived neural progenitor cells, measurements of natural genetic variation, and CRISPR-Cas9 genetic perturbations, we identified a common variant that regulates antiviral IFITM3 expression and explains most inter-individual variation in susceptibility to the Zika virus. We also detected expression QTLs corresponding to GWAS loci for brain traits and discovered novel disease-relevant regulators of progenitor proliferation and differentiation such as CACHD1. This approach provides scalable ways to elucidate the effects of genes and genetic variation on cellular phenotypes.
Recent technological innovations have enabled the high-throughput quantification of gene expression and epigenetic regulation within individual cells, transforming our understanding of how complex tissues are constructed. Missing from these measurements, however, is the ability to routinely and easily spatially localise these profiled cells. We developed a strategy, Slide-tags, in which single nuclei within an intact tissue section are ‘tagged’ with spatial barcode oligonucleotides derived from DNA-barcoded beads with known positions. These tagged nuclei can then be used as input into a wide variety of single-nucleus profiling assays. Application of Slide-tags to the mouse hippocampus positioned nuclei at less than 10 micron spatial resolution, and delivered whole-transcriptome data that was indistinguishable in quality from ordinary snRNA-seq. To demonstrate that Slide-tags can be applied to a wide variety of human tissues, we performed the assay on brain, tonsil, and melanoma. We revealed cell-type-specific spatially varying gene expression across cortical layers and spatially contextualised receptor-ligand interactions driving B-cell maturation in lymphoid tissue. A major benefit of Slide-tags is that it is easily adaptable to virtually any single-cell measurement technology. As proof of principle, we performed multiomic measurements of open chromatin, RNA, and T-cell receptor sequences in the same cells from metastatic melanoma. We identified spatially distinct tumour subpopulations to be differentially infiltrated by an expanded T-cell clone and undergoing cell state transition driven by spatially clustered accessible transcription factor motifs. Slide-tags offers a universal platform for importing the compendium of established single-cell measurements into the spatial genomics repertoire.Competing Interest StatementE.Z.M. and F.C. are academic founders of Curio Bioscience. F.C., E.Z.M., A.J.C.R., J.A.W., N.M.N., and V.K. are listed as inventors on a patent application related to the work.