单细胞转录组测序(single-cell RNA sequencing)是一种在单细胞水平上对全基因组基因表达进行高通量测序的技术,能有效解析细胞群体异质性,目前广泛应用于发育、疾病等研究领域。由于单细胞转录组数据通常存在高噪声、高维度和高稀疏性等特征,传统分析方法在处理这些数据时存在明显局限性。近年来,以自编码器、生成对抗网络为代表的深度学习模型被广泛应用到单细胞转录组数据分析中,包括表达值插补、批次效应校正、数据降维、细胞聚类和细胞类型注释等,并展现了深度学习在单细胞转录组数据分析中的优越性。特别地,基于Transformer的深度学习大模型,通过自注意力机制学习基因间隐含依赖关系以及基因表达与细胞之间的关联,为单细胞转录组数据分析提供了新路径和发展方向,并为多模态组学整合分析提供了创新的解决方案和潜在的应用前景。
Single-cell RNA sequencing (scRNA-seq) is a high-throughput sequencing technology that profiles genome-wide gene expression at the single-cell level,and can efficiently resolve cellular heterogeneity.It is widely applied in fields such as developmental biology and disease research. However,scRNA-seq data often exhibit characteristics such as high noise,high dimensionality, and high sparsity,which pose significant challenges to traditional data analysis methods.In recent years,deep learning models,represented by autoencoders and generative adversarial networks, have been extensively applied to scRNA-seq data analysis tasks,including expression imputation, batch effect correction,dimensionality reduction,cell clustering,and cell type annotation. These applications demonstrate the power of deep learning. Notably, Transformer-based deep learning models, leveraging self-attention mechanisms to capture implicit dependencies among genes and associations between gene expression and cells, offer a novel strategy and direction for scRNA-seq data analysis, and provide innovative solutions with promising applications
for the integration of multimodal omics data.