本篇内容接上一篇稿子:都2025年了啥单细胞数据分析还没有实验验证能发到IF10+呢?,讨论B细胞亚群中是否真的存在 Breg亚群。数据背景等可以参考上面的微信稿!
数据读取在上面的稿子中也有,这里重新看看:
###
### Create: Jianming Zeng
### Date: 2023-12-31
### Email: jmzeng1314@163.com
### Blog: http://www.bio-info-trainee.com/
### Forum: http://www.biotrainee.com/thread-1376-1-1.html
### CAFS/SUSTC/Eli Lilly/University of Macau
### Update Log: 2023-12-31 First version
### Update Log: 2024-12-09 by juan zhang (492482942@qq.com)
###
rm(list=ls())
options(stringsAsFactors = F)
library(ggsci)
library(dplyr)
library(future)
library(Seurat)
library(clustree)
library(cowplot)
library(data.table)
library(ggplot2)
library(patchwork)
library(stringr)
library(qs)
library(Matrix)
# 创建目录
getwd()
gse <- "GSE290927"
dir.create(gse)
###### step1: 导入数据 ######
samples <- list.dirs("GSE290927/", recursive = F, full.names = F)
samples
scRNAlist <- lapply(samples, function(pro){
#pro <- samples[1]
print(pro)
folder <- file.path("GSE290927/", pro)
folder
counts <- Read10X(folder, gene.column = 2)
sce <- CreateSeuratObject(counts, project=pro, min.cells = 3)
return(sce)
})
names(scRNAlist) <- samples
scRNAlist
# merge
sce.all <- merge(scRNAlist[[1]], y=scRNAlist[-1], add.cell.ids=samples)
sce.all <- JoinLayers(sce.all) # seurat v5
sce.all
library(qs)
qsave(sce.all, file="GSE290927/sce.all.qs")
然后经过质控,并降维聚类分群,harmony去批次,使用下面的基因先来注释一下大的亚群,提取出B细胞亚群后做亚群细分。
细胞对应的markers基因文献中提供了一个附件表格,如下:
cell_markers <- list(
"Epithelial cells" = c("EPCAM", "KRT18", "KRT19", "KRT8", "KRT17", "KRT15"),
"T cells" = c("CD2D", "CD3D", "CD3E", "CD3G"),
"B cells" = c("CD19", "CD79A", "CD79B", "MS4A1"),
"Myeloid cells" = c("CD33", "CD68", "CD1E", "LYZ", "LAMP3"),
"NK cells" = c("CD56", "CD16", "NKP46", "NKP30"),
"Fibroblasts" = c("COL1A1", "COL1A2", "COL3A1", "ACTA2", "FAP", "S100A4"),
"Endothelial cells" = c("PECAM1", "CD34", "CDH5", "VWF", "VEGFR", "TEK", "CD54" )
)
文献给出的列表里面有很多蛋白名字,修正一下
CD56:NCAM1
CD2D:没有对应的基因名或者蛋白名
CD16:FCGR3A
NKP46:NCR1
NKP30:NCR3
VEGFR:KDR
CD54:ICAM1
修正后的基因列表:
cell_markers <- list(
"Epithelial cells" = c("EPCAM", "KRT18", "KRT19", "KRT8", "KRT17", "KRT15"),
"T cells" = c("PTPRC", "CD3D", "CD3E", "CD3G"),
"B cells" = c("CD19", "CD79A", "CD79B", "MS4A1"),
"Myeloid cells" = c("CD33", "CD68", "CD1E", "LYZ", "LAMP3"),
"NK cells" = c("NCAM1", "FCGR3A", "NCR1", "NCR3"),
"Fibroblasts" = c("COL1A1", "COL1A2", "COL3A1", "ACTA2", "FAP", "S100A4"),
"Endothelial cells" = c("PECAM1", "CD34", "CDH5", "VWF", "KDR", "TEK", "ICAM1")
)
还辅助一些其他的marker基因,如这篇稿子中使用的:中性粒细胞的质量值到底是多低呢?
大类注释结果如下:
现在挑选出来 B细胞亚群:
rm(list=ls())
library(harmony)
library(qs)
library(Seurat)
library(clustree)
###### step3: harmony整合多个单细胞样品 ######
dir.create("4-subcluster_Bcells")
getwd()
# 读取数据
sce.all <- readRDS("3-check-by-0.3/sce.all.int.RData")
table(Idents(sce.all))
sce_sub <- subset(sce.all, idents = c("B cells") )
# 默认 ScaleData 没有添加"nCount_RNA", "nFeature_RNA"
sce_sub
print(dim(sce_sub))
head(sce_sub@meta.data)
然后重新走一下标准流程分析,开始注释!文献里面用到的B细胞marker基因如下:这些marker在用的时候有一些不是理想。
################################ 本数据marker:2025-GSE290927-Breg B细胞亚群
cell_markers <- list(
"B cells" = c("CD19", "CD79A", "CD79B"),
"Memory B cells" = c("MS4A1", "CD27", "AIM2", "TNFRSF13B", "CRIP2", "ITGB1"),
#"Naïve B cells" = c("MS4A1", "IGHD", "FCER2", "TCL1A", "IL4R"),
"Naïve B cells" = c("IGHD", "FCER2", "TCL1A", "IL4R"),
"GC B cells" = c("AICDA", "RGS13", "GCSAM"),
"Bregs" = c("IL10", "CD1D", "CD5", "TGFB1"),
"Plasma cells" = c("CD38", "SDC1", "MZB1")
)
又找了一下别的地方的marker基因,曾老板以前写的一个帖子:B细胞细分亚群。首先 B cells (form marker gene: CD79A) 可以细分成为 CD20+ B cells 和 CD138+ plasma cells。
CD20+ B cells,又是可以细分成为:
CD138+ plasma cells,可以细分成为:
背诵一些基因功能:
总结如下:
############## 这个来自曾老板的脚本中的marker,然后发现在这里来自 https://mp.weixin.qq.com/s/6JO50MSVntbVC9SYupGpzQ
# CD20 (MS4A1)表达于除plasma B 之外的所有B,很关键的区分naive 和plasma的marker
# SDC1 = CD138 plasma B (接受抗原,可表达抗体)
Bcels_markers_list <- list(
All = c('MS4A1','SDC1', 'CD38','CD19', 'CD79A'),
naive =c( 'IGHD', 'FCER2', 'TCL1A','IL4R'),
memory=c('CD27', 'AIM2', 'TNFRSF13B'),
GC_B = c('S1PI2', 'LRMP', 'SUGCT', 'MME', 'MKI67', 'AICDA'),
plasma =c('IGHA1','IGHG1','JCHAIN')
)
p <- DotPlot(sce.all.int, features = Bcels_markers_list, assay='RNA',group.by = select_idet,cols = c("grey", "red") ) +
ggtitle(paste0(select_idet, ": Bcels_markers_list")) +
xlab("") +
theme(axis.text.x = element_text(angle = 45, hjust = 1)) # 更改x轴标签角度
p[["theme"]][["strip.text"]]$angle <- 90
p[["theme"]][["strip.text"]]$hjust <- 0
p
ggsave(filename = "4-subBcells_check-by-0.3/Markers_Bcels_markers_list-dotplot.pdf", plot=p, width=15, height = 8,bg="white")
注释:
第六群看marker这个地方很有意思,文献里面的 "Bregs" = c("IL10", "CD1D", "CD5", "TGFB1") 表达很微弱,然后你会发现它还大量表达T细胞marker,这个现象在微信群里每天都有人问:我的这群细胞里面为什么同时表达 B细胞marker和 T细胞marker呢?
还表达T细胞marker:
是不是非常有意思!这个问题可以做更多的探索~(我目前也无法给出一个具体的答案)
今天分享到这~