前面分别介绍过了单细胞常见的可视化方式DimPlot,FeaturePlot ,DotPlot ,VlnPlot 和 DoHeatmap的优化方式
本次介绍ggplot2 - gghalves 绘制豆荚图/对半小提琴图的方法。
一 载入R包,数据
仍然使用之前注释过的sce.anno.RData数据 ,后台回复 anno 即可获取。
首先下载gghalves-R包,然后geom_half_violin绘制分半小提琴图
library(Seurat)
library(tidyverse)
#devtools::install_github('erocoar/gghalves')
library(gghalves)
library(gridExtra)
load("sce.anno.RData")
head(sce2,2)
使用FetchData函数提取自己研究关注的基因表达量 以及 重要的分组和注释信息
gene <- c("BNIP3","CD3D","CSTB","APOE","EGFR","VEGFA","IL6")
exprs <- data.frame(FetchData(object = sce2, vars = c("celltype",gene,"group")) )
exprs$Proj <- "Seurat"
二 gghalves 绘图
1,绘制单一基因
首先绘制单个基因的对半小提琴图,先提取单一分组的数据,然后使用
geom_half_violin函数进行绘制左半边 ,然后叠加右边的图,注意side='r' 参数
p <- ggplot() +
geom_half_violin(data = exprs[exprs$group == 'MET',],
aes(x = Proj, y = BNIP3, fill = group),
color = 'black',
scale = 'width')
#在上图基础上叠加右边,绘图逻辑相同
p1 <- p +
geom_half_violin(data = exprs[exprs$group == 'PT',],
aes(x = Proj, y = BNIP3, fill = group),
color = 'black',
scale = 'width',
side = 'r')
p1
使用ggplot2的参数对图形进行修饰
p2 <- p1 +
theme_bw() +
theme(axis.text.x = element_blank(),
panel.grid = element_blank()) +
scale_fill_manual(values = c("#E39A35","#68A180")) +
labs(x = 'BNIP3', y = 'Expression Level') #y轴标题本文内容修改
p2
当基因个数较多时,使用循环的方式无疑是一种很好的选择
# 创建空的图表列表
plot_list <- list()
# 循环替换基因并创建半小提琴图层
for (gene in c("CD3D","CSTB","APOE","EGFR","VEGFA","IL6")) {
# 创建半小提琴图层
violin_layer1 <- ggplot() + geom_half_violin(data = exprs[exprs$group == "MET", ],
aes(x = Proj, y = !!sym(gene), fill = group),
color = 'black',
scale = 'width')
violin_layer2 <- geom_half_violin(data = exprs[exprs$group == 'PT',],
aes(x = Proj, y = !!sym(gene), fill = group),
color = 'black',
scale = 'width',
side = 'r')
# 添加图层到图表列表中
plot_list[[gene]] <- violin_layer1 + violin_layer2 +
theme_bw() +
theme(axis.text.x = element_blank(),
panel.grid = element_blank()) +
scale_fill_manual(values = c("#E39A35","#68A180")) +
labs(x = gene ,y = 'Expression Level')
}
# 列表中的所有图绘制到一张图中
combined_plot <- do.call(grid.arrange, c(plot_list, nrow =2, ncol = 3))
# 绘图
combined_plot
需要前期使用reshape2的melt函数将提取的重点基因数据,分组数据和celltype数据 转为长数据,然后facet_grid函数添加细胞类型的分面。
#从Seurat对象中提取细胞注释以及基因表达量
gene <- c("CD3D","CSTB","APOE","EGFR","VEGFA","IL6")
exprs$Cell <- rownames(exprs)
exprs.melt <- reshape2::melt(exprs,
id.vars = c("Cell","celltype","group"),
measure.vars = gene,
variable.name = "gene",
value.name = "Expr")
head(exprs.melt, 10)
p5 <- ggplot() +
geom_half_violin(data = exprs.melt[exprs.melt$group == 'MET',],
aes(x = gene, y = Expr, fill = group),
color = 'black',
scale = 'width') +
facet_grid(rows = vars(celltype), scales = 'free_y')
#在上图基础上叠加,绘图逻辑相同
p51 <- p5 +
geom_half_violin(data = exprs.melt[exprs.melt$group == 'PT',],
aes(x = gene, y = Expr, fill = group),
color = 'black',
scale = 'width',
side = 'r') +
facet_grid(rows = vars(celltype), scales = 'free_y')
p51
p52 <- p51 +
theme_bw() +
theme(
panel.grid = element_blank()) +
scale_fill_manual(values = c("#E39A35","#68A180")) +
labs(x = "", y = 'Expression Level') #y轴标题本文内容修改
p52
到这里就完成了分组情况下的对半小提琴图的绘制,geom_half_violin 该函数这种有 geom_half_boxplot ,ggbeeswarm::geom_beeswarm,geom_half_dotplot,geom_half_boxplot ,geom_half_point等多种变种,可自行尝试。单细胞数据可能不是很合适。
参考资料:
https://cran.r-project.org/web/packages/gghalves/vignettes/gghalves.html
https://stackoverflow.com/questions/71752123/geom-half-violin-on-right-side-of-plot-cut-off