
在上一节中[[77-R可视化13-多个ggplot图象映射实现以假乱真的dodge+stack效果]],我们提到了这张图:

下面是本来要复现的图:

有同学给我说了,这个图其实是有它的道理的,它其实显示的是,堆叠在一起的OM 与YM 的差异基因,各自有50多,合计对应Y 轴上的内容。
哇,原来是这样。那它就更垃圾了。
这种基本的misunderstanding 可是数据科学家可视化的禁忌啊!
你明明有更好的选择,比如将另一个对比分组调整到坐标轴的负轴。
非常简单,假数据和绘图我一并写了:
# fake data
a1 <- data.frame(
counts = c(-53, -40, -59, -39), #将a1显示在x轴下方
type1 = c(rep("NK",2), rep("TC",2)),
type2 = rep(c("YM", "YF"), 2)
)
a1$counts2 <- abs(a1$counts) #为了让label 显示正值
a2 <- data.frame(
counts = c(52, 24, 57,28),
type1 = c(rep("NK",2), rep("TC",2)),
type2 = rep(c("OM", "OF"), 2)
)
ggplot() +
geom_col(data = a2, aes(type1, counts, fill = type2),
position = "dodge") +
geom_col(data = a1,
aes(type1, counts, fill = type2),
position = "dodge") +
geom_text(data = a2,
aes(type1, counts,fill = type2, label = counts),
position = position_dodge(0.9), vjust = -0.8) +
geom_text(data = a1,
aes(type1, counts,fill = type2, label = counts2),
position = position_dodge(0.9), vjust = 1.5) +
ggthemes::theme_economist() +
# 在这个网站找颜色https://colorbrewer2.org/
scale_fill_manual(values = c("#fc9272", "#9ecae1", "#de2d26","#3182bd"))
但忽然发现,结果有点突兀:

这里发现因为存在两个图形映射,分面让人很不满意:

稍微修改下主题再出图:
p1 <- ggplot() +
geom_col(data = a2, aes(type1, counts, fill = type2),
position = "dodge") +
geom_col(data = a1,
aes(type1, counts, fill = type2),
position = "dodge") +
geom_text(data = a2,
aes(type1, counts,fill = type2, label = counts),
position = position_dodge(0.9), vjust = -0.8) +
geom_text(data = a1,
aes(type1, counts,fill = type2, label = counts2),
position = position_dodge(0.9), vjust = 1.5) +
ggthemes::theme_economist() +
# 在这个网站找颜色https://colorbrewer2.org/
scale_fill_manual(values = c("#fc9272", "#9ecae1", "#de2d26","#3182bd")) +
labs(x = NULL) + theme(
axis.ticks.x = element_blank()
)
p1

这种100以内的加法,还需要一个坐标轴告诉你差异基因的数目吗?