🤒 CMplot | 连Nature上的曼哈顿图都卷起来啦（二）

生信漫卷

发布于 2022-10-31 09:06:21

1.5K00

代码可运行

文章被收录于专栏：R语言及实用科研软件R语言及实用科研软件

运行总次数：0

代码可运行

1写在前面

原文地址：https://www.nature.com/articles/s41467-020-17376-1

上期复刻了常见的Manhattan图，这一期我们让它卷起来吧！今天要复刻一下Nature Communications上的一张Manhattan圈图。

2用到的包

rm(list = ls())
library(CMplot)

3示例数据

data(pig60K)
data(cattle50K)

Note! 示例数据中的前三列分别是SNP的名称、染色体和位置，其余各列是GWAS的p值或traits的GS/GP。有时候你可能只是想画SNP的密度图，这个时候前3列就够了。

4SNP密度图

上一期并未绘制简单的SNP密度图，这里我们补一下

CMplot(pig60K,
       type="l", # "p"(point), "l"(line), "h"(vertical lines) 
       plot.type="d", # "d", "c", "m", "q" or "b" 
       bin.size=1e6, #the size of bin for SNP_density plot.
       chr.den.col=c("darkgreen", "yellow", "red"), #the color for the SNP density
       file="jpg", #"jpg", "pdf","tiff"
       memo="",
       dpi=300,
       main="illumilla_60K", #title
       file.output=F,
       verbose=F,
       width=9,height=6
       )

这里需要说明一下plot.type可选的有： ✅ "d" → SNP density plot ✅ "c" → circle-Manhattan plot ✅ "m" → Manhattan plot ✅ "q" → Q-Q plot ✅ "b" → both circle-Manhattan, Manhattan and Q-Q plots

5SNP密度圈图

CMplot(pig60K,
       type="p",
       plot.type="c",
       chr.labels=paste("Chr",c(1:18,"X","Y"),sep=""),
       r=0.4, # radius for the circle
       cir.legend=T,
       outward=F, # inside or  outside
       cir.legend.col="black", # the color of the axis of legend.
       cir.chr.h=1.3, # the width for the boundary
       chr.den.col="black", # the colour for the SNP density. 
       file="jpg",
       memo="",
       dpi=300,
       file.output=F,
       verbose=F,
       width=10,height=10)

6美化并突出重要的点

这里我们通过amplify = T函数设置需要突出的点；具体其他参数见代码注释~ 🤜🤛

CMplot(pig60K,
       type="p",
       plot.type="c",
       r=0.4,
       col=c("#E64B35E5","#4DBBD5E5","#00A087E5","#3C5488E5"), # dot color
       chr.labels=paste("Chr",c(1:18,"X","Y"),sep=""),
       threshold=c(1e-6,1e-4), # significant threshold.
       cir.chr.h=1.5, # the width for the boundary
       amplify=T, # Amplify points bigger than the minimal significant level
       threshold.lty=c(1,2), # type for the line of threshold levels,
       threshold.col=c("#91D1C2E5", "#DC0000E5"), # significant threshold color
       signal.line=1, # thickness of lines of significant SNPs cross the circle.
       signal.col=c("#91D1C2E5","#DC0000E5"), # colour of significant points
       chr.den.col=c("#91D1C2E5","white","#DC0000E5"),
       bin.size=1e6,
       outward=FALSE,
       file="jpg",
       memo="",
       dpi=300,
       file.output=F,
       verbose=F,
       width=10,height=10)

7Genomic Selection/Prediction(GS/GP)

这里有几个概念，补充给大家：

✅ 基因组选择:(Genomic selection) 基因组选择利用覆盖全基因组的高密度SNP标记, 结合表型记录或系谱记录对个体育种值进行估计, 其假定这些标记中至少有一个标记与所有控制性状的QTL处于连锁不平衡状态。
✅ 参考群:(Reference population) ✅ 候选群:(Candidate population) 基因组选择中, 参考群是指有基因型和表型信息的群体。根据参考群的数据进行建模, 预测只有基因型个体的表型值。基因组选择的效率主要受参考群的大小, 规模以及和候选群的关系等因素的影响。基因组选择将群体分为参考群体和候选群体, 参考群体用于建模, 估算候选群体的育种值。参考群有表型和基因型, 候选群只有基因型。

CMplot(cattle50K,type="p",
       plot.type="c",
       LOG10=FALSE,
       outward=TRUE,
       col=matrix(c("#4DAF4A",NA,NA, "dodgerblue4",
                    "deepskyblue",NA,"dodgerblue1",
                    "olivedrab3", "darkgoldenrod1"), 
                  nrow=3, 
                  byrow=TRUE),
       chr.den.col="black",
       chr.labels=paste("Chr",c(1:29),sep=""),
       threshold=NULL,
       r=1.2,
       cir.chr.h=1.5,
       cir.legend.cex=0.5,
       cir.band=1,
       file="jpg", 
       memo="",
       dpi=300,
       file.output= F,
       verbose= F, 
       width=10,height=10)