本次需要完成的是非主流转录组数据分析图表,主流图表是:
以前通常认为乳腺上皮包括两大类细胞:位于内层的分泌性管腔细胞(secretory luminal cells)和位于外层的基底/肌上皮细胞(basal/myoepithelial cells)
发表于2017年9月的NC,文章是;Construction of developmental lineage relationships in the mouse mammary gland by single-cell RNA profiling 的研究者系统性的跟踪检测了小鼠epithelial cells 的各种时期的单细胞转录组情况,包括:pre-puberty, puberty, adulthood and pregnancy, as well as at different points of the estrus cycle.
而我们今天要复现图表的文章(Cell Rep. 2016 Nov 15;) 提到正常的成年人的mammary gland 是由双层上皮细胞组成:
简而言之,外层是basal,内层是luminal。其中luminal还可以细分成两类。作者就研究了这3类细胞,再加上 stromal cells, 总共4类细胞。
发表于2018年的《Nature Communications》 , 题目为“Profiling human breast epithelial cells using single cell RNA sequencing identifies cell diversity”。作者从7个个体的乳腺上皮细胞提取25,790 个单细胞进行转录组测序。
鉴定出了三种不同的上皮细胞群:
下面看学徒的表演
这是卖家秀
了解到是不同的细胞群体就好
):
image.png
grid
,用得不好,后续加强;rm(list=ls())
a <- read.csv('mmc3.csv',header=T,row.names=1,fill=T)
data<- cbind(log10(a[,1:4]),a[,c(5,6,8)])
library(ggplot2)
library(grid)
library(gridExtra)
ph1<- ggplot(data,aes(x=LP, y=BC,color=BCvsLP))+geom_point()+scale_color_brewer(type='qual',palette=2)
ph2<- ggplot(data,aes(x=LP, y=LC,color=LCvsLP))+geom_point()+scale_color_brewer(type='qual',palette=2)
ph3<- ggplot(data,aes(x=LC, y=BC,color=BCvsLC))+geom_point()+scale_color_brewer(type='qual',palette=2)
data1<- cbind(log2(a[,1:4]+1),a[,c(5,6,8)])
data1$BCvsLP_log[!data1$BCvsLP==''] <- abs(data1$BC[!data1$BCvsLP=='']-data1$LP[!data1$BCvsLP==''])
data1$BCvsLC_log[!data1$BCvsLC==''] <- abs(data1$BC[!data1$BCvsLC=='']-data1$LC[!data1$BCvsLC==''])
data1$LCvsLP_log[!data1$LCvsLP==''] <- abs(data1$LC[!data1$LCvsLP=='']-data1$LP[!data1$LCvsLP==''])
data1[is.na(data1)] <- 0
####boxplot数据处理
tmp<- data1[,8:10]
tmp1<- data.frame(abslog2FC=c(data1[,8],data1[,8],data1[,8]),
group=rep(colnames(data1)[8:10],each=nrow(data1)))
library(ggplot2)
tmp1<- tmp1[!tmp1$abslog2FC==0,]
ph4<- ggplot(tmp1,aes(x = group, y = abslog2FC,color = group )) +geom_boxplot()+coord_flip()
####图片整合
grid.arrange(rectGrob(), rectGrob())
grobs = list(ph1,ph2,ph3,ph4)
marrangeGrob(grobs, nrow=2, ncol=2)
save(data1,file='data1.Rdata')
这是买家秀点评一下: 真的是卖家和买家秀差距好大,图片没有注释上下调基因数量,配色也不对。然后条形图很明显是错的,背后应该是有深层次原因!
library(clusterProfiler)
load('data1.Rdata')
library(org.Hs.eg.db)
en_sym<-select(org.Hs.eg.db,keys = keys(org.Hs.eg.db),columns = c('ENSEMBL','SYMBOL','ENTREZID'))
data1[,11:13]<- en_sym[match(rownames(data1),en_sym$ENSEMBL),]
BC_LP<- data1$ENTREZID[!data1$BCvsLP=='']
BS_LC<- data1$ENTREZID[!data1$BCvsLC=='']
LC_LP<- data1$ENTREZID[!data1$LCvsLP=='']
gc<- list(all=union(union(BC_LP,BS_LC),LC_LP),BC_LP=BC_LP,BS_LC=BS_LC,LC_LP=LC_LP)
xx <- compareCluster(gc,
fun="enrichGO",
OrgDb="org.Hs.eg.db",
ont= "BP")
dotplot(xx, showCategory=5, includeAll=FALSE)