我需要在R中做一个简单的数据转换,以便在igraph中使用。我的数据帧采用此格式,并按GROUP
分组
A GROUP
1 1 a
2 2 a
3 3 a
4 4 a
5 1 b
6 3 b
7 5 b
1.如何展开群组以获得此格式的无定向边缘列表el
?
A B
1 1 2
2 1 3
3 1 4
4 2 3
5 2 4
6 3 4
7 1 3
8 1 5
9 3 5
注意:没有自我引用1-1,2-2,3-3,...
2.如何计算A-B的出现次数并从el
创建加权边缘列表
A B weight
1 1 2 1
2 1 3 2
3 1 4 1
4 2 3 1
5 2 4 1
6 3 4 1
7 1 5 1
8 3 5 1
发布于 2012-03-24 17:35:05
下面是使用plyr
获取边缘列表的方法
foo <- data.frame(
A = c(1,2,3,4,1,3,5),
GROUP = c("a","a","a","a","b","b","b"))
library("plyr")
E1 <- do.call(rbind,dlply(foo,.(GROUP),function(x)t(combn(x$A,2))))
E1
返回:
[,1] [,2]
[1,] 1 2
[2,] 1 3
[3,] 1 4
[4,] 2 3
[5,] 2 4
[6,] 3 4
[7,] 1 3
[8,] 1 5
[9,] 3 5
然后获取权重(这里我使用combn
将最小的数字放在第一位):
W <- apply(E1,1,function(x)sum(E1[,1]==x[1]&E1[,2]==x[2]))
E2 <- cbind(E1,weight=W)
E2 <- E2[!duplicated(E2),]
E2
它返回:
weight
[1,] 1 2 1
[2,] 1 3 2
[3,] 1 4 1
[4,] 2 3 1
[5,] 2 4 1
[6,] 3 4 1
[7,] 1 5 1
[8,] 3 5 1
发布于 2012-03-24 20:15:34
下面是我在代码中注释的一个解决方案:
# your data
df <- data.frame(A = c(1, 2, 3, 4, 1, 3, 5),
GROUP = c("a", "a", "a", "a", "b", "b", "b"))
# define a function returning the edges for a single group
group.edges <- function(x) {
edges.matrix <- t(combn(x, 2))
colnames(edges.matrix) <- c("A", "B")
edges.df <- as.data.frame(edges.matrix)
return(edges.df)
}
# apply the function above to each group and bind altogether
all.edges <- do.call(rbind, lapply(unstack(df), group.edges))
# add weights
all.edges$weight <- 1
all.edges <- aggregate(weight ~ A + B, all.edges, sum)
all.edges
# A B weight
# 1 1 2 1
# 2 1 3 2
# 3 2 3 1
# 4 1 4 1
# 5 2 4 1
# 6 3 4 1
# 7 1 5 1
# 8 3 5 1
https://stackoverflow.com/questions/9850514
复制相似问题