我有一个数据框架,如下所示:
sub = c("X001","X001", "X001","X002","X002","X001","X002","X001","X002","X002","X002","X002")
revenue = c(20, 15, -10,-25,20,-20, 17,9,14,12, -9, 11)
df = data.frame(sub, revenue)我想用这样一种方式来汇总:第二栏应该显示所有收入的总和,第三栏应该显示绝对值的总和,第四栏应该显示所有正数的和,第五栏应该显示所有负值的和。
结果应该如下所示:
Sub All Sum Absolute Sum Positive Sum Negative Sum
X001 14 74 44 -30
X002 40 108 74 -34我编写了计算所有和的代码:
y<-aggregate(df$revenue, by=list(Feature=x$Sub), FUN=sum)如果在R方面更有知识的人能帮助我计算其他三列,我会非常感激的。
发布于 2017-04-23 22:45:23
下面是如何使用dplyr实现这一点:
library(dplyr)
df%>%
group_by(sub)%>%
summarise(All_Sum=sum(revenue),Absolute_Sum=sum(abs(revenue)),
Positive_Sum=(sum(revenue[revenue>0])),Negative_Sum=(sum(revenue[revenue<0])))
sub All_Sum Absolute_Sum Positive_Sum Negative_Sum
<fctr> <dbl> <dbl> <dbl> <dbl>
1 X001 14 74 44 -30
2 X002 40 108 74 -34发布于 2017-04-23 23:18:51
在基数R中使用aggregate
aggregate(.~sub, df, function(a) c(sum(a), sum(abs(a)), sum(a[a>0]), sum(a[a<0])))
# sub revenue.1 revenue.2 revenue.3 revenue.4
#1 X001 14 74 44 -30
#2 X002 40 108 74 -34发布于 2017-04-24 03:18:04
我们也可以使用data.table
library(data.table)
setDT(df)[, .(All_Sum = sum(revenue), Absolute_Sum = sum(abs(revenue)),
Positive_Sum = sum(revenue[revenue>0]), Negative_Sum = sum(revenue[revenue<0])), by = sub]
# sub All_Sum Absolute_Sum Positive_Sum Negative_Sum
#1: X001 14 74 44 -30
#2: X002 40 108 74 -34https://stackoverflow.com/questions/43577225
复制相似问题