我有下表:
create table account_info(
id int not null unique,
creation_date date,
deletion_date date,
gather boolean)向其中添加示例数据:
insert into account_info(id,creation_date,deletion_date,gather)
values(1,'2019-09-10',null,true),
(2,'2019-09-12',null,true),
(3,'2019-09-14','2019-10-08',true),
(4,'2019-09-15','2019-09-18',true),
(5,'2019-09-22',null,false),
(6,'2019-09-27','2019-09-29',true),
(7,'2019-10-04','2019-10-17',false),
(8,null,'2019-10-20',true),
(9,'2019-10-12',null,true),
(10,'2019-10-18',null,true)我想看看有多少帐户已添加分组按周和有多少帐户已删除分组按周分组。
我尝试了以下几点:
select dd, count(distinct ai.id) as created ,count(distinct ai2.id) as deleted
from generate_series('2019-09-01'::timestamp,
'2019-10-21'::timestamp, '1 week'::interval) dd
left join account_info ai on ai.creation_date::DATE <= dd::DATE
left join account_info ai2 on ai2.deletion_date::DATE <=dd::DATE
where ai.gather is true
and ai2.gather is true
group by dd
order by dd asc这将产生以下输出:
dd | Created | Deleted |
+------------+---------+---------+
| 2019-09-22 | 4 | 1 |
| 2019-09-29 | 5 | 2 |
| 2019-10-06 | 5 | 2 |
| 2019-10-13 | 6 | 3 |
| 2019-10-20 | 7 | 4 |这个输出显示了已经创建了多少和被删除了多少的运行总数。
不过,我希望看到这样的情况:
+------------+---------+---------+-------------------+-------------------+
| dd | Created | Deleted | Total Sum Created | Total Sum Deleted |
+------------+---------+---------+-------------------+-------------------+
| 2019-09-22 | 4 | 1 | 4 | 1 |
| 2019-09-29 | 1 | 1 | 5 | 2 |
| 2019-10-06 | NULL | NULL | 5 | 2 |
| 2019-10-13 | 1 | 1 | 6 | 3 |
| 2019-10-20 | 1 | 1 | 7 | 4 |当我试图总结psql中的created和deleted列时,我会收到一条错误消息。因为我不能嵌套聚合函数。
发布于 2019-10-21 22:29:14
横向连接和聚集非常适合这个问题。如果您对数据中的每周内容感到满意:
select date_trunc('week', dte) as week,
sum(is_create) as creates_in_week,
sum(is_delete) as deletes_in_week,
sum(sum(is_create)) over (order by min(v.dte)) as running_creates,
sum(sum(is_delete)) over (order by min(v.dte)) as running_deletes
from account_info ai cross join lateral
(values (ai.creation_date, 1, 0), (ai.deletion_date, 0, 1)
) v(dte, is_create, is_delete)
where v.dte is not null and ai.gather
group by week
order by week;如果您希望它在指定的一组星期内使用:
select gs.wk,
sum(v.is_create) as creates_in_week,
sum(v.is_delete) as deletes_in_week,
sum(sum(v.is_create)) over (order by min(v.dte)) as running_creates,
sum(sum(v.is_delete)) over (order by min(v.dte)) as running_deletes
from generate_series('2019-09-01'::timestamp,
'2019-10-21'::timestamp, '1 week'::interval) gs(wk) left join
( account_info ai cross join lateral
(values (ai.creation_date, 1, 0), (ai.deletion_date, 0, 1)
) v(dte, is_create, is_delete)
)
on v.dte >= gs.wk and
v.dte < gs.wk + interval '1 week'
where dte is not null and ai.gather
group by gs.wk
order by gs.wk;这里是db<>fiddle。
发布于 2019-10-21 22:12:00
您只需将现有查询转换为子查询,并使用lag()计算连续记录之间的差异:
select
dd,
created - coalesce(lag(created) over(order by dd), 0) created,
deleted - coalesce(lag(deleted) over(order by dd), 0) deleted,
created total_sum_created,
deleted total_sum_deleted
from (
select
dd,
count(distinct ai.id) as created ,
count(distinct ai2.id) as deleted
from
generate_series(
'2019-09-01'::timestamp,
'2019-10-21'::timestamp,
'1 week'::interval
) dd
left join account_info ai
on ai.creation_date::DATE <= dd::DATE and ai.gather is true
left join account_info ai2
on ai2.deletion_date::DATE <=dd::DATE and ai2.gather is true
group by dd
) x
order by dd asc我将条件ai[2].gather = true移到join的on端:将这些条件放在where子句中,基本上可以将left join转换为inner join。
| dd | created | deleted | total_sum_created | total_sum_deleted |
| ------------------------ | ------- | ------- | ----------------- | ----------------- |
| 2019-09-01T00:00:00.000Z | 0 | 0 | 0 | 0 |
| 2019-09-08T00:00:00.000Z | 0 | 0 | 0 | 0 |
| 2019-09-15T00:00:00.000Z | 4 | 0 | 4 | 0 |
| 2019-09-22T00:00:00.000Z | 0 | 1 | 4 | 1 |
| 2019-09-29T00:00:00.000Z | 1 | 1 | 5 | 2 |
| 2019-10-06T00:00:00.000Z | 0 | 0 | 5 | 2 |
| 2019-10-13T00:00:00.000Z | 1 | 1 | 6 | 3 |
| 2019-10-20T00:00:00.000Z | 1 | 1 | 7 | 4 |另一种选择是结合使用lag()和generate_series()来生成日期范围列表。然后,您可以在原始表上只进行一个连接,并在外部查询中进行条件聚合:
select
dd,
count(distinct case
when ai.creation_date::date <= dd::date and ai.creation_date::date > lag_dd::date
then ai.id
end) created,
count(distinct case
when ai.deletion_date::date <= dd::date and ai.deletion_date::date > lag_dd::date
then ai.id
end) deleted,
count(distinct case
when ai.creation_date::date <= dd::date
then ai.id
end) total_sum_created,
count(distinct case
when ai.deletion_date::date <= dd::date
then ai.id
end) total_sum_deleted
from
(
select dd, lag(dd) over(order by dd) lag_dd
from generate_series(
'2019-09-01'::timestamp,
'2019-10-21'::timestamp,
'1 week'::interval
) dd
) dd
left join account_info ai on ai.gather is true
group by dd
order by dd发布于 2019-10-21 22:17:46
可以使用一系列CTE生成所需的结果,以构建数据表:
with dd as
(select *
from generate_series('2019-09-01'::timestamp,
'2019-10-21'::timestamp, '1 week'::interval) d),
ddl as
(select d, coalesce(lag(d) over (order by d), '1970-01-01'::timestamp) as pd
from dd),
counts as
(select d, count(distinct ai.id) as created, count(distinct ai2.id) as deleted
from ddl
left join account_info ai on ai.creation_date::DATE > ddl.pd::DATE AND ai.creation_date::DATE <= ddl.d::DATE AND ai.gather is true
left join account_info ai2 on ai2.deletion_date::DATE > ddl.pd::DATE AND ai2.deletion_date::DATE <= ddl.d::DATE AND ai2.gather is true
group by d)
select d, created, deleted,
sum(created) over (rows unbounded preceding) as "total created",
sum(deleted) over (rows unbounded preceding) as "total deleted"
from counts
order by d asc请注意,gather条件需要是left join的一部分,以避免将它们转换为内部连接。
输出:
d created deleted total created total deleted
2019-09-01 00:00:00 0 0 0 0
2019-09-08 00:00:00 0 0 0 0
2019-09-15 00:00:00 4 0 4 0
2019-09-22 00:00:00 0 1 4 1
2019-09-29 00:00:00 1 1 5 2
2019-10-06 00:00:00 0 0 5 2
2019-10-13 00:00:00 1 1 6 3
2019-10-20 00:00:00 1 1 7 4注意,这个查询给出了以d结束的一周的结果。如果您希望从d开始的一周的结果,lag可以更改为lead。你可以在我的演示中看到这个。
https://stackoverflow.com/questions/58494483
复制相似问题