前往小程序,Get更优阅读体验!
立即前往
首页
学习
活动
专区
工具
TVP
发布
社区首页 >专栏 >一个分组查询引发的思考

一个分组查询引发的思考

作者头像
时间静止不是简史
发布2022-01-05 14:44:44
1K0
发布2022-01-05 14:44:44
举报
文章被收录于专栏:Java探索之路

一个分组查询引发的思考

我们在看项目代码或者SQL语句时, 往往会看到很多非常复杂的业务或者SQL 那么问题来了. 复杂SQL是如何写成的? 下面通过一个数据展示的需求来体会到复杂的SQL是如何书写的

1. 计算平均等待时间

当你拿到需求是一般都是比较简单的, 例如统计某些数据, eg: 统计每天平均等待时间

代码语言:javascript
复制
-- 计算平均等待时间
-- 逻辑
select 等待总时间/等待数 as 列名
from 表名
where 时间(后续根据要求可改为按年月日查询)
between 起始时间
and 结束时间
group by 业务名称

-- eg
select businame,
round(sum(waitingTime)/count(case when `state`=2 or 3 then 1 else null end)/60,1) as avgWaitingTime
from t_number_takers 
where DATE_FORMAT(takeTime,'%Y-%m-d%')
BETWEEN  '2021-01-01 00:00:00'
AND  '2021-12-31 23:59:59'
group by businame

2. 分组统计各项业务

下面产品经理又给你说, 那你搞个按年月日进行统计吧, 这样用户可以按照年月日进行统计各种信息

代码语言:javascript
复制
-- 通过时间分组(年月日)并根据分组显示每天时长
-- 逻辑
select 各项业务
from 表名
where 时间(后续根据要求可改为按年月日查询)
between 起始时间
and 结束时间
group by 时间

-- eg: 按日分组查询
select  DATE_FORMAT(takeTime,'%Y-%m-%d') as '时间',
round((sum(waitingTime)/count(case when `state` in(2,3) then 1 else null end))/60,1) as '客户平均等待时长(min)',
round((sum(UNIX_TIMESTAMP(endTime) - UNIX_TIMESTAMP(acceptTime))/count(case when `state`in (2,3) then 1 else null end)/60),1) as '业务平均办理时长',
sum(case when `id` is not null then 1 else null end) as '业务总数',
sum(case when businame ='开户' then 1 else null end) as '开户',
sum(case when businame ='缴费' then 1 else null end) as '缴费',
sum(case when businame ='业务' then 1 else null end) as '业务',
sum(case when businame ='销户' then 1 else null end) as '销户'
from t_number_takers
WHERE 1=1
and takeTime
BETWEEN  '2021-01-01 00:00:00'
AND  '2021-12-31 23:59:59'
group by DATE_FORMAT(takeTime,'%Y-%m-%d')

-- eg: 按月分组查询(见下图)
select  DATE_FORMAT(takeTime,'%Y-%m') as '时间',
round((sum(waitingTime)/count(case when `state` in(2,3) then 1 else null end))/60,1) as '客户平均等待时长(min)',
round((sum(UNIX_TIMESTAMP(endTime) - UNIX_TIMESTAMP(acceptTime))/count(case when `state`in (2,3) then 1 else null end)/60),1) as '业务平均办理时长',
sum(case when `id` is not null then 1 else null end) as '业务总数',
sum(case when businame ='开户' then 1 else null end) as '开户',
sum(case when businame ='缴费' then 1 else null end) as '缴费',
sum(case when businame ='业务' then 1 else null end) as '业务',
sum(case when businame ='销户' then 1 else null end) as '销户'
from t_number_takers
WHERE 1=1
and takeTime
BETWEEN  '2021-01-01 00:00:00'
AND  '2021-12-31 23:59:59'
group by DATE_FORMAT(takeTime,'%Y-%m')

-- eg: 按年分组查询
select  DATE_FORMAT(takeTime,'%Y') as '时间',
round((sum(waitingTime)/count(case when `state` in(2,3) then 1 else null end))/60,1) as '客户平均等待时长(min)',
round((sum(UNIX_TIMESTAMP(endTime) - UNIX_TIMESTAMP(acceptTime))/count(case when `state`in (2,3) then 1 else null end)/60),1) as '业务平均办理时长',
sum(case when `id` is not null then 1 else null end) as '业务总数',
sum(case when businame ='开户' then 1 else null end) as '开户',
sum(case when businame ='缴费' then 1 else null end) as '缴费',
sum(case when businame ='业务' then 1 else null end) as '业务',
sum(case when businame ='销户' then 1 else null end) as '销户'
from t_number_takers
WHERE 1=1
and takeTime
BETWEEN  '2021-01-01 00:00:00'
AND  '2021-12-31 23:59:59'
group by DATE_FORMAT(takeTime,'%Y')

ps: 按照时间段(年月日)进行显示时, 用到了 DATE_FORMAT() 函数, 这个改动会对后面的统计带来意想不到的影响

3. 将汇总和统计写在一张表

这个时候, 产品经理又又提出需求了, 需要将汇总信息写在一个接口中返回.

代码语言:javascript
复制
-- a.利用 with rollup 进行汇总, 图1. 我们可以看到汇总的那一行为空值
select  DATE_FORMAT(takeTime,'%Y-%m') as '时间',
round((sum(waitingTime)/count(case when `state` in(2,3) then 1 else null end))/60,1) as '客户平均等待时长(min)',
round((sum(UNIX_TIMESTAMP(endTime) - UNIX_TIMESTAMP(acceptTime))/count(case when `state`in (2,3) then 1 else null end)/60),1) as '业务平均办理时长',
sum(case when `id` is not null then 1 else null end) as '业务总数',
sum(case when businame ='开户' then 1 else null end) as '开户',
sum(case when businame ='缴费' then 1 else null end) as '缴费',
sum(case when businame ='业务' then 1 else null end) as '业务',
sum(case when businame ='销户' then 1 else null end) as '销户'
from t_number_takers
WHERE 1=1
and takeTime
BETWEEN  '2021-01-01 00:00:00'
AND  '2021-12-31 23:59:59'
group by DATE_FORMAT(takeTime,'%Y-%m')
with rollup


-- b.解决为空的情况, 利用 ifnull()函数
 ifnull(列名,'列名为空后的字段') as 列名 <=> 当前列下如果有数据为null,就将该列下这一行null改为: 列名为空后的字段

--  当该列不为函数时, 如图2
select ifnull(businame,'总计') as businame,count(1) as busiNum
from t_number_takers
WHERE DATE_FORMAT(takeTime,'%Y-%m-d%')
BETWEEN  '2021-01-01 00:00:00'
AND  '2021-12-31 23:59:59'
group by businame  with rollup 

-- 当列为函数时会报错, 如下代码  ifnull(DATE_FORMAT(takeTime,'%Y-%m'),"合计") as '时间'
select  ifnull(DATE_FORMAT(takeTime,'%Y-%m'),"合计") as '时间',
round((sum(waitingTime)/count(case when `state` in(2,3) then 1 else null end))/60,1) as '客户平均等待时长(min)',
round((sum(UNIX_TIMESTAMP(endTime) - UNIX_TIMESTAMP(acceptTime))/count(case when `state`in (2,3) then 1 else null end)/60),1) as '业务平均办理时长',
sum(case when `id` is not null then 1 else null end) as '业务总数',
sum(case when businame ='开户' then 1 else null end) as '开户',
sum(case when businame ='缴费' then 1 else null end) as '缴费',
sum(case when businame ='业务' then 1 else null end) as '业务',
sum(case when businame ='销户' then 1 else null end) as '销户'
from t_number_takers
WHERE 1=1
and takeTime
BETWEEN  '2021-01-01 00:00:00'
AND  '2021-12-31 23:59:59'
group by DATE_FORMAT(takeTime,'%Y-%m')
with rollup

> 1055 - Expression #1 of SELECT list is not in GROUP BY clause and contains nonaggregated column 'callmachine.t_number_takers.takeTime' which is not functionally dependent on columns in GROUP BY clause; this is incompatible with sql_mode=only_full_group_by
原因是: MySQL 5.7.5及以上功能依赖检测功能。
如果启用了ONLY_FULL_GROUP_BY SQL模式(默认情况下),MySQL将拒绝选择列表,
HAVING条件或ORDER BY列表的查询引用在GROUP BY子句中既未命名的非集合列,也不在功能上依赖于它们。
(5.7.5之前,MySQL没有检测到功能依赖关系,默认情况下不启用ONLY_FULL_GROUP_BY


-- c.根据b后面出现的问题进行解决, 执行第一条指令(当前shell生效), 然后再执行b中出现问题的语句, 图3.  
-- 可以看到虽然执行没问题, 但是返回结果却没有根据ifnull将 takeTime 为null时的字段改成 我们想要定制的字段: 总计
-- 分组异常时执行下面命令
set sql_mode='STRICT_TRANS_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTITUTION'
-- 还原之前设置
set sql_mode='ONLY_FULL_GROUP_BY,STRICT_TRANS_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTITUTION'
-- 查询当前sql模式
SELECT @@sql_mode



-- d 利用 union all 进行组合查询, 通过组合查询将总计信息拼在原来的列下(图4)
-- 虽然这样写不太规范(在时间里面返回了一个总计的字段, 但是满足了一张表返回所有分组信息和总计结果)
select  DATE_FORMAT(takeTime,'%Y-%m') as '时间' ,
round((sum(waitingTime)/count(case when `state` in(2,3) then 1 else null end))/60,1) as '客户平均等待时长',
round((sum(UNIX_TIMESTAMP(endTime) - UNIX_TIMESTAMP(acceptTime))/count(case when `state`in (2,3) then 1 else null end)/60),1) as '业务平均办理时长',
sum(case when `id` is not null then 1 else null end) as '业务总数',
sum(case when businame ='开户' then 1 else null end) as '开户',
sum(case when businame ='缴费' then 1 else null end) as '缴费',
sum(case when businame ='业务' then 1 else null end) as '业务',
sum(case when businame ='销户' then 1 else null end) as '销户'
from t_number_takers
where 1=1
and takeTime
between  '2021-01-01 00:00:00'
and  '2021-12-31 23:59:59'
group by DATE_FORMAT(takeTime,'%Y-%m')
union all select
'总计' as '时间' ,
round(sum(waitingTime)/60,1) as "客户平均等待时长",
round(sum(UNIX_TIMESTAMP(endTime) - UNIX_TIMESTAMP(acceptTime))/60,1) as '业务平均办理时长',
count(1) as '业务总数',
sum(case when businame ='开户' then 1 else null end) as '开户',
sum(case when businame ='缴费' then 1 else null end) as '缴费',
sum(case when businame ='业务' then 1 else null end) as '业务',
sum(case when businame ='销户' then 1 else null end) as '销户'
from t_number_takers 

图1

图2

图3

图4

union和union all区别 union all只是合并查询结果,并不会进行去重和排序操作,在没有去重的前提下,使用union all的执行效率要比union高

ps: 后续因为某些原因, 总计还是单独作为一个接口来了, 尴尬~~~

4. 按照任意字段排序

产品又又又有要求, 让页面显示的统计信息可以根据某个字段进行任意排序

代码语言:javascript
复制
 <!--利用mybaties的${}输出, 对其进行任意字段排序 order by ${param3(排序字段)} ${param4(升序降序asc,desc)} -->
   <select id="statisticalBusinessByYear" resultType="map">
       select  DATE_FORMAT(takeTime,'%Y') as '时间' ,
       round((sum(waitingTime)/count(case when `state` in(2,3) then 1 else null end))/60,1) as '客户平均等待时长',
       round((sum(UNIX_TIMESTAMP(endTime) - UNIX_TIMESTAMP(acceptTime))/count(case when `state`in (2,3) then 1 else null end)/60),1) as '业务平均办理时长',
       sum(case when `id` is not null then 1 else null end) as '业务总数',
	sum(case when businame ='开户' then 1 else null end) as '开户',
	sum(case when businame ='缴费' then 1 else null end) as '缴费',
	sum(case when businame ='业务' then 1 else null end) as '业务',
	sum(case when businame ='销户' then 1 else null end) as '销户'
       from t_number_takers
       where 1=1
       <if test="param1 != null and param1 != ''">
           <if test="param2 != null and param2 != ''">
               and takeTime
               between  #{param1,jdbcType=VARCHAR}
               and  #{param2,jdbcType=VARCHAR}
           </if>
       </if>
       group by DATE_FORMAT(takeTime,'%Y') 
       order by ${param3} ${param4}
   </select>

5. 动态查询列信息

之前按照产品原型写的业务列统计, 如 业务总数/开户/销户/业务/缴费都是根据原型图写的. 产品又不干了. 说万一后面客户增加了其他列的话, 你这种固定显示的业务列统计怎么合乎规范呢? 好吧, 我们继续改. 不慌. 动态查询列信息核心逻辑: 在原来的基础上, 首先新增一个获取所有业务列的接口, 然后在当前查询Dao接口传入这个list. mapper.xml 中 通过使用 mybaties的 < foreach >标签进行遍历

首先看下臃肿的dao层接口

代码语言:javascript
复制
List<String> queryBusinessList();

List<Map<String, Object>> statisticalBusinessByMounth(@Param("startTime") String startTime, @Param("endTime") String endTime,
                                                      @Param("statisticalBusinessType") String statisticalBusinessType,
                                                      @Param("sortType") String sortType,@Param("list") List<String> businessList);

mapper.xml下对应的查询语句

代码语言:javascript
复制
<select id="queryBusinessList" resultType="java.lang.String">
    select distinct(businame) from t_number_takers
</select>
    
<select id="statisticalBusinessByMounth" resultType="map">
        select  DATE_FORMAT(takeTime,'%Y-%m') as '时间' ,
        round((sum(waitingTime)/count(case when `state` in(2,3) then 1 else null end))/60,1) as '客户平均等待时长',
        round((sum(UNIX_TIMESTAMP(endTime) - UNIX_TIMESTAMP(acceptTime))/count(case when `state`in (2,3) then 1 else null end)/60),1) as '业务平均办理时长',
        sum(case when `id` is not null then 1 else null end) as '业务总数'
        <if test="list != null and list.size() != 0">
            ,
            <foreach collection="list" item="item" separator=",">
                sum(case when businame = #{item} then 1 else null end) as #{item,jdbcType=VARCHAR}
            </foreach>
        </if>
        from t_number_takers
        where 1=1
        <if test="param1 != null and param1 != ''">
            <if test="param2 != null and param2 != ''">
                and takeTime
                between  #{param1,jdbcType=VARCHAR}
                and  #{param2,jdbcType=VARCHAR}
            </if>
        </if>
        group by DATE_FORMAT(takeTime,'%Y-%m')
        order by ${param3} ${param4}
    </select>

总结:

虽然上面的例子可能不如你在其他项目中看到的其他项目那么复杂. 就个人经历而言, 复杂的SQL语句或者业务, 不是一下子就写成的, 而是随着业务的复杂之后, 慢慢的优化而成的. 我们不要过于的惧怕这些复杂的代码, 应该在战略上蔑视它, 在战术上重视它. 多多磨练自己的技巧, 并且及时反思, 而这些行为一定会给你带来意想不到的收获~~~

本文参与 腾讯云自媒体同步曝光计划,分享自作者个人站点/博客。
原始发表:2021/12/15 ,如有侵权请联系 cloudcommunity@tencent.com 删除

本文分享自 作者个人站点/博客 前往查看

如有侵权,请联系 cloudcommunity@tencent.com 删除。

本文参与 腾讯云自媒体同步曝光计划  ,欢迎热爱写作的你一起参与!

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
目录
  • 一个分组查询引发的思考
    • 1. 计算平均等待时间
      • 2. 分组统计各项业务
        • 3. 将汇总和统计写在一张表
          • 4. 按照任意字段排序
            • 5. 动态查询列信息
              • 总结:
              相关产品与服务
              云数据库 MySQL
              腾讯云数据库 MySQL(TencentDB for MySQL)为用户提供安全可靠,性能卓越、易于维护的企业级云数据库服务。其具备6大企业级特性,包括企业级定制内核、企业级高可用、企业级高可靠、企业级安全、企业级扩展以及企业级智能运维。通过使用腾讯云数据库 MySQL,可实现分钟级别的数据库部署、弹性扩展以及全自动化的运维管理,不仅经济实惠,而且稳定可靠,易于运维。
              领券
              问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档