
现有用户登录日志表,记录了每个用户登录的IP地址,请查询共同使用过3个及以上IP的用户对;
+----------+-----------------+----------------------+
| user_id | ip | time_stamp |
+----------+-----------------+----------------------+
| 2 | 223.104.41.101 | 2023-08-24 07:00:00 |
| 4 | 223.104.41.122 | 2023-08-24 10:00:00 |
| 5 | 223.104.41.126 | 2023-08-24 11:00:00 |
| 4 | 223.104.41.126 | 2023-08-24 13:00:00 |
| 1 | 223.104.41.101 | 2023-08-24 16:00:00 |
| 3 | 223.104.41.101 | 2023-08-24 16:02:00 |
| 2 | 223.104.41.104 | 2023-08-24 16:30:00 |
| 1 | 223.104.41.121 | 2023-08-24 17:00:00 |
| 2 | 223.104.41.122 | 2023-08-24 17:05:00 |
| 3 | 223.104.41.103 | 2023-08-24 18:11:00 |
| 2 | 223.104.41.103 | 2023-08-24 19:00:00 |
| 1 | 223.104.41.104 | 2023-08-24 19:00:00 |
| 3 | 223.104.41.122 | 2023-08-24 19:07:00 |
| 1 | 223.104.41.122 | 2023-08-24 21:00:00 |
+----------+-----------------+----------------------+1.题目给出的数据是登录记录,需要使用IP进行关联,找到使用相同IP的记录;
2.因为要使用ip进行关联,首先保证每个用户同一个IP只有一条记录,否则关联会导致结果数据重复;
3.自关联,会导致使用相同IP的用户,出现1-2,2-1两条记录、1-1,2-2自己的记录,这些记录需要去重和剔除;
4.计算共同使用过的IP数量,得出结果;
维度 | 评分 |
|---|---|
题目难度 | ⭐️⭐️⭐️ |
题目清晰度 | ⭐️⭐️⭐️⭐️⭐️ |
业务常见度 | ⭐️⭐️⭐️⭐️ |
1)将所有用户登录记录按照用户ID和登录IP去重
select
user_id,
ip
from t_login_log_032
group by user_id,ip查询结果

2)通过IP地址进行自关联,去重,剔除相同用户。
with tmp as
(
select
user_id,
ip
from t_login_log_032
group by user_id,ip
)
select
t1.user_id,t2.user_id,t1.ip
from
tmp as t1
join
tmp as t2
on t1.ip = t2.ip
where t1.user_id <t2.user_id查询结果

3.根据用户组计算使用共同IP的个数
with tmp as
(
select
user_id,
ip
from t_login_log_032
group by user_id,ip
)
select
t1.user_id,
t2.user_id,
count(t1.ip)
from
tmp as t1
join
tmp as t2
on t1.ip = t2.ip
where t1.user_id <t2.user_id
group by t1.user_id,
t2.user_id查询结果

4)查询共同使用过3个以上IP的用户对
with tmp as
(
select
user_id,
ip
from t_login_log_032
group by user_id,ip
)
select
t1.user_id,
t2.user_id
from
tmp as t1
join
tmp as t2
on t1.ip = t2.ip
where t1.user_id <t2.user_id
group by t1.user_id,
t2.user_id
having count(t1.ip)>=3查询结果

--建表语句
CREATE TABLE t_login_log_032 (
user_id bigint COMMENT '用户ID',
ip string COMMENT '用户登录ip地址',
time_stamp string COMMENT '登录时间'
) COMMENT '用户登录记录表'
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
;
-- 插入数据
insert into t_login_log_032(user_id,ip,time_stamp)
values
(1,'223.104.41.101','2023-08-24 16:00:00'),
(1,'223.104.41.121','2023-08-24 17:00:00'),
(1,'223.104.41.104','2023-08-24 19:00:00'),
(1,'223.104.41.122','2023-08-24 21:00:00'),
(1,'223.104.41.122','2023-08-24 22:00:00'),
(2,'223.104.41.101','2023-08-24 07:00:00'),
(2,'223.104.41.103','2023-08-24 19:00:00'),
(2,'223.104.41.104','2023-08-24 16:30:00'),
(2,'223.104.41.122','2023-08-24 17:05:00'),
(3,'223.104.41.103','2023-08-24 18:11:00'),
(3,'223.104.41.122','2023-08-24 19:07:00'),
(3,'223.104.41.101','2023-08-24 16:02:00'),
(4,'223.104.41.126','2023-08-24 13:00:00'),
(5,'223.104.41.126','2023-08-24 11:00:00'),
(4,'223.104.41.122','2023-08-24 10:00:00');