每日HiveSQL_求解运动员最大连胜的次数_15
2024-01-03 13:30:13
1.现需要从运动员比赛结果表中统计每个运动员最大连胜的次数
需求结果:
2.所用到的表和数据
--表创建
CREATE TABLE athlete_results
(
athlete_id INT,
match_time TIMESTAMP,
result VARCHAR(10) -- 'win', 'lose', 'draw'
);
--数据装载
INSERT INTO athlete_results
VALUES (1, '2023-11-01', 'win'),
(1, '2023-11-02', 'win'),
(1, '2023-11-03', 'lose'),
(1, '2023-11-04', 'win'),
(1, '2023-11-05', 'draw'),
(1, '2023-11-06', 'win'),
(1, '2023-11-07', 'win'),
(1, '2023-11-08', 'win'),
(2, '2023-11-01', 'win'),
(2, '2023-11-02', 'lose'),
(2, '2023-11-03', 'draw'),
(2, '2023-11-04', 'win'),
(2, '2023-11-05', 'win'),
(2, '2023-11-06', 'win');
3.答案
解析:由结果可知,此结果是对每个运动员进行分组得到的每个运动员的最大连胜数,此题关键在于如何根据特定的分组条件进行分组,得到每个可以代表运动员和比赛结果的分组字段。
思路一:
首先使用row_number()开窗根据运动员id进行分区,日期进行排序得到分组字段rn1,row_number()开窗根据运动员id和比赛结果进行分区,日期进行排序得到分组字段rn2,如下图rn1 - rn2可以唯一表述每个运动员在一次连续胜利、失败、平局的分组字段。
3.1?使用row_number()开两个窗,一个根据运动员id分区,日期升序排序,得到group1,另一个是根据运动员id和比赛结果进行分区,日期升序排序,得到group2
select athlete_id,
to_date(match_time) match_time,
result,
row_number() over (partition by athlete_id order by to_date(match_time)) group1,
row_number() over (partition by athlete_id, result order by to_date(match_time)) group2
from athlete_results;
运行结果:?
3.2筛选出比赛结果为”win“,然后再根据运动员id和win_group(group1 - group2)分组,count(*)得到运动员每个分组阶段的连胜次数
select athlete_id,
group1 - group2 win_group,
count(*) win_count
from (
select athlete_id,
to_date(match_time) match_time,
result,
row_number() over (partition by athlete_id order by to_date(match_time)) group1,
row_number() over (partition by athlete_id, result order by to_date(match_time)) group2
from athlete_results
) t
where result = 'win'
group by athlete_id, group1 - group2;
运行结果:
3.3最后根据以上运行结果根据运动员id进行分组得到最终结果
select athlete_id,
max(win_count) max_win_count
from (
select athlete_id,
group1 - group2 win_group,
count(*) win_count
from (
select athlete_id,
to_date(match_time) match_time,
result,
row_number() over (partition by athlete_id order by to_date(match_time)) group1,
row_number() over (partition by athlete_id, result order by to_date(match_time)) group2
from athlete_results
) t
where result = 'win'
group by athlete_id, group1 - group2
) t
group by athlete_id;
运行结果:?
思路二:?
思考打破连续胜利的条件,仔细思考可知,一个运动员每次失败或者平局都会开启一个新的分组,怎么描述这个分组呢?很容易可以想到sum(if())结构,如下图?
3.1使用sum(if())开窗得到分组字段win_group
select athlete_id,
to_date(match_time) match_day,
result,
sum(if(result = 'lose' or result = 'draw', 1, 0))
over (partition by athlete_id order by to_date(match_time)) win_group
from athlete_results;
运行结果:
3.2根据以上结果筛选出result = ‘win’的结果,之后进行根据运动员id和win_group分组,count(*)得到连胜的次数。
select athlete_id,
count(*) win_count
from (
select athlete_id,
result,
sum(if(result = 'lose' or result = 'draw', 1, 0))
over (partition by athlete_id order by match_time) win_group
from athlete_results) t
where result = 'win'
group by athlete_id, win_group;
运行结果:
3.3根据运动员id进行分组,得到最终结果
select athlete_id,
max(win_count) max_win_count
from (
select athlete_id,
count(*) win_count
from (
select athlete_id,
result,
sum(if(result = 'lose' or result = 'draw', 1, 0))
over (partition by athlete_id order by match_time) win_group
from athlete_results) t
where result = 'win'
group by athlete_id, win_group
) t
group by athlete_id;
运行结果:
文章来源:https://blog.csdn.net/m0_64893323/article/details/135359822
本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。 如若内容造成侵权/违法违规/事实不符,请联系我的编程经验分享网邮箱:veading@qq.com进行投诉反馈,一经查实,立即删除!
本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。 如若内容造成侵权/违法违规/事实不符,请联系我的编程经验分享网邮箱:veading@qq.com进行投诉反馈,一经查实,立即删除!