MYSQL窗口函数详解和实战(内含示例)

2023-12-14 01:13:06

????????MySQL 窗口函数提供了跨与当前行关联的分组的计算能力。 窗口函数与聚合函数不同,聚合函数对每个分组进行计算并为每个分组返回一行,而窗口函数不会将每个分组输出到一行,而是将每个分组的计算结果合并到与之关联的行中。

目录

一、CUME_DIST 取累积分布

二、DENSE_RANK?分区内的排名

三、FIRST_VALUE 评估值(第一行)

四、LAST_VALUE?评估值(最后行)

五、NTH_VALUE?评估值(指定行)

六、PERCENT_RANK 取分区相对排名

七、ROW_NUMBER 取分区序号

八、LAG

九、LEAD

十、NTILE

十一、RANK


一、CUME_DIST 取累积分布

MySQL CUME_DIST() 函数返回当前行的累积分布,即从第一行到与当前行值相同的最后一行的行数在分区内的总行数中的占比。

语法:

CUME_DIST()
OVER (
  [PARTITION BY partition_column_list]
  [ORDER BY order_column_list]
)

参数说明:

  • partition_column_list:参与分区的列的列表。
  • partition_column_list:参与排序的列的列表。

返回值:

  • 一个大于 0 并且小于等于 1 的值,它是当前行的累积分布。
  • 计算公式为:(当前行之前的行数 + 与当前行值相同的行数) / 分区内的总行数。

示例:

mysql> select * from grade;
+----+------+-------+---------+-------+
| id | name | class | subject | grade |
+----+------+-------+---------+-------+
|  1 | Tim  | A     | Math    |     9 |
|  2 | Tom  | A     | Math    |     7 |
|  3 | Jim  | A     | Math    |     8 |
|  4 | Tim  | A     | English |     7 |
|  5 | Tom  | A     | English |     8 |
|  6 | Jim  | A     | English |     7 |
|  7 | Lucy | B     | Math    |     8 |
|  8 | Jody | B     | Math    |     6 |
|  9 | Susy | B     | Math    |     9 |
| 10 | Lucy | B     | English |     6 |
| 11 | Jody | B     | English |     7 |
| 12 | Susy | B     | English |     8 |
+----+------+-------+---------+-------+
mysql> SELECT *,
  CUME_DIST() OVER (
    PARTITION BY subject
    ORDER BY grade
  ) "cume_dist"
FROM grade;
+----+------+-------+---------+-------+---------------------+
| id | name | class | subject | grade | cume_dist           |
+----+------+-------+---------+-------+---------------------+
| 10 | Lucy | B     | English |     6 | 0.16666666666666666 |
|  4 | Tim  | A     | English |     7 |  0.6666666666666666 |
|  6 | Jim  | A     | English |     7 |  0.6666666666666666 |
| 11 | Jody | B     | English |     7 |  0.6666666666666666 |
|  5 | Tom  | A     | English |     8 |                   1 |
| 12 | Susy | B     | English |     8 |                   1 |
|  8 | Jody | B     | Math    |     6 | 0.16666666666666666 |
|  2 | Tom  | A     | Math    |     7 |  0.3333333333333333 |
|  3 | Jim  | A     | Math    |     8 |  0.6666666666666666 |
|  7 | Lucy | B     | Math    |     8 |  0.6666666666666666 |
|  1 | Tim  | A     | Math    |     9 |                   1 |
|  9 | Susy | B     | Math    |     9 |                   1 |
+----+------+-------+---------+-------+---------------------+

在?OVER?子句中:

  • PARTITION BY subject?将所有行按照科目进行分区。
  • ORDER BY grade 将每个分区内的行按照成绩升序排列。
  • CUME_DIST() 计算每行的累积分布。

注意,在每个科目内,成绩相同的行具有相同的累积分布。

二、DENSE_RANK?分区内的排名

MySQL DENSE_RANK() 函数返回当前行所在的分区内的排名,从 1 开始,但没有间隔。

语法:

DENSE_RANK()
OVER (
  [PARTITION BY partition_column_list]
  [ORDER BY order_column_list]
)

参数说明:

  • partition_column_list参与分区的列的列表。
  • order_column_list参与排序的列的列表。

返回值:

  • 当前行所在的分区内的排名,从 1 开始,但没有间隔。

示例:

mysql> select * from grade;
+----+------+-------+---------+-------+
| id | name | class | subject | grade |
+----+------+-------+---------+-------+
|  1 | Tim  | A     | Math    |     9 |
|  2 | Tom  | A     | Math    |     7 |
|  3 | Jim  | A     | Math    |     8 |
|  4 | Tim  | A     | English |     7 |
|  5 | Tom  | A     | English |     8 |
|  6 | Jim  | A     | English |     7 |
|  7 | Lucy | B     | Math    |     8 |
|  8 | Jody | B     | Math    |     6 |
|  9 | Susy | B     | Math    |     9 |
| 10 | Lucy | B     | English |     6 |
| 11 | Jody | B     | English |     7 |
| 12 | Susy | B     | English |     8 |
+----+------+-------+---------+-------+
  • 按照科目查看每个学生的成绩排名?
mysql> SELECT *,
  DENSE_RANK() OVER (
    PARTITION BY subject
    ORDER BY grade DESC
  ) "dense_rank",
  rank() OVER (
    PARTITION BY subject
    ORDER BY grade DESC
  ) "rank"
FROM grade;
+----+------+-------+---------+-------+------------+------+
| id | name | class | subject | grade | dense_rank | rank |
+----+------+-------+---------+-------+------------+------+
|  5 | Tom  | A     | English |     8 |          1 |    1 |
| 12 | Susy | B     | English |     8 |          1 |    1 |
|  4 | Tim  | A     | English |     7 |          2 |    3 |
|  6 | Jim  | A     | English |     7 |          2 |    3 |
| 11 | Jody | B     | English |     7 |          2 |    3 |
| 10 | Lucy | B     | English |     6 |          3 |    6 |
|  1 | Tim  | A     | Math    |     9 |          1 |    1 |
|  9 | Susy | B     | Math    |     9 |          1 |    1 |
|  3 | Jim  | A     | Math    |     8 |          2 |    3 |
|  7 | Lucy | B     | Math    |     8 |          2 |    3 |
|  2 | Tom  | A     | Math    |     7 |          3 |    5 |
|  8 | Jody | B     | Math    |     6 |          4 |    6 |
+----+------+-------+---------+-------+------------+------+

在?OVER?子句中:

  • PARTITION BY subject 将按照学科进行分区。
  • ORDER BY grade DESC 将每个分区内的行按照成绩逆序排列。
  • DENSE_RANK() 返回每行在其关联的分区内的排名。

注意,在每个科目内,成绩相同的行具有相同的累积分布。

  • 按照班级查看每个学生的总成绩排名,要查看在每个班级中每个学生按照总成绩从高到底的排名。

mysql> SELECT 
  t.*,
  DENSE_RANK() OVER (
    PARTITION BY class
    ORDER BY t.sum_grade DESC
  ) "dense_rank"
FROM (
    SELECT class,
      name,
      sum(grade) sum_grade
    FROM grade
    GROUP BY class, name
  ) t;
+-------+------+-----------+------------+
| class | name | sum_grade | dense_rank |
+-------+------+-----------+------------+
| A     | Tim  | 16        |          1 |
| A     | Tom  | 15        |          2 |
| A     | Jim  | 15        |          2 |
| B     | Susy | 17        |          1 |
| B     | Lucy | 14        |          2 |
| B     | Jody | 13        |          3 |
+-------+------+-----------+------------+
  • 其中子查询
mysql> SELECT class,
  name,
  sum(grade) sum_grade
FROM grade
GROUP BY class, name;
+-------+------+-----------+
| class | name | sum_grade |
+-------+------+-----------+
| A     | Tim  | 16        |
| A     | Tom  | 15        |
| A     | Jim  | 15        |
| B     | Lucy | 14        |
| B     | Jody | 13        |
| B     | Susy | 17        |
+-------+------+-----------+

????????主语句在从这个子查询中按照班级进行分区,然后在每个分区内按照总成绩逆序排序,并使用?DENSE_RANK()?返回每行在其关联的分区内的排名。?

三、FIRST_VALUE 评估值(第一行)

MySQL FIRST_VALUE() 函数从当前行关联的窗口框架的第一行中返回评估的值。

语法:

FIRST_VALUE(expr)
OVER (
  [PARTITION BY partition_column_list]
  [ORDER BY order_column_list]
)

参数说明:

  • expr:必需的。它可以是一个列名或者表达式。
  • partition_column_list:参与分区的列的列表。
  • order_column_list:参与排序的列的列表。

返回值:

  • 当前行关联的窗口框架的第一行中返回评估的值。

示例:

mysql> SELECT * FROM grade;

+----+------+-------+---------+-------+
| id | name | class | subject | grade |
+----+------+-------+---------+-------+
|  1 | Tim  | A     | Math    |     9 |
|  2 | Tom  | A     | Math    |     7 |
|  3 | Jim  | A     | Math    |     8 |
|  4 | Tim  | A     | English |     7 |
|  5 | Tom  | A     | English |     8 |
|  6 | Jim  | A     | English |     7 |
|  7 | Lucy | B     | Math    |     8 |
|  8 | Jody | B     | Math    |     6 |
|  9 | Susy | B     | Math    |     9 |
| 10 | Lucy | B     | English |     6 |
| 11 | Jody | B     | English |     7 |
| 12 | Susy | B     | English |     8 |
+----+------+-------+---------+-------+
  • 按照科目查看每个学生的成绩排名,要查看在每个科目中每个学生按照成绩从高到低的排序号和每个科目中的最好成绩。
mysql> SELECT 
  *,
  FIRST_VALUE(grade) OVER (
    PARTITION BY subject
    ORDER BY grade DESC
    RANGE BETWEEN 
      UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
  ) first_grade
FROM grade;
+----+------+-------+---------+-------+-------------+
| id | name | class | subject | grade | first_grade |
+----+------+-------+---------+-------+-------------+
|  5 | Tom  | A     | English |     8 |           8 |
| 12 | Susy | B     | English |     8 |           8 |
|  4 | Tim  | A     | English |     7 |           8 |
|  6 | Jim  | A     | English |     7 |           8 |
| 11 | Jody | B     | English |     7 |           8 |
| 10 | Lucy | B     | English |     6 |           8 |
|  1 | Tim  | A     | Math    |     9 |           9 |
|  9 | Susy | B     | Math    |     9 |           9 |
|  3 | Jim  | A     | Math    |     8 |           9 |
|  7 | Lucy | B     | Math    |     8 |           9 |
|  2 | Tom  | A     | Math    |     7 |           9 |
|  8 | Jody | B     | Math    |     6 |           9 |
+----+------+-------+---------+-------+-------------+
  • 窗口函数:
FIRST_VALUE(grade) OVER (
  PARTITION BY subject
  ORDER BY grade DESC
  RANGE BETWEEN 
    UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
)

在 OVER 子句中:

  • PARTITION BY subject 将按照学科进行分区
  • ORDER BY grade DESC 将每个分区内的行按照成绩逆序排列。
  • RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING 定义了每个分区中与当前行相关的窗口框架,这里是从分区的第一行到最后一行。
  • FIRST_VALUE(grade) 返回每行在其关联的分区的第一行的成绩(grade)。因为是按照 grade 逆序排序,于是每行中添加了一列显示最好的成绩。?
  • 按照班级查看每个学生的总成绩排名,要查看在每个班级中每个学生按照总成绩从高到底的排序号和每个班级的最好成绩。
mysql> SELECT 
  t.*,
  FIRST_VALUE(t.sum_grade) OVER (
    PARTITION BY class
    ORDER BY t.sum_grade DESC
    RANGE BETWEEN 
      UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
  ) first_grade
FROM (
    SELECT class,
      name,
      sum(grade) sum_grade
    FROM grade
    GROUP BY class, name
  ) t;
+-------+------+-----------+-------------+
| class | name | sum_grade | first_grade |
+-------+------+-----------+-------------+
| A     | Tim  | 16        | 16          |
| A     | Tom  | 15        | 16          |
| A     | Jim  | 15        | 16          |
| B     | Susy | 17        | 17          |
| B     | Lucy | 14        | 17          |
| B     | Jody | 13        | 17          |
+-------+------+-----------+-------------+
  • 子查询使用 GROUP BY 子句和 sum() 按照班级和学生汇总出每个学生的总成绩。
mysql> SELECT class,
  name,
  sum(grade) sum_grade
FROM grade
GROUP BY class, name;

+-------+------+-----------+
| class | name | sum_grade |
+-------+------+-----------+
| A     | Tim  | 16        |
| A     | Tom  | 15        |
| A     | Jim  | 15        |
| B     | Lucy | 14        |
| B     | Jody | 13        |
| B     | Susy | 17        |
+-------+------+-----------+

????????主语句在从这个子查询中按照班级进行分区,然后在每个分区内按照总成绩逆序排序,并使用?FIRST_VALUE(grade)?返回每行在其关联的分区内第一行中的成绩。

四、LAST_VALUE?评估值(最后行)

MySQL LAST_VALUE() 函数从当前行关联的窗口框架的最后一行中返回评估的值。

语法:

LAST_VALUE(expr)
OVER (
  [PARTITION BY partition_column_list]
  [ORDER BY order_column_list]
)

参数说明:

  • expr:必需的。它可以是一个列名或者表达式。
  • partition_column_list:参与分区的列的列表。
  • order_column_list:参与排序的列的列表。

返回值:

  • 从当前行关联的窗口框架的最后一行中返回评估的值。

示例:

mysql> SELECT * FROM grade;
+----+------+-------+---------+-------+
| id | name | class | subject | grade |
+----+------+-------+---------+-------+
|  1 | Tim  | A     | Math    |     9 |
|  2 | Tom  | A     | Math    |     7 |
|  3 | Jim  | A     | Math    |     8 |
|  4 | Tim  | A     | English |     7 |
|  5 | Tom  | A     | English |     8 |
|  6 | Jim  | A     | English |     7 |
|  7 | Lucy | B     | Math    |     8 |
|  8 | Jody | B     | Math    |     6 |
|  9 | Susy | B     | Math    |     9 |
| 10 | Lucy | B     | English |     6 |
| 11 | Jody | B     | English |     7 |
| 12 | Susy | B     | English |     8 |
+----+------+-------+---------+-------+
  • 按照科目查看每个学生的成绩排名,要查看在每个科目中每个学生按照成绩从高到低的排序号和每个科目中的最好成绩。?
mysql> SELECT 
  *,
  LAST_VALUE(grade) OVER (
    PARTITION BY subject
    ORDER BY grade DESC
    RANGE BETWEEN 
      UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
  ) last_grade
FROM grade;

+----+------+-------+---------+-------+------------+
| id | name | class | subject | grade | last_grade |
+----+------+-------+---------+-------+------------+
|  5 | Tom  | A     | English |     8 |          6 |
| 12 | Susy | B     | English |     8 |          6 |
|  4 | Tim  | A     | English |     7 |          6 |
|  6 | Jim  | A     | English |     7 |          6 |
| 11 | Jody | B     | English |     7 |          6 |
| 10 | Lucy | B     | English |     6 |          6 |
|  1 | Tim  | A     | Math    |     9 |          6 |
|  9 | Susy | B     | Math    |     9 |          6 |
|  3 | Jim  | A     | Math    |     8 |          6 |
|  7 | Lucy | B     | Math    |     8 |          6 |
|  2 | Tom  | A     | Math    |     7 |          6 |
|  8 | Jody | B     | Math    |     6 |          6 |
+----+------+-------+---------+-------+------------+
  • 窗口函数:
LAST_VALUE(grade) OVER (
  PARTITION BY subject
  ORDER BY grade DESC
  RANGE BETWEEN 
    UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
)

在 OVER 子句中:

  • PARTITION BY subject 将按照学科进行分区。
  • ORDER BY grade DESC 将每个分区内的行按照成绩逆序排列。
  • RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING 定义了每个分区中与当前行相关的窗口框架,这里是从分区的第一行到最后一行。
  • LAST_VALUE(grade) 返回每行在其关联的分区的第一行的成绩(grade)。因为是按照 grade 逆序排序,于是每行中添加了一列显示最差的成绩。?
  • 按照班级查看每个学生的总成绩排名,要查看在每个班级中每个学生按照总成绩从高到底的排序号和每个班级的最好成绩。
mysql> SELECT 
  t.*,
  LAST_VALUE(t.sum_grade) OVER (
    PARTITION BY class
    ORDER BY t.sum_grade DESC
    RANGE BETWEEN 
      UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
  ) last_grade
FROM (
    SELECT class,
      name,
      sum(grade) sum_grade
    FROM grade
    GROUP BY class, name
  ) t;
+-------+------+-----------+------------+
| class | name | sum_grade | last_grade |
+-------+------+-----------+------------+
| A     | Tim  | 16        | 15         |
| A     | Tom  | 15        | 15         |
| A     | Jim  | 15        | 15         |
| B     | Susy | 17        | 13         |
| B     | Lucy | 14        | 13         |
| B     | Jody | 13        | 13         |
+-------+------+-----------+------------+
  • 子查询使用 GROUP BY 子句和 sum() 按照班级和学生汇总出每个学生的总成绩。
mysql> SELECT class,
  name,
  sum(grade) sum_grade
FROM grade
GROUP BY class, name;
+-------+------+-----------+
| class | name | sum_grade |
+-------+------+-----------+
| A     | Tim  | 16        |
| A     | Tom  | 15        |
| A     | Jim  | 15        |
| B     | Lucy | 14        |
| B     | Jody | 13        |
| B     | Susy | 17        |
+-------+------+-----------+

????????主语句在从这个子查询中按照班级进行分区,然后在每个分区内按照总成绩逆序排序,并使用 LAST_VALUE(grade) 返回每行在其关联的分区内最后一行中的成绩。

五、NTH_VALUE?评估值(指定行)

MySQL NTH_VALUE() 函数从当前行关联的窗口框架的指定的一行中返回评估的值

语法:

NTH_VALUE(expr, n)
OVER (
  [PARTITION BY partition_column_list]
  [ORDER BY order_column_list]
)

????????NTH_VALUE(expr, 1) 等效于 first_value(expr)。?

参数说明:

  • expr:必需的。它可以是一个列名或者表达式。
  • n:必需的。指定行的编号。
  • partition_column_list:参与分区的列的列表。
  • order_column_list:参与排序的列的列表。

返回值:

  • 从当前行关联的窗口框架的指定的一行中返回评估的值。
  • 返回?NULL情况:指定的行不存在。

示例:

mysql> SELECT * FROM grade;
+----+------+-------+---------+-------+
| id | name | class | subject | grade |
+----+------+-------+---------+-------+
|  1 | Tim  | A     | Math    |     9 |
|  2 | Tom  | A     | Math    |     7 |
|  3 | Jim  | A     | Math    |     8 |
|  4 | Tim  | A     | English |     7 |
|  5 | Tom  | A     | English |     8 |
|  6 | Jim  | A     | English |     7 |
|  7 | Lucy | B     | Math    |     8 |
|  8 | Jody | B     | Math    |     6 |
|  9 | Susy | B     | Math    |     9 |
| 10 | Lucy | B     | English |     6 |
| 11 | Jody | B     | English |     7 |
| 12 | Susy | B     | English |     8 |
+----+------+-------+---------+-------+
  • 按照科目查看每个学生的成绩排名,要查看在每个科目中每个学生按照成绩从高到低的排序号和每个科目中的最好成绩。
mysql> SELECT 
  *,
  NTH_VALUE(grade, 1) OVER (
    PARTITION BY subject
    ORDER BY grade DESC
    RANGE BETWEEN 
      UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
  ) first_grade
FROM grade;
+----+------+-------+---------+-------+-------------+
| id | name | class | subject | grade | first_grade |
+----+------+-------+---------+-------+-------------+
|  5 | Tom  | A     | English |     8 |           8 |
| 12 | Susy | B     | English |     8 |           8 |
|  4 | Tim  | A     | English |     7 |           8 |
|  6 | Jim  | A     | English |     7 |           8 |
| 11 | Jody | B     | English |     7 |           8 |
| 10 | Lucy | B     | English |     6 |           8 |
|  1 | Tim  | A     | Math    |     9 |           9 |
|  9 | Susy | B     | Math    |     9 |           9 |
|  3 | Jim  | A     | Math    |     8 |           9 |
|  7 | Lucy | B     | Math    |     8 |           9 |
|  2 | Tom  | A     | Math    |     7 |           9 |
|  8 | Jody | B     | Math    |     6 |           9 |
+----+------+-------+---------+-------+-------------+
  • 窗口函数:
NTH_VALUE(grade, 1) OVER (
  PARTITION BY subject
  ORDER BY grade DESC
  RANGE BETWEEN 
    UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
)

在?OVER?子句中:

  • PARTITION BY subject?将按照学科进行分区
  • ORDER BY grade DESC?将每个分区内的行按照成绩逆序排列。
  • RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING?定义了每个分区中与当前行相关的窗口框架,这里是从分区的第一行到最后一行。
  • NTH_VALUE(grade, 1)?返回每行在其关联的分区的第一行的成绩(grade)。因为是按照?grade?逆序排序,于是每行中添加了一列显示最好的成绩。

????????如果想在每行中显示每个科目中最差的成绩,可以使用?NTH_VALUE(grade, 3)?替换上面语句中的?NTH_VALUE(grade, 1)。?

  • 按照班级查看每个学生的总成绩排名,要查看在每个班级中每个学生按照总成绩从高到底的排序号和每个班级的最好成绩。
mysql> SELECT 
  t.*,
  NTH_VALUE(t.sum_grade, 1) OVER (
    PARTITION BY class
    ORDER BY t.sum_grade DESC
    RANGE BETWEEN 
      UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
  ) first_grade
FROM (
    SELECT class,
      name,
      sum(grade) sum_grade
    FROM grade
    GROUP BY class, name
  ) t;
+-------+------+-----------+-------------+
| class | name | sum_grade | first_grade |
+-------+------+-----------+-------------+
| A     | Tim  | 16        | 16          |
| A     | Tom  | 15        | 16          |
| A     | Jim  | 15        | 16          |
| B     | Susy | 17        | 17          |
| B     | Lucy | 14        | 17          |
| B     | Jody | 13        | 17          |
+-------+------+-----------+-------------+
  • 子查询使用 GROUP BY 子句和 sum() 按照班级和学生汇总出每个学生的总成绩。
mysql> SELECT class,
  name,
  sum(grade) sum_grade
FROM grade
GROUP BY class, name;

+-------+------+-----------+
| class | name | sum_grade |
+-------+------+-----------+
| A     | Tim  | 16        |
| A     | Tom  | 15        |
| A     | Jim  | 15        |
| B     | Lucy | 14        |
| B     | Jody | 13        |
| B     | Susy | 17        |
+-------+------+-----------+

????????主语句在从这个子查询中按照班级进行分区,然后在每个分区内按照总成绩逆序排序,并使用?NTH_VALUE(t.sum_grade, 1)?返回每行在其关联的分区内第一行中的成绩。

六、PERCENT_RANK 取分区相对排名

MySQL PERCENT_RANK() 函数返回当前行所在的分区内的相对排名,也就是 (rank() - 1) / (分区总行数 - 1)

语法:

PERCENT_RANK()
OVER (
  [PARTITION BY partition_column_list]
  [ORDER BY order_column_list]
)

参数说明:

  • partition_column_list:参与分区的列的列表。
  • order_column_list:参与排序的列的列表。

返回值:

  • 一个介于 0 和 1 (包含) 之间的数,表示当前行所在的分区内的相对排名,其计算公式为是?(rank() - 1) / (分区总行数 - 1)

示例:

mysql> SELECT * FROM grade;
+----+------+-------+---------+-------+
| id | name | class | subject | grade |
+----+------+-------+---------+-------+
|  1 | Tim  | A     | Math    |     9 |
|  2 | Tom  | A     | Math    |     7 |
|  3 | Jim  | A     | Math    |     8 |
|  4 | Tim  | A     | English |     7 |
|  5 | Tom  | A     | English |     8 |
|  6 | Jim  | A     | English |     7 |
|  7 | Lucy | B     | Math    |     8 |
|  8 | Jody | B     | Math    |     6 |
|  9 | Susy | B     | Math    |     9 |
| 10 | Lucy | B     | English |     6 |
| 11 | Jody | B     | English |     7 |
| 12 | Susy | B     | English |     8 |
+----+------+-------+---------+-------+
  • 按照科目查看每个学生的成绩的相对排名,要查看在每个科目中每个学生按照成绩从高到低的相对排名。
mysql> SELECT *,
  rank() OVER (
    PARTITION BY subject
    ORDER BY grade DESC
  ) "rank",
  PERCENT_RANK() OVER (
    PARTITION BY subject
    ORDER BY grade DESC
  ) "percent_rank"
FROM grade;
+----+------+-------+---------+-------+------+--------------+
| id | name | class | subject | grade | rank | percent_rank |
+----+------+-------+---------+-------+------+--------------+
|  5 | Tom  | A     | English |     8 |    1 |            0 |
| 12 | Susy | B     | English |     8 |    1 |            0 |
|  4 | Tim  | A     | English |     7 |    3 |          0.4 |
|  6 | Jim  | A     | English |     7 |    3 |          0.4 |
| 11 | Jody | B     | English |     7 |    3 |          0.4 |
| 10 | Lucy | B     | English |     6 |    6 |            1 |
|  1 | Tim  | A     | Math    |     9 |    1 |            0 |
|  9 | Susy | B     | Math    |     9 |    1 |            0 |
|  3 | Jim  | A     | Math    |     8 |    3 |          0.4 |
|  7 | Lucy | B     | Math    |     8 |    3 |          0.4 |
|  2 | Tom  | A     | Math    |     7 |    5 |          0.8 |
|  8 | Jody | B     | Math    |     6 |    6 |            1 |
+----+------+-------+---------+-------+------+--------------+
  • 窗口函数:
PERCENT_RANK() OVER (
  PARTITION BY subject
  ORDER BY grade DESC
)

在?OVER?子句中,

  • PARTITION BY subject?将按照学科进行分区
  • ORDER BY grade DESC?将每个分区内的行按照成绩逆序排列。
  • PERCENT_RANK()?返回每行在其关联的分区内的相对排名。
  • 按照班级查看每个学生的总成绩的相对排名,要查看在每个班级中每个学生按照总成绩从高到底的相对排名。
mysql> SELECT t.*,
  PERCENT_RANK() OVER (
    PARTITION BY class
    ORDER BY t.sum_grade DESC
  ) "percent_rank"
FROM (
    SELECT class,
      name,
      sum(grade) sum_grade
    FROM grade
    GROUP BY class, name
  ) t;
+-------+------+-----------+--------------+
| class | name | sum_grade | percent_rank |
+-------+------+-----------+--------------+
| A     | Tim  | 16        |            0 |
| A     | Tom  | 15        |          0.5 |
| A     | Jim  | 15        |          0.5 |
| B     | Susy | 17        |            0 |
| B     | Lucy | 14        |          0.5 |
| B     | Jody | 13        |            1 |
+-------+------+-----------+--------------+
  • 子查询使用 GROUP BY 子句和 sum() 按照班级和学生汇总出每个学生的总成绩。
mysql> SELECT class,
  name,
  sum(grade) sum_grade
FROM grade
GROUP BY class, name;
+-------+------+-----------+
| class | name | sum_grade |
+-------+------+-----------+
| A     | Tim  | 16        |
| A     | Tom  | 15        |
| A     | Jim  | 15        |
| B     | Lucy | 14        |
| B     | Jody | 13        |
| B     | Susy | 17        |
+-------+------+-----------+

????????主语句在从这个子查询中按照班级进行分区,然后在每个分区内按照总成绩逆序排序,并使用?PERCENT_RANK()?返回每行在其关联的分区内的相对排名。

七、ROW_NUMBER 取分区序号

MySQL ROW_NUMBER() 函数返回当前行所在的分区内的序号,从 1 开始。

语法:

ROW_NUMBER()
OVER (
  [PARTITION BY partition_column_list]
  [ORDER BY order_column_list]
)

参数说明:

  • partition_column_list:参与分区的列的列表。
  • order_column_list:参与排序的列的列表。

返回值:

  • 当前行所在的分区内的序号,从 1 开始。

示例:

mysql> SELECT * FROM grade;
+----+------+-------+---------+-------+
| id | name | class | subject | grade |
+----+------+-------+---------+-------+
|  1 | Tim  | A     | Math    |     9 |
|  2 | Tom  | A     | Math    |     7 |
|  3 | Jim  | A     | Math    |     8 |
|  4 | Tim  | A     | English |     7 |
|  5 | Tom  | A     | English |     8 |
|  6 | Jim  | A     | English |     7 |
|  7 | Lucy | B     | Math    |     8 |
|  8 | Jody | B     | Math    |     6 |
|  9 | Susy | B     | Math    |     9 |
| 10 | Lucy | B     | English |     6 |
| 11 | Jody | B     | English |     7 |
| 12 | Susy | B     | English |     8 |
+----+------+-------+---------+-------+
  • 按照科目查看每个学生的成绩的排序号,要查看在每个科目中每个学生按照成绩从高到低的排序号。?
mysql> SELECT *,
  ROW_NUMBER() OVER (
    PARTITION BY subject
    ORDER BY grade DESC
  ) "row_number"
FROM grade;
+----+------+-------+---------+-------+------------+
| id | name | class | subject | grade | row_number |
+----+------+-------+---------+-------+------------+
|  5 | Tom  | A     | English |     8 |          1 |
| 12 | Susy | B     | English |     8 |          2 |
|  4 | Tim  | A     | English |     7 |          3 |
|  6 | Jim  | A     | English |     7 |          4 |
| 11 | Jody | B     | English |     7 |          5 |
| 10 | Lucy | B     | English |     6 |          6 |
|  1 | Tim  | A     | Math    |     9 |          1 |
|  9 | Susy | B     | Math    |     9 |          2 |
|  3 | Jim  | A     | Math    |     8 |          3 |
|  7 | Lucy | B     | Math    |     8 |          4 |
|  2 | Tom  | A     | Math    |     7 |          5 |
|  8 | Jody | B     | Math    |     6 |          6 |
+----+------+-------+---------+-------+------------+
  • 窗口函数:?
ROW_NUMBER() OVER (
  PARTITION BY subject
  ORDER BY grade DESC
)

在?OVER?子句中:

  • PARTITION BY subject?将按照学科进行分区
  • ORDER BY grade DESC?将每个分区内的行按照成绩逆序排列。
  • ROW_NUMBER()?返回每行在其关联的分区内的序号。
  • 按照班级查看每个学生的总成绩的排序号,要查看在每个班级中每个学生按照总成绩从高到底的排序号。
mysql> SELECT t.*,
  ROW_NUMBER() OVER (
    PARTITION BY class
    ORDER BY t.sum_grade DESC
  ) "row_number"
FROM (
    SELECT class,
      name,
      sum(grade) sum_grade
    FROM grade
    GROUP BY class, name
  ) t;
+-------+------+-----------+------------+
| class | name | sum_grade | row_number |
+-------+------+-----------+------------+
| A     | Tim  | 16        |          1 |
| A     | Tom  | 15        |          2 |
| A     | Jim  | 15        |          3 |
| B     | Susy | 17        |          1 |
| B     | Lucy | 14        |          2 |
| B     | Jody | 13        |          3 |
+-------+------+-----------+------------+
  • 子查询使用 GROUP BY 子句和 sum() 按照班级和学生汇总出每个学生的总成绩。
mysql> SELECT class,
  name,
  sum(grade) sum_grade
FROM grade
GROUP BY class, name;
+-------+------+-----------+
| class | name | sum_grade |
+-------+------+-----------+
| A     | Tim  | 16        |
| A     | Tom  | 15        |
| A     | Jim  | 15        |
| B     | Lucy | 14        |
| B     | Jody | 13        |
| B     | Susy | 17        |
+-------+------+-----------+

????????主语句在从这个子查询中按照班级进行分区,然后在每个分区内按照总成绩逆序排序,并使用?ROW_NUMBER()?返回每行在其关联的分区内的排序号。

八、LAG

MySQL LAG() 函数返回来自当前行所在的分区内当前行之前的指定行之内的值。

语法:

LAG(expr[, offset[, default]])
OVER (
  [PARTITION BY partition_column_list]
  [ORDER BY order_column_list]
)

参数说明:

  • expr:必需的。它可以是一个列名或者表达式。
  • offset:可选的。相对于当前行的偏移的行数。默认值为 1。
  • default:可选的。它可以是一个列名或者表达式。
  • partition_column_list:参与分区的列的列表。
  • order_column_list:参与排序的列的列表。

返回值:

  • 来自当前行所在的分区内当前行之前的指定行之内的值。

示例:

mysql> SELECT * FROM revenue;
+----+------+---------+---------+
| id | year | quarter | revenue |
+----+------+---------+---------+
|  1 | 2020 | 1       |    3515 |
|  2 | 2020 | 2       |    3678 |
|  3 | 2020 | 3       |    4203 |
|  4 | 2020 | 4       |    3924 |
|  5 | 2021 | 1       |    3102 |
|  6 | 2021 | 2       |    3293 |
|  7 | 2021 | 3       |    3602 |
|  8 | 2021 | 4       |    2901 |
+----+------+---------+---------+
  • 使用 MySQL LAG() 函数和下一季度的收益比较,在使用 MySQL LAG() 函数在每行中添加 next_quarter_revenue 列以比较当前季度和下一季度的收益:
mysql> SELECT
  *,
  LAG(revenue, 1) OVER (
    PARTITION BY year
    ORDER BY quarter DESC
  ) next_quarter_revenue
FROM revenue;

+----+------+---------+---------+----------------------+
| id | year | quarter | revenue | next_quarter_revenue |
+----+------+---------+---------+----------------------+
|  4 | 2020 | 4       |    3924 | NULL                 |
|  3 | 2020 | 3       |    4203 |                 3924 |
|  2 | 2020 | 2       |    3678 |                 4203 |
|  1 | 2020 | 1       |    3515 |                 3678 |
|  8 | 2021 | 4       |    2901 | NULL                 |
|  7 | 2021 | 3       |    3602 |                 2901 |
|  6 | 2021 | 2       |    3293 |                 3602 |
|  5 | 2021 | 1       |    3102 |                 3293 |
+----+------+---------+---------+----------------------+
  • 窗口函数:
LAG(revenue, 1) OVER (
  PARTITION BY year
  ORDER BY quarter DESC
) next_quarter_revenue

在?OVER?子句中:

  • PARTITION BY year?将所有行按照年份进行分区
  • ORDER BY quarter DESC?将每个分区内的行按照季度逆序排列。
  • LAG(revenue, 1)?返回每行在其关联的分区内之前一行(1)中的收益(revenue)。

????????那么在?next_quarter_revenue?列中存放的就是当前行的下一个季度的收益。所以,每个分区的第一行中的?next_quarter_revenue?列的值为 null。

  • 使用默认值:

mysql> SELECT
  *,
  LAG(revenue, 1, 0) OVER (
    PARTITION BY year
    ORDER BY quarter DESC
  ) next_quarter_revenue
FROM revenue;
+----+------+---------+---------+----------------------+
| id | year | quarter | revenue | next_quarter_revenue |
+----+------+---------+---------+----------------------+
|  4 | 2020 | 4       |    3924 |                    0 |
|  3 | 2020 | 3       |    4203 |                 3924 |
|  2 | 2020 | 2       |    3678 |                 4203 |
|  1 | 2020 | 1       |    3515 |                 3678 |
|  8 | 2021 | 4       |    2901 |                    0 |
|  7 | 2021 | 3       |    3602 |                 2901 |
|  6 | 2021 | 2       |    3293 |                 3602 |
|  5 | 2021 | 1       |    3102 |                 3293 |
+----+------+---------+---------+----------------------+

九、LEAD

MySQL LEAD() 函数返回来自当前行所在的分区内当前行之后的指定行之内的行的值。

语法:

LEAD(expr[, offset[, default]])
OVER (
  [PARTITION BY partition_column_list]
  [ORDER BY order_column_list]
)

参数说明:

  • expr:必需的。它可以是一个列名或者表达式。
  • offset:可选的。相对于当前行的偏移的行数。默认值为 1。
  • default:可选的。它可以是一个列名或者表达式。
  • partition_column_list:参与分区的列的列表。
  • order_column_list:参与排序的列的列表。

返回值:

  • 来自当前行所在的分区内当前行之后的指定行之内的行的值。

示例:

mysql> SELECT * FROM revenue;
+----+------+---------+---------+
| id | year | quarter | revenue |
+----+------+---------+---------+
|  1 | 2020 | 1       |    3515 |
|  2 | 2020 | 2       |    3678 |
|  3 | 2020 | 3       |    4203 |
|  4 | 2020 | 4       |    3924 |
|  5 | 2021 | 1       |    3102 |
|  6 | 2021 | 2       |    3293 |
|  7 | 2021 | 3       |    3602 |
|  8 | 2021 | 4       |    2901 |
+----+------+---------+---------+
  • 使用 MySQL LEAD() 函数和上一季度的收益比较,使用 MySQL LEAD() 函数在每行中添加 last_quarter_revenue 列以比较当前季度和上一季度的收益:
mysql> SELECT
  *,
  LEAD(revenue, 1) OVER (
    PARTITION BY year
    ORDER BY quarter DESC
  ) last_quarter_revenue
FROM revenue;

+----+------+---------+---------+----------------------+
| id | year | quarter | revenue | last_quarter_revenue |
+----+------+---------+---------+----------------------+
|  4 | 2020 | 4       |    3924 |                 4203 |
|  3 | 2020 | 3       |    4203 |                 3678 |
|  2 | 2020 | 2       |    3678 |                 3515 |
|  1 | 2020 | 1       |    3515 | NULL                 |
|  8 | 2021 | 4       |    2901 |                 3602 |
|  7 | 2021 | 3       |    3602 |                 3293 |
|  6 | 2021 | 2       |    3293 |                 3102 |
|  5 | 2021 | 1       |    3102 | NULL                 |
+----+------+---------+---------+----------------------+
  • 窗口函数:
LEAD(revenue, 1) OVER (
  PARTITION BY year
  ORDER BY quarter DESC
) last_quarter_revenue

在?OVER?子句中:

  • PARTITION BY year?将所有行按照年份进行分区
  • ORDER BY quarter DESC?将每个分区内的行按照季度逆序排列。
  • LEAD(revenue, 1)?返回每行在其关联的分区内之后一行(1)中的收益(revenue)。

????????那么在?last_quarter_revenue?列中存放的就是当前行的上一个季度的收益。所以,每个分区的最后一行中的?last_quarter_revenue?列的值为 null。

  • 使用默认值:
mysql> SELECT
  *,
  LEAD(revenue, 1, 0) OVER (
    PARTITION BY year
    ORDER BY quarter DESC
  ) last_quarter_revenue
FROM revenue;

+----+------+---------+---------+----------------------+
| id | year | quarter | revenue | last_quarter_revenue |
+----+------+---------+---------+----------------------+
|  4 | 2020 | 4       |    3924 |                 4203 |
|  3 | 2020 | 3       |    4203 |                 3678 |
|  2 | 2020 | 2       |    3678 |                 3515 |
|  1 | 2020 | 1       |    3515 |                    0 |
|  8 | 2021 | 4       |    2901 |                 3602 |
|  7 | 2021 | 3       |    3602 |                 3293 |
|  6 | 2021 | 2       |    3293 |                 3102 |
|  5 | 2021 | 1       |    3102 |                    0 |
+----+------+---------+---------+----------------------+

十、NTILE

MySQL NTILE() 函数将当前行所在的分区内的所有行尽可能平均的分成指定数量的区间,并返回当前行所在的区间编号。

每个区间, MySQL 称之为一个排名桶。?NTILE()?根据指定排序为每个桶指设定排名。

语法:

NTILE(buckets)
OVER (
  [PARTITION BY partition_column_list]
  [ORDER BY order_column_list]
)

参数说明:

  • buckets:必需的。桶的数量。桶的数量最大为此分区内的行的数量。
  • partition_column_list:参与分区的列的列表。
  • order_column_list:参与排序的列的列表。

返回值:

  • 将当前行所在的分区内的所有行尽可能平均的分成指定数量的排名桶,并返回当前行所在的桶的排名。

示例:?

  • 下面语句展示使用?NTILE(3)?将他们按照升序分成 3 个桶,按照尽可能平均分配的原则,那么 1-3 的桶排名是 1, 4-6 的桶排名是 2, 7-9 的桶排名是 3。
mysql> SELECT
  x,
  NTILE(3) over (
    ORDER BY x
  ) "ntile"
FROM (
    SELECT 1 x
    UNION
    SELECT 2 x
    UNION
    SELECT 3 x
    UNION
    SELECT 4 x
    UNION
    SELECT 5 x
    UNION
    SELECT 6 x
    UNION
    SELECT 7 x
    UNION
    SELECT 8 x
    UNION
    SELECT 9 x
) t;
+---+-------+
| x | ntile |
+---+-------+
| 1 |     1 |
| 2 |     1 |
| 3 |     1 |
| 4 |     2 |
| 5 |     2 |
| 6 |     2 |
| 7 |     3 |
| 8 |     3 |
| 9 |     3 |
+---+-------+
mysql> SELECT * FROM revenue;
+----+------+---------+---------+
| id | year | quarter | revenue |
+----+------+---------+---------+
|  1 | 2020 | 1       |    3515 |
|  2 | 2020 | 2       |    3678 |
|  3 | 2020 | 3       |    4203 |
|  4 | 2020 | 4       |    3924 |
|  5 | 2021 | 1       |    3102 |
|  6 | 2021 | 2       |    3293 |
|  7 | 2021 | 3       |    3602 |
|  8 | 2021 | 4       |    2901 |
+----+------+---------+---------+
  • 使用 MySQL?NTILE()?函数将每年的收益按照升序分成 2 桶:
mysql> SELECT
  *,
  NTILE(2) OVER (
    PARTITION BY year
    ORDER BY revenue
  ) "ntile"
FROM revenue;
+----+------+---------+---------+-------+
| id | year | quarter | revenue | ntile |
+----+------+---------+---------+-------+
|  1 | 2020 | 1       |    3515 |     1 |
|  2 | 2020 | 2       |    3678 |     1 |
|  4 | 2020 | 4       |    3924 |     2 |
|  3 | 2020 | 3       |    4203 |     2 |
|  8 | 2021 | 4       |    2901 |     1 |
|  5 | 2021 | 1       |    3102 |     1 |
|  6 | 2021 | 2       |    3293 |     2 |
|  7 | 2021 | 3       |    3602 |     2 |
+----+------+---------+---------+-------+
  • 窗口函数:
NTILE(2) OVER (
  PARTITION BY year
  ORDER BY revenue
)

在?OVER?子句中:

  • PARTITION BY year?将所有行按照年份进行分区
  • ORDER BY revenue?将每个分区内的行按照收益升序排列
  • NTILE(2)?将每个分区的收益尽可能平均的分成 2 个桶。由于每年有 4 行,所以每个桶有 2 行。所以每年的前两行的桶排名为 1, 后两行的桶排名为 2。

十一、RANK

MySQL RANK() 函数返回当前行所在的分区内的排名,从 1 开始,但有间隔。即,相同的值具有相同的排名,但是下一个不同的值的排名采用 row_number() 编号。比如,如果有 2 个第一名,那么第三位的排名是 3。这与 dense_rank() 函数是不同的。

语法:

RANK()
OVER (
  [PARTITION BY partition_column_list]
  [ORDER BY order_column_list]
)

参数说明:

  • partition_column_list:参与分区的列的列表。
  • order_column_list:参与排序的列的列表。

返回值:

  • 当前行所在的分区内的排名,从 1 开始,但有间隔。

示例:

mysql> SELECT * FROM grade;
+----+------+-------+---------+-------+
| id | name | class | subject | grade |
+----+------+-------+---------+-------+
|  1 | Tim  | A     | Math    |     9 |
|  2 | Tom  | A     | Math    |     7 |
|  3 | Jim  | A     | Math    |     8 |
|  4 | Tim  | A     | English |     7 |
|  5 | Tom  | A     | English |     8 |
|  6 | Jim  | A     | English |     7 |
|  7 | Lucy | B     | Math    |     8 |
|  8 | Jody | B     | Math    |     6 |
|  9 | Susy | B     | Math    |     9 |
| 10 | Lucy | B     | English |     6 |
| 11 | Jody | B     | English |     7 |
| 12 | Susy | B     | English |     8 |
+----+------+-------+---------+-------+
  • 按照科目查看每个学生的成绩排名,要查看在每个科目中每个学生按照成绩从高到低的排名。
mysql> SELECT *,
  RANK() OVER (
    PARTITION BY subject
    ORDER BY grade DESC
  ) "rank",
  dense_rank() OVER (
    PARTITION BY subject
    ORDER BY grade DESC
  ) "dense_rank"
FROM grade;
+----+------+-------+---------+-------+------+------------+
| id | name | class | subject | grade | rank | dense_rank |
+----+------+-------+---------+-------+------+------------+
|  5 | Tom  | A     | English |     8 |    1 |          1 |
| 12 | Susy | B     | English |     8 |    1 |          1 |
|  4 | Tim  | A     | English |     7 |    3 |          2 |
|  6 | Jim  | A     | English |     7 |    3 |          2 |
| 11 | Jody | B     | English |     7 |    3 |          2 |
| 10 | Lucy | B     | English |     6 |    6 |          3 |
|  1 | Tim  | A     | Math    |     9 |    1 |          1 |
|  9 | Susy | B     | Math    |     9 |    1 |          1 |
|  3 | Jim  | A     | Math    |     8 |    3 |          2 |
|  7 | Lucy | B     | Math    |     8 |    3 |          2 |
|  2 | Tom  | A     | Math    |     7 |    5 |          3 |
|  8 | Jody | B     | Math    |     6 |    6 |          4 |
+----+------+-------+---------+-------+------+------------+
  • 窗口函数:
RANK() OVER (
  PARTITION BY subject
  ORDER BY grade DESC
)

在?OVER?子句中:

  • PARTITION BY subject?将按照学科进行分区
  • ORDER BY grade DESC?将每个分区内的行按照成绩逆序排列。
  • RANK()?返回每行在其关联的分区内的排名。

????????可以看出?RANK()?与?dense_rank()?的不同。

  • 按照班级查看每个学生的总成绩排名,要查看在每个班级中每个学生按照总成绩从高到底的排名。
mysql> SELECT t.*,
  RANK() OVER (
    PARTITION BY class
    ORDER BY t.sum_grade DESC
  ) "rank"
FROM (
    SELECT class,
      name,
      sum(grade) sum_grade
    FROM grade
    GROUP BY class, name
  ) t;
+-------+------+-----------+------+
| class | name | sum_grade | rank |
+-------+------+-----------+------+
| A     | Tim  | 16        |    1 |
| A     | Tom  | 15        |    2 |
| A     | Jim  | 15        |    2 |
| B     | Susy | 17        |    1 |
| B     | Lucy | 14        |    2 |
| B     | Jody | 13        |    3 |
+-------+------+-----------+------+
  • 子查询使用 GROUP BY 子句和 sum() 按照班级和学生汇总出每个学生的总成绩。?
mysql> SELECT class,
  name,
  sum(grade) sum_grade
FROM grade
GROUP BY class, name;
+-------+------+-----------+
| class | name | sum_grade |
+-------+------+-----------+
| A     | Tim  | 16        |
| A     | Tom  | 15        |
| A     | Jim  | 15        |
| B     | Lucy | 14        |
| B     | Jody | 13        |
| B     | Susy | 17        |
+-------+------+-----------+

????????主语句在从这个子查询中按照班级进行分区,然后在每个分区内按照总成绩逆序排序,并使用?RANK()?返回每行在其关联的分区内的排名。

文章来源:https://blog.csdn.net/imwucx/article/details/134720314
本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。