SQL-窗口函数
SQL中有一类函数叫聚合函数,比如count, sum, avg, min, max。因为这些函数能够把多行数据按照规定聚集为一行,一般聚集前的数据行要大于聚集后的数据行。而有时候我们不仅想要聚集前的数据,又想要聚集后的数据,这时候,便引入窗口函数。
窗口函数的基本语法如下:
<窗口函数> over (partition by <用于分组的列名> order by <用于排序的列名>)
<窗口函数>的位置,可以放以下两种函数:
- 专用窗口函数,包括rank, dense_rank, row_number等专用窗口函数
- 聚合函数,如sum. avg, count, max, min等
一:例子–专用窗口函数rank
比如现在有一个班级成绩表格
| 学号 | 班级 | 成绩 |
|---|---|---|
| 001 | 1 | 80 |
| 002 | 1 | 89 |
| 003 | 2 | 81 |
| 004 | 1 | 91 |
| 005 | 2 | 92 |
| 006 | 3 | 98 |
| 007 | 1 | 88 |
| 008 | 2 | 81 |
想在每个班级内按成绩排名,即1班,2班,3班按照成绩来排名。用sql创建表格
DROP TABLE IF EXISTS student;
CREATE TABLE student (id int, class int, grade int);
INSERT INTO student VALUES(1, 1, 80);
INSERT INTO student VALUES(2, 1, 89);
INSERT INTO student VALUES(3, 2, 81);
INSERT INTO student VALUES(4, 1, 91);
INSERT INTO student VALUES(5, 2, 92);
INSERT INTO student VALUES(6, 3, 98);
INSERT INTO student VALUES(7, 1, 88);
INSERT INTO student VALUES(8, 2, 81);
SELECT * FROM student;

使用rank窗口函数来排名
SELECT * ,
rank() over (partition by class order by grade DESC) as ranking
FROM student;

二:结合面试题来解释
面试题1
现有交易数据表 user_goods_table 有 user_name(用户名), goods_kind(用户订购的外卖品类),现在老板想知道每个用户购买的外卖品类偏好分布,并提取每个用户购买最多的外卖品类是哪一个。
要求输出的是 user_name(用户名), goods_kind(该用户购买的最多外卖品类)
创建交易数据表
DROP TABLE IF EXISTS user_goods;
CREATE TABLE user_goods(user_name varchar(10), goods_kind varchar(10));
INSERT INTO user_goods VALUES('张三', '云团');
INSERT INTO user_goods VALUES('李四', '云团');
INSERT INTO user_goods VALUES('张三', '云团');
INSERT INTO user_goods VALUES('张三', '云团');
INSERT INTO user_goods VALUES('李四', '面条');
INSERT INTO user_goods VALUES('李四', '面条');
INSERT INTO user_goods VALUES('李四', '面条');
INSERT INTO user_goods VALUES('张三', '面');
INSERT INTO user_goods VALUES('张三', '面条');
首先一步步来,想着先GROUP BY 用户名和外卖品类,用COUNT计算外卖品类的个数。 即如下操作
SELECT user_name, goods_kind, count(goods_kind) as num
FROM user_goods
GROUP BY user_name,goods_kind;

然后对num 进行partition by进行倒序排序,得到:
SELECT *,
row_number() over (partition by user_name order by num DESC) as rnk
FROM
(SELECT user_name, goods_kind, count(goods_kind) as num
FROM user_goods
GROUP BY user_name,goods_kind) as t1;

现在只要用where 筛选 条件 rnk = 1 即可得到想要的。
SELECT user_name, goods_kind
FROM
(SELECT *,row_number() over (partition by user_name order by num DESC) as rnk
FROM
(SELECT user_name, goods_kind, count(goods_kind) as num
FROM user_goods
GROUP BY user_name,goods_kind) as t1) as t2
WHERE rnk = 1;

结束。
版权声明:本文为Jiana_Feng原创文章,遵循CC 4.0 BY-SA版权协议,转载请附上原文出处链接和本声明。