从 SQL Server 2005 起, SQL Server 开始支持窗口函数 (Window Function), 以及到 SQL Server 2012, 窗口函数功能增强, 目前为止支持以下几种窗口函数:
1. 排序函数 (Ranking Function) ;
2. 聚合函数 (Aggregate Function) ;
3. 分析函数 (Analytic Function) ;
4. NEXT VALUE FOR Function, 这是给 sequence 专用的一个函数;
一. 排序函数 (Ranking Function)
帮助文档里的代码示例很全.
排序函数中, ROW_NUMBER() 较为常用, 可用于去重, 分页, 分组中选择数据, 生成数字辅助表等等;
排序函数在语法上要求 OVER 子句里必须含 ORDER BY, 否则语法不通过, 对于不想排序的场景可以这样变通;
- drop table if exists test_ranking
- create table test_ranking
- (
- id int not null,
- name varchar(20) not null,
- value int not null
- )
- insert test_ranking
- select 1,'name1',1 union all
- select 1,'name2',2 union all
- select 2,'name3',2 union all
- select 3,'name4',2
- select id , name, ROW_NUMBER() over (PARTITION by id ORDER BY name) as num
- from test_ranking
- select id , name, ROW_NUMBER() over (PARTITION by id) as num
- from test_ranking
- /*
- Msg 4112, Level 15, State 1, Line 1
- The function 'ROW_NUMBER' must have an OVER clause with ORDER BY.
- */
--ORDERY BY 后面给一个和原表无关的派生列
- select id , name, ROW_NUMBER() over (PARTITION by id ORDER BY GETDATE()) as num
- from test_ranking
- select id , name, ROW_NUMBER() over (PARTITION by id ORDER BY (select 0)) as num
- from test_ranking
二. 聚合函数 (Aggregate Function)
SQL Server 2005 中, 窗口聚合函数仅支持 PARTITION BY, 也就是说仅能对分组的数据整体做聚合运算;
SQL Server 2012 开始, 窗口聚合函数支持 ORDER BY, 以及 ROWS/RAGNE 选项, 原本需要子查询来实现的需求, 如: 移动平均 (moving averages), 总计聚合 (cumulative aggregates), 累计求和 (running totals) 等, 变得更加方便;
代码示例 1: 总计 / 小计 / 累计求和
- drop table if exists test_aggregate;
- create table test_aggregate
- (
- event_id varchar(100),
- rk int,
- price int
- )
- insert into test_aggregate
- values
- ('a',1,10),
- ('a',2,10),
- ('a',3,50),
- ('b',1,10),
- ('b',2,20),
- ('b',3,30)
--1. 没有窗口函数时, 用子查询
- select a.event_id,
- a.rk, --build ranking column if needed
- a.price,
- (select sum(price) from test_aggregate b where b.event_id = a.event_id and b.rk <= a.rk) as totalprice
- from test_aggregate a
--2. 从 SQL Server 2012 起, 用窗口函数
--2.1
-- 没有 PARTITION BY, 没有 ORDER BY, 为全部总计;
-- 只有 PARTITION BY, 没有 ORDER BY, 为分组小计;
-- 只有 ORDER BY, 没有 PARTITION BY, 为全部累计求和 (RANGE 选项, 见 2.2)
- select *,
- sum(price) over() as TotalPrice,
- sum(price) over(partition by event_id) as SubTotalPrice,
- sum(price) over(order by rk) as RunningTotalPrice
- from test_aggregate a
--2.2 注意 ORDER BY 列的选择, 可能会带来不同结果
- select *,
- sum(price) over(partition by event_id order by rk) as totalprice
- from test_aggregate a
- /*
- event_id rk price totalprice
- a 1 10 10
- a 2 10 20
- a 3 50 70
- b 1 10 10
- b 2 20 30
- b 3 30 60
- */
- select *,
- sum(price) over(partition by event_id order by price) as totalprice
- from test_aggregate a
- /*
- event_id rk price totalprice
- a 1 10 20
- a 2 10 20
- a 3 50 70
- b 1 10 10
- b 2 20 30
- b 3 30 60
- */
-- 因为 ORDER BY 还有个子选项 ROWS/RANGE, 不指定的情况下默认为 RANGE UNBOUNDED PRECEDING AND CURRENT ROW
--RANGE 按照 ORDER BY 中的列值, 将相同的值的行均视为当前同一行
- select *,sum(price) over(partition by event_id order by price) as totalprice from test_aggregate a
- select *,sum(price) over(partition by event_id order by price range between unbounded preceding and current row) as totalprice from test_aggregate a
-- 如果 ORDER BY 中的列值有重复值, 手动改用 ROWS 选项即可实现累计求和
select *,sum(price) over(partition by event_id order by price rows between unbounded preceding and current row) as totalprice from test_aggregate a
代码示例 2: 移动平均
-- 移动平均, 举个例子, 就是求前 N 天的平均值, 和股票市场的均线类似
- drop table if exists test_moving_avg
- create table test_moving_avg
- (
- ID int,
- Value int,
- DT datetime
- )
- insert into test_moving_avg
- values
- (1,10,GETDATE()-10),
- (2,110,GETDATE()-9),
- (3,100,GETDATE()-8),
- (4,80,GETDATE()-7),
- (5,60,GETDATE()-6),
- (6,40,GETDATE()-5),
- (7,30,GETDATE()-4),
- (8,50,GETDATE()-3),
- (9,20,GETDATE()-2),
- (10,10,GETDATE()-1)
--1. 没有窗口函数时, 用子查询
- select *,
- (select AVG(Value) from test_moving_avg a where a.DT>= DATEADD(DAY, -5, b.DT) AND a.DT <b.DT) AS avg_value_5days
- from test_moving_avg b
--2. 从 SQL Server 2012 起, 用窗口函数
-- 三个内置常量, 第一行, 最后一行, 当前行: UNBOUNDED PRECEDING, UNBOUNDED FOLLOWING, CURRENT ROW
-- 在行间移动, 用 BETWEEN m preceding AND n following (m, n> 0)
- SELECT *,
- sum(value) over (ORDER BY DT ROWS BETWEEN 5 preceding AND CURRENT ROW) moving_sum,
- avg(value) over (ORDER BY DT ROWS BETWEEN 4 preceding AND CURRENT ROW) moving_avg1,
- avg(value) over (ORDER BY DT ROWS BETWEEN 5 preceding AND 1 preceding) moving_avg2,
- avg(value) over (ORDER BY DT ROWS BETWEEN 3 preceding AND 1 following) moving_avg3
- FROM test_moving_avg
- ORDER BY DT
三. 分析函数 (Analytic Function)
代码示例 1: 取当前行某列的前一个 / 下一个值
- drop table if exists test_analytic
- create table test_analytic
- (
- SalesYear varchar(10),
- Revenue int,
- Offset int
- )
- insert into test_analytic
- values
- (2013,1001,1),
- (2014,1002,1),
- (2015,1003,1),
- (2016,1004,1),
- (2017,1005,1),
- (2018,1006,1)
-- 当年及去年的销售额
- select *,lag(Revenue,1,null) over(order by SalesYear asc) as PreviousYearRevenue from test_analytic
- select *,lag(Revenue,Offset,null) over(order by SalesYear asc) as PreviousYearRevenue from test_analytic
- select *,lead(Revenue,1,null) over(order by SalesYear desc) as PreviousYearRevenue from test_analytic
-- 当年及下一年的销售额
- select *,lead(Revenue,1,null) over(order by SalesYear asc) as NextYearRevenue from test_analytic
- select *,lead(Revenue,Offset,null) over(order by SalesYear asc) as NextYearRevenue from test_analytic
- select *,lag(Revenue,1,null) over(order by SalesYear desc) as NextYearRevenue from test_analytic
-- 可以根据 offset 调整跨度
代码示例 2: 分组中某列最大 / 最小值, 对应的其他列值
假设有个门禁系统, 在员工每次进门时写入一条记录, 记录了 "身份号码","进门时间","衣服颜色", 查询每个员工最后一次进门时的 "衣服颜色".
- drop table if exists test_first_last
- create table test_first_last
- (
- EmployeeID int,
- EnterTime datetime,
- ColorOfClothes varchar(20)
- )
- insert into test_first_last
- values
- (1001, GETDATE()-9, 'GREEN'),
- (1001, GETDATE()-8, 'RED'),
- (1001, GETDATE()-7, 'YELLOW'),
- (1001, GETDATE()-6, 'BLUE'),
- (1002, GETDATE()-5, 'BLACK'),
- (1002, GETDATE()-4, 'WHITE')
--1. 用子查询
- --LastColorOfColthes
- select * from test_first_last a
- where not exists(select 1 from test_first_last b where a.EmployeeID = b.EmployeeID and a.EnterTime < b.EnterTime)
- --LastColorOfColthes
- select *
- from
- (select *, ROW_NUMBER() over(partition by EmployeeID order by EnterTime DESC) num
- from test_first_last ) t
- where t.num =1
--2. 用窗口函数
-- 用 LAST_VALUE 时, 必须加上 ROWS/RANGE BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING, 否则结果不正确
- select *,
- FIRST_VALUE(ColorOfClothes) OVER (PARTITION BY EmployeeID ORDER BY EnterTime DESC) as LastColorOfClothes,
- FIRST_VALUE(ColorOfClothes) OVER (PARTITION BY EmployeeID ORDER BY EnterTime ASC) as FirstColorOfClothes,
- LAST_VALUE(ColorOfClothes) OVER (PARTITION BY EmployeeID ORDER BY EnterTime ASC ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING) as LastColorOfClothes,
- LAST_VALUE(ColorOfClothes) OVER (PARTITION BY EmployeeID ORDER BY EnterTime DESC ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING) as FirstColorOfClothes
- from test_first_last
-- 对于显示表中所有行, 并追加 Last/First 字段时用窗口函数方便些
-- 对于挑选表中某一行 / 多行时, 用子查询更方便
四. NEXT VALUE FOR Function
- drop sequence if exists test_seq
- create sequence test_seq
- start with 1
- increment by 1;
- GO
- drop table if exists test_next_value
- create table test_next_value
- (
- ID int,
- Name varchar(10)
- )
- insert into test_next_value(Name)
- values
- ('AAA'),
- ('AAA'),
- ('BBB'),
- ('CCC')
-- 对于多行数据获取 sequence 的 next value, 是否使用窗口函数都会逐行计数
-- 窗口函数中 ORDER BY 用于控制不同列值的计数顺序
- select *, NEXT VALUE FOR test_seq from test_next_value
- select *, NEXT VALUE FOR test_seq OVER(ORDER BY Name DESC) from test_next_value
参考:
- SELECT - OVER Clause (Transact-SQL)
- https://docs.microsoft.com/en-us/sql/t-sql/queries/select-over-clause-transact-sql?view=sql-server-2017
SQL Server Windowing Functions: ROWS vs. RANGE
https://www.sqlpassion.at/archive/2015/01/22/sql-server-windowing-functions-rows-vs-range/
来源: https://www.cnblogs.com/seusoftware/p/9316139.html