使用SQL Server 自带功能简化SQL
insights Stats
在低版本SQL Server中,对于一些常用的查询功能,我们可能需要运用复杂的逻辑,如临时表,游标等等来实现,这样可能会导致代码难维护,同时产生性能问题。本文将简述用自带的功能去简单的实现这些操作。
注意:一些功能仅在SQL Server 2012以及以上版本支持。
1.查询分页 (OFFSET, FETCH)
在很多Web应用程序中,我们需要实现分页功能,在过去,我们往往通过top语句实现或者通过ROW_NUMBER;在2012中加入了FETECH以及OFFSET标准SQL中支持的功能。下面的例子是查询第11到20行数据:
select bp.Title,bp.CommentCount from Context.BlogPosts bp inner join Context.BlogPostsInCategories bpic on bpic.BlogPostId = bp.BlogPostId order by bp.CommentCount desc offset 10 rows fetch first 10 rows only
2. 遍历 (CROSS APPLY/OUTER APPLY)
设想这个需求:返回每一个博客分类中评论数量最多的两条博客文章。在习惯性的做法中,我们可能选择使用游标来遍历每个分类,最终通过临时表返回所有的数据。而通过APPLY我们课题很简单的实现这个功能:
select bc.Title as Category,post.Title as PostTitle,post.CommentCount from [Context].[BlogCategories] bc outer apply ( select bp.* from Context.BlogPosts bp inner join Context.BlogPostsInCategories bpic on bpic.BlogPostId = bp.BlogPostId where bpic.BlogCategoryId = bc.BlogCategoryId order by bp.CommentCount desc offset 0 rows fetch first 2 rows only ) post order by bc.BlogCategoryId
结果如下:
可以看到每个分类下至多两条记录。
这里我们使用的是OUTER APPLY 当在APPLY时未找到相应数据,会默认返回NULLs.
3.行转列(PIVOT/UNPIVOT)
我们可能习惯使用CASE… WHEN语句进行行转列;PIVOT则可以简化我们的操作。比如我们需要将2012年每个分类下每个月发表的博客文章数用一行展示,每一列代表一个月,那么语句就可以如下:
select bp.Title,bp.CommentCount from Context.BlogPosts bp inner join Context.BlogPostsInCategories bpic on bpic.BlogPostId = bp.BlogPostId order by bp.CommentCount desc offset 10 rows fetch first 10 rows only GO with data as ( select 1 as indicator, MONTH(bp.DateCreated) as [month], bc.Title as [cate] from Context.BlogPosts bp inner join Context.BlogPostsInCategories bpic on bpic.BlogPostId = bp.BlogPostId inner join Context.BlogCategories bc on bpic.BlogCategoryId = bc.BlogCategoryId WHERE bp.DateCreated between '20120101' and '20121231' ) select p.cate,[1],[2],[3],[4],[5],[6],[7],[8],[9],[10],[11],[12] from data pivot ( sum(data.indicator) for data.[month] in ([1],[2],[3],[4],[5],[6],[7],[8],[9],[10],[11],[12]) ) as p
显示结果如下:
4. 混合详细数据以及聚合数据 Window (OVER)
在一些报表中,我们需要在一行中展示这一行的数据,同时需要展示汇总数据。比如如下的需求: 按年月日汇总发帖的数目,查询语句可以简化如下:
with data as ( select count(*) as [count], DAY(bp.DateCreated) as [day], YEAR(bp.DateCreated) as [year], MONTH(bp.DateCreated) as [month] from Context.BlogPosts bp group by YEAR(bp.DateCreated),MONTH(bp.DateCreated),DAY(bp.DateCreated) ) select [year],[month],[day], [count] as [countOfTheDay], sum([count]) over (partition by year,month) as [countOfTheMonth], sum([count]) over (partition by year) as [countOfTheYear] from data order by data.year, data.month
最终结果显示如下:
5. 显示汇总信息(ROLLUP)
加入将上一个例子需求更改为汇总至MTD(Month to Date), YTD (Year to Date)又该怎样实现呢?这就需要结合GROUP BY ROLLUP了。代码如下:
with data as ( select count(*) as [count], DAY(bp.DateCreated) as [day], YEAR(bp.DateCreated) as [year], MONTH(bp.DateCreated) as [month] from Context.BlogPosts bp group by YEAR(bp.DateCreated),MONTH(bp.DateCreated),DAY(bp.DateCreated) ), groupedData as ( select GROUPING_ID(data.year, data.month, data.day) as groupid, [year], [month], [day], sum([count]) as [count] from data group by rollup(data.year, data.month, data.day) ) select * from groupedData order by groupedData.year, month, day
如下图,所有数据均按照层级汇总显示:
6. Window Frame
在上面的例子中,虽然层级汇总信息都显示了出来,但是却没有在一行记录中显示,而是分开显示,如果需要在一行中显示,应该怎么实现呢?这样我们需要借助于Window Framing概念,代码如下:
with data as ( select count(*) as [count], DAY(bp.DateCreated) as [day], YEAR(bp.DateCreated) as [year], MONTH(bp.DateCreated) as [month] from Context.BlogPosts bp group by YEAR(bp.DateCreated),MONTH(bp.DateCreated),DAY(bp.DateCreated) ) select [year],[month],[day], [count] as [countOfTheDay], sum([count]) over ( partition by year,month order by year,month,day rows between unbounded preceding and current row) as [runningCountOfTheMonth], sum([count]) over (partition by year order by year,month,day range between unbounded preceding and current row) as [runningCountOfTheYear] from data order by data.year, data.month
运行结果如下:
7. 上一条记录以及下一条记录 Window Offset (LAD / LEAD)
在很多Web程序中,我们在展示一篇博客文章的时候,往往需要添加超链接到上下两篇文章。怎样在一个Query里边实现这个功能呢?代码如下:
select LAG(bp.Title) over (order by bp.BlogPostId) as PrevBlog, bp.Title as CurrentBlog, LEAD(bp.Title) over (order by bp.BlogPostId) as NextBlog from Context.BlogPosts bp
结果如下:
值得注意的是,Windows Offset中的LAD和LEAD都不支持Window Frame语句。而FIRST_VALUE以及LAST_VALUE均支持Window Frame语句。
8. 继续Window Offset (FIRST_VALUE / LAST_VALUE)
试想在第2个例子中间,如果我们是要在一行同时显示一个分类下最早和最晚发布的两篇文章,应该怎样实现呢? 代码如下:
select distinct bc.[Title] as cate, FIRST_VALUE(bp.Title) over (partition by bpic.[BlogCategoryId] order by bp.DateCreated,bp.[BlogPostId] rows between unbounded preceding and unbounded following) as TopicOld, FIRST_VALUE(bp.DateCreated) over (partition by bpic.[BlogCategoryId] order by bp.DateCreated,bp.[BlogPostId] rows between unbounded preceding and unbounded following) as DateCreatedOld, LAST_VALUE(bp.Title) over (partition by bpic.[BlogCategoryId] order by bp.DateCreated,bp.[BlogPostId] rows between unbounded preceding and unbounded following) as TopicNew, LAST_VALUE(bp.DateCreated) over (partition by bpic.[BlogCategoryId] order by bp.DateCreated,bp.[BlogPostId] rows between unbounded preceding and unbounded following) as DateCreatedNew from Context.BlogCategories bc inner join Context.BlogPostsInCategories bpic on bc.BlogCategoryId = bpic.BlogCategoryId inner join Context.BlogPosts bp on bpic.BlogPostId = bp.BlogPostId
结果如下:
大家可以思考下为什么这里我用到了window frame以及distinct。
总结
其实在强大的SQL Server支持下,还有很多复杂的数据分析操作,我们其实都可以通过简单的TSQL来实现,这里就不做一一介绍了,希望以上分享能够对大家带来帮助。