Raymond Raymond

使用SQL Server 自带功能简化SQL

event 2013-06-01 visibility 1,262 comment 0 insights toc
more_vert
insights Stats

在低版本SQL Server中,对于一些常用的查询功能,我们可能需要运用复杂的逻辑,如临时表,游标等等来实现,这样可能会导致代码难维护,同时产生性能问题。本文将简述用自带的功能去简单的实现这些操作。

注意:一些功能仅在SQL Server 2012以及以上版本支持。

1.查询分页 (OFFSET, FETCH)

在很多Web应用程序中,我们需要实现分页功能,在过去,我们往往通过top语句实现或者通过ROW_NUMBER;在2012中加入了FETECH以及OFFSET标准SQL中支持的功能。下面的例子是查询第11到20行数据:

select bp.Title,bp.CommentCount from  Context.BlogPosts bp
inner join Context.BlogPostsInCategories bpic
on bpic.BlogPostId = bp.BlogPostId
order by bp.CommentCount desc
offset 10 rows fetch first 10 rows only

2. 遍历 (CROSS APPLY/OUTER APPLY)

设想这个需求:返回每一个博客分类中评论数量最多的两条博客文章。在习惯性的做法中,我们可能选择使用游标来遍历每个分类,最终通过临时表返回所有的数据。而通过APPLY我们课题很简单的实现这个功能:

select bc.Title as Category,post.Title as PostTitle,post.CommentCount from [Context].[BlogCategories] bc
outer apply
(
select bp.* from  Context.BlogPosts bp
inner join Context.BlogPostsInCategories bpic
on bpic.BlogPostId = bp.BlogPostId
where bpic.BlogCategoryId = bc.BlogCategoryId
order by bp.CommentCount desc
offset 0 rows fetch first 2 rows only
) post
order by bc.BlogCategoryId

结果如下:

image

可以看到每个分类下至多两条记录。

这里我们使用的是OUTER APPLY 当在APPLY时未找到相应数据,会默认返回NULLs.

3.行转列(PIVOT/UNPIVOT)

我们可能习惯使用CASE… WHEN语句进行行转列;PIVOT则可以简化我们的操作。比如我们需要将2012年每个分类下每个月发表的博客文章数用一行展示,每一列代表一个月,那么语句就可以如下:

select bp.Title,bp.CommentCount from  Context.BlogPosts bp
inner join Context.BlogPostsInCategories bpic
on bpic.BlogPostId = bp.BlogPostId
order by bp.CommentCount desc
offset 10 rows fetch first 10 rows only
GO
with data as
(
select 1 as indicator,
MONTH(bp.DateCreated) as [month],
bc.Title as [cate]
from  Context.BlogPosts bp
inner join Context.BlogPostsInCategories bpic
on bpic.BlogPostId = bp.BlogPostId
inner join Context.BlogCategories bc
on bpic.BlogCategoryId = bc.BlogCategoryId
WHERE bp.DateCreated between '20120101' and '20121231'
)
select p.cate,[1],[2],[3],[4],[5],[6],[7],[8],[9],[10],[11],[12] from
data
pivot 
(
    sum(data.indicator)
    for data.[month] in ([1],[2],[3],[4],[5],[6],[7],[8],[9],[10],[11],[12])
)
as p
显示结果如下:
image

4. 混合详细数据以及聚合数据 Window (OVER)

在一些报表中,我们需要在一行中展示这一行的数据,同时需要展示汇总数据。比如如下的需求: 按年月日汇总发帖的数目,查询语句可以简化如下:

with data as
(
select count(*) as [count],
DAY(bp.DateCreated) as [day],
YEAR(bp.DateCreated) as [year],
MONTH(bp.DateCreated) as [month]
from  Context.BlogPosts bp
group by YEAR(bp.DateCreated),MONTH(bp.DateCreated),DAY(bp.DateCreated)
)
select [year],[month],[day],
[count] as [countOfTheDay],
sum([count]) over (partition by year,month) as [countOfTheMonth], 
sum([count]) over (partition by year) as [countOfTheYear]
from data
order by data.year, data.month

最终结果显示如下:

image

5. 显示汇总信息(ROLLUP)

加入将上一个例子需求更改为汇总至MTD(Month to Date), YTD (Year to Date)又该怎样实现呢?这就需要结合GROUP BY ROLLUP了。代码如下:

with data as
(
select count(*) as [count],
DAY(bp.DateCreated) as [day],
YEAR(bp.DateCreated) as [year],
MONTH(bp.DateCreated) as [month]
from  Context.BlogPosts bp
group by YEAR(bp.DateCreated),MONTH(bp.DateCreated),DAY(bp.DateCreated)
),
groupedData as
(
    select 
    GROUPING_ID(data.year, data.month, data.day) as groupid,
    [year],
    [month],
    [day],
    sum([count]) as [count]
    from data
    group by rollup(data.year, data.month, data.day)
)
select * from 
groupedData
order by groupedData.year, month, day

如下图,所有数据均按照层级汇总显示:

image

6. Window Frame

在上面的例子中,虽然层级汇总信息都显示了出来,但是却没有在一行记录中显示,而是分开显示,如果需要在一行中显示,应该怎么实现呢?这样我们需要借助于Window Framing概念,代码如下:

with data as
(
select count(*) as [count],
DAY(bp.DateCreated) as [day],
YEAR(bp.DateCreated) as [year],
MONTH(bp.DateCreated) as [month]
from  Context.BlogPosts bp
group by YEAR(bp.DateCreated),MONTH(bp.DateCreated),DAY(bp.DateCreated)
)
select [year],[month],[day],
[count] as [countOfTheDay],
sum([count]) over (
partition by year,month
order by year,month,day
rows between unbounded preceding and current row) as [runningCountOfTheMonth], 
sum([count]) over (partition by year
order by year,month,day
range between unbounded preceding and current row) as [runningCountOfTheYear]
from data
order by data.year, data.month

运行结果如下:

image

7. 上一条记录以及下一条记录 Window Offset (LAD / LEAD)

在很多Web程序中,我们在展示一篇博客文章的时候,往往需要添加超链接到上下两篇文章。怎样在一个Query里边实现这个功能呢?代码如下:

select LAG(bp.Title) over (order by bp.BlogPostId) as PrevBlog,
bp.Title as CurrentBlog,
LEAD(bp.Title) over (order by bp.BlogPostId) as NextBlog
from Context.BlogPosts bp

结果如下:

image

值得注意的是,Windows Offset中的LAD和LEAD都不支持Window Frame语句。而FIRST_VALUE以及LAST_VALUE均支持Window Frame语句。

8. 继续Window Offset (FIRST_VALUE / LAST_VALUE)

试想在第2个例子中间,如果我们是要在一行同时显示一个分类下最早和最晚发布的两篇文章,应该怎样实现呢? 代码如下:

select distinct bc.[Title] as cate,
FIRST_VALUE(bp.Title) over (partition by bpic.[BlogCategoryId] order by bp.DateCreated,bp.[BlogPostId] rows between unbounded preceding and unbounded following) as TopicOld,
FIRST_VALUE(bp.DateCreated) over (partition by bpic.[BlogCategoryId] order by bp.DateCreated,bp.[BlogPostId] rows between unbounded preceding and unbounded following) as DateCreatedOld,
LAST_VALUE(bp.Title) over (partition by bpic.[BlogCategoryId] order by bp.DateCreated,bp.[BlogPostId] rows between unbounded preceding and unbounded following) as TopicNew,
LAST_VALUE(bp.DateCreated) over (partition by bpic.[BlogCategoryId] order by bp.DateCreated,bp.[BlogPostId] rows between unbounded preceding and unbounded following) as DateCreatedNew
from Context.BlogCategories bc
inner join Context.BlogPostsInCategories bpic 
on bc.BlogCategoryId =  bpic.BlogCategoryId
inner join Context.BlogPosts bp
on bpic.BlogPostId = bp.BlogPostId

结果如下:

image 

大家可以思考下为什么这里我用到了window frame以及distinct。

总结

其实在强大的SQL Server支持下,还有很多复杂的数据分析操作,我们其实都可以通过简单的TSQL来实现,这里就不做一一介绍了,希望以上分享能够对大家带来帮助。

More from Kontext
comment Comments
No comments yet.

Please log in or register to comment.

account_circle Log in person_add Register

Log in with external accounts