Wednesday, March 7, 2012

Problem with WHERE on ROW_NUMBER when using nested query

Hi,

I am trying to limit a result set by ROW_NUMBER. However, I am having problems getting it working.

The following query works fine, and I get a result set with PollID, AddedDate and RowNum columns.

SELECT *, ROW_NUMBER() OVER (ORDER BY Results.AddedDate DESC) AS RowNum FROM

( SELECT DISTINCT p.PollID, p.AddedDate

FROM vw_vs_PollsWithVoteCount p

JOIN vs_PollOptions o ON p.PollID = o.PollID

) AS Results

However, as soon as I add a WHERE condition:

SELECT *, ROW_NUMBER() OVER (ORDER BY Results.AddedDate DESC) AS RowNum FROM

( SELECT DISTINCT p.PollID, p.AddedDate

FROM vw_vs_PollsWithVoteCount p

JOIN vs_PollOptions o ON p.PollID = o.PollID

) AS Results

WHERE RowNum BETWEEN 1 AND 10

The query fails with an ' Invalid column name 'RowNum' ' error.

I have tried using 'Results.RowNum' but I get the same problem.

I don't understand what the issue is. The result set has a column headed 'RowNum' so why can't I apply a WHERE clause to this column? I can apply WHERE to the PollID column, for example, with no problem.

Any help very much appreciated.

Thanks...

RowNum is just an alias for your row_number() which is not qualified as a filter column. You need to derive the select to use it.

e.g.

Code Snippet

select *

from (

SELECT *, ROW_NUMBER() OVER (ORDER BY Results.AddedDate DESC) AS RowNum

FROM ( SELECT DISTINCT p.PollID, p.AddedDate

FROM vw_vs_PollsWithVoteCount p

JOIN vs_PollOptions o ON p.PollID = o.PollID

) AS Results

) derived

WHERE RowNum BETWEEN 1 AND 10

|||

Thanks for the reply. I tried doing this but I was worried that it would incur additional overhead and that there may be a more efficient way of doing it.

But thinking about it further, I can't see any reason why there should be a lot of extra overhead in doing this. My initial gut feeling was 'another SELECT statement, another nesting level, therefore more overhead'. But looking at what the outer SELECT statement is doing (i.e. very little) perhaps this isn't really an issue.

What do you think?

All that aside, it is great to have an explanation of why the query I suggested wasn't working. Thanks...

|||

You will be surprised to know that the cost (compilation) is so negligent that it does not even show up in the estimate - take a look at the execution plans for these. The cost is exactly 50/50. Sqlserver is pretty smart to _correctly_ flatten out the queries to produce the best plans (most of the times).

Code Snippet

select top 1 *
from sysobjects

select *
from(
select *
from(
select top 1 *
from sysobjects
)t1
)t2

|||

I've tried that and I see what you mean. Thanks very much for your excellent help with this.

|||

Just an addition, you can use CTE too.

Code Snippet

;with mycte
as (
SELECT *, ROW_NUMBER() OVER (ORDER BY Results.AddedDate DESC) AS RowNum
FROM ( SELECT DISTINCT p.PollID, p.AddedDate
FROM vw_vs_PollsWithVoteCount p
JOIN vs_PollOptions o ON p.PollID = o.PollID
) Results
)
select *
from mycte
WHERE RowNum BETWEEN 1 AND 10

No comments:

Post a Comment