SQL Server v.Next (Denali) : Deriving sets using SEQUENCE
One complaint about SEQUENCE is that there is no simple construct such as NEXT (@n) VALUES FOR so that you could get a range of SEQUENCE values as a set. In a previous post about SEQUENCE, I mentioned that to get a range of rows, you should use the system stored procedure sys.sp_sequence_get_range. If you've tried this stored procedure, you'll know that it has a few issues:
- the parameter names are not easy to memorize;
- it requires multiple conversions to and from SQL_VARIANT; and,
- producing a set from the output requires further processing – all you get is the first and last values.
SEQUENCE has a lot of limitations, and most examples thus far have shown how to use NEXT VALUE FOR to return a scalar value, but one thing you can do is use this mechanism in a limited number of query scenarios. Let's start with a simple SEQUENCE:
USE tempdb; GO CREATE SEQUENCE dbo.MySequence AS BIGINT MINVALUE 1 NO MAXVALUE START WITH 1;
And as a quick query example, there is nothing stopping you from doing this:
;WITH fake_and_meaningless_set AS ( SELECT x = 'first row' UNION ALL SELECT x = 'second row' ) SELECT [Value] = NEXT VALUE FOR dbo.MySequence FROM fake_and_meaningless_set;
Value ----- 1 2
Note that in the above case you must use UNION ALL. If you try to use UNION, you will get the following error message, because of when the duplicates would have to be filtered out by the UNION:
NEXT VALUE FOR function cannot be used directly in a statement that uses a DISTINCT, UNION (except UNION ALL), EXCEPT or INTERSECT operator.
Anyway, interesting, right? You now see that NEXT VALUE FOR can actually scale across a set. So, given that, it is easy to see how you could construct a generic wrapper that allowed you to generate any number of sequence values as a set instead of just the first and last values. If you need to support a max of 50 values at a time, you could use sys.objects; for 5000, you could use sys.all_columns. For more, you could implement various cross joins or you could use any of the number generators I've described before. Here is an example that assumes you won't ever need more than 5,000 sequence values to be assigned at one time:
DECLARE @RangeSize INT = 250; ;WITH fake_set AS ( SELECT TOP (@RangeSize) column_id FROM sys.all_columns ) SELECT [Value] = NEXT VALUE FOR dbo.MySequence FROM fake_set;
Value ----- 3 4 ... 251 252
And if you need to support more, you can do this:
DECLARE @RangeSize INT = 50000; ;WITH fake_set AS ( SELECT TOP (@RangeSize) s1.column_id FROM sys.all_columns AS s1 CROSS JOIN sys.all_columns AS s2 ) SELECT [Value] = NEXT VALUE FOR dbo.MySequence FROM fake_set;
Value ----- 253 254 ... 50251 50252
This runs a lot faster than you might expect. Also note that not all of the rows will materialize; so, for example, if the cross join above yields 30 million rows, you won't use up that many SEQUENCE values – only the rows that are returned by TOP.
You can place this code in a stored procedure, but I thought it would be more useful outside of that context because, with the T-SQL above, you could join the results with other tables or views. Unfortunately, you can't place this code into an inline table-valued function, since a function cannot have side effects (the side effect would be assigning new sequence values). Here is the error message you will get if you try:
NEXT VALUE FOR function is not allowed in check constraints, default objects, computed columns, views, user-defined functions, user-defined aggregates, sub-queries, common table expressions, or derived tables.
So that allows you to generate a range of sequence values based on data you don't have yet (e.g. returning a set of values for an application to use). But what about if you want to return real data with rows that are assigned by the sequence at runtime? This is even easier. Let's say we want to use our sequence to "tag" all objects that have a name that starts with 'sp' – we can simply do this:
SELECT [ID] = NEXT VALUE FOR dbo.MySequence, name FROM [master].sys.objects WHERE name LIKE N'sp%';
That's it! Results:
ID name ----- -------------------------- 50253 spt_fallback_db 50254 spt_fallback_dev 50255 spt_fallback_usg 50256 spt_monitor 50257 spt_values 50258 sp_MSrepl_startup 50259 sp_MScleanupmergepublisher
You'll notice these aren't in any particular order; of course not, because we did not specify an ORDER BY clause. As has been discussed on this blog many times before, without ORDER BY, there is no default ordering, and SQL Server is free to return the results in any order it wishes. This also holds true to the order in which it assigns SEQUENCE values. But you will be quickly disappointed if you just add "ORDER BY name" to the above query:
NEXT VALUE FOR function cannot be used directly in a statement that contains an ORDER BY clause unless the OVER clause is specified.
This gives us a slight hint that NEXT VALUE FOR accepts an OVER clause, so that you can dictate which order the SEQUENCE values are assigned. Sure enough (and I totally missed this when I first looked at the NEXT VALUE FOR topic):
SELECT [ID] = NEXT VALUE FOR dbo.MySequence OVER (ORDER BY name), name FROM [master].sys.objects WHERE name LIKE N'sp%';
ID name ----- -------------------------- 50260 sp_MScleanupmergepublisher 50261 sp_MSrepl_startup 50262 spt_fallback_db 50263 spt_fallback_dev 50264 spt_fallback_usg 50265 spt_monitor 50266 spt_values
While it is true that this returns the results in the desired order even without a final ORDER BY clause, this is just a coincidence and this isn't guaranteed. The OVER (ORDER BY) clause certainly assigned the SEQUENCE values in the order we asked for, but SQL Server is not bound to return the rows in the exact same order every time. So the proper form of the query should be:
SELECT [ID] = NEXT VALUE FOR dbo.MySequence OVER (ORDER BY name), name FROM [master].sys.objects WHERE name LIKE N'sp%' ORDER BY name;
Now the results are the same as the above, and you know they will be presented the same way when you run the query again tomorrow or next week.
This technique isn't all that different from using ROW_NUMBER() without bothering with a sequence at all, except that the sequence ensures that your values are unique across the domain where you use them. You might also be tempted to just grab the current value from sys.sequences and then add that value to a distinct ROW_NUMBER() from the set. I strongly recommend against that, because it does not block anyone else from using conflicting values. Note that while you can use this technique to guarantee the order of the SEQUENCE values, you still cannot assume that there will be no gaps in the event that another user is generating values at the same time.
Don't forget to clean up:
USE tempdb; GO DROP SEQUENCE dbo.MySequence;