This time around, I've got five wish list items for T-SQL / DDL, and then one bonus – a parallelism bug. My thinking on the former is that, since we are either about to start (or are just finishing) locking down the code for SQL Server 2012, this is the first moment where some of these suggestions can be seriously considered without the "but we're about to ship" excuse. 🙂 The last one is just an incorrect results bug that needs to be fixed. I'll start with the T-SQL / DDL wishes:
Ordered Set Functions
For a while I've been pretty vocal about getting grouped concatenation into SQL Server – functionality that exists in MySQL (GROUP_CONCAT) and more recently added to Oracle 11gR2 (LISTAGG). (And here I thought we were the only pencil-necks using an R2 moniker.) I've often promoted this Connect item when the question comes up on forums or #sqlhelp. This can be solved by a grander implementation – ordered set functions. This comes directly from the standard and supports grouped concatenation (with predictable ordering, unlike many of the workarounds that exist today, such as FOR XML PATH), and a slew of other functionality. The item you should vote for is one that Itzik Ben-Gan just raised this week (and please see this document for more in-depth background):
Create or Replace
I'm pretty sure I've pimped this one before, but in the interest of driving up items that might make the T-SQL / DDL feature list for the *next* version of SQL Server… we should have the ability to specify in the DDL that we want to create this thing if it doesn't exist, and alter if it does. This avoids the need to re-script permissions, worry about breaking existing dependencies, or build complicated batch logic with a stub create, dynamic SQL, etc. I would gladly take this for simple modules only (procedures, triggers, functions, non-indexed views) as the logic around completely re-designing, say, a table can quickly become more complex.
Native Regular Expression Support
I would love to see RegEx as a first class citizen, and at least in V1 don't expect this to be fully optimized the way LIKE can be. I just get so sick of writing complex and wordy expressions using PATINDEX / SUBSTRING / STUFF / REVERSE while switching between 0-based and 1-based in my head. And I . Since other CLR functionality is slowly making its way into the top T-SQL tier (spatial, FORMAT(), TRY_PARSE(), etc), I think it's high time we can perform RegEx as well without each of us having to write our own complicated assemblies, deploying and maintaining them, all that assuming we can assure the boss or the IT guys that it's ok to use CLR.
Create Table As
It would be really nice to have DDL that just says "create a table, but make it a copy of that other table." Today it is easy enough to do this using the tools but to do it in an automated way of any kind is foolhardy – the complexity of the code to generate a CREATE TABLE statement when anything complex is involved can be astounding – consider dependencies, constraints, keys, permissions, triggers, computed columns, indexes, etc. Some folks recommend PowerShell for this but I just think you are trading a different syntax for the same complex code.
The PIVOT and UNPIVOT commands as they exist today expect that you know all of the potential values up front. What if you are feeding a report that is more than happy to accept as many columns as you're willing to provide? What ends up happening is we use really ugly workarounds with dynamic SQL after scanning the table and seeing all the potential pivot values (year-month combinations, let's say) that may be represented. I'd like to see this be a little more reactive to the data returned by the query, though it does step into the land of unstructured data at least a little bit…
…and one of these things is not like the others…
Another parallelism bug
While at least one of the related bugs has recently been fixed, I've previously talked about a few potential perils of parallelism (say that three times fast!). Last week at the MVP Summit I encountered an unfortunate scenario in SQL Server 2012, where incorrect and in fact unpredictable results can come out of using SUM() OVER() with the new windowing functions when combined with parallelism. There are some decent workarounds but they're not very intuitive (and it may be difficult to notice that you're affected).