Connect Digest : 2009-12-12
Sorry I have been quiet recently; been working very hard for the past two weeks on a migration of an old, tried SQL Server 2005 database to a brand new 2008 cluster with faster SAN undercarriage, double the memory and twice as many CPUs. I'll blog more about that later, because other than a few user errors, the whole process was actually quite smooth, and is already paying HUGE dividends. But due to the preparation, the only breaks I've really taken in these two weeks were to smack a volleyball around for a couple of hours at a time, and to not see Adam Machanic present at SNESSUG (though I must say Grant Fritchey did an excellent job of filling in on the fly). I also was not able to participate in T-SQL Tuesday, which looked like a lot of fun, and I'm sorry I missed it. Next month!
So, long story short, I haven't had much time to review the new submissions from other Connect users, and will probably fall back to an every-other-week or once-a-month schedule. That said, during my migration process, I did come across a number of minor issues that I found worthy enough to post myself, and thought I would spread the word now that I'm not feverishly monitoring the bread and butter of my division.
For the love of all that is holy, stop writing code samples that use MM/DD/YY
I know it's been almost 10 years, but has Microsoft forgotten about Y2K already? The gravity of the problem was certainly more hype than reality, but the core problem remains the same: if you represent dates in bad ways, they will come back to bite you. I am sure there are plenty of other documentation samples like this, but I found a particularly bad one in the sp_delete_backuphistory topic — they used '08/20/98' as a string literal representing August 20th, 1998. At least they chose a date where the format is unambiguous to a human, but that format is going to cause real problems on a system where MM/DD/YY isn't the default interpretation. Particularly if they use a date which will not error out when the month and day are transposed, such as August 7th. Err, I mean, July 8th. Wait, maybe that was August 7th. This careless and US-centric effort was my biggest beef with this topic, but I pointed out a couple of other potential enhancements as well, that could also carry onto other topics in general:
Estimating data compression savings, one table at a time
Obviously part of my push toward SQL Server 2008 had to do with compression. On the first day in the new system, we are already reaping tremendous benefits from backup compression. But even under normal OLTP operations, we have some big databases that are mostly I/O-bound, — so there are several areas in our platform where data compression would be a big win. In some shops, data compression is all about disk space, but for us, it is about trading a little bit of CPU for less I/O. Even our high-powered EMC2 Clariion is pushed to the limit at certain points in our business cycle.
While calculating where we would get the most bang for our buck, I came across two shortcomings in sp_estimate_data_compression_savings. One is that the result set returns index_id, but not index_name; the latter would be much more useful if I am then going to go build DDL commands to implement those compression suggestions. (An afterthought I'm just having now is, why not also produce the ALTER INDEX…REBUILD statement, much like the Missing Index Details functionality in a query plan does, and which Tibor replicated for on-demand use here?)
The other is that you must specify whether you want to estimate page or row compression savings. Why can't I estimate both, and compare? This will tell me in one call if my data distribution is more suitable for one of the options. If I could pass NULL into the @data_compression parameter, the procedure would estimate both, and return a mixed resultset (or two resultsets). While the workaround is obviously quite trivial, to me it's more about how easy it would be to make this procedure more convenient, rather than making every user implement these kludgy workarounds. So I submitted this suggestion which referenced both of these enhancement ideas:
I was also going to complain that you couldn't just run the procedure for the entire database, instead of the current method which is to run it one table at a time. Then I noticed MadDog's item, which asks for this:
Of course, Paul Nielsen wrote a procedure that does this for you, but it would be great to have that support in the product. Especially since there is already an object_name column in the result set — why would we need that if we are restricted to calling the procedure for each table? Maybe it was something they initially intended to do, but then they realized that it wouldn't look good if someone with a 6TB database ran it and it took 6 weeks to complete. 🙂
Upgrade advisor, can you be more specific?
A few days ago I realized that if you have some questionable SQL in a stored procedure (in this case it was insisting I had an ORDER BY constant when I certainly didn't, which is a bug for a different day), the Upgrade Advisor reports the name of the procedure, but not the schema. I have been using schemas more and more to divide my objects logically (and alphabetically), and there are some cases where I do have collisions on object_name alone (for example, staging tables). So, it would be much more useful if the Upgrade Advisor could report on the fully-qualified name.
But I don't want all these DTA objects
To test a workload, I ran the Database Engine Tuning Advisor against a production instance during a maintenance window. The next time I looked in msdb, I was appalled to see the vast number of objects DTA creates, and even more appalled that there isn't a straightforward and automated way to get rid of them (much like diagramming objects… you can add them with one click, but it is a real pain to remove them). So I filed this suggestion, and included a workaround demonstrating how I got rid of them on my own, in case you don't want to wait for them to fix it:
Always remember to disable auto-updates on new machines
And finally, due to a bug feature in Windows Crapdate – which by default reboots Windows 7 machines when they receive minor and useless updates – I discovered a minor flaw in the auto-recover option in SSMS. Don't get me wrong, this enhancement to SSMS is absolutely essential, and works great if you have one SSMS window open when your system crashes reboots without asking — but not so great if, like me, you typically have three or four instances at a time. What happens is that when your system recovers, you are offered to recover the files that were open in *one* instance of SSMS at the time of the crash. Which instance seems arbitrary, though I bet it is either the one that Windows forced closed first or last.
Jonathan Kehayias reminded me that you can retrieve these files manually from My Documents \ SSMS \ Backup Files\, and I instantly remembered that I did this a couple of weeks prior, hitting a different problem. When you try to open multiple such .sql files, every single file prompts you to connect — at a certain number of files, clicking Connect or Cancel that many times just becomes unbearable, and I'm not sure why there can't be a checkbox that says "remember my choice"… especially if I decline the offer to connect early on. So, I offer both of these suggestions: