Peeking under the hood : what bits have changed in SQL Server 2008 R2?
We know what new features are available in SQL Server 2008 R2: data-tier applications (still with that unfortunate and poorly-chosen acronym, DAC), control point explorer, dashboard viewpoints, report builder 3.0, Unicode compression, StreamInsight, yada yada. To me, at this early point in the release cycle, these features are, well, boring — mostly because they are not yet fully implemented yet, or in a few cases, because I just haven't had the time to dig into them yet. I do plan on doing some further investigation into the performance impact of Unicode compression, and provide you with some metrics that might help you determine if this might be a useful feature for you. In the meantime, I wanted to get my hands dirty. I'm not talking about a little grease or a smudge of chocolate : I'm talking REAL dirty.
I installed an instance of SQL Server 2008, and left it at RTM, in spite of setup's assurance that RTM has compatibility issues with Windows Server 2008 R2. I then set up a second instance of SQL Server 2008, and applied SP1 + CU3. Finally, I set up an instance of SQL Server 2008 R2. To summarize, my setup looks like this:
|\SQL2008A||SQL Server 2008 RTM||10.0.1600.22|
|\SQL2008B||SQL Server 2008 SP1+ CU3||10.0.2723.0|
|\SQL2008R2||SQL Server 2008 R2||10.50.1092|
Now that I have the environment set up, I'll explain what I'm going to do: compare versions of the mssqlsystemresource database. I've always been fascinated by the resource database, and the fact that you can't gain access to it at all, at least under normal circumstances. It has always felt like this taboo thing, like you do not talk about Fight Club, and I constantly wonder what they put in there under lock and key that they couldn't put in master. Back when SQL Server 2005 first introduced this "hidden" database, it wasn't a very well-kept secret that you could shut down SQL Server, make a copy of the MDF and LDF files, and then attach them under a different name; voila, a user database with all the resource database code and data.
So, I go to the Services applet (one way to get there is through Control Panel | Administrative Tools) and systematically bring down the SQL Server services for the three instances noted above. Once the services are stopped successfully, I go into the following folders (the paths might be different for you, depending on your data drive and if you customized your Instance ID separate from your instance name):
C:\Program Files\Microsoft SQL Server\MSSQL10.SQL2008A\MSSQL\Binn\ C:\Program Files\Microsoft SQL Server\MSSQL10.SQL2008B\MSSQL\Binn\ C:\Program Files\Microsoft SQL Server\MSSQL10_50.SQL2008R2\MSSQL\Binn\
In each of these folders in Windows Explorer, I hold Shift and right-click on the Binn folder, select "Open command window here", and copy the following files:
copy mssqlsystemresource.mdf resourcecopy.mdf copy mssqlsystemresource.ldf resourcecopy.ldf
Then I start all three instances, open a query window for each, and run the following (replacing %PATH% in each case with the relevant path listed above):
CREATE DATABASE ResourceCopy ON ( FILENAME = '%PATH%\resourcecopy.mdf' ) LOG ON ( FILENAME = '%PATH%\resourcecopy.ldf' ) FOR ATTACH;
I probably could have whipped up a script to perform all of these things for me, but it might have taken just as long (or probably longer). So here we are.
Now I have three very accessible copies of the resource database, and I just need a means to compare them. In this case I am going to use the tool I use in my everyday work, Red-Gate SQL Compare. I'm not trying to play favorites; it is just the tool I've been using since day one. Any capable product will do the job, of course; I provide a very comprehensive list of alternatives in my blog post on the cost of reinventing the wheel – mostly just to illustrate that there are so many products to choose from, that you really should try several to see which works best for you (if you're not already set in your ways, like me).
In SQL Compare, you choose File | New Project, and pick the two databases you want to compare, as follows:
After clicking Compare Now, the tool brings back up to four sets of objects: those that are in both but have changed, those that are only in the left database, those that are only in the right database, and then the set of identical objects. For each object, when you highlight it, the bottom pane will show the code on each side, and highlight the differences. (If you want a quick way to scroll through a bunch of objects and view their definition in any database, just compare it to an empty database. This can be much less tedious than writing your own script that extracts them all, or right-click each object from Object Explorer.)
Anyway, here are the results. I'm not going to go through each and every change, but I will highlight the parts that I found interesting.
Differences in 2008 and 2008 R2 : resource database
A lot of changes, but several of them I write off almost immediately as uninteresting. For example, all of the sp_MS changes around replication and log reading are just the tightening up of a filter, adding a clause that prevents pulling rows from MSdistribution_history where the comments column does not start with '<stats state'. Yawn. Some obvious bug fixes in the data compression procedure (issues with clustered index) and to sp_help (they were missing BIGINT as a type that has some tie to precision/scale), and a new column in the catalog view sys.change_tracking_databases (max_cleanup_version).
Even more interesting than the deltas were a couple of new objects. We see functions that would be extremely useful if we could call them ourselves, such as fn_getVolumeFreeSpace and fn_getProcessorUsage. These are obviously called by other system functions that can access the MpuSqlClrWrapper assembly, as when I try to call them as a
mere peon sysadmin, I get the following error:
Msg 15351, Level 16, State 1, Line 1 The CLR procedure/function/type being signed refers to an assembly that is not signed either by a strong name or an assembly.
But this seems to be paving the way for integrated CLR objects that we may someday be able to use without writing our own assemblies (who wouldn't like to see RegEx included in the box)?
Another thing I noticed is that there are some funny coding habits revealed in system objects. I remember working with a guy who always wrote code like, "WHERE 50 = CustomerID," and it drove me batty. Here in the code for sys.sp_dropmessage, I see this doozie:
... AND 0 = SERVERPROPERTY('IsSingleUser')
That just makes my head spin. (Funny enough, the guy I worked with now works for MS, but not in the SQL Server group, so I'm fairly confident the code is not his.) It would make much more sense, at least to me, as:
... AND SERVERPROPERTY('IsSingleUser') = 0
I guess I'm just a formula/column/variable/defined thing on the left, and constant/less known thing on the right kind of guy. I would rather hear the juicy part last, e.g. "my father's flight arrives at 6 PM" vs. "at 6 PM is when my father's flight arrives." Which version is more memorable, so that you'll show up at the airport on time? To each his/her own, I guess, but I'm afraid that if I was in there writing some of this code I'd be doing a lot of, er, constructive edits. 🙂
I also compared the two 2008 instances, just because I was curious what type of changes are made to the resource database in service packs and cumulative updates. As it turns out: a lot. There were 34 changed objects (but no new objects) between RTM and SP1 CU3. Some of these objects overlapped with the objects changed above, but some didn't. That seemed strange at first, but then I realized, of course, that a lot of the public 2008 fixes were made *after* the R2 code was branched off, so many of those updates simply haven't made it into that line yet. Here is the graphical output:
Differences in 2008 RTM and 2008 SP1 CU3 : resource database
Most of the changes were already discussed above; but there were a couple missing (e.g. the fixes to sp_help and the new column in sys.change_data_tracking). But there are also a couple of fixes that did not yet make it into 2008 R2; the most interesting deltas to me are the ones where they use stuff that we can't use. For example, sp_unbindefault and sp_unbindrule make calls to EXEC %%ColumnEx() and EXEC %%System().FireTrigger()… we obviously can't call these ourselves, but it's interesting to see how and when these system functions are used, and what kind of problems get uncovered over time (in this case, the code in RTM hard-coded a column_id of 1, whereas a fix in SP1 or a CU made it pass in the @col_id value that is determined just before this block).
For fun, I also compared the system databases msdb and master between 2008 RTM and 2008 R2. Nothing too exciting here… most of the changes are in certificates, though msdb has a few new utility roles.
Differences in 2008 and 2008 R2 : master database
Differences in 2008 and 2008 R2 : msdb database
Did I learn a lot from this exercise? Not really. But I hope I demonstrated that it doesn't take much to dig into the product and examine deltas (or just effortlessly look at the resource database code, or any database code really).
I'm wondering how I can use sys.fn_getVolumeFreeSpace?
will it work if I copy the resource database, as above?
you are detail oriented!
Thanks for the clarification, Mark!
> Result [of yourVariable = 0]: Assign 0 to yourVariable and return true because the assignment succeeded. Oops.
Not exactly – the result is the value of the assignment, which in this case is 0. In C, it will be implicitly cast to a boolean, where 0 is false. The more common form of this mistake was using *anything other* than 0 – since everything !0 is true.
In C#, there is no conversion from int to boolean – so you can't make that mistake. You can, however, still use the result of an assignment:
if ((yourVariable = 0) == 0)
Either there is a bug in SQL Compare that ignores UCP/DAC, or these features utilize existing schema, or — most likely — it doesn't actually build any schema until you do something with it. There are definite precedents here, such as CDC/change tracking, and even features as simple as database diagrams. So perhaps my results would be different if it wasn't a brand new instance and I had already played with UCP/DAC before performing my comparisons.
>>For fun, I also compared the system databases msdb and master >>between 2008 RTM and 2008 R2. Nothing too exciting here… most >>of the changes are in certificates, though msdb has a few new >>utility roles.
Are you sure about this. Where is all the schema for UCP and DAC?
Just a note on why certificates change at each resource db version change: the certificates used to sign stuff in the resource database drop their private key right away, actually as part of the resource db make process. This way it ensures that the private key is never lost (since is destroyed) and it cannot ever be used to sign a fraudulent module in the resource db. The signatures can always be verified with the public keys. Because of this practice each build will create a new certificate to sign stuff in the resource db.
FYI, a lot of people started that "constant on the left" style because of C and C++. In those languages = is for assignment and == is for equivalence and so if you wanted to know whether yourVariable is 0, you would do:
if (yourVariable == 0)
… but the following is also legal:
if (yourVariable = 0)
Result: Assign 0 to yourVariable and return true because the assignment succeeded. Oops.
To "solve" that problem a lot of people started putting the constant on the left, which would result in an error instead of an accidental assignment. Luckily, C# did away with allowing this syntax, and of course it's never been possible to do something like that in SQL, so the syntax is a bit outdated at this point.