September 21, 2007 | SQL Server

Elaborations on my SQL Server 2008 slides

After attending a few sessions at PASS, based on much newer builds of SQL Server 2008, I have some additional information to complement what I have published in my slides (since I am not scheduled to present the talk further, I am not going to go back and re-vamp the slides every time I find out a new piece of information).


One important change that I briefly mentioned was IntelliSense in Management Studio.  I don't have a recent enough build to play with it yet (the next CTP is slated for October), but Bill Ramos has told us that for SQL Server 2008 they will only be supported on SELECT queries.  My biggest concern with the feature is not this limited functionality, but rather the potential performance problems — using SQLPrompt I always ended up turning it off, because it tended to be very slow against my data center.  Not sure whether that was the network, or the complexity of the schema, or a combination, but neither of those factors are likely to change much before SQL Server 2008 is released.  The good news is that you can turn IntelliSense on and off just like any other Management Studio option.
 
They are also adding collapsible region support (similar to regions in Visual Studio).  I do not know if you are going to be able to define your own regions like you can in C# (e.g. using a comment like –REGION foo), since the only element of this I saw had automatic regions defined at CREATE PROCEDURE, BEGIN, BEGIN TRY, etc.  But I do know that, like IntelliSense and other Management Studio options, you will be able to turn this feature on and off.  I did file a Connect enhancement request for this issue: SSMS : Provide regions in Query Editor / sp code (like VS)
 
During one of the presentations I gave last week, an audience member asked how they were possibly going to remove the IIS requirement for Reporting Services.  I received an answer about this today, and it is not very technically detailed, but basically they only require access to the HTTP stack (e.g. http.sys).  This will work on XP, Vista, 2003 and 2008 without IIS configured at all.  There was a brief mention that there is the potential for some kind of port conflict if you *do* enable IIS after setting up reporting services in non-IIS mode.  I hope they remove that annoying yellow alert symbol during setup if IIS isn't configured (a warning that occurred before you even told setup whether you were insterested in installing reporting services in the first place).
 
To expand a bit on Backup compression, it is currently planned as a simple enhancement to the BACKUP DATABASE command (WITH [NO] COMPRESSION).  You can set the default compression (on/off) at instance level using sp_configure.  What is missing is the ability to set the default compression setting at the database level, since you may have a server with different types of databases that have different compression results (both rate and speed), and may want to have all backups on one database to be compressed, and all backups on another be NOT compressed.  I filed a request on Connect for this (Allow default backup compression at database level), and the initial response has been positive, so we may see it.  I have also requested they expand this to support a different default for FULL vs. DIFF backups (Expand default backup compression to treat full/diff).  This one is not as important to me, but based on the response, it also seems to have been deemed worthwhile.  (Note that I think Bill or Kevin responded to both issues in only one accidentally).
 
I was not happy with the amount of detail I was able to provide on Change Data Capture and its associated feature Change Tracking, so I'd like to go over a few points here.

 
First of all, to my surprise, Change Data Capture is only available in the Enterprise Edition SKU.  So, you will not be able to use this functionality in Express, Workgroup or even Standard Edition.  This was a question posed at the NESQL presentation, and at the time, I just didn't know the answer (but I did hint that I *suspected* it would be an upper level feature).
 
Next, I explained in my talk that CDC was not DDL aware, meaning if you add a column to a table that is being captured, the capture table is ignorant and is not updated, and in order to capture the new column, you need to disable and re-enable CDC, which means you will lose the existing captured data (unless you archive it off first). The suggestion from Gopal in the session at PASS was to create a second instance of CDC (so you have two tables for a short period), which will pick up the new column, and then disable the first instance.  This way you don't lose any of the old data, and from that point you start collecting the new column.
 
They are providing LSN <=> date/time functions (e.g. sys.fn_cdc_map_time_to_lsn()) so that you can easily determine all of the changes that occurred in a specific time range, rather than having to be fluent in log sequence numbers.  They are also making it easy via sys.fn_cdc_get_min_lsn('<Captured_Table_Name>') and sys.fn_cdc_get_max_lsn() to pass all possible LSN range into cdc.fn_cdc_get_all_changes_<Captured_Table_Name>() and cdc.fn_cdc_get_net_changes_<Captured_Table_Name>().
 
I am not crazy about the names for a lot of the functions and procedures, and have requested that they augment at least the CREATE / ALTER TABLE syntax so that you can turn CDC on and off using DDL instead of system stored procedures.  I don't expect any big changes there, because there were several factors involved (including time constraints and developer resources), but if you agree with me, you can vote: CDC : provide CREATE/ALTER TABLE options to enable CDC
  
Change Tracking, of which I had no visibility before I put together my slides, is an immediate, transactionally consistent tracking mechanism allowing you to see changes as they happen.  This can be used for auditing who changed what data when, but you can also use it to provide update mechanisms for cached read-only catalogs or offline stores (think Outlook in cached mode).  It can also be used to support concurrency updates, for example when a client wants to update a row it can check the change tracking information to make sure the row hasn't been changed in the meantime, and only update the row if it hasn't been touched (however in this case there didn't seem to be an easy way, other than checking @@ROWCOUNT, to bubble an error up to the client).  It also supports ADO.NET / sync services.
 
Change Tracking will be allowed via DDL and through Management Studio (it is definitely more put together than CDC):
 
ALTER DATABASE dbname SET
(
  CHANGE_TRACKING = ON 
  (CHANGE_RETENTION = 48 HOURS)
);
 
— I'm not too crazy about the new paren syntax here 
 
ALTER TABLE dbo.table
ENABLE CHANGE_TRACKING
WITH 
(
   TRACK_COLUMNS_UPDATED = ON
); 
 
There is a new table-valued function called CHANGETABLE() which returns the set of changes that have occurred from a baseline.  There are also new functions CHANGE_TRACKING_CURRENT_VERSION() and CHANGE_TRACKING_MIN_VALID_VERSION() so you can determine what your database-wide baseline should be (this is based on committed transactions).  There is also a CHANGE_TRACKING_IS_COLUMN_IN_MASK() function to make querying for changes slightly more efficient, since it will only show you the rows where the value.
 
The impact of Change Tracking is very light… roughly the same maintenance overhead as you would experience if you had added a secondary index to the table.  This varies slightly depending on the type of DML operation, but it is still proven to be far better than any existing trigger-based solutions. 
 
When I was originally told that I shouldn't use Change Data Capture for auditing, I was led to believe that Change Tracking could be used for that instead.  Sadly, this isn't done here either, since the model is not extensible for supporting custom columns such as SPID, HostName, ApplicationName, etc. Parts of this can be accomplished using the third area of tracking in SQL Server 2008 : Auditing.
 
Auditing is focused on compliance, but allows you to audit DML, DDL, SELECT, Permissions, and login/logout events.  I don't have any deeper technical details than that, since there are no bits exposed that support this functionality yet.  I can tell you that auditing captures entire statements, not individual column values, so if you want to know who updated the row *and* the before and after values of the column (or even the after value without parsing the actual statement, since you can't be sure it hasn't been changed again since the event occurred), you would have to combine auditing and CDC somehow (presumably joining on LSN or at least an LSN range).  It may very well be the case that they are trying to provide some basic auditing functionality without stepping on the toes of ApexSQL, Lumigent, et. al. but, big surprise, I have filed a Connect item for this as well: Change Tracking : complete solution is not there yet
 
I filed a couple of other enhancement requests for Change Tracking, and they might make a little more sense to you once the next CTP is available:
 
So, we have three new change tracking mechanisms in SQL Server 2008, all with pros and cons and their own use case scenarios.  Important to note is that CDC and Auditing are available only in Enterprise Edition, while Change Tracking is available in all SKUs (yes, even Express).
 
That's all I have for now; I hope the updates have been informative… 

9 comments on this post

    • cinahcaM madA - September 22, 2007, 12:43 AM

      Is the "auditing" you're referring to different than what we have today in 2005 (Common Criteria audit mode)?

    • AaronBertrand - September 22, 2007, 1:03 AM

      Yes, I believe it is a brand new feature.  Of course I was bitching and moaning so much about CDC and Change Tracking that they didn't get to the Auditing slides.  🙂

    • Chris Randall - September 22, 2007, 1:11 AM

      "Next, I explained in my talk that CDC was not DDL aware, meaning if you add a column to a table that is being captured, the capture table is ignorant and is not updated, and in order to capture the new column, you need to disable and re-enable CDC, which means you will lose the existing captured data (unless you archive it off first).  I learned today that they are going to fix this in RTM so that capturing would not have to be turned off and on, though I wasn't entirely clear if it is completely transparent or if you have to do something about it."
      I don't think this is the case (I was at the same session with you, Aaron). Gopal indicated that you would need to either capture to a new instance or disable/enable  CDC. He didn't mention any fix in RTM. Did you hear this elsewhere at PASS today?

    • AaronBertrand - September 22, 2007, 1:18 AM

      Maybe I misheard?  I definitely got the impression from his statement that it would be more seamless than now… I think my interpretation was that you could transfer the existing CDC to the new instance without disabling / enabling.

    • AaronBertrand - September 22, 2007, 1:26 AM

      I think you're right Chris, and I've updated the main body above.  Let's hope they find some way to make this more seamless, although I think the combination of perpetual CDC and rapidly-changing schema is going to affect a very small number of customers.

    • Chris Randall - September 22, 2007, 4:51 AM

      Seamless would be good, as would more information on SQL Audit so we could see if indeed there are the big gaps between change-tracking tools in Katmai that it appears there are (data but not who, who and statement but not data, etc).

    • AaronBertrand - September 22, 2007, 6:52 AM

      Yes, I think that gap is going to affect a large number of customers who are going to continue to need a more comprehensive and expensive companion tool.  🙁

    • gopal - September 22, 2007, 8:37 PM

      What i talked about exists today. CDC allows you to enable two capture instance per table. So lets say you have CDC enabled for tableA with capture_instance name tableA_inst1 and you are consuming change using the TVF corresponding to that instance.
      Now you add a new column to the table, what you can do is you can enable a second capture instance which includes both this column. This does the following, create a second set of change tables and TVF based on the new capture instance definition.
      The catch here is we log to both the change tables. So if you want to consume the new result set, you can switch over to using the new TVF to get the complete result set without missing any changes. There is going to be overhead when you have both the tables enabled, but once you have switched over, you can drop the first instance.
      Try that out and let me know your thoughts.

    • db042188 - February 27, 2008, 9:22 PM

      Adam, what is "Common Criteria audit mode"?  Cant find anything on this subject.  Anyone, is the 2008 auditing feature going to use something like business rules to log info to a file, app log or security log as I'm seeing in one ppt?  Will insert, update and delete be audited with a userid?  Will Katmai somehow protect the audit file itself from fraudulent activity?  We're considering holding off on a custom security component for our app, one that calcs a hash at the business layer, in hopes that Katmai will be addressing application circumvention at  the data layer in a robust way.

Comments are closed.