Playing with CDC in Katmai

Change Data Capture (CDC) is something DBAs around me have been screaming about for ages.  For auditing and other reasons, we have needed to rely on 3rd party tools from companies like Lumigent, whose motto for their auditing product previously named Entegra went something like, "Who did what to which data when?"  It's a slogan I don't think I'll ever forget, and not just because I have it plastered on a T-Shirt I grabbed at PASS or SQL Connections that year…

Much to the delight of SQL Server DBAs everywhere (and surely to the chagrine of 3rd party vendors like Lumigent), there is now CDC functionality coming in Katmai (sorry, SQL Server 2008), and you can play with it in the June CTP.  (If you haven't read in one of the hundreds of blog posts on this site and elsewhere, the June CTP is available here.)

I started playing with this feature, and it is very easy to set up.

CREATE DATABASE CDCTest;
GO
 
USE CDCTest;
GO
 
EXEC sys.sp_cdc_enable_db_change_data_capture;
GO
 
CREATE TABLE dbo.Test
(
  id INT PRIMARY KEY,
  tx NVARCHAR(MAX),
  dt DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP
);
 
EXEC sys.sp_cdc_enable_table_change_data_capture
     @source_schema = N'dbo',  
     @source_name   = N'Test',  
     @role_name     = NULL;  
— there are many more options to this procedure; this is the default
— behavior, including capturing all columns in the table.

Now, in two out of three tries from scratch, I've found that I needed to restart SQL Server Agent for data changes to start being captured (I filed a suggestion on Connect).  So, to be sure, before we put any data into the table, let's go to SQL Server Agent in Object Explorer, expand jobs, right-click "cdc.CDCTest_capture" and choose "Start Job at Step…" and get the job running once.  Now:

INSERT dbo.Test(id,tx) SELECT 1, 'foo';
INSERT dbo.Test(id,tx) SELECT 2, 'bar';
WAITFOR DELAY '00:00:05';
UPDATE dbo.Test SET tx = 'splunge' WHERE id = 2;

In order to capture the data changes, there is a new table created in a schema called 'cdc', named <schema>_<table>_CT.  In this case, it contains all of the columns from the table, as well as other information including pointers to the log and the type of operation that was captured.

SELECT * FROM cdc.dbo_test_CT;

You should see something like this:

__$start_lsn __$end_lsn __$seqval __$operation __$update_mask id tx dt
0x0000001A000001360013 NULL 0x0000001A000001360012 2 0x07 1 foo 2007-06-21 11:29:43.670
0x0000001A0000013A0003 NULL 0x0000001A0000013A0002 2 0x07 2 bar 2007-06-21 11:29:43.670
0x0000001A0000013B0004 NULL 0x0000001A0000013B0002 3 0x02 2 bar 2007-06-21 11:29:48.670
0x0000001A0000013B0004 NULL 0x0000001A0000013B0002 4 0x02 2 splunge 2007-06-21 11:29:48.670

For the __$operation column, 1 is a delete, 2 is an insert, 3 is the before image of an update, and 4 is the after image of an update.  (SELECTs are not captured.)  So, we can see my initial two inserts there in the first two rows, and then the last two rows show the before and after image of the update (this is just like the deleted and inserted columns available within a DML trigger). 

[An important item to note here is that if you make schema changes to the table, you need to disable CDC at the table level and then re-enable it in order to capture changes to the new columns.  This "feature" is documented but I don't think it will be very intuitive.]

This gives you a bit of information, but for me, it wasn't quite enough.  What if I want to track (easily!) what user performed that update, changing 'bar' to 'splunge', and when it happened?  Well, I created my own logging table, with a representation of the three columns from the unique index on dbo_test_CT, and two additional columns (username and eventDate):

CREATE TABLE cdc.dbo_test_CT_MoreInfo
(
  startlsn   BINARY(10),
  seqval     BINARY(10),
  operation  INT,
  username   SYSNAME NOT NULL DEFAULT SUSER_SNAME(),
  eventDate  DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP,
  PRIMARY KEY (startlsn, seqval, operation)
);

To capture my own data (essentially, lumping CDC on top of CDC) I created the following trigger on the capture table:

CREATE TRIGGER cdc.LogMoreCDCInfo
ON cdc.dbo_test_CT
FOR INSERT
AS
BEGIN
  IF @@ROWCOUNT > 0
  BEGIN
    INSERT cdc.dbo_test_CT_MoreInfo(startlsn,seqval,operation)
      SELECT __$start_lsn, __$seqval, __$operation FROM inserted;
  END
END
GO

And I can retrieve the information using a query like this:

SELECT operation = CASE m.operation 
         WHEN 1 THEN 'delete'
         WHEN 2 THEN 'insert'
         WHEN 3 THEN 'update (before)' 
         WHEN 4 THEN 'update (after)'
         ELSE 'unknown' END,
       c.id,  
       c.tx,
       c.dt,
       m.username,
       m.eventDate
FROM cdc.dbo_test_CT_MoreInfo m
INNER JOIN cdc.dbo_test_CT c
ON c.__$start_lsn = m.startlsn
AND c.__$seqval = m.seqval
AND c.__$operation = m.operation
ORDER BY m.eventDate;

[Note that change data is only kept around for 72 hours by default.  This retention time is stored in minutes (4320) in the msdb.dbo.cdc_jobs.retention column.  I've had reasonable (but not perfect) success in changing this value and having the cleanup job obey.]

Now, you can make the trigger get more information, e.g. host_name from sys.dm_exec_sessions or client_net_address from sys.dm_exec_connections.  I just wanted to open the door.

Note that I did not stress test this solution… not that a stress test on an early CTP would be very realistic, anyway.  I'm just playing around here, so far at least.

Aaron Bertrand

I am a passionate technologist with industry experience dating back to Classic ASP and SQL Server 6.5. I am a long-time Microsoft MVP, write at SQLPerformance and MSSQLTips, and have had the honor of speaking at more conferences than I can remember. In non-tech life, I am a father of two, a huge hockey and football fan, and my pronouns are he/him. If I've helped you out, consider thanking me with a coffee. :-)

9 Responses

  1. vamsi says:

    when i try to enable cdc on a table i get the below error. i tried making db_owner as sa, granted view server state permission but still i get the below error.
    EXECUTE sys.sp_cdc_enable_table
        @source_schema = N'dbo'
      , @source_name = N'CDC_test1'
      , @role_name = Null
      –@supports_net_changes = 1
    GO
    Msg 22832, Level 16, State 1, Procedure sp_cdc_enable_table_internal, Line 607
    Could not update the metadata that indicates table [dbo].[CDC_test1] is enabled for Change Data Capture. The failure occurred when executing the command 'sp_cdc_create_change_table'. The error returned was 297: 'The user does not have permission to perform this action.'. Use the action and error to determine the cause of the failure and resubmit the request.
    Msg 266, Level 16, State 2, Procedure sp_cdc_enable_table_internal, Line 0
    Transaction count after EXECUTE indicates a mismatching number of BEGIN and COMMIT statements. Previous count = 0, current count = 1.
    Msg 3930, Level 16, State 1, Procedure sp_cdc_enable_table, Line 61
    The current transaction cannot be committed and cannot support operations that write to the log file. Roll back the transaction.
    Msg 266, Level 16, State 2, Procedure sp_cdc_enable_table, Line 0
    Transaction count after EXECUTE indicates a mismatching number of BEGIN and COMMIT statements. Previous count = 0, current count = 1.
    Msg 3998, Level 16, State 1, Line 1
    Uncommittable transaction is detected at the end of the batch. The transaction is rolled back.

  2. davetiyeci says:

    düğün davetiyesi ve davetiye sözleri

  3. davetiyeci says:

    düğün davetiyesi ve davetiye sözleri

  4. ShanDBA says:

    Krishna,
    Use System_User instead of SUSER_SNAME().

  5. Krishna says:

    This is not working for me as SUSER_SNAME() inside the cdc trigger is sa not the user making the data change. Is there any config setting I am missing?

  6. Krishna says:

    This is not working for me as SUSER_SNAME() inside the cdc trigger is sa not the user making the data change. Is there any config setting I am missing?

  7. Krishna says:

    This is not working for me as SUSER_SNAME() inside the cdc trigger is sa not the user making the data change. Is there any config setting I am missing?

  8. Krishna says:

    This is not working for me as SUSER_SNAME() inside the cdc trigger is sa not the user making the data change. Is there any config setting I am missing?

  9. Krishna says:

    This is not working for me as SUSER_SNAME() inside the cdc trigger is sa not the user making the data change. Is there any config setting I am missing?