Change data capture is used to audit insert, update, and delete activity for table in SQL Server table. You can go through below link to understand more theoretical knowledge.
http://technet.microsoft.com/en-us/library/cc645937.aspx
To understand change data capture practical, we will follow below steps.
1) Enable Change Data Capture
Query:
EXEC sys.sp_cdc_enable_db
Verify Query:
select is_cdc_enabled from sys.databases where name = db_name()
Output:
2) Create table dbo.Config.
Query:
Create Table dbo.Config(
ConfigID int IDENTITY(1,1) NOT NULL,
ConfigKey varchar(100) NOT NULL,
ConfigValue varchar(4000) NOT NULL,
InsertedBy bigint,
InsertedOn datetime2(7) NULL DEFAULT getdate(),
UpdatedBy bigint,
UpdatedOn datetime2(7) NULL,
constraint PK_Config Primary Key Clustered(ConfigID asc)
)
GO
3) Enable Change Data Capture for table.
Query:
EXEC sys.sp_cdc_enable_table
@source_schema = N'dbo',
@source_name = N'Config',
@role_name = NULL,
@supports_net_changes = 1
GO
Verify Query:
Select
is_tracked_by_cdc
From
sys.tables
Where
schema_name(schema_id) = 'dbo'
and name = 'Config'
Output:
4) Perform DML operation on table.
Query:
-- Insert
insert into dbo.Config values('Key','Value',1,default,null,null)
insert into dbo.Config values('Key 1','Value 1',2,default,null,null)
insert into dbo.Config values('Key 2','Value 2',2,default,null,null)
-- Update
update dbo.Config
set
ConfigValue = 'Value 12',
UpdatedBy = 3,
UpdatedOn = getdate()
where
ConfigKey = 'Key 1'
update dbo.Config
set
ConfigValue = 'Value 22',
UpdatedBy = 3,
UpdatedOn = getdate()
where
ConfigKey = 'Key 2'
Verify Query:
Select * from dbo.Config
Output:
5) Get all changes from last day till today.
Query:
Declare
@begin_time datetime,
@end_time datetime,
@begin_lsn binary(10),
@end_lsn binary(10);
Set @begin_time = GETDATE()-1
Set @end_time = GETDATE()
Select @begin_lsn = sys.fn_cdc_map_time_to_lsn('smallest greater than', @begin_time);
Select @end_lsn = sys.fn_cdc_map_time_to_lsn('largest less than or equal', @end_time);
Select
ConfigId,
ConfigKey,
ConfigValue,
InsertedBy,
InsertedOn,
UpdatedBy,
UpdatedOn
From
cdc.fn_cdc_get_all_changes_dbo_config(@begin_lsn, @end_lsn, 'all')
Output:
6) I have seen most of company or user wants see auditing on column wise instead of row wise.
Here, I tried to create procedure sc.sp_cdc_enable_table which accepts same parameter as dbo.sp_cdc_enable_table with some addition parameters as below to enable change data capture (CDC) for given table and also create function sc.fn_cdc_get_column_changes_<capture_instance> to see column wise changes.
Additional Parameters:
- @inserted_by_column_name
- Name of the Inserted by column.
- Default value is “InsertedBy”
- @updated_by_column_name
- Name of the Updated by column.
- Default value is “UpdatedBy”
To understand practically, I used same table dbo.Config as above and followed below steps.
a) Enable Change Data Capture with Column Wise Tracking. It will enable CDC on table if it’s not and also create function sc.fn_cdc_get_column_changes_<capture_instance>.
Query:
EXEC sc.sp_cdc_enable_table
@source_schema = N'dbo',
@source_name = N'Config',
@role_name = NULL,
@supports_net_changes = 1
GO
Verify newly created function:
b) Get all column wise changes from last day till today using above function.
Query:
Declare
@from_lsn binary(10),
@to_lsn binary(10);
Select @from_lsn = sys.fn_cdc_get_min_lsn ( 'dbo_Config' )
Select @to_lsn = sys.fn_cdc_get_max_lsn ()
select
*
from
[sc].[fn_cdc_get_column_changes_dbo_Config](@from_lsn,@to_lsn,null,null)
Order by
__$start_lsn
Output:
c) Disable Change Data Capture with Column Wise Tracking.
Query:
EXEC sc.sp_cdc_disable_table
@source_schema = N'dbo',
@source_name = N'Config',
@capture_instance = 'dbo_Config'
GO
Verify Query:
Select
is_tracked_by_cdc
From
sys.tables
Where
schema_name(schema_id) = 'dbo'
and name = 'Config'
Output:
Limitation:
-
We can’t use Order by clause inside function so we need to explicitly use order by clause for __$start_lsn to get data in proper sequence. -
We don’t have feature to search last 10 records updated from any table for given time. -
To avoid above limitation, I tried to create a process which will auto audit column wise changes for CDC table to dbo.Audit and dbo.AuditDetails for all CDC enable tables configured in dbo.AuditConfig. -
We will go through all above scenario one by one and their configuration. Before that we need to configure some objects. Below are those objects. (Configure Audit Process.sql)
Schemas:
-
Sc
Tables:
-
Error_Log (Check blog http://www.sqlscientist.com/2013/07/error-logging-and-alerts.html) -
Sc. AuditConfig-
Columns-
ConfigID-
Auto increment ID for table.
-
-
SchemaName-
Schema Name of table for which we want to enable column wise CDC.
-
-
TableName-
Table Name of table for which we want to enable column wise CDC
-
-
InsertedByColumnName-
Inserted by Column Name for table.
-
-
UpdatedByColumnName-
Updated by Column Name for table.
-
-
CapturedColumnList-
Identifies the source table columns that are to be included in the change table. captured_column_list is nvarchar(max) and can be NULL. If NULL, all columns are included in the change table. -
Column names must be valid columns in the source table. Columns defined in a primary key index, or columns defined in an index referenced by index_name must be included. -
Captured_column_list is a comma-separated list of column names. Individual column names within the list can be optionally quoted by using either double quotation marks ("") or square brackets ([]). If a column name contains an embedded comma, the column name must be quoted. -
captured_column_list cannot contain the following reserved column names: __$start_lsn, __$end_lsn, __$seqval, __$operation, __$update_mask, __UpdatedBy and __UpdatedOn
-
-
LastLsn-
Last __$start_lsn value after inserting audit records to audit tables.
-
-
IsActive-
1 = Allow CDC records to insert into audit tables. -
0 = Disallow CDC records to insert into audit tables.
-
-
-
-
Sc.Audit-
Coumns-
AuditID-
Auto increment ID for table.
-
-
SchemaName-
Schema Name of table for which we have audit.
-
-
TableName-
Table Name of table for which we have audit.
-
-
__$start_lsn-
Reference lsn values for CDC table.
-
-
PrimaryKeyName-
Primary Key Name for which we have audit.
-
-
PrimaryKeyValue-
Primary Key Value for which we have audit.
-
-
UpdatedBy-
InsertedBy value from table for newly inserted records -
UpdatedBy value for table for deleted record. -
Null for deleted records from table.
-
-
-
-
Sc.AuditDetail-
Columns-
AuditDetailID-
Auto increment ID for table.
-
-
AuditID-
Reference column from Sc.Audit table.
-
-
ColumnName-
Column for which changes happened on table.
-
-
OldColumnValue-
NULL for newly inserted records -
Old value before changes for updated and deleted records.
-
-
NewColumnValue-
New value for inserted and updated records. -
NULL for deleted records.
-
-
-
Triggers:
-
Sc.TRG_AuditConfig : This trigger is used to execute below steps based on operation in dbo.AuditConfig table.-
Insert-
Enable CDC to track column wise changes using sc.sp_cdc_enable_table.
-
-
Delete-
Disable CDC to track column wise changes using sc.sp_cdc_disable_table.
-
-
Update-
It will restrict users from updating Schema or Table Name.
-
-
Procedures:
-
Sc.sp_cdc_enable_table: This procedure is used to enable CDC with column wise change. We have already seen demo for that in step 6 c -
Sc.sp_cdc_disable_table This procedure is used to disable CDC with column wise change. We have already seen demo for that in step 6 c -
Sc.cp_Audit: This procedure is used to audit records from CDC table using function sc.fn_cdc_get_column_changes_<capture_instance>.
Jobs:
-
DatabaseName.dbo.job_audit-
Name : DatabaseName.dbo.job_audit -
Description: This job is used to audit records for tables which are configured in sc.AuditConfig table. -
Steps-
Name: Step 1 -
Type: Transact-SQL script(T-SQL) -
Command: exec sc.cp_Audit
-
-
Schedules-
Name: Occurs every day every 1 minute(s). -
Schedule type: Recurring -
Occurs: Daily -
Occurs every : 1 minute(s) -
Start Date: Today’s Date
-
-
-
DatabaseName.dbo.job_error_log_alert (Check blog http://www.sqlscientist.com/2013/07/error-logging-and-alerts.html)
Example:
To understand this scenario, I have taken Employee and Department example. Below is E-R Diagram for that. Below are the details of tables. (Configure Audit Process Employee & Department tables.sql)
-
Department-
There may be many departments in database.
-
-
Employee-
There are many employees for each departments
-
Now, our requirement is to audit column wise changes for Department and Employee table. For that, we need to insert these tables in sc.AuditConfig table. On insert on these tables into sc.AuditConfig table, it will automatically enable CDC on this table and create function sc.fn_cdc_get_column_changes_<capture_instance> using trigger Sc.TRG_AuditConfig.
Query:
insert into sc.AuditConfig values('dbo','Department','InsertedBy','UpdatedBy','ID,Name,InsertedBy,UpdatedBy',null,1,getdate(),null)
insert into sc.AuditConfig values('dbo','Employee','InsertedBy','UpdatedBy','ID,DepartmentID,Name,InsertedBy,UpdatedBy',null,1,getdate(),null)
Output:
Verify:
Now, we will talk about our daily changes on tables. We can have 3 types of changes in our tables. We will talk about each one by one.
1. Insert records.
Query:
-- Insert two records for department
insert into department(id,name,insertedby,insertedon)
values(1,'department 1', 1, getdate())
insert into department(id,name,insertedby,insertedon)
values(2,'department 2', 1, getdate())
-- Insert two records for employee
insert into employee(id,departmentid,name,insertedby,insertedon)
values (1,1,'Employee 1',1,getdate())
insert into employee(id,departmentid,name,insertedby,insertedon)
values (2,1,'Employee 1',1,getdate())
Verify Audit Query:
Note: CDC is asynchronously update CDC table. On top of that, we have job “job_Audit” to audit records to our tables. You can find some delay to get audit records.
Select
A.SchemaName,
A.TableName,
A.PrimaryKeyName,
A.PrimaryKeyValue,
AD.ColumnName,
AD.OldColumnValue,
AD.NewColumnValue,
A.UpdatedBy,
A.UpdatedOn
From
sc.Audit A
inner join sc.AuditDetail AD on AD.AuditID = A.AuditID
Order By
A.__$start_lsn asc
Verify Audit Output:
2. Update records
Query:
-- Update value for department name
update department set name = name + '1', updatedby = 2, updatedon = getdate()
-- Update value for employee name
update employee set name = name + '1', updatedby = 2, updatedon = getdate()
Verify Audit Query:
You can use same query as above
Verify Audit Output:
3. Delete records
Query:
-- Delete records from Employee
Delete from Employee
-- Delete records from Department
Delete from department
Verify Audit Query:
You can use same query as above
Verify Audit Output:
Now, we had requirement to just inactive audit column wise changes to audit tables. We can do that using below query.
Query:
Update sc.AuditConfig set IsActive = 0
Verify:
It will not disable CDC on table but it will skip this table to audit records in audit tables. Once we will active these tables again. It will audit all records from LastLsn till today into Audit table.
Now, we want to disable CDC permanently for these tables. For that, we need to delete those records from sc.AuditConfig table which will disable CDC using trigger Sc.TRG_AuditConfig.
Query:
Delete from sc.AuditConfig Output:
Below are the scripts to download.
Hi Asif, Thank you so much for posting your solution. Could I ask a couple of questions please:
ReplyDelete1. Do you have to add the columns InsertBy and UpdatedBy to the source tables?
2. Do you have to have primary key columns set up on the source tables? Is there any work around if there are no primary keys?
Hi Aarion,
DeleteBelow are answers for your questions.
1. As per current decision, we should have InsertedBy and UpdatedBy columns. If suppose, you have only one column ModifiedBy for your table then you can set the value ModifiedBy for both InsertedByColumnName and UpdatedByColumnName
2. Yes, In most of database design, we will be having primary key and for that only we mostly looks for column wise changes. Assuming that, we have designed this code to use primary key from table to see column wise changes.
Please feel free if you have more questions.
First of all again thank you for doing this solution. I am learning a lot from just implementing the code
ReplyDeleteCould I ask if there is a way in which we could we could create a IDENTITY(1,1) column and the code pick this column as the unique column without having the primary key?
I will be loading a very large table and loading with the primary key may slow down the process.
Thank you
Hi Aarion,
DeleteFirst of all you need to understand that we have created wrapper on CDC which is inbuilt feature of SQL Server. But for our implementation, we are supporting column level auditing for primary key only right now.
It's required to create unique index of primary key asp per below comments in TechNet for CDC.
[ @index_name = ] 'index_name'
The name of a unique index to use to uniquely identify rows in the source table. index_name is sysname and can be NULL. If specified, index_name must be a valid unique index on the source table. If index_name is specified, the identified index columns takes precedence over any defined primary key columns as the unique row identifier for the table.
http://technet.microsoft.com/en-us/library/bb522475.aspx
If you user unique key. It will create no clustered index and primary key is using clustered index.
Performance wise both will be same only while inserting data.
There are different ways you can do it if you want to load large data for you table and don't want to create index.
If you want to load very large data initially only then you can check difference manually and load into audit table. then enable CDC on it.
Please let me know if you looking for specific scenario.
Asif, also may I ask if the table has a composite primary key, would I have to amend the code in anyway?
ReplyDeleteSorry Aarion, In first version, we have supported only for single column as primary key.
DeleteYes, we can change code to use composite primary key.
Please let me know if you need of it so I will try to create 2nd version of it which will support composite primary key.
Hi,
ReplyDeleteIs this functionality available on Enterprise 2008R2 Standard..
Thanks
Yes, CDC is already supported in SQL Server 2008 R2.
DeleteHi,
ReplyDeleteApologies my question was not as clear as it should have been.
Is CDC available on 2008R2 Standard, I have read in some blogs that itis not available.
Thanks in advance.
Hi,
DeleteI think you may be referring wrong article. Please check below link from TechNet. CDC is supported from 2008 version.
http://technet.microsoft.com/en-us/library/bb500353(v=sql.105).aspx
Hi Asif , Can you please put more light on the given bellow function.
ReplyDeleteEXEC sc.sp_cdc_enable_table
EXEC sc.fn_cdc_get_column_changes_
I mean to say can you please share the Functions what you have written.
Hi Upen,
ReplyDeleteI have created new procedure "sc.sp_cdc_enable_table" on top of the existing procedure. This procedure is calling the original procedure to enable CDC and some logic to create dynamic function to get column wise changes.
sc.fn_cdc_get_column_changes_ is used to get the column wise changes which is supported by Microsoft. It is dynamically created for each table from above procedure.
Pretty! This has been an incredibly wonderful post.
ReplyDeleteThanks for providing this info.