Thursday, 1 December 2011

T-SQL: Audit Data Changes

I was recently asked about how to implement auditing of data changes for a database. I know SQL Server contains the Change Data Capture feature but (as is often the case with things I want!), its an Enterprise only feature meaning its out of the reach for the application in question.

As it happened, it turned out that there was already an application in-house (that I hadn't developed) which implemented its own auditing of changes based around a simple Audit table and Triggers. I'm still yet to find out who wrote it, but it certainly does the trick. The database it runs on is relatively static so I do have reservations how scalable it is but for some tables you want control over, it may be suitable.

CREATE TABLE [dbo].[Audit](
  
[AuditID] [int] IDENTITY(1,1) NOT NULL,
  
[Type] [char](1) NULL,
  
[TableName] [varchar](128) NULL,
  
[PrimaryKeyField] [varchar](1000) NULL,
  
[PrimaryKeyValue] [varchar](1000) NULL,
  
[FieldName] [varchar](128) NULL,
  
[OldValue] [varchar](1000) NULL,
  
[NewValue] [varchar](1000) NULL,
  
[UpdateDate] [datetime] NULL,
  
[UserName] [varchar](128) NULL
)
ON [PRIMARY]

GO
 

ALTER TABLE [dbo].[Audit]  
ADD  CONSTRAINT [DF_Audit_UpdateDate] DEFAULT (GETDATE()) FOR [UpdateDate]
ALTER TABLE [dbo].[Audit] 
ADD  CONSTRAINT [DF_Audit_UserName] DEFAULT (SUSER_SNAME()) FOR [UserName]
GO

CREATE TRIGGER dbo.MyTableChangeTracking
ON dbo.MyTable 
FOR INSERT, UPDATE, DELETE
AS

DECLARE
@bit INT
DECLARE
@field INT
DECLARE
@maxfield INT
DECLARE
@char INT
DECLARE
@fieldname VARCHAR(128) 

DECLARE @TableName VARCHAR(128) = 'MyTable';
DECLARE @PKCols VARCHAR(1000)
DECLARE @sql VARCHAR(2000) 
DECLARE @UserName VARCHAR(128) = SUSER_SNAME() 
DECLARE @Type CHAR(1) 
DECLARE @PKFieldSelect VARCHAR(1000)
DECLARE @PKValueSelect VARCHAR(1000)

-- Action 
IF EXISTS (SELECT TOP 1 * FROM inserted)
BEGIN
   IF
EXISTS (SELECT TOP 1 * FROM deleted)
      
SET @Type = 'U' -- update
  
ELSE
       SET
@Type = 'I' -- insert 

END
ELSE
   SET
@Type = 'D' -- delete

-- get all the changed values (need both tables for update)
 

SELECT * INTO #ins FROM inserted 
SELECT * INTO #del FROM deleted 

-- Get primary key columns for full outer join 
SELECT @PKCols = COALESCE(@PKCols + ' AND', ' ON') + ' i.' + c.COLUMN_NAME + ' = d.' + c.COLUMN_NAME 
FROM INFORMATION_SCHEMA.TABLE_CONSTRAINTS pk
  
INNER JOIN INFORMATION_SCHEMA.KEY_COLUMN_USAGE c
      
ON c.TABLE_NAME = pk.TABLE_NAME
          
AND c.CONSTRAINT_NAME = pk.CONSTRAINT_NAME 

WHERE pk.TABLE_NAME = @TableName
  
AND CONSTRAINT_TYPE = 'PRIMARY KEY' 


-- Get primary key fields select for insert 
SELECT @PKFieldSelect = COALESCE(@PKFieldSelect + '+', '') + '''' + COLUMN_NAME + '''' 
FROM INFORMATION_SCHEMA.TABLE_CONSTRAINTS pk
  
INNER JOIN INFORMATION_SCHEMA.KEY_COLUMN_USAGE c
      
ON c.TABLE_NAME = pk.TABLE_NAME
          
AND c.CONSTRAINT_NAME = pk.CONSTRAINT_NAME 

WHERE pk.TABLE_NAME = @TableName
  
AND CONSTRAINT_TYPE = 'PRIMARY KEY' 


SELECT @PKValueSelect = COALESCE(@PKValueSelect + '+', '') + 'convert(varchar(100), coalesce(i.' + COLUMN_NAME + ',d.' + COLUMN_NAME + '))'
FROM INFORMATION_SCHEMA.TABLE_CONSTRAINTS pk
  
INNER JOIN INFORMATION_SCHEMA.KEY_COLUMN_USAGE c
      
ON c.TABLE_NAME = pk.TABLE_NAME
          
AND c.CONSTRAINT_NAME = pk.CONSTRAINT_NAME 

WHERE pk.TABLE_NAME = @TableName
  
AND CONSTRAINT_TYPE = 'PRIMARY KEY' 


-- raise an error if there is no Primary Key 
IF @PKCols IS NULL 
BEGIN
   RAISERROR
('no PK on table %s', 16, -1, @TableName)
  
RETURN
END

SELECT
@field = 0,
  
@maxfield = MAX(ORDINAL_POSITION)  

FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME = @TableName 

-- now loop through the fields to log the values from each field
WHILE @field < @maxfield 
BEGIN
   SELECT
@field = MIN(ORDINAL_POSITION)
  
FROM INFORMATION_SCHEMA.COLUMNS
  
WHERE TABLE_NAME = @TableName AND ORDINAL_POSITION > @field
  
  
SELECT @bit = (@field - 1 )% 8 + 1
  
SELECT @bit = POWER(2,@bit - 1)
  
SELECT @char = ((@field - 1) / 8) + 1
  
  
IF SUBSTRING(COLUMNS_UPDATED(),@char, 1) &amp; @bit > 0 OR @Type IN ('I','D')
  
BEGIN
       SELECT
@fieldname = COLUMN_NAME
      
FROM INFORMATION_SCHEMA.COLUMNS
      
WHERE TABLE_NAME = @TableName
          
AND ORDINAL_POSITION = @field
      
      
-- dynamically build and execute the audit insert statement
      
SELECT @sql = 'INSERT INTO dbo.Audit (Type, TableName, PrimaryKeyField, PrimaryKeyValue, FieldName, OldValue, NewValue)'
      
SELECT @sql = @sql + ' SELECT ''' + @Type + ''''
      
SELECT @sql = @sql + ',''' + @TableName + ''''
      
SELECT @sql = @sql + ',' + @PKFieldSelect
      
SELECT @sql = @sql + ',' + @PKValueSelect
      
SELECT @sql = @sql + ',''' + @fieldname + ''''
      
SELECT @sql = @sql + ',CONVERT(VARCHAR(1000),d.' + @fieldname + ')'
      
SELECT @sql = @sql + ',CONVERT(VARCHAR(1000),i.'  + @fieldname + ')'
      
SELECT @sql = @sql + ' FROM #ins i FULL OUTER JOIN #del d'
      
SELECT @sql = @sql + @PKCols
      
SELECT @sql = @sql + ' WHERE i.' + @fieldname + ' <> d.' + @fieldname
      
SELECT @sql = @sql + ' OR (i.' + @fieldname + ' IS NULL AND d.' + @fieldname + ' IS NOT NULL)'
      
SELECT @sql = @sql + ' OR (i.' + @fieldname + ' IS NOT NULL AND d.' + @fieldname + ' IS NULL)'

      
EXEC (@sql)
  
END
END

GO


Its quite a complicated trigger and took some time to get my head round but it certainly does the trick and you just need to implement this for each table you wish to track changes on. The data is readable too:


But i wasn't comfortable with the overhead of the trigger, the dynamic sql nor the fact that I hadn't written it myself! So I considered a second approach, still using triggers but this time just dumping the changed records to an audit table in XML format. This would have the advantage of a much simpler trigger even at the cost of readability of results and potentially higher storage costs.


CREATE TABLE dbo.tblAudit(
  
AuditID INT NOT NULL IDENTITY(1,1),
  
TableName VARCHAR(255),
  
DataInserted XML,
  
DataDeleted XML,
  
DateCreated DATETIME,
  
CreatedBy VARCHAR(2000)
)
 

GO 

CREATE TRIGGER dbo.MyTable_ChangeTracking 
ON dbo.MyTable 
FOR INSERT, UPDATE, DELETE
AS

SET NOCOUNT ON
-- set the tablename

DECLARE @TableName VARCHAR(255) = 'dbo.MyTable'-- grab the changed data in XML format
DECLARE @InsertXML XML = (SELECT * FROM inserted FOR XML AUTO);
DECLARE @DeleteXML XML = (SELECT * FROM deleted FOR XML AUTO);-- log it 
INSERT INTO dbo.tblAudit (TableName, DataInserted, DataDeleted, DateCreated, CreatedBy) 
SELECT @TableName, @InsertXML, @DeleteXML, GETDATE(), SUSER_SNAME()
GO

To interogate the results, you'd need to be comfortable working with XML but the simplicity of the trigger is something i really like. Hopefully, you wouldn't need to be looking at this data too often so the ease of logging is the main factor when implementing something like this. You could easily deploy this across multiple tables/servers too with a simple Powershell script.

Now, I wouldn't recommend either of this in a highly transactional database and you would want to consider whether to implement this method if the performance of your DML statements was critical and your storage quite tight. I'm going to experiment a bit with the CDC to see just what benefits it gives over a bespoke implementation (I suspect its much more lightweight) while also hoping that the feature gets downgraded to the lower editions of SQL Server.

2 comments:

  1. This comment has been removed by the author.

    ReplyDelete
  2. Really very informative article, I found great sql server auditing tool ( https://www.netwrix.com/sql_server_auditing.html ) which provides an efficient data changes tracking and helps me to auditing production changes like creation, deletion, modification, etc and allows data filter options to reports which are based on owner name, application name, objects, databases, who, where, etc.

    ReplyDelete

/* add this crazy stuff in so i can use syntax highlighter