Thursday, 9 February 2012

T-SQL: INSERT with XML shredding is very slow

Back in 2010, I came across a performance bug with SQL2008 SP1. When shredding XML using the nodes query, the performance is fine with a SELECT but when you want to INSERT the data to a table/temptable/tablevariable the performance becomes very poor.

The query I was having problems with was something like this (Thanks for the sample data wBob!):

-- Spin up some test data 
WITH cte AS 
SELECT 65 x 
SELECT x + 1 
FROM cte 
WHERE x < 90 ) 

SELECT @xml = (
SELECT CHAR( a.x ) +
' & co.' AS "name"
AS "value"
'FIELD' AS "paramdata/field"

    FROM cte a 
      CROSS JOIN cte b 
      CROSS JOIN cte c

DECLARE @ConfigData TABLE (ID INT, Name NVARCHAR(255), Value NVARCHAR(255), ParamData XML)

INSERT INTO @ConfigData (ID, Name , Value, ParamData) 
tbl.cols.value('name[1]', 'varchar(1000)'),
tbl.cols.value('value[1]', 'varchar(1000)'),

FROM @XML.nodes('//root') AS tbl(cols) 

I raised this with MS via Connect (#562092) and went on to implement my own workaround using the following CLR shredding function:

public partial class UserDefinedFunctions
    [SqlFunction(FillRowMethodName = "tvf_clr_FillParameterData",
        TableDefinition = "Name nvarchar(255), Value nvarchar(255), ParamData nvarchar(255)")]
public static IEnumerable tvf_clr_ParameterDataShredder(SqlXml parameterData)
        XmlDocument document
= new XmlDocument();
return document.SelectNodes("//parameter");

public static void tvf_clr_FillParameterData(object row, out string outName, out string outValue, out string outParamData)
        XmlNode document
= (XmlNode)row;

outName = document.SelectSingleNode("name").InnerXml;
outValue = document.SelectSingleNode("value").InnerXml;

outParamData = null;
if (document.SelectSingleNode("paramdata") != null)
outParamData = document.SelectSingleNode("paramdata").OuterXml;

This served me well and solved the performance issue and all was fine. Until now.

The thing is, this method doesn't handle decoding of escape characters in XML, so any &s that exist in the XML string being shredded will be returned as &:amp. Not ideal. My first thought was to just implement a decoding function within the CLR routine but the main candidates use System.Web which is not available within SQL Server. I eventually stumbled upon this post from Jonathan Keyahias which provided a sql safe Encoding implementation and I was going to use this as a basis for writing my own decoding function. While waiting the 20mins for Visual Studio 2008 to open, it occured to me to revisit the Connect case and fortunately, there were a number of workarounds there, and a combination of the solutions the most effective.

INSERT INTO @ConfigData (ID, Name , Value, ParamData)
SELECT     1,
tbl.cols.value('(name/text())[1]', 'nvarchar(255)'),
tbl.cols.value('(value/text())[1]', 'nvarchar(255)'),

FROM @xml.nodes('//root') AS tbl(cols) 

The above simple modification to my SQL INSERT statement and performance was acceptable and more importantly, the decoding of the escaped characters was fine. Saved me a lot of work.

1 comment:

  1. This is where is all starts Powered by shredding, liveSite https www camelback net


/* add this crazy stuff in so i can use syntax highlighter