XML querying in SQL2000
Do you want to query on unstructured xml data inside SQL2000 text fields? Keep on reading!
I've created a table 'Person' with a text field containing flat xml data (see below):
Id [identity] | FirstName [varchar(50)] | ContentXml [text] |
1 | Dion | <content><birthdate>06/21/1970</birthdate><gender>male</gender></content> |
2 | Rosie | <content><birthdate>11/30/1969</birthdate><gender>female</gender></content> |
3 | James | <content><birthdate>05/02/1973</birthdate><gender>male</gender></content> |
4 | Bill | <content><birthdate>06/12/1961</birthdate><gender>male</gender></content> |
Now I'd like to query the xml-data in the 'ContentXml' field, for example:
“Select all rows where 'birthdate' is at least '1/1/1970', and 'gender' equals 'male'. “
Using SQLServer2005, querying on data inside xml fields is as easy as:
SELECT *
FROM Person
WHERE ContentXml.value('(//birthdate)[1]', 'DateTime') >= CAST('1/1/1970' AS DateTime)
AND ContentXml.value('(//gender)[1]', 'VarChar') = 'male'
But what if you're using SQLServer2000 and want a solution right now? SQLServer2000 doesn't “know” xml.....
Well, take a look at the query below, running on SQLServer2000 and returning the same results:
SELECT *
FROM Person
WHERE dbo.fn_XmlElementValue(ContentXML, 'birthdate') >= CAST('1/1/1970' AS DateTime)
AND dbo.fn_XmlElementValue(ContentXML, 'gender') = 'male'
You want to see the fn_XmlElementValue function? Here it is:
CREATE FUNCTION dbo.fn_XmlElementValue(@xml text, @tagname varchar(100))
RETURNS varchar(2000)
AS
BEGIN
DECLARE @startpos int, @endpos int
IF (NOT @tagname IS NULL)
BEGIN
SET @startpos = CHARINDEX('<' + @tagname + '>', @xml) + LEN('<' + @tagname + '>')
IF (@startpos > 0)
BEGIN
SET @endpos = CHARINDEX('</' + @tagname + '>', @xml, @startpos)
IF (@endpos > @startpos)
BEGIN
-- Return the requested value
RETURN (SUBSTRING(@xml, @startpos, @endpos - @startpos))
END
END
END
-- Tag empty or not found
RETURN NULL
END
Let me explain what happens here: The field 'ContentXml' is searched for the string-value between the '<birthdate>' and '</birthdate>' tags, and the result is compared with date '01/01/1970'. Cool, eh?
Is this slow? No, certainly not! I just inserted 10.000 Persons in my table, containing xml data with random birthdates between 1/1/1960 and 1/1/1980. The select query is still very quick, and seems even quicker than the corresponding SQLServer2005 query...
Note: the given solution is only suitable for flat xml data, where the elements have unique tag-names and don't have attributes but only values.
But then again, it's a pretty powerful solution if you want to add fields to a table without having to change the table's design.