XML querying in SQL2000

Do you want to query on unstructured xml data inside SQL2000 text fields? Keep on reading!

I've created a table 'Person' with a text field containing flat xml data (see below):

Id
[identity]
FirstName 
[varchar(50)]
ContentXml
[text]
1 Dion <content><birthdate>06/21/1970</birthdate><gender>male</gender></content>
2 Rosie <content><birthdate>11/30/1969</birthdate><gender>female</gender></content>
3 James <content><birthdate>05/02/1973</birthdate><gender>male</gender></content>
4 Bill <content><birthdate>06/12/1961</birthdate><gender>male</gender></content>


Now I'd like to query the xml-data in the 'ContentXml' field, for example:

“Select all rows where 'birthdate' is at least '1/1/1970', and 'gender' equals 'male'. “

Using SQLServer2005, querying on data inside xml fields is as easy as:

SELECT *
FROM   Person
WHERE  ContentXml.value('(//birthdate)[1]', 'DateTime') >= CAST('1/1/1970' AS DateTime)
       AND ContentXml.value('(//gender)[1]', 'VarChar') = 'male'

But what if you're using SQLServer2000 and want a solution right now? SQLServer2000 doesn't “know” xml.....
Well, take a look at the query below, running on SQLServer2000 and returning the same results:

SELECT *
FROM   Person
WHERE  dbo.fn_XmlElementValue(ContentXML, 'birthdate') >= CAST('1/1/1970' AS DateTime)
       AND dbo.fn_XmlElementValue(ContentXML, 'gender') = 'male'


You want to see the fn_XmlElementValue function? Here it is:

CREATE FUNCTION dbo.fn_XmlElementValue(@xml text, @tagname varchar(100)) 
RETURNS varchar(2000)
AS
BEGIN
 DECLARE @startpos int, @endpos int
 IF (NOT @tagname IS NULL)
 BEGIN
  SET @startpos = CHARINDEX('<' + @tagname + '>', @xml) + LEN('<' + @tagname + '>')
  IF (@startpos > 0)
  BEGIN
   SET @endpos = CHARINDEX('</' + @tagname + '>', @xml, @startpos)
   IF (@endpos > @startpos)
   BEGIN
    -- Return the requested value
    RETURN (SUBSTRING(@xml, @startpos, @endpos - @startpos))
   END
  END
 END
 --  Tag empty or not found
 RETURN NULL
END

Let me explain what happens here: The field 'ContentXml' is searched for the string-value between the '<birthdate>' and '</birthdate>' tags, and the result is compared with date '01/01/1970'. Cool, eh?

Is this slow? No, certainly not! I just inserted 10.000 Persons in my table, containing xml data with random birthdates between 1/1/1960 and 1/1/1980. The select query is still very quick, and seems even quicker than the corresponding SQLServer2005 query...

Note: the given solution is only suitable for flat xml data, where the elements have unique tag-names and don't have attributes but only values.
But then again, it's a pretty powerful solution if you want to add fields to a table without having to change the table's design.

3 Comments

  • This is a very good statement.. I will incorporate this as I am using serialized objects in the database :-)



    Although I think XPath enabled Queries will be the great feature in 2005





    In Dutch:

    Nu hoef ik niet perse op SQL2005 te wachten voor dit soort queries..

    top bedankt.



  • Well it's probably faster than 2005 since your &quot;xml parser&quot; is very naive. Put an attribute in one of those tags and watch what happens...

  • Of course the given solution is only for flat xml data, where the elements have unique names and don't have attributes but only values.

    But then again, it's a pretty powerful solution if you want to add fields to a table without changing its design.

Comments have been disabled for this content.