Demystify SQL Server FOR XML EXPLICIT

By Robbe D. Morris

Printer Friendly Version

Robbe Morris
Robbe & Melisa Morris
FOR XML EXPLICIT is an option in SQL Server 2000 that allows you to return your results in a structure specifically defined by you.  It is much more powerful than FOR XML AUTO and FOR XML RAW.  This includes parent/child relationships, where values are shown as node values or node attributes, encoded or unencoded, and whether a particular column needed in a GROUP BY or ORDER BY clause should be included in the XML results.  In my experience, this is a powerful feature for use with a wide variey of applictions such as e-commerce or survey based applications.  I've also found this mechanism great for manipulating the XmlDocument returned and eventually passing it back to a different stored procedure in it's entirety and processing an update to the database with it.  For information on working with XmlDocument's inside of stored procedures, click here.
As always, with powerful features comes somewhat complex methods for implementing them.  FOR XML EXPLICIT is no different.  But, once you've got the basics down it really isn't all that hard.  Let's take a quick look at the structure of a very basic query in Query Analyzer against the Northwind database.  The first step is to create the column to xml mapping.  This takes place in the first half of our UNION query along with the node value and attribute population for the first level of the document.  Notice the Tag and Parent tag.  These two tags are required because they define the top two node levels of the document.  The Tag tag becomes the first true element and the Parent tag serves as the root node of the document.


Also in this first half of the UNION query is the population and mapping of the top level node.  This is done via the use of special instruction tags.  The structure of these tags is: Node Name!Tag Index!Attribute Name!Directive.  Valid directives include: element (force attribute to be considered a child node that is encoded), xml (force attribute to be considered child node that is not encoded), xmltext (retrieves data from overflow column and appends it to the document), cdata (wraps CDATA xml tag around values of the node to support invalid xml characters), hide (omits node from xml output), and id, idref, idrefs in conjunction with XMLDATA.
In this example, I've given the top level node the node name of C, it's tag index is set to 1, mapped the CustomerID column to an attribute of the C tag, and mapped the ContactName column as a node value of the C tag (if you designate more than one column as the node value of a tag, the values get concatenated together).  Notice that I've also pre-mapped a new child node under C called O for the Orders table in the second half of our UNION query along with a mapping for a node value for O.
In the second half of the UNION query, we'll populate values from the Orders table.  Keep in mind, the column to xml node mapping is based on the order of the columns in the SELECT statement.  Let's run the query in Query Analyzer to see the results.  When retrieving the value in ADO or ADO.NET, just iterate through the recordset/data table and concatenate your results to a local string variable.  In most cases, all of the results will come back in one record but if the xml is large enough, it can span more rows.
Sample 1
  

 declare @CustomerID nchar(5)
 select @CustomerID = 'ALFKI'
 
 SELECT 1                   as Tag, 
         NULL               as Parent,
         CustomerID         as [C!1!customerid],
         ContactName        as [C!1],
         NULL               as [O!2!orderid],
         NULL               as [O!2]
FROM Customers as C
 where C.CustomerID = @CustomerID
UNION 
SELECT 
       2 as tag, 
       1 as parent, 
       C.CustomerID,
       C.ContactName,
       O.OrderID,
       O.ShipAddress
    FROM Customers C, Orders O 
    WHERE (C.CustomerID = @CustomerID)
      AND (C.CustomerID = O.CustomerID)
      FOR XML EXPLICIT
Abbreviated Results
  
   <C customerid="ALFKI">
       Maria Anders
      <O orderid="10643">
      	Obere Str. 57
      </O>
      <O orderid="10692">
        Obere Str. 57
     </O>
   </C>
Now it is time to take a look at a more complex set of results.  In this example, we want to not only get results for our Customer (C tag) and Orders (O) tags but we also want a set of child nodes for O from the Order Details table.  To do this, we start by mapping xml node and attribute tags for a tag index of 3.  In this case, orderdetailid and productid and we place them in a new node called OD for Order Details.  Then, we create a similar query with an additional JOIN on the Order Details table in a third UNION query.  Notice that the Parent and Tag required tags are now set at level 2 and level 3 which sets the C tag as the top level of this query and the O tag as the immediate parent of the newly created OD tag.  And, to properly put the correct child OD nodes under the correct O node, we'll order the columns using an ORDER BY statement.  Remember, UNION queries require that the same set of columns be used in each query.  Notice the null values passed in UNION query 2.
 
Sample 2
  
 declare @CustomerID nchar(5)
 select @CustomerID = 'ALFKI'
 
 SELECT 1                    as Tag, 
         NULL                as Parent,
         C.CustomerID        as [C!1!customerid],
         C.ContactName       as [C!1],
         C.PostalCode        as [C!1!postalcode],
         NULL                as [O!2!orderid],
         NULL                as [O!2],
         NULL                as [O!2!employeeid],
         NULL                as [OD!3!orderdetailid],
         NULL                as [OD!3!productid]
FROM Customers as C
 where C.CustomerID = @CustomerID
UNION 
SELECT 
       2 as tag, 
       1 as parent, 
       C.CustomerID as customerid,
       C.ContactName,
       C.PostalCode,
       O.OrderID,
       O.ShipAddress,
       O.EmployeeID,
       null,
       null
    FROM Customers C, Orders O
    WHERE C.CustomerID = @CustomerID
    AND C.CustomerID = O.CustomerID
UNION  
SELECT 
       3 as tag, 
       2 as parent, 
       C.CustomerID,
       C.ContactName,
       C.PostalCode,
       O.OrderID,
       O.ShipAddress,
       O.EmployeeID,
       OD.OrderID as OrderDetailID,
       OD.ProductID as ProductID
    FROM Customers C, Orders O, [Order Details] OD
    WHERE (C.CustomerID = @CustomerID)
      AND (C.CustomerID = O.CustomerID)
      AND (O.OrderID = OD.OrderID)
     order by 3,4,5,6,7,8,9,10
       FOR XML EXPLICIT
Abbreviated Results
  
 <C customerid="ALFKI" postalcode="12209">
   	Maria Anders
  	<O orderid="10643" employeeid="6">
          Obere Str. 57
  		<OD orderdetailid="10643" productid="28"/>
   	<OD orderdetailid="10643" productid="39"/>
    <OD orderdetailid="10643" productid="46"/>
  </O>
  <O orderid="10692" employeeid="4">
    	  Obere Str. 57
   	<OD orderdetailid="10692" productid="63"/>
  </O>
 </C>
You can also run a variation on this theme with only two queries instead of one.  The following shows how to create child elements of the O tag by using the Element directive after the Attribute directive for the orderdetailid and productid columns.  The Element directive forces the values to be shown as child nodes instead of attributes within the same node.
 
Sample 3
  
    
 declare @CustomerID nchar(5)
 select @CustomerID = 'ALFKI'

 SELECT 1                    as Tag, 
         NULL                as Parent,
         C.CustomerID        as [C!1!customerid],
         C.ContactName       as [C!1],
         C.PostalCode        as [C!1!postalcode],
         NULL                as [O!2!orderid],
         NULL                as [O!2],
         NULL                as [O!2!employeeid],
         NULL                as [O!2!orderdetailid!element],
         NULL                as [O!2!productid!element]
FROM Customers as C
 where C.CustomerID = @CustomerID
UNION 
SELECT 
       2 as tag, 
       1 as parent, 
       C.CustomerID,
       C.ContactName,
       C.PostalCode,
       O.OrderID,
       O.ShipAddress,
       O.EmployeeID,
       OD.OrderID as OrderDetailID,
       OD.ProductID as ProductID
    FROM Customers C, Orders O, [Order Details] OD
    WHERE (C.CustomerID = @CustomerID)
      AND (C.CustomerID = O.CustomerID)
      AND (O.OrderID = OD.OrderID)
     order by 3,4,5,6,7,8,9,10
   
       FOR XML EXPLICIT
GO 
Abbreviated Results
  
  <C customerid="ALFKI" postalcode="12209">
      	Maria Anders
   	<O orderid="10643" employeeid="6">
         Obere Str. 57
      	<orderdetailid>10643</orderdetailid>
       <productid>28</productid>
   	</O>
   <O orderid="10643" employeeid="6">
     	 Obere Str. 57
    	<orderdetailid>10643</orderdetailid>
     <productid>39</productid>
   </O>
</C>
Here's sample for utilizing a column in an ORDER BY clause but excluding it from the XML output.  In this example, we'll hide the OrderDate column.  We'll also use the CDATA directive around the ContactName column.
 
Sample 4
  
 declare @CustomerID nchar(5)
 select @CustomerID = 'ALFKI'

 SELECT  1          as Tag, 
         NULL       as Parent,
         CustomerID     as [C!1!customerid],
         ContactName       as [C!1!!cdata],
         NULL       as [O!2!orderid],
         NULL       as [O!2!orderdate!hide],
         NULL       as [O!2]
FROM Customers as C
 where C.CustomerID = @CustomerID
UNION 
SELECT 
       2 as tag, 
       1 as parent, 
       C.CustomerID,
       C.ContactName,
       O.OrderID,
       null,
       O.ShipAddress
    FROM Customers C, Orders O 
    WHERE (C.CustomerID = @CustomerID)
      AND (C.CustomerID = O.CustomerID)
    order by [O!2!orderdate!hide] DESC
       FOR XML EXPLICIT
GO 
Abbreviate Results
  
    <C customerid="ALFKI">
      <![CDATA[Maria Anders]]>
     	<O orderid="10643">
         Obere Str. 57
      </O>
      <O orderid="10692">
        	Obere Str. 57
     </O>
     <O orderid="10702">
         Obere Str. 57
     </O>
     <O orderid="10835">
        	Obere Str. 57
     </O>
    	<O orderid="10952">
        Obere Str. 57
    </O>
    	<O orderid="11011">
        Obere Str. 57
     </O>
  </C>
This should go a long way to getting you started with using FOR XML EXPLICIT.  Something to keep in mind, depending on your query needs and number of joins and unions, it may be more efficient to query the records in normal fashion and insert them into a table variable inside the stored procedure.  Then, run the various joins and unions against it with your FOR XML EXPLICIT clause.  This is particularly useful if you are joining large tables with relatively few rows actually returned.  This isn't true in all cases of course.  Thus, you should time both options and go with the best one.  If you need help with table variables, click here.
If you have any questions, feel free to post them to our Article Discussions forum.  You can also pick up a highly recommended book on SQL Server and it's XML support called The Guru's Guide to SQL Server Stored Procedures, XML, and HTML by Ken Henderson.  It is a fairly detailed look at the entire subject of XML support in SQL Server and is written quite well.

Robbe has been a Microsoft MVP in C# since 2004.  He is also the co-founder of NullSkull.com which provides .NET articles, book reviews, software reviews, and software download and purchase advice.