So if you'd ask me one of the most under-used and useful features of TSQL, I'd have to say "FOR XML" queries. I don't quite understand why enough people don't use 'em, because they truly are very useful, so I figured, hey why not write up a blog post on what the heck they are, so we could then decide why do you use, or not use them??
So here we go.
So FOR XML Queries are written as shown below -
FOR XML mode [, XMLDATA][, ELEMENTS][, BINARY BASE64]
There are 4 permissible FOR XML Modes -
a) RAW
b) AUTO
c) PATH (New in SQL2k5)
d) EXPLICIT
So FOR XML RAW is the most basic form. If you execute it, it will simply try and return you a bunch of <row> elements, there is no hierarchy per se. Here's an example runnin' on AdventureWorks (SQL2k5)
SELECT
Loginid, Title,
Humanresources.Department.Departmentid, Humanresources.Department.Name
FROM
Humanresources.Employee
INNER JOIN Humanresources.Department ON
Humanresources.Employee.Departmentid = Humanresources.Department.Departmentid
WHERE Humanresources.Department.Departmentid = 7
FOR XML RAW
The output this query will produce looks like as shown below -
<row LOGINID="adventure-works\guy1" TITLE="Production Technician - WC60" DEPARTMENTID="7" NAME="Production" />
<row LOGINID="adventure-works\jolynn0" TITLE="Production Supervisor - WC60"
DEPARTMENTID="7" NAME="Production" />
<row LOGINID="adventure-works\ruth0" TITLE="Production Technician - WC10"
DEPARTMENTID="7" NAME="Production" />
<row LOGINID="adventure-works\barry0" TITLE="Production Technician - WC10"
DEPARTMENTID="7" NAME="Production" />
...
The bad news? This XML is like .. umm .. tabular data munged into XML, which is like .. WTF .. it ain't useful. Except of course if you ran XSLT on it, and I've met very few developers who like doing XSLT. Most guys would rather make love to this lady with pink titties rather than to fix the above using XSLT (I agree, fixing the above can get very painful).
So the other option is FOR XML AUTO. FOR XML AUTO exploits the hierarchical nature of SQL Queries. So each table associated with FOR XML AUTO query is represented as an XML element. So for instance, check the query out below (again on AdventureWorks) -
SELECT
Humanresources.Department.Departmentid, Humanresources.Department.Name,
Humanresources.Employee.Loginid, Humanresources.Employee.Title
FROM
Humanresources.Department
INNER JOIN Humanresources.Employee On
Humanresources.Department.Departmentid = Humanresources.Employee.Departmentid
WHERE Humanresources.Department.Departmentid = 7
FOR XML AUTO
When you run the above query, SQL Server will take advantage of the fact that there is only one DepartmentID, and logically it will nest various employees that appear under this department id. The results are as shown below.
<HUMANRESOURCES.DEPARTMENT DEPARTMENTID="7" NAME="Production">
<HUMANRESOURCES.EMPLOYEE LOGINID="adventure-works\guy1" TITLE="Production Technician - WC60" />
<HUMANRESOURCES.EMPLOYEE LOGINID="adventure-works\jolynn0" TITLE="Production Supervisor - WC60" />
<HUMANRESOURCES.EMPLOYEE LOGINID="adventure-works\ruth0" TITLE="Production Technician - WC10" />
...
...
...
<HUMANRESOURCES.EMPLOYEE LOGINID="adventure-works\olinda0" TITLE="Production Technician - WC20" />
<HUMANRESOURCES.EMPLOYEE LOGINID="adventure-works\tom0" TITLE="Production Technician - WC10" />
</HUMANRESOURCES.DEPARTMENT>
The criterion SQL Server follows is that the columns mentioned first, are attempted to be grouped together first. For example, the above mentioned query takes advantage of the fact that there is only one DepartmentID, and hence it uses the FOR XML AUTO directive to specify that all results should be grouped under the particular DepartmentID. If instead you had written the query as below.
SELECT
Humanresources.Employee.Loginid, Humanresources.Employee.Title,
Humanresources.Department.Departmentid, Humanresources.Department.Name
FROM
Humanresources.Department
INNER JOIN Humanresources.Employee On
Humanresources.Department.Departmentid = Humanresources.Employee.Departmentid
WHERE Humanresources.Department.Departmentid = 7
FOR XML AUTO
The results you would have got would have been in the form below.
...
<HUMANRESOURCES.EMPLOYEE LOGINID="adventure-works\guy1" TITLE="Production Technician - WC60">
<HUMANRESOURCES.DEPARTMENT DEPARTMENTID="7" NAME="Production" />
</HUMANRESOURCES.EMPLOYEE>
...
Now ain't that cool ;-).
No that ain't cool, it's not ultra useful as you can clearly guess, because you ain't got much flexibility boy. So the ultra powerful version of FOR XML is FOR XML EXPLICIT. This form, gives you a WHOLE LOTTA CONTROL. I mean, you can pretty much munge any relational structure to any hierarchical structure - elements, attributes, entity encoding, CDATA sections - the whole burrito/enchilada take your pick.
So the following snippet from a FOR XML EXPLICIT query’s SELECT clause demonstrates how the DepartmentID and Name from AdventureWorks HumanResources.Department table can be specified as either an attribute or an element within the same XML document:
SELECT
1 AS TAG,
NULL AS PARENT,
DEPARTMENTID AS [DEPARTMENT!1!DEPARTMENTID],
NAME AS [DEPARTMENT!1!NAME!ELEMENT]
FROM
HUMANRESOURCES.DEPARTMENT
FOR XML EXPLICIT
Ignoring the Tag and Parent parts of this query for now, the alias following the first instance of DepartmentID contains no directive, so it’s treated as an attribute (the default). However, the element directive in the alias following the second Name column causes that column to be represented as an element. I will go into more detail later on, but a portion of the XML generated by the previous SQL is as follows:
<DEPARTMENT DEPARTMENTID="12">
<NAME>Document Control</NAME>
</DEPARTMENT>
<DEPARTMENT DEPARTMENTID="1">
<NAME>Engineering</NAME>
</DEPARTMENT>
<DEPARTMENT DEPARTMENTID="16">
<NAME>Executive</NAME>
</DEPARTMENT>
...
The problem is, FOR XML EXPLICIT Queries are UGLY. They are HUGE. And other than the creator, nobody understands their inner logic (I mean, try writing one with 3-4 tables, and you'll see what I'm sayin', the query would just extend over PAGES AND PAGES AND PAGES .. ugghh !!!).
So whadya do? Well in SQL2k you don't have much of a choice, so the answer is you're screwed. But in SQL2k5 - you got FOR XML PATH. (CLAP CLAP !!)
So, FOR XML PATH is a new addition to SQL Server 2005. In fact, most of what FOR XML EXPLICIT lets you do can be done using FOR XML PATH. There are very few scenarios such as CDATA sections where you may still need to use FOR XML EXPLICIT.
For example, if you wished to write the above query in FOR XML PATH, it could easily be written as below.
Select
DepartmentID "Department/@DepartmentID",
NAME "Department/Name"
FROM
HumanResources.Department
FOR XML PATH ('')
As you can see, this is a simple SELECT query with some XPATH looking syntax after each column. The first Department/@DepartmentID specifies that DepartmentID should appear as an attribute under the department node, and the second “Department/Name” specifies that Name should appear as an element under Department. The results of this query look exactly like the FOR XML EXPLICIT query and are shown as below.
<Department DepartmentID="12">
<Name>Document Control</Name>
</Department>
<Department DepartmentID="1">
<Name>Engineering</Name>
</Department>
<Department DepartmentID="16">
<Name>Executive</Name>
</Department>
Thus as you can see, the various modes of FOR XML allow you to easily render a relational and tabular structure into a hierarchical XML structure.
So, now I gotta ask ya !! What's stoppin' u from usin' the above? Huh?