Intro to XML, Part 2

The eXtensible Markup Language--A C++ Developer's Primer

By: Kenn Scribner for Visual C++ Developer

Background: I started looking into XML and XSL in 1998 when the initial betas of IE 5 were coming out (in fact, Microsoft released to us a beta of MSXML even before our IE 5 beta). We were developing a client-server application with a thick client that would send processed data to the server for analysis. (In fact, it was our own version of SOAP, though we didn't call it that at the time.) When Kate (Kate Gregory, the editor of VCD) contacted me and asked for an article on XML, I invited myself in for a series of 4! At that time I was completing the SOAP book and wanted to lead the VCD readers through an introduction to XML right to SOAP and Biztalk. That was the goal, anyway...give these articles a read and see if I hit the mark.

The eXtensible Markup Language--A C++ Developer's Primer

Part I, XML: A C++ Developer's Primer
Part II, The DOM and XSL
(Part III, SOAP)
(Part IV, Biztalk)

Welcome to the second in a three-part series discussing XML for C++ developers—with any luck I didn't overwhelm you in the first part of this series! (See "XML: A C++ Developer's Primer" in the February 2000 issue.) Before I dive into the XML DOM, let's look at the concept of a document object model in general. Document object models for various types of documents come in many flavors and sizes. Some are linear, allowing you to traverse the document from beginning to end. Others are trees, allowing you to view the document as a collection of abstract nodes arranged in a tree structure. But I feel that the most interesting is a true object model, where sections of the document form entities in the OO sense, with attributes and methods. The XML DOM incorporates both the tree model and the object model, in that your XML document can be thought of as a tree, with each node being an object you can call to manipulate the data encapsulated within the node.

Take, for example, this XML document:

<?xml version="1.0"?>
 <myxmldoc>
    <title>Basic XML Markup</title>
    <greeting>Hello, World!</greeting>
 </myxmldoc>

You saw this XML document in my earlier article. Here you can count four elements: the processing instruction, the myxmldoc element, the title element, and the greeting element. If I arrange these elements such that they appear in tree form, you'd see something like Figure 1.

The processing instruction (<?xml version="1.0"?>) and the XML root element (<myxmldoc />) are sibling nodes, both children of the imaginary root node. The XML root element has two child nodes, <title /> and <greeting />.

However, if I ignore trees and look at each node as an object, I see four node objects. To be sure, some are specialized, like the processing instruction. Hmm. . . sounds like polymorphism. In fact, it is. The processing instruction node is both a node object and a processing instruction object. Each object type has very different characteristics, but each refers to the same object. The XML document as a whole is also an object. You'd use this object to create new XML nodes, for example. So without much effort, you already know of three of the XML DOM objects! Table 1 provides you with a complete table of the XML DOM objects.

Table 1.XML DOM objects and their uses.

DOM object	Purpose
DOMImplementation	A query object to determine the level of DOM support
DocumentFragment	Represents a portion of the tree (good for cut/paste operations)
Document	Represents the top node in the tree
NodeList	Iterator object to access XML nodes
Node	Extends the core XML tagged element
NamedNodeMap	Namespace support and iteration through the collection of attribute nodes
CharacterData	Text manipulation object
Attr	Represents the element's attribute(s)
Element	Nodes that represent XML elements (good for accessing attributes)
Text	Represents the textual content of a given element or attribute object
CDATASection	Used to mask sections of XML from parsing and validation
Notation	Contains a notation based within the DTD or schema
Entity	Represents a parsed or unparsed entity
EntityReference	Represents an entity reference node
ProcessingInstruction	Represents a processing instruction

From Table 1, you can probably see how some objects map to others. For example, an Element object is also a Node object that contains a Text object, and so on. If you're familiar with Dynamic HTML, the concepts involved with the DOM should be relatively easy to comprehend. As a side note, the object names I've used are those you'll find defined in the XML specification. When you use the Microsoft MSXML processor to access your XML document via the DOM, the object names will all have "XMLDOM" in front of their true object names. That is, "Node" would become "XMLDOMNode" and so forth. This was most likely done to distinguish the object names from names Microsoft had already used. "Node," for example, refers to a Microsoft J++ WFC tree node. When you're working with objects through their COM interfaces, you then also prepend the traditional "I," as in "IXMLDOMNode."

All of these objects have both attributes and methods, and naturally they differ among the objects. All DOM object methods, however, return values consisting of the data types shown in Table 2.

Table 2. XML DOM data types.

Data type	Description of Data Type
unsigned short	Unsigned short value (such as a node's nodeType value)
unsigned long	Unsigned long value (such as the length of a NodeList or NamedNodeMap)
Node	DOM Node object
NodeList	DOM NodeList object
NamedNodeMap	DOM NamedNodeMap object
DOMString	Unicode string
Boolean	True or False
Void	No value

Of course, this is all very interesting, but what's really interesting is seeing this stuff in action. To do that, let's write some code to fill a tree control with information obtained by processing an XML file.

XMLDOM demonstration application

To demonstrate XML processing and C++, I'll use the Microsoft XML processor, MSXML. There are other XML processors on the market, but this processor ships for free with Internet Explorer 5.0. If you and your end users are using IE5, then you can ship your XML-based application with no additional XML-related installation requirements. You can, if you wish, download the MSXML redistributable from the Microsoft Internet site at http://msdn.microsoft.com/xml/default.asp . There's also a version of the XML processor available for IE4.

Assuming you have access to the MSXML parser, you'll also need to make sure you have the latest Platform SDK include files. The original include files that shipped with VC6 (even as modified by SP3) don't address the DOM as implemented in the MSXML parser. If you don't have the latest include files, you can obtain them from Microsoft's Web site at http://msdn.microsoft.com/downloads/samples/internet/setup/entry.htm . Be sure that after you download the include files, you also add them to Visual Studio's include file and library directory list as shown in Figure 2. You can access this dialog box through the Tools|Options menu. Be sure to use the directories you specified when you installed the files, if they differ from what you see in Figure 2.

Now that you have the proper include files and libraries, you can begin coding against the MSXML parser. I've created a demonstration application, DOMDemo, that you can use to load and parse an XML file. It's an MFC dialog-based application that displays the contents of an XML file using the tree control, as shown in Figure 3. The tree data you see came from my previous article's schema demonstration file.

You select an XML file using the browse button, which ultimately is handled by the CDOMDemoDlg::OnOpen() method you'll find in DOMDemoDlg.cpp (in the accompanying Download file). After querying the user for a file, the method creates an instance of the XML parser and begins to fill the tree control.

try {
    // Create an instance of the XML processor
    CComPtr<IXMLDOMDocument> spXMLDoc;
    HRESULT hr = 
     spXMLDoc.CoCreateInstance(__uuidof(DOMDocument));
    if ( FAILED(hr) ) throw hr;
 
    // Load the requested XML file
    VARIANT_BOOL bSuccess;
    hr = 
     spXMLDoc->load(CComVariant(dlg.m_ofn.lpstrFile),
                                &bSuccess);
    if ( FAILED(hr) || !bSuccess ) throw hr;
 
    // Fill the tree control
    FillXMLTree(spXMLDoc);
 
    // Fill the filename variable
    m_strFilename = dlg.m_ofn.lpstrFile;
    UpdateData(FALSE);
 } // try

As you can see, I'm using ATL to support the COM work I'm doing in this demonstration. The DOMDocument object is specified in the MSXML.H file, which I included in DOMDemoDlg.h. The first three lines of code in the try block create an instance of the MSXML parser. If the object was created, I pass the XML document's interface pointer to a helper function that fills the tree with XML data, FillXMLTree(spXMLDoc).

The CDOMDemoDlg::FillXMLTree() method manages both the tree control and the XML document.

void CDOMDemoDlg::FillXMLTree(
                        IXMLDOMDocument* pXMLDoc)
 {
    // Display hourglass...
    CWaitCursor wc;
 
    // Clear any previous items
    m_CXMLTree.DeleteAllItems();
 
    // Pull XML information
    CComPtr<IXMLDOMElement> spDocElement;
    CComQIPtr<IXMLDOMNode> spRootNode;
 
    HRESULT hr = 
       pXMLDoc->get_documentElement(&spDocElement);
    if ( SUCCEEDED(hr) ) {
        // With the document element in hand, QI() 
        // for the root
        // node object
        spRootNode = spDocElement;
 
        // Create the tree
        CreateXMLTree(spRootNode);
 
        // Expand tree from root
        HTREEITEM htiRoot = m_CXMLTree.GetRootItem();
        ExpandBranch(htiRoot,TVE_EXPAND);
        m_CXMLTree.SelectItem(htiRoot);
        m_CXMLTree.EnsureVisible(htiRoot);
    } // if
    else {
        // Some error...
        AfxMessageBox(IDS_E_NOBUILDTREE,
                      MB_OK|MB_ICONERROR);
    } // else
 }

After changing the cursor to an hourglass and clearing any existing tree data, I retrieve the document element (IXMLDOMElement), which is an XML node, and then query the document element for its IXMLDOMNode interface. I do this because I'm going to use the root node to extract information recursively from the root's children in the CDOMDemoDlg::CreateXMLTree() helper function, which you'll see shortly. To enumerate the child nodes, I need the node interface, not the element interface. If the tree can be created, I expand the tree from the root node for display purposes.

The meat of the recursive tree creation implementation is in CreateXMLTree(). It begins by checking for a NULL node and turning off the tree control's painting behavior (so you don't see flashing as nodes are inserted):

void CDOMDemoDlg::CreateXMLTree(IXMLDOMNode* pNode, 
                                 HTREEITEM hParent 
                                 /*=TVI_ROOT*/)
 {
    // Quick check
    ASSERT(pNode != NULL);
    if ( pNode == NULL ) return;
 
    // Disallow redraws...
    m_CXMLTree.SetRedraw(FALSE);

Once that's complete, assuming a non-NULL node, I make a local copy of the node interface:

   try {
        // Create local copy of the node
        CComPtr<IXMLDOMNode> spCurrent(pNode);

The tree control will display four key pieces of information per node: the node type, the node name, the node's value, and the node's attribute list. I first determine the node's type by calling IXMLDOMNode::get_nodeType():

       // Retrieve node type
        DOMNodeType eNodeType; // node type enumeration
        HRESULT hr = 
           spCurrent->get_nodeType(&eNodeType);
        if ( FAILED(hr) ) throw(hr);

If I was able to retrieve the node type, I try to obtain the node's name:

       // Retrieve node name
        CComBSTR bstrNodeName;
        CString strName;
        hr = spCurrent->get_nodeName(&bstrNodeName);
        if ( FAILED(hr) ) throw(hr);
        strName = bstrNodeName;

I now have enough information to insert this particular node's information into the tree control. To make things a bit easier, I deal explicitly with text nodes, CDATA nodes, and comment nodes. All other nodes are considered elements and simply added to the tree, along with any attributes. Because several of these nodes will use the same tree display format, I look for a node value and precompute a node display string from that:

       // Look for a value...
        CComVariant vNodeValue;
        hr = spCurrent->get_nodeValue(&vNodeValue);
        if ( FAILED(hr) ) throw(hr);
        CString strValue;
        strValue = vNodeValue.bstrVal;
 
        // Precompute a tree node string
        CString strNode;
        strNode.Format("%s {%s}",strName,strValue);

If it turns out the node is the XML root document element or another XML element, I'll throw away this display string and calculate another. For the most part, though, this value is used. Next, I determine which type of node I'm dealing with by placing each potential node type in a switch statement.

       // Stuff node types we want to handle into the 
        // tree. If the node is a standard element 
        // node, we'll check for more information. 
 // If not, we'll insert the node and exit.
        HTREEITEM hCurrent = NULL;
        switch ( eNodeType ) {
 
            case NODE_TEXT:
                // Add text node to tree
                m_CXMLTree.InsertItem(strNode,
                                      XML_TEXT,
                                      XML_TEXT,
                                      hParent);
                return;
             case NODE_CDATA_SECTION:
                // Add CDATA node to tree
                m_CXMLTree.InsertItem(strNode,
                                      XML_CDATA,
                                      XML_CDATA,
                                      hParent);
                return;
             case NODE_COMMENT:
                // Add comment node to tree
                m_CXMLTree.InsertItem(strNode,
                                      XML_COMMENT,
                                      XML_COMMENT,
                                      hParent);
                return;
 
            default:
                // Standard element nodes, so check to 
                // see if the node is the root node and
                // mark it as such.
                if ( hParent == TVI_ROOT ) {
                    // Root
                    hCurrent = 
                       m_CXMLTree.InsertItem(strName,
                                             XML_ROOT,
                                             XML_ROOT,
                                             hParent);
                } // if
                else {
                    // Interior node
                    hCurrent = 
                       m_CXMLTree.InsertItem(strName,
                                          XML_ELEMENT,
                                          XML_ELEMENT,
                                          hParent);
                } // else
                break;
        } // switch

This is pretty standard tree control code. All I'm really doing is keeping track of the parent node and inserting a new tree node with an associated icon (this makes for a nicer user interface).

If the XML node was text, CDATA, or a comment, I'm done. However, the other XML nodes will fall through the switch statement's default case and be further processed. In this case, I look for any attributes the element might have and enumerate those. I begin by obtaining the list of attributes for the node:

       // If we got here, we're dealing with a 
        // standard node (root or interior), 
        // so dive in and retrieve the list of 
        // attributes.
        CComPtr<IXMLDOMNamedNodeMap> spAttributes;
        hr = spCurrent->get_attributes(&spAttributes);
        if ( FAILED(hr) ) throw(hr);

If an attribute list indeed exists, I enumerate the list (using IXMLDOMNamedNodeMap::nextNode()), pull the attribute name and value, and stuff what I found into the tree control.

       // Get current node's attributes...
        CComPtr<IXMLDOMNode> spChild;
        if ( spAttributes.p != NULL ) {
            hr = spAttributes->nextNode(&spChild);
 
            while ( SUCCEEDED(hr) && spChild.p ) {
                // Look for attributes...
                CComBSTR bstrAttName;
                hr = 
                   spChild->get_nodeName(&bstrAttName);
                if ( FAILED(hr) ) throw(hr);
 
                // Reset node name to attribute
                strName = bstrAttName;
 
                // Look for a value...
                CComVariant vAttValue;
                hr = 
                   spChild->get_nodeValue(&vAttValue);
                if ( FAILED(hr) ) throw(hr);
                strValue = vAttValue.bstrVal;
 
                // Add attribute node to tree
                CString strNode;
                strNode.Format("%s {%s}",
                               strName,strValue);
                m_CXMLTree.InsertItem(strNode,
                                      XML_ATTRIBUTE,
                                      XML_ATTRIBUTE,
                                      hCurrent);
 
                // Next attribute...while test will 
                // catch failed HRESULT...
                hr = 
                   spAttributes->nextNode(&spChild.p);
            } // while
        } // if

The while loop runs until there are no more attributes. Finally, I recurse by locating the first child of the current XML node. If a child node exists, I call CreateXMLTree() using the current node as the tree root. If not, I move back up the call stack until all of the XML document nodes have been processed:

       // Now recurse through all child nodes
        hr = spCurrent->get_firstChild(&spChild);
        while ( SUCCEEDED(hr) && spChild.p ) {
            // Recurse
            CreateXMLTree(spChild.p, hCurrent);
 
            // Get next node
            CComPtr<IXMLDOMNode> spNext;
            hr = spChild->get_nextSibling(&spNext);
            spChild.Attach(spNext.Detach());
        } // while

All that's left is a little cleanup:

   } // try
    catch(HRESULT hrError) {
        // Some error...
        AfxMessageBox(IDS_E_NOBUILDTREE,
                      MB_OK | MB_ICONERROR);
    } // catch
    catch(...) {
       // Some error...
       AfxMessageBox(IDS_E_NOBUILDTREE,
                     MB_OK | MB_ICONERROR);
    } // catch
 
    // Allow redraws...
    m_CXMLTree.SetRedraw(TRUE);
 }

The rest of the DOMDemo code is relatively straightforward MFC code to deal with the tree control. For example, you can expand the entire tree or collapse it with a single click.

You should know that this demonstration was inspired by Aaron Skonnard's excellent C++/XML sample application, XMLEdit, which I've included with this article. In his demonstration program, you'll find even more of interest, including how to handle parser errors using IXMLDOMParseError as well as how to edit and insert XML nodes dynamically.

You've now seen the basics of working with the MSXML parser, as well as an overview of the XML DOM itself. Now let's turn to something different, taking a look at style sheets and how you can use a style sheet to convert raw XML data into an HTML format for presentation.

The eXtensible Stylesheet Language

XSL was created to convert XML to other formats, currently XML and HTML. XML as a different format? Sure. I mean, you can convert an XML document with one set of tags to an XML document with a different set of tags. While you're converting the XML information to the other format, you can also filter the data, sort it, and make programmatical decisions based on a given element's content.

XSL is based on two fundamental constructs. First, there's the transformation language itself. But it's also important to remember the second construct, which is the formatting vocabulary. It probably isn't too surprising to find that XSL defines an XML-based programming language you can use to construct a new document—that's the whole idea behind stylesheets. The formatting vocabulary allows you to redefine the specific semantics that will be used to format the XML data. Because we're using the MSXML processor here, which is almost entirely directed toward producing HTML, we'll ignore the formatting vocabulary at this time. Future MSXML implementations might allow us to deal directly with the formatting vocabulary, but for now we're pretty much limited to HTML and HTML-like output (such as another XML format). Therefore, I'll concentrate on the transformation language itself.

Here's a simple example XML document I'll use for discussion purposes. It simply contains a listing of the cars I own, including their make, model, year, and mileage. I added the XML-Data datatypes to the elements as well. Normally I'd create a schema, but for this simple example it isn't necessary. The latest MSXML processor technology preview allows inline schemas, but I don't want to force you to download the latest processor just to try XSL. In any case, my XML car database is:

<?xml version="1.0" ?>
 
 <cars xmlns:dt="urn:schemas-microsoft-com:datatypes">
    <!-- Car database -->
    <car color="Red">
       <!-- Individual car record -->
       <make dt:type="string">Saturn</make>
       <model dt:type="string">SL-2</model>
       <year dt:type="int">1992</year>
       <mileage dt:type="int">150437</mileage>
    </car>
    <car color="White">
       <make dt:type="string">Ford</make>
       <model dt:type="string">F-150</model>
       <year dt:type="int">1992</year>
       <mileage dt:type="int">84659</mileage>
    </car>
    <car color="Black">
       <make dt:type="string">Buick</make>
       <model dt:type="string">Regal GS</model>
       <year dt:type="int">1997</year>
       <mileage dt:type="int">51084</mileage>
    </car>
 </cars>

And before you ask, yes, these really are my vehicles, and the Saturn is running just fine, thank you. I hope it lasts another 150,000 miles, considering what vehicles cost today!

Given that database, it's reasonable to want to show users the data in any number of formats. Not only could you want HTML to show the information in different ways, but you might also want to process the information to show only a subset of the data, or show a sorted listing.

XSL to the rescue! The XSL language is based on the notion of a template. To XSL, a template provides a structure used to produce the output document. Structure, in this case, means the tagging arrangement you'll use for the resulting converted document. XSL templates contain patterns. The XSL patterns are used to tell the processor which XML tags are to be transformed.

For example, my cars database uses the root element <cars />, with subelements of <car />. I can create a pattern to apply a transformation to all of the <car /> nodes within <cars /> (in fact, I'll do that in a moment) . The template would include the patterns I create, so I could have a template that displayed the car database sorted by make, or perhaps by model. If I had both sort styles available, I'd have two distinct templates even though the patterns would most likely be quite similar for each template (I could reuse the patterns, in other words). You can create XSL files with a single template, or you can incorporate multiple templates within the same file.

To clarify this a bit, imagine I wanted this HTML as output from my XSL style sheet:

<html>
 
 <body bgcolor="#FFFFFF">
 
 <ul>
     <li>Saturn</li>
     <li>Ford</li>
     <li>Buick</li>
 </ul>
 
 </body>
 </html>

In this case, I'm simply displaying the car models as an unordered list. In pseudocode, the template would look like this:

Emit("<html><body bgcolor="#FFFFFF"><ul>")
 For (each <car> in <cars>)
     Emit("<li>",make,"</li>")
 End For
 Emit("</ul></body></html>")

This pseudocode is completely contrived, as there is no such function as Emit(). But the template concept the pseudocode demonstrates isn't contrived. In this case, the XSL processor will spit out the introductory HTML code verbatim:

<html><body bgcolor="#FFFFFF"><ul>

Then, the processor would loop through all of the <car/> elements. For each element it found, it would insert the make value from the XML document inside the HTML list item tag:

<li>Saturn</li>
 <li>Ford</li>
 <li>Buick</li>

This is a pattern in action! Finally, the XSL processor would spit out the closing HTML code:

</ul></body></html>

If I wanted a table instead, I would have used the HTML table tags in the template instead of the unordered list tags. Thus, I would have defined another template, even though I could have used a similar pattern (I'd need to replace the list item tags with table data tags, but I hope you have the idea). I could even tell the XSL processor to select the unordered list or table for me, depending on some piece of data stored within the XML database itself.

Now let's look at a real XSL template that would produce the unordered list HTML code. Here's one:

<?xml version="1.0" ?>
 <xsl:stylesheet 
        xmlns:xsl="http://www.w3.org/TR/WD-xsl">
    <xsl:template match="/">
      <html>
        <body>
          <ul>
            <xsl:for-each select="cars/car">
              <li><xsl:value-of select="make" /></li>
            </xsl:for-each>
          </ul>
        </body>
      </html>
    </xsl:template>
 </xsl:stylesheet>

This XSL code implements the pseudocode I described previously. Note that XSL is XML, so there's a single root element:

<xsl:stylesheet 
    xmlns:xsl="http://www.w3.org/TR/WD-xsl">

The root element declares this to be an XSL stylesheet and specifies the xsl namespace. Then I declare the template:

<xsl:template match="/">

The match attribute tells the XSL processor to apply this template to the entire XML document. Then I declare the overall structure of the HTML document I want to create:

<html>
    <body>
       <ul>
          (list item data)
       </ul>
    </body>
 </html>

This should produce a pretty basic HTML document. The actual XSL pattern code rummages through the XML document and extracts the <make /> element value from each <car /> element (the "list item data" comment from the previous code snippet):

<xsl:for-each select="cars/car">
    <li><xsl:value-of select="make" /></li>
 </xsl:for-each>

You can see the looping construct in action. Notice the mixture of XSL tags and HTML tags within the body of the loop (the HTML tags are shown underlined):

<li><xsl:value-of select="make" /></li>

The remaining tag tells the XSL processor to insert the current value from <cars><car><make> at this point in the output stream. In this case, the value will be sandwiched between the list item tags. The for-each loop will run through each <car /> element in the XML document.

There are many XSL elements and methods, and space limits me from listing each. I encourage you to purchase a good reference book on XSL, and XML, to provide you with a bit more information. However, Table 3 (XSL elements) and Table 4 (XSL methods) do give you an idea how powerful XSL is.

Table 3. XSL elements.

Element	Purpose
xsl:for-each	Looping element—applies template to identified nodes
xsl:choose	Conditional test (like C++ switch), used with xsl:when and xsl:otherwise
xsl:when	Conditional selection, used with xsl:choose (like C++ case)
xsl:otherwise	Conditional selection, used with xsl:choose (like C++ default)
xsl:value-of	Returns value of identified node, used with its select attribute
xsl:if	Conditional test
xsl:element	Generates a new node in output stream
xsl:template	Defines an output stream based upon specific pattern(s)
xsl:stylesheet	Defines set of templates to be applied to the source document

Table 4. XSL methods.

Method	Purpose
Depth	Returns hierarchical depth within XML tree
ChildNumber	Returns node number relative to siblings with the same name
AbsoluteChildNumber	Returns node number relative to all siblings
AncestorChildNumber	Returns number of nearest ancestor with the given name
FormatIndex	Formats integer using given numerical system
FormatNumber	Formats number using given format
FormatDate	Formats date using given format options
FormatTime	Formats time using given format options
UniqueID	Returns unique identifier for given node

In addition to the unordered list example, I've provided two other XSL stylesheets you can use to see how XSL and XML work together. These style sheets produce HTML with a table of the car information sorted by either make (descending) or model (ascending). The particular line of XSL code that sorts the XML data looks like this:

<xsl:for-each order-by="- make" select="cars/car">

I underlined the sorting attribute. As you can see, you use the for-each element as you normally would, simply adding the order-by attribute. The minus sign (-) indicates the descending sort, while a plus sign (+) indicates an ascending sort order.

Combining XSL and the XML DOM

You can, if you'd like, combine the XML data and the style sheet by indicating in the XML document which style sheet should be applied. When the XML processor sees the stylesheet's processing instruction, it will load the XSL stylesheet and make the relevant transformation(s). For example, if I saved my XSL code in the file "UList.xsl" and then modifed the car XML to add the XSL processing instruction, this is what you'd see:

<?xml version="1.0" ?>
 <?xml-stylesheet href="ulist.xsl" type="text/xsl"?>
 
 <cars xmlns:dt="urn:schemas-microsoft-com:datatypes">
    <!-- Car database -->
    (car information here...)
 </cars>

If you loaded this XML document into IE5, the MSXML processor would first apply the XSL transformation, then IE would load the resulting HTML to show you the simple bulleted output.

However, if you're using MSXML programmatically and want separate data and stylesheets (perhaps to allow the user to select the presentation style he or she desires), you'll need to apply the XSL transformation yourself. In this case, you begin by loading the XML document as you saw in the code for OnOpen(), using IXMLDOMDocument::load(). The twist is that you then create a second instance of the MSXML processor and load the XSL style sheet into that instance. Next, you provide the XML document with the root element of the XSL style sheet using the transformNode() method, which returns to you a BSTR that will contain the transformed markup. You can then do anything you'd like with this BSTR.

I've created an MFC dialog-based application that allows you to load an XML file, load a corresponding XSL stylesheet, apply the style to the XML data, and then finally view the results. The basic user interface is shown in Figure 4. XSLDemo's file loading code is similar to the DOMDemo code you examined previously.

The transformation code is pretty straightforward:

void CXSLDemoDlg::OnXForm() 
 {
     // The XML document has been loaded, as well 
     // as the style sheet, so now apply the style...
     HRESULT hr = S_OK;
     try {
         // Disable the view button...
         CWnd* pBtn = GetDlgItem(IDC_VIEW);
         pBtn->EnableWindow(FALSE);
 
         // Get the XSL sheet's root node
         CComPtr<IXMLDOMElement> spXSLSSElement;
         CComQIPtr<IXMLDOMNode> spXSLRootNode;
 
         hr = 
            m_spXSLSS->get_documentElement(
                                   &spXSLSSElement);
         if ( FAILED(hr) ) throw hr;
 
         // With the document element in hand, QI() 
         // for the root node object
         spXSLRootNode = spXSLSSElement;
 
         // Apply the stylesheet to the root 
         // document object node...
         CComBSTR bstrHTML;
         hr = 
            m_spXMLDoc->transformNode(spXSLRootNode,
                                      &bstrHTML);
         if ( FAILED(hr) ) throw hr;
 
         // Save the HTML output to a file for 
         // later browsing
         TCHAR strPath[_MAX_PATH+1] = {0};
         ::GetTempPath(_MAX_PATH,strPath);
         m_strHTMLPath.Format("%s%s",strPath,
                              _T("XForm.htm"));
         CStdioFile siofOut;
         if ( siofOut.Open(m_strHTMLPath,
                           CFile::modeCreate |
                           CFile::modeWrite |
                           CFile::typeText) ) 
         {
             CString strHTML = bstrHTML;
             siofOut.Write(strHTML.GetBuffer(0),
                           strHTML.GetLength());
             siofOut.Close();
         } // if
 
         // Enable the view button...
         pBtn->EnableWindow(TRUE);
     } // try
     catch(HRESULT hrError) {
         // Object error...
         CString strMsg;
         strMsg.Format(IDS_E_NOXFORM,hrError);
         AfxMessageBox(strMsg,MB_OK|MB_ICONERROR);
     } // catch
     catch(CFileException e)
     {
         // MFC file I/O error...
         CString strMsg;
         strMsg.Format(IDS_E_NOSAVEFILE,e.m_lOsError);
         AfxMessageBox(strMsg,MB_OK|MB_ICONERROR);
     }
     catch(...) {
         // Some error
         AfxMessageBox(IDS_E_UNKXFORMERR,
                       MB_OK | MB_ICONERROR);
     } // catch
 }

In a nutshell, I store pointers to both the XML and XSL IXMLDomDocument interfaces. The user provides the XML and XSL files, at which time the transform button is enabled. When the user clicks the transform button, CXSLDemoDlg::OnXForm() is called. I then disable the view results button (there's technically nothing to view at this time) and convert the XSL's IXMLDOMDocument pointer to an IXMLDOMNode pointer.

I then can transform the XML document using the XSL information. If the transformation was successful, I save the HTML data contained within the BSTR to a file for later navigation with the browser, then enable the View button.

Wrap-up

With this installment, you should now be able to use the MSXML processor from within your C++ applications to manipulate XML data, either to display the information as you walk through the XML tree, or as HTML output you create using an XSL style sheet. The MSXML processor has a great deal more detail than I've been able to show you here in this limited space, but you have the basics and should hopefully be able to understand other samples or articles. Stylesheets could fill a book by themselves, and I've barely scratched the surface! Nonetheless, as with the MSXML processor, you should be able to look at an XSL stylesheet and have some idea as to what it will do when it's applied to the XML document.

In the final article on XML and C++, I'll discuss BizTalk and SOAP, two XML-based standards you'll certainly see in your future work if you keep using the latest in XML technology. Until then, try the sample applications and write a stylesheet or two. See you next time!

Comments? Questions? Find a bug? Please send me a note!

[Back] [Home]