The Barnes & Noble Review
XSL brings XML to life.
XSL's obvious value is in formatting XML data for display in a browser (effectively, a more sophisticated alternative to Cascading Style Sheets). But XSL's true power lies in its ability to transform XML data into virtually any new structure you can imagine. Another way of putting it: XSL bridges your data store and your browser.
Professional XSL focuses on this transformative power of XSL and XSLT, offering detailed techniques and extensive sample code you can begin using right now. You'll also find a batch of example applications -- including a book catalog application that showcases many of XSL's capabilities.
There's a full chapter on transforming XML data into VoiceXML documents -- enabling users to interact with an application by listening and speaking on the phone rather than reading screens and typing on keyboards. This VoiceXML application is emblematic of one of XSL's key advantages: You can store content in flexible XML-compatible formats on your servers, and deliver that content seamlessly to virtually any device, from browsers to wireless PDAs. It's a lot of power -- and Professional XSL places it at your command.
Bill Camarda is a consultant and writer with nearly 20 years' experience in helping technology companies deploy and market advanced software, computing, and networking products and services. His 15 books include Special Edition Using Word 2000 and Upgrading & Fixing Networks For Dummies®, Second Edition.
Read an Excerpt
Chapter 3: XSLT Basics
In this chapter, we will build on Chapters 1 and 2 to provide you with enough information to start
building useful XSLT stylesheets. I will introduce a number of the elements that make up the language,
providing examples of their use. We will also look at a few of the functions built into the language and
see how XSLT manages namespaces, whitespace and some other important issues.
To illustrate the concepts I introduce, we will work mainly with two documents, one that is textual in
content, and one that is more data oriented. The former is a Shakespeare play (Hamlet), and the latter is
a book catalog that could, for example, have been extracted from a relational database. Both documents
are given in the code download for the chapter.
By the end of the chapter, you will:
- have a clearer picture of the processing model of XSLT
- know the difference between push and pull model stylesheets, and when to use each
- understand the use of the most important XSLT elements
- understand the use of a few of the built-in functions
- understand the basic rules of how XSLT copes when there are conflicts in the stylesheet
- know more about the built-in template rules and how to over ride them
Before delving into the detail of XSLT elements and functions, let's start by looking in detail at how an
XSLT processor, such as XT, Saxon or MSXML3, processes a document. We will look at the model
from an abstract view – becoming an XSLT processor ourselves and working our way through a
document and stylesheet. We'll then look at the two fundamental ways in which this model can be used.
More information on these processors can be found in Appendix E. Later in this chapter I will be
mainly using XT to process XSLT stylesheets, but any of these processors can be used. XT is similar
in use to Instant Saxon, which was introduced in Chapter 2.
The XSLT Processing Model
Although we often talk of an XSLT processor as something that turns one XML document into another
(or into an HTML or text document), this is not strictly true. The specification actually talks in terms of
a source tree (or input tree) and a result tree. There is therefore an assumption that, for example, if we
are starting from a text document rather than an existing DOM tree, it has been turned into some sort of
tree structure before the XSLT processor starts its work, and that the result tree will be used for further
processing or serialized in some way to create another text document.
The model, including formatting, therefore looks like this...
...This concept is simple enough. But you will have read in Chapter 1 that XSLT is a declarative language
and uses templates. How does this work in practice? Let's have a look at a simple XML document and
stylesheet, and walk through the processing.
Processing a Document
Here is my XML document – it is the book catalog that you will be familiar with if you have read
Professional XML (Wrox Press, ISBN 1-861003-11-0), although I have cut it down to just two books,
removed some elements and renamed it
<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<Title>Designing Distributed Applications</Title>
<Title>Professional ASP 3.0</Title>
We'll look at the XSLT stylesheet we use to transform this document shortly, but let's now become an XSLT
processor and see what happens. We already know that, as an XSLT processor, we cannot use the source
XML, but need a tree representation based on the structure and content of the document. So here it is...
...Each node is described by a block of three rectangles. In the top rectangle is the node type, with the
node name in the rectangle below it. The bottom rectangle contains an asterisk if the node has element
content, and the text if it has text content.
At the top of the tree is the root node or document root. Don't confuse this with the root element (or
document element) familiar from XML. The document root is the base of the document, and has the
document element (<Catalog>) as a child. It also has the XML declaration and any other top-level
nodes (which might be comments or processing instructions) as children. The document element
contains two child <Book> elements, and these hold the information about the books.
So now we have the tree structure, we can start to populate and process it. This is the processing model
we will use...
Before XSL processing starts, both the source document and XSLT stylesheet must be loaded into the
processor's memory. How this happens is dependent on the implementation. One option is that both are
loaded as DOM documents under the control of a program. Another option is that the stylesheet is
referenced by a processing instruction in the source XML document. IE5 can operate in this way, and
will automatically load the stylesheet when the XML document is loaded.
And here is the XSLT stylesheet (
TitleAndDate.xsl) we will use to process the shortcatalog.xml
to get a new XML document listing just the titles of the books and their publication dates:
<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<xsl:value-of select="Title"/>, <xsl:value-of select="PubDate"/>
Once the documents are in memory, we can start our processing. The XSL processor starts by reading
the template for the document root from the stylesheet (step 1). Here is that template:
The first line indicates that it is a template, with a match attribute to indicate the node or nodes it is
matching. The attribute value is an XPath expression, in this case just being the / to indicate the
Working round the diagram, at step 2 we find the source node (strictly, the node-set, but here it will
comprise a single node) in the source tree that the template matches. This will be the document root.
The second line of the template moves us on to step 3 and indicates that we will execute whatever
templates apply to the children of this node. The document root has two children – the XML
declaration and the <Catalog> element.
Looking through the stylesheet, there is no template for the XML declaration (XSLT does not give us
access to this node), but there is one for the element. Processing a document using XSL is a
recursive process, and we are now back to step 1 with a new template. Here is the template:
This contains some text, which looks like another element called <Books>. As our diagram indicates,
we will transform this into a result node at step 3. It also contains an
<xsl:apply-templates/> instruction, so we will again look for templates to execute matching the child nodes.
The only children of the <Catalog> element are the two <Book> elements, so we will read the
template for these elements and go round the circle again. Here is the template...