7.6 Formatting 7.6.1 Setting the Output Mode 7.6

7.6 Formatting
Since XSLT was originally intended for producing human-readable formatted documents, and not just as a general transformation
tool, it comes with a decent supply of formatting capabilities.
7.6.1 Setting the Output Mode
A global setting you may want to include in your stylesheet is the <output> element. It controls how the XSLT engine constructs the
result tree by forcing start and end tags, handling whitespace in a certain way, and so on. It is a top-level element that should
reside outside of any template.
Three choices are provided: XML, HTML, and text. The default output type, XML, is simple: whitespace and predefined entities are
handled exactly the same in the result tree as in the input tree, so there are no surprises when you look at the output. If your result
document will be an application of XML, place this directive in your stylesheet:
<xsl:output method="xml"/>
HTML is a special case necessary for older browsers that do not understand some of the new syntax required by XML. It is unlikely
you will need to use this mode instead of XML (for XHTML); nevertheless it is here if you need it. The exact output conforms to
HTML version 4.0. Empty elements will not contain a slash at the end and processing instructions will contain only one question
mark. So in this mode, the XSLT engine will not generate well-formed XML.
Text mode is useful for generating non-XML output. For example, you may want to dump a document to plain text with all the tags
stripped out. Or you may want to output to a format such as troff or TEX. In this mode, the XSLT engine is required to resolve all
character entities rather than keep them as references. It also handles whitespace differently, preserving all newlines and
indentation.
7.6.2 Outputting Node Values
XPath introduced the notion of a node's string value. All the text in an element is assembled into a string and that is what you get.
So in this element:
<sentence><subject>The quick, brown
<noun>fox</noun></subject> <action>jumped over</action> <object>the
lazy <noun>dog</noun></object>.</sentence>
The text value is "The quick, brown fox jumped over the lazy dog."
In the default rules, all text nodes that are the children of elements are output literally. If you have no explicit template for text nodes,
then any apply-templates directive that matches a text node will resort to the default rule and the text will be output normally.
However, there are cases when you can't rely on the default rules. You may want to output the value of an attribute, for example. Or
else you might want to get a string without any markup tags in it. When this is the case, you should use value-of.
This element requires an attribute select which takes an XPath expression as its value. value-of simply outputs the string value of
that expression.
Recall from Example 7-2 this template:
<xsl:template match="part[@label]">
<dt>
<xsl:value-of select="@label"/>
</dt>
<dd>
<xsl:apply-templates/>
</dd>
</xsl:template>
It extracts the value of the attribute label and outputs it literally as the content of a dt element.
Besides nodes, value-of can be used to resolve variables, as you will see in the next section.
7.6.3 Variables
A convenience provided in XSLT is the ability to create placeholders for text called variables. Contrary to what the name suggests,
it is not actually a variable that can be modified over the course of processing. It's really a constant that is set once and read
multiple times. A variable must be defined before it can be used. For that, you can use the variable element.
Here are some examples of declaring variables:
<!-- A numeric constant -->
<xsl:variable name="year" select="2001"/>
<!-- A string consisting of two blank lines, useful for making
output XML easier to read -->
<xsl:variable name="double-blank-line">
<xsl:text>
</xsl:text>
</xsl:variable>
<!-- A concatenation of two elements' string values -->
<xsl:variable name="author-name">
<xsl:value-of select="/book/bookinfo/authorgroup/author/firstname"/>
<xsl:text> </xsl:text>
<xsl:value-of select="/book/bookinfo/authorgroup/author/surname"/>
</xsl:variable>
The first example sets the variable to a string value. The last two examples set the variables to result tree fragments. These are
XML node trees that will be copied into the result tree.
Like parameters, a variable reference has a required dollar sign ($) prefix, and when referenced in non-XPath attribute values, it
must be enclosed in curly braces ({ }). Variables can be used in other declarations, but be wary of creating looped definitions.
Here is a dangerous, mutually referential set of variable assignments:
<!-- ASKING FOR TROUBLE -->
<xsl:variable name="thing1" select="$thing2" />
<xsl:variable name="thing2" select="$thing1" />
Variables can be declared outside of templates, where they are visible by all, or inside one, where its scope is limited to that
template. The following template creates a bracketed number to mark a footnote, and makes it a link to the footnote text at the end
of the page. The number of the footnote is calculated once, but used twice.
<xsl:template match="footnote">
<xsl:variable name="fnum"
select="count(preceding::footnote[ancestor::chapter//.])+1"/>
<a>
<xsl:attribute name="href">
<xsl:text>#FOOTNOTE-</xsl:text>
<xsl:number value="$fnum" format="1"/>
</xsl:attribute>
<xsl:text>[</xsl:text>
<xsl:number value="$fnum"/>
<xsl:text>]</xsl:text>
</a>
</xsl:template>
Instead of performing the calculation in the content of the element, I did it inside a select attribute. Using select is generally better
because it doesn't incur the cost of creating a result tree fragment, but sometimes you have to use the element content method
when more complex calculations such as those involving choices are necessary.
7.6.4 Creating Nodes
You can create elements and attributes just by typing them out in template rules, as we have seen in previous examples. Although
this method is generally preferable for its simplicity, it has its limitations. For example, you may want to create an attribute with a
value that must be determined through a complex process:
<xsl:template match="a">
<p>See the
<a>
<xsl:attribute
name="href">http://www.oreilly.com/catalog/<xsl:call-template
name="prodname"/></xsl:attribute>
catalog page
</a> for more information.)
</p>
</xsl:template>
In this template, the element attribute creates a new attribute node named <href>. The value of this attribute is the content of the
node-creating element, in this case a URI with some variable text provided by a call-template element. As I have written it here, the
variable text is impossible to include inside the a element, so I have broken it out in a separate attribute node creation step.
7.6.4.1 Elements
XSLT provides an element for each node type you would want to create. element creates elements. Usually, you don't need this
because you can just type in the element tags. In some circumstances, the element name may not be known at the time you write
the stylesheet. It has to be generated dynamically. This would be an application of element.
The name attribute sets the element type. For example:
<xsl:template match="shape">
<xsl:element name="{@type}">
<xsl:value-of select="."/>
</xsl:element>
</xsl:template>
If this template is applied to an element shape with attribute type="circle," it would create an element of type circle.
7.6.4.2 Attributes and attribute sets
I have already shown you how to create attributes with attribute. As with element generation, you can derive the attribute name and
value on the fly. Note, however, that an attribute directive must come before any other content. It is an error to try to create an
attribute after an element or text node.
To apply a single set of attributes to many different elements, you can use attribute-set. First, define the set like this:
<xsl:attribute-set name="common-atts">
<xsl:attribute name="id"/>
<xsl:value-of select="generate-id()"/>
</xsl:attribute>
<xsl:attribute name="class">
<xsl:text>shape</xsl:text>
</xsl:attribute>
</xsl:attribute-set>
This creates a set of two attributes, id with a unique identifier value and class="shape". The set can be accessed from any element
through its name common-atts. Use the attribute use-attribute-sets to refer to the attribute set you defined:
<xsl:template match="quote">
<blockquote xsl:use-attribute-sets="common-atts">
<xsl:value-of select="."/>
</blockquote>
</xsl:template>
You can include as many attribute sets as you want by including them in a space-separated list.
7.6.4.3 Text nodes
Creating a text node is as simple as typing in character data to the template. However, it may not always come out as you expect.
For example, whitespace is stripped from certain places in the template before processing. And if you want to output a reserved
character such as <, it will be output as the entity reference, not the literal character.
The container element text gives you more control over your character data. It preserves all whitespace literally, and, in my opinion,
it makes templates easier to read. The element has an optional attribute disable-output-escaping, which if set to yes, turns off the
tendency of the XSLT engine to escape reserved characters in the result tree. The following is an example of this.
<xsl:template match="codelisting">
<xsl:text disable-output-escaping="yes">&lt;hey&gt;</xsl:text>
</xsl:template>
This produces the result <hey>.
7.6.4.4 Processing instructions and comments
Creating processing instructions and comments is a simple task. The element processing-instruction takes an attribute name and
some textual content to create a processing instruction:
<xsl:template match="marker">
<xsl:processing-instruction name="formatter">
pagenumber=<xsl:value-of select="@page"/>
</xsl:processing-instruction>
</xsl:template>
This rule creates the following output:
<?formatter pagenumber=1?>
You can create a comment with the element comment, with no attributes:
<xsl:template match="comment">
<xsl:comment>
<xsl:value-of select="."/>
</xsl:comment>
</xsl:template>
To create the processing instruction or content of a comment, you have to specify either plain text or an element such as value-of
that becomes text. Any other kind of specification produces an error.
7.6.5 Numeric Text
Although value-of can output any numeric value as a string, it does not offer any special formatting for numbers. You are stuck with
decimals and that's it. For more options, you should move up to the more flexible number instruction. With this element, you can
output numbers as Roman numerals, with zeros prepended, or as letters. It also has a built-in facility for counting nodes.
Returning to the table of contents example, here is how you could create one with number:
<xsl:template match="book">
<xsl:for-each select="chapter">
<xsl:number value="position()" format="I"/>.
<xsl:value-of select="title"/>.
</xsl:for-each>
</xsl:template>
You'll get output like this:
I. Evil King Oystro Sends Assassins
II. Aquaman is Poisoned by Pufferfish
III. Aqualad Delivers the Antidote
IV. Atlantis Votes Aquaman into Office
The attribute value contains the numeric expression or value to be formatted, and the attribute format controls the appearance (in
this case, Roman numerals). The default value for format is the same as value-of: plain decimal.
Table 7-1 shows some ways to use the format attribute.
Table 7-1. Number formats
Format string
Numbering scheme
1
1, 2, 3, 4, ...
0
0, 1, 2, 3, ...
01
01, 02, 03, ..., 09, 10, 11, ...
I
I, II, III, IV, ...
i
i, ii, iii, iv, ...
A
A, B, C, D, ...
a
a, b, c, d, ...
One stickler is if you wanted an alphabetical list starting with the Roman numeral i. You cannot use format="i" because that
indicates lowercase Roman numerals. To resolve the ambiguity, use an additional attribute, letter-value, to force the format type to
be alphabetical.
Very large integers often require separator characters to group the digits. For example, in the United States a comma is used (e.g.,
1,000,000 for a million). In Germany, the comma means decimal point, so you need to be able to specify which scheme you want.
You have two attributes to help you. The first, grouping-separator, sets the character used to delimit groups. The other, groupingsize, determines how many digits to put in a group.
The following would result in the text 1*0000*0000:
<xsl:number
value="100000000"
grouping-separator="*"
grouping-size="4"/>
An interesting feature of number is its ability to count nodes. The count attribute specifies the kind of node to count. Say you
wanted to print the title of a chapter with a preceding number like this:
<h1>Chapter 3. Bouncing Kittens</h1>
Perhaps you could use this template:
<xsl:template match="chapter/title">
<xsl:text>Chapter </xsl:text>
<xsl:value-of select="count(../preceding-sibling::chapter)+1"/>
<xsl:text>. </xsl:text>
<xsl:value-of select="."/>
</xsl:template>
That will work, but it is a little difficult to read. Instead you can write it like this:
<xsl:template match="chapter/title">
<xsl:text>Chapter </xsl:text>
<xsl:number count="chapter" format "1. ">
<xsl:value-of select="."/>
</xsl:template>
count looks only at nodes that are siblings. If you want to count nodes that may appear at different levels, you need to add more
information. The attribute level determines where to look for matching nodes. It has three possible values: single, multiple, and any.
If single is selected (the default), the XSLT engine looks for the most recent ancestor that matches the pattern in the count attribute.
Then it counts backward among nodes at the same level. With the value multiple selected, all matching nodes among the
ancestors, and their preceding siblings, may be considered. Finally, if you select any, then all previous nodes matching the pattern
are counted. These options correspond to decreasing order of efficiency in implementation.
Consider:
<xsl:template match="footnote">
<xsl:text>[<xsl/text>
<xsl:number count="footnote" from="chapter" level="any"/>
<xsl:text>]<xsl/text>
</xsl:template>
This rule inserts a bracketed number where the footnote appears. The attribute from="chapter" causes the numbering to begin at
the last chapter start tag. level="any" ensures that all footnotes are counted, regardless of the level at which they appear.
The purpose of level="multiple" is to create multilevel numbers like 1.B.iii. In this example, we use number to generate a multilevel
section label:
<xsl:template match="section/head">
<xsl:number count="section" level="multiple" format="I.A.1."/>
<xsl:apply-templates/>
</template>
Assuming that sections can be nested three levels deep, you will see section labels like IV.C.4. and XX.A.10.
7.6.6 Sorting
Elements often must be sorted to make them useful. Spreadsheets, catalogs, and surveys are a few examples of documents that
require sorting. Imagine a telephone book sorted by three keys: last name, first name, and town. The document looks like this:
<telephone-book>
...
<entry id="44456">
<surname>Mentary</surname>
<firstname>Rudy</firstname>
<town>Simpleton</town>
<street>123 Bushwack Ln</street>
<phone>555-1234</phone>
</entry>
<entry id="44457">
<surname>Chains</surname>
<firstname>Allison</firstname>
<town>Simpleton</town>
<street>999 Leafy Rd</street>
<phone>555-4321</phone>
</entry>
...
</telephone-book>
By default, the transformation processes each node in the order it appears in the document. So the entry with id="44456" is output
before id="44457". Obviously, that would not be in alphabetical order, so we need to sort the results somehow. It just so happens
that we can do this with an element called sort. Here's how the document element's rule might look:
<xsl:template match="telephone-book">
<xsl:apply-templates>
<xsl:sort select="town"/>
<xsl:sort select="surname"/>
<xsl:sort select="firstname"/>
</xsl:apply-templates>
</xsl:template>
There are three sorting axes here. First, all the results are sorted by town. Next, the entries are sorted by surname. Finally, the
entries are sorted by first name.
7.6.7 Handling Whitespace
Character data from the source tree is not generally normalized. You can force the XSLT engine to strip space of selected
elements by adding their names to a list in the stylesheet. The element strip-space contains a list of element names in its elements
attribute. This is a top-level element that should be outside of any template.
There is also a list of elements to preserve space called preserve-space. The reason for having both these elements is that you
can set up a default behavior and then override it with a more specific case. For example:
<xsl:strip-space elements="*"/>
<xsl:preserve-space elements="poem codelisting asciiart"/>
Whitespace will be normalized for elements except poem, codelisting, and asciiart.
7.6.8 Example: A Checkbook
This example demonstrates the concepts discussed so far. First, Example 7-3 is a sample XML document representing a
checkbook.
Example 7-3. Checkbook document
<checkbook>
<deposit type="direct-deposit">
<payor>Bob's Bolts</payor>
<amount>987.32</amount>
<date>21-6-00</date>
<description category="income">Paycheck</description>
</deposit>
<payment type="check" number="980">
<payee>Kimora's Sports Equipment</payee>
<amount>132.77</amount>
<date>23-6-00</date>
<description category="entertainment">kendo equipment</description>
</payment>
<payment type="atm">
<amount>40.00</amount>
<date>24-6-00</date>
<description category="cash">pocket money</description>
</payment>
<payment type="debit">
<payee>Lone Star Cafe</payee>
<amount>36.86</amount>
<date>26-6-00</date>
<description category="food">lunch with Greg</description>
</payment>
<payment type="check" number="981">
<payee>Wild Oats Market</payee>
<amount>47.28</amount>
<date>29-6-00</date>
<description category="food">groceries</description>
</payment>
<payment type="debit">
<payee>Barnes and Noble</payee>
<amount>58.79</amount>
<date>30-6-00</date>
<description category="work">O'Reilly Books</description>
</payment>
<payment type="check" number="982">
<payee>Old Man Ferguson</payee>
<amount>800.00</amount>
<date>31-6-00</date>
<description category="misc">a 3-legged antique credenza that once
belonged to Alfred Hitchcock</description>
</payment>
</checkbook>
Now we will write an XSLT stylesheet to change this type of document into a nicely formatted HTML page. As a further benefit, our
stylesheet will add up the transactions and print a final balance (assuming that the initial balance is zero). The first template sets up
the HTML page's outermost structure:
<xsl:template match="checkbook">
<html>
<head/>
<body>
<!-- page content goes here -->
</body>
</html>
</xsl:template>
Let us add a section that summarizes income activity. The section header, wrapped inside an h3 element, is generated using new
text (with text) not present in the document and the dates from the first and last transactions (using value-of). After the header, all
the income transactions are listed, in the order they appear, with apply-templates. The rule now looks like this:
<xsl:template match="checkbook">
<html>
<head/>
<body>
<!-- income information -->
<h3>
<xsl:text>Income from </xsl:text>
<xsl:value-of select="child::*[1]/date"/>
<xsl:text> until </xsl:text>
<xsl:value-of select="child::*[last()]/date"/>
<xsl:text>:</xsl:text>
</h3>
<xsl:apply-templates select="deposit"/>
</body>
</html>
</xsl:template>
After that, we will add a section to describe the deductions from the checking account. It would be nice to sort this list of
transactions from highest to lowest, so let's use the sort element. The rule is now:
<xsl:template match="checkbook">
<html>
<head/>
<body>
<!-- income information -->
<h3>
<xsl:text>Income from </xsl:text>
<xsl:value-of select="child::*[1]/date"/>
<xsl:text> until </xsl:text>
<xsl:value-of select="child::*[last()]/date"/>
<xsl:text>:</xsl:text>
</h3>
<xsl:apply-templates select="deposit"/>
<!-- payment information -->
<h3>
<xsl:text>Expenditures from </xsl:text>
<xsl:value-of select="child::*[1]/date"/>
<xsl:text> until </xsl:text>
<xsl:value-of select="child::*[last()]/date"/>
<xsl:text>, ranked from highest to lowest:</xsl:text>
</h3>
<xsl:apply-templates select="payment">
<xsl:sort data-type="number" order="descending"
select="amount"/>
</xsl:apply-templates>
</body>
</html>
</xsl:template>
And finally, we'll display the account balance. We'll use number to calculate the sum of the transactions. Two sum( ) terms are
necessary: one for the payment total and one for the income total. Then we'll subtract the total payment from the total income. To
make it clear whether the user is in debt or not, we'll color-code the calculated result and print a warning if it's negative. Here is the
template:
<xsl:template match="checkbook">
<html>
<head/>
<body>
<!-- income information -->
<h3>
<xsl:text>Income from </xsl:text>
<xsl:value-of select="child::*[1]/date"/>
<xsl:text> until </xsl:text>
<xsl:value-of select="child::*[last()]/date"/>
<xsl:text>:</xsl:text>
</h3>
<xsl:apply-templates select="deposit"/>
<!-- payment information -->
<h3>
<xsl:text>Expenditures from </xsl:text>
<xsl:value-of select="child::*[1]/date"/>
<xsl:text> until </xsl:text>
<xsl:value-of select="child::*[last()]/date"/>
<xsl:text>, ranked from highest to lowest:</xsl:text>
</h3>
<xsl:apply-templates select="payment">
<xsl:sort data-type="number" order="descending"
select="amount"/>
</xsl:apply-templates>
<h3>Balance</h3>
<p>
<xsl:text>Your balance as of </xsl:text>
<xsl:value-of select="child::*[last()]/date"/>
<xsl:text> is </xsl:text>
<tt><b>
<xsl:choose>
<xsl:when test="sum( payment/amount )
> sum( deposit/amount )">
<font color="red">
<xsl:text>$</xsl:text>
<xsl:value-of select="sum( deposit/amount )
- sum( payment/amount )"/>
</font>
</xsl:when>
<xsl:otherwise>
<font color="blue">
<xsl:text>$</xsl:text>
<xsl:value-of select="sum( deposit/amount )
- sum( payment/amount )"/>
</font>
</xsl:otherwise>
</xsl:choose>
</b></tt>
</p>
<xsl:if test="sum( payment/amount )> sum( deposit/amount )">
<p>
<font color="red">
<xsl:text>DANGER! Deposit money quick!</xsl:text>
</font>
</p>
</xsl:if>
</body>
</html>
</xsl:template>
Now we need some rules to handle the payment and deposit elements. The first, shown below, numbers each payment and
summarizes it nicely in a sentence:
<xsl:template match="payment">
<p>
<xsl:value-of select="position()"/>
<xsl:text>. On </xsl:text>
<xsl:value-of select="date"/>
<xsl:text>, you paid </xsl:text>
<tt><b>
<xsl:text>$</xsl:text>
<xsl:value-of select="amount"/>
</b></tt>
<xsl:text> to </xsl:text>
<i>
<xsl:value-of select="payee"/>
</i>
<xsl:text> for </xsl:text>
<xsl:value-of select="description"/>
<xsl:text>.</xsl:text>
</p>
</xsl:template>
This works well enough for most payment types, but doesn't quite work when type="atm". Notice in the document instance that the
atm payment lacks any description of the payee, since we assume that the checkbook's author is receiving the funds. Let's make a
special rule just for this case:
<xsl:template match="payment[@type='atm']">
<p>
<xsl:value-of select="position()"/>
<xsl:text>. On </xsl:text>
<xsl:value-of select="date"/>
<xsl:text>, you withdrew </xsl:text>
<tt><b>
<xsl:text>$</xsl:text>
<xsl:value-of select="amount"/>
</b></tt>
<xsl:text> from an ATM for </xsl:text>
<xsl:value-of select="description"/>
<xsl:text>.</xsl:text>
</p>
</xsl:template>
Finally, here's the rule for deposit:
<xsl:template match="deposit">
<p>
<xsl:value-of select="position()"/>
<xsl:text>. On </xsl:text>
<xsl:value-of select="date"/>
<xsl:text>, </xsl:text>
<tt><b>
<xsl:text>$</xsl:text>
<xsl:value-of select="amount"/>
</b></tt>
<xsl:text> was deposited into your account by </xsl:text>
<i>
<xsl:value-of select="payor"/>
</i>
<xsl:text>.</xsl:text>
</p>
</xsl:template>
Putting it all together in one stylesheet, we get the listing in Example 7-4.
Example 7-4. Checkbook transformation stylesheet
<?xml version="1.0"?>
<!-A simple transformation stylesheet to get information out of
a checkbook.
-->
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="1.0">
<xsl:template match="checkbook">
<html>
<head/>
<body>
<h3>
<xsl:text>Income from </xsl:text>
<xsl:value-of select="child::*[1]/date"/>
<xsl:text> until </xsl:text>
<xsl:value-of select="child::*[last()]/date"/>
<xsl:text>:</xsl:text>
</h3>
<xsl:apply-templates select="deposit"/>
<h3>
<xsl:text>Expenditures from </xsl:text>
<xsl:value-of select="child::*[1]/date"/>
<xsl:text> until </xsl:text>
<xsl:value-of select="child::*[last()]/date"/>
<xsl:text>, ranked from highest to lowest:</xsl:text>
</h3>
<xsl:apply-templates select="payment">
<xsl:sort data-type="number" order="descending" select="amount"/>
</xsl:apply-templates>
<h3>Balance</h3>
<p>
<xsl:text>Your balance as of </xsl:text>
<xsl:value-of select="child::*[last()]/date"/>
<xsl:text> is </xsl:text>
<tt><b>
<xsl:choose>
<xsl:when test="sum( payment/amount )> sum( deposit/amount )">
<font color="red">
<xsl:text>$</xsl:text>
<xsl:value-of select="sum( deposit/amount )
- sum( payment/amount )"/>
</font>
</xsl:when>
<xsl:otherwise>
<font color="blue">
<xsl:text>$</xsl:text>
<xsl:value-of select="sum( deposit/amount )
- sum( payment/amount )"/>
</font>
</xsl:otherwise>
</xsl:choose>
</b></tt>
</p>
<xsl:if test="sum( payment/amount )> sum( deposit/amount )">
<p>
<font color="red">
<xsl:text>DANGER! Deposit money quick!</xsl:text>
</font>
</p>
</xsl:if>
</body>
</html>
</xsl:template>
<xsl:template match="payment[@type='atm']">
<p>
<xsl:value-of select="position()"/>
<xsl:text>. On </xsl:text>
<xsl:value-of select="date"/>
<xsl:text>, you withdrew </xsl:text>
<tt><b>
<xsl:text>$</xsl:text>
<xsl:value-of select="amount"/>
</b></tt>
<xsl:text> from an ATM for </xsl:text>
<xsl:value-of select="description"/>
<xsl:text>.</xsl:text>
</p>
</xsl:template>
<xsl:template match="payment">
<p>
<xsl:value-of select="position()"/>
<xsl:text>. On </xsl:text>
<xsl:value-of select="date"/>