A Gentle Introduction to XSL Transformations (XSLT)


Table of Contents

1 About "xmLP"

2 Overview

3 A Sample XML File

4 Generating HTML Output

5 Running "xmLP"

6 List of Macros

7 List of Files


1 About "xmLP"

This document is a literate program created using the prototype literate programming tool "xmLP". Literate programming tools allow code and documentation to be interspersed in the one file. For "xmLP", the source file is written in XML, and the documentation is created in XML or HTML using the instructions in an XSL transformation file. Code sections look like the following:

Example code definition [1] =
<code, typeset differently from descriptive text>

This macro is NEVER invoked.

Code sections can also be additive, as with the two following examples:

Additive code definition example [2] +=
<first part of the code>
This macro is defined in definitions 2 3.
This macro is NEVER invoked.

Additive code definition example [3] +=
<second part of the code>
This macro is defined in definitions 2 3.
This macro is NEVER invoked.

2 Overview

The XSL transformation language (XSLT) is a scripting language that transforms XML files into XML, HTML, or text, based on patterns. Whenever a pattern is recognised, a matching transformation is performed. This differs from more traditional procedural languages, where actions are performed one after the other in a more deterministic manner. Besides XSLT, other existing pattern matching languages are Prolog and Mathematica.

All of this is a bit vague without a concrete example, so let us start with one.

3 A Sample XML File

For this document, the following sample XML file will be used. It is an RDF file (Resource Description Framework) which describes the properties (name/value pairs) of some objects. Be assured that no understanding of RDF is required to understand this document. However, it will be assumed that the RDF properties are to be displayed using HTML, which is not an unlikely scenario.

Without further ado, the sample XML (RDF) is as follows:

animals.rdf [1] =
<?xml version="1.0"?>

<rdf:RDF
  xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
  xmlns:s="http://dummy.org/xmLP-RDF-sample"
>
  <rdf:Description about="http://dummy.org/animal/octopus">
    <s:Name>Octopus</s:Name>
    <s:LegCount>8</s:LegCount>
    <s:Habitat>Sea &amp; Ocean</s:Habitat>
    <s:Carnivore>True</s:Carnivore>
  </rdf:Description>

  <rdf:Description about="http://dummy.org/animal/elephant">
    <s:Name>Elephant</s:Name>
    <s:LegCount>4</s:LegCount>
    <s:Habitat>Africa &amp; Asia</s:Habitat>
    <s:Carnivore>False</s:Carnivore>
  </rdf:Description>
</rdf:RDF>
This is an output file.

The XML (RDF) sample contains the following information:

  1. an octopus has 8 legs, lives in the sea and ocean, and is a carnivore;
  2. an elephant has 4 legs, lives in Africa and Asia, and is not a carnivore.

4 Generating HTML Output

To generate HTML output, the XSLT stylesheet must contain an appropriate "xsl:output" tag.

HTML Output Tag [4] =
<xsl:output method="html" indent="no"/>

This macro is invoked in files 2.

Assume that the properties of each object (animal) is to be put into a separate HTML table. In the RDF file, each separate object has a separate "rdf:Description" tag, so this can be used as the pattern to generate the HTML "TABLE" tag. A suitable XSLT "xsl:template" pattern is

rdf:Description template [5] =
<xsl:template match="rdf:Description">
  <P>
  <TABLE BORDER="2" WIDTH="60%">
    <xsl:apply-templates/>
  </TABLE>
  </P>
</xsl:template>

This macro is invoked in files 2.

This template (pattern) does the following: when and if an "rdf:Description" tag is encountered, it is replaced by an HTML "TABLE" tag (inside a "P" tag. The XSLT "xsl:apply-templates" tag indicates that pattern matching should be done recursively on anything between "<rdf:Description>" and its matching closing tag "</rdf:Description>". This is very important; if the "xsl:apply-templates" tag was left out of the template, all that would be generated in the output HTML file would be "<TABLE></TABLE>", i.e. a rather boring table with neither rows nor columns.

As each property is encountered, a table row should be generated for it showing its name and its value. A simple (but not scalable) way to deal with this is to create a template for each property. For the "Name" property, a suitable template (pattern) is

s:Name template [6] =
<xsl:template match="s:Name">
  <TR>
    <TH WIDTH="33%">Name</TH>
    <TD WIDTH="*"><xsl:value-of select="text()"/></TD>
  </TR>
</xsl:template>

This macro is invoked in files 2.

This creates a heading column with the name of the property ("Name"), and a normal column with the value of the property. The "xsl:value-of" tag is used to get a value from the source document, and "text()" returns the text of the current tag. As the template is executing, the current tag is "s:Name", which is exactly what we want the text of.

The other property tags are handled similarly.

s:LegCount template [7] =
<xsl:template match="s:LegCount">
  <TR>
    <TH>Leg Count</TH>
    <TD><xsl:value-of select="text()"/></TD>
  </TR>
</xsl:template>

This macro is invoked in files 2.

s:Habitat template [8] =
<xsl:template match="s:Habitat">
  <TR>
    <TH>Habitat</TH>
    <TD><xsl:value-of select="text()"/></TD>
  </TR>
</xsl:template>

This macro is invoked in files 2.

s:Carnivore template [9] =
<xsl:template match="s:Carnivore">
  <TR>
    <TH>Carnivore?</TH>
    <TD><xsl:value-of select="text()"/></TD>
  </TR>
</xsl:template>

This macro is invoked in files 2.

Finally, the HTML page will need the usual "HTML", "HEAD", and "BODY" tags to make it complete, and these can be added by adding a template for the root of the source document, which is represented by a single forward slash ("/").

Document root template [10] =
<xsl:template match="/">
  <HTML>
  <HEAD>
    <TITLE>RDF Properties</TITLE>
  </HEAD>
  <BODY>
    <xsl:apply-templates/>
  </BODY>
  </HTML>
</xsl:template>

This macro is invoked in files 2.

Note that an XSLT "xsl:apply-templates" tag is required to make sure that the other templates are called as their patterns are encountered.

Finally, the XSLT stylesheet is assembled as follows:

rdf2html1.xsl [2] =
<?xml version="1.0"?>

<xsl:stylesheet
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
  xmlns:lp="http://www.litprog.org/xmLP/alpha-2000-04-09/"
  xmlns="http://www.w3.org/TR/REC-html40"
  xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
  xmlns:s="http://dummy.org/xmLP-RDF-sample"
  version="1.0"
>

HTML Output Tag

Document root template

rdf:Description template

s:Name template

s:LegCount template

s:Habitat template

s:Carnivore template

</xsl:stylesheet>
This is an output file.

5 Running "xmLP"

To run "xmLP", you need an XSLT engine. In particularly, "xmLP" has only been tested with the Java version of Apache's "Xalan" XSLT engine (formerly "LotusXSL"). To run Xalan, your Java CLASSPATH should contain "xalan.jar" and "xerces.jar". Xalan is then invoked using

Xalan [11] M=
java org.apache.xalan.xslt.Process

This macro is invoked in macros 12 13 14 15.

and so the full command line is

Xalan command line [12] =
Xalan -IN <xml-source> -XSL <xsl-stylesheet> -OUT <output-file>

This macro is NEVER invoked.

To generate this documentation, the command is

Documentation command line [13] =
Xalan -IN xmlp-xslt.xml -XSL xmLPweave.xsl -OUT xmlp-xslt.html

This macro is NEVER invoked.

To generate the RDF file and the stylesheet which generates a matching HTML page, the command is

Product files command line [14] =
Xalan -IN xmlp-xslt.xml -XSL xmLPtangle.xsl

This macro is NEVER invoked.

To run the product XSLT stylesheet on the RDF file and produce the matching HTML page, the command is

RDF sample command line [15] =
Xalan -IN animals.rdf -XSL rdf2html1.xsl -OUT animals.html

This macro is NEVER invoked.

That, then, is a very quick and potted introduction to how XSLT stylesheets are constructed. There are many powerful features that have not been mentioned here, so keep reading about XSLT and you will discover what a useful tool it is.


6 List of Macros

  1. Example code definition [never invoked]
  2. Additive code definition example [#1/2] [never invoked]
  3. Additive code definition example [#2/2] [never invoked]
  4. HTML Output Tag
  5. rdf:Description template
  6. s:Name template
  7. s:LegCount template
  8. s:Habitat template
  9. s:Carnivore template
  10. Document root template
  11. Xalan
  12. Xalan command line [never invoked]
  13. Documentation command line [never invoked]
  14. Product files command line [never invoked]
  15. RDF sample command line [never invoked]

7 List of Files

  1. animals.rdf
  2. rdf2html1.xsl