Extensible Stylesheet Language for Transformations and XPath
Manipulation and Matching. Now that you have a basic understanding of XML's fundamental
rules, you are ready to begin to work with XSL Transformations. This is where
the power and versatility of XML really becomes apparent. In fact, XSLT's
syntax is the same as that of XML. In addition, if you are familiar with the
system of how files on a computer are stored in a hierarchy of folders within
other folders contained on a hard drive, then keep this in mind because the
nesting (cf. rule #1 above) of XML tags is accessible the same way by how XPath
works in XSLT.
Fundamentally, consider XSLT as you would vidhaana (vi +
-dhaa - to distribute, to put variously, sort), both literally and figuratively.
Vidhaana is to suutra as XSLT is to XML document, specifically, that
is—and for the references in this article—insofar as injunctions for
mantras to be used in a ritual. Another term for the manipulation of mantra,
addressed specifically below, also applies to one possible use of XSLT,
viharaNam (referring to the intertwining of portions of a mantra). I will
return to this below, and the conceptual applicability of the analogy becomes
clearer viz. mantra in the third section.
In what follows, however, I am beginning with very rudimentary
examples so that the basic interconnection of syntax and function between XML
and XSLT are clear. You will not be able to benefit from the more
research-savvy examples in the third section, however, without careful attention
to this one.
The following section bases a sequence of examples upon the
mantras in the XML RV included with the sample files. These examples are
designed to show how XSLT code looks in its most immediately useful function.
It's very likely that there are other ways to accomplish some of what we do
below. However, these steps build basic understanding for using XSLT in more
powerful ways. It is essential to follow this discussion incrementally so that
the actual application to mantra research in final section of this paper will be
clear.
In case it hasn't already occurred to you, just because you
have this swell tagged text in XML with all your favorite categories and tags,
there still remains of the matter of doing something with it. Enter XSLT.
Among other things, XSLT is written according to XML rules. This makes certain
parts of the otherwise difficult tedium in the syntax found in other mainstream
programming languages less ambiguous and idiomatic. An XSLT command has a
beginning and end in a symmetric order just like XML's matching opening and
closing tags. Otherwise, it uses empty tags such as those we saw
above.
XSLT works based on a simple notion of matching. In other
words, you tell it go and find such-and-such a tag, and once it does, do
something to it, or do something to what it contains. A simple example assumes
you had a copy of the Rig Veda in HTML and you wanted to make all the tag
indications of an individual line ("<li>") be more specific and say
"mantra" as the tag name instead15:
<xsl:template match="li">
<mantra>
<xsl:apply-templates />
</mantra>
</xsl:template>
In short, this finds anything that has the tag or element name
of "li" and changes it to "mantra." In the example above, the basic components
of XSLT are all present. A simple command, in this case, "xsl:template"
16 is
to make this model or template happen according to the pattern specified by the
attribute value given for "match," in this case, all elements named "li" are to
be templated.
Here the XML syntax can help us understand what's happening.
Everything between <xsl:template match="li">
and </xsl:template> is
what is supposed to happen, or is to be done, when the match of "li" is found.
It's easy to tell the difference between the XSLT command tags and the actual
XML tags for the RV that we're working on, because all XSLT command tags begin
with "xsl:".
Incidentally, the "match" attribute value--or most any
attribute value in XSLT--is always assumed to be an element or tag unless
otherwise indicated (see examples with mantras below). So, when the element/tag
"li" is found, the following is done: <li>
is changed to <mantra>
and </li> becomes </mantra>.
Simple enough, but what about that little
<xsl:apply-templates /> esoterica? I confess, that this one often
confounds me. When I said that XSLT is a simple language that's easy to read
compared to most programming languages, I had this in mind as a caveat.
Frankly, I don't get it but, in this case, the <xsl:apply-templates />
basically says "keep everything that was originally wrapped with <li>
(technically called the "children" of li), and pipe it on through to the
resulting file, but wrap it in <mantra>." Take it as a nitya puja--it is
simply enjoined to be done. If I were to offer a justification beyond this,
just realize that there might be elements or tags contained within whatever is
matched which could be discarded if the apply-templates instruction is not
added.
Granted, in this example it might seem simpler to just go in
and edit by hand with, say, search-and-replace in any word-processor. This is a
reasonable supposition. However, suppose you want to make these changes to
occasions of "li" which occur variously in a large file of say, a megabyte in
size, it takes time. If you have to change more things, it becomes even more
difficult.
Suppose you decided that you wanted to identify the meter of
each mantra as well. Now you're not just changing a tag, but adding an
attribute.
To add the attribute for meter is fairly simple. This
particular process is very handy because you don't have to change element/tag
names in order to add additional information for your research. Also, since
most systems for visual display of XML works with tag names, you don't risk
having your research tagging change the display if you're only adding
attributes. To begin, we will only be adding the attribute name "meter" and a
generic value of "vedic" as an example of the XSLT attribute function (this is
from the add_attr.xsl file in the sample set):
<xsl:template match="mantra">
<mantra>
<xsl:attribute name="meter">
<xsl:text>vedic</xsl:text>
</xsl:attribute>
<xsl:apply-templates />
</mantra>
</xsl:template>
The XML syntax will again help us interpret what's going on in
this script. The <xsl:template> tags specify what to do when a match of
the "mantra" tag is found. In this case, the <xsl:attribute> command is
to be applied when the match of "mantra" is found. When this happens, an
attribute named "meter" is to be inserted into each <mantra> tag.
The value for this newly-created "meter" attribute has been
set as "vedic" to keep this introductory example uncluttered. Note how the
<xsl:attribute> tags are wrapped around the identification of "vedic" as
the text value for the meter attribute according to basic XML rules in order to
effect this creation.17
Again you see the same <xsl:apply-templates /> in this
example as we saw above. It is included to be sure all the rest of the RV XML
file which is not being affected by the match gets passed on through. We wrap
this little <xsl:attribute> set in <mantra>
tags because, of course,
that is the element with which we are working. It's the same element name
created in the first XSLT example by changing from the <li> and we
naturally want to keep it.
Now, assume you know that all the mantra's of a particular
hymn were gaayatrii. Rather than just insert a generic meter tag for every
mantra with a meaningless "vedic" value, you want to identify a particular
hymn's meter as gaayatrii.
You know that your Rig Veda has the mandala's tagged
(according to common Indological convention in the sample file) as <mandala
id="rv3">, such as for the third book of the RV. You know the hymns are
<hymn id="rv3.62"> (in other words, the second-level nested division is of
the type "hymn" and this particular one is number rv3.62). You know that the
verses are tagged as <verse id="rv3.62.10"> and that each mantra is now
tagged as <mantra>.
So, we want to find the specific <hymn>, that is
identified as 3.62. In XSLT, "attribute" is abbreviated by the "@" sign.
Following the "@" comes the attribute's name. If you want to add something to
this particular hymn's attributes, such as an attribute and value for meter, you
can easily do this by specifying the "id" attribute's value (this is the
attr_val_change.xsl file, and you can select different hymns):
<xsl:template match="hymn[@id='rv3.62']">
<hymn>
<xsl:attribute name="meter">
<xsl:text>gaayatrii</xsl:text>
</xsl:attribute>
<xsl:apply-templates select="*|@*" />
</hymn>
</xsl:template>
This script is very much the same as what we did above, except
that we are saying "match that particular hymn which has an attribute called
"id," where the value of that id attribute is exactly 3.62." The reference to
"hymn" specifies that element name, the [] indicate that, within any given
<hymn> tag, find whatever it is that is inside the []'s. In this case, it
says, "inside some hymn that has attribute/@ by the name of 'id' and value equal
to 'rv3.62,' consider this a match and do whatever follows." You've just chosen
RV 3.62 to act upon.
What comes next again follows from XML rules. As above, we
want to keep our hymn tags, so we wrap them around the particular command we're
doing, which is to use <xsl:attribute> as before to add an attribute named
meter. This time, because we've chosen specifically RV 3.62, only that hymn
will have an attribute with the value "gaayatrii." Remember, in the example
above, we had set an attribute named "meter" to the generic value of "vedic" for
every single hymn in the RV. In fact, of course, only a few parts of RV 3.62
are gaayatrii, and it is possible to change this selectively for one or several
hymns with XSLT, but such an example would take us afield of the parameters for
introducing this material in this article.
However, to do this for one specific verse, use the same file,
but change "hymn" to "verse" in three places and change the number of the id
attribute selected to whatever you wanted, say rv1.2.3:
<xsl:template match="verse[@id='rv1.2.3']">
<verse>
<xsl:attribute name="meter">
<xsl:text>gaayatrii</xsl:text>
</xsl:attribute>
<xsl:apply-templates select="*|@*" />
</verse>
</xsl:template>
These examples have showed you how to add and change markup to
an XML text. Of course, for scholarship, it is even more important to extract
data. Remember that XSLT is good for adding markup to a structured text like
the RV which you already know well. It is even more powerful-see the final
section of this paper below-for extracting information based on that structure
or based on tags you've added according to your research emphasis. One of the
best ways to think of XSLT is as a programming language for non-programmers.
One of the greatest wizards of markup technology and lead developer and creator
of related software (most of which is generously free), James Clark, notes "XSLT
makes XML useful for non-programmers" (http://www.jclark.com/xml/xslt-talk.htm).
XSL Transformations and the Study of Mantra
For the following section, a set of texts and online tools are
available for you to replicate the examples and begin making your own
stylesheets by making small changes to the working scripts.18 In this section
the tone of presentation will present more balance between actual discussion of
the Vedic mantra sources from which the examples are drawn and the technology
applied to them. The unmistakable syntactic similarities between X-nology and
the functional role of mantra in Tantra suggested above is revisited to provide
a more resonant context in which to move from rote learning of the technology to
the understanding of it in context.
The first point about which the reader may have been wondering
concerns the name of XSLT itself: Extensible Stylesheet Language for
Transformations. The use of the term "transformation" is specifically ironic in
the context of mantra and its role in effecting ritual transformations. I have
mentioned vidhaana above as a word signifying distribution or variously
apportioning. It's formal role of designating the mantras to be employed in a
ritual (e.g., F. Smith--vidhi-1987:23-24, 27, etc.; Gonda, 1980:4, 213f.; Bhat,
1998; Staal, 1989:48f.; etc.) is an optimal analogue of XSLT's functional role
with XML. Naturally, this is a different sense of transformation that is
commonly associated with the role of mantra (cf. Wheelock, 1989:101f.; Gonda,
1980:345; Beyer, 1978; Santidev, 1999- Vol. 2:113).
Along with several discussions of vidhi (Smith, Gonda, Bhat,
etc.), there is an interesting discussion of a mantra's actual "transformation,"
though quite differently, in both Staal and Alper's offerings in the Mantra
collection. Staal posits the transformation chronologically from
mantra's to more discursive language. The simple rules of ritual which govern
the mantra originating with it become complex as language develops from it
(1989:71-72). This is, of course, an inverse of the relationship suggested here
for the purpose of understanding XSLT. However, it is worth noting that the
relative simplicity—regardless of the immediate clarity or absence of
meaning—in an XSLT script definitely becomes complex and possibly
unnatural in the requisite explanations, not unlike what Staal
suggests!
The actual instance or occurrence of the mantra follows an
inner organizational principle of syntax that is independent of their meaning in
many cases (Alper, 1989: 262, 301). This points to an important qualification
in the analogy I am drawing with XSLT, vidhi, texts, and mantra. Neither XML
or XSLT should be inferred to "know" what they mark or identify. They are
rules of syntax which afford the construction or projection of
meaning--"effectuation" in a sense (cf. Beyer, 1978:243)--at the silicon
processor level which transmits to the analogue or human realm independent of
the meaning marked by the XML tags and manipulated by the XSLT
syntax.
Before the allowing the temptation to draw these inferences
and accompanying metaphors beyond the scope of this article, a return to the
primary focus upon deploying the technology will illustrate the relevant
connection. So, contrary to the commonly used sense of "transformation" be
effected _by_ mantra, as regards the use of XSLT we will use it more closely in
the sense of vidhi to designate the transformation performed on a text to
extract mantra (or any other selected portion of a text) for study. This use of
XSLT can facilitate the suggestion by Alper, for instance, for "using modern
methods to gather together a large number of mantras" which could be drawn from
an "inventory of rites where mantras occur" (1989: 311-312).
To be sure, such an undertaking assumes the existence of no
small collection of texts in electronic form (some of the most systematic of
which are found at TITUS).19 Such resources would have to also be in XML, of
course, to enable not only the collection but the systematic cross referencing
and multiple avenues of study enabled with XSLT tools. As a starting model, we
can begin with the RV supplied for this article. Many of the mantras inferred
by Alper have been employed throughout history in various forms for various
ritual purposes, both Vedic and Tantra. Works such as Bhat's edition of
Rigvidhaana provide ample lists of specific mantras to be extracted from the RV
according to ritual applications (1998). In fact, one could arguably
"translate" the Rigvidhaana into XSLT insofar as its identifications of various
mantras go.
We've already seen above with the selection of RV 3.62 for
adding an attribute how easy it is to select a specific mantra from an XML
edition based upon your knowledge of how it is structured (e.g., id values of
"rv1.1.1," etc., per verse and so forth). Often, however, the mantra's are not
only extracted, but also re-formed. In order to examine both operations, we
will move beyond the simple listing of mantras to their reworking in the
following examples.
Returning to Staal, consider his discussion of indra
juSasva and the Soma ritual (1983- Vol. 1:661f., also 1989). The two
mantras in this example are RV 1.16.1 (accents removed per Staal's purposes and
those of this example):
aa tvaa vahantu harayo vRSaNaM somapiitaye |
indra tvaa suuracakSasaH ||
and RV 1.84.10:
svaador itthaa viSuuvato madhvaH pibanti gauryaH |
yaa indreNa sayaavariir vRSNaa madanti sobhase vasviir anu svaraajyam ||
Here they are in XML form building on the files you are
provided (use the "viharanam.xml" file for trying out this
example)20:
<verse meter="gaayatrii" id="rv1.16.1">
<mantra id="rv1.16.1a">
aa tvaa vahantu harayo
</mantra>
<mantra id="rv1.16.1b">
vRSaNaM somapiitaye
</mantra>
<mantra id="rv1.16.1c">
indra tvaa suuracakSasaH
</mantra>
</verse>
and the second mantra:
<verse meter="gaayatrii" id="rv1.84.10">
<mantra id="rv1.84.10a">
svaador itthaa viSuuvato
</mantra>
<mantra id="rv1.84.10b">
madhvaH pibanti gauryaH
</mantra>
<mantra id="rv1.84.10c">
yaa indreNa sayaavariir
</mantra>
<mantra id="rv1.84.10d">
vRSNaa madanti sobhase
</mantra>
<mantra id="rv1.84.10e">
vasviir anu svaraajyam
</mantra>
</verse>
Staal discusses the use of viharaNam (intertwining or
transposition) with these mantras as part of his longer argument about the
derivation of language from the pre-linguistic residual of ritual as something
represented—or preserved—in the Vedic mantras (1989:52ff.). In
fact, the intertwining of these two mantras includes additional verses from each
hymn, undergoing viharaNam in turn, as well as similar procedures with several
other hymns (1983 –Vol. 1: 661f).
In this example, I'm addressing only the mechanics of the
syntax as an applicable instance of extraction and reconstruction of mantra
facilitated by XSLT. Working with the files provided, the matter of actually
extracting the mantras and then reconstructing them can be done in one XSLT
script as the specification enjoins that the commands are to be enacted by the
software in the order of the script's composition.
I will divide this into two parts for clarity. The first
involves the actual extraction of the mantras to be intertwined, a very simple
operation, much like the one in which we applied a value of "gaayatrii" for the
meter attribute of a hymn tag above (this is the "select_mantra.xsl file and can
be used with the larger RV file, or your own files once you tag them in
XML):
<xsl:template match="hymn">
<xsl:copy-of select="*[@id='rv1.16.1']" />
<xsl:copy-of select="*[@id='rv1.84.10']" />
<xsl:apply-templates select="." />
</xsl:template>
As you can see, if you need to pull a whole range of mantras
from a text, this can be repeated over and over. The script is run once, and
you get a file with only the items you want to work with. Obviously, if you
were only getting one mantra it would be as easy to retype it or cut and paste
it as write all this code, but this short example is for didactic rather than
pragmatic illustration.
A quick review shows that the XML nesting syntax guides us in
understanding what happens. We've called for a template match on all hymn tags
to begin. Then, within those hymn tags, we're asking for a "copy-of" all
elements with the "*." However, we're adding that we only want those elements
or tags within a hymn which have an attribute called id, and that id attribute
must have one of two values for the hymns we've chosen. Remember, attributes
are parts of, or subsets of the tag or element in which they are declared. The
"id" attribute is identified with the "@" sign, and the particular value of
rv1.16.1 and rv1.84.10 are the specific selects. We want the whole verse, so we
use the mystic xsl:apply-templates command which assures that
everything—tags and attributes—which match the template rule will be
provided to us in the output.
At this point, in a live tutorial, I would warn the
participants not to guess what is done next, because we're going to use XML
syntax to perform the viharaNam technique with XSLT, but at a more complex
level than you might expect (use the "viharanam.xml and the
"viharanam_transform.xsl" files to try out this example):
<xsl:template match="sample">
<xsl:for-each select="verse/mantra">
<xsl:sort select="substring-after(@id,../@id)"/>
<xsl:sort select="../@id" order="ascending"/>
<xsl:copy-of select="."/>
</xsl:for-each>
</xsl:template>
I included this script to show how sophisticated XSLT can be,
not to show how easy every function is. I will summarize this briefly. Some
things are clear—we're matching the main element or tag of the sample file
(called "sample" in viharanam.xml), and asking the that for each mantra within a
verse, a sorting is done. The select is as above with the other occasions of
"select," but we're using a string function which asks for whatever string comes
after the attribute/@ "id", and the attribute of the other verse (selected by
requesting it's parent with ".."—see Appendix 2 on XPath below—and
then looking for the attribute and placing it in order after the first one).
The second "sort" just says, count up (you can also supply "descending," which
makes the rv1.84.10 mantras come before the rv1.16.1 mantras—try it with
the sample files). Finally, the "copy-of" just says to output whatever the
result of this sorting is.
From this example, consider also how, with your own notes
gathered over time and converted with easily available software to XML, you can
literally "call forth" sections and paragraphs of you work, in order, to rough
out a draft of a report or paper—or even book—by "remote control" as
it were with XSLT. Or, suppose you wanted to derive a different redaction of a
given e-text from one you have. While some parts obviously would have additions
and omission, frequently the order is one of the most significant first steps
before beginning the more micro-level edits (e.g., as with the ZBM and
ZBK).
But XML and XSLT can obviously do much more, otherwise the
world's biggest corporations and consulting firms wouldn't be embracing and
supporting it wholesale. One of the most frequent needs is to find all portions
of some level of a text which contains one thing or another. It is important to
emphasize that, for the RV, it requires that you get the TITUS CD which has the
excellent electronic edition of Lubotsky's painstaking per-word padapaatha of
the RV (it can be converted to XML). Many other electronic texts are based upon
primary sources where the words are not elided as in the RV. It is up to the
scholar to identify the resources s/he needs. In addition, I encourage those
who wish to go further in this work to participate in one of the two list serves
recommended in the appendix which describes the acquisition and installation of
these tools.
I will only provide one further example to whet your appetite
as to what is possible. In addition, too much more might surpass the critical
mass tolerance of the reader for new techniques in the learning curve—and
length—of this article! In order to proceed, I am going to scratch the
surface of one of the really powerful operators in XSLT for selecting based on
text.
Now, let's select all verses of the RV which also contain the
word "I;dyo" (note, the ";" is the udaatta accent mark. You may want to delete
all of these from the sample RV). We're going to use the matches you've learned
above and another example of how intuitive the XSLT language is, the "contains"
function, and the "xsl:copy-of" operator 21 (this script is "search.xsl" in
your sample files):
<xsl:template match="//verse//mantra">
<xsl:if test="contains(. , 'I;Dyo')">
<xsl:copy-of select='ancestor::verse'/>
</xsl:if>
</xsl:template>
Yes, it's really that simple. All you need to do is change
the text value for the word you're seeking. This command says, match all tags
called "mantra" (remember, XSLT assumes you are asking for a tag or element
name in the "=" statement unless you use "@" to say attribute) which contains
the word "I;Dyo." You will notice the (., 'I;Dyo') syntax says a little more.
This is kind of like the nitya puja again.
The ".," says "find itself and." Rather than go into all of why, I'll
leave it that XSLT is matching "mantra" that is descended from "verse." Of
course, you want the mantra, not just the word, so you have to say
"itself" (or ".") along with the word sought (sort of like nitya, again,
the specification requires two arguments) which is "I;Dyo."
Finally, following the XML a rule of nesting, the xsl:template command is
wrapped around a single "empty" (no separate closing tag to wrap around
something) command called "xsl:copy-ofi". The thing to be copied, of
course, is going to be the "ancestor" verse of what is found in the match,
the mantra "itself" (or ".") located bythe match command with contains
finding the chosen word. Naturally might want the whole verse, not just
the mantra, where "I;Dyo" is found.
Finally, following the XML a rule of nesting, the xsl:template
command is wrapped around a single "empty" (no separate closing tag to wrap
around something) command called "xsl:copy-of". The thing to be copied, of
course, is what is found in the match, the "itself" (or ".") located by the
match command.
In the match, you have the "//" which simply say "some
elements come before or after (depending on where it's used in the match phrase)
this element." So, here, you have the statement "for wherever in RV you find a
verse ("//verse") somewhere nested within it is mantra ("//mantra")." This is
an XPath statement. To learn more about XPath, I've added some preliminary
notes in the second appendix. It is the key to really getting the most out of
XSLT commands as it enables very precise identification of the various nested
levels of your XML document. In effect, it turns an XML file into a directory
tree of files just like on your hard drive. Each element or tag is treated as a
directory.
Conclusion
This exploration provides a beginning point for working with
tools that are not only readily available but low in cost for both time to learn
and finances. Most any Pentium or PowerMac will run them. On a Pentium, you'll
need InternetExplorer 5 which is freely available. The other tools are
explained briefly in the appendix.
For your research, it is important to take these examples and
practice with the RV in order to begin seeing what you will be able to do for
your work. Using the first few XSLT examples, you can add your own tags and
categories of inquiry to the RV, such as temporal periods of composition,
families from which the hymns come (cf. Van Nooten and Holland, 1994), or other
criteria. If you have notes you've written which list the verses, just revise
one the included XSLT scripts to automatically add all the tags for you, then
begin doing word searches based on the categories you've identified (if you have
questions, send them to the lists recommended in the appendix—chances are,
others are trying to do/have questions the same as you!).
As for the study of mantra, the reader will recognize that it
provided a ready analogy for the relationship of two otherwise abstract
specifications of code to a document. I am developing the inquiry based on this
analogy for a separate project and welcome feedback upon. For the purposes
here, though, this paper has provided the analogy not only for didactic purposes
with the technology, but also for a different point of inquiry and analysis of
the subtler levels of what Beyer calls effectuation with magic and mantra.
We're using XSLT as the simulacrum with XML, however, analogous to using
mantra.
In closing, I would suggest that electronic text technology,
or e-textnology, is only now beginning to "catch up" to the range of
sophisticated tools which took form long ago with smRti and vedaaGga. Most
famous of these, of course, is the frequently-noted origins of propositional
logic—upon which most basic machine language has been based—by the
Artificial Intelligence community as a whole (cf. note 6). In one noteworthy
exchange of the hallowed proceedings of the ACM, none other than Donald E.
Knuth, creator of the TeX word processing system for the technical sciences and
author of books on software engineering which have shaped the entire industry
prompted a reference to PaaNini when forwarding the now-standard name for the
BNF or Backus Naur Form of annotation for propositional logic (Ingermann, 1967,
cf. recent Indology discussion).22 The intricate system by which the suutra's
and zastra's have been preserved in memory, referenced, commentated, and
extracted for ritual purposes throughout the history of South Asia is only now
finding widely available tools which can begin to access them with a compatible
breadth and depth to the powers woven warp and woof throughout their semantics,
syntax, phonetics, and potency of evocation.
|