Frequently Asked Questions about Cost

Contents

Building Cost

Q1. With which versions of Tcl/Tk/ITCL does Cost work?

A. Cost 2.0 requires either Tcl 7.4 or [incr Tcl] 2.0. ([incr Tcl] 2 includes its own copy of Tcl, so you need one or the other, but not both.)

NOTE: 2 Feb 1996. Cost 2.0a3 does not work with the latest Tcl Beta release (7.5b1).

Cost is also known to work with a number of Tcl extensions, including Tk 4.0, BLT 1.8, Tix 4.0, and TclX (version unknown).

J.M. Ivler maintains a Tcl extension compatibility matrix at
URL:http://www.crl.com/~ivler/tcltab.html
which lists several extensions and which Tcl/Tk versions they are compatible with.

Q2. On what platforms does Cost run?

A. So far (that I know of for sure):

It should work on any operating system to which Tcl 7.4 has been ported.

Some of the translation scripts require POSIX features like command pipelines, but other than that I believe Cost to be reasonably portable.

Q3. Does Cost require X11/Tk/incr Tcl/Unix?

A. No, Tcl and Cost do not require X, Tk or [incr Tcl]. So far Cost has only been built on Unix systems (that I am aware of), but a Windows port is planned eventually.

A. The long answer:

Tcl is a general-purpose scripting language, and runs on most Unix platforms. Tk is a GUI toolkit built on top of Tcl and X. Cost only uses Tcl 7.4; you can use Cost and Tk together if you like, but Tk (and X) are not mandatory.

But wait, there's more: The latest beta versions of Tcl (7.5b2) and Tk (4.1b2) have been ported to Windows and the Mac. However, the current version of Cost has not (yet) been ported to Tcl 7.5, so Cost still requires Unix.

But wait, there's still more: [incr Tcl] is an object-oriented extension to Tcl. itcl version 2 is based on Tcl 7.4 / Tk 4.0, and does not (as far as I know) run on PCs yet. Cost will work with [incr Tcl] 2.0 too. [incr Tcl] is not required -- you can use vanilla Tcl 7.4 instead -- but some of the sample translation scripts in the Cost distribution do use it.

Q4. Help! [incr tcl] won't compile!

A. There are some conflicts between [incr tcl] 1.5 and Tcl 7.4. You should use [incr Tcl] 2.0 instead.

A. On some systems (notably SGI IRIX) you may need to add the line

	SHELL	= /bin/sh

to the Makefile after running './configure'.

Q5. Is there an autoconfigure script for Cost?

A. No, and there probably won't be. Sorry.

Using Cost

Q6. Any examples of a simple translation script?

A. See the script that converts this FAQ list into HTML. (Included in the distribution.)

Q7. How do I specify a default rule for elements with the 'Simple' module?

A. The 'el' query clause matches any element node.

Note that in specifications, the order of rules is important: more specific rules like '{element LIST withattval FORMAT ALPHA}' should come before more general rules like '{el}'.

Q8. Where are # comments allowed?

A. #-style comments are only recognized in data that is interpreted as a Tcl script.

This means that you can't (easily) include comments inside a specification list. I'm looking into this problem...

Q9. Can I load more than one document at a time?

A. Not currently. Calling 'loadsgmls' or 'loaddoc' a second time overwrites the first document.

Instead, you can try creating a dummy ``hub'' document that references all the documents you'd like to process as SUBDOCument entities:

    <!doctype hub [
	<!element hub - - (doc+)>
	<!element doc - - (#PCDATA)>
	<!entity doc1 SYSTEM "firstdoc.sgml" SUBDOC>
	<!entity doc2 SYSTEM "seconddoc.sgml" SUBDOC>
	<!-- ... -->
	<!entity docN SYSTEM "nthdoc.sgml" SUBDOC>
    ]>
    <hub>
    <doc>&doc1;</doc>
    <doc>&doc2;</doc>
    <!-- ... -->
    <doc>&docN;</doc>
    </hub>

Then load the hub document to process all the referenced documents concurrently.

Q10. How come there are so few questions in this section?

A. Nobody's asked very many :-(

Using SGMLS

Q11. How do I set SGML_PATH?

A. Good question. This is what I use:

setenv SGML_PATH "%S:%N.%Y:%N"

This seems to work most of the time. Your mileage may vary...

Using NSGMLS (part of SP) or SGMLS 1.1.91, PUBLIC identifiers and catalog files will greatly simplify things; See "What are catalogs and how do I use them? ".

Q12. SGMLS produces a gazillion error messages, but I'm sure the document is OK. What's up?

A. It could be a missing SGML declaration.

If you see error messages like the following:

Length of name, number, or token exceeded NAMELEN or LITLEN limit
Normalized length of literal exceeded 240; markup terminated
Content model token 33: more than GRPCNT model group tokens; terminated

then you probably just forgot to supply the SGML declaration. This can be passed to sgmls on the command line:

sgmls sgml-declaration.dcl mydoc.sgm | costsh -S myspec.spec

The Cost 'loaddoc' command will supply this parameter automatically if the $SGML_DECLARATION environment variable is set.

Q13. I don't have an SGML declaration. What do I do?

A. Ask whoever sent you the document or DTD to supply one.

A. If you're writing your own DTD, you can try using the SGML declaration that comes with Cost (see doc/dtd/costdoc.dcl in the distribution). It is based on the reference concrete syntax, with many of the quantity and capacity settings increased.

For more information about SGML declarations, see
URL:http://gopher.sil.org/sgml/topics.html#SGMLDecl
.

Q14. What are catalogs and how do I use them?

A. External entities (including DTDs and entity sets) are specified with a public identifier, a system identifier, or both.

The system identifier is a string that specifies how to find the entity. They are usually filenames, but they can also be URLs or formal system identifiers. As the term implies, system identifiers are system-specific. Public identifiers on the other hand are globally unique. For example, an SGML document might start with:

<!DOCTYPE DocBook 
    PUBLIC "-//Davenport//DTD DocBook//EN" 
    SYSTEM "/usr/local/lib/sgml/dtds/docbook.dtd"
>

Obviously the same SYSTEM identifier won't work on every machine, This is where catalogs come in: they map PUBLIC identifiers to SYSTEM identifiers, so that the exact location doesn't need to be specified in the document.

Catalog files look like this:

-- Davenport DTDs: --
PUBLIC "-//Davenport//DTD DocBook//EN" "dtds/docbook.dtd"
PUBLIC "-//Davenport//DTD DocBook V2.3//EN" "dtds/docbook.dtd"

-- Entity sets: --
PUBLIC "ISO 8879:1986//ENTITIES Added Latin 1//EN" "entities/iso-lat1.gml"
PUBLIC "ISO 8879:1986//ENTITIES Added Latin 2//EN" "entities/iso-lat2.gml"
PUBLIC "ISO 8879:1986//ENTITIES Greek Letters//EN" "entities/iso-grk1.gml"

SGMLS and NSGMLS look at the SGML_CATALOG_FILES environment variable for a list of catalogs to read. Here's what I use:

setenv SGML_CATALOG_FILES \
    "./catalog:$HOME/sgml/catalog:/usr/local/lib/sgml/catalog"

This tells the parser to look in the current directory, my personal directory ($HOME/sgml) and a system-wide directory when resolving public identifiers.

Q15. How do I use 8-bit characters with SGMLS?

A. First you need to decide what the 8-bit characters mean. Chances are you want to use the ISO 8859-1 character set (aka "ISO Latin-1"). To use this character set, modify the CHARSET clause at the beginning of the SGML declaration to look something like:

CHARSET
        BASESET "ISO 646:1983//CHARSET
                 International Reference Version (IRV)//ESC 2/5 4/0"
        DESCSET 0  9 UNUSED
                9  2  9
               11  2 UNUSED
               13  1 13
               14 18 UNUSED
               32 95 32
              127  1 UNUSED
	BASESET	"ISO Registration Number 100//CHARSET
                 ECMA-94 Right Part of Latin Alphabet Nr. 1//ESC 2/13 4/1"
	DESCSET	128    32    UNUSED
		160    96    32

The first BASESET/DESCSET pair specifies that bit combinations 0 through 127 correspond to ISO 646 IRV (which is an ASCII-compatible character set). The second pair specifies that bit combinations 128 through 255 correspond to the Latin 1 alphabet.

You don't need to add anything after the BASESET/DESCSET pair in the SYNTAX clause; that can remain the same.

NOTE: I'm not sure how to declare character sets other than ISO 8859-1... any information on this would be most welcome. --Ed.

SGMLS (and Cost) cannot handle multi-byte or wide character sets such as Unicode and SHIFT-JIS. For that you'll have to use SP.

See
URL:http://gopher.sil.org/sgml/topics.html#SGMLDecl
for more information on the SGML declaration.

License and copyright issues

Q16. Can I use Cost in a commercial application?

A. Yes. From section 5 of the Artistic License under which Cost is provided:

[...] You may charge any fee you choose for support of this Package. You may not charge a fee for this Package itself. However, you may distribute this Package in aggregate with other [...] programs as part of a larger [...] software distribution provided that you do not advertise this Package as a product of your own.

Tcl, Tk, and SGMLS have similarly liberal licensing policies.

In particular, I consider an SGML processing application written on top of Cost to be a form of "support", so that's fine too.

Q17. Is Cost public domain?

A. No. It is, however, freely redistributable.

Miscellany

Q18. Is there a mailing list?

A. Yes! Send subscription requests to cost-l@venus.co.uk with the word subscribe in the subject: field. Submissions to cost-l@venus.co.uk.

Many thanks to Venus Internet (
URL:http://www.venus.co.uk
) for providing this service.

A. I'm not sure if the old mailing list (cost-list@euromath.dk) is even active anymore.

Q19. What does ``CoST'' stand for?

A. CoST was originally developed by Klaus Harbo when he was at the University of Copenhagen. In the tradition of ASP (Amsterdam SGML Parser), YASP (Yorktown SGML Parser), and YAO (Yuan-ze, Almaden, Oslo), it was named the Copenhagen SGML Tool.

Klaus has since moved on to bigger and better things, and Cost is no longer associated with Copenhagen. Cost 2 is sufficiently different from the original that I ought to give it a new name, but I just can't come up with a pronounceable acronym for ``San Francisco SGML Processor.''

So Cost really doesn't stand for anything anymore, except maybe ``Combination Of Sgmls and Tcl'' if you like.

Esoterica

Q20. Why doesn't 'query docroot child' work?

A. The root SD node may have PI node children preceding the document element node if, for example, the DTD contained processing instructions. Use 'query docroot child el' instead.

Q21. Why doesn't 'query? textnode in FOO' work?

A. Text nodes -- CDATA, SDATA, RE, and ENTREFs -- always appear as children of a PEL (pseudoelement) node, which in turn are children of EL (element) nodes. 'query? textnode in FOO' is a synonym for 'query? textnode parent withGI FOO'; what you really want is 'query? textnode within FOO', which is a synonym for '... ancestor withGI FOO'.

Note that 'query? textnode parent parent withGI FOO' would also work, but you don't usually want that -- this would select all text nodes that are directly contained in 'FOO' elements, but would fail for those that appear in subelements of a 'FOO'.


Joe English
Revision: $Revision: 1.4 $