SGML FAQ File, Exeter ================================== FREQUENTLY ASKED QUESTIONS(c) v1.7 ================================== INTRODUCTION ============ This article contains answers to questions that are frequently posted to comp.text.sgml etc., it is intended for newcomers to such lists and SGML beginners. This FAQ is maintained on a voluntary basis, but any comments or additional information are welcome (see final section). CONTENTS ======== 01) What does "SGML" stand for? What is SGML? 02) What can SGML be used for? 03) What do I need to use SGML? 04) Books, Bibliographies, Newsletters and Journals 05) Newsgroups and discussion lists 06) Public Domain software 07) Commerical software 08) ftp archives 09) The SGML Users' Group, National Chapters, SIGs, Standards bodies 10) Conferences 11) SGML initiatives and major projects 12) SGML and other Standards 13) Introductory questions with answers - by Erik Naggum 14) Making comments/additions to this FAQ If any of the terms used in this FAQ are unfamiliar to you, consult the section on "Introductory questions with answers", the text of the SGML standard (see 01.1), or a good book on SGML (see below). 01) What does "SGML" stand for? What is SGML? ============================================== 01.1 SGML stands for the Standard Generalized Markup Language. SGML is defined in ISO 8879:1986 "Information Processing -- Text and Office Systems -- Standard Generalized Markup Language (SGML)". A copy of this document can be obtained from the International Organization for Standardization (ISO) or your national standard body. It is not available for ftp. 01.2 SGML enables the description of structured information _independent_ of how that information is processed. It is a meta-language that provides a standard syntax for defining descriptions of classes of structured information; these descriptions are called document type definitions (DTDs). Information can be "marked up" according to a DTD, so that its structure is made explicit and accessible. The "markup" can be checked against a DTD to ensure that it is valid, and thus that the structure of the information conforms to that of the class described by the DTD. Ensuring that information is structured in a known way greatly facilitates any subsequent use of that information. For more information, beginners should read Erik Naggum's "Introductory questions with answers" (in this FAQ), consult ISO 8879 and/or a book on SGML (also in this FAQ). 01.3 DTDs define the rules to structure information but do _not_ say how that information should be processed. Therefore, SGML and DTDs do not deal with how, say, a document should be processed for formatting on paper (via LaTeX), display on-line (via Hypercard) , and mapping into a document database (via Oracle) -- but, having made the structure of the document explicit, enables all these subsequent processes to use exactly the same source document. SGML is _not_ a replacement for defacto standards such as TeX or PostScript. 01.4 SGML is non-proprietary. Publication and amendments to the International Standard are controlled solely by the ISO. 01.5 Use of SGML is not confined to any particular make or type of computer or software. SGML-aware products are available for most types of machine. 01.6 It is not user-unfriendly. What SGML is, what it looks like "in the raw", and how other software is able to make use of SGML markup, will be of little concern to most users. Sophisticated packages now exist to create, edit, and manipulate information that has been marked up with SGML, e.g. quasi-WYSIWYG editors for creating SGML documents that conform to any given valid DTD. 02) What can SGML be used for? ============================== 02.1 Its uses are many and diverse. SGML DTDs (see 01.2) define the markup and markup rules that can be used for a given class of documents (where "document" is a file of information). A DTD is usually written with some kind of end processing in mind - but since SGML markup is application independent, it means that documents that conform to a particular DTD can be re-used in a variety of different ways. 02.2 For example, a document designer might write a DTD that enables the abstract of a scientific paper to be marked up as such. The primary purpose is that the text identified as forming part of a paper's abstract can then be formatted in a particular way when the SGML source document is translated into a file usable by a text processing system. If, at some later date, it is decided that abstracts should be formatted in a different way, it is only necessary to alter the translation program and not every instance of an abstract in every paper that has to be (re-)printed out. Moreover, knowing that in every document conforming to the particular DTD, abstracts will be identified as such, it is a trivial matter to combine papers supplied by several authors into a collection that has a uniform physical appearance, or to produce a catalogue of abstracts for publication or inclusion in a database. 02.3 If you want to know whether SGML is appropriate for a particular task, consult the current discussion lists, journals, and Special Interest Groups (SIGs), and/or post to a newsgroup such as comp.text.sgml. People are always willing and eager to hear about new ways that SGML might be used. 03) What do I need to use SGML? =============================== 03.1 In order to use SGML you will need an SGML parser (that conforms to ISO 8879), an entity manager, an editor to produce your DTDs and/or SGML documents, and probably some sort of translation program to convert your SGML documents into a form suitable for some specific processing. If you are planning to convert existing information into SGML documents, you will need some sort of "retro-tagging" or auto-conversion software. 03.2 ISO 8879 contains very precise definitions of terms such as "SGML system", "SGML application", "SGML parser", "entity manager" and so on. Users are advised to consult the text of ISO 8879 carefully, as mis-use of terms defined in the Standard can lead to misunderstandings. 03.3 The following definitions are taken from ISO 8879, but readers are advised to consult the full text: 4.279 SGML application: Rules that apply SGML to a text processing application. An SGML application includes a formal specification of the markup constructs used in the application, expressed in SGML. It can also include a non-SGML definition of semantics, application conventions, and/or processing. 4.287 SGML system: A system that includes an SGML parser, an entity manager, and both or either of: a) an implementation of one or more SGML applications; and/or b) facilities for a user to implement SGML applications, with access to the SGML parser and entity manager. 4.285 SGML parser: A program (or portion of a program or a combination of programs) that recognizes markup in SGML documents. 4.123 entity manager: A program (or portion of a program or a combination of programs), such as a file system or symbol table, that can maintain and provide access to multiple entities. 4.120 entity: A collection of characters that can be referenced as a unit. NB. In note (b) of the "Scope" section of ISO 8879:1986, it states that the Standard does NOT "Specify the implementation, architecture, or markup error handling of conforming systems". In the glossary to his book (see below) Eric Van Herwijnen defines "SGML implementation: A collection of SGML application procedures that.... provide the mapping from the structure defined by a given SGML application to a concrete system such as a textformatter or a database." Note: Beginners may not find any of these definitions enlightening. Be aware that some posters use the terminology of ISO 8879 very rigorously, whilst others are more lax. This opens the door for misunderstandings. Unless you are sure that you are using terminology in the correct way, as taken from ISO 8879, please try to be as explicit and unambiguous as possible. 03.4) Put simply, most users will obtain or write a DTD (see above). If you write your own DTD, you will need to validate it using an SGML parser that conforms to ISO 8879. To create SGML documents which conform to a DTD you will need an editor and a parser. The editor is used to input information and insert SGML markup into the document; the parser is used to check that the markup and the way it has been used conform to the rules given in the DTD. Many commercial packages offer syntax-directed editors, which interactively ensure that any editing and markup operations conform to the rules of the DTD. 03.5) Once you have a valid SGML document that conforms to a valid SGML DTD, you may want to do some subsequent processing. For example, in order to get paper output, you will need a program (or set of programs), that can read your SGML document and produce a file acceptable to your word processing package/text formatter. With a well-known publicly available DTD, it may be possible to obtain a translation package that has already been written; otherwise, you will need to write any translations required for subsequent processing yourself. 03.6) Translating existing information into valid SGML documents can be more problematic. SGML is good at handling structured information. You will need to obtain/write a DTD which is suitable for representing the structure (in full or in part) of your existing information. You will then need to obtain/write translations which can take your existing information and output it in an appropriate form that includes SGML markup which can be validated against your chosen DTD. If your existing information already contains an unambiguous structure which is clearly indicated, it should be possible to convert this information into conforming SGML. If your existing information is not clearly structured, or that structure is ambiguous, conversion to SGML is much more hard work. Complex information structures will also involve much more effort to translate into conforming SGML. 03.7) Always look out for existing DTDs/translation packages which may meet your needs. If you write a DTD or means of translating to/from SGML, consider sharing it with the rest of the SGML user community (post it to a newsgroup or ftp site). This is a good way for all of us to have access to well-written, tried and tested DTDs etc. 04) Books, Bibliographies, Newsletters and Journals =================================================== This list does not pretend to be complete, nor does it offer any value judgements about any of the products listed. Items in each category are given in alphabetical order by author/title. Some items are available at discounted rates to members of the SGML Users' Group or the Graphics Communication Association (GCA) - see below. Note: Robin Cover's on-line bibliography contains many more references, and much more information on each text. 04.1 Books: BRYAN, Martin "SGML: an author's guide to the standard generalized markup language". Wokingham/Reading/New York: Addison-Wesley. 1988. 380 pages. ISBN: 0-201-17535-5 (pbk). GOLDFARB, Charles "The SGML Handbook". Oxford: Oxford University Press. 1990. 688 pages. ISBN: 0-19-853737-9 (hbk). HERWIJNEN, Eric van "Practical SGML". Dordrecht/Boston/London: Kluwer Academic Publishers. 1990. 307 pages. ISBN 0-7923-0635-X (pbk). SMITH, Joan & STUTELY, Robert "SGML:the users' guide to ISO 8879". New York/ Chichester/Brisbane/Toronto: Ellis Horwood Limited/Halstead Press. 1988. 173 pages. ISBN 0-7458-0221-4 (Ellis Horwood Limited) (hbk). ISBN 0-470-21126-1 (Halstead Press)(hbk). SMITH, Joan. "SGML and Related Standards". New York/Chichester/Brisbane /Toronto: Ellis Horwood Limited/Halstead Press. 1992. 151 pages. ISBN 0-13-806506-3 (Ellis Horwood Limited) (hbk). SOFTQUAD Inc. "The SGML Primer". Toronto: SoftQuad Inc. Private printing, available from SoftQuad Inc. 04.2 Bibliographies: COVER, Robin & DUNCAN, Nicholas & BARNARD, David "BIBLIOGRAPHY ON SGML (Standard Generalized Markup Language) AND RELATED ISSUES Technical Report 91-299). Ontario: Queen's University at Kingston. 1991. 312 pages. ISSN 0836-0227. 1991 Cost $21.00 (Canadian). Contact: Doug Hamilton, Dept. of Computing & Information Science. Goodwin Hall, Queen's University, Kingston, Ontario, CANADA K7L 3N6. Phone: (1-613) 545-6056. Email (Internet): hamilton@qucis.queensu.ca COVER, Robin "STANDARD GENERALIZED MARKUP LANGUAGE, ISO 8879:1986 (SGML) ANNOTATED BIBLIOGRAPHY AND LIST OF RESOURCES" version 2.0 Revised January 1992. (c) Robin Cover. Available on-line from many ftp sites. Updates posted to comp.text.sgml etc. Contact: Robin Cover 6634 Sarah Drive, Dallas, TX 75236 USA. Phone: (214) 296-1783. Fax: (214) 709-3387. Email (Internet): robin@utafll.uta.edu. 04.3 Newsletters & Journals: Note: There are several journals dedicated to CALS - see Robin Cover's on-line Bibliography or contact the CALS SIG of the International SGML Users' Group for details. "EPSIG News" - A quarterly publication of information relating to the ANSI/NISO manuscript standard Z39.59-1988 (also known as the "AAP" standard). Avaiable through EPSIG (address below). ISSN 1042-3737. "SGML Users' Group Newsletter" - An occasional publication of news, events, product announcements and short articles available through the International SGML Users' Group (address below). ISSN 0952-8008 "SGML Users' Group Bulletin" - Longer/more technical papers than appear in the SGML Users' Group Newsletter. Available through the International SGML Users' Group (address below). ISSN 0269-2538. "SGML SIGhyper Newsletter" - An occasional publication of the SGML Users' Group Special Interest Group on Hypertext and Multimedia (SIGhyper). Available through SGML SIGhyper (address below). " The SGML Newsletter" - Managing Editor: Brian Travis (Internet email: brian@sgmlinc.com). 12 issues per year. Contact: Graphic Communications Association. Phone: +1 703-519-8157. Fax: +1 703-548-2867. 04.4) Teaching & learning aids: "The SGML Tutorial" - a DynaText version of Eric van Herwijnen's book, "Practical SGML". Includes the SGMLS parser, which can be used to validate sample files, check answers to DTD-writing exercises etc. Requires Microsoft Windows 3.x. Available from the GCA, and Lou Burnard of the TEI (addresses below). 05) Newsgroups and discussion lists =================================== 05.1 Newsgroups comp.text.sgml - this usenet newsgroup. The main electronic forum for discussion of SGML and closely related matters. Begun in late 1990. All postings are archived at the ftp site maintained at Oslo University in Norway (searching via WAIS and gopher is possible). comp-text-sgml@naggum.no - electronic mailing list that echoes all postings to comp.text.sgml for those who have difficulties with usenet news. It was set-up, and is maintained by Erik Naggum. To subscribe: mail: comp-text-sgml-request@naggum.no subject: subscribe comp.text.sgml body: (blank) to post an article: mail: comp-text-sgml@naggum.no Erik also maintains an archive of all postings to comp.text.sgml that is accessible over the net. Read the newsgroup for Erik's postings re. any changes to the service. The following (edited) article (which may now be out of date), gives an idea of how to use the service. *********************************************************************** Date: 15 Dec 1992 00:15:28 +0100 From: Erik Naggum Subject: FTP, mail server, archive stuff [...some stuff deleted....] If you have direct access to the Internet, two very convenient access options apply. (1) With the WAIS source comp.text.sgml.src. This source allows fuzzy keyword searches. Warning: You may get a large number of hits. (2) With anonymous FTP to ftp.ifi.uio.no. Articles are archived both by Message-ID and by date, in /pub/SGML/comp.text.sgml/by.msgid and /pub/SGML/comp.text.sgml/by.date, respectively. This can be used to retrieve articles that you missed, based on message-ids in the References header. Search requests can also be mailed to for manual processing. An experimental FTP-by-mail service is available by sending FTP commands to . (See RFC 959 for a list of commands, or send a HELP command.) If you should be unable to _post_ to the newsgroup, articles mailed to will be posted to comp.text.sgml. The headers may be edited to include a missing References header in replies, but will be posted automatically if all required headers are present and valid. If you should be unable to _read_ the newsgroup, you may request to be added to the comp.text.sgml mailing list by sending a message with the subject "subscribe comp.text.sgml" to . If you have any questions about the archive, just reply to this message. --------------------------------------------------------------------------- I'd like the two paragraphs about access to the newsgroup for people with no access to USENET to be distributed as far as possible. Your help is appreciated. The FTP-by-mail service is experimental, which means that I may change its behavior without prior warning. However, such changes only apply to the messages that are returned from the service, not commands sent to it. (SGML is about letting the information survive the system, and so is the intent with the interface to this service.) As mentioned above, the commands are taken from RFC 959, which defines the FTP protocol. For our purposes, both the subject header and the message body can contain commands. Garbage in the subject field is ignored. The first non-command line in the message is treated as QUIT. A full transaction log is sent by return mail. (I haven't decided yet whether I will always return retrieved files in separate messages or in-line, or mix and match according to size and other variables.) (For the mail connoisseurs: the References and In-Reply-To field will contain the Message-ID of the request message for all returned messages, and the subject field will contain the type of response.) The reason for choosing the FTP commands is that they're proven to work, and simple programs can be written to do "batch FTP" with a mail server. The HELP command produces this response: 214-The following commands are recognized: -Administrative: - USER Login as user (first half of login operation) - PASS Password (second half of login operation) - QUIT Terminate service - HELP Retrieve this message - -File transfer and manipulation: - TYPE Transfer type (default: ASCII) - RETR Retrieve file - STOR* Store file - APPE* Append to file - RNFR* Rename From (first half of rename operation) - RNTO* Rename To (second half of rename operation) - DELE* Delete file - CWD Change working directory (initial: /pub/SGML) - LIST List files in local system format - NLST List names of files (not directories) - MKD * Make directory - RMD * Remoe directory - PWD Print working directory - CDUP Change to parent directory - STOU* Store unique - SIZE Report size of file - -Special offers: - PORT Select data transfer method - MDTM Report modification time - FIND Search for regular expression in file name - GREP Search for regular expression in directory - ENCD Mail encoding (default: uuencode) - -Commands marked * requires login. Enquiries to sgml@ifi.uio.no. 214 Send requests to sgml.ftp@ifi.uio.no, comments to sgml@ifi.uio.no PORT is abused. It takes as argument where to send the response, and can be one of "In-line", "Message", or an e-mail address where the data files should be sent. This is not according to RFC 959! ENCD takes an argument which specifies the encoding to use for image (binary) transfers. Supported values are "uuencode", "xxencode", "btoa", and "base64". TYPE will revert to IMAGE if the file is suspected to be binary, and trigger the mail encoding. If set to IMAGE, it will not revert to ASCII for text files. To aid the batch user, file names matched by regular expressions in LIST, NLST, or FIND are used as the default file name(s) in subsequent commands if the argument is the special token "<>". Commands take only one argument, but this will cause iteration over the "selected" file names. To help avoid the annoying problem of changing to the right directory, a search for the full directory name will be made if the argument doesn't match a subdirectory of the current directory, and it doesn't contain directory separators (/). A typical request will look like this: --------------------------------------------------------------------------- From: user@major.net To: sgml.ftp@ifi.uio.no CWD comp.text.sgml/by.msgid RETR Bz4BF1.AI9@bcstec.ca.boeing.com QUIT --------------------------------------------------------------------------- The return message will look somewhat like this, depending on what I do with this software: --------------------------------------------------------------------------- From: sgml.request@ifi.uio.no To: user@major.net 220-ftp.ifi.uio.no FTP-by-mail server (version 0.0) ready 220 Defaults: USER=ANONYMOUS PASS=user@major.net PORT=in-line TYPE=ascii 230 ANONYMOUS user logged in >>> CWD comp.text.sgml/by.msgid 250 CWD command successful >>> RETR Bz4BF1.AI9@bcstec.ca.boeing.com 150-Sending /pub/SGML/comp.text.sgml/by.msgid/Bz4BF1.AI9@bcstec.ca.boeing.com 150 Text file follows, ending with "." on line by itself. Newsgroups: comp.text.sgml From: vanzwol@bcstec.ca.boeing.com (Ted Van Zwol) Subject: (1) PD Parser; (2) Arbortext Message-ID: Organization: Boeing Date: 11 Dec 1992 23:18:36 (19921211231836) Lines: 20 . . . . 226 Transfer complete (1220 characters, 28 lines/records). >>> QUIT 221 Thanks for using the SGML FTP-by-mail server. --------------------------------------------------------------------------- I won't claim that all these commands will actually be implemented from day one (i.e. today), but at least this is what I think we need. Most probably, the more esoteric commands will be implemented as requests using them com ein. The login feature will be used to allow people to store and control files in the archive, if anybody wants it. Send me mail for more information. Some files may be protected and will thus also require a password to access them. More on this when (if?) it's used. Give it a try! Best regards, -- Erik Naggum ISO 8879 SGML +47 295 0313 ISO 10744 HyTime ISO 9899 C Memento, terrigena ISO 10646 UCS Memento, vita brevis *********************************************************************** 05.2 Discussion lists sgml - electronic mailing list managed by The SGML Project. Used for circulating information about the Project, and SGML activities/ information relevant to the UK academic and research community. Some articles posted to comp.text.sgml are echoed to this list. To subscribe: mail:mailbase@uk.ac.mailbase subject: (blank) body: join sgml Michael Popham If successful, you will be emailed a "User Guide" by the Mailbase listserver. NB. from outside the UK, you may have to reverse the elements of the email address eg. use "mailbase@mailbase.ac.uk". sgml-l - electronic mailing list for discussion of SGML issues. Many articles posted to comp.text.sgml are echoed to this list. To subscribe: mail: listserv@dhdurz1 subject: (blank) body: SUB SGML-L Michael Popham SIGNUP sgml-math - electronic mailing list for discussion of issues relating to the handling of math under the AAP Standard. DTD fragments are circulated for comment. To subscribe: mail: listerv@e-math.ams.com subject: (blank) body: subscribe sgml-math Michael Popham set sgml-math mail ack help sgml-tables - electronic mailing list for discussion of issues relating to the handling of tables under the AAP Standard. DTD fragments are circulated for comment. To subscribe: mail: listerv@e-math.ams.com subject: (blank) body: subscribe sgml-tables Michael Popham set sgml-tables mail ack help tei-l - electronic mailing list for discussion/information relating to the work of the Text Encoding Initative (TEI). To subscribe: mail: listserv@uicvm subject: (blank) body: SUB SGML-L Michael Popham SIGNUP 06) Public Domain software ========================== ************************************************************************ SEE ALSO STEVE PEPPER'S "WHIRLWIND GUIDE TO SGML TOOLS", IN THE DIRECTORY sgml-tools.info IN THIS ARCHIVE ************************************************************************ Note: Public domain products are available from most of the anonymous ftp archives. (The full addresses of many of the ftp archives is given in Robin Cover's on-line Bibliography, or you could search available archives using ARCHIE). Older public domain products are also available from some ftp sites, but are not listed here. ARC-SGML A set of SGML Parser Materials, produced by Dr Charles Goldfarb and made available through the SGML Users' Group. Contains source code which can be used to build your own programs to handle SGML; also contains a sample application called vm2. Copies on disk are available through the GCA, SGML SIGhyper, and The SGML Project at the University of Exeter. The orginal source code was written in C to run on IBM compatible PCs under DOS. The original files and ports to many operating systems and platforms (e.g UNIX, Mac) are available for ftp. (When searching ftp archives, look for directories/files with names like "arcsgml" or "ARC-SGML"). ICA (Integrated Chameleon Architecture) A code generating software architecture for producing translators between different representations of electronic data. ICA is not SGML-specific. Runs under UNIX, using X Windows (R4, R5). The ICA Project is based at Ohio State University, and all new releases come from there. Available for ftp from archive.cis.ohio-state.edu, under the directory pub/chameleon. (The accompanying PostScript file of documentation runs to 186 pages). Contact: Peter Ware qwertz/FORMAT An SGML to LaTeX and nroff/troff translator produced by the Qwertz Project at the German National Centre for Computer Science. The LaTeX document styles have been re-written as an SGML DTD (the qwertz DTD). SGML documents can be created, and quickly mapped into a format suitable for processing by a LaTeX, nroff/troff formatter. New releases are announced on comp.text.sgml. Available for ftp. The original code is available for ftp from ftp.gmd.de under the directory /gmd/sgml (get "README" and "sgml2latex-format.1.3.tar.Z") sgmls An SGML parser derived from the ARC-SGML Parser Materials, written by James Clark. sgmls outputs a simple, line-oriented, ASCII representation of an SGML document's Element Structure Information set which can be easily parsed by awk, perl, C or whatever. The idea is that sgmls can be used as the front end for a structure-controlled SGML application. New releases are announced on comp.text.sgml. sgmls consists of C source code intended to run under UNIX, but with instructions for porting/compiling under DOS. Available for ftp. (look for directories/files with names like "sgmls", "jclark", "sgmls-0.8.tar.Z"). Erik Naggum has set up several ways to obtain the latest version of the sgmls parser (and other SGML and related software etc.) at his site in Norway. Here is a posting on how to retrieve stuff from Erik's archive (but check out comp.text.sgml for recent announcements on changes to this service): --------------------------------------------------------------------------- From: Erik Naggum Subject: Re: SGML parsers [Richard Shroyer] : | Can someone tell me and other newcomers to SGML where an up-to-date | SGML parser is to be had? I'd be willing to pay for a UNIX (X-Windows) | and DOS parser but would naturally prefer a free one if it's good. Following is a "reprint" of my article of 1993-02-23. Newsgroups: comp.text.sgml Path: enag From: Erik Naggum Message-ID: <19930223.003@erik.naggum.no> Date: 23 Feb 1993 17:17:48 +0100 References: Subject: Re: sgmls 1.1 available Lines: 47 [James Clark] : | Sgmls version 1.1 is now available for anonymous ftp from | ftp.jclark.com:/pub/sgmls/sgmls-1.1.tar.Z. The size of the | distribution is about 420k. Remember to use binary mode when getting | the file. An MS-DOS executable with formatted documentation is | available in the same directory as sgmls1_1.zip. The size of this is | approximately 100k. | | My internet connection is rather feeble, so you might prefer to wait a | few days and pick it up from one of the usual archive sites. To reduce the load on the "feeble" connection, the files are now available in ftp.ifi.uio.no:/pub/SGML/SGMLS with the same file names as listed above. Anonymous FTP is the preferred access means to the archive (login as "anonymous" and use your e-mail address for password), but if you cannot use FTP, you may send a message to , containing the following commands (excluding the line numbers): (1) CWD SGMLS (2) TYPE IMAGE (3) ENCD uuencode (4) RETR sgmls-1.1.tar.Z (5) RETR sgmls1_1.zip (6) QUIT (1) is mandatory. (2) is optional. (3) is optional, but can take other values "xxencode", "btoa", or "base64" if you don't want uuencoded files. Choose one of (4) and (5), unless you want both (obvious, right?). (6) is optional if no junk follows the commands. The files will be sent by return mail one file per message. If your site, or any site along the way, barfs and dies on large messages, your request will be requeued using split files with a 64K message size limit if an appropriate error messages is returned. In this case, shar is used to package the files (requires a Unix system or small set of Unix utilities). Please do not use WAIS or gopher to retrieve these files. Best regards, -- Erik Naggum ISO 8879 SGML +47 2295 0313 Oslo, Norway ISO 10744 HyTime ISO 9899 C Memento, terrigena ISO 10646 UCS Memento, vita brevis --------------------------------------------------------------------------- 07) Commerical software ======================= This list is not complete. Omission from this list is through accident or ignorance. A list of products is given, followed by a list of contact names and addresses. Note: NO VALUE OR "FITNESS FOR PURPOSE" JUDGEMENT IS PLACED ON ANY PRODUCT OR SERVICE LISTED. ALWAYS CHECK WITH THE SUPPLIER TO ENSURE THAT ANY PRODUCT (OR COMBINATION OF PRODUCTS) WILL DO WHAT YOU WANT, AND WILL WORK WITH YOUR COMPUTER/OPERATING SYSTEM. Note: "quasi-WYSIWYG" refers to the capablity of some packages to format the screen/paper output of an SGML document, such that the two output representations are similar. ************************************************************************ N.B. THIS LIST IS GETTING OUT OF DATE -- TAKE A LOOK AT STEVE PEPPER'S "WHIRLWIND GUIDE TO SGML TOOLS", SEE THE DIRECTORY sgml-tools.info IN THIS ARCHIVE ************************************************************************ 07.1) Product list Agfa CAPS - for large-scale publishing. Contact: local Agfa Gevaert office. Agfa SDMS - SGML-based document management system. Contact: local Agfa Gevaert office. Author/Editor - create/edit/validate SGML DTDs and documents (quasi-WYSIWYG). Contact: SoftQuad Inc. Balise - SGML application programming environment - can be used for basic parsing/validation tasks, right up to complex document composition or database-oriented applications. Contact: AIS/Berger Levrault BASISplus - SGML-aware database system. Contact: Information Dimensions Inc. DL Composer - SGML-aware composition engine. Contact: Datalogics Inc. DocuBuild SGML - for large-scale (CALS) publishing on DEC VAX. Contact: Xerox Corporation. DynaText - on-line SGML document indexing/searching/browsing. Contact: Electronic Book Technologies EASE - create/edit/validate SGML DTDs and documents. Contact: E2S FastTAG - conversion/auto-tagging for scanned hardcopy and electronic text files. Contact: Avalanche Development Company FrameBuilder - "SGML-aware" structured document editor. Contact: Frame Technology Corporation FrameMaker - DTP with some (CALS?) SGML capabilities. Contact: Frame Technology Corporation Grif - create/edit/validate SGML DTDs and documents (quasi-WYSISWYG), inc. graphics. Contact: Grif S.A. Guide - Guide hypertext from SGML documents. Contact: Office Workstations Limited (OWL), or InfoAccess Inc. (in US) HyMinder - HyTime engine (in production). Contact: TechnoTeacher, Inc. Interleaf 5 - SGML editor built on Interleaf's DTP software. Requires "Intergrator's Toolkit" (?) to add user defined DTDs, and can work with Interleaf's RDM and WorldView products. Contact: Interleaf LECTOR - SGML text database software. Contact: Open Text Corporation MacroTag - supports macros for using SGML with MS-Word 4.0 or WordPerfect 5.0. Contact: Allen Creek Software. Mark-It - validates SGML DTDs and documents. Contact: Sema Group Systems Limited MarkMinder - an OEM software product: a library of C++ objects offering an object-oriented API to SGML documents and the constructs they contain. Contact: TechnoTeacher Inc. PAT - SGML text database software. Contact: Open Text Corporation PATHWAYS Interactive Electronic Publishing (IEP) - hypertext software system for publishing/viewing interactive SGML source files. Contact: Westinghouse Electric Corporation PATMOTIF - SGML text database software. Contact: Open Text Corporation SGML/CALS Translator - conversion/auto-tagging. Contact: Shafftstall Corporation. SGML-DB - an SGML-aware database system. Contact: A.I.S/Berger Levrault SGML Hammer - conversion/auto-tagging for SGML documents. Contact: Avalanche Development Company SGML Publisher - create/edit/validate SGML DTDs and documents (quasi- WYSIWYG), inc. graphics. Contact: AborText Inc. SGML/Search - an SGML-aware database system (being replaced by SGML-DB) Contact: A.I.S/Berger Levrault SGML Smart Editor -supports research and editorial functions for any structured database using the SGML standard. Contact: Auto-Graphics Inc. SGML TextWrite - create/edit SGML documents. Contact: IBM SGML TextWrite Tools - create/edit SGML DTDS. Contact: IBM SGML Translator - validates SGML documents, translates SGML documents to DCF also BookMaster, BookManager. Contact: IBM SGML Translator - conversion/auto-tagging. Contact: Shafftstall Corporation. SGML Toolchest - a set of tools to aid the production of SGML documents on DEC/VAX machines. Contact: DEC Silversmith - full-text retrieval system for SGML documents. Contact: Taunton Engineering. TABLETAG - conversion/auto-tagging for Lotus 1-2-3 spreadsheets to CALS, AAP and Author/Editor tables. Contact: The Unifilt Company. TagWorX - conversion/auto-tagging of scanned documents (within ScanWorX). Contact: Xerox Imaging Systems. TagWrite - conversion/auto-tagging of Microsoft RTF, WordPerfect and ASCII files. Runs under Windows 3.x. Contact: ZANDAR Corporation WordPerfect IntelliTag - converts between UNIX/DOS WordPerfect 5.1. and SGML documents. Contact: WordPerfect Corporation [UK support: try Graphnet Computers Ltd] Write-It - create/edit SGML documents. Contact: Sema Group Systems Limited WriterStation - create/edit SGML documents. Contact: Datalogics Inc. XGML Omnimark - conversion/auto-tagging. Contact: Exoterica Corporation XGML Validator - validates SGML DTDs and documents. Contact: Exoterica Corporation 07.2 Contact list A.I.S/Berger Levrault - 34 Av. du Roule, 92200 Neuilly, France; Phone: +33-1-46-40-10-60: Fax: +33-1-46-40-18-44. AborText Inc.- 533 West William Street, Suite 300, Ann Arbor, MI 48103, USA; Phone: +1-313-996-3566; Fax: +1-313-996-3573; Email: sales@arbortext.com (Internet) Agfa Gevaert - (Your local Agfa Gevaert office/supplier). Allen Creek Software - Carol Kamm, 1209 West Huron, Ann Arbor, MI 48103, USA; Phone: +1-313-663-4248. Auto-Graphics Inc. - Char Garon. Phone: +1-800-776-6939. Avalanche Development Company - Eileen Quirk, Director of Marketing and Sales, Avalanche Development Company, 947 Walnut Street, Boulder, CO 80302, USA; Phone:+1-303-449-5032; Fax: +1-303-449-3246; Email: sales@avalanche.com (Internet) Datalogics Inc. - 441 West Huron Street Chicago, Illinois 60610, USA; Phone: +1-312-266-4444. DEC - (Your local Digital Equipment Corporation (DEC) supplier) E2S - Ronny Verkest, Sales Manager, E2S, Moutstraat 100, B-9000 Gent, Belgium; Phone: +32(91)-21-03-83; Fax: +32(91)-20-31-91; Email: e2s@e2s.be (Internet) Electronic Book Technologies - One Richmond Square, Providence, RI 02906, USA; Phone: +1-401-421-9550; Fax: +1-401-421-9551 Exoterica Corporation - 1545 Carling Ave., Suite 404, Ottawa, Ontario, Canada K1Z 8P9; Tel: 613-722-1700 (or 1-800-565-XGML for product information); Fax: 613-722-5706. Email: info@xgml.com (enquiries) (Internet) Frame Technology Corporation - 2911 Zanker Road, San Jose, CA 95134, USA; Phone: +1-408-433-3311 Grif S.A. - 2 Bd Vauban BP 266, 78053 St-Quentin-en-Yvelines, Cedex, France; Phone: +33-1-30-12-14-30; Fax: +33-1-30-64-06-46 Graphnet Computers Ltd. - Wessex House, 113 Fore Street, North Petherton, Somerset TA6 6SA; Phone: 0278 663680; Fax: 0278 663755 IBM - (Your local IBM office) InfoAccess Inc. - Alister Gibson, Vice President of Marketing, 2800 156th Avenue SE, Bellevue, WA 98007, USA; Phone: (206) 747-3203; Fax: (206) 641-9367 Information Dimensions Inc. - 5080 Tuttle Crossing Boulevard, Dublin, Ohio 43017-3569, USA; Phone: 1-800-DATA-MGT: Fax: 614-761-7290 Interleaf - (Your nearest Interleaf supplier) Or phone: 617.290-0710 Office Workstations Limited (OWL) - Rosebank House, 144 Broughton Road, Edinburgh EH7 4LE, UK; Phone: +44-31-557-5720; Fax: +44-31-557-5721. Open Text Corporation - Michael F Farrell, Executive Vice President, 180 King Stree South, Suite 550, Waterloo, Ontario N2L 1P8, CANADA; Phone: (519) 571-7111; Fax: (519) 571-9092 Sema Group Systems Limited - Sema Group Systems Limited, Avonbridge House, Bath Road, Chippenham, Wiltshire SN15 2BB, UK; Phone: +44-249-656194; Fax: +44-249-655723 Shafftstall Corporation - Anthony L. Shaffstall, VP Sales, 7901 East 88th Street, Indianapolis, IN 46256-1235, USA; Phone: +1-317-842-2077 SoftQuad Inc. - 56 Aberfoyle Crescent, Suite 810, Toronto, CANADA, M8X 2W4; Phone: +1-416-239-4801; Fax: +1-416-239-7105; Email: mail@sq.com (Internet) Tauton Engineering - John Bottoms, Tauton Engineering Inc., 26 Westvale Road, Condord, MA 01742-2935, USA. TechnoTeacher, Inc.- Steve Newcomb, TechnoTeacher, Inc., 1810 High Road, Tallahassee, FL 32303-4408, USA; Phone: +1-904-422-3574; Fax: +1-904-386-2562. The Unifilt Company - Michael Kless (President), PO Box 2528, Edison, NJ 08817, USA. Phone: +1-908-225-2243; Fax: +1-908-225-2248. Westinghouse Electric Corporation - Larry Marchese, Manager Marketing & Sales, PO Box 746, Baltimore, MD 21298-9451, USA. Phone: +1-800-742-4802; Fax: +1-410-993-2214 WordPerfect - (Your local/national WordPerfect Corporation office) Xerox Corporation - Publishing Marketing Manager, 10200 Willow Creek Road, San Diego, CA 92131, USA. Phone: +1-619-695-7789; Fax: +1-619-695-7710. Xerox Imaging Systems - 9 Centennial Drive, Peabody, MA 01960, USA. Phone: +1-508-977-2000; Fax: +1-508-977-2148. (European office): Unit 8, Suttons Business Park, Reading, RG6 1AZ, UK. Phone:+44-734-668421; Fax: +44-734-261913. ZANDAR Corporation - RR2 Box 962 (Hanley Lane), P.O. Box 467, Jericho, Vermont 05465-0327, USA. Phone:+1-802-899-1058 08) ftp archives ================ Many ftp archives now hold information on SGML and copies of public domain software, DTDs etc. (see Robin Cover's on-line Bibliography, or use ARCHIE to find your nearest ftp site). The main archives are: ftp.ifi.uio.no [129.240.88.1] - University of Oslo, Norway mailer.cc.fsu.edu [128.186.6.103] - Florida State University, USA sgml1.ex.ac.uk [144.173.6.61] - Exeter University, UK The ftp archive at Oslo is very well maintained. Some national archives mirror the contents of one or more of those mentioned above. Please try to be considerate when using ftp - it is a privilege not a right. The setting up and maintenance of most archives is done on a voluntary basis, using resources that are loaned by the site administrators. 09) The SGML Users' Group, National Chapters, SIGs ================================================== The International SGML Users' Group was set up to promote the use of SGML and represent the interests of SGML users on various international bodies. Membership of the International SGML Users' Group (or an affiliated National Chapter or SIG), entitles you to receive the Users' Group Newsletter and Bulletin, and discounts on various books and conferences. For membership details, contact: Mr Stephen G Downie SoftQuad Inc. 56 Aberfoyle Crescent Suite 810 Toronto, Ontario M8X 2W4 Canada Phone: +1-416-239-4801 Fax: +1-416-239-7105 Activities and costs of joining a National Chapter or SIG varies greatly. Please contact the appropriate person (see below) for more details. Other CALS SIGs may exist, but I do not have any information about them; I will list them if and when details are supplied. 09.1 National Chapters are listed below by country (with contact names): (A large debt is owed to Brian Travis, the Editor of , for much of the information listed here. Any errors are probably mine.) Australia - Nick Carr, PO Box R806, Sydney NSW, Australia 2000; Phone: 612-262-4777; Fax 612-262-4774) Belgium (and Luxembourg) - Bart Bauwens, Katholieke Universiteit Leuven, Departement Elektrotechniek, Afd.TEO, Kardinaal Mercierlaan 94, B-3000 Leuven; Phone:+32 16 22 09 31 (1119); Fax:+32 16 22 18 55 Email: Bart.Bauwens@esat.kuleuven.ac.be Canada - Dr Martin Levy (Chairman), Senior Director Regulatory Affairs, Vice President Scientific Affairs, Fujisawa Pharmaceutical Company, 7181 Woodbine Avenue, Suite 110, Markham, Ontario L3R 1A3 Canada; Phone: +1-416-470-7990; Fax: +1-416-470-7799. France - Jean-Francois Legendre, AFNOR-STIA; Phone:+331-4291-5950 Germany - Dr Manfred Kruger, MID/Information Logistics Group GmbH, Ringstrasse 15, 6900 Heidelberg, West Germany; Phone: +49-6221- 166-091; Fax: +49-6221-23921. Japan - Mr Makoto Yoshioka, Research Fellow, Personal Systems Division, Fujistu Laboratories Ltd, 1015, Kamikodanaka Nakahara-Ku, Kawasaki 211, Japan; Phone: +81-44-754-2690; Fax:+81-44-754-2594. Luxembourg - (See Belgium) Netherlands - Mr Jan Maasdam, Samsom Uitgeverij, Postbus 4, 2400 MA Alphen aan de Rijn, The Netherlands; Phone: +31-1720-66-612. New Zealand - (Just setting up. Lisa Owen, Project Manager, Butterworths, NZ, Ltd., PO Box 472, Wellington, NEW ZEALAND Phone: 644-802-7116 FAX: 644-385-1598 E-Mail: attmail!bwthsnz!BONZL!wgpo!OWENL) Norway - Mr Jon Urdal, Fabritius A/S, Brobekkvn. 80, 0583 Oslo 5, Norway. Phone: +47-2-636400; Fax: +47-2-636590; Email: ju@gi.no (Internet) South East Asia - (See Australia) Sweden - Ulf Larson, R&D Manager, Elanders Tryckeri, PO Box 10404, S-43424 Kungsbacka, Sweden; Phone: +46-300-50-000 Switzerland - Mr Jurgen De Jonghe, AS Division, CERN, Geneva 23, Switzerland; Phone: +41-22-767-81-41; Fax: +41-22-782-47-20. UK - Mr Nigel Bray, Database Publishing Systems Ltd., 608 Delta Business Park, Great Western Way, Swindon, Wiltshire SN5 7XF, UK; Phone:+44-793-512-515; Fax: +44-793-512-516. USA (Arizona) - Lucius Lockwood, Motorola Inc.; Phone: +1-602- 441-2805; Email: lucius_lockwood@com.mot.email (Internet) USA (Northern California) - Dennis Arnon; Phone: (415) 812-4425; Fax: (415) 752-1827; Email: arnon.parc@com.xerox USA (Colorado/Denver) - aka "Rocky Mountain SGML Entity" Linda Turner, Avalanche Development Company, 947 Walnut Street, Boulder, CO 80302, USA; Phone:+1-303-449-5032 USA (Midwest/Mid-Atlantic) - Ms Beth Micksch, Datalogics Inc., 441 West Huron Chicago, IL 60611, USA; Phone: +1-312-266-3131; Fax: +1-312-266-4473; Email: bem@dlogics.com (Internet) (93/04/29: These details may no longer be valid. See entry for USA Southeastern). USA (New York) - Mr W Joseph Davidson, SGML Forum of New York, Bowling Green Station, P.O. Box 803, New York, NY 10274-0803, USA; Phone: +1-212-691-4463; Fax: +1-212-691-1821. USA (Northern Carolina) - (Just setting up. Contact: Eliot Kimber, Dept E14/B500, Network Programs Information Development, IBM Corporation, Research Triangle Park, NC 27709, USA; Phone: +1-919-254-5160; Email:drmacro@ralvm13.vnet.ibm.com (Internet). USA (Pacific Northwest) - Mark Cates, Arthur Anderson & Co.; Phone: +1-206-233-8343 USA (Southern Carolina) - Roger Watson, OutSource Technical Documentation Services,8929 S. Sepulveda Blvd. Suite 210, Los Angeles, CA 90045-3616, USA; Phone: +1-310-337-0309; Fax: +1-310-337-0423; Email: rjw@outsource.com (Internet). USA (Southeastern) - Beth Micksch, Intergraph Corporation, MS GD3005, Huntsville, AL 35894-0001, USA; Phone: 205-730-3683; Fax: 205-730-3301; Email: bmicksch@micksch.b30.ingr.com USA (Washington DC) - Tommie Usdin, ATLIS Consulting Group, 6011 Executive Blvd., Rockville, MD 20852, USA; Phone: +1-301-770-3000 09.2 Special Interest Groups (SIGs): ATA SIG - Ms Dianne Kennedy, Datalogics Inc., 441 West Huron, Chicago, IL 60611, USA; Phone: +1-312-266-4483; Fax: +1-312-266-4473; Email: dkv@dlogics.com (Internet) CALS in Europe SIG - David Ardron, Secretary, CALS in Europe SIG, Ferranti Computer Systems Ltd., Western Road, Bracknell, Berkshire RG12 1RA, UK; Phone: +44-344-483232; Fax: +44-344-54639 Database SIG - Mr Hans Mabelis, c/o Matrices Software, Westeinde 14, 1017 ZP Amsterdam, The Netherlands; Phone: +31-20-25-50-06; Fax: +31-20-24-79-48. European Workgroup on SGML (EWS) - Mr Holger Wendt, Springer-Verlag GmbH & Co. KG, Postfach 105280, Tiergartenstrasse 17 6900, Heidelberg 1, Germany; Phone: +49-6221-487-324; Fax: +49-6221-43982. SGML SIGhyper (The SGML Users' Group SIG on Hypertext and Multimedia) - Mr Steven R Newcomb, TechnoTeacher, Inc., 1810 High Road, Tallahassee, FL 32303-4408, USA; Phone: +1-904-422-3574; Fax: +1-904-386-2562. 09.2 Standards bodies: NB. **NONE** of the International Standards relevant to SGML, HyTime, DSSL etc. are available electronically for ftp (or similar). The texts of these Standards are copyrighted, and you will need to order a copy through your own National Standards body. American National Standards Institute (ANSI) - 1430 Broadway, New York, NY 10018, USA. Phone: +1-212-642-4995. British Standards Institution (BSI) - Linford Wood, Milton Keynes, MK14, UK. International Organization for Standardization (ISO) - 1 Rue de Varembe, Case Postale 56, CH-1211 Geneva 20, Switzerland. 10) Conferences =============== The major SGML-related conferences are organized by the Graphics Communications Association (GCA), at reduced rates for GCA members. Anyone can attend, and since SGML`92 the GCA have introduced a discount rate for representatives from academic institutions. To contact the GCA: Graphic Communications Association 100 Daingerfield Road, 4th Fl. Alexandria VA 22314-2888 Phone: (703)-519-8157 Fax: (703)-548-2867 "SGML Europe" - Held annually since 1982 in Europe. General/all SGML (and closely related issues). Recently some sessions have been repeats of the preceeding "SGML" conference in the USA. Next: 16 - 19 May 1994, Montreux, Switzerland. "SGML" - Held annually in USA. General/all/specialist SGML (and closely related issues). Often slightly more technical and hectic than "SGML Europe". Next: 7 - 10 November 1994, Vienna, Virginia, USA. 11) SGML Initiatives and major projects ======================================= This section contains details on: AAP / EPSIG EWS TEI Davenport Group 11.1) AAP / EPSIG - Association of American Publishers, Electronic Publishing Special Interest Group. Responsible for producing, maintaining and updating ANSI/NISO Z39.59-1988 (also know as the "AAP Standard"). The AAP standard consists of a set of DTDs relating to the preparation and markup of electronic manuscripts. The standard is currently undergoing review, and specialist workgroups of interested volunteers are working on DTDs for handling tables and mathematics (using two electronic discussion lists - see the appropriate section, above). For information, contact: Betsy Kiser (EPSIG Manager) EPSIG c/o OCLC 6565 Frantz Road Dublin Ohio 43017-0702 USA Phone: (614)-764-6195 Fax: (614)-764-6096 This information may no longer apply; you may have more luck with: GCARI (CGA Research Institute), PO Box 2888, Alexandria, VA 22314-2888, USA; Phone: +1 (703) 519 8184. Alternatively, try: McAffee & McAdam Ltd, SGML Architects and Consultants, 1220 Churchville Road, Bel Air, Maryland 21014, USA; Phone: +1 (410) 893 1340; Fax: +1 (410) 838 1913 [this company offers training courses etc. tailored to the AAP/ISO 12083 DTDs] 11.2) EWS - European Workgroup on SGML. A collection of major European publishers, typesetters, printers (and other interested parties), working toward producing a DTD (or set of DTDs), suitable for publishing scientific journals, papers etc. Initally based-on, and still closely watching, work on the AAP Standard. EWS DTD(s) currently known as the "MAJOUR DTD" - a complete version of which is due out at the end of 1992 (a copy of the header part of the DTD and accompanying handbook was distributed at "International Markup `91"). For information, contact: (See entry for EWS above - in section dealing with Special Interest Groups). 11.3) TEI - The Text Encoding Initiative. An international research project to develop and disseminate guidelines for the encoding and interchange of machine-readable texts. Primarily concerned with taking existing texts, and marking them up with SGML (so as to facilitate later study). The TEI has several specialist committees and workgroups e.g. General Linguistics, Spoken Texts, Historical Studies, Machine Readable Dictionaries, Computational Lexica, Terminological Databases, Character Sets, Text Criticism, Hypertext and Hypermedia, Mathematical Formulae and Tables, Language Corpora, Verse, Performance Texts, Literary Prose. The TEI maintains two electronic discussion lists (tei-l and sgml-l), two archives of related documentation (reports, DTDs, and entity sets), and publishes widely. For more information, contact one of the co-ordinators: C. Michael Sperberg-McQueen University of Illinois at Chicago Computer Center (M/C 135) Box 6998 Chicago IL 60680 USA Phone: +1-312-996-2981 Fax: +1-312-996-6834 Email: U35395@uicvm.cc.uic.edu (Internet) U35395@UICVM (Bitnet) Lou Burnard Oxford University Computing Service 13 Banbury Road Oxford OX2 6NN UK Phone: +44-865-273-238 Fax: +44-865-273-275 Email: LOU@VAX.OX.AC.UK (Internet) 11.4) Davenport Group - [the following is taken from a posting to the newsgroup comp.text.sgml, by Steven R. Newcomb on 24 Aug 1993] The ``Davenport Group'' is a group of Unix system vendors and vendors of related hardware, software, and services whose purposes include the creation of SGML-based conventions for representing online software documentation. Some public participation is permitted. For more information, please contact Dale Dougherty, c/o O'Reilly & Associates, Inc., 103A Morris Street, Sebastopol, California 95472 USA (tel: +1 707 829 0515 or +1 800 998 9938 (home office: +1 707 829 3762 or +1 707 829 3762), fax: +1 707 829 0104, Internet: ) or Ralph Ferris, c/o Fujitsu Open Systems Solutions, Inc., 6121 Hollis Street, Emeryville, California 94608-2092 USA (tel: +1 510 652 6200 x421, fax +1 510 652 5532, Internet: ). 11.5) SGML Open - an industry consortium to promote SGML and inter- operability. First announced at TechDoc 93. Still in the process of defining its goals/roles/tasks. See the March 1993 edition of for more information, or the postings to comp.text.sgml. At this stage, for more information try contacting any of the following (full addresses given in section 07): Yuri Rubinsky President, SoftQuad Phone: +1-416-239-4801 Larry Bohn Senior vice president, Interleaf Phone: +1-617-290-0710 Haviland Wright President, Avalanche Development Company Phone: +1-303-449-5032 11.6) CALS - Continuous Acquisition Life-cycle Support (fomerly Computer-aided Acquisition and Logistic Support). Originated from the US Department of Defense (DoD), this strategy is being taken up by several allied nations. A set of military specifications are in existence (and apparently undergo frequent revision). The following information was posted to the comp.text. sgml newsgroup by Matt L. Voisard on 23 April 1993: To those who may need them, CALS standards are available through a number of organizing bodies here in the US. The National Technical Information Service (NTIS) is the maintainer of the CALS BBS. They can be reached at: NATIONAL TECHNICAL INFORMATION SERVICE (NTIS) 5285 PORT ROYAL ROAD SPRINGFIELD, VA 22161-2171 (UNITED STATES) VOICE: (703) 487-4650 BBS 1200 baud (703) 321-8020 9600 baud (703) 321-8970 ***** NOTE -- I'm not aware of an Internet, etc. connection as of this writing. I'm under the impression that one may be available in the future however. For those people who are DoD contractors, CALS standards are available through the Defense Printing Service (DPS). DEFENSE PRINTING SERVICE (DPS) STANDARDIZATION DOCUMENTS ORDER DESK BUILDING 4D, 700 ROBBINS AVENUE PHILADELPHIA PA 19111-5094 USA VOICE: (215) 697-3321 (USA) You must have an authorized charge number for "free" copies of CALS and other specifications and standards. As always, the Air Force CALS Program Office can provide official guidance to those in the DoD and its contractors on any matter concerning CALS. To reach the AF CALS PMO, contact AIR FORCE CALS PMO c/o Steve Holloway 4027 Col. Glenn Highway, Suite 200 Dayton OH 45431-1601 USA VOICE: (513) 427-2295 ext. 223 DSN 787-3085 ext. 223 INTERNET: shollowa@mmdis01.hq.aflc.af.mil 11.7) Rainbow "Rainbow is an *idea* and a publicly available DTD that implements that idea.", first mooted by Dave Sklar of EBT at SGML`93 in Boston. The Rainbow DTD provides an intermediate step in the process of converting from a proprietary (word-processing) format into an SGML-tagged file which conforms to a user-specified DTD. "Rainbow Makers" are pieces of software which map a file's content and (proprietary) formatting information into a file which conforms to the Rainbow DTD (e.g. where attributes are used to record font information etc.). A file which conforms to the Rainbow DTD can then be transformed into a user-specified target DTD. Transforming files from one DTD into another DTD is considered to be more robust than going from non-SGML directly into SGML. Thus, if someone develops an RTF-to-Rainbow Rainbow Maker, and someonelse produces a Rainbow-DTD-to-DOCBOOK-DTD transformer, it should be possible to automate much of the process of converting RTF files into SGML files which conform to the DOCBOOK DTD. The availability of the Rainbow DTD, Rainbow Makers etc. should help to stop people continually re-inventing the wheel, and may improve the uptake of SGML. The Rainbow DTD, Rainbow Makers and any related software are available from various ftp sites, and are made available and maintained on a voluntary basis. The main ftp site is run by Dave Skar ftp.ebt.com [192.111.115.3]:/pub/outgoing/rainbow 12) SGML and other Standards ============================ ****************************************************************** SEE ALSO - "SGML and RELATED ISO STANDARDS" Compiled by Heather Davenport and Edited by Erik Naggum (a copy is available on this archive, in the directory "standards"). ****************************************************************** NB. **NONE** of the International Standards relevant to SGML, HyTime, DSSL etc. are available electronically for ftp (or similar). The texts of these Standards are copyrighted, and you will need to order a copy through your own National Standards body. Some or all of the Standards mentioned below are described in much more depth in Robin Cover's on-line Bibliography (users are advised to consult this). Copies of International Standards are available through your national standards body. Standards are listed below by acronym/title. ASCII - (see Character Sets) Character Sets - Being non-proprietary and device-independent, SGML does not restrict users to a particular character set. This is a complex area of SGML, and readers are directed to ISO 8879, and the ftp archive of postings to comp.text.sgml on this subject (at Oslo) for futher information. DSSSL - Document Style Semantics and Specification Language (ISO/IEC DIS 10179:1990). The introduction to the Standard says DSSSL can be used "..for the specification of document processing such as formatting and data management functions, with the initial focus on formatting to both print and on display media, and data conversion......The objective of the DSSSL Standard is to provide a formal and rigorous means of expressing the range of document production specifications, including high-quality typography, required by the graphics arts industry." DSSSL is not yet a full International Standard. EBCDIC - (see Character Sets) Graphics - (see Proprietary/defacto standards) HTML - the "Hypertext markup language". A DTD developed for use with the World Wide Web. Work is ongoing to develop translators into HTML (eg. LaTeX -> HTML). For more information see the newsgroup comp.infosystems.www (and the associated FAQ), or download copies of the relevant documents from one of the major WWW sites (eg. download info.cern.ch/hypertext/WWW/MarkUp/HTML.html). HTML can be used to represent such things as: Hypertext news/mail/online documentation/collaborative hypermedia, menus of options, database query results, simple structured documents with inline graphics, and hypertext views of existing bodies of information. HyTime - Hypermedia/Time-based Structuring Language (ISO/IEC 10744) (Extract from Robin Cover's Bibliography:) "HyTime is a standard neutral markup language for representing hypertext, multimedia, hypermedia and time- and space-based documents in terms of their logical structure. Its purpose is to make hyperdocuments interoperable and maintainable over the long term. HyTime can be used to represent documents containing any combination of digital notations. HyTime is parsable as Standard Generalized Markup Language..." HyTime was accepted as a full International Standard in spring 1992. ODA - Open (or "Office") Document Architecture (ISO 8613) This is mentioned because ODA is often presented as if both it and SGML existed in opposition to one another. This is not the case. ODA is a complex standard, and is undergoing a thorough review; contact your national standards body for more information. Papers have been published that compare and contrast ODA and SGML (see Robin Cover's on-line Bibliography for references). Note: some readers object to postings on ODA being sent to comp.text.sgml (try comp.text instead). Proprietary/defacto standards. There is a common misconception that SGML exists in competition to some major existing defacto and proprietary standards (such as PostScript or TeX/LaTeX). This is not the case, and as you find out more about SGML this should become self-evident (however, see SPDL). Note that SGML enables the inclusion of data marked up with something other than SGML (e.g. TeX, Encapsulated PostScript, a Lotus spreadsheet, CCITT/4, CGM, TIFF etc.) SDIF - SGML Document Interchange Format (ISO 9069:1988) "A standard describing the interchange for documents enclosed with SGML" (Eric Van Herwijnen, "Practical SGML") SGML-B - (SGML Binary ?) A standard for describing a compiled form of SGML (?). James David Mason (Convenor, ISO/IEC JTC1/SC18/WG8), posted the following message to comp.text.sgml (15 Aug 1992) "The official status of SGML-B is that it is an approved work item in ISO/IEC JTC1/SC18/WG8, the group responsible for SGML itself. The editors are Dr. Charles Goldfarb, the SGML project leader, and Dr. David Abrahamson, of Trinity College, Dublin. The project is being maintained as officially active, with the provision that it will not be progressed until the current review and potential revision of SGML itself is further along. Our intention is to make SGML-B reflect whatever revisions we decide to incorporate into the base standard and then to make it a part of the revised standard rather than something independent. SMDL - Standard Music Description Language (ISO/IEC CD 10743) (Extract from Robin Cover's Bibliography:) "..SMDL 'defines a language for the representation of music information, either alone, or in conjunction with text, graphics, or other information needed for publishing or business purposes.' Multimedia time sequence information is supported. SMDL is a HyTime application...." SMDL came before, and was the source of inspiration for HyTime. Not to be confused with SDML ("Standard Digital Markup Language"?) which is a proprietary standard. SPDL - Standard Page Description Language (ISO/IEC DIS 10190:1991) A Standard for mapping to (and possibly from) a description language for output devices. Thus an SGML document might go through DSSSL- and SPDL- conforming processes before being output on a printer. SPDL might be seen to be competing with defacto standards such as PostScript. World Wide Web (WWW or W3) - an initiative to link information from various sources around the world (cf. gopher, WAIS etc.). All WWW readers must be capable of handling texts marked up with HTML, and the more sophisticated browser provide good facilities for presenting such texts on screen. Support for multimedia (still and moving images, and sound). Also support hypertext/hypermedia linking to and from buttons, "hot spots" etc. See the newsgroup comp.infosystems.www for more information. 13) Introductory questions with answers - by Erik Naggum ========================================================
INTRODUCTORY QUESTIONS with answers. What is SGML, briefly? SGML is an abbreviation for the "Standard Generalized Markup Language". SGML is defined in an International Standard published by the International Organization for Standardization (ISO), with reference number ISO 8879:1986, bearing the full name "Information processing -- Text and office systems -- Standard Generalized Markup Language (SGML)". To most people, _markup_ means an increase in the price of an article. Although we talk about increases in value, it's not the same thing. "Markup" is a term coming from the publishing and printing business, where it means the instructions for the typesetter that were written on a typescript or manuscript copy by an editor. Today, with your favorite editor, you can enter the markup yourself, or even have it entered for you, in terms of codes or other instructions for an electronic typesetting program, which in simple cases is also the editor. An example is troff's ".ce" for "center the following line". A _markup_language_ is a set of means (constructs) to express how text (i.e., that which is not markup) should be processed, or handled in other ways. Unlike most other artificial languages, markup languages have to deal with embedded data, and contain rules for what is markup and what is data. For instance, in TeX the backslash means that subsequent input is TeX instructions. Most markup languages offer additional, administrative, language constructs, with which to define other language constructs (such as macros). _Generalized_markup_ is markup that has the curious property that it does _not_ specify how things should look. We still call it markup, though, because of the similarity with markup as described above. For instance, "" and "" are used in this FAQ to denote Question and Answer, respectively. This doesn't say anything about how questions should look in a typeset edition of this FAQ. You could have all the questions rendered in bold-face, for instance. With generalized markup, you tell the system _what_ you have, rather than how it should look, and you do so by putting a label (tag) around the text. There is a clear correlation between tags and what things look like. Tags are placed at the start and at the end of text or a certain kind, and these are precisely the places where typographic features are used, such as spacing, change of typeface, etc. An example is LaTeX, which, through macros, let you talk about itemized lists, instead of indents, item numbering, among other things. The _Standard_Generalized_Markup_Language_ started out as GML, the Generalized Markup Language, created by Charles Goldfarb, Edward Mosher and Raymond Lorie (G, M, and L, respectively) in 1969 at IBM. GML became the basis for the Standard through work in ANSI and with aid from a project predating GML, GenCode, which attempted to standardize names of commonly used elements. Rather than take this (impossible) approach, SGML is a language which makes it possible to roll your own generalized markup, but with a standard form and in standard ways. (Historic note: The origin of SGML was confused with that of GenCode in the 1991-12-15 edition of this FAQ.) In practice, you won't exactly roll your own, any more than you design LaTeX packages on your own. Although some people actually do that! Central to the design of SGML is the idea that a set of generic identifiers (the names of the tags), together with their interrelationships, form a type (or class) of documents, and that every document is an instance of a class, which means it can be validated with respect to this class. Can I read more about SGML somewhere? Let me suggest only one book, and then a bibliography. The book is Charles F. Goldfarb: The SGML Handbook; Oxford University Press, 1990; ISBN 0-19-853737-9. This book includes the text of the standard, so you don't have to worry about finding out how to order it from your ISO national member body or directly from ISO in Geneva, or wherever. The main feature of this book is that Charles Goldfarb, who is the project editor for the standard in ISO's SGML committee, has added a tremendous amount of annotations and has provided links between parts of the standard to guide your yearning for knowledge. Another big win is the overview, which takes you through a guided tour of concepts and facilities. If there be only one authority on SGML, this book is it. A "paper hypertext" feature makes the links in the text easy to follow. This is a book you need. The bibliography is Robin Cover's Brief Bibliography, also to be published on this newsgroup, and it covers the essentials, as well as enough pointers to other works to fill a wall of literature. Robin Cover, et alia, produced the huge, 312-page "Bibliography on SGML" (Tech Report 91-299, Queen's University, Kingston, Ontario, Canada), an incredibly useful work. Robin Cover continues to track the SGML arena, and hopefully, he will continue to provide us with the fruits of his work. SGML is often mentioned as being a "meta-language". What is that? This refers to the fact that SGML isn't only one language, but a language which describes other languages within its framework. As we talked about classes of documents and every document being an instance of such a class, we talk about a class of markup languages, and every markup language being an instance of the class. SGML also has the necessary expressive power to redefine the particular characters that are to be considered markup in a particular markup language, so that SGML is really a meta-language with an abstract syntax that each SGML document fills in to get a concrete syntax and a particular markup language for that document. This is the administrative information that makes it possible to talk about "conformance" to SGML. What does an SGML document look like? An SGML document is divided into three different parts, each with a clearly defined function. The first part specifies the character set of the document, which of these characters have special meaning to SGML in the rest of the document, and which advanced features are used. This is called the "SGML declaration", and is like a list of ingredients on food, so you know what to expect and what you can't eat. Using this as a check- list, you can determine whether your system can handle the document at hand. The SGML declaration looks like this: (There can be several document types, and a another construct called link type declarations (similar to DOCTYPE, but with LINKTYPE).) The third part of an SGML document is the marked-up "real" document which all of the administrative information and legwork makes possible. This is called the document instance. It usually begins with the name of the document in angle brackets, like this; which is the syntax for a start-tag of an element. The corresponding end-tag looks like this: When your parser reads your document, it checks that the tags in the document belong to the document type, and that they are allowed where they're used, again according to the document type. This process is called "validation". When a document is validated, it does not need to be so again no matter what your parser is instructed to do with it, and no matter which application will use the data in the document. This is another strength of SGML: application-independent validation. What do you mean "my parser"? Are there any freely available ones? 99% of the fun with SGML can be had only with a parser, so you do need one. (The remaining 1% comes from beholding the elegance and beauty of the language, and contemplating all the wondrous things you can do with it, once you have a parser. This feeling tends not to last, unless you're developing a parser, in which case it's almost all the fun.) Fortunately, a competent programmer and SGML afficionado has had a lot of fun lately, and in mid-July 1991, the ARC SGML parser materials were released. The ARC SGML parser materials are legally unencumbered (i.e., you can do whatever you want with it) and it's available for a nominal cost from the SGML Users' Group, as well as from several public SGML repositories. Can I get the ARC SGML from somewhere electronically? The University of Oslo, Department of Informatics, kindly sponsors a public FTP archive with material on SGML and has the ARC SGML parser available for anonymous FTP. Both the original MS-DOS distribution and a Unix port done by James Clark are available. This archive also holds information on some standards related to SGML, most notably an SGML application for hypermedia documents (the Hypermedia/Time-based structuring language, HyTime). Take a look around in the SGML and SIGhyper subdirectories. (Anonymous FTP works like this: You need to be connected to the Internet, and need a program which can talk the FTP protocol, usually something with "FTP" in it. On Unix systems, you can say "ftp ftp.ifi.uio.no", and that should be it. You will be asked for a user name -- reply "anonymous". You will then be asked for a password -- reply with your Internet mail address. You're now logged in, and can use the "cd" command to switch directories ("cdup" to go one level up), and "ls" to look around. Use "get" to fetch files.) If you need guidance, or can't use FTP, you may write to , which I'll try to answer as fast as possible. There are also other FAQs available on how to FTP. I've received an SGML document from a net.friend, what can I do with it? Didn't your net.friend tell you?? Seriously, an SGML document is, as mentioned above, an instance of a document type, and a document type can be many things, and it's only part of an application of SGML. Such an application consists of several parts: First, there's the document type definition, which says which elements you can have, and how they interrelate. Second, with the document type definition, there's a description of the semantics of the elements, so you know what they mean. The description is needed because SGML is not concerned with what things mean, only how they are represented. (You might complain that this is too small, but it's better to do a given task well than to do a greater task badly. There are other standards in the great SGML family which take care of these things, and more are coming as we witness increased adoption of SGML in the market.) I'm writing a book, and my publisher wants me to submit an SGML document on a diskette, what do I do? You take a look at one of the several SGML editing system around, and see which you think you would like to write a whole book with. Recruit your publisher to help you understand what he wants, and try to play with SGML a little before you start writing. SGML is like, um, anyway, it gets better with experience, and can be frightening the first time. For a good list of starter tools, I again refer you to Robin Cover's brief bibliography for the details.
TECHNICAL QUESTIONS with answers What, precisely, is an "element"? An element is the smallest part of a document that SGML deals with, and it's the basic building block of document types. An element may contain data (text), subelements, both, or it may be empty. The task of a document type designer is to identify the elements a document is to consist of, and define a hierarchical structure of these elements by means of other elements. An element definition consists of the name (generic identifier) which will be used in tags, a description of the content (using a "content model"), and an indication of whether the start-tags or the end-tags may be omitted. An element (in the document instance) is indicated by a start-tag, the contents, and an end-tag. An element, with its notion of content models, provide a powerful abstraction over the different kinds of text that can be found in a document. For instance, ordinary text is just characters that will be formatted somehow on output. If you have special kinds of text, such as, for instance, a telephone number, it could make sense (depending on your application) to make a special element with generic identifer "phone". That way, you can look for telephone numbers and get matches only at the right places. If you're really far-sighted, you would define a telephone number notation you associated with this element, so that you could check that all your phone numbers had the right format. Then you could modify the presentation of a phone number to suit a particular need, e.g. +1 516 555 8879 in the document could come out as "(516) 555 8879" in a domestic catalog and with full, international format for an international catalog. In a way, elements are like concepts, where a concept (say, "beef") is an abstraction over an innumerable lot of things into a particular "type" of thing, all having common characteristics, and fits into a hierarchy where concepts may be abstractions over other concepts. This idea of "types" and of a conceptual tool for text is one of the many great things with SGML. A content model is like the definition of a concept, with the important difference that a content model is defined in terms of the behavior its subelements. A subelement may be optional, required, or repeatable, and subelements may be chosen from a set, form an ordered set, or form an unordered set. Then there are exceptional subelements, which may either be forbidden or allowed anywhere in the contents of the element. The similarity between element and concepts go further, as elements may have attributes. An attribute is information about an element which is not part of its content. The element in SGML is thus a high abstraction over identifiable, separate portions of contents of a document from a conceptual and hierarchical view. What is an "entity" in SGML? The notion of an entity is SGML is an even higher abstraction than the element, and since this is somewhat unexpected to most readers of SGML, it's probably the reason why so many have problems with it. The concept of an element comes from looking at the contents of a document and grasping that the contents forms an element structure, a hierarchy of elements, and that the nature of each element can be abstracted so that a content model can be defined which spans the varied use of each subelement. The concept of an entity comes from looking at the individual pieces of text that make up a whole document, and realizing that these pieces are independent of the element structure. E.g., a book may physically consist of several files on the author's disks. The element structure of the book spans all the disks and all the files, yet it's important to be able to refer to the files. The both complicating and relieving aspect of this is that we need to be able to refer to these pieces in a system- and storage-independent way. This is where the entity saves us a lot of trouble. Entities are named pieces of text. The abstraction that causes some confusion is over what a "piece of text" is, and, in particular, where it is found. We have looked at external entities, that is, entities which, when we refer to them, cause us to read a different file. We may also need to define short- hand notations for things in a document without needing an external file for every small piece of text. This means that entities have types, as well. There are internal entities, entities that are useful as short-hands for language constructs, entities that are text which is not to be interpreted, etc, and external entities, entities that are simply text, entities that are in a special notation, to be interpreted by a special program, perhaps with parameters, entities which constitute larger parts of the administrative functions of the first and second part of the SGML document. Moreover, entities may be used both by the administrative parts and the user, and the user shouldn't have to worry about which entities are used by the administrative functions he doesn't see. So, entities come in two flavors, parameter entities and general entities. An "entity", then, is an abstraction over several types of text that you want to refer to by name. Once defined, you don't need to know where it is found, or of what kind it is -- all (general) entities look and feel the same to the user.
FURTHER QUESTIONS without answers Is this all? No, it just takes a lot of time to invent questions and write good answers. In this FAQ I have not tried to make a summary of questions asked on the net so far, but to provide answers to questions that I have seen come up in several ways without necessarily being asked in the form presented above. A summary of question and answers in this group will be incorporated into the next version of the FAQ. How can I contribute? Glad you asked. You can, at any time, fetch the latest versino of the FAQ at ftp.ifi.uio.no:SGML/FAQ., where is 0.0 for this version. Other versions will be available as I write more, and as your contributions flood my mailbox. Please write to me at Erik Naggum or Erik Naggum 14) Making comments/additions to this FAQ ========================================= If you have any additional information, comments or questions, please email either of the authors of this FAQ. (Producers and suppliers of commercial software should note that we can only provide _very_ brief details of their product). If you wish to include some information, please indicate _clearly_ in your message where you think it should go in the FAQ (ie. what section etc). Email: Erik Naggum Michael Popham Post: Erik Naggum, Naggum Software, Boks 1570, Vika, 0118 OSLO, Norway; Michael Popham, The SGML Project, Computer Unit, University of Exeter, Exeter EX4 4QE, UK. Fax: (Michael Popham) +44-392-211630 ========================================================================= COPYRIGHT NOTICE - Users are free to distribute this information in any form, PROVIDED THAT no charge (other than to cover reproduction costs) is made, the authors are acknowledged, and the final section on "Making comments/additions to this FAQ" and this notice are included in all copies. The copyright for sections written by named individual authors, such as Erik Naggum, remains with those authors. You should seek their permission before reproducing their material. =========================================================================