Cover Pages: SGML/XML Bibliography Part 6, O

O'Connell, Conleth S. Jr. Supporting the Development of Grammar Descriptions for Multiple Applications. OSU-CISRC-TR-7/90-TR20. Columbus, Ohio: The Ohio State University, Department of Computer and Information Science, July 1990. Extent: 39 pages.

"Abstract: In computer science, context-free grammars are used extensively to describe data sets such as manuscript types and programming languages. The data, or members, contained in a particular set represent instances of the grammar describing that set, for example, documents and programs.

Determining the elements comprising instances is the task of content investigation. Imposing structure on these elements is the task of grammar development. Creating, editing, and manipulating instances of a grammar is the task of grammar instantiation. Grammar instantiation has received much attention with software systems such as programming environments and compound-document environments. Content investigation and grammar development have only recently been recognized as recurring complex tasks. They have received little attention because of their newly emerging significance. This work focuses on grammar development.

Grammar development produces a grammar description in a particular notation that contains two types of information: a formal, context-free grammar and auxiliary information. Auxiliary information describes the application of the grammar description. For example, a grammar may describe the manuscript type ``article,'' but the auxiliary information may describe how to format the instances for layout, how to analyze the sentence structure, or how to exchange documents of that type.

The separation of the general, context-free grammar from the application-specific, auxiliary information provides the power and flexibility to generalize problem classes associated with grammar development. The formalisms of context-free grammars motivate two such problem classes: syntactic properties and semantic properties. The analysis of the development of large grammars motivates two other problem classes: reusable grammars and multiple notations.

A review of existing software systems reveals that a new, general-purpose, support environment was required for developing grammar descriptions. A prototype environment for developing grammar descriptions, DeveGram, has been designed and implemented. DeveGram controls and manages the four problem classes by capturing any context-free grammar, providing mechanisms for determining properties about a grammar, capturing auxiliary information, and generating automatically grammar descriptions in a testbed of different notations. DeveGram produces grammar descriptions for a testbed of software systems differing in syntax and purpose. The testbed presently consists of Yacc, SGML, MDL, MANDEN, and BNF."

Note: see more on the Chameleon project by Mamrak and Walter. For a paper copy of the report, send email to: strawser@cis.ohio-state.edu OR to cso@cis.ohio-state.edu.

O'Connell, Conleth S.; Mamrak, Sandra A. A Support Environment for Describing Structured-Object Specifications. OSU-CISRC-6/89-TR25. Columbus, Ohio: The Ohio State University, Department of Computer and Information Science, June, 1989. Extent: 16 pages.

"Many software systems were developed to manipulate structured objects. The system developers were performing common tasks, e.g., parsing, manipulating the structured objects in a similar fashion, and even using common data structures, e.g., trees, to represent the structured objects being manipulated. System generators were developed to eliminate this duplication of effort. Each of these generators requires a specification describing the structured object under consideration, e.g., a programming language. In general, a specification consists of two parts: a context-free grammar and auxiliary information. The context-free grammar describes the manipulations to perform on the structure and content of an object. The task of describing a specification is inherently complex for the typical specifier. In particular, defining a structured-document specification presents considerable difficulties to the specifier. In this paper, we identify the complexities of defining a specification, in particular for structured documents. We also describe ideal features of a support environment that would aid in controlling and managing these complexities. Other system generators are then evaluated according to the identified features. Finally, the design of a prototype environment driven by this discussion is presented.

[CR: 19971227]

O'Connor, Dennis J. "The SGML Puzzle. The Pieces and How They Fit Together." Pages 77-82 in SGML/XML '97 Conference Proceedings. SGML/XML '97. "SGML is Alive, Growing, Evolving!" The Washington Sheraton Hotel, Washington, D.C., USA. December 7 - 12, 1997. Sponsored by the Graphic Communications Association (GCA) and Co-sponsored by SGML Open. Conference Chairs: Tommie Usdin (Chair, Mulberry Technologies), Debbie Lapeyre (Co-Chair, Mulberry Technologies); Michael Sperberg-McQueen (Co-Chair, University of Illinois). Alexandria, VA: Graphic Communications Association (GCA), 1997. Extent: 691 pages, CDROM; print volume contains author and title indexes, keyword and acronym lists. Author's affiliation: [Dennis J. O'Connor]: Consultant, Mulberry Technologies, Inc., 17 West Jefferson Street, Suite 207, Rockville, Maryland 20850 USA; Phone: +1 301/315-9631; FAX: +1 301/315-8285; Email: doconnor@mulberrytech.com; WWW: http://www.mulberrytech.com.

Abstract: "Various components of an SGML system are examined using a graphical framework; where applicable, software applications and the relevance of XML are reviewed within this framework. Using a broad concept of an SGML document, the following tools which work with these documents are discussed, including their interrelationships: authoring, conversion, document management, and output. The basic structure of a document is a DTD, a set of rules for applying SGML to the markup of a document ('tagging'). SGML Editors make possible the creation of information using SGML tags from the DTD. Conversion tools facilitate changing data to and from various coding schemes. Document managers permit a number of functions, including revision control, and coordination of the other tools. Having defined content with these tools, formatters use output specifications to control the output of data in a formatted fashion. The introduction of XML increases the importance of application documentation as XML removes the requirement of a DTD."

This paper was delivered as part of the "Newcomer" track in the SGML/XML '97 Conference.

Note: The SGML/XML '97 conference proceedings volume is available from the Graphic Communications Association, 100 Daingerfield Road, Alexandria, VA 22314-2888; Tel: +1 (703) 519-8160; FAX: +1 (703) 548-2867. Complete information about the conference (e.g., program listing, tutorials, show guide, DTDs, conference reports) is provided in the dedicated conference section of the SGML/XML Web Page and via the GCA Web server. The electronic proceedings on CDROM was produced courtesy of Jouve Data Management (Jouve PubUser).

Ogawa, Arthur. "Object-Oriented Programming, Descriptive Markup, and TeX." TUGboat: The Communication of the TeX Users Group [Proceedings of the 1994 Annual Meeting] 15/3 (September 1994) 325-330. 4 references. Author affiliation: TeX Consultants, P.O. Box 51, Kaweah, CA. 93237-0051 USA; email: ogawa@orion.arc.nasa.gov.

Abstract: I describe a synthesis within TeX of descriptive markup and object-oriented programming. An underlying formatting system may use a number of different collections of user-level markup, such as LATEX or SGML. I give an extension of LATEX's markup scheme that more effectively addresses the needs of a production environment. The implementation of such a system benefits from the use of the model of object-oriented programming. LATEX environments can be thought of as objects, and several environments may share functionality donated by a common, more general object. This article is a companion to William Baxter's "An Object-Oriented Programming System in TeX." [See the relevant bibliographic entry.]

[CR: 19961121]

Oldham, Joseph D; Marek, Victor W. "Toward Intelligent Representation of Database Content." Pages 274-84 (with 19 references) in Foundations of Intelligent Systems. Proceedings of the NinthInternational Symposium on Methodologies for Intelligent Systems, ISMIS '96. International Symposium, ISMIS '96. Zakopane, Poland, June 9-13, 1996. Edited by Zbigniew W. Ras and Maciek Michalewicz. Lecture notes in computer science, 1079. Berlin / New York: Springer-Verlag, 1996. ISBN: 3540612866. Authors' affiliation: University of Kentucky, Computer Science Department, 779 Anderson Hall, Lexington, KY 40506-0046, USA. Email: marek@cs.engr.uky.edu, oldham@cs.engr.uky.edu. WWW: Oldham Home Page, Marek Home Page.

"Abstract: We address the problem of automating the display of database records in an intelligent way. By this we mean the synthesis of complete multimedia documents from database records. We propose an architecture for mapping diverse data stored in a database to markup language (SGML) programs. These programs are ready for final presentation. The mapping is based on a computational extension of the linguistic concept of registers. The resulting presentation represents data as information in an intelligent way. General conditions for such a system are discussed. Our own treatment of registers as rule-based computational structures is offered with some early results on the behaviour of rule-based registers."

The document is available online in Postscript format: http://www.cs.engr.uky.edu/~oldham/ismisfinal.ps; [mirror copy]. Possibly also as: ftp://al.cs.engr.uky.edu/cs/manuscripts/dexter1.ps, [mirror copy].

[CR: 19970529]

Olsen, Florence. "Energy Puts SGML to Work. [SGML Markup] Language Lets Office Offer Desktop Access to Multimedia Document Architectures." Government Computer News 16/13 (May 26, 1997) 35, 37.

Summary: "The Energy Department is spending $825,000 to exploit little-known capabilities of the Standard Generalized Markup Language for managing distributed archives. Energy's Office of Scientific and Technical Information (OSTI) recognized the value of SGML several years ago when it adopted the language as its standard for document exchange. Now Energy is laying the groundwork for a distributed multimedia archive that agency scientists and academic researchers can access from any desktop computer. With the new architecture, OSTI officials want to give government scientists desktop access to the full text and multimedia output of Energy's more than 60 research sites and program offices that conduct basic materials research and other high-interest investigations."

Available online from the Government Computer News WWW server: GCN Online; [archive copy, text only].

Olsen, Florence. "FDA [Food and Drug Administration] Urging 'electronic templates'. Agency to Develop Template for Pharmaceutical Companies Seeking Approval." Government Computer News 13/12 (June 13, 1994) 41-42.

"Abstract: The Food and Drug Administration (FDA) is suggesting the use of a standard template, based on the Standard Generalized Markup Language (SGML), for the use of pharmaceutical companies that are submitting new drug proposals. SGML-tagged documents can be created even on PCs and there are 47 SGML products in the market for the Apple Macintosh. FDA officials are in the process of working with their counterparts in the Netherlands, Sweden and Canada in developing a multinational electronic template. The agency has implemented a pilot data-exchange project by the Center for Drug Evaluation and Research that contained electronic data standards for text, bit-mapped graphics, quantitative data, chemical structures and analytical instrument data."

[CR: 19970909]

Olsen, Florence. "Middleware unifies publishing. Dataware's text-based repository maps non-SGML documents to its structure." Government Computer News 16/26 (September 1 1997) 45 - 46. ISSN: 0738-4300. Author's affiliation: GCN Staff.

Extract: "Although the Standard Generalized Markup Language promised the same thing many years ago, converting all documents to SGML proved too time-consuming, said Bill Thornburg, vice president of publisher markets for Dataware Technologies Inc. of Cambridge, Mass. Dataware's Electronic Publishing Management System (EPMS) can accept documents in any non-SGML data formats, including active news feeds, by later this year. The text-based repository for the finished documents is an SGML document store, which maps non-SGML documents to its SGML structure."

Available online from GCN: "Middleware unifies publishing"; [archive copy].

[CR: 19970228]

Olsen, Florence. "Security Agency Opts for SGML to Manage Data, Decrease Costs." Government Computer News 16/2 (January 27 1997) 42[-?]. Author's affiliation: GCN Senior Editor, email: folsen@gcn.com; Tel: 301-650-2000.

An article on (National Security Agency) Intelink SGML applications. See for other details the SGML '96 presentation by Fredrick Thomas Martin, Deputy Director, Information Services Group, National Security Agency, "SGML in the Intranet for the US Intelligence Community: 'INTELINK' - A Case Study." Its provisional abstract, in part: "The US National Security Agency, the Central Intelligence Agency, the Defense Intelligence Agency, the National Reconnaissance Office and other agencies of the United States Intelligence Community are improving intelligence gathering and reporting through development and implementation of technology including SGML. INTELINK, the classified world-wide 'Intranet', addresses one of the world's largest data management problems."

[CR: 19970106]

Olszewski, Leonard P. "Modular DTD Development and Maintenance at SAS Institute: Implementing an Efficient SGML System Using Software Engineering Principles ." Pages 265-278 in SGML '96 Conference Proceedings. Celebrating a Decade of SGML. SGML '96 Conference, Boston, MA, November 18-21, 1996. Sponsored by The Graphic Communications Association (GCA). [Edited by] Conference Co-Chairs: B. Tommie Usdin and Deborah A. Lapeyre. Alexandria, VA: GCA, 1996. Extent: 711 pages. Author's affiliation: SAS Institute Inc., SAS Campus Drive, Cary, North Carolina, USA 27513; Tel: (919) 677-8000 x-7487; FAX: (919) 677-4444; Email: saslpo@unx.sas.com.

Abstract: "The Publications Division of SAS Institute needed a way to replace the hardcopy formatting tools it had been using, and also faced the challenge of producing online documentation for its large variety of software products. After deciding to implement an SGML solution using Adept, the Institute decided to apply good software engineering and programming principles to the effort and develop a modular, maintainable store of declarative SGML structures and custom executables. This paper describes the implementation of that system."

A related presentation describing the implementation of SGML by the Publications Division of SAS Institute was given at SGML '96 by Craig R. Sampson, "SASOUT: A Context Based Table Model."

Note: The above presentation was part of the "SGML User" track at SGML '96. The SGML '96 Conference Proceedings volume containing the full text of the paper may be obtained from GCA.

[CR: 19980506]

Orchard, David. "[XML: The Key to Bridging Java and the Web.] The Twain Shall Meet. Finding Convergence for Java and XML, Objects and Documents." Object Magazine 8/2 (April 1998) 60-65. ISSN: 1055-3614. Author's affiliation: [David Orchard] Senior Technical Architect, IBM Pacific Development Center.

Abstract: "The worlds of Java objects and Web documents are converging, and XML is key to providing the final gateway between them. While there are two opposing schools of thought in this arena, XML provides a radical software change with benefits that outpace HTML, SGML, RTF, and provide an interchange format for OO developers."

Note previous articles on XML in Object Magazine, including: "[The XML Revolution.] Document Objects With Style. An XML Document is a Composite Structure of Node Objects," by David Carlson. In Object Magazine (February 1998) 14-15.

[CR: 19961018]

Ossenbruggen, Jacco van; Eliëns, Anton; Schönhage, Bastiaan. "Web Applications and SGML." Pages 51-62 (with 22 references) in EP '96. Proceedings of the Sixth International Conference on Electronic Publishing, Document Manipulation and Typography. [ = Journal Special Issue: Electronic Publishing - Origination, Dissemination and Design (EPODD), June & September 1995, Volume 8, Issues 2-3. Sixth International Conference on Electronic Publishing, Document Manipulation and Typography, Palo Alto, California. September 24-26, 1996. Sponsored by Adobe Systems Incorporated; School of Information Management and Systems, University of California at Berkeley; Xerox Corporation. [Proceedings Volume] Edited by Allen Brown, Anne Brüggemann-Klein, and An Feng; [Journal] Editors David F. Brailsford and Richard K. Furuta. Chichester/ New York: John Wiley & Sons, 1996. ISSN: 0894-3982. Authors' affiliation: Vrije Universiteit, Faculty of Mathematics and Computer Sciences, Amsterdam. Email: jrvosse@cs.vu.nl; WWW (Ossenbruggen).

Abstract: "This article advocates the use of SGML technology for the creation, dissemination, and display of Web documents. It presents a software architecture that allows for defining the operational interpretation of arbitrary document types by means of style sheets, written in a scripting language. Our approach has been motivated by a desire to extend the functionality of the Web with support for multimedia and active documents. Although growing in complexity, HTML is still lacking in functionality. We prefer a more flexible and generic approach, as enabled by the employment of SGML."

"After a brief introduction to SGML, we will illustrate how our approach accommodates (extensions of) HTML as well as arbitrary SGML documents containing multimedia data such as video and audio. We will then briefly sketch the software components used in the realization of our approach and discuss some topics for further research."

A postscript draft version of the document is available on the VU WWW server: http://www.cs.vu.nl/~dejavu/papers/ep96.ps.gz; [mirror copy]. For other conference information, see the main conference entry for EP '96, or the brief history of the conference as sixth in a series since 1986. See the volume main bibliographic entry for a linked list of other EP '96 titles relevant to SGML and structured documents.

[CR: 19961226]

Øverby, Erlend. "Which Demands Do an SGML-organization (users) Have to the Developers of the SGML-system." Pages 597-600 in SGML '96 Conference Proceedings. Celebrating a Decade of SGML. SGML '96 Conference, Boston, MA, November 18-21, 1996. Sponsored by The Graphic Communications Association (GCA). [Edited by] Conference Co-Chairs: B. Tommie Usdin and Deborah A. Lapeyre. Alexandria, VA: GCA, 1996. Extent: 711 pages. Author's affiliation: Chief Engineer, University of Oslo, Center for Information Technology Services, Section for Administrative Data Processing, SGML-group, Gaustadalléen 23, P.O. Box 1059, Blindern N-0316, Oslo, Norway; Tel: +4722852533; FAX: +4722852730; Email: Erlend.Overby@usit.uio.no; WWW: http://www.uio.no/~erlendo/index.html.

Abstract: "The presentation will be a summary of a project at the University of Oslo, where about 80 persons have been working with SGML. The way we work with SGML is a bit different from many others. We want to use SGML as an infrastructure, applied to a wide range of documents. In this presentation I will summarize the evaluation of the project, and the interviews that I have done with some of the writers."

Note: The above presentation was part of the "SGML Case Studies" track at SGML '96. The SGML '96 Conference Proceedings volume containing the full text of the paper may be obtained from GCA.

[CR: 19961226]

Owens, Evan. "SGML and The Astrophysical Journal: A Case Study in Scholarly and Scientific Publishing." Pages 155-160 in SGML '96 Conference Proceedings. Celebrating a Decade of SGML. SGML '96 Conference, Boston, MA, November 18-21, 1996. Sponsored by The Graphic Communications Association (GCA). [Edited by] Conference Co-Chairs: B. Tommie Usdin and Deborah A. Lapeyre. Alexandria, VA: GCA, 1996. Extent: 711 pages. Author's affiliation: Electronic Publishing Manager, Journals Division, The University of Chicago Press, 5720 S. Woodlawn Avenue, Chicago, Illinois 60637, USA; Tel: +1 (773) 702 0199; Email: eowens@journals.uchicago.edu; WWW: http://www.journals.uchicago.edu.

Abstract: "The Astrophysical Journal, published by the University of Chicago Press for the American Astronomical Society, is a large and complex scientific journal of more than 25,000 pages per year. Over the last several years the production system for this publication has been re-engineered to be SGML-based, including on-screen SGML copy editing, exporting SGML for conventional typesetting, and producing an online HTML edition from the SGML archive. The most difficult part of the implementation was the use of SGML math and the problems encountered in translating complex mathematics between LaTeX, TeX, SGML, ASCII, HTML, and two different commercial typesetting systems. The key benefits of this implementation were (1) reduced conventional production costs, (2) the creation of additional electronic products, and (3) the establishment of a rigorous framework for future non-text content."

For more information on the use of SGML by the American Astronomical Society, see the main AAS entry in the SGML/XML Web Page.

Note: The above presentation was part of the "SGML User" track at SGML '96. The SGML '96 Conference Proceedings volume containing the full text of the paper may be obtained from GCA.

[CR: 19960202]

Ozsu, M. T.; Szafron, D.; El-Medani, G; Vittal, C.. "An Object-oriented Multimedia Database System for a News-on-demand Application." Multimedia Systems 3/5-6 (1995) 182-203 (with 41 references). Authors' affiliation: Department of Computer Science, Alberta Univeristy, Edmonton, Alberta, Canada.

"Abstract: Describes the design of a multimedia database management system for a distributed news-on-demand multimedia information system. News-on-demand is an application that uses broadband network services to deliver news articles to subscribers in the form of multimedia documents. Different news providers insert articles into the database, which are then accessed by users remotely, over a broadband, asynchronous transfer mode (ATM) network. The particulars of our design are an object-oriented approach and strict adherence to international standards, in particular the Standard Generalized Markup Language (SGML) and HyTime. The multimedia database system has a visual query facility, which is also described in this paper. The visual query interface provides three major facilities for end users: presentation, navigation and querying of multimedia news documents. The main focus, however, is the querying of multimedia objects stored in the database."

[CR: 19970806]

Paciello, Mike. "Designing the Web for People with Disabilities." International SGML Users' Group Newsletter 3/3 (July 1997) 20-24. ISSN: 0952-8008. Author's affiliation: Executive Director, Yuri Rubinsky Insight Foundation.

Abstract: "Information access for people with disabilities is creating numerous opportunities and challenges in the Information Highway community. In addition, as a result of the increasing paradigm shift by the publishing industry toward Internet and WWW-based document delivery systems, the importance of producing accessible information using electronic document mechanisms has increased immeasurably. . . The paper will attempt to identify major problems in information and software design that deny access; cite successful products that can be used by people with disabilities to access publications; and point to resources that assist developers in creating accessible products in the future."

Originally published as "People with Disabilities Can't Access the Web!," World Wide Web Journal Volume 2, Issue 1 (Winter 1997), pages 173-182 = Advancing HTML: Style and Substance, from O'Reilly & Associates, Inc. URL: http://www.w3.org/Journal/5/s3.paciello.html; [archive copy, text only].

[CR: 19971123]

Paciello, Michael G. "Pushing the Envelope This WAI." Page(s) 13 in SGML '97 Conference Proceedings. SGML Europe '97. "The Next Decade - Pushing the Envelope." Princesa Sofia Intercontinental, Barcelona, Spain. 11-15 May, 1997. Sponsored by Graphic Communications Association (GCA) and SGML Open. Conference Chair: Pamela L. Gennusa (Director, Database Publishing Systems Ltd). Alexandria, VA: Graphic Communications Association (GCA), 1997. Extent: 342 pages, CDROM. Author's affiliation: Executive Director, Yuri Rubinsky Insight Foundation.

Abstract: "The World Wide Web is fast becoming the de facto repository of preference for on-line information, yet the technology of the Web has inadvertently created barriers for people with disabilities. Worldwide, more than 750 million people with disabilities (more than 100 million in Europe alone) are affected by the emergence of the Web, directly or indirectly. In order to 'push the envelope' of information access and truly realize the full potential of the Web, the World Wide Web Consortium (W3C) intends to take a leadership role in removing accessibility barriers by launching the Web Accessibility Initiative (WAI, pronounced 'WAY').

"Mr. Paciello, creator of the WAI, will discuss the initiative's goals and mission and how SGML plays a major role in the advancement of information accessibility."

Note: The electronic conference proceedings in hypertext were produced by Inso Corporation (DynaText) and by High Text (EnLIGHTeN). Information about the SGML Europe '97 Conference may be found in the main database entry.

[CR: 19971123]

Paciello, Michael G.; Dardailler, Daniel. "Opening 750 Million Envelopes Without an Instrument." Page(s) 57 in SGML '97 Conference Proceedings. SGML Europe '97. "The Next Decade - Pushing the Envelope." Princesa Sofia Intercontinental, Barcelona, Spain. 11-15 May, 1997. Sponsored by Graphic Communications Association (GCA) and SGML Open. Conference Chair: Pamela L. Gennusa (Director, Database Publishing Systems Ltd). Alexandria, VA: Graphic Communications Association (GCA), 1997. Extent: 342 pages, CDROM. Author's affiliation: [Paciello]: Executive Director, Yuri Rubinsky Insight Foundation; [Dardailler]: W3C.

[No abstract available.]

Painter, J. Derek. "Marking up the Dictionary (The Oxford English Dictionary)." Information Media and Technology 21/2 (March 1988) 72-74. ISSN: 0306-2880. CODEN: IMTEED.

Abstract: The article describes the Oxford University Press's implementation of the Standard Generalized Markup language (SGML). SGML provides a rigorous syntax for describing unambiguously the content and structure of any document in such a way that its presentation can be controlled by conversion to typographic codes and selective retrieval can be enabled by the application of search software. Clearly, because the languages is generic, its use is independent of specific devices and it can be implemented universally, regardless of the make of front-end, host, printer and operating system. SGML has been adopted by the Oxford University Press in order to convert the Oxford English Dictionary into a lexical database.

[CR: 19960727]

Palowitch, Casey; Horowitz, Lisa. "Meta-Information Structures for Networked Information Resources." Cataloging and Classification Quarterly 21/3-4 ( 1996) 109-130 (with 25 references). Author's affiliation: Pittsburgh University, PA, USA. University of Pittsburgh Electronic Text Project, University Library System Project Manager. Email: cjp+@pitt.edu.

Abstract: "The authors develop a model of meta-information architectures (header, local index and directory) and present three current or proposed meta-information structures for networked information resources with applicability to organization and access in libraries and networked information environments. Special emphasis is given to the Text Encoding Initiative's TEI Header and Independent Header as a model for meta-information for academic and library needs. Recommendation is made for the specification of a generalized SGML meta-information header based on the principles of the TEI Independent Header, to address the needs of cataloging, automatic processing and serving of networked information resources."

[CR: 19960728]

Palowitch, Casey; Stewart, Darin L. Automating the Structural Markup Process in the Conversion of Printed Documents to Electronic Texts. Paper submitted to Digital Libraries '95. Pittsburgh PA: Electronic Text Project, University of Pittsburgh Library System, University of Pittsburgh, March 19 1995. Extent: approximately 9 pages. Authors' affiliation: University of Pittsburgh Library System.

"ABSTRACT: This paper presents the early results of a research initiative constructing a system for automatically identifying structural features and applying SGML tagging based on the Text Encoding Initiative DTD to text generated from the scanning and OCR processing of print documents. The system interprets typographical and geometric analysis output from a specialized OCR software system, and maps combinations of these characteristic features to TEI constructs based on a user-generated document analysis specification. The system is being developed as part of a pilot project to create from the original paper document a TEI-encoded edition of the Transactions of the American Medical Association, Vol. 2, 1849, a research resource for 19th century United States medical and urban historical study. Although this project focuses on one specific text, an important goal of the project is to create a software system that can process and at least minimally tag many types of printed documents, given a proper document analysis specification, and thus allow a more rapid process of retrospective conversion of printed documents into SGML texts in libraries."

Available in HTML format: http://stirner.library.pitt.edu/DL95paper.html [mirror copy, text only; July 1996]

[CR: 19960910]

Paoli, Jean. Cooperative Work On the Network: Edit the WWW! Paper presented at the Third International World Wide Web Conference. St Quentin en Yvelines: Grif S.A. April, 1995. Extent: approximately 11 pages. Author's affiliation: Technical Director, GRIF S.A., 2 Boulevard Vauban, BP 266, 78053 St Quentin en Yvelines, Cedex, France. Email: Jean.Paoli@grif.fr.

"Abstract: There is a real need for a tool to enable effective collaborative authoring of documents on the WWW. A number of sophisticated tools allow browsing of local and remote files but do not as yet allow authors to modify them. Our approach is to promote the creation of information directly on the WWW and so enable interaction between the different contributors. This approach relies on the use of a structured editing tool which recognizes the structured content of HTML documents and is wired on the network. We discuss various cooperative strategies and user interface issues and how SGML might help in the generalization of collaborative authoring on the WWW."

Available on the Internet: http://www.grif.fr/fr/newsref/coopwork.html; [mirror copy]. Apparently also published in Computer Networks and ISDN Systems, April 1995, pages 841-847.

[CR: 19960910]

Paoli, Jean. Creating SGML Objects for End-Users. Establishing SGML in an Interactive World. Paper presented at SGML '94 Conference. St Quentin en Yvelines: Grif S.A., [1994]. Author's affiliation: Technical Director, GRIF S.A., 2 Boulevard Vauban, BP 266, 78053 St Quentin en Yvelines, Cedex, France. Email: Jean.Paoli@grif.fr.

See the conference proceedings, pages 323-333.

[CR: 19960818]

Paoli, Jean. "Extending the Web's Tag Set Using SGML: Authoring New Tags with Grif Symposia." Computer Networks and ISDN Systems 28/7 (May 1996) 1095-1104 (with 15 references). Author's affiliation: Jean Paoli, Technical Director, GRIF S.A., 2 Boulevard Vauban, BP 266, 78053 St Quentin en Yvelines, Cedex, France. Email: Jean.Paoli@grif.fr.

"Abstract: HTML suffers from its lack of extensibility, and anarchical tag proliferation is in danger of breaking the WWW. In our view, the extensibility of the text model is necessary and we should develop and make extensive use of SGML to extend the current HTML model, not only by defining other DTDs (document type definitions) which could replace HTML, but also by proposing an extensibility scheme offering Web users rules for extending the HTML DTD themselves. This approach has been developed in the Grif Symposia authoring tool. Grif Symposia, a joint INRIA/GRIF S.A. project, is an integrated authoring-browsing environment to be shipped with full extensible capabilities in order to handle mixed HTML/SGML data models. We discuss the advantages to be gained by using a mixed HTML/SGML data model for the WWW on the basis of the work that we have achieved by developing Grif Symposia. We present the different layers developed for Grif Symposia and highlight the advantages obtained in authoring information in a mixed SGML/HTML environment."

[The author's] "Conclusions:

"It is possible to build a WYSIWYG authoring and viewing environment which support dynamic and structured tag set extensibility on the WWW.
SGML could be used to extend the current HTML model not only by defining other DTDs which could replace HTML but by proposing an extensibility scheme offering to Web users rules for extending themselves the HTML DTD.
The display of new SGML tags is the most serious problem one have to consider and a complete and powerful style sheet language has to be adopted.
adopting a structured approach to information authoring and retrieval on the WWW, we can access and manipulate intelligently on both the client and the server sites the data which is semantically and structurally identified." [from the Net version]

Based upon (or being) a paper delivered at the Fifth International World Wide Web Conference, Paris, France, 6-10 May 1996. See: the presentation slides; or the full text of the article http://www5conf.inria.fr/fich_html/papers/P18/Overview.html [mirror copy, text only].

[CR: 19960910]

Paoli, Jean. Rules for Extending a WWW client: The Symposia API. Paper presented at the Fourth International World Wide Web Conference, December, 1995. St Quentin en Yvelines: Grif S.A., August 7, 1996. Extent: approximately 18 pages. Author's affiliation: Technical Director, GRIF S.A., 2 Boulevard Vauban, BP 266, 78053 St Quentin en Yvelines, Cedex, France. Email: Jean.Paoli@grif.fr.

"Abstract: There is a great need for WWW clients to be extensible. The availability of the source code of some popular browsers (Mosaic) led many people to slice the original Mosaic or CERN code and to add diverse custom code for specific applications. In our view, a WWW authoring/viewing environment must be extensible enough to allow the building of interactive document authoring environments in which the user is able to access all relevant documentary information on the Web and incorporate it directly in his/her document. Symposia (shipping since March 95) is a joint INRIA / GRIF S.A. project for building a cooperative WYSIWYG authoring tool for the WWW. Symposia will soon be shipped with an API that we have developed that presents a set of solid principles for extending the user interface, document management, network extensibility and interactive behavior of document fragments in a WWW client. We will discuss in this paper the advantages gained from basing the extensibility of a WWW client on a generic structured environment. We will present different ways proposed today to extend WWW clients: Forms/CGI and Java and will compare them with the Symposia API."

Available on the Internet: http://www.grif.fr/fr/newsref/sympapi.html [mirror copy]

[CR: 19971107]

Paoli, Jean; Schach, David; Lovett, Chris; Layman, Andrew; Cseri, Istvan. "Building XML Parsers for Microsoft's IE4." Pages 187-195 in XML: Principles, Tools, and Techniques. Guest Edited by Dan Connolly. World Wide Web Journal [edited by Rohit Khare] Volume 2, Issue 4. Sebastopol, CA: O'Reilly & Associates, Fall 1997. Extent: xxii + 248 pages. ISBN: 1-56592-349-9. ISSN: 1085-2301. Authors' affiliation: Microsoft.

Abstract: "Microsoft cofounded the XML working group at the W3C in July 96 and actively participated in the definition of the standard. This article describes why Microsoft implemented its first XML application and how it led to the development of two XML parsers shipping in Internet Explorer 4.0, one written in C++ and the other in Java. We describe the importance of designing an object model API and our vision of XML as a universal, open data format for the Internet."

A version of this document is available online in HTML format: http://www.w3j.com/xml/excerpt.html; [local archive copy]. See also Microsoft's XML Support Page.

[CR: 19980829]

Paradis, François; Vercoustre, Anne-Marie; Hills, Brendan. A Virtual Document Approach for Reusing SGML/XML Information Objects. Paper presented at the SGML/XML Asia Pacific, Sydney, Australia, 22-24 September, 1997. Australia: CSIRO, 1997. Authors' affiiation: Commonwealth Scientific and Industrial Research Organisation (CSIRO), Australia. Email: François Paradis or Brendan Hills..

"Abstract: The importance of reusing information is well understood in electronic publishing, and is one of the motivations for the development and use of SGML. Reuse is actually quite hard to achieve with SGML, as the elements are strongly typed and there can be incompatibilities between the DTDs. HTML, an SGML derivative, relaxes those constraints, but unfortunately it does not provide a significant level of structure for identifying and extracting information, since the tags are mostly used for presentation. XML, another SGML derivative, is a promising alternative which could bring the power of SGML to the Web while keeping the simplicity of HTML. Those standards all have particularities which must addressed in a global solution to the reuse of information. We present our solution to the reuse of SGML information objects: a system that can dynamically combine information from various sources, including databases and SGML-like documents, to produce a virtual document, which allow an author to reuse information in a document-centric, descriptive way. We maintain support for the particularities of the data sources, by having them stored in different formats and accessed in their own native query language, but also support the integration of these information objects by converting them into a common, tree-like data structure, and by providing a language to extract and transform information in those trees. In this approach, a collection of SGML documents can be stored in an object-oriented database as a tree-like hierarchy of information objects; thus taking advantage of the strict typing of SGML to provide efficient storage and retrieval. By extending the standard query language of the object-oriented database, we can query on an incomplete or partial knowledge of the document structure whilst retaining the search efficiency that the database engine provides us. Combination of the results with other databases or data sources, and inclusion into the SGML virtual document is handled by our tree language. HTML and XML documents, do not always conform to a DTD, and, if they come from the Web, are volatile and fast-changing in nature. We propose in this case to access those documents through the standard file systems or http protocol, to convert them to our tree-like data structures on-line, and to use our tree language for both extraction and transformation, with possibly some specific instructions to handle links. The system is currently being implemented. Our prototypal application, a document to generate activity reports, reuses both an SGML database and a collection of HTML pages (as well as an SQL database), and shows how flexible and powerful our tool for information reuse is."

See also "Reuse of Linked Documents through Virtual Document Prescriptions." By Anne-Marie Vercoustre and François Paradis [INRIA (France) and CSIRO (Australia)]. Pages 499-512 in Electronic Publishing, Artistic Imaging, and Digital Typography. Proceedings of the 7th International Conference on Electronic Publishing (EP '98), Held Jointly with the 4th International Conference on Raster Imaging and Digital Typography, RIDT '98). Saint Malo, France, March 30 - April 3, 1998. Edited by Roger D. Hersch, Jacques André, and Heather Brown. New York/Berlin/Heidelberg: Springer-Verlag, 1998.

Contact François Paradis or Brendan Hills for a copy of the paper.

[CR: 19980907]

Paradis, François; Vercoustre, Anne-Marie; Hills, Brendan. "A Virtual Document Interpreter for Reuse of Information." Pages 487-98 (with 20 references) in Electronic Publishing, Artistic Imaging, and Digital Typography. Proceedings of the 7th International Conference on Electronic Publishing (EP '98), Held Jointly with the 4th International Conference on Raster Imaging and Digital Typography, RIDT '98). EP '98 and RIDT '98, Saint Malo, France. March 30 - April 3, 1998. Edited by Roger D. Hersch, Jacques André, and Heather Brown. Lecture Notes in Computer Science Series, Number 1375. New York/Berlin/Heidelberg: Springer-Verlag, 1998. ISBN: 3-540-64298-6, and 3-540-64298-6. Authors' affiliation: [Paradis, Hills]: CSIRO Mathematical and Information Sciences, 723 Swanston St, Carlton, VIC 3053, Australia; [Vercoustre]: INRIA Rocquencourt, BP 105, 78153 Le Chesnay Cedex, France.

Abstract: "The importance of reuse of information is well recognised for electronic publishing. However, it is rarely achieved satisfactorily because of the complexity of the task: integrating different formats, handling updates of information, addressing the document author's need for intuitiveness and simplicity, etc. An approach which addresses these problems is to dynamically generate and update documents through a descriptive definition of virtual documents. We present a document interpreter that allows information to be gathered from multiple sources and combined dynamically to produce a virtual document. Two strengths of our approach are: the generic information objects that we use, which enable access to distributed, heterogeneous data sources; and the interpreter's evaluation strategy, which permits a minimum re-evaluation of the information objects from the data sources."

"The RIO (Reuse of Information Objects) project aims to develop techniques which can support information reuse in various contexts. The focus of the project is currently on the specification and interpretation of virtual documents to enable reuse of structured information from heterogeneous sources. The instructions for the construction of virtual documents are stored in a document prescription, which is processed by the document interpreter to generate or update a virtual document. An editor facilitates the writing of the document prescriptions; it is connected to the document interpreter in order to provide dynamic editing. The document prescription consists of: 1) Static data, or the structure and the text that does not change in the document; 2) Queries, or the commands needed to generate the dynamic part of the document; 3) Transformation instructions, to convert the reused information objects into new document objects. The document prescription is written as an SGML document or as one of its derivatives such as HTML or XML, that might not enforce compliance to a formal DTD. Static data is expressed using normal SGML constructs. Queries and transformation instructions are expressed as SGML Processing Instructions (PI). There are two kinds of queries: native queries, which send requests to the data sources in their specific language (e.g., SQL for a relational database, URL for an HTML server), and pick queries, written in an OQL-like language that we designed to combine results and provide search capabilities for semi-structured information."

See Slides: A Virtual Document Interpreter for Reuse of Information. See also the online document abstract and the full text in PDF format; [local archive copy].

[CR: 19980104]

Paris, Joseph. "The EDIDOC project at the European Space Agency (ESA) ." SGML BeLux Newsletter 1/2 (May 1994) [NA]. Author's affiliation: ACSE s.a., a member of the SGML Technologies Group.

Summary: "The EDIDOC project at ESA attempts to merge the functionalities of electronic data interchange with those of electronic document engineering. It uses SGML to exchange in electronic format documentation within ESA, and between ESA and its different industrial partners."

"Specific SGML and EDIFACT substandards will be defined and implemented, covering both technical and administrative applications, and will be submitted to international standardisation bodies. They will include a.o. the following document types: 1) project control documents such as Monthly Progress Report, Engineering Change Notice, and Contract Change Notice; 2) software documents that are part of the ESA Software Engineering Standards such as User Requirements, Architectural Design, Operator Manual, etc.; 3) and possibly ITT and Proposal."

The document is available online in HTML format: http://www.sgmlbelux.be/Newsletter/N12A4.HTM; [local archive copy]. See the EDIDOC main entry in the SGML/XML Web Page.

Parkinson, Kirsten L. "DynaText Publishing to Go Mac: Will Build Documents from SGML Files." MacWeek ?/? (April 5, 1993) page X.

Brief description of DynaText 2.0 for the Macintosh. Includes comments of Paul Kahn, director of the IRIS Program at Brown University, where DynaText has been used to create electronic books in the areas of mathematics, literature, and science.

[CR: 19950716]

Pearson, P.; Francomano, C.; Foster, P.; Bocchini, C.; Li, P.; McKusick, V. "The Status of Online Mendelian Inheritance in Man (OMIM) medio 1994." Nucleic Acids Research 22/17 (September 1994) 3470-3473. Authors affiliation: Johns Hopkins University School of Medicine, Baltimore, MD.

"ABSTRACT: During the last year many changes have been introduced into the system of maintaining OMIM. There are three major components of the reorganization. First, a distributed editorial system was introduced which provides a three-tiered editorial board with senior editors, science writers and subject editors. Second, MIM entries have been restructured to provide separate gene and phenotype information and to organize them into separate catalogs. The restructuring also establishes clearly defined sections for entering new information, converts old entries to the new structure, and establishes a file maintenance and editorial system in SGML format. Third, the entry numbering and naming system has been modified. In addition, the information has been made available through a variety of output media, including books, CD-ROM and online access based on the IRx, WAIS, Gopher and WWW formats."

[CR: 19990514]

Peis, Eduardo; Fernández-Molina, Juan Carlos. "Enrichment of Bibliographic Records of Online Catalogs through OCR and SGML Technology." Information Technology and Libraries 17/3 (September 1998) 161-172 (with 33 references). ISSN: 0730-9295. Authors' affiliation: Facultad de Biblioteconomía y Documentación [School of Library Science and Documentation], Colegio Máximo de Cartuja, Universidad de Granada, Granada, Spain. Email [Peis]: epeis@platon.ugr.es.

"This article presents the results of research on the feasibility of using scanner technology to capture contents pages of collective monographs, and to extract the bibliographic information of each individual work and process this with a standardized language, such as SGML, for tagging electronic documents. By this means, data can be used as electronic information or stored in an online catalog (OPAC), thus providing additional access points. A pilot system has been designed to test the initial hypotheses, show the feasibility of achieving the suggested goals, and develop the tasks required so that they may be carried out as automatically as possible."

[Note the dissertation of Eduardo Peis...]

[CR: 19960826]

Pelletier, S.-J.; Arcand, J.-F.; Velissarios, J. "STEALTH: A Personal Digital Assistant for Information Filtering." Pages 455-474 (with 25 references) in PAAM '96. Proceedings of the First International Conference on the Practical Application of Intelligent Agents and Multi-Agent Technology. First International Conference on Practical Application of Intelligent Agents and Multi-Agent Technology, London, UK. April 22-24, 1996. Sponsored by . Edited by . . Blackpool, UK: Practical Application Company, 1996. ISBN: . Authors' affiliation: Centre for Information Technology and Innovation, Laval, Quebec, Canada.

Abstract: "The article presents an intelligent information agent which provides just-in-time help to flight simulator maintenance technicians. The agent, called STEALTH, is embedded within an information management system, TOPSS, and works as a personal digital assistant (PDA) for information filtering. It provides users with intelligent guidance to useful information within a given context, taking user's preferences into account. STEALTH offers numerous advantages compared to several other information filtering paradigms: (1) its intelligence is based on an artificial neural network (ANN) technology which can easily add new documents to its knowledge base, as well as learn and forget user's preferences; (2) its information search strategy includes theme indexing and stemming, which goes beyond the use of full text keywords; (3) it can take advantage of an SGML document base, retrieving documents at different granularity levels (paragraph, section, chapter, document); and (4) it allows users to build their own collection of documents using available document parts. The approach taken in the development of STEALTH is outlined here: from the ergonomic task analysis to the actual implementation and testing of the agent. Although STEALTH is presently only used within a flight simulator context, it is believed that its generic design will make it applicable to a wide range of domains. In effect, one only has to provide the agent with the document base to be used in order to profile the agent for the new context."

[CR: 19971123]

Peltonen, Björn. "'Case Study': The SGML (Standard Generalized Markup Language) Implementation at Norsk Hydro." Page(s) 149-151 in SGML '97 Conference Proceedings. SGML Europe '97. "The Next Decade - Pushing the Envelope." Princesa Sofia Intercontinental, Barcelona, Spain. 11-15 May, 1997. Sponsored by Graphic Communications Association (GCA) and SGML Open. Conference Chair: Pamela L. Gennusa (Director, Database Publishing Systems Ltd). Alexandria, VA: Graphic Communications Association (GCA), 1997. Extent: 342 pages, CDROM. Author's affiliation: Vice President, Citec Engineering, Silmukkatie 2, P.O. Box 109, FIN-65101 Vaasa, Finland; Email: bjorn.peltonen@citec.fi, or Email: bpe@citec.fi.

Abstract: "A significant economical objective at Norsk Hydro is to reduce the time and cost of maintaining equipment used in oil production.

"According to NORSOK (NORsk SOkkels Konkuranseposisjon, or in English the competitive standing of the Norwegian offshore sector), 50% of the development cost of an off-shore installation is related to information. NORSOK is the Norwegian initiative to reduce development and operation cost for the off-shore oil and gas industry. An important part of this effort is to develop cost efficient standards to replace individual oil company specifications.

"In this case study we will explain the implementation of an interactive system to improve the accessibility of technical supplier documentation by utilising the SGML standard."

[CR: 19971227]

Peltonen, Björn. "'Case Study', The SGML (Standard Generalized Markup Language) Implementation at Norsk Hydro. Do More with Less and Do It Better." Pages 641-643 in SGML/XML '97 Conference Proceedings. SGML/XML '97. "SGML is Alive, Growing, Evolving!" The Washington Sheraton Hotel, Washington, D.C., USA. December 7 - 12, 1997. Sponsored by the Graphic Communications Association (GCA) and Co-sponsored by SGML Open. Conference Chairs: Tommie Usdin (Chair, Mulberry Technologies), Debbie Lapeyre (Co-Chair, Mulberry Technologies); Michael Sperberg-McQueen (Co-Chair, University of Illinois). Alexandria, VA: Graphic Communications Association (GCA), 1997. Extent: 691 pages, CDROM; print volume contains author and title indexes, keyword and acronym lists. Author's affiliation: [Björn Peltonen]: Vice President, Citec Information Technology, P.O. Box 109, FIN-65 101 Vaasa FINLAND; Phone: +358 6 3240 702; FAX: +358 6 3240 800; Email: bpe@citec.fi; WWW: http://www.citec.fi/.

Abstract: "A significant economical objective at Norsk Hydro is to reduce the time and cost of maintaining equipment used in oil production.

"According to NORSOK (NORsk SOkkels Konkuranseposisjon or in English the competitive standing of the Norwegian offshore sector), 50% of the development cost of an off-shore installation, is related to information. That will explain why there is substantial savings in make the information management process and methods more efficient.

"In this case study we will explain the implementation of an integrated, interactive system to improve the accessibility of information needed for the maintenance procedures at an off-shore installation by utilising the SGML standard. This system contains all the relevant parts in an information management process like authoring, storage and information distribution."

This paper was delivered as part of the "Case Studies" track in the SGML/XML '97 Conference.

[CR: 19961226]

Peltonen, Björn; Mäki, Erik. "Case Study: Wärtsilä Diesel Oy, Power Plants." Pages 623-632 in SGML '96 Conference Proceedings. Celebrating a Decade of SGML. SGML '96 Conference, Boston, MA, November 18-21, 1996. Sponsored by The Graphic Communications Association (GCA). [Edited by] Conference Co-Chairs: B. Tommie Usdin and Deborah A. Lapeyre. Alexandria, VA: GCA, 1996. Extent: 711 pages. Authors' affiliation: [Peltonen]: VP Sales & Marketing, Citec International Ltd Oy / Information Technology, P.O. box 109 FIN-65 101, VAASA, FINLAND; Tel: +358-6-3240 700; FAX: +358-6-3240 800; Email: bpe@citec.fi; WWW: http://www.citec.fi/; [Mäki]: Pbs Development Manager, Wärtsilä Diesel Oy Power Plants, P.O. Box 252 FIN-65 101, VAASA, FINLAND; Tel: +358-6-3270; FAX: +358-6-3271 440; Email: erik.maki@wartsila.fi.

Abstract: "Wärtsilä Diesel is the largest medium speed diesel engine manufacturer in the world, with offices and factories all over the world. This is a case study where Wartsila Diesel Power Plant provides an editorial system for their subcontractors, so that they can easily produce content oriented information modules, based on the physical equipment breakdown structure (EBS) according to the WD Base-DTD. The study also covers the production system that is used in Wartsila to maintain and to produce presentation-oriented technical manuals from the content oriented information modules delivered by the subcontractors. We will also cover the background and problems of handling lots of information coming from several sources in different formats, why WD decided to implement an CALS/SGML information environment and what they achieved so far.

The editorial system consists of the WD Base-DTD that is mapped in SGML Author to templates in Microsoft WORD and a database that is used for mapping the information modules into the correct level in the EBS. This editorial system makes it very easy to author content oriented information, because of the familiar wordprocessor that helps the user to navigate in the DTD without having any knowledge about SGML. The key thing in the application is having an interface of a database from where the author chooses an information module and puts in information by using the next legal style, which follow the structure in the WD Base-DTD.

The production system consists of tools for navigating, searching, browsing and publishing of the technical information from the main repository. When the subcontractor delivers the technical information, it will be analysed in Wartsila Diesel and if it becomes approved it is saved into a main repository for the information. The main tool in the production system is a browser that is configured to the relational database (main repository) that holds the EBS with the associated information modules. The tool is used for searching, viewing and publishing of the information modules in a very object oriented way. By choosing publish, the user can produce information products, such as IETM, Online and/or paper manuals very easy by 'dragging and dropping'."

Note: The above presentation was part of the "SGML Case Studies" track at SGML '96. The SGML '96 Conference Proceedings volume containing the full text of the paper may be obtained from GCA.

[CR: 19960911]

Penfold, David. "IS in Practice. Publishing and Printing." Part 2 in Ivanhoe Career Guide to Information Systems 1996. Edited by Aline Cumming. London: Cambridge Market Intelligence Limited [Published in association with the British Computer Society], 196. ISBN: ISBN 1 897977 16 6. Author's affiliation: Electronic Publishing Specialist Group of the British Computer Society, 1 Sanford Street, Swindon SN1 1HJ, UK. Telephone 01793 417417; Fax 01793 480270. Email membenq@bcs.org.uk.

The Ivanhoe Career Guide to Information Systems 1996 is" a guide to the profession for students and others considering IT as a career." Penfold's article discusses the role of SGML in publishing.

Available online: http://www.bcs.org.uk/ivanhoe/part-2/g8.htm; [mirror copy].

[CR: 19970901]

Pepper, Steve. SGML in Practice - Publishing on Paper and WWW. Presentation delivered at SGML Sweden '96. Oslo, Norway: Falch Infotek, February 17 1966. Author's affiliation: Chief SGML Architect with Falch Infotek as an Oslo-based company, specialising in SGML-based information re-engineering and electronic publishing. Email: pepper@falch.no.

Abstract: "Since 1994 all of the Norwegian Government's official reports (Norges Offentlige Utredninger, or the NOU series) have been produced using SGML. The same SGML source document is used to enable the publication of the printed version and two different on-line versions. As a result of this project, the information embodied in the NOU series is now available faster and more reliably to a greater number of people than before. Thanks to the use of SGML, the series can now be searched as free text, distributed via the Internet, accessed by the visually impaired, and re-used in other publications. It is also guaranteed to be available to future generations in machine-readable form. The success of the NOU Project has motivated the Norwegian Government Administration Services to use SGML for other important official publications, and a new project is currently underway.

"This presentation will describe the background to the NOU Project, the way in which it was implemented and some of the lessons that were learned. Particular emphasis will be given to the ways in which SGML enables small and medium-sized publishers to exploit new medier such as CD-ROM and World Wide Web."

See a summary of the presentation in the SGML Sweden '96 program; [mirror copy].

[CR: 19961226]

Pepper, Steve. "Whirlwind Guide to SGML Tools and Vendors." Page 37 in SGML '96 Conference Proceedings. Celebrating a Decade of SGML. SGML '96 Conference, Boston, MA, November 18-21, 1996. Sponsored by The Graphic Communications Association (GCA). [Edited by] Conference Co-Chairs: B. Tommie Usdin and Deborah A. Lapeyre. Alexandria, VA: GCA, 1996. Extent: 711 pages. Author's affiliation: SGML Architect, Falch Infotek a.s, Postboks 130 Kalbakken N-0902, Oslo, Norway; Tel: +47-22902733; FAX: +47-22902599; Email: pepper@falch.no; WWW: http://www.falch.no.

[A presentation based upon the author's popular "Whirlwind Guide to SGML Tools and Vendors."]

Note: The above presentation was part of the "SGML Newcomer" track at SGML '96. The SGML '96 Conference Proceedings volume containing the full text of the paper may be obtained from GCA.

[CR: 19970901]

Pepper, Steve. The Whirlwind Guide to SGML Tools and Vendors. Online database version of information used in a presentation originally given at SGML Europe '94 in Montreux, "The Whirlwind Guide to SGML Tools." Oslo, Norway: Falch Hurtigtrykk A/S, 1996 [see later current version]. Extent: linked WWW pages, or approximately 25 pages in hardcopy. Author's affiliation: Falch Infotek; Postboks 130 Kalbakken; N-0902 Oslo; Norway; Phone: +47-22 90 25 00; Fax: +47-22 90 25 99. Email: pepper@falch.no.

"Abstract. SGML is an enabling technology: it doesn't actually do anything in and of itself. In order to make it work for you, you need software tools: tools to help you design your application, tools to help you get your information into SGML format, and tools to help you do something useful with it once you've got it there. This presentation aims to give a brief overview of the kinds of SGML products currently available and some of the questions you should be asking yourself (and vendors) when choosing them."

Note: The Guide is developed in several parts, with an introduction for each class of SGML software tool in the document overview. The Vendor and Tool directory is thus part of a larger document which explains the role played by different kinds of tools in an enterprise information management solution. The document sections include: (1) Document abstract; (2) Introduction: Classifying SGML Tools (according to hardware and software platform, level of SGML support, and function or activity); (3) Hardware and Software Platforms; (4) Level of SGML Support: Feature support, Syntax support, Validation services offered; (5) Activities and Functionality (planning the application, capturing the data, managing the information, putting the information to work); (6) Directory of SGML Tools and Vendors. It supplies a listing of all (known) SGML tools, categorised according to functionality, along with names, addresses and telephone numbers of the vendors. The database is an extremely valuable resource, and has been updated faithfully [from 1992 through September 1997, or later].

Available via the WWW on the URL http://www.infotek.no/sgmltool/guide.htm.

[CR: 19961030]

Pepping, Simon. "TeX en SGML bij Elsevier Science." MAPS (Minutes and Appendices - Nederlandstalige TeX Gebruikersgroep) 13 (November 1994) 118-122.

[Reference is from the PREMIUM Project]

[CR: 19960312]

Peterson, David C. "The 8879 Revision." <TAG> 9/2 (February 1996) 9. ISSN: 1067-9197.

The article summarizes some of the key recommended changes in 8879 and most likely changes to be ratified in the revision of ISO 8879:1986. Several of the changes involve the integration of HyTime constructs into SGML itself.

[CR: 19950716]

Peterson, David C. "Always Use a DTD Module [Tutorial]." <TAG> 8/6 (June 1995) 7, 12. ISSN: 1067-9197.

Summarizes material presented in a poster session at SGML '93. Discusses the value of creating reusable "DTD modules" (via parameter entities) as opposed to placing a copy of a DTD into a document instance.

[CR: 19971106]

Peterson, David C. "Characters, Encodings, and XML." <TAG>: The SGML Newsletter 10/10 (October 1997) 6-8. ISSN: 1067-9197. Author's affiliation: SGML Works!.

The article discusses: "ISO 10646 and Unicode; Transmitting and Storing 16-bit Bit Patterns on 8-bit Byte-oriented Systems; 10646/Unicode and XML; Non-canonical Representations of Strings of UCS-2 Characters; UTF-8 and ASCII; Which Am I Getting: UCS-2, UTF-8, or Something Else?" On this topic, see also the excellent article written by François Chahuneau, "Unicode and Internationalization Issues in Document Management: A Global Solution to Local Problems," in The Gilbane Report on Open Information & Document Systems 5/4 (July/August 1997) 1-25; it also discusses Unicode and XML.

[CR: 19980203]

Peterson, David C. "Characters, Encodings, and XML, Continued." <TAG> 10/11 (November 1997) 9. ISSN: 1067-9197. Author's affiliation: SGML Works!.

The article is a continuation of the author's presentation in the October issue of <TAG>. It discusses in greater detail XML's "Document Character Set," and current WG discussions on the status of "composite" characters.

[CR: 19970306]

Peterson, David C. "Character Sets [Part 1]." <TAG> 10/2 (February 1997) 1, 5-7. ISSN: 1067-9197. Author's affiliation: David Peterson is an SGML consultant for SGMLWorks! Email: davep@acm.org.

The first article in an announced series of articles on SGML character set issues. This lead article defines some of the key terms that are used variably within different industry and standards arenas: glyph, glyph image, font, character, character repertoire, etc.

[CR: 19970331]

Peterson, David C. "Character Sets in SGML (Part 2)." <TAG> 10/3 (March 1997) 1, 5-8. ISSN: 1067-9197. Author's affiliation: David Peterson is an SGML consultant for SGMLWorks! Email: davep@acm.org.

The second in a series of tutorial articles on character sets, introducing some of the new terminology used in connection with the planned revision of SGML. See the first of the serialized articles on character sets in the February issue of <TAG>.

Peterson, David C. "Character Sets [Tutorial]." <TAG>: The SGML Newsletter 8/5 (May 1995) 8-9. ISSN: 1067-9197. Author's affiliation: David Peterson is an SGML consultant for SGMLWorks! Email: davep@acm.org.

The author discusses the parts of the SGML declaration having to do with character sets, particularly in light of character-set handling in HTML. Titled sections of the article include: Character repertoires; SGML and characters; Coded character sets; Encoding schemes.

[CR: 19960716]

Peterson, David C. "Data in SGML [Tutorial Article]." <TAG>: The SGML Newsletter 9/6 (July 1996) 4-6. ISSN: 1067-9197. Authors' affiliation: Consultant for SGMLWorks! and technical expert for ISO/IEC JTC1/SC18/WG8. Email: davep@acm.org.

The author explains the different kinds of "data" (and metadata") in SGML, with a focus upon special processing concerns that need to be understood when using "RCDATA," and "CDATA" in different contexts. The article explains various wany of hiding markup characters (as literal data) within PCDATA so that it is not recognized as markup.

[CR: 19950925]

Peterson, David C. "Dealing With 'Special' Characters." <TAG>: The SGML Newsletter 8/8 (August 1995) 1, 7. ISSN: 1067-9197.

The author discusses the definition of a "character" (and related concepts) from the perspective of the SGML standard. He explains how SGML entities can be used for special characters in the ISO 8879 scheme, and explains how the SGML "characte set" relates to some industry-standard character sets.

[CR: 19961201]

Peterson, David C. "Deja Vu . . .[EMPTY elements, TAGC]." <TAG> 9/11 (November 1996) 9. ISSN: 1067-9197. Author's affiliation: .

The author reflects on the current discussion (within the XML design arena) to differentiate markup for EMPTY elements (which cannot have an end-tag) from markup for elements that have an omissible end-tag -- in support of parsing an SGML/XML instance without the DTD. The current proposals are apparently very similar to one made by John Klensin (INFOODS) in 1985 -- at a time when it was recognized that the design of SGML's EMPTY element was problematic, but when it was said to be too late to turn back... Etc.

[CR: 19951208]

Peterson, David C. "Document Analysis: Running Text." <TAG>: The SGML Newsletter 8/11 (November 1995) 12-13. ISSN: 1067-9197. Author's affiliation: Consultant, SGML Works!.

The tutorial article is one in a series on basic document analysis. Peterson treats common features in "running text" and the relevance of some style rules for DTD design and tagging.

[CR: 19951015]

Peterson, David C. "Document Analysis: Section Structures [Tutorial]." <TAG>: The SGML Newsletter 8/10 (October 1995) 7-8. ISSN: 1067-9197. Author's affiliation: SGML Works!.

The document analysis tutorial discusses identification of "section" text objects, and explains when using recursion (nested section structure) to model the hierarchy may or may not be a good idea.

[CR: 19960325]

Peterson, David C. "Document Analysis: Tables." <TAG>: The SGML Newsletter 9/3 (March 1996) 6-8. ISSN: 1067-9197. Author's affiliation: David Peterson is an SGML consultant for SGMLWorks! (email: davep@acm.org).

The third in a series of tutorial articles on fundamentals of document analysis. The article includes a dissussion of tables as logical structures versus tables as a display style.

[CR: 19981007]

Peterson, David C. "DTDs and XML Schemas." <TAG>: The SGML Newsletter 11/9 (September 1998) 4-5. ISSN: 1067-9197. Authors' affiliation: SGML Works! .

Peterson provides extensive commentary on the definition of 'DTD' [document type definition] in SGML (ISO 8879) and then explains how XML schemas (may) relate to XML DTDs. He identifies three separate components in a DTD: syntax, semantic roles, and application semantics.

Peterson, David C. "[Tutorial:] DTDs and Document Type Declarations." <TAG> 7/8 (August 1994) 5. ISSN: 1067-9197.

Comments on the difference between "document type definition" and "document type declaration".

Peterson, David C. "Elements and Element Types." <TAG> 8/3 (March 1995) 6, 12. ISSN: 1067-9197.

Peterson, David C. "Formal Public Identifiers." <TAG> 7/3 (March 1994) 1-3. ISSN: 1067-9197.

[CR: 19971227]

Peterson, Dave. "Handling Tables in SGML: A Dream." Pages 235-239 in SGML/XML '97 Conference Proceedings. SGML/XML '97. "SGML is Alive, Growing, Evolving!" The Washington Sheraton Hotel, Washington, D.C., USA. December 7 - 12, 1997. Sponsored by the Graphic Communications Association (GCA) and Co-sponsored by SGML Open. Conference Chairs: Tommie Usdin (Chair, Mulberry Technologies), Debbie Lapeyre (Co-Chair, Mulberry Technologies); Michael Sperberg-McQueen (Co-Chair, University of Illinois). Alexandria, VA: Graphic Communications Association (GCA), 1997. Extent: 691 pages, CDROM; print volume contains author and title indexes, keyword and acronym lists. Author's affiliation: [Dave Peterson]: Principal Consultant, SGMLWorks!, 3 Winston Road, Lexington, MA 02173; Phone: +1 617 861 8475; Email: davep@acm.org.

Abstract: "SGML products could make 'good SGML' easier by separating tabular display from tabular data organization more thoroughly. This presentation will describe and discuss the structural versus display approaches to tabular data in SGML, and will describe the author's dream table-oriented capabilities for display-oriented SGML tools, especially editors. In the process, there will be provided a description of various simple and more complicated tabular structures. We don't accept products that only recognize one or two DTDs in general; why should we for tables?"

"There is a difference between tabular display of data and tabular organization of data. Tabular display involves how data is placed on the screen or page, whereas tabular organization involves the semantic relationships between various pieces of data. Most programs, such as SGML-aware editors, that provide a tabular data display currently require that the SGML markup for the data directly indicate how it is to be displayed, rather than any structural relationship between the pieces of data, and generally require that all data to be displayed tabularly be marked up to the same DTD fragment (hereinafter called a 'schema'). This should not be necessary. My 'dream' is that it not be necessary."

This paper was delivered as part of the "User" track in the SGML/XML '97 Conference.

[CR: 19950716]

Peterson, David C. "Notations." <TAG> 8/4 (April 1995) 9. ISSN: 1067-9197.

This tutorial on SGML NOTATIONS explains how to use a notation with an external data entity and as an attribute of an SGML element.

[CR: 19950716]

Peterson, David C. "A Lesson to be Learned." <TAG> 7/7 (July 1995) 7. ISSN: 1067-9197. Author's affiliation: SGMLWorks! Email: davep@acm.org.

The author refers to the work of SGML Open in its review of the CALS table model as variably interpreted by implementors. The model was designed in 1989, but the DTD did not adequately address "the semantic role of each of the types of elements and attributes declared therein." According to the author, the "lesson to be learned" from the current (expensive) re-interpretation of the CALS table model is: "It's going to cost a lot of money downstream if you don't document the semantics of your element types and attributes carefully."

[CR: 19980203]

Peterson, Dave C. "More on XML Characters." <TAG>: The SGML Newsletter 11/1 (January 1998) 10. ISSN: 1067-9197. Author's affiliation: SGML Works!.

The author provides an update on XML characters, as of the December 1997 XML specification. Peterson comments on the XML WG's choice of UTF-16 (variable-length representation) rather than UCS-2 (the 16-bit two-byte canonical representation of the first 65536 characters of ISO 10646). See also "Characters, Encodings, and XML, Continued" and "Characters, Encodings, and XML."

[CR: 19970825]

Peterson, David C. "Objects, Classes, Trees, and Groves." <TAG>: The SGML Newsletter 10/8 (August 1997) 9-10. ISSN: 1067-9197. Author's affiliation: Consultant, SGML Works!; Email: davep@acm.org.

The author discusses SGML/HyTime 'groves' in light of the emerging importance of this notion in the development of SGM systems. The concept was formalized in the HyTime TC [1997]. Earlier articles by the author on architectural forms and groves (from September and November 1995) are refernced. [Dave Peterson, "Objects, Classes, Trees, and Groves." <TAG>: The SGML Newsletter 10/8 (August 1997) 9 - 10 "Trees, Groves, and SGML." By Dave Peterson. <TAG>: The SGML Newsletter 8/12 (December 1995) 11-12.]

[CR: 19950925]

Peterson, David C. "Object Oriented Architectural Forms." <TAG>: The SGML Newsletter 8/9 (September 1995) 14. ISSN: 1067-9197.

[CR: 19961029]

Peterson, David C. "Permissive DTDs." <TAG> 9/9 (September 1996) 8-9. ISSN: 1067-9197. Author's affiliation: SGMLWorks! Email: davep@acm.org.

The author examines the trend toward using "permissive" DTDs - a DTD that "permits different ways of organizing the same information, not because the different ways are needed (presumably because they have slightly different semantics), but simply to accommodate the preferences of different users." He warns that many common motivations for adopting a permissive DTD are uncritical, and end up working against the goals of the information management project.

[CR: 19980612]

Peterson, Dave. "The Rocky Road to Unicode." <TAG> The SGML Newsletter 11/5 (May 1998) 7-8. ISSN: 1067-9197. Author's affiliation: SGML Works!.

Dave Peterson discusses the special problems raised by the fact that there is frequently more than one way to represent the same abstract character in Unicode (for example, "ö" (e.g., using 16-bit and 32-bit representation). Some languages use a base character having several stacked diacritics, where differential ordering of the combinations would create different bit patterns for the "same" distinct (abstract) character, from XML's perspective. Where a Unicode-compliant piece of software ought to be able to equate equivalent representations, ISO/IEC 10646 "does not address the issue." A question remains as to whether the XML specification will address this issue, and what the consequences will be for designers and users of XML applications/implementations if the issue is not formally addressed.

[CR: 19961226]

Peterson, David C. "The SGML Character Model." Pages 681-686 in SGML '96 Conference Proceedings. Celebrating a Decade of SGML. SGML '96 Conference, Boston, MA, November 18-21, 1996. Sponsored by The Graphic Communications Association (GCA). [Edited by] Conference Co-Chairs: B. Tommie Usdin and Deborah A. Lapeyre. Alexandria, VA: GCA, 1996. Extent: 711 pages. Author's affiliation: SGML Works!, 3 Winston Road, Lexington, MA 02173, USA. Tel: +1 617 861 8475; Email: davep@acm.org .

Abstract: "There are differences of opinion as to how the current SGML standard (ISO 8879 as amended in 1988) should be interpreted with respect to the handling of the characters that make up the SGML documents it describes. But a consensus has pretty well been achieved as to how the revision now being worked on will treat 'characters' and 'character strings', and how the 'character sets' described in an SGML declaration will be interpreted and used. This paper presents the character model that is being considered by the group working on the revision of ISO 8879 (the SGML Raporteur Group of ISO/IEC JTC1 SC 18 WG8).

Characters are recognized as 'abstract' data types, just as, for example, are integers. The new model will not assume, for example, that characters of a given character repertoire are always represented by fixed-width bit strings and that strings of characters are not always represented by direct concatenation of the representations of single characters.

The new character model clarifies the relationship between the character representations being used by an SGML system, the character representations used to store external entities, and the character sets described in the SGML declarations of SGML documents. It provides for the possibility of character representation information being in the SGML declaration's 'document character set' description or the 'formal system identifier' of an entity, or even being provided via external-to-the-document, system-dependent means."

Note: The above presentation was part of the "And More..." track at SGML '96. The SGML '96 Conference Proceedings volume containing the full text of the paper may be obtained from GCA.

[CR: 19971123]

Peterson, Dave. "The SGML Character Model." Page(s) 245-250 in SGML '97 Conference Proceedings. SGML Europe '97. "The Next Decade - Pushing the Envelope." Princesa Sofia Intercontinental, Barcelona, Spain. 11-15 May, 1997. Sponsored by Graphic Communications Association (GCA) and SGML Open. Conference Chair: Pamela L. Gennusa (Director, Database Publishing Systems Ltd). Alexandria, VA: Graphic Communications Association (GCA), 1997. Extent: 342 pages, CDROM. Author's affiliation: Consultant, SGML Works!; Email: davep@acm.org.

Abstract: "SGML was designed in an environment where other-than-8-bit character representations were only vaguely known and not understood. The designers did not differentiate between (abstract) characters and the bit-patterns by which they are represented in machines. This resulted in a character-handling model that is no longer adequate in many respects. In addition, there have surfaced differences of opinion as to how the current SGML standard (ISO 8879 as amended in 1988) should be interpreted with respect to the handling of the characters that make up the SGML documents it describes.

"A new character and character-string model has been adopted by the SGML Rapporteur Group within WG8 [now WG4], where the ISO 8879 revision is being prepared. The new model encompasses handling of variable-width-character string representations such as Shift-JIS, and outside-the-document specification of character representations, as well as the traditional 'document character set' specification.

"A Technical 'Correction' to ISO 8879 was made official in 1996, which made it more feasible to use SGML with very-large-character-set languages such as Japanese and Chinese, for use on SGML systems not constrained to an 8-bit character set.

"This presentation will explain the distinction between (abstract) characters and the computer representations of characters, and will explain the new character handling model in terms thereof. It will further explain the relationship to the old character-handling model of 1986, and how older systems may be upgraded, and what is possible when still running under the old (1986/88) rules. This involves the relationship of the 'document character set' to the way systems may actually represent characters, and the use or non-use of the 'shunned character numbers' specification."

[CR: 19951220]

Peterson, David C. "Trees, Groves, and SGML [Tutorial Article]." <TAG> 8/12 (December 1995) 11-12. ISSN: 1067-9197. Author's affiliation: SGML Works!

Peterson, David C. "Peterson Works Tables. Three Tutorials Relating to [SGML] Tables." <TAG> 7/11 (November 1994) 7-8, 12. ISSN: 1067-9197.

[CR: 19960716]

Peterson, David C. "Versioning." <TAG>: The SGML Newsletter 9/6 (June 1996) 7-9. ISSN: 1067-9197. Authors' affiliation: Consultant for SGMLWorks! and technical expert for ISO/IEC JTC1/SC18/WG8. Email: davep@acm.org.

The author explains and illustrates the pitfalls of using nested marked sections, including precedence rules and (non-)support on the fine points by some of the leading SGML software packages, depending upon where the marked sections occur. Some of the complexities and problems will be addressed in the revision of ISO 8879.

[CR: 19960520]

Peterson, David C. "Your First Document Analysis." <TAG> 9/5 (May 1996) 1, 5-6. ISSN: 1067-9197. Author's affiliation: SGML Works!.

Overview of the document analysis process from the perspective of an enterprise representative working with outside SGML consultants, with a view to conversion of legacy data.

[CR: 19950828]

Phillips, Lisa. A DTD for STEP Integrated Resources. ISO TC184/SC4 Editing Committee Version 1.0, N43. [Gaithersburg, MD]: National Institute of Standards and Technology, September 5, 1994.

Available online: stepir.dtd and readme.txt, or in mirror copy here [August 1995].

[CR: 19961022]

Phillips, Lisa; Lubell, Joshua. An SGML Environment for STEP. NISTIR [Technical Report] 5547. Gaithersburg, MD: National Institute of Standards and Technology, June [October], 1994. Authors' affiliation: National Institute of Standards and Technology, Building 220, Room A127, Gaithersburg, MD 20899-0001, USA. Email: phillips@cme.nist.gov AND lubell@cme.nist.gov.

Absrtact:" The Standard for the Exchange of Product Data (STEP) is emerging under the auspices of the International Organization of Standards (ISO). As part of a major NIST effort in support of the advancement of STEP, NIST is developing an environment which will facilitate and accelerate the development of the component specifications in STEP, know as Application Protocols (APs). This environment is called the Application Protocol Development Environment (APDE), the purpose of which is to provide an integrated suite of STEP-tailored tools to assist STEP AP developers in the development of high-quality APs. A major part of the APDE will be document authoring, browsing, and publishing environment based on the Standard Generalized Markup Language (SGML), an ISO standard (ISO 8879) which is used to specify format. The SGML environment is expected to address two major challenges faced by the developers of STEP documents. These challenges are: 1) to ensure accurate interpretation and conformance to the specified structure of STEP documents in a "reasonable" amount of time and 2) to be able to intelligently query and access information from the component parts of the standard. The SGML environment for STEP will accomplish this by providing the facilities to guide and quicken the development of STEP documents through structure-based editing, by providing the ability to issue intelligent, structure-based queries against STEP documents and by helping to ensure that STEP documents are consistent and structurally correct."

Apparently an alternate version of the document is An SGML Environment for STEP, NISTIR 5515, November 1994 (also published in the Proceedings of SGML '94, Vienna VA, Graphic Communications Association). Potsscript version: http://www.nist.gov/msidlibrary/doc/phill94a.ps; [mirror copy]. For an online version of NISTIR 5547: http://elib.cme.nist.gov/msid/pubs/phill94a.ps; [mirror copy]. See the main entry for SGML and STEP (ISO 10303 Standard for the Exchange of Product Data).

[CR: 19971123]

Pieper, Frank. "Document Structure Independent Data Modelling." Page(s) 133-140 in SGML '97 Conference Proceedings. SGML Europe '97. "The Next Decade - Pushing the Envelope." Princesa Sofia Intercontinental, Barcelona, Spain. 11-15 May, 1997. Sponsored by Graphic Communications Association (GCA) and SGML Open. Conference Chair: Pamela L. Gennusa (Director, Database Publishing Systems Ltd). Alexandria, VA: Graphic Communications Association (GCA), 1997. Extent: 342 pages, CDROM. Author's affiliation: Research and Development Manager, MediaWare B.V., The Netherlands .

Abstract: "This presentation provides an overview of four techniques that combine the principles of data storage and generalized markup into database publishing systems. These four techniques are ordered by increasing document structure flexibility. Their effects are illustrated by a simple yet realistic example. The conclusion argues in favor of document structure independent data modeling.

"Consider a class of database publishing systems that exploit the principles of generalized markup. Each system in this class encompasses three main subsystems: a data storage subsystem, processes that transform the stored data into SGML, and processes that transform the SGML documents into final publication form.

"We are interested in the impact that the chosen style of storing data has on the structural flexibility of the database publishing system. In other words, we wonder how much effort, and of what kind, is involved in adding a DTD (Document Type Definition) to the publication domain, or in altering an existing DTD.

"Four styles of data storage are beforehand, namely: 1) storing text files containing the SGML; 2) a database whose schema has been designed after one or more DTDs; 3) a generally applicable SGML database; and 4) databases designed independently of document structure. Each of these approaches has its own advantages and disadvantages, but regarding structural flexibility, document structure independent data modeling is the winner by far."

[CR: 19960730]

Pino, Marta. Encoding two large Spanish corpora with the TEI scheme: design and technical aspects of textual markup. Paper presented at Digital Libraries Workshop 1996, Organized by Nancy Ide and Judith Klavans, Held in conjunction with the First ACM International Conference on Digital Libraries, Bethesda, Maryland. Poughkeepsie, New York / New York, NY: Vassar College, Department of Computer Science / Columbia University, Department of Information Services, 1996. Author's affiliation: Instituto de Lexicografia, Real Academia Espanola. Email: mpino@crea.rae.es.

"This paper has tried to show that the TEI scheme results very suitable to encode large amounts of electronic text, like in the case of the Spanish corpora. The TEI provides encoding solutions for many different types of application, but it is almost impossible to use the whole tag set in a particular text or collection of texts. The use of the TEI requires a thorough analysis of its principles and a selection of a reduced tag set, according to the purposes of the text that is going to be encoded." [from the Conclusion]

The document is available online: ; [mirror copy]. See the main workshop entry or the program listing for other workshop details.

[CR: 19960202]

Pitti, Daniel V. "Settling the Digital Frontier: The Future of Scholarly Communication in the Humanities." In [Proceedings of the Berkeley Finding Aid Project Conference]. Berkeley Finding Aid Project Conference. Morrison Room of the Doe Library, University of California, Berkeley. April 4-6, 1995. Sponsored by the Commission on Preservation and Access. Berkeley, CA: Berkeley Finding Aid Project, 1995. Author's affiliation: Librarian for Advanced Technologies Projects, The Library, University of California, Berkeley.

"A collection-level structured document like the finding aid clearly would serve this function well. The kind of collection that I have in mind here, though, is not one determined by the archival principle of provenance, but rather a collection that represents the shared interests of intellectual communities. Fortunately the model for such a collection already exists in the print world. It is the comprehensive, critical subject bibliography. In order for the bibliography to serve as the new axis of electronic academic publishing, we need to design it in such a way that it can be collaboratively and cooperatively built and extended by publishers, scholarly societies, and libraries. It must enable all of the critical functions to take place under the control and jurisdiction of the appropriate experts and professionals. An underlying assumption is that the subject bibliography will be organized hierarchically." [extracted]

Available online in HTML format: http://www.lib.berkeley.edu/AboutLibrary/Projects/BFAP/dpitti.html [mirror copy]. For more on the conference, see information on the Sunsite WWW server.

[CR: 19961205]

Pitti, Daniel V. Standard Generalized Markup Language and the Transformation of Cataloging. Paper presented at the annual conference of the North American Serials Interest Group, Vancouver, BC. Berkeley, CA: University of California, Berkeley, Friday, June 3, 1994. Extent: approximately 9 pages. Author's affiliation: Librarian for Advanced Technologies Projects, University of California, Berkeley. Email: dpitti@library.berkeley.edu.

[Summary}: ". . .SGML is a general standard capable of embracing a wide variety of text documents, and of using that text to provide access to and control of a multitude of online information formats. Hence SGML can serve as the basis of a comprehensive, integrated, multimedia, text-based information environment. In essence, it would be possible to use SGML as the general underlying standard for the information that we use to catalog, the catalog records we create, and the electronic texts we catalog; and further yet, we could use the text in any of these domains to provide entry to and control of extra-textual digital objects. . . I believe librarians, and in particular, catalogers, have a professional obligation to actively assert themselves in the creation of this information universe." [extracted]

The published version of the paper is available in the proceedings, from Haworth Press. Online versions: NASIG Paper from Berkeley FTP server, or online version in minimal HTML format.

Polk, W. Timothy; Bassham, Lawrence E. III. "A Window and Icon Based Prototype for Expert Assistance for Manipulation of SGML Document Type Definitions." Pages 79-84 (with 6 references) in Proceedings of the ACM Conference on Document Processing Systems, Santa Fe (ACM Conference on Document Processing Systems, Santa Fe, New Mexico, 5-9 December 1988, sponsored by ACM SIGGRAPH, SIGOIS, and SIGIR). New York: Association for Computing Machinery, 1988. ISBN: 0-89791-291-8 (ACM Order Number: 429882). Authors' affiliation: National Institute of Standards and Technology.

Abstract: The DTD editing tool is a window and icon based tool for the creation, manipulation, and comprehension of SGML Document Type Definitions (DTDs). This tool allows users to manipulate SGML DTDs without any knowledge of the rather complex SGML syntax. More generally, the tool allows users to manipulate context-free grammars without any knowledge of the syntax used to describe them. The tool generates SGML DTDs, and has features specific to that application; however, the approach could also be applied to manipulation of context-free grammars represented in other grammar description languages.

[CR: 19950716]

Popham, Michael G. "UK [SGML Users' Group] Chapter Meeting, 9th March 1994. Corporate SGML - Business Case - Strengths, Weaknesses, Paybacks, and Risks." SGML Users' Group Newsletter 27 (May 1994) 3-7. ISSN: 0952-8008. Author's affiliation: The SGML Project, University of Exeter.

The article contains a detailed report of the meeting, with presentations by Pam Gennusa, (President, International SGML Users' Group), Paula Leenheer (EDMS Project Leader), Mike Allen (Glaxo Research and Development), and Martin Bryan (The SGML Centre).

Popham, Michael. "An Update on SGML." Information Management and Technology 27/6 (November 1994) 247-50, 262. Author affiliation: Exeter University, UK.

Abstract: SGML, the Standard Generalized Markup Language is an International Standard for information representation (ISO:8879 Information processing-text and office systems-Standard Generalized Markup Language (SGML), 15 October 1986-Amendment 1, 1 July 1988). SGML can be used for publishing in its broadest definition, ranging from single medium conventional publishing, e.g., on paper, to multimedia database publishing, e.g., an online journal containing hypertext links, still graphics, video and sound. Since 1988 SGML use has continued to grow steadily. Its influence has spread beyond the boundaries of traditional document publishing, and now it is of interest to anyone involved with the creation or use of information stored in electronic form.

Popham, Michael G. "Use of SGML and HyTime in UK universities." Information Services and Use [Workshop on Hypermedia and Hypertext Standards, Amsterdam, Netherlands, 22-23 April 1993. Sponsored by CEC.] 13/2 (1993) 103-109. Author affiliation: Exeter University, UK.

Abstract: A description is given of current and future uses of SGML and HyTime within UK universities. The author proposes that academics' reasons for adopting SGML and HyTime are to solve problems which are, in fact, identical to those also faced by those working outside the academic community. The research initiatives and experiences of academics can provide other potential users of SGML and/or HyTime with several valuable lessons, which can save them time, money and wasted effort. The work of academics can also provide useful, practical examples of the effective use of SGML, and of the benefits to be gained therefrom.

[CR: 19971202]

Popham, Michael; Burnard, Lou. "Putting Our Headers Together: A Report on theTEI Header Meeting of 12 September 1997." Pages 103-106 in TEI 10: A Conference in Celebration of the Tenth Anniversary of the Text Encoding Initiative. Abstracts.. TEI 10: Text Encoding Initiative, Tenth Anniversary User Conference , Brown University, Providence, Rhode Island. November 14-16, 1997. Sponsored by Martin Hensel Corporation, Kluwer Academic Publishers, and MIT Press. Hosted by Brown University Library, and Computing and Information Services. Providence, RI: Brown University, 1977. Authors' affiliation: Oxford Text Archive, Oxford University; Email michael.popham@oucs.ox.ac.uk, and lou.burnard@oucs.ox.ac.uk .

Summary: "An increasing number of electronic text centres, libraries, and archives from around the world are deciding to follow the principles and practices outlined in TEI P3, and hence seeking to adopt the effective use of the TEI Header as a means of describing and documenting electronic textual resources. Metadata of the kind described by the header has a vital part to play in information management and retrieval - yet variant practices with respect to the format and use of the header abound. For existing and potential users, the flexibility offered by TEI P3 is one of its most attractive features. The Guidelines allow for widely divergent approaches to the basic issues of encoding electronic texts, and providing metadata in the form of TEI Headers. This is entirely appropriate for a general purpose scheme, and for individual scholars seeking only a scheme capable of expressing their (often complex) analytic needs. For implementers working within a common framework, and with similar objectives however, this generality and expressivity imposes an additional burden. Such implementors must identify a mutually acceptable code of practice in the application of the scheme to their needs, or compromise one of the very purposes for which the TEI scheme was designed - the mutual interchangeability of texts and their associated headers. This paper will present the results of an attempt to address this problem at source, by bringing together an initial core of expert TEI header creators with the explicit goal of sharing their expertise and co-ordinating (if possible) their practice."

The extended abstract for the document is available online: http://www.stg.brown.edu/webs/tei10/tei10.papers/popham.html; [local archive copy]. See the main database entry for additional information about the conference, or the Brown University web site.

Poppelier, Nico. "Pre-publication Review [of Eric van Herwijnen], Practical SGML, 2nd edition." TUGboat 15/1 (March 1994) 24-25. Author affiliation: Elsevier Science Publishers; email: n.poppelier@elsevier.nl.

See also Poppelier's review of Practical SGML first edition by Nico Poppelier in TUGboat 13/2 (July 1992) 184-185.

[CR: 19951228]

Poppelier, Nico A. F. M. "SGML and TeX in Scientific Publishing." TUGboat: The Communications of the TEX Users Group [= Proceedings of TeX90] 12/1 (March 1991) 105-109. ISSN: 0896-3207. Author's affiliation: Elsevier Science Publishers, Physical Sciences and Engineering Division, Netherlands. Email: n.poppelier@elsevier.nl.

"Elsevier Science Publishers has for a few years investigated the possibility of accepting compuscripts, a manuscript in electronic form, created with TeX, LATEX and a few other text processing systems, and converting these to SGML form. This paper will discuss the current status of these activities, the reasons for converting compuscripts to SGML form, and the various ways in which TeX is used."

Note also the article of Kees van der Laan in this issue: "SGML (,TeX and . . .)".

[CR: 19961119]

Poppelier, N. A. F. M.; Herwijnen, Eric van; Rowley, C.A. "Standard DTDs and Scientific Publishing." EPSIG News 5/3 (September 1992) 10-19.

The article was posted to the discussion group 'sgml-math', and is also available in Postscript format [dated 7-August-1992] on the Elsevier FTP server. See further on Elsevier Science in the main Elsevier entry. A mirror copy of the document is available here. [Abstract also needed.]

[CR: 19971024]

Powell, Christina Kelleher; Kerr, Nigel. "SGML Creation and Delivery. The Humanities Text Initiative." D- Lib Magazine (July/August 1997). ISSN: 1082-9873. Author's affiliation: [Powell:] Humanities Text Initiative, University of Michigan; [Kerr:] Digital Library Production Services, University of Michigan.

Introduction: "The Humanities Text Initiative (HTI) is an SGML creation and support unit at the University of Michigan and is part of the university's Digital Library Production Service (DLPS). Created in 1994, the HTI's origins are in the University Library's 1988 efforts to create an Internet based 'textual analysis' capability through a service then known as UMLibText. With a wider variety of collections and a broader user base than UMLibText, the HTI was designed as an umbrella organization for the creation and maintenance of online text, and as a mechanism for furthering the University's capabilities in the area of electronic text. Since its creation, the HTI has amassed perhaps the Internet's largest and certainly richest collection of materials in SGML (as of the date this article was written, there are almost 2 million pages of encoded text using 14 different DTDs online). As well as creating text collections available to the Internet community and working with scholars on the creation of new electronic editions, the HTI supports the delivery of externally-created SGML collections and has collaborated with publishers and with other academic operations to design and build local access mechanisms for their titles. In 1996, the HTI launched the SGML Server Program, which leverages Michigan's years of experience with electronic texts to assist in the development of SGML support at other academic institutions."

The article is available online in HTML format; local archive copy. Note that the July/August 1997 double issue of D-Lib Magazine (Amy Friedlander, editor) contains several articles referencing the use of SGML encoding in digital library research.

[CR: 19981021]

Prescod, Paul. Formalizing XML and SGML Instances with Forest Automata Theory. Draft Technical Paper. Waterloo, Ontario: University of Waterloo, Department of Computer Science, May 5, 1998. Extent: 16 pages. Author's affiliation: The Department of Computer Science, University of Waterloo; Email: Paul Prescod; WWW: Home Page.

Abstract: "The notions of schemata and validation are widely deemed to be crucial to the success of XML and SGML. Yet there are very few people who consider the formal definition of these terms nor formal definitions of how individual schema languages work. This paper surveys a particular formalization for schemata that is sufficiently powerful to implement recursive SGML content model validation and also many features that SGML content models do not support."

Online preliminary version of the paper; [local archive copy]. See also the database entry "SGML/XML and Forest/Tree Automata Theory," where some technical papers of Murata Makoto are referenced.

[CR: 19970331]

Prescod, Paul. "Multiple Media Publishing in SGML." Pages 3-9 (with glossary) in Conference Proceedings, SIGDOC '96. The 14th Annual International Conference on Computer Documentation. ["Marshalling New Technological Forces: Building a Corporate, Academic, and User-Oriented Triangle"]. ISGDOC '96: 14th Annual International Conference. Research Triangle Park, North Carolina, US. October 20-23, 1996. Sponsored by the Association for Computing Machinery Special Interest Group on Documentation (SIGDOC). New York, NY: Association for Computing Machinery, 1996. ISBN: 0-89-791-799-5. Author's affiliation: Department of English, University of Waterloo, Ontario, Canada.

Abstract: "In recent years, authors of scholarly materials have had to choose between a bewildering number of formats for information distribution. Some formats, such as Microsoft Word's file format or PostScript, assured excellent print quality. Others, like the Web's HTML and Microsoft's Windows Help format, allowed fast electronic distribution. None allowed for optimal print and online representation. To compound the problem, new formats are being created every day. The Internet world had seen Gopher, HTML, HTML 2.0, HTML 3.0, HTML 3.2, Hyper-G's HTF, and Adobe's PDF. In the print world, PostScript, WordPerfect, MS Word 2.0, MS Word 6.0, and Rick Text Format have vied for favor. The University of Waterloo English Department successfully used the Standard Generalized Markup Language (SGML) to transcend these 'notation wars' and deliver high-quality World Wide Web and print documentation for on-campus and distance education students. We found the system robust enough that we also taught students to use SGML for multimedia publishing in a four-month course in technical writing, English 210E."

Several other articles in this proceedings volume are germane to SGML: Tom Banfalvi, et al., "Manufacturing Documentation in the Virtual Warehouse"; Betsy Brown, et al., "From Hardcopy to Online: Changes to the Editor's Role and Processes"; Paul Beam and Peter Goldsworthy, "Technical Writing on the Web-Distributed SGML-Based Learning"; Stephanie Copp, "Working with Academe"; Cindy Roposh, et al., "Developing Single-Source Documentation for Multiple Formats"; Lin-Ju Yeh, et al., "SSQL: a Semi-Structured Query Language for SGML Document Retrievals"; Dee Stribling, et al., "A Real World Conversion to SGML".

[CR: 19971227]

Prescod, Paul; McCool, Michael. "Strategies for DSSSL Code Reuse." Pages 317-323 in SGML/XML '97 Conference Proceedings. SGML/XML '97. "SGML is Alive, Growing, Evolving!" The Washington Sheraton Hotel, Washington, D.C., USA. December 7 - 12, 1997. Sponsored by the Graphic Communications Association (GCA) and Co-sponsored by SGML Open. Conference Chairs: Tommie Usdin (Chair, Mulberry Technologies), Debbie Lapeyre (Co-Chair, Mulberry Technologies); Michael Sperberg-McQueen (Co-Chair, University of Illinois). Alexandria, VA: Graphic Communications Association (GCA), 1997. Extent: 691 pages, CDROM; print volume contains author and title indexes, keyword and acronym lists. Authors' affiliation: [Paul Prescod]: University of Waterloo, Department of Computer Science, Computer Graphics Lab, Waterloo, Ontario Canada N2L 3G1; Phone: +1 (519) 888-4567 x4422; FAX: +1 (519) 885-1208; WWW: http://itrc.uwaterloo.ca/~papresco; Email: paprescod@cgl.uwaterloo.ca; [Michael McCool]: Assistant Professor, University of Waterloo Department of Computer Science, Computer Graphics Lab, Waterloo, Ontario Canada N2L 3G1; Phone: +1 (519) 888-4567 x4422; FAX: +1 (519) 885-1208 WWW: http://www.cgl.uwaterloo.ca/~mmccool/ Email: mmccool@cgl.uwaterloo.ca.

Abstract: "The DSSSL (Document Style Semantics and Specification Language) Style Language is the International Standard for specifying a formatting procedure for a document. DSSSL stylesheets can be simple declarative specifications of formatting to be applied to elements. But if the situation warrants they can also be complex computer programs. When they are complex it becomes important to be able to reuse code in order to save development time and increase reliability and consistency. We have investigated mechanisms for reusing code across document types and for designing style specifications that would publish documents both on the web and in print."

"We evaluated and experimented with simple templates, multi-stage transformations, HyTime architectural forms and a convention we call pseudo flow object constructors. This paper will compare these strategies, discuss the applicability of non-DSSSL tools and consider some extensions to DSSSL to make code reuse easier."

"Despite the difficulties we still feel that higher-level flow objects are a reasonable route to code and style reuse. Until the Web matures into a DSSSL-capable platform this route will be the only one that can capture the commonalities of print and web delivery. Even after the Web matures, higher level flow objects could speed the creation of stylesheets for new document types. DSSSL as it stands does not seem to be a hospitable development environment for creating higher-level objects. There are a few ways this situation could be improved. . ."

This paper was delivered as part of the "Expert" track in the SGML/XML '97 Conference.

[CR: 19961009]

Prettyman (Reenie), Maureen. "Conversion to SGML: The Pain, The Gain." <TAG> 9/9 (September 1996) 6-8. ISSN: 1067-9197. Author's affiliation: Information Technology Branch, Lister Hill Center for Biomedical Information, National Library of Medicine. Email: reenie@nlm.nih.gov.

"Our experience with the electronic dissemination of full text information at the National Library of Medicine (NLM) has taught us that, even though the process of conversion to SGML is time-comsuming, labor-intensive, and expensive, the transformed data can be re-used or re-published in an infinite variety of ways with very little additional effort or cost. . . The value of maintaining the text in SGML-encoded format has been proven repeatedly during the 6- or 7-year period we have been making full text information available." [extracted]

See also "SGML Encoding of an Online Medical Reference Work," presented with Charles M. Goldstein at SGML '89 (Atlanta, October 24-27, 1989). For further information on the use of SGML by the National Library of Medicine for database publishing, see the main entry for the NLM.

[CR: 19971227]

Prettyman, Maureen. "SGML as a Navigational Tool for Accessing Information." Pages 65-70 in SGML/XML '97 Conference Proceedings. SGML/XML '97. "SGML is Alive, Growing, Evolving!" The Washington Sheraton Hotel, Washington, D.C., USA. December 7 - 12, 1997. Sponsored by the Graphic Communications Association (GCA) and Co-sponsored by SGML Open. Conference Chairs: Tommie Usdin (Chair, Mulberry Technologies), Debbie Lapeyre (Co-Chair, Mulberry Technologies); Michael Sperberg-McQueen (Co-Chair, University of Illinois). Alexandria, VA: Graphic Communications Association (GCA), 1997. Extent: 691 pages, CDROM; print volume contains author and title indexes, keyword and acronym lists. Author's affiliation: [Maureen Prettyman]: National Library of Medicine, 8600 Rockville Pike, Bethesda, MD 20894; Phone: +1 (301) 496-1936; FAX: +1 (301) 480-6183 Email: reenie@nlm.nih.gov.

Abstract: "The challenge of navigating the morass of available electronic information is a frequent topic of news groups, newspapers, and professional association publications. Global indexing schemes are proliferating in response to this concern and many are providing at least minimally effective information organization and search aids. However, indexing alone is not sufficient. Much information is inherent within the structure (e.g., chapter headings, section headings, and their relation to each other) of full text scientific and technical documents.

"At the NLM (National Library of Medicine), NIH (National Institutes of Health), we have been working on a relatively small scale project that demonstrates the larger issue of how to provide improved search discrimination for, and navigation of related but diverse collections of biomedical full text documents by exploiting the structural information. This project supports the NLM collection of HSTAT (Health Services/Technology Assessment Text) which currently includes over 120 different on-line documents ranging from a 600-page monograph to a six-page pamphlet.

"Our approach has been to use SGML (Standard Generalized Markup Language) to define the objects of information in the collection. The use of SGML (required, by our FTRS [Full Text Search and Retrieval System]) enables us to provide the users a roadmap to the information available via a dynamically created table of contents, context-qualified search results, and context-flags while browsing the full text. The benefits of this approach include the users' ability to expand the table of contents to the lowest possible level of structure in the document, and the system's ability to consistently provide context information for all information retrieved -- whether browsing the text or submitting specific searches. Challenges have included the conversion of legacy data to SGML and development of a suitable DTD (Document Type Definition) - a DTD that is at the same time specific enough to support a particular document and general enough to provide some continuity across the collection.

"Although the labor-intensive nature of legacy data conversion to SGML has been costly, the project has been very successful. The SGML-encoded documents have allowed the development of algorithms to automatically convert to many different formats including HTML (Hypertext Markup Language), print, word processing, PDF (Portable Document Format), etc., for reuse. It has proved to be an eminently scalable system, growing from the original 60 documents to the current size -- and still growing. Finally, the use of SGML has provided a framework that allows us to constantly improve the search and retrieval capabilities and add new features without incurring the cost of continually reprocessing the entire collection. HSTAT is accessible at http://text.nlm.nih.gov."

This paper was delivered as part of the "Newcomer" track in the SGML/XML '97 Conference.

[CR: 19961226]

Price, Christopher. "The Daily Production of Multiple Media Publications in the 11 Official Languages of the European Union." Pages 161-170 in SGML '96 Conference Proceedings. Celebrating a Decade of SGML. SGML '96 Conference, Boston, MA, November 18-21, 1996. Sponsored by The Graphic Communications Association (GCA). [Edited by] Conference Co-Chairs: B. Tommie Usdin and Deborah A. Lapeyre. Alexandria, VA: GCA, 1996. Extent: 711 pages. Author's affiliation: Sales Manager for European Union Business, Saarbücker Zeitung, Verlag und Druckerei GmbH, EU Sales Departement, Untertürkheimerstrasse 15, Saarbücken, Germany; Tel: 0049- 681-502-1802; FAX: 0049-681-502- 1819; Email: hkl@sz-sb.de.

Abstract: "The paper provides an overview of the [following elements]: (1) Reasons which spurred the Commission of the European Union to seek and find 'the SGML solution' for their own Official Journal of the European Communities and its Supplement, including a discussion as to how the technical issues surrounding SGML and the production constraints hampered the full implementation of 'pure' SGML production systems for a decade.

(2) The decision to implement 'SGML transition systems' and an account of the consequential experience gained through their implementation, along with an insight as to how Multilingual, Multiple Media 'pure' SGML production systems will be in place before the end of 1996, thanks to the increasing availability of ever more sophisticated software tools and the recent availability of reasonably priced computer processing power.

(3) Philosophy of the new technical concepts and the names the products which comprise the new production systems.

(4) Meeting of the technical challenges involved in providing Multilingual, Multiple Media SGML services, including overcoming the issues related to the implementation of special character sets and a review of a number of both critical personal and business decisions which were made in order to maximise the scope and optimise the Multilingual, Multiple Media SGML services being provided to the European Union."

Note: The above presentation was part of the "SGML User" track at SGML '96. The SGML '96 Conference Proceedings volume containing the full text of the paper may be obtained from GCA.

Price, Lynne A. "An Alternate Representation of SGML Content Models." SGML Users' Group Bulletin 2/2 (1987) 125-126. ISSN: 0269-2538. Author affiliation: Hewlett-Packard, 3200 Hillview Avenue, Palo Alto, CA 94304 USA.

"Annex H of ISO 8879 SGML describes a substitution for the & connector of SGML content models that permits a simple translation of model groups to regular expressions. Finite-state automata (FSAs) corresponding to the resulting regular expressions can be used to parse document instances. However, a less space-consuming representation is preferable. One such representation is described here. Each content model is represented by a set of sets of FSAs rather than by a single FSA..."[from the introduction].

Price, Lynne A. "Graphic Representation of Content Models." <TAG> 10 (July 1989) 12-16.

The article demonstrates the use of tree structures and (more extensively) FSAs to represent SGML content models. FSAs are useful in revealing ambiguity (seemingly equivalent models). The article is derived from the author's tutorial session at the ACM Conference on Document Processing Systems, Santa Fe, New Mexico (5-9 December 1988).

[CR: 19951228]

Price, Lynne A. "A Note Comparing SGML to Text Processing Macro Languages." SGML Users' Group Bulletin 2/2 (1987) 127. ISSN: 0269-2538. Author's affiliation: [was] Hewlett-Packard.

Price, Lynne A. "A Parser Generator for SGML." Pages 118-23 (with 4 references) in PROTEXT IV. Proceedings of the Fourth International Conference on Text Processing Systems. International Conference on Text Processing Systems, Boston, MA, USA 20-22 October 1987. Sponsored by INCA - Institute for Numerical Computation and Analysis. Edited by John J. H. Miller. Dun Laoghaire, Ireland: Boole Press, Ltd., 1987. vii + 153 pages. ISBN: 0-906783-80-1 (hardback); 0-906783-79-8 (paperback). Hewlett-Packard Co., Palo Alto, CA, USA.

Abstract: The Standard Generalized Markup Language (SGML), defined in International Standard 8879, allows multiple processes to be performed on a single document. The same source file, for example, can be formatted, loaded into an onlinedatabase, or analyzed for various linguistic properties. SGML-based software that is not restricted to particular applications must therefore provide tools for defining new applications. Internal software is being developed at Hewlett-Packard for using SGML in the production of user manuals. At the center of this activity is a general-purpose SGML parser and application generator, called MARKUP. This paper describes MARKUP from the perspective of the programmer responsible for development of SGML applications.

[CR: 19960202]

Price, Lynne A. "MARKUP: Hewlett-Packard's SGML Implementation." SGML Users' Group Bulletin 2/2 (1987) 116-118. ISSN: 0269-2538. Author affiliation: Hewlett-Packard, 3200 Hillview Avenue, Palo Alto, CA 94304 USA.

The author describes MARKUP, Hewlett-Packard's in-house SGML parser and application generator. The software assists in the publication of user guides and technical reference manuals that are produced in over fifty writing departments worldwide on a variety of computing platforms. SGML supplies a means of expressing for standards for document design and SGML documents are used for interchange.

A version of the paper was presented at the Corporate Electronic Publishing Systems Federal Conference (US), 1987; see the published proceedings.

Price, Lynne A. "The Problem with Ambiguous Content Models." SGML Users' Group Bulletin 3/1 (1988) 25-26.

Abstract: "IS 8879 defines an ambiguous content model - one for which an element or character string occurring in a document instance can satisfy more than one primitive content token without look-ahead - to be a nonreportable markup error. In other words, it is an error to use an ambiguous content model, but SGML software will not necessarily detect the condition. SGML could have been defined to give a unique interpretation to each possible content model. For example, an element or character string occurring in a document instance could have been interpreted as matching the leftmost possible primitive content token. Such a definition, however, would not have given consistently intuitive results. Several illustrations of this point are given below.

[CR: 19971123]

Price, Lynne A. "The Pros and Cons of Industry-Standard DTDs." Page(s) 205-208 in SGML '97 Conference Proceedings. SGML Europe '97. "The Next Decade - Pushing the Envelope." Princesa Sofia Intercontinental, Barcelona, Spain. 11-15 May, 1997. Sponsored by Graphic Communications Association (GCA) and SGML Open. Conference Chair: Pamela L. Gennusa (Director, Database Publishing Systems Ltd). Alexandria, VA: Graphic Communications Association (GCA), 1997. Extent: 342 pages, CDROM. Author's affiliation: Chair: Lynne A. Price, Text Structure Consulting, Fremont, CA, USA; Email: lprice@ix.netcom.com.

Abstract: "A panel of experienced users (including Marcy Thompson, Neil Bradley, Roger Boncoeur, Alan Burrows, François Chahuneau, Louis Nahas, Steve Newberry) discusses the pros and cons of using industry DTDs. Avoiding the over simplification of two polar positions (drastic or no modification of an existing DTD), the panel members discuss the numerous decisions facing implementers in between these poles. Questions posed to the members include: When a DTD is so large that an organization only uses part of it, what does interchange mean? Do different organizations interpret elements (or attributes) differently? Does any organization implement the entire DTD? How accepted is it to use a different DTD for editing and the industry-standard one for 'interchange'?

"The term industry-standard DTD refers to a DTD, designed by participants from several unrelated companies or other organizations, for use with documents on similar topics produced by these organizations. Despite the word 'standard,' industry-standard DTDs are not necessarily produced by recognized standards authorities. Often, they are intended for interchanging documents rather than for creating new ones. This panel brings together two groups of users of such DTDs -- end users and vendors (or SGML service providers) -- for a discussion of practical issues in the implementation of SGML projects based on such DTDs.

[CR: 19960425]

Price, Lynne A. "Book Review of README FIRST: SGML for Writers and Editors, by Turner/Douglass/Turner." TUGboat: The Communication of the TeX Users Group 17/1 (March 1996) 35-37. Author affiliation: Text Structure Consulting; Email: lprice@ix.netcom.com.

For other reviews of the book, see the bibliographic entry for README FIRST: SGML for Writers and Editors.

[CR: 19960520]

Price, Lynne A. "Registering Owners for Public Text Identifiers." <TAG> 9/5 (May 1996) 7-9. ISSN: 1067-9197. Author's affiliation: Text Structure Consulting; email: lprice@ix.netcom.com.

The author explains the structure of an FPI as defined in the standard and used in SGML documents. The ISO Council has designated ANSI to act as the registration authority (May 1996). [See also the main entry for ISO 9070 with information on the relevant WWW site].

[CR: 19951228]

Price, Lynne A. "SGML and TeX." TUGboat: The Communications of the TEX Users Group [= Proceedings of TeX90] 8/ (July 1987) 221-225. ISSN: 0896-3207.

Price, Lynne A. "Using SGML and TeX for User Documentation." Pages 203-210 in TEXniques No. 7: Proceedings, TeX User's Group 1988 Annual Meeting. (Montreal, 21-24 August 1988.) TeX User's Group, 1988.

Abstract: The Standard Generalized Markup Language (SGML), defined in International Standard (ISO) 8879, is a notation for representing documents and making their inherent structure explicit. The open-ended list of SGML applications includes document interchange, formatting or typesetting, loading databases for information retrieval, stylistic or linguistic analysis, and computer-aided translation. The combination of SGML and TeX is a natural one. This paper reviews the philosophy of SGML and then describes a particular environment where SGML and TeX are used together, giving specific examples of how processing is shared between the SGML application and TeX macros.

[CR: 19971227 MD: 19971229]]

Price, Lynne A. "The Ubiquitous Architectural Form." Pages 467-472 in SGML/XML '97 Conference Proceedings. SGML/XML '97. "SGML is Alive, Growing, Evolving!" The Washington Sheraton Hotel, Washington, D.C., USA. December 7 - 12, 1997. Sponsored by the Graphic Communications Association (GCA) and Co-sponsored by SGML Open. Conference Chairs: Tommie Usdin (Chair, Mulberry Technologies), Debbie Lapeyre (Co-Chair, Mulberry Technologies); Michael Sperberg-McQueen (Co-Chair, University of Illinois). Alexandria, VA: Graphic Communications Association (GCA), 1997. Extent: 691 pages, CDROM; print volume contains author and title indexes, keyword and acronym lists. Author's affiliation: [Lynne A. Price]: Text Structure Consulting, Inc.; Email: lprice@txstruct.com; WWW: http://www.txstruct.com/.

Abstract: "This paper examines three well-known SGML applications: CALS tables, the DocBook DTD, and Adobe FrameMaker+SGML. Although none of them are defined using architectural forms, all three applications provide constructs that are functionally equivalent to architectural forms. Identifying these equivalences promotes understanding of the concepts and similarities and may even suggest possible future generalizations."

"ISO/IEC 10744, Hypermedia/Time-based Structuring Language (HyTime), defines architectural forms to be rules for creating document components. In particular, architectural forms are rules for creating element type, attribute definition list, and notation declarations. The rules must be documented in an architecture definition document. In addition, the architectural forms are defined using SGML syntax in a meta-DTD, that is, in a DTD that serves as a template for numerous client DTDs. An individual declaration in a client DTD can correspond to a declaration in the meta-DTD. All names can be changed and the client DTD can be more restrictive than the meta-DTD. For example, a meta-DTD can define a list to consist of one or more items. A derived declaration in a client DTD can define a list to consist of a heading item followed by at least two normal items. Since both the heading item and the normal items are items, the declaration in the client DTD is consistent with that in the meta-DTD. Data that occurs as an attribute value in the architectural form can appear as content in the client document."

This paper was delivered as part of the "Expert" track in the SGML/XML '97 Conference.

Further information on architectural forms processing and SGML architectures is available in the dedicated database section of the SGML/XML Web Page, "Architectural Forms and SGML Architectures."

[CR: 19961226]

Price, Lynne A; Peterson, David C. "Ongoing Development of ISO 8879." Pages 339-344 in SGML '96 Conference Proceedings. Celebrating a Decade of SGML. SGML '96 Conference, Boston, MA, November 18-21, 1996. Sponsored by The Graphic Communications Association (GCA). [Edited by] Conference Co-Chairs: B. Tommie Usdin and Deborah A. Lapeyre. Alexandria, VA: GCA, 1996. Extent: 711 pages. Authors' affiliation: [Price]: SGML Specialist, Text Structure Consulting, 48680 Taos Rd., Fremont, CA 94539, USA. Tel: +1 510 498 1104; Email: lprice@ix.netcom.com; [Peterson]: SGMLWorks!, 3 Winston Road, Lexington, MA 02173, USA. Tel: +1 617 861 8475; Email: davep@acm.org.

Abstract: "As described by Charles Goldfarb during talks at SGML '95 and SGML Europe '95, the international standards committee responsible for the SGML standard, ISO/IEC JTC1/SC18/WG8, has been reviewing ISO 8879, which defines SGML. This effort has been more intense recently. After each of its meetings, WG8 makes a point of reporting on the status of the review and the technical issues that have been decided. While these reports are available on the Web at http://www.ornl.gov/sgml/wg8/docs or http://www.sgmlsource.com/8879rev/index.htm, this paper presents the decisions that have been reached to date in order to:

Assure conference attendees who have not already studied the material that a revised standard will not affect the validity or interpretation of today's documents;
Encourage attendees to participate in the work of WG8 by commenting on these proposals, suggesting additional possible changes, or representing their national standards bodies at WG8 meetings.

Note: The above presentation was part of the "SGML Expert" track at SGML '96. The SGML '96 Conference Proceedings volume containing the full text of the paper may be obtained from GCA.

Price, Lynne A.; Schneider, Joe. "Evolution of an SGML Parser Generator." Pages 51-60 in Proceedings of the ACM Conference on Document Processing Systems, Santa Fe (ACM Conference on Document Processing Systems, Santa Fe, New Mexico, 5-9 December 1988, sponsored by ACM SIGGRAPH, SIGOIS, and SIGIR). New York: Association for Computing Machinery, 1988. ISBN: 0-89791-291-8 (ACM Order Number: 429882).

Abstract: The Standard Generalized Markup Language (SGML) is a notation for describing classes of structured documents and for coding documents belonging to described classes. An advantage of SGML and other grammar-based document representations is the ability to perform multiple applications on a single document source file. This paper describes the evolution of a software development tool for implementing such applications. It explains the original design as well as enhancements made during the system's first eighteen months. Although not statistically significant, data on the use of the enhanced features are presented. The experience described is relevant to other software engineering tools for text processing.

[CR: 19990326]

Price, Roger. "Beyond SGML." Pages 172-181 (with 23 references) in Digital Libraries '98. Proceedings of the Third ACM Conference on Digital Libraries. Third ACM Conference on Digital Libraries. Pittsburgh, PA. June 23-26, 1998. Sponsored by ACM Siglink and SIGIR. Edited by Ian H. Witten, Rob Akscyn, amd Frank M. Shipman, III. New York, N.Y.: Association for Computing Machinery, 1998. ISBN: 0-89791-965-3. Author's affiliation: Department of Computer Science, University of Massachusetts Lowell; Email: Roger.Price@acm.org.

Abstract: "The International Standard for the Standard Generalized Markup Language (SGML) published in 1986 is now seen as a mature language for expressing document structure and is accepted as the basis for major projects such as the Text Encoding Initiative and important hypertext languages such as HTML and the XML. The historical origin of SGML as a technique for adding marks to texts has left a legacy of complexities and difficulties which hinder its wide acceptance. A key difficulty is the dual role that SGML documents currently play: they are both a representation for interchange and a human readable presentation. We examine possible document markup techniques in a post-SGML 86 world with emphasis on the framework architecture and the inclusion of richer agent behaviour. The novel ideas include the generalized recursion of elements and attributes, and the generalization of the notion of a 'character' to much broader token which is strongly typed." See particularly the document sections "Relation between grammar and document" and "The recursion of elements and attributes" ['SGML 86 allows mutual recursion of elements, but not recursions between elements and attributes. The overall design may be simplified by removing this restriction; at the same time allowing more general structures to be created in the ASN.l style. SGML 86 also distinguishes between the content of the those elements which have content, and the other properties represented by attributes. This also seems to be an artificial complexity which we shall remove.']

[Conclusion:] "We have reviewed the Standard Generalized Markup Language (SGML) as defined by IS0 8879:1986 and identified difficulties in its implementation and use. Our post SGML 86 system integrates other media besides texts and includes rules in a natural way. We claim that the alternative readings of a document provided by views based on the syntax and semantics of other languages provide a natural extension to more complex bchaviours. We have shown that all this can be done without excessive culture shock: When an SGML 86 grammar (DTD) is used, the SGML 86 human readable document, can always be produced as a view of the stored document. The major advantages of our post SGML 86 approach are: a) More generalized structures may be described in a simpler recursive way. b) Multiple media can be included in a straighforward and natural way. All media, including rules and agents, are marked up in the same way, with straightforward semantics. c) Confusion between the different usages of text characters are avoided by strong typing of all characters. d) Multiple views may he created of a document, each corresponding to the needs of a particular class of reader. e) SGML 86 documents may be easily migrated to and from post SGML 86 systems, thus preserving the considerable investment in SGML 86."

See also information on Digital Libraries '98. The Proceedings volume Table of Contents is available online; the full article is available online, via ACM DL subscription

Price-Wilkin, John. "The Feasibility of Wide-Area Textual Analysis Systems in Libraries: A Practical Analysis." Pages 113-136 in Literary Texts in an Electronic Age: Scholarly Implications and Library Services. A Collection of the Papers Presented at the 1994 Clinic on Library Applications of Data Processing at the Graduate School of Library and Information Science, University of Illinois at Urbana-Champaign. Clinic on Library Applications of Data Processing, Graduate School of Library and Information Science, University of Illinois at Urbana-Champaign, April 10-12, 1994. Edited by Brett Sutton. University of Illinois, Urbana-Champaign: The Graduate School of Library and Information Science, University of Illinois at Urbana-Champaign, 1994. ISBN: 0-87845096-3. ISSN: 0069-4789.

"Abstract: This paper discusses the textual and software resources necessary for the establishment of a generalized wide-area textual analysis system. A distinction is made between textual analytical systems and text retrieval systems. The necessity of using standards and open systems in implementing such systems is emphasized. The paper includes a review of critical characteristics of generalized analytical software. It is argued that the resources necessary for the establishment of a service are currently available. The paper concludes with a discussion of deficiencies in current resources and standards. The author also includes an appendix discussing the need to incorporate a recognition of structure in textual retrieval systems."

Another abstract for the article is available from ETEXTCTR Review #2 (Jerry Caswell).

Price-Wilkin, John. "A Gateway Between the World-Wide Web and PAT: Exploiting SGML Through the Web." The Public-Access Computer Systems Review 5/7 (1994) 5-27.

Abstract: The HyperText Markup Language (HTML) used by the World-Wide Web has limited markup and structure recognition capabilities. Only a small set of text characteristics can be represented, and few of these have any functional value beyond display capabilities. The HTML ANCHOR element supports hypertext links; however, it cannot retrieve components of a linked document, such as a single glossary entry from a collection of several thousand entries, without resorting to programs external to HTML and the Web server. In spite of these limitations, HTML and the Web are keytechnologies for libraries. The Standard Generalized Markup Language (SGML) is a full-featured, standard markup language. HTML is actually an SGML Document Type Definition. Ideally, it would be possible to retrieve text documents marked up with the richer SGML tag set via the World-Wide-Web. This technical paper discusses how the Web can be linked to the PAT system, Open Text's search engine that supports access to SGML-encoded documents. This Web-to-PAT Gateway utilizes the Web's Common Gateway Interface (CGI) capability and SGML-to-HTML filter programs. After briefly overviewing key technical concepts, the paper explains the operation of the Web-to-PAT Gateway, using several examples of how it is employed at the University of Virginia Libraries, including access to text files such as a Middle English collection, the Oxford English Dictionary, and the Text Encoding Initiative's Guidelines for Electronic Text Encoding and Interchange. [from the Introduction]

To obtain this document, use the following URL: gopher://info.lib.uh.edu:70/00/articles/e-journals/uhlibrary/pacsreview/v5/n7/pricewil.5n7. Or send the following e-mail message to listserv@uhupvm1.uh.edu: GET PRICEWIL PRV5N7 F=MAIL.

[CR: 19980121]

Price-Wilkin, John. "Just-in-time Conversion, Just-in-case Collections. Effectively leveraging rich document formats for the WWW." D-Lib Magazine (May 1997). ISSN: 1082-9873. Author's affiliation: Head, Digital Library Production Service, University of Michigan, Ann Arbor, Michigan.

Summary: "The University of Michigan's Digital Library Production Service (DLPS) has developed substantial experience with dynamic generation of Web-specific derivatives from non-HTML sources based on several key projects and consideration of how users work with key resources. This article is based on DLPS's experience and resultant policies and practices that guide present and future projects. . . The DLPS currently offers dozens of collections, including more than 2,000,000 pages of SGML-encoded text and more than 2,000,000 pages of material using TIFF page images.3 All of the material in these collections is offered through the WWW, and nearly all of it is presented in Web-accessible formats through real-time transformations of the source material."

Available online: http://www.dlib.org/dlib/may97/michigan/05pricewilkin.html; [local archive copy].

[CR: 19950804]

Price-Wilkin, John. "Using the World Wide Web to deliver complex electronic documents." Electronic Texts and the Text Encoding Initiative [Special Issue] = TEXT Technology: The Journal of Computer Text Processing 5/3 (Autumn, 1995) 190-203. ISSN: 1053-900X. Author's affiliation: Director, Humanities Text Initiative, University of Michigan.

"Price-Wilkin provides an excellent account of both the strengths and the limitations of the World Wide Web's approach to the use of SGML, spelling out in practical detail exactly how the TEI approach may be used to address those weaknesses, without losing the benefits of the succcess story which is the World Wide Web today. His paper should be required reading for anyone aiming to set up an electronic text centre, virtual library, or whatever name we finally settle on for the entity which will replace (or complement) the traditional scholarly library within the next decade." [from the issue Introduction, by Lou Burnard]

See the main entry for this special issue of TEXT Technology dedicated to the TEI, edited by Lou Burnard.

Price-Wilkin, John. "Using the World Wide Web to Deliver Complex Electronic Documents: Implications for Libraries." The Public-Access Computer Systems Review 5/3 (1994) 5-21.

"Introduction: The World-Wide Web (also called the Web) is a very promising tool for libraries to use to explore the delivery of rich and complex documents. [1] Nevertheless, there are many limitations in the Web's HTML markup language and the ability of Web servers to deliver structured information. This paper explores the benefits and limitations of the Web in the context of several projects taking place at the University of Virginia, both in the Library and in the University's Institute for Advanced Technology in the Humanities. A gateway between the Web and the SGML-based PAT system that helps to overcome the Web's inherent limitations is also described." [from the Introduction]

The author discusses TEI-SGML as relevant to delivery of electronic documents. This PACS Review version is derived from a presentation by the author at the Yale Hypertext Conference, May 1994. The article is available online here. Or connect via GOPHER to the UH GOPHER server: gopher://info.lib.uh.edu:70/. To obtain a plain-text copy of the article by standard mail, send the following command to the UH LISTSERVer (listserv@uhupvm1.uh.edu): GET PRICEWIL PRV5N3 F=MAIL. An abstract of the article by Jerry Caswell is given here.

[CR: 19950903]

Price-Wilkin, John. WWW-to-PAT Gateway: Exploiting an SGML-Aware System Through the Web. HTI Technical Report. Ann Arbor, Michigan: The University of Michigan, June 18,1995. Extent: approximately 16 pages; 12 notes. .

Abstract: "This technical paper presents the information necessary for implementing a gateway from a Common Gateway Interface (CGI) compliant WWW server to PAT, Open Text's text search engine. It presents information on several variants on implementation, including an Oxford English Dictionary (OED) lookup facility, a book browsing facility using the TEI Guidelines for Electronic Text Encoding and Interchange (P3), and a KWIC result generator for literary analysis in a text collection. One of the problems of the Web and of HTML in particular is the limited use of markup and limited structure recognition: only a small set of characteristics can be represented, and few of these have any functional value beyond presentational capabilities. While one would like to be able to deliver components of the text (e.g., a single glossary entry from a collection of several thousand), as rendered by the markup, only file transfer is possible without resorting to programs external to HTML and the server. Despite the anchor element, <A>, with its ability to provide hypertext links within and beyond the text, it is still necessary to return the entire file containing the link to the user, even when only a small portion is required for the link. The gateway capability discussed in this technical paper documents a method by which the Web administrator with access to PAT can use any number of richer SGML DTDs, and begin to provide the user with access to that richer set of tags and structural retrieval possibilities."

The document is available from the WWW at the University of Michigan Humanities Text Initiative (or: WWW-to-PAT Gateway, [mirror copy]. For pointers to other SGML-related research at the HTI, see the main URL.

Pullinger, D. J. "Learning from Putting Electronic Journals on SuperJANET: The SuperJournal Project." Interlending and Document Supply 23/1 (1995) 20-27. 5 references. Author's affiliation: Electronis Publisher, Macmillan Magazines, London, UK.

"Abstract: High speed networks, such as the UK's SuperJANET, present opportunities for new dissemination. The SuperJournal Project in 1993 explored the possibilities with electronic versions of journal articles from nine publishers using four different interfaces. To support user retrieval strategies (hierarchic selection, searching, browsing), network speeds need to be very fast and supported end-to-end, more tools for production support for SGML need to be developed and a move made away from text-led systems to the more visual. The author discusses the changing role of paper in the context of multimedia systems."

[CR: 19951114]

Quin, Liam. Nathan Bailey's Universal Etymological ENGLISH DICTIONARY, 1736 [or] SGML Excerpts from the 1736 Edition Universal Etymological English Dictionary of Nathan Bailey. Toronto, Ontario: Liam Quin / SoftQuad Inc., 1995. Author's affiliation: SoftQuad, Inc.; email: lee@sq.com.

The document is an SGML version (excerpted) of the 1736 2nd edition of DICTIONARIUM BRITANNICUM: Or, a more Compleat Universal Etymological ENGLISH DICTIONARY Than any EXTANT, Containing Not only the Words and their Explication; but their Etymologies from the Antient British, Teutonick, Dutch [etc], by Nathan Bailey.

"Welcome to a collection of excerpts from Nathan Bailey's English Dictionary, as prepared by Liam Quin, lee@sq.com. The dictionary itself is marked up in SGML, and you will want an SGML-aware browser, such as SoftQuad Panorama, in order to proceed much further." URLs: Bailey's Dictionary, 1736 (Excerpts), and also: Headwords from Bailey's 1736 Dictionary (Excerpts)

[CR: 19980911]

Quin, Liam. "Suggestive Markup: Explicit Relationships in Descriptive and Prescriptive DTDs." Pages 405-418 in SGML '96 Conference Proceedings. Celebrating a Decade of SGML. SGML '96 Conference, Boston, MA, November 18-21, 1996. Sponsored by The Graphic Communications Association (GCA). [Edited by] Conference Co-Chairs: B. Tommie Usdin and Deborah A. Lapeyre. Alexandria, VA: GCA, 1996. Extent: 711 pages. Author's affiliation: Senior Technical Consultant, SoftQuad Inc., Toronto, Ontario, Canada M4R 1K8; Tel: +1 416 544 9000; FAX: +1 416 544 0400; Email: lee@sq.com; WWW: http://www.softquad.com/ .

Abstract: "The SGML literature divides DTDs into two types: those that describe existing information structures and those that prescribe a fixed set of structures. A purely prescriptive approach has been in vogue for several years; however, the descriptive approach has much to offer. It is suggested that many DTDs should in fact fall somewhere between the two extremes, and could be termed suggestive. In a Suggestive DTD, certain structures are fixed, others are flexible, and still others are configured through the simple use of attributes to permit previously unexpected values. Relationships are explicitly marked where they cannot be derived."

[September 11, 1998] An online version of this paper is now available from the GroveWare Web site.

Note: The above presentation was part of the "SGML Expert" track at SGML '96. The SGML '96 Conference Proceedings volume containing the full text of the paper may be obtained from GCA.

[CR: 19971227]

Quin, Liam R. E. "Towards Global Interchange of SDATA Character Entities." Pages 365-376 in SGML/XML '97 Conference Proceedings. SGML/XML '97. "SGML is Alive, Growing, Evolving!" The Washington Sheraton Hotel, Washington, D.C., USA. December 7 - 12, 1997. Sponsored by the Graphic Communications Association (GCA) and Co-sponsored by SGML Open. Conference Chairs: Tommie Usdin (Chair, Mulberry Technologies), Debbie Lapeyre (Co-Chair, Mulberry Technologies); Michael Sperberg-McQueen (Co-Chair, University of Illinois). Alexandria, VA: Graphic Communications Association (GCA), 1997. Extent: 691 pages, CDROM; print volume contains author and title indexes, keyword and acronym lists. Author's affiliation: [Liam R. E. Quin]: Development Manager, Suite 901 Inc., 67 Yonge Street, Toronto, Ontario, Canada M5A 3C7; Phone: +1 416 955-9845; WWW: http://www.interlog.com/~liamquin/; Email: liamquin@interlog.com.

Abstract: "SGML Documents that include references to SGML SDATA character entities are today inherently not portable. It is not possible to define a new SDATA entity in such a way that any arbitrary SGML system is likely to display the corresponding character or characters correctly.

"ISO has published some standard PUBLIC texts containing definitions of the more common entities, such as é, but they are defined in such a way that any given SGML system is certain to have to change them, or to provide a mapping onto system-specific definitions. As a result, many SGML software packages do not display even the core ISO SDATA character entities in a useful way.

"This paper briefly reviews the current state of the art, introduces terminology from relevant existing ISO standards, outlines a set of requirements for a possible solution, and then proposes one such solution, the Glyph Interchange Language (GIL).

"This paper is limited in scope to those SGML SDATA entities that are intended to represent a single character or a sequence of characters; other uses of SGML SDATA entities are not affected. The paper also discusses possible uses of the Glyph Interchange Language both in XML, the eXtensible Markup language, and in the context of the Text Encoding Initiative (TEI) P3 Guidelines."

This paper was delivered as part of the "Expert" track in the SGML/XML '97 Conference.

For further information, see the URL http://www.interlog.com/~liamquin/sgml/gil/. The corresponding document "GIL: Glyph Interchange Language" records notes that are excerpted from a draft of Quin's SGML/XML '97 paper ("a sort of straw-boy proposal. The paper contains motivational examples and introduces some terminology, mostly taken from PDTR 15285 Version 9 [January 1997], 'Information Technology - An Operational Model for Characters and Glyphs'). This document "will be updated after the [SGML/XML '97] conference to reflect the status of Glyph Interchange Work."

[CR: 19980911]

Quin, Liam R. E. "Writing a Readable DTD." Pages 297-308 in SGML/XML '97 Conference Proceedings. SGML/XML '97. "SGML is Alive, Growing, Evolving!" The Washington Sheraton Hotel, Washington, D.C., USA. December 7 - 12, 1997. Sponsored by the Graphic Communications Association (GCA) and Co-sponsored by SGML Open. Conference Chairs: Tommie Usdin (Chair, Mulberry Technologies), Debbie Lapeyre (Co-Chair, Mulberry Technologies); Michael Sperberg-McQueen (Co-Chair, University of Illinois). Alexandria, VA: Graphic Communications Association (GCA), 1997. Extent: 691 pages, CDROM; print volume contains author and title indexes, keyword and acronym lists. Author's affiliation: [Liam R. E. Quin]: Development Manager, Suite 901 Inc., 67 Yonge Street, Toronto, Ontario, Canada M5A 3C7; Phone: +1 416 955-9845; WWW: http://www.interlog.com/~liamquin/; Email: liamquin@interlog.com.

Abstract: "An SGML Document Type Definition serves many purposes, and is read both by software and by people. It must therefore be presented in a way which is clear and effective. A DTD is neither a program nor a document, but shares some characteristics of both. Techniques for presenting both textual information and structured information have been developed in other fields, and it is instructive to study these techniques and to see how they apply to SGML DTDs. In particular, graphic design and typography on the one hand and computer science and program layout on the other are very relevant.

"Existing literature on document analysis and the preparation of SGML Document Type Definitions does not generally discuss the layout of SGML DTDs from a typographic or engineering point of view. This paper describes a number of techniques and principles used in typography, graphic design, information architecture and also in engineering and computer science.

"The principles of design that underline these techniques are discussed in turn, and a clear way to organise and lay out a DTD is then presented. Further reading is given in an annotated bibliography."

This paper was delivered as part of the "How To" track in the SGML/XML '97 Conference.

[September 11, 1998] An online version of the presentation is now available from the GroveWare Web site.

[CR: 19990401]

Quin, Liam R. E.; Graham, Ian S. XML Specification Guide. New York, NY: John Wiley & Sons, 1999. Extent: xiv + 432 pages. ISBN: 0-471-32753-0. Authors' affiliation: [Graham:] Vice President of Research and Development, Groveware Inc., WWW; [Quin:] Director of Development, Groveware Inc., WWW: http://www.GroveWare.com/~lee/.

The XML Specification Guide features: (1) A complete explanation of XML structure, syntax, and rules, built around detailed examples; (2) A comprehensive specification-guide to XML, annotated with helpful clarifications, background material, and illustrative examples; (3) In-depth coverage of namespaces and schemas; (4) A detailed discussion of the Unicode character sets and their relationship to XML; (5) A detailed glossary of important XML terms; (6) A quick-reference guide to XML features, with illustrated examples of how they differ from HTML."

"The XML Specification Guide is divided into three parts. Part One presents a bootstrap overview of XML. Part Two contains the complete XML 1.0 specification, with explanatory annotations. Last, Part Three contains appendices that (a) describe technical standards (e.g., the Unicode character set) important for understanding important parts of the XML specification, and that (b) introduce some of the evolving, but as-yet incomplete, extensions to the base XML 1.0 standard." See the description on the Groveware web site, the Book Outline and Content Description (with annotated Table of Contents), and Supplementary Material and Resources. See also the Wiley web page and 'Suporting Web Site'.

Note in 'Supplementary Resources and Tools': A collection of additional resources, not found in the book [The XML Specification Guide], that are useful for understanding the XML specifications and using XML. These include [1999-03-26]: (1) A searchable index of all XML specifications (2) A searchable index of EBNF productions (3) A set of XML design patterns -- based on the approach of the same name introduced in object oriented design -- applied here to modeling document architecture; (4) Extracted EBNF for XML - defines, along with the well-formedness and validity constraints, the rules for writing well-formed or valid XML documents; (5) A list of useful online resources."

Publisher's blurb: "The masters of XML show how to unleash the language's vast potential. Less complicated than SGML and more flexible than HTML, XML (Extensible Markup Language) is fast becoming the language of choice for Web developers and programmers. Readers are looking for a clear-cut roadmap to this new technology's exciting terrain, its advantages, capabilities, and little-known shortcuts. XML Specification Guide is what the Web world is waiting for. After a concise overview of the purpose and scope of XML and its principles, the authors -- renowned XML experts -- provide an in-depth, annotated specification guide, complete with sample applications. Beyond comprehensive coverage of the XML specification, the book discusses the new 'namespaces' technology from W3C, the Tiny XML subset, databases and object-oriented models, and much more." [from the Wiley server, 1999-03-12

See also the condensed Table of Contents.

[CR: 19950801]

Quint, Vincent; Munson, Ethan (translator). The Languages of GRIF. GRIF Technical Report [1986-1994]. GIPSI S.A., GRIF S.A., April 18 1994. Extent: v + 133 pages. Author's affiliation: BULL-IMAG; email: Vincent.Quint@imag.fr.

Available via the Internet on the IMAG (Institut d'Informatique et de Mathématiques Appliquées de Grenoble) FTP server: The languages of GRIF, or in mirror copy [August 01, 1995]. The original French version of the document is available online as well.

[CR: 19951113]

Quint, Vincent; Vatton, Irène. "Combining Hypertext and Structured Documents in Grif." Pages 23-32 (with 25 references) in ECHT '92. Proceedings of the ACM Conference on Hypertext and Hypermedia. Second European Conference on Hypertext and Hypermedia. Milano, Italy. 30 November - 4 December, 1992. Sponsored by ACM: SIGLINK, SIGIR, SIGOIS. Edited by Dario Lucarella, J. Nanard, M. Nanard, and P. Paolini. New York: ACM [Association for Computing Machinery] Press, 1992. ISBN: [?]; ACM order number: 614920. Author's affiliation: [Quint] IMAG-INRIA; [Vatton] CNRS.

"Abstract: This paper presents the experience gained in developing and using the hypertext functions of the Grif system [sincluding work funded by ESF Eureka programme EU 43]. Grif is a structured document editor based on the generic structure concept: each document is represented in the system by its logical structure which is an instance of a generic structure. This notion of logical structure encompasses both hierarchical structures (as is usual in structured documents) and non-hierarchical links (as is usual in hypertext).

"The document model on which Grif is based is presented, focusing on the different types of links. Various applications using these links are also described. It is shown that the approaches of electronic documents and hypertext, which are often opposed to each other, can be combined for building more powerful integrated systems."

Available in Postscript format on the Internet: ftp://ftp.imag.fr/pub/OPERA/doc/ECHT92.ps.gz [mirrored copy, November 1995].

[CR: 19951113]

Quint, Vincent; Nanard, M.; André Jacques. "Towards Document Engineering." Pages 17-29 (with 20 references) in Electronic Publishing '90: Proceedings of the International Conference on Electronic Publishing, Document Manipulation and Typography (Gaithersburg, Maryland, September 1990). Edited by Richard Furuta [University of Maryland]. The Cambridge Series on Electronic Publishing. Cambridge: Cambridge University Press, 1990. x + 298 pages; index. ISBN: 0-521-40246-8. Authors' affiliation: INRIA [Institut National de Recherche en Informatique et en Automatique], Le Chesnay, France / IMAG, Gières, France.

"Abstract: Methods and techniques used in software engineering are compared with the ones used for handling electronic documents. The authors show the common features in both domains, but also the differences and propose an approach which extends the field of document manipulation to document engineering. They show also in what respect document engineering is different from software engineering. Therefore specific techniques must be developed for building integrated environments for document engineering."

The document was also published as INRIA Technical Report Number 1244, June, 1990.

[CR: 19951113]

Quint, Vincent; Vatton, Irène; Bedor, Hassan. "Grif: An Interactive Environment for TeX." Pages 145-158 in TeX for Scientific Documentation. Proceedings of the Second European Conference. (The Second European Conference on TeX for Scientific Documentation, Strasbourg, France, June 19-21, 1986, Sponsored by: CNRS (Centre National de le Recherche Scientifique), SMF (Société Mathématique de France), Université Louis-Pasteur de Strasbourg). Edited by Jacques Désarménien. Lecture Notes in Computer Science, Number 236. Berlin/New York: Springer-Verlag, 1986. ISBN: 0387168079 (New York); ISBN: 3540168079 (Berlin).

"Abstract: Several attempts have been made for making TEX more user-friendly by providing specific tools for preview or input of documents. We propose a different approach which uses an interactive system for editing the documents intended to be formatted by TEX. This system, GRIF, is based on a structural model of documents and allows the user to define the structure and the presentation of documents edited. We present how it may be used for efficiently preparing documents to be printed by TEX."

[CR: 19951113]

Quint, Vincent; Vatton, Irène. "Active Structured Documents as User Interfaces." In [Workshop on] Human Interaction for Symbolic Computation. Edited by N. Kajler. Amsterdam: CWI, March 1994.

"Abstract: This paper introduces the concepts of structured and active documents and shows how they can be used for making the user interface for different applications in the field of computer algebra. A system for active structured documents is briefly presented as well as some examples of its use."

On the notion of "active structured documents," see also the EPODD article by Quint.

The document is available on the Internet in Postscript format: ftp://ftp.imag.fr/pub/OPERA/doc/HISC.ps.gz [mirrored copy, November 1995]. See also ICSE: ftp://ftp.imag.fr/pub/OPERA/doc/ICSE16.ps.gz [mirrored copy, November 1995] = V. Quint, I. Vatton, "Active Documents as a Paradigm for Human-Computer Interaction", Workshop on Software Engineering and Human-Computer Interaction: Joint Research Issues, Preprints, R. Taylor and J. Coutaz, ed., pp. 255-262, Sorrento, Italy, May 16-17 1994. {???}

[CR: 19951113]

Quint, Vincent; Vatton, Irène; Bedor, Hassan. "The GRIF System." T.S.I -- Technology and Science of Informatics 6/1 (April 1987) 98-103.

Also apparently published as "Le système Grif" in T.S.I -- Technology and Science of Informatics 5/4 (1986) 337-341.

[CR: 19951113]

Quint, Vincent; Vatton, Irène. "Making Structured Documents Active." Electronic Publishing - Origination, Dissemination and Design (EPODD) 7/2 (June 1994) 55-74 (with 27 references). ISSN: 0894-3982. Author's affiliation: INRIA (Institut National de Recherche en Informatique et en Automatique), Rhône-Alpes; CNRS-IMAG.

"Abstract: Active documents result from a combination of some specific features in documents and some mechanisms in a document manipulation system. In this paper, we present the possibilities offered by a structured model of documents and a structured editor for making active documents. Several applications are described (annotations, electronic indexes, cooperative editing, documents as user interfaces, etc.), which show how a document's logical structure may be exploited for developing a variety of active document applications."

The document is available in Postscript format on the Internet: ftp://ftp.imag.fr/pub/OPERA/doc/ActiveGrif.ps.gz [mirrored copy, November 1995].

[CR: 19951113]

Quint, Vincent. "Édition de documents structurés." Pages 11-47 in Le traitement électronique du document. COURS INRIA: LE TRAITEMENT ELECTRONIQUE DU DOCUMENT, AIX-EN-PROVENCE, 3-7 octobre 1994. Edited (by program officers) Jean-Claude Le Moal and Bernard Hidoine. Sciences de l'information, Serie etudes et techniques. Paris: ADBS éditions, octobre 1994. Volume extent: 285 pages. ISBN: 2-901046-76-2. Author's affiliation: INRIA (Institut National de Recherche en Informatique et en Automatique) Rhône-Alpes.

Abstract: "Dans toute application informatique il est nécessaire de définir clairement un modèle des informations à traiter pour pouvoir spécifier les traitements à leur appliquer. Les documents n'échappent pas à cette règle. Différents types de modèles ont été utilisés ou sont proposés dans le domaine du traitement des documents; les plus riches et les plus puissants considèrent un document comme une structure ou même comme plusieurs structures. Grâce à un haut degré d'abstraction, ils représentent une partie de la sémantique du document et autorisent ainsi des traitements élaborés.

"Ce texte s'intéresse principalement aux modèles de documents structurés. Il présente les concepts qui sont à la base de la structuration des documents et il montre quelques cas d'application de ces concepts. Il est divisé en trois parties.

"La première partie traite des différents types de structures qu'on peut identifier dans les documents, aussi bien les structures logiques, abstraites, que les structures graphiques, concrètes.

"La deuxième partie présente une norme qui est de plus en plus largement utilisée pour traiter des documents et les échanger entre systèmes différents. Il s'agit de la norme SGML qui est fondée sur une structuration logique des documents.

"La troisième partie est consacrée à l'utilisation des modèles de documents structurés dans la création et la manipulation interactive des documents. Elle fait le point sur les éditeurs de documents structurés."

The document is available in Postscript format on the Internet: ftp://ftp.imag.fr/pub/OPERA/doc/CoursAix.ps.gz [mirrored copy, November 1995].

[CR: 19951113]

Quint, Vincent. Opéra: outils pour les documents électroniques, recherche et applications. Rapport INRIA 1993. Programme 3 - Intelligence artificielle, systèmes cognitifs en interaction homme-machine. Number RA-1. Grenoble and Rennes, France: INRIA-IMAG/IRISA, 31 janvier 1994. Extent: ii + 18 pages.

"Abstract: Project Opéra is interested in electronic documents, their representation in computer systems and the techniques used for processing them. The main goal of the project is to design new document models which can represent not only the logical organization of documents, but also their graphical aspect and content, as well as the relationships between documents or parts of documents, thus representing usual documents and hypertexts as well. Multimedia is also considered. Structured elements contained in documents, such as tables, equations or drawings are also taken into accoun tin these models. Another objective is to develop editing techniques for implementing the document models. The high level of abstraction of these models makes sophisticated treatments of documents possible, and a large part of the project's activity is dedicated to the development of new tools for manipulating documents. These tools may be used in various types of applications that handle documents. A major issue in editing tools is the user interface. The project aims to use active documents as a way to communicate between applications and users."

Available in Postscript format on the Internet: ftp.imag.fr/pub/OPERA/doc/RA93opera.ps.gz [mirrored copy, November 1995].

[CR: 19951113]

Quint, Vincent; Vatton, Irène. "Grif: An Interactive System for Structured Document Manipulation." Pages 200-213 (with 23 references) in Text Processing and Document Manipulation. Proceedings of the International Conference, University of Nottingham, 14-16 April 1986. Edited by J. C. [Hans] van Vliet. The British Computer Society Workshop Series. Cambridge: Cambridge University Press [on behalf of the British Computer Society], 1986. ISBN: 0-521-32592-7.

"Abstract: Grif is an interactive system for editing and formatting complex documents. It manipulates structured documents containing objects of various types: tables, mathematical formulae, programs, pictures, graphics. etc. It is a structure directed editor which guides the user in accordance with the structure of the document and of the objects being edited; the image displayed on the screen also being constructed from that structure. Flexibility is one of the most interesting characteristics of Grif. The user can define new document structures and new types of objects, as well as [sic! to] specify the way in which the system displays these documents and objects."

[CR: 19951113]

Quint, Vincent; Vatton, Irène. "Modularity in Structured Documents." Pages 170-177 in WOODMAN'89. Workshop on Object-oriented document manipulation [= Journees manipulation de documents orientes objets]. Journees manipulation de documents orientes objets, Rennes, France,, 29-31 mai, 1989. Sponsored by: AFCET, BIGRE, CCETT. Edited by Jacques André and Jean Bézivin. BIGRE [Bulletin d'information du Groupe de recherche sur les outils de conception et d'ecriture des systems] + Globule 63-64. Rennes, France: IRISA [Institut de Recherche en Informatique et Systèmes Aléatoires], mai 1989.

The document is available in Postscript on the Internet: [possibly]: ftp://ftp.imag.fr/pub/OPERA/doc/ECHT92.ps.gz [mirrored copy, November 1995].

[CR: 19951113]

Quint, Vincent; Vatton, Irène. "L'édition structurée et le World-Wide Web." Cahiers GUTenberg 19 (janvier 1995) 85-97. Authors' affiliation: INRIA (Institut National de Recherche en Informatique et en Automatique) / IMAG, 2 avenue de Vignate, 38610 Gières, France.

"Résumé : La création de documents pour le World-Wide Web n'est pas toujours une tâche facile. Beaucoup d'auteurs créent ces documents 'à la main'. Ils doivent alors saisir la syntaxe HTML, même si l'éditeur de texte qu'ils utilisent fournit quelques aides. Une alternative consiste à utiliser les filtres de différents systèmes de production de documents, mais ces systèmes n'intègrent pas toutes les spécificités du Web. Ces deux méthodes ne sont donc pas complètement satisfaisantes. Nous présentons ici une solution fondée sur l'éditeur de documents structurés Grif. L'éditeur Grif a été étendu pour prendre en compte les caractéristiques propres du Web et en faire un environnement confortable pour la création des documents sur le Web."

The document is available in Postscript format on the Internet: ftp://ftp.imag.fr/pub/OPERA/doc/GutenbergW3.ps.gz [mirrored copy, November 1995].

[CR: 19951113]

Quint, Vincent; Vatton, Irène. Hypertext Aspects of the Grif Structured Editor: Design and Applications [Les aspects hypertexte de l'éditeur structuré Grif: conception et applications]. INRIA Rapport de recherche, Number 1734. Rocquencourt, France: INRIA, July 1992. ii + 23 pages, 24 references. Author's affiliation: Institut National de Recherche en Informatique et en Automatique, Le Chesnay, France.

"Abstract: This report presents the experience gained in developing and using the hypertext functionalities of the Grif system. Grif is a structured document editor based on the generic structure concept: each document is represented in the system by its logical structure which is an instance of a generic structure. This notion of logical structure encompasses both hierarchical structures (as is usual in structured documents) and non-hierarchical links (as is usual in hypertexts). The document model on which Grif is based is presented, focusing on the different types of links. Various applications using these links are also described. It is shown that the approaches of electronic documents and hypertext, which are often opposed to each other, can be combined for building more powerful integrated systems."

"Résumé: Ce rapport présente l'expérience acquise lors du développement et de l'utilisation des fonctionalités hypertexte du système Grif. Grif est un éditeur de documents structurés fondé sur le concept de structure générique: chaque document est représenté dans le système par sa structure logique qui est une réalisation d'une structure générique. La notion de structure logique englobe à la fois les structures hiérarchiques (comme c'est l'usage dans les documents structurés) et les liens non-hiérarchiques (qui sont à la base des hypertextes). On présente le modèle de document sur lequel s'appuie Grif, en mettant l'accent sur les différents types de liens. Plusieurs applications utilisants ces liens sont également décrites. On montre que les deux approches des documents électroniques et des hypertextes, qui sont souvent opposées l'une à l'autre, peuvent être combinées pour construire des systèmes intégrés d'une plus grande puissance."

The document is available in Postscript format on the Internet: ftp://ftp.imag.fr/pub/OPERA/doc/RR92HyperGrif.ps.gz [mirrored copy, November 1995].

Quint, Vincent; Roisin, Cécile; Vatton, Irène. A Structured Authoring Environment for the World-Wide Web. Paper Presented at The Third International World-Wide Web April 10-14, 1995, Darmstadt, Germany. April 11, 1995. Extent: 35K (computer file), approximately 13 pages, 7 refs. Authors' affiliation: Vincent Quint (INRIA), IMAG, 2 avenue de Vignate, 38610; Gières, FranceVincent.Quint@imag.fr; Cécile Roisin (Grenoble University); Cecile.Roisin@imag.fr; Irène Vatton (CNRS); Irene.Vatton@imag.fr.

Abstract: Authoring documents for the World-Wide Web is not always an easy task. Most authors either directly type HTML syntax with a text editor or convert files that they produce with various document preparation systems, but both methods pose problems. We propose another approach, based on a structured document editor, Grif. The main characteristics of HTML documents are analyzed and the extensions that these documents have imposed to the Grif editor are presented. With these extensions, Grif becomes a comfortable environment for authoring WWW documents, and it allows better and more rigorously structured documents to be produced. It also allows a smooth evolution towards SGML.

Document keywords: authoring environments, structured documents, structure conversion, HTML, SGML. The paper may be obtained from the online proceeedings, or in mirror copy here.

[CR: 19950716]

Rada, R.; Bird, G.; Min Zheng. "Hypertext Interchange Using ICA." Journal of Documentation 51/2 (June 1995) 99-117 (with 28 references). Author's affiliation: Department of Computer Science, Liverpool University, Liverpool, UK.

"Abstract: Interchange of text and hypertext between various systems is vital in order to reuse text and hypertext, but the task of generating translators between different representations is often complex and tedious. The integrated chameleon architecture (ICA) is a public domain toolset for generating translators. However, ICA can only handle context-free grammars while the grammar of hypertext is not context-free. This paper presents an extended ICA (E-ICA) which is based on ICA with extra pre- and post-processors to handle the context-sensitive and implicit information of hypertext. A system called SGML-MUCH has been developed using E-ICA. The development and use of the SGML-MUCH system is presented as a case study with converters for the hypertext systems MUCH, Guide, Hyperties, and Toolbook described in detail."

[CR: 19950716]

Rada, Roy; Carson, George S. "The New Media." Communications of the Association for Computing Machinery 37/9 (September1994) 23-25. Authors afiliation: Roy [Rada;] University of Liverpool in England; [Carson:] GSC Associates Inc., California. Both authors are members of the ACM Technical Standards Committee.

"Abstract: In production and dissemination of electronic documents, numerous standards must be considered. In dealing with hypermedia even more standards may apply. Computerized, moving images and sound may be encoded in a multitude of formats. An application of the Standard Generalized Markup Language (SGML) called HyTime is being used for logical markup of hypermedia. Additional progress continues to be impeded by several factors. First, there is no universal agreement on the logical structure of common office documents. Second, given that the logical structure of a document has been captured, there is no widely accepted standard way of describing how that information is to be presented. Several document processing tools are good at converting various image formats into a form which Microsoft Word can display. An excessive need for converters may be remedied by a new generation of interchange formats which allow private objects and structure to be described on an equal footing with public ones."

[CR: 19970828]

Radosevich, Lynda. "Health Care Uses XML for Records. Other Vertical Industry Groups Also Expected to Cooperate to Customize XML." InfoWorld 19/34 (August 25 1997) 51-52. ISSN: 0199-6649. Author's affiliation: [InfoWorld Staff].

" . . .forward-thinkers in the health-care industry are devising ways to use Extensible Markup Language (XML) as an open framework for creating portable electronic medical records. XML is just starting to make its way into products, but it is considered more powerful than HTML, particularly for defining and accessing structured data. . . Although XML is just emerging as a Web document format, a group within the Health Level 7 (HL7) standards body is floating a plan called the Kona Proposal, which aims to enable the exchange of medical information in a vendor-neutral structure built on the Standard Generalized Markup Language (SGML) and XML. . . Although the Kona Proposal is specific to the health-care business, it is a harbinger of how vertical industry groups will cooperate to customize XML into an exchange medium for their industries." [Extracted]

See the electronic version online: "Health Care Uses XML for Records. Other Vertical Industry Groups also Expected to Cooperate to Customize XML." For more on XML, see the main entry for Extensible Markup Language. See the main entry for SGML Initiative in Health Care (HL7 Health Level-7 and SGML) for other information. [archive copy]

[CR: 19970303]

Radosevich, Lynda. "W3C Preps XML Despite Netscape's Snub." InfoWorld 19/9 (March 3 1997) 43. ISSN: 0199-6649.

See the XML main entry for background on the Extensible Markup Language. The text of the article includes: ". . . [besides GCA] Other supporters of XML are Digital, Hewlett-Packard, IBM, JavaSoft, Microsoft, Novell, Spyglass, and Sun. But notably missing is an endorsement from Netscape, which stated that it believes the extensions [viz., the Net user's ability to create new tags for Web documents in a standardized manner] are not needed." See InfoWorld 19/9 (March 03, 1997): 43, or the online version of the document; [mirror copy, text only].

A related article may be found in: "Netscape Replies to XML [apparently: not interested, thank you!]," Seybold Report on Internet Publishing 1/5 (January 1997): 2.

[CR: 19970905]

Radosevich, Lynda. "XML à la Microsoft. Company Eyes a New Markup Format." InfoWorld 19/35 (September 1 1997) 1, 24. ISSN: 0199-6649. Author's affiliation: InfoWorld staff..

Extract: "In an effort to boost the Extensible Markup Language's role in transforming browsers into sophisticated front-end clients, Microsoft plans soon to propose an Extensible Markup Language (XML) style-sheet language to the World Wide Web Consortium (W3C), according to sources close to the development."

See the main XML entry for additional information on the Extensible Markup Language. See also: "Microsoft to Push XML as Alternative to Java," by Lynda Radosevich. In InfoWorld Electric, August 30, 1997, 6:27 AM PT. Available online: via the InfoWorld server; [archive copy].

Raggett, David. "A Review of the HTML+ [HTML-plus] Document Format." Computer Networks and ISDN Systems 27/2 (November 1994) 135-145. 7 references. Author affiliation: Hewlett Packard Labs., Bristol, UK.

Abstract: HTML+ is a set of modular extensions to the Hypertext Markup Language (HTML), which is in widespread use in the World Wide Web. The use of SGML to specify HTML+ allows authors to create documents in a variety of ways: with text editors, SGML authorizing tools and filters from common word processing formats like Framemaker, Microsoft Word and LaTeX. The paper finishes by looking at extensions under current consideration, and encouraging wider debate on what is needed to fuel the next stage in the development of the Web.

[CR: 19960125]

Rahtz, Sebastian. "Another Look at LATEX to SGML Conversion." TUGboat: The Communications of the TEX Users Group [issue = Proceedings of the 1995 Annual Meeting] 16/3 (September 1995) 315-324. ISSN: 0896-3207. Author affiliation: Elsevier Science Ltd., The Boulevard, Langford Lane, Kidlington, Oxford OX5 1GB, United Kingdom; email: s.rahtz@elsevier.co.uk.

"Abstract: Publishers are starting to use SGML as their permanent form of storage for documents. How can LATEX files be converted to an SGML instance? This paper discusses possible strategies, and describes an implementation by Elsevier Science of a system based on conversion to TEX itself, and the extraction of SGML code from the dvi file."

For other publications relating to SGML and (LA)TEX, see the small bibliographic collection in this database.

[CR: 19961226]

Ramalho, Jose Carlos; Almeida, Jose Joao; Henriques, Pedro. "Document Semantics: Two Approaches." Pages 473-484 in SGML '96 Conference Proceedings. Celebrating a Decade of SGML. SGML '96 Conference, Boston, MA, November 18-21, 1996. Sponsored by The Graphic Communications Association (GCA). [Edited by] Conference Co-Chairs: B. Tommie Usdin and Deborah A. Lapeyre. Alexandria, VA: GCA, 1996. Extent: 711 pages. Authors' affiliation: [Ramalho]: University of Minho, Computer Science Department, Largo do Paço, 4709 Braga CODEX, Portugal. Email: jcr@di.uminho.pt; WWW: http://www.di.uminho.pt/~jcr/; [Almeida]: Email: jj@di.uminho.pt; Web: http://www.di.uminho.pt/~jj/; [Henriques]: Email: prh@di.uminho.pt; WWW: http://www.di.uminho.pt/~prh/.

Abstract: "SGML (Standard Generalized Markup Language) introduced DTD (Document Type Definition) concept to formally describe document syntax and structure. One of its main characteristics is the fact of being purely declarative and fully independent of the future document's processing (typesetting, formatting, translation/transformation). In this context, SGML has become the international standard to be followed.

Sooner or later, a document must be processed. In order to do that we need to associate semantics to the document's structure. In compiler's context, normally we separate semantics in two, static and dynamic. Establishing a parallelism with document processing, we can think of the document's decorated tree (as recognized by a SGML analyzer) as representing the static semantics and document's tree transformation as dynamic semantics.

Pursuing this idea, we will present and discuss a study of the relationship between SGML, DAST (Decorated Abstract Syntax Tree), and Algebraic Specification, in order to better understand how to formally process documents and how to specify and build generic document processing tools."

Note: The above presentation was part of the "SGML Expert" track at SGML '96. The SGML '96 Conference Proceedings volume containing the full text of the paper may be obtained from GCA.

[CR: 19990519]

Ramalho, José Carlos; Rocha, Jorge Gustavo; Almeida, José João; Henriques, Pedro. "SGML Documents. Where Does Quality Go?" Markup Languages: Theory & Practice 1/1 (Winter 1999) 75-90 (with 9 references) . ISSN: 1099-6622 [MIT Press]. Authors' affiliation: [Ramalho:] University of Minho, Portugal; Email: jcr@di.uminho.pt; WWW: www.di.uminho.pt/~jcr; [Rocha:] Email: jgr@di.uminho.pt; WWW: www.di.uminho.pt/~jgr; [Almeida:] Email: jj@di.uminho.pt; WWW: www.di.uminho.pt/~jj; [Henriques:] Email: prh@di.uminho.pt; WWW: www.di.uminho.pt/~prh.

Abstract: "Quality control in electronic publications should be one of the major concerns of everyone who is managing a big project, like a digital library. Collecting information from several different sources raises problems of quality assurance. With SGML we can solve part of the problem, structural/syntactic correctness. There are situations where pre-conditions over the information being introduced should be enforced in order to prevent the user from introducing erroneous data; we shall call this process data semantics validation. In this paper we present ways of associating a constraint language with the SGML model. We present the steps towards the implementation of that language. In the end, we present a new SGML authoring and processing model which has an extra validation task: semantic validation. We also describe some cases in which quality could be improved with this new working scheme."

[Conclusion:] "Our main concern in the work reported in this article was the improvement of quality control in SGML-based electronic processing. In this context we discussed a new SGML authoring and processing model to remedy the lack of semantic validation in the traditional SGML model. The main idea was to restrict the values that the user can enter, by associating constraints with the element definitions. This way we can minimize data incorrectness and improve document quality. Through the use of several examples we illustrated the main problems in the implementation of such semantic validation task: data normalization, type inference, and the definition of a constraint language. . ."

For other articles in this issue of MLTP, see the annotated Table of Contents.

[CR: 19971227]

Ramalho, José Carlos; Rocha, Jorge Gustavo; Almeida, José Joao; Rangel Henriques, Pedro. "SGML Documents: Where Does Quality Go?." Pages 171-177 in SGML/XML '97 Conference Proceedings. SGML/XML '97. "SGML is Alive, Growing, Evolving!" The Washington Sheraton Hotel, Washington, D.C., USA. December 7 - 12, 1997. Sponsored by the Graphic Communications Association (GCA) and Co-sponsored by SGML Open. Conference Chairs: Tommie Usdin (Chair, Mulberry Technologies), Debbie Lapeyre (Co-Chair, Mulberry Technologies); Michael Sperberg-McQueen (Co-Chair, University of Illinois). Alexandria, VA: Graphic Communications Association (GCA), 1997. Extent: 691 pages, CDROM; print volume contains author and title indexes, keyword and acronym lists. Authors' affiliation: [José Carlos Ramalho]: University of Minho Computer Science Department, University of Minho Departamento de Informática Largo do Paço, 4709 Braga CODEX, Portugal; Email: jcr@di.uminho.pt; WWW: http://www.di.uminho.pt/~jcr/; [Jorge Gustavo Rocha]: Email: jgr@di.uminho.pt; WWW: http://www.di.uminho.pt/~jgr/; [José João Almeida]: Email: jj@di.uminho.pt; WWW: http://www.di.uminho.pt/~jj/; [Pedro Rangel Henriques]: Email: prh@di.uminho.pt; WWW: http://www.di.uminho.pt/~prh/.

Abstract: "Quality Control in Electronic Publications should be one of the major concerns of every project. Big projects try to gather information from a series of different sources: universities, libraries, museums and other scientific or cultural organizations. Collecting and treating information from several different sources raises a very interesting problem: the assurance of quality.

"Quality in Electronic Publications can be reflected in several forms, from the visual aspects of the interface and linguistic/literary ones to the correctness of data. We are concerned with the lowest boundary of this spectrum, correctness of data. With SGML (Standard Generalized Markup Language) we can solve a small part of the problem, structural correctness. SGML provides a nice way to structure documents keeping a complete separation between structure (syntax) and typesetting. Today there are lots of editors and environments that can assist the user producing well-formed SGML documents (validating their structure). But, there is clearly a lack for content validation. There are situations where pre-conditions over the information being introduced should be enforced in order to prevent the user from introducing erroneous data; we shall call this process data semantics validation. In SGML is not possible to implement this process."

"We will discuss an adaptation of the SGML syntax that will enable us to express constraints to allow some semantic validation when authoring. Furthermore, we will propose a new SGML processing model capable of dealing with this extension. This model will be built extending the existing one. So, we will not restrict any SGML capability, instead we will add new ones. Both, the SGML extension and the model extension, will be defined and implemented resorting to algebraic specification (SET theory and functional programming)."

This paper was delivered as part of the "User" track in the SGML/XML '97 Conference.

[CR: 19971029]

Randall, Neil. "XML: A Second Chance for Web Markup. HTML gave up a lot of SGML's power. XML brings back the power but keeps it simple." PC Magazine 16/19 (November 4, 1997) 319-322. ISSN: []. Author's affiliation: PC Magazine Staff. Also, the author of The Soul of the Internet (ITCP) and coauthor of Special Edition Using Microsoft FrontPage..

Excerpt: "One of XML's greatest strengths is that it lets entire industries, academic disciplines, and professional organizations develop sets of DTDs that will standardize the presentation of information within those disciplines. To an extent this works against the much-ballyhooed universality of the Web and HTML, but if you work in a specialized area, you're probably aware of the need for systems that let you produce documents enabling you to communicate efficiently with your colleagues. Specialists often need to display formulas, hierarchies, mathematical and scientific notations, and other elements, all within well-defined parameters. SGML's DTD system lets you do so, and XML picks up on the DTD system without all the complexity."

The article is available online; [local archive copy].

[CR: 19971123]

Rankin, George. "Croner & SGML - The First 3 Years: Opening the Envelope!" Page(s) 165-168 in SGML '97 Conference Proceedings. SGML Europe '97. "The Next Decade - Pushing the Envelope." Princesa Sofia Intercontinental, Barcelona, Spain. 11-15 May, 1997. Sponsored by Graphic Communications Association (GCA) and SGML Open. Conference Chair: Pamela L. Gennusa (Director, Database Publishing Systems Ltd). Alexandria, VA: Graphic Communications Association (GCA), 1997. Extent: 342 pages, CDROM. Author's affiliation: Publications Director, Croner Publications Limited, The Netherlands; Email: gxr@croner.co.uk; WWW: http://www.croner.co.uk.

Abstract: "Croner Publications Ltd's experiences with SGML over the last three years have been both exciting and traumatic! In this presentation, Mr. Rankin describes how and why Croner considered SGML to be the answer to many of their problems and details the difficulties encountered along the way. The presentation emphasizes the need to have the courage to press ahead and 'open the envelope'. Finally, he plots the future strategy his company will adopt and the role SGML is set to play over the next decade."

"Croner Publications is a successful publishing company based in London and is market leader in the UK in areas of publishing information relating to tax, employment law, transport, education, health & safety and the environment. The publications interpret the law for business managers and directors and the text of the publications is mainly in the form of commentary. Today, Croner publish over 100 loose-leaf books, 75 newsletters, 40 special reports, 70 bound books and 50 electronic products. Publications are updated frequently, including weekly, fortnightly, monthly and quarterly. [...] The presentation will be in the form of a case study and will look at how Croner Publications tackled the whole issue of adopting the SGML standard. Covering the period 1993 to 1997, the presentation will highlight some of the problems encountered, how the business benefits of SGML were perceived and how the business case for SGML was sold to the publishers at Croner. The future is also addressed and an outline of Croner's future strategy will be given. The theme of the presentation is simply that it took courage to finally grasp the nettle and 'open the envelope'!

[CR: 20000803]

Rath, Hans Holger. "Topic Maps: Templates, Topology, and Type Hierarchies." [ARTICLE] Markup Languages: Theory & Practice 2/1 (Winter 2000) 45-64 (with 24 references). ISSN: 1099-6622 [MIT Press]. Author's affiliation: STEP Electronic Publishing Solutions GmbH; Email: consulting@step.de; WWW: www.topicmaps.com.

Abstract: "The new ISO standard ISO/IEC 13250 Topic Maps defines a model and architecture for the semantic structuring of link networks. Dubbed the 'GPS of the information universe,' topic maps will become the solution for organizing and navigating large and continuously growing information pools, and provide a 'bridge' between the domains of knowledge representation and information management. This paper presents several technical issues of which are of great interest when applying topic maps to real world applications. The main focus of the paper is the introduction of 'topic map templates' -- a semi-official term coined by the standards' committee for a concept that the author argues is a necessary but as yet unstandardized addition to the basic model. Furthermore: association taxonomies, class hierarchies, and consistency constraints of topic maps are presented and discussed."

[Conclusion:] "The new topic map standard ISO/IEC 13250 defines a model and architecture for the semantic structuring of link networks. It can be seen as a base technology for modeling knowledge structures. The standards working group defined topic maps in such a way that a limited but implementable set of core concepts express the necessary semantics. The STEP Group has investigated how topic maps can be applied to reference works and uncovered some concepts which are not made explicit in the standard: (1) ability to separate the declarative part from the 'real' map, (2) predefined association types and association type properties, (3) class hierarchies for types, and (4) consistency constraints as input to map validation. The paper has explained these concepts and presented meaningful solutions. First experiences have shown that the part of a topic map made up by all topics used as themes and types by other 'objects' in the map should be clustered somehow. For this purpose the term topic map template was coined by the ISO working group. Templates can be used as starting points for new maps or can be used by reference in order to provide all the themes and types the map needs. Standardizing topic map templates will offer base topic maps for specific application areas and could form the basis of semantic application profiles. We looked at related academic fields like mathematics, linguistics, and philosophy to get some substantial input about relations. The results are a list of association type properties which give important hints to the topic map software and a list of basic association types which could act as built-in superclasses. The introduction of the superclass-subclass relationship was the logical consequence. Another technical issue covered by the paper is the validation problem. Topic maps might become rather big with millions of topics, occurrences, and associations. Manual consistency checking will be impossible. All the previously defined concepts open the possibility for sophisticated rule-based validation of topic maps. The proposed consistency constraints are those rules which declare the semantics not expressible with DTDs and which control the validation process. A couple of examples proved that standardizing the missing concepts as predefined topic map templates will help both the topic map developer and the topic map user. The improvements were presented on a level that they can be used as input to the ISO working group for further discussions."

[Received 17 December 1999; Accepted 15 February 2000]

See other references in: "(XML) Topic Maps."

[CR: 19950823]

Rath, Hans Holger. Tabellen in SGML. ZGDV-Bericht 75/93, Dipl.-Informatik Hans Holger Rath. Darmstadt: Zentrum für Graphische Datenverarbeitung e.V. [ZGDV], 1993. Extent: v + 77 pages, 47 references. Author's Affiliation: Department Head, Document Computing, ZGDV. Zentrum für Graphische Datenverarbeitung e.V., Wilhelminenstrasse 7, D-64283 Darmstadt. Email: rath@idg.fhg.de; Tel. 06151 155 152; FAX 06151 155 199.

"Einleitung: Tabellen dienen in technischen Dokumenten zur übersichtlichen Präsentation von Daten und werden vielfälltig verwendet. Tabellen sind entsprechend ein Layoutmittel zur Aufbereitung von zusammenhängenden Datensätzen. Generell lassen sich Layoutangaben beliebig skalieren und sind nur schwer mit einem festen Satz von Eigenschaften zu beschreiben; dazu sind die Anforderungen zu vielfältig. Dies gilt entsprechend für die Tabellen.

Um Tabellen mit SGML [refs. omitted] zu verarbeiten, muss ihre "Struktur" in einer DTD (Dokumenttypdefinition) abgebildet werden. Das bedeutet (bei einer ersten Betrachtung), Layout mit SGML zu beschrieben. Dies ist zwar nicht das eigentliche Anliegen von SGML, es lässt sich aber bei Tabellen nicht vermeiden, da Struktur und Layout in einem sehr engen Zusammenhang stehen, der nur schwer trennbar ist.

Damit bei den weiteren Ausführungen zum Thema Tabellen einheitliche Begriffe und eine gemeinsame Vorstellung von der Komplexität der zu betrachtenden Tabellen vorhanden sind, wird in Kapitel 2 eine (mögliche) Tabellen-Definition vorgestellt. Diese ist ein pragmatischer Ansatz, der sich aus der Erfahrung mit verschiedenen Dokumentarten und Publishing Systemen ergeben hat. Er verlangt nicht den Anspruch der Vollständigkeit.

Nach der Einführung der Begriffe werden in Kapitel 3 unterschiedliche Techniken zur Abbildung von Tabellen in SGML vorgestellt und in Kapitel 4 die Tabellen in drei Klassen eingeteilt, zu denen Anforderungen und Beispiele angeführt werden. In Kapitel 5 werden existierende DTDs für freie Tabellen ausführlich untersucht und bewertet. Im Kapitel 6 wird einen neues Konzept zur strukturorientierten Darstellung von freien Tabellen vorgestellt: Tabellen als Koordinatensysteme. Abschliessend wird in Kapitel 7 eine Zusammenfassung der Ergebnisse geliefert. Anhang A enthält eine umfangreiche Auflistung von verschiedenen Tabellen-DTDs."

Available via the Internet: http://zgdv.igd.fhg.de/papers/ed/SGML-Tabs.ps.Z, or in PDF format from this server, or Postscript (mirror copy)

[CR: 19950823]

Rath, Hans Holger. SGML -- Eine Einführung. ZGDV Technical Report. Darmstadt: Zentrum für Graphische Datenverarbeitung e.V. [ZGDV], 1993. Extent: 20 pages, 25 references. Author's Affiliation: Department Head, Document Computing, ZGDV. Zentrum für Graphische Datenverarbeitung e.V., Wilhelminenstrasse 7, D-64283 Darmstadt. Email: rath@idg.fhg.de; Tel. 06151 155 152; FAX 06151 155 199.

"Abstract: Die Inhalte einer Einführung in SGML sind immer vom übergeordneten Kontext abhängig. Dieser Kontext ist mit dem Workshop-Titel "SGML in der Praxis" klar auf praktische Anwendungen ausgerichtet; entsprechend werden in diesem Artikel primär die praxisrelevanten Teile von SGML und seinem Umfeld erläutert. Es beginnt mit einer Motivation des standardisierten Dokumentformates SGML. Darauf folgt ein kurzer Ausflug in die Geschichte von SGML und der Einführung der Basisbestandteile von SGML. Da SGML ein Formalismus ist, müssen bei der Anwendung verschiedene Software-Werkzeuge eingesetzt werden; diese werden vorgestellt und ihre Zusammenarbeit im Gesamtprozess wird beschrieben. Wirklich nur einführenden Charakter hat der Überblick über die Syntax und Semantik von SGML. Der Artikel endet mit einer Zusammenfassung der Vor- und Nachteile von SGML."

Available via the Internet from ZGDV: http://zgdv.igd.fhg.de/papers/ed/SGML-Einf.ps.Z, or in PDF format from this server, or Postscript (mirror copy).

[CR: 19960408]

Rath, Hans Holger; Wiedling, Hans-Peter. "Making SGML Work: Introducing SGML Into an Enterprise and Using its Possibilities in Advanced Applications." Computer Standards & Interfaces 18/1 (January 1996) 37-53 (with 11 references). ISSN: 0920-5489. Authors' affiliation: Computer Graphics Center, Darmstadt, Germany [Zentrum für Graphische Datenverarbeitung e.V].

Abstract: "Nowadays more and more companies are evolving into worldwide operating enterprises embedded in interdependent networks of communication, information exchange and product manufacturing. An up-to-date enterprise is spread over several sites with distributed subsidiaries for administration, product launches and manufacturing. At the same time, internationalization is combined with a strong need for efficient information exchange in an open system environment. Information management and dissemination is becoming a key issue to success in every phase of the product life cycle. Any solution must take all these developments into account."

"With SGML-based documents and tools, a flexible approach can be made towards open system independent document management with re-usable portions of information. In this paper different aspects are described: the steps required to set up an SGML-based application are introduced, ideas and the benefits of Literate Programming to SGML applications are explained, database functionality in distributed environments is presented, and document exchange and hypermedia documents are considered."

This article was published in an SGML special issue of Computer Standards & Interfaces [The International Journal on the Development and Application of Standards for Computers, Data Communications and Interfaces], under the issue title SGML Into the Nineties. It was edited by Ian A. Macleod, of Queen's University.

Raymond, Darrell R. "Flexible Text Display with Lector." IEEE Computer 25/8 (August 1992) 49-60. ISSN: 0018-9162. Author affiliation: Department of Computer Science, University of Waterloo.

Published summary: "Lector provides flexible text interaction for X11 applications. It handles descriptively marked-up text and acts as a text previewer, database browser, code prettyprinter, or menu utility."

Additional notes: The article supplies screen shots of Lector's display of the Oxford English Dictionary and the associated electronic stylesheet (style-specification file); both use SGML-style tagging. It also shows various styles for a csh man page, for prettyprinted C-code, and for text/graphic representation of pieces in a chess game. Available in PostScript format from UWaterloo, or here. See more on the Lector application in Darrell Raymond's technical report. See also the NOED main entry.

Raymond, Darrell R. Lector - An Interactive Formatter for Tagged Text Technical Report OED-90-02. Waterloo, Ontario: University of Waterloo Centre for the New Oxford English Dictionary and Text Research, August, 1990. 26 pages, 13 figures.

Abstract: Lector is an X.11 application that provides highly interactive text formatting. Unlike text previewers, Lector handles descriptively marked-up text, supports multiple styles, and interacts well with other programs, including other invocations of Lector. Appropriate selection of texts and styles enables Lector to act as a text previewer, database browser, code prettyprinter, menu utility, and iconic interface. Lector's implementation revolves around a set of tradeoffs involving efficiency, simplicity and generality. The result demonstrated the utility of generalized text display tools.

The report is available in PostScript format from UWaterloo, or here. For further details on SGML-related work at the Waterloo Centre, see Gonnet.

[CR: 19960104]

Raymond, Darrell R.. Partial Order Databases. PhD Thesis. Waterloo, Ontario, Canada: Computer Science Department, University of Waterloo, 1996. Extent: ix + 156 pages.

Abstract: "Order is a fundamental property of information that is not explicitly captured in model database models. The partial order model introduces the idea of partially ordered sets as the basic construct for modelling data. This model exhibits two novel properties. First, it is capable of describing structure without reference to data types. Second, it inherently separates the structure of data from the objects being structured. These two properties mean that the model naturally facilitates the use of multiple structures for data.

"We investigate a collection of algebraic operators for manipulating ordered sets. An implementation of these operators is presented, based on the use of realizers as a data structure. An algorithm is provided for generating realizers for arbitrary finite partial orders.

"The partial order model is useful for data domains that involve containment or dependency relationships. Text databases and software repositories are two examples of such domains. We show how the partial order model can be used to structure text and software data, and how it provides new insights in both areas. In particular, partial orders can be the foundation of better systems for handling both tables and makefiles.

"Partial orders are prominent not only in the data stored in databases, but in database system internals as well. Partial orders play key roles in dependency theory, object-oriented modelling, and the management of redundant data. Thus partial orders are a concept of broad importance both in understanding and impelementing the fundamentals of almost any database system."

Note that Chapter 5 of the dissertation treats ordering relationships in text. The chapter summary discusses SGML.

The document is available online in compressed Postscript format: ftp://daisy.uwaterloo.ca/pub/grail/partial.ps.Z [local mirror copy, December 1995]. See the online Table of Contents for the dissertation in a separate file.

[CR: 19960408]

Raymond, Darrell R.; Tompa, Frank Wm.; Wood, Derick. "From Data Representation to Data Model: Meta-Semantic Issues in the Evolution of SGML." Computer Standards & Interfaces [SGML Special Issue] 18/1 (January 1996) 25-36 (with 20 references). ISSN: 0920-5489. Authors' affiliation [Raymond, Tompa]: Department of Computer Science, University of Waterloo, Waterloo, Ontario, Canada; [Wood]: Department of Computer Science, Hong Kong University of Science and Technology, Kowloon, Hong Kong.

Abstract: "SGML provides standard representations for documents, but as documents become more fluid, we will need standard semantics for them as well. The ability to manage change is a fundamental capability of any system that supports document semantics. We look at three areas important in change management: equivalence, redundancy, and operators. We show how these areas are implicitly addressed in SGML and by SGML-based standards, and argue that more explicit consideration would be useful both for evaluating current standards, and for developing new systems for document semantics."

A draft version of the article is available (April 1995) in PostScript format from UWaterloo, or ftp://cs-archive.uwaterloo.ca/cs-archive/CS-95-17/CS-95-17.ps.Z, or mirror copy here. See the entry for this special SGML issue of Computer Standards & Interfaces, edited by Ian A. Macleod, of Queen's University.

Raymond, Darrell R.; Tompa, Frank Wm.; Wood, Derick. Markup Reconsidered. Department of Computer Science, Technical Report No. 356. The University of Western Ontario, 1993. 20 pages, 32 references. ISBN: 0-7714-1504-4.

Abstract: We describe some of the implications of markup for document management systems. Markup's properties are inherited from text, since it is embedded in text. These properties are most advantageous when document structure is reducible to substrings of characters, and when the update characteristics of the structure are similar to the update characteristics of the text. We describe situations in which these characteristics are disadvantageous. Markup is not a data model, but one of several possible techniques for representing structure. For this reason it should not be the foundation of document management systems.

Also available under the title [Technical Report] OED-93-01, UW Centre for the New Oxford English Dictionary, University of Waterloo (April 1993). A presentation under the same title was given at the First International Workshop on Principles of Document Processing, Washington, D.C. (October 21-23, 1992). An earlier version (unpublished") was written as "Markup Considered Harmful" and a related work was entitled "Reading Between the Tags: An Appraisal of Descriptive Markup" ["Markup Considered Harmful"?]. The UWO version of the paper is available via FTP to UWO; ftp://ftp.csd.uwo.ca/pub/csd-technical-reports/356/. The report is also available in PostScript format from UWaterloo, or here.

[CR: 19971123]

Reich, Thomas; Von Zadow, Günter. "From Mainframe to Intranet." Page(s) 31-36 in SGML '97 Conference Proceedings. SGML Europe '97. "The Next Decade - Pushing the Envelope." Princesa Sofia Intercontinental, Barcelona, Spain. 11-15 May, 1997. Sponsored by Graphic Communications Association (GCA) and SGML Open. Conference Chair: Pamela L. Gennusa (Director, Database Publishing Systems Ltd). Alexandria, VA: Graphic Communications Association (GCA), 1997. Extent: 342 pages, CDROM. Author's affiliation: [Reich]: FIDES Informatik, Zürich, Switzerland; [von Zadow]: DOSCO Document Systems Consulting GmbH, Heidelberg, Germany; WWW: http://www.dosco.de/beratung/brt.htm.

Abstract: "This report describes our experience with a conversion project involving IBM BookMaster and HTML.

"Credit Suisse, a large Swiss Bank, has an existing document processing infrastructure based on IBM host. They use IBM BookMaster -- a markup language based on GML (Generalized Markup Language). Their BookMaster documents are traditionally distributed on paper. The content is the documentation of bank-internal software applications -- everything from user manuals to technical descriptions for audit purposes. The document size varies from about five to several hundred pages.

"To improve the accessibility of their documents the bank wants to publish and keep these documents in a corporate intranet in the future. While this intranet is currently coming into existence the old paper publishing process must be maintained. The coexistence of the old and the new 'world' will be necessary for several years. Conversion tools are therefore needed to ease the way from BookMaster to HTML and back. This paper is a description of the current status of the ongoing project.

[CR: 19990111]

Reid, Brian K. "20 Years of Abstract Markup - Any Progress?" Pages [ ] in Markup Technologies '98 Conference Proceedings. . Hyatt Regency, McCormick Place, Chicago, Illinois, USA. November 19 - 20, 1998. Sponsored by GCA and co-sponsored by MIT Press. Edited by the program chairs, B. Tommie Usdin, Debbie Lapeyre, and Michael Sperberg-McQueen. Alexandria, VA: Graphic Communications Association (GCA), 1998. Author's affiliation: Digital Equipment Corporation [Compaq].

"Brian Reid's work with markup systems began in the 1970s. He independently invented and implemented descriptive markup and developed its theory. His Scribe system may have been the cleanest separation of structure and format ever built. His dissertation on it was already complete in 1981, the year he presented in Lausanne in the same session where Charles Goldfarb publicly presented GML; SGML was proposed about a year later. In recent years Reid has turned his attention to network systems and the internet."

This keynote address ("20 Years of Abstract Markup - Any Progress?") was a reflection upon the early development of descriptive markup based upon a presentation made at the Conference on Research and Trends in Document Preparation Systems at Lausanne, Switzerland, February 27-28, 1981. Many of the slides in Reid's Chicago keynote presentation were taken from the 1981 paper, "The Scribe Document Specification Language and its Compiler."

The visuals from Brian Reid's keynote at the Markup Technologies '98 conference, given 19 November 1998, are available online in PowerPoint 97 format. Please note that the file is almost 10 megabytes - it has 25 full-page scanned images in it. [local archive copy, 1999-01-04]

Full abstracts and annotations for other presentations given at the Markup Technologies '98 Conference are provided in a separate document.

Reid, Brian K. "The Scribe Document Specification Language and its Compiler.". Pages 59-62 (with 10 references) in International Conference on Research and Trends in Document Preparation Systems. Abstracts of the Presented Papers. Conference on Research and Trends in Document Preparation Systems, Lausanne, Switzerland, February 27-28, 1981. Supported and organized by the [Swiss] Conseil des Ecoles Polytechniques Fédérales. J. D. Nicoud, Program Chair. Lausanne/Zürich: Swiss Federal Institutes of Technology, 1981. v + 130 pages. Author affiliation: Computer Systems Laboratory, Stanford University, Stanford, CA USA 94305.

The paper documents the design of one of the earliest successful document production systems which made rigorous separation of content and format based upon a document specification language. The Scribe document specification language and the associated compiler were eventually developed into a commercial document production system.

See also the slides from Brian Reid's Opening Keynote Address "20 Years of Abstract Markup - Any Progress?" at the Markup Technologies '98 Conference, held in Chicago, November 1998. The slides are available in PowerPoint 97 format. This was a reflection on the early development of descriptive markup based upon a presentation made at the Conference on Research and Trends in Document Preparation Systems at Lausanne, Switzerland, February 27-28, 1981. Many of the slides in Reid's Chicago keynote presentation were taken from the 1981 paper, "The Scribe Document Specification Language and its Compiler." Please note that the file is almost 10 megabytes - it has 25 full-page scanned images in it. [local archive copy, 1999-01-04]

Reid, Brian K. "Scribe: Histoire et évaluation." Pages 28-39 (with 17 references) in Actes des Journées sur la Manipulation de Documents. [Conference] La Manipulation de Documents, Rennes, 4-6 Mai 1983. Edited by Jacques André. Rennes, France: INRIA [Institut National de Recherche en Informatique et en Automatique], 1983. ix + 286 pages. ISBN: 2-7261-0351-0. Author Affiliation: Stanford University, California, USA.

The article supplies a historical overview of the design, development, implementation, and early use of the Scribe document preparation system. In specific terms, the research effort began in 1975 within the Computer Science Department at Carnegie-Mellon University.

Reid, Brian K. "A High-Level Approach to Computer Document Formatting." Pages 24-31 (with 13 references) in Conference Record of the Seventh Annual ACM Symposium on Principles of Programming Languages. Papers Presented at the Symposium. ACM Symposium on Principles of Programming Languages, Las Vegas, Nevada (January 28-30, 1980. Sponsored by ACM, SIGACT [Special Interest Group on Automata and Computability Theory], and SIGPLAN [Special Interest Group on Programming Languages]) New York, NY: Association for Computing Machinery, 1980. ISBN: 0-89791-011-7; ACM Order Number: 549800. Author's affiliation: Carnegie-Mellon University, Computer Science Department, Pittsburgh, PA USA 15213.

Describes the design goals, implementation, and early experiences with the Scribe document formatting language and document preparation system. See also the reference for the Scribe user manual.

Reid, Brian K.; Walker, Janet H. Scribe Introductory User's Manual, Preliminary Draft. Third Edition. Pittsburgh, PA: UniLogic Ltd, May 1980. xii + 188 pages, index. Authors' affiliation: [Reid:] Carnegie-Mellon University; [Walker:] Bolt Beranek and Newman, Inc..

Preliminary draft of the Third Edition of the Scribe User's Manual. Previous editions were published at Carnegie-Mellon in August 1978 and August 1979. Scribe was one of the earliest and most influential implementations of document preparation systems based upon a formal descriptive markup language (rigorous separation of document content and format) and based upon the notion of different document types. Janet Walker was responsible for technical editing/authoring of the manual(s).

Reid, Brian K.; Walker, Janet H. Scribe Introductory User's Manual. Second Edition, third printing. Pittsburgh, PA: UniLogic Ltd, July 25, 1979. xii + 332 pages, index. Authors' affiliation: [Reid:] Carnegie-Mellon University; [Walker:] Bolt Beranek and Newman, Inc..

This volume includes the Scribe User's Manual corresponding to Scribe version 2A(400). The volume also contains, in part II, the Scribe Format Designer's Guide (first edition), corresponding to Scribe version 2A(375). See also the third edition manual for other details.

Reid, Brian K.; Shamos, Michael I.; Walker, Janet H. Scribe Database Administrator's Guide. First [Unilogic] Edition. Pittsburgh, PA: Unilogic, Ltd., 1981.

Reid, Brian Keith. Scribe: A Document Specification Language and its Compiler. Adviser: Robert F. Sproull. Ph.D. Dissertation. Pittsburgh, PA: Department of Computer Science, Carnegie-Mellon University, October, 1980. x + 148 pages.

This dissertation and its related earlier work was highly influential in the United States, particularly within academic circles, in demonstrating the power of document grammars and descriptive markup principles. The dissertation was also published as Carnegie-Mellon Technical Report CMU-CS-81-100. Address: Carnegie-Mellon University, Department of Computer Science, Schenley Park, Pittsburgh, PA 15213.

Reid, Brian K. Scribe Introductory User's Manual.. First edition. Pittsburgh, PA: Carnegie-Mellon University, Computer Science Department, 3 August 1978. iii + 112 pages.

This manual was one of the earlier beginner's manuals for SCRIBE. A companion volume was the Scribe Expert's Manual, in which additional system configuration and customization features are documented. This manual corresponds to SCRIBE Version CMU OE(110). Later editions of the manual were produced in 1979, 1980 and later; see the bibliographic entry for the 1980 edition.

Reinke, U. "Towards a Standard Interchange Format for Terminographic Data." Pages 270-282 (with 27 references) in TKE '93. Terminology and Knowledge Engineering. Proceedings of the Third International Congress on Terminology and Knowledge Engineering. International Congress on Terminology and Knowledge Engineering [TKE'93], Cologne, Germany, 25-27 August 1993. Edited by: K.-D. Schmitz. Frankfurt/Main, Germany: Indeks Verlag, 1993. viii + 472 pages. Author's affiliation: Saarlandes Univ., Saarbrucken, Germany.

Abstract: We still have to overcome a number of difficulties before a terminographic interchange standard can be put into practice. Although the recent SGML-based approach seems feasible and promising, describing the content of a complex universal terminological entry still represents an essential problem. It is not sufficient to define an open catalogue of terminological data categories. The difficulties start when it comes to specifying the relationships between them. Yet, a detailed description is a precondition for applying SGML. A different kind of problem will occur when a workable standard will have to be established for the commercial market and TMS developers will have to be convinced of the necessity of offering suitable export and import routines for their programs allowing for the conversion of terminographic data between the individual TMS and the standard format. This aspect is essential because many of the interchange problems have to be solved by the conversion programs rather than by the interchange standard itself. However, by using SGML the recent approach has been based on a standard that has already been applied in several cases where large amounts of electronic data had to be exchanged. Moreover, several SGML programs, such as SGML editors or parsers for checking the conformity of an SGML document, are already available today.

[CR: 19971024]

Renear, Allen. "The Digital Library Research Agenda: What's Missing -- and How Humanities Textbase Projects Can Help." D-Lib Magazine (July/August 1997). ISSN: 1082-9873. Author's affiliation: Scholarly Technology Group, Brown University.

Excerpt: "...it is not implausible to say that most of the major achievements in the realm of digital libraries are also based on insights into document structure. These include such things as, formal grammars for documents (such as SGML and HTML), modular separation of form and content (in stylesheets), hypertext data models, multiple views, etc [...] over the last 30 years there has been evolving a community which, though still small, and not particularly well-institutionalized, is focused on research and development in the core issues of knowledge representation and document structure. This is that portion of the humanities and social science computing community that is developing SGML 'textbases'. I would suggest that this community, in conjunction with other traditional sources of document-oriented research (such as the hypertext, hypermedia, and SGML communities) can help maintain document studies at the center of the digital library research agenda."

[CR: 19970228]

Renear, Allen. "Representing Text on the Computer: Lessons for and from Philosophy." Bulletin of the John Rylands University Library 74 (1992) 221-248.

[CR: 19980625]

Renear, Allen; Mylonas, Elli; Durand, David. "Refining our Notion of What Text Really Is: The Problem of Overlapping Hierarchies." Pages xx-xx [ca. 16 pages, 23 references] in Research in Humanities Computing. Edited by Nancy Ide and Susan Hockey. Oxford: Oxford University Press, 1993. Authors' affiliation: Allen Renear, Brown University; Elli Mylonas, Harvard University; David Durand, Boston University.

Abstract: We examine the claim that 'text is an ordered hierarchy of content objects [OHCO]'; this thesis was affirmed by the authors, and others, in the late 1980s and has been associated with certain approaches to text processing and the encoding of literary texts. First we discuss the nature of this claim and its connection with the history of text processing and text encoding standardization projects such as SGML and the Text Encoding Initiative. We then describe how the experience of the text encoding community, as represented and codified in the TEI Guidelines, have raised difficulties for this thesis. Next we consider two progressively weaker versions of this thesis formulated in response to these difficulties. Ultimately we find that no version appears to be free from counterexample.

Although none of these formulations proves to be theoretically sound, they are nonetheless methodologically illuminating as each generalizes actual encoding practices, making explicit certain assumptions that, even though they have been fundamental to the working methodologies of most text encoding projects, have never been explicitly articulated, let alone explained or defended. The counterexamples to the different versions of the OHCO thesis also arise in actual encoding projects -- so although our focus is theoretical it is grounded in the methodology and problems of contemporary encoding practices. The problems discussed here have implications not only for text encoding and our understanding of the nature of textual communication, but raise very fundamental issues in the logic and methodology of the humanities.

Actual publication date? [Draft version is, January 6, 1993.]. An earlier version was presented at the annual joint meeting of the Association for Humanities Computing and the Association for Literary and Linguistic Computing, Oxford University, April 1992. An HTML version of the draft is available on the Brown University WWW (STC) server [HTMLized by Robin Cover and John Lavagnino]; it is also held in mirror copy [May 1995] here.

See the bibliography entry for "What is Text, Really?" (1990; DeRose, Durand, Mylonas, Renear) for pointers to related articles by the authors on the quintessence of "text" for the purposes of markup. An even earlier article by Coombs, Renear, and DeRose laid the groundwork for this series of markup treatises; see "Markup Systems and the Future of Scholarly Text Processing," CACM 30/11 (1987) 933-947.

See also the announcement from Stuart Lee (Humanities Computing Unit, Oxford University Computing Services - OUCS) for a lecture series on "text" presented by Renear, Summer 1998. On the use of a hierarchical database to model (non-) hierarchical structures, see SGML/XML and (Non-) Hierarchy."

[CR: 19971202]

Resnik, Philip; Olsen, Mari Broman; Diab, Mona. "Creating a Parallel Corpus from the Book of 2000 Tongues." Pages 107-116 in TEI 10: A Conference in Celebration of the Tenth Anniversary of the Text Encoding Initiative. Abstracts.. TEI 10: Text Encoding Initiative, Tenth Anniversary User Conference , Brown University, Providence, Rhode Island. November 14-16, 1997. Sponsored by Martin Hensel Corporation, Kluwer Academic Publishers, and MIT Press. Hosted by Brown University Library, and Computing and Information Services. Providence, RI: Brown University, 1977. Author's affiliation: Department of Linguistics and Institute for Advanced Computer Studies, University of Maryland. [Philip Resnik], Assistant Professor, Department of Linguistics and Institute for Advanced Computer Studies, 1401 Marie Mount Hall, University of Maryland, College Park, MD 20742 USA; Email: resnik@umiacs.umd.edu; WWW: http://umiacs.umd.edu/~resnik/home.html; [Olsen]: Email: molsen@umiacs.umd.edu.

Summary: "This paper reports on a project to annotate biblical texts in order to create an aligned multilingual Bible corpus for linguistic research, particularly computational linguistics, including automatically creating and evaluating translation lexicons and semantically tagged texts. The output of this project will enable researchers to take advantage of parallel translations across a wider number of languages than previously available, providing, with relatively little effort, a corpus that contains careful translations and reliable alignment at the near-sentence level. We discuss the nature of the text, our annotation process, and intended uses for the corpus, and we point out relevant aspects and potential limitations of the current draft of the Corpus Encoding Standard with respect to this corpus.[...] At present, we have implemented a standard intermediate-level annotation, delimiting book, chapter, and verse, for a growing collection of languages. The availability of on-line versions of the Bible leads us to be optimistic about the prospect of creating a resource that covers a wide variety of languages and will be valuable to specialists in translation, linguistics, and the computational analysis of language."

The extended abstract for the document is available online: http://www.stg.brown.edu/webs/tei10/tei10.papers/resnik.html; [local archive copy]. See the main database entry for additional information about the conference, or the Brown University web site.

Reynolds, Larry. Standard Generalized Markup Language Overview and Trade Study 1994. 124 pages, with bibliography. CALL NUMBER: QA76.73.S44 R49.

Reynolds, Louis R.; Derose, Steven J. "Electronic Books: Hypertext Publishing Lets You Structure, Distribute, Retrieve and Annotate the Information You Need." Byte Magazine 17/6 (June 1992) 263-268. Authors' affiliation: Electronic Book Technologies.

The article by Reynolds and DeRose is in a Byte special section "Managing Infoglut: How to Add Value to Your Data." Several articles in this issue discuss the role of SGML in electronic publishing.

[CR: 19961130]

Rice, John D. SGML Fundamentals and Design Issues: An overview of process and technique. Presentation (Paper) at a meeting of the Washington Area SGML Users Group, February 15, 1996. Silver Spring, MD: ATLIS Consulting Group, 1966. Extent: approximately 8 pages. Author's affiliation: Atlis Consulting Group. John D. Rice, 4901 7th Street North, Arlington, Virginia 22203, (703) 351-7203. Email: JDRice@gnn.com.

Summary aphorism: "The fundamental precept for an SGML application is to define a set of rules to describe a continuum of data in a harmonious and carefully governed manner. . . A data collection may be seen as a continuum. That is, it grows, changes, evolves, yet remains united by some common thread. Typically, the document is described as the fundamental unit in that continuum. . ."

Available online from the Washington Area SGML Users Group WWW server, in HTML format; [mirror copy].

[CR: 19971227]

Rice, John; Hancock, Zanetta. "Object Oriented SGML." Pages 105-108 in SGML/XML '97 Conference Proceedings. SGML/XML '97. "SGML is Alive, Growing, Evolving!" The Washington Sheraton Hotel, Washington, D.C., USA. December 7 - 12, 1997. Sponsored by the Graphic Communications Association (GCA) and Co-sponsored by SGML Open. Conference Chairs: Tommie Usdin (Chair, Mulberry Technologies), Debbie Lapeyre (Co-Chair, Mulberry Technologies); Michael Sperberg-McQueen (Co-Chair, University of Illinois). Alexandria, VA: Graphic Communications Association (GCA), 1997. Extent: 691 pages, CDROM; print volume contains author and title indexes, keyword and acronym lists. Authors' affiliation: [John Rice]: Application Engineer, ISOGEN International Corp., 2200 North Lamar St. Suite 230, Dallas, TX 75202 Phone: +1 (214) 953-0004; Email: john@isogen.com; WWW: http://www.isogen.com; [Zanetta Hancock]: ISOGEN International Corp.; Email: zanetta@isogen.com.

Abstract: "In 1996, we began a work on a project to develop for the State of Alabama an application modeled on the processes of the state legislative system. Among other things we needed to develop an effective set of solutions for some complex information authoring, management and delivery requirements. A true code generating OO (Object Oriented) RAD (Rapid Application Development) tool would be used to model business processes and to generate the application which would run on top of an Oracle database. However, there were questions as to how the data management component would be effectively designed and integrated. [...] Our requirements were to develop functions to manage the creation, revision, indexing, output delivery, and search and retrieval for a number of document types. According to the business processes, not only document types but individual document component types were subject to versioning. Furthermore, it had to be possible to retrieve any particular version of a document from any stage of its life. Authors worked with a number of document types, such as Draft Legislation, Draft Resolutions, and Amendments. Several of these documents could go through important semantic state changes that a simple static Document Type label was inadequate to describe.

"Here we will focus on three practical aspects of the application design: data management, versioning, and retrieval. We will also discuss the object/relational environment in which the application was designed and how that environment enabled, constrained, or otherwise affected our decisions. [...] We present what we believe to be some logical methodologies for managing SGML components in the context of a larger application. We believe this approach that can be successfully and efficiently repeated and can be successfully applied in a number of different business areas."

This paper was delivered as part of the "User" track in the SGML/XML '97 Conference.

[CR: 19970219]

Richter-Wills, Susanne. High-volume, High-accuracy, SGML Document Capture: A Case Study. Rank Xerox Technical Report. Gloucestershire, UK: Rank Xerox Business Services, 1995 [?]. Extent: approximately 11 pages. Author's affiliation: Business Development Manager, Document Imaging, Rank Xerox Business Services, Document Technology Centre, Beech House, Building 9, Mitcheldean, Gloucestershire, UK. Email: srw@dtc.rankxerox.co.uk.

Abstract: "This case study describes points that need to be considered when setting up a high quality (99.9%), high volume (over 1 million pages per annum), long term (5-10 years) document capture operation. Rank Xerox Business Services captures documents for the EPO - European Patent Office. Around 25,000 pages are captured each week, 52 weeks a year. From paper to SGML encoded text with embedded images, all steps and relating issues will be discussed."

The document is available online in HTML format: http://www.dtc.rankxerox.co.uk/Srw_pape.html; [mirror copy]. For other information on the conversion of EPO documents into SGML format, see: Paul Brewin, "SGML and Patent Document Processing. WIPO standard ST.32."

[CR: 19980907]

Richy, Hélène. "Document Style Design by Direct Manipulation." Pages 331-42 (with 21 references) in Electronic Publishing, Artistic Imaging, and Digital Typography. Proceedings of the 7th International Conference on Electronic Publishing (EP '98), Held Jointly with the 4th International Conference on Raster Imaging and Digital Typography, RIDT '98). EP '98 and RIDT '98, Saint Malo, France. March 30 - April 3, 1998. Edited by Roger D. Hersch, Jacques André, and Heather Brown. Lecture Notes in Computer Science Series, Number 1375. New York/Berlin/Heidelberg: Springer-Verlag, 1998. ISBN: 3-540-64298-6, and 3-540-64298-6. Author's affiliation: Irisa, Campus universitaire de Beaulieu, F-35042 Rennes Cedex, France.

Abstract: "Designing style sheets for structured documents is often a difficult task. In this paper, we discuss the way to support style design through direct manipulation and propose an interactive method of specification by example for editing style sheets. In this approach, style editing actions within a formatted document are generalized into ''generic'' style specifications and the ''generic'' style sheet is dynamically updated. An initial implementation of a direct-manipulation editor for structured document style sheet is presented. Based on a structured authoring environment, this prototype provides a comfortable environment for editing style properties as defined in style sheets through the visual representation of any document, without programming the style language."

[Conclusion]: "This paper has presented a tool for the design of style sheets for structured documents, which does not require users to program the style language. The proposed approach is based on a presentation graph and enables the generalization process to produce the style sheet properly. In this approach, designers or authors directly modify the presentation on a formatted document. By immediately visualizing the formatted document, users are able to discover relationships among logical components and style rules. This ability helps users to update the generic style in an efficient way. A first prototype, based on the structured Thot editor, provides an interactive environment for editing the typography, topology, and style of a document, and for updating style sheets written in the P language. Since P-edit is a prototype used for the validation of our generic approach, it does not provide all the functionality of a full system. Some features such as page layout, footnotes, views, or decorative boxes should be improved. And we have made the simplifying assumption that an initial style sheet is provided. Until now, not much attention has been paid to the problem of generic style editing. The results of this experiment reveal that this approach can be used to edit style sheets with simple contextual conditions as defined in the CSS language."

See the online slides, the online abstract, and the PDF document, full text. [local archive copy] See also J. André, H. Richy, and Ch. Hérault, "Notion de 'feuille de style'," in Cahiers GUTenberg 21 (June 1995) 127-134.

[CR: 19951113]

Richy, Hélène. "A Hypertext Electronic Index Based on the Grif Structured Document Editor." Electronic Publishing -- Origination, Dissemination and Design (EPODD) 7/1 (March 1994) 21-34 (with 30 references). Author's affiliation: Opéra Project, IRISA, Rennes, France.

"Abstract: This paper presents an electronic index service that was developed in the Grif editor by taking advantage of the hypertext facilities available in the system. Grif is a structured document editor based on the generic structure concept that supports both hierarchical structures and non-hierarchical links. The active cross-reference within the Grif index makes activation and browsing through indexing more powerful than in other systems: the index tables, helpful as a medium for supporting search by keywords in paper documents, support browsing in electronic documents. These indexes are easy to use as they are displayed in the same form as indexes in a paper document."

Available on the Internet in Postscript format: ftp.irisa.fr/opera/doc/index.ps.Z [mirrored copy, November 1995].

[CR: 19951113]

Richy, Hélène. Grif et les index électronique. Rapport de recherche num. 1756. Rennes: IRISA, 11 septembre 1992. Extent: 31 pages, 22 references. Author's affiliation: IRISA, Campus Universitaire de Beaulieu F-35042 Rennes Cedex, France. Email: richy@irisa.fr.

"Résumé: Ce rapport présente les index électroniques de Grif. Ces index sont à la fois des tables alphabétiques, dans lesquelles sont cités les termes importants d'un document, et des liens qui permettent de parcourir le document comme un hypertexte, sans utiliser la structure générique initiale du document.

"Plus précis que les index imprimés, qui font uniquement référence aux pages d'un document, ces index permettent de retrouver directement dans le document électronique le passage, la phrase ou la section qui traitent du sujet cherché. De plus, ces index peuvent être utilisés aussi bien lors de la consultation d'un document qu'au cours son édition."

ftp://ftp.imag.fr/pub/OPERA/doc/IndexGrif.ps.gz [mirrored copy, November 1995].

[CR: 19961018]

Richy, Hélène; André, Jacques. "Typographic Sheets and Structured Documents." Pages 81-93 (with 25 references) in EP '96. Proceedings of the Sixth International Conference on Electronic Publishing, Document Manipulation and Typography. [ = Journal Special Issue: Electronic Publishing - Origination, Dissemination and Design (EPODD), June & September 1995, Volume 8, Issues 2-3. Sixth International Conference on Electronic Publishing, Document Manipulation and Typography, Palo Alto, California. September 24-26, 1996. Sponsored by Adobe Systems Incorporated; School of Information Management and Systems, University of California at Berkeley; Xerox Corporation. [Proceedings Volume] Edited by Allen Brown, Anne Brüggemann-Klein, and An Feng; [Journal] Editors David F. Brailsford and Richard K. Furuta. Chichester/ New York: John Wiley & Sons, 1996. ISSN: 0894-3982. Author's affiliation: IRISA - Institut de Recherche en Informatique et Systèmes Aléatoires, Campus de Beaulieu, F-35042 Rennes Cedex, France. Email: Helene.Richy@irisa.fr; Jacques.Andre@irisa.fr.

Abstract: "Document structure provides facilities for accessing and presenting hypermedia documents, and style sheets support their layout on screen (or paper). However, little attention is given to the quality of these documents in terms of typography, in the sense, for example, of abiding by the rules of The Chicago Manual of Style. This paper shows how the specification of typographic sheets can help apply typographic control to structured documents. This approach provides a mapping between elements and typographic properties. A typographic checker based on typographic sheets is presented."

Keywords: Typography, correctness, quality, sheets, structured documents, SGML, Thot.

For other conference information, see the main conference entry for EP '96, or the brief history of the conference as sixth in a series since 1986. See the volume main bibliographic entry for a linked list of other EP '96 titles relevant to SGML and structured documents.

[CR: 19951113]

Richy, Hélène; Frison, Patrice; Picheral, Eric. "Multilingual String-to-String Correction in Grif, a Structured Editor." Pages 183-198 (with 21 references) in EP [Electronic Publishing] 92: Proceedings of Electronic Publishing, 1992 International Conference on Electronic Publishing, Document Manipulation, and Typography. Swiss Federal Institute of Technology, Lausanne, Switzerland. April 7-10, 1992. Sponsored by the Swiss Federal Institute of Technology and the Swiss National Science Foundation. Edited by Christine Vanoirbeek and Giovanni Coray [EPF, Lausanne, Switzerland]. The Cambridge Series on Electronic Publishing. Cambridge: Cambridge University Press, 1992. ISBN: 0-521-43277-4. Author affiliation: Swiss Federal Institute of Technology, Lausanne, Switzerland.

"Abstract: This paper describes the integration of a spelling corrector into the structured editor Grif. The corrector is based on the Levensthein metric concept which is particularly efficient for string correction. This method can be implemented efficiently and can produce good results with short response time on a new RISC workstation even with large dictionaries. The integration within Grif enables checking of textual content of structured documents where large vocabularies are required. Thanks to an attribute language, the editor can automatically adapt the correction to the language and can apply a specific word recognition algorithm and dictionaries, thus allowing checking and correcting of multilingual documents."

Available on the Internet: ftp://ftp.irisa.fr/opera/doc/corrector.ps.Z [mirrored copy, November 1995; or ftp://ftp.irisa.fr/opera/doc/correcteur.ps.Z ftp://ftp.irisa.fr/opera/doc/corrector.ps.Z this French version]

[CR: 19951113]

Richy, Hélène; Hérault, Chrystèle; André, Jacques. "Notion de 'feuille de style'." Cahiers GUTenberg Number 21 (juin 1995) 127-143 (with 45 references). Authors' affiliation: Opéra Project, IRISA, Campus de Beaulieu, Rennes.

"Réesumée: Le concept de document structurée permet de faire la distinction entre le contenu du document et sa structure logique (DTD en SGML). À cette distinction, on ajoute maintenant celle de la structure graphique du document. Comme nous l'indiquons dans cet article, la spéecification de cette structure graphique est loin d'être uniformiséee: de nombreuses propositions sont faites autour de HTML, en particulier. Au delà de certaines divergences apparentes, la notion de feuille de style commence à s'imposer et devrait faciliter les éechanges de documents."

Available in Postscript format on the Internet: ftp://ftp.irisa.fr/opera/doc/styles.ps.gz [mirrored copy, November 1995]

Rieger, Wolfgang. SGML für die Praxis: Ansatz und Einsatz von ISO 8879. Mit einer Einführung in HTML, inkl. Diskette mit Public Domain SGML-Parser sgmls. Heidelberg: Springer Verlag Heidelberg, 1995. x + 427 pages, diskette. ISBN: 3-540-57534-0; price: 98 DM. Author's address: Wolfgang Rieger; bse - Buero fuer Software-Entwicklung; Frankfurter Ring 193a; 80807 Munich, Germany; Phone: +49 89 323 19 93; Fax: +49 89 323 19 93; Email: rieger@bse.de; WWW: http://www.bse.de/.

[Published summary:] "Ein Grundproblem der elektronischen Erstellung, Bearbeitung und Archivierung von Dokumenten ist die unzureichende Wiedergabe von Inhalt und Struktur durch die heute verbreiteten Dokumentformate und Seitenbeschreibungssprachen.

"SGML löst dieses Problem durch die präzise und flexible Beschreibung der Struktur von Dokumenten und ermöglicht dadurch die vielseitige Nutzung von Dokumenten in konventionellen und elektronischen Publikationen.

Dieses Buch gibt einen praxisorientierten Einstieg in Vorteile und Anwendungsgebiete von SGML. Die verschiedenen Bestandteile von SGML-Dokumenten werden anhand zahlreicher Beispiele und Übungen behandelt.

"Ein eigenes Kapitel ist HTML gewidmet, einem auf SGML basierenden Hypertext-Standard, der dem neuen Internet-Dienst World Wide Web zugrundeliegt. Ein Überblick zu derzeit verfügbaren SGML-Anwendungen mit Bezugsquellen, sowie eine Zusammenstellung von Informationsquellen unter besonderer Berücksichtigung des Internet ergänzen die Darstellung."

Overview and Table of Contents for the book may be found on the BSE server, or in mirror copy [dated April 1995] here.

[CR: 19961226]

Rink, Chris; Yencha, Bob. "Multiple DTD Composition System; A Case Study in Semantic Transformation." Pages 223-230 in SGML '96 Conference Proceedings. Celebrating a Decade of SGML. SGML '96 Conference, Boston, MA, November 18-21, 1996. Sponsored by The Graphic Communications Association (GCA). [Edited by] Conference Co-Chairs: B. Tommie Usdin and Deborah A. Lapeyre. Alexandria, VA: GCA, 1996. Extent: 711 pages. Authors' affiliation: [Rink]: Sr. Application Engineer, Coris Inc., R.R. Donnelley & Sons Company, 7501 S. Quincy St., Willowbrook, Illinois 60521, USA; Tel: +1 (630) 655-7215; FAX: +1 (630) 655-7755; Email: crink@coris.com; WWW: http://www.coris.com/; [Yencha]: Sr. SGML Systems Analyst, National Semiconductor, MS 01-07, 333 Western Avenue, South Portland, Maine 04106, USA; Tel: +1 (207) 775-8736; FAX: +1 (207) 775-8745; Email: bob.yencha@nsc.com.

Abstract: "This composition system accepts documents coded according to multiple authoring DTDs (of many versions) and provides a maintainable method for updating the system to keep pace with DTD changes. The key is that lower level elements, such as paragraphs and phrases, are identical across DTDs while Division (section) level elements differ. The solution automatically creates a document-type-specific transformation program and creates a generic SGML file from one of multiple authoring document type SGML instances. The generic SGML file can then be input to a more structure-based composition converter to create the final composed (targeted) output."

The description of the composition system is based upon an SGML application developed by (for) National Semiconductor to support its technical publications needs -- delivering some 30,000 pages of company information in various delivery formats. The underlying database is called Powerbase, from Coris. "There are two essential parts to the PowerBase solution: database content management, and information production and distribution. National Semiconductor provides new or revised product information in SGML format. Coris produces additional, associated file formats -- for example, HTML for the Web, or video files for multimedia CDs -- and converts images to TIFF for print or GIF for Web. Every one of these pieces is then organized as an individual content object in the PowerBase database, ready to be pulled into multiple-purpose materials and in multiple media." More information on Powerbase may be found on the Coris WWW server: http://www.coris.com/corishome/pwrbase/pwp.html.

Note: The above presentation was part of the "SGML User" track at SGML '96. The SGML '96 Conference Proceedings volume containing the full text of the paper may be obtained from GCA.

[CR: 19971227]

Rivers, Bernard. "Accessing XML from Plugin-enhanced Browsers: Practical Experience at Two Multinational Corporations." Pages 571-580 in SGML/XML '97 Conference Proceedings. SGML/XML '97. "SGML is Alive, Growing, Evolving!" The Washington Sheraton Hotel, Washington, D.C., USA. December 7 - 12, 1997. Sponsored by the Graphic Communications Association (GCA) and Co-sponsored by SGML Open. Conference Chairs: Tommie Usdin (Chair, Mulberry Technologies), Debbie Lapeyre (Co-Chair, Mulberry Technologies); Michael Sperberg-McQueen (Co-Chair, University of Illinois). Alexandria, VA: Graphic Communications Association (GCA), 1997. Extent: 691 pages, CDROM; print volume contains author and title indexes, keyword and acronym lists. Author's affiliation: [Bernard Rivers]: Managing Director, RivCom, 945 West End Avenue, New York, NY 10025 USA; Phone: +1 212 662-6800; FAX: +1 212 662-6900 (Also based in Swindon, UK); Email: bernard.rivers@rivcom.com; WWW: http://www.rivcom.com.

Abstract: "This case study shows how XML has been used by RivCom in projects for Logica and Shell International. Both projects required the delivery of highly structured information in an accessible and easily navigable form to the desktops of users world-wide. The solution adopted in each case was to store the information in the form of XML files. These files could then be accessed by users over the network via corporate intranet, or locally from CD-ROM. The users were provided with a browser plugin developed by RivCom. The plugin intercepts the stream of incoming XML, and then composes HTML on the fly which it passes to the browser for presentation to the user. The HTML reflects presentational preferences resulting from a dynamic combination of styles defined by the publisher (the corporation) and user choices or actions. [...] And XML, we found, was ideally suited to meeting the needs of our clients."

"The two projects I want to describe and illustrate in this paper are the following: 1) Logica is a major computer consultancy and systems integration company with over 5,000 staff in 22 countries around the world. Their Cortex Business Model describes all aspects of Logica's business. 2) Shell International is one of the world's largest oil companies. Their Downstream Business Activity Model is a structured description of the company's entire downstream business - from the arrival of crude oil at refineries to the delivery of refined oil products to customers at Shell service stations. A common requirement of both projects was the need to deliver highly structured information in an accessible and easily navigable form to users distributed throughout the world."

This paper was delivered as part of the "Case Studies" track in the SGML/XML '97 Conference.

[CR: 19971018]

Roberts, Roda P; Langlois, Lucie; Megginson, David. "SGMLizing the Bilingual Canadian Dictionary: Reasons, Process, and Problems." Page [184] in ACH-ALLC '97. The 1997 Joint International Conference of the Association for Computers and the Humanities and the Association for Literary and Linguistic Computing. Conference Abstracts. ACH-ALLC '97. Queen's University at Kingston, Ontario, Canada. June 3 - 7, 1997. Compiled by Greg Lessard and Michael Levison. Ontario, Canada: Queen's University, 1997. ISBN: 0-88911-760-8. Authors' affiliation: [Roberts]: University of Ottawa, Email: roberts@uottawa.ca; [Langlois]: University of Ottawa, Email: langlois@balzac.sti.uottawa.ca; [Megginson]: Microstar & University of Ottawa, dmeggins@microstar.com.

[Extract:] "This session explores the reasons for and the challenges of setting up a text processing application using SGML for lexicographic data. More particularly, it presents the experience of a group of researchers in the Humanities who were forced to become familiar with the SGML standard, to help design a Document Type Definition (DTD), and to get used to using SGML authoring tools to write a dictionary. The dictionary in question is the Bilingual Canadian Dictionary (BCD), which is still in preparation. As its tentative title indicates, it is a bilingual dictionary which will reflect English and French as they are used in Canada. The creation of this dictionary is the major objective of a vast collaborative research project, called 'Comparative Lexicography of French and English in Canada', funded by the Social Sciences and Humanities Research Council. The project involves three universities: the University of Ottawa (which is also the administrative centre), the University of Montreal, and Laval University."

Abstract available online in HTML format: "SGMLizing the Bilingual Canadian Dictionary: Reasons, Process, and Problems", by Roda P. Roberts, Lucie Langlois, and David Megginson. Presentation at ACH/ALLC '97. [archive copy] The ACH-ALLC paper abstract is also on the UOttawa web site. See the Web site for the Comparative Lexicography of French and English in Canada, or the main database entry in the SGML/XML Web Page.

Additional information on the ACH-ALLC '97 Conference is available in the SGML/XML Web Page main conference entry, or [August 1997] via the Queen's University WWW server.

[CR: 19970314]

Robertson, Lindsay "SGML Markup For Publishing at Leeds." SGML Users' Group Bulletin 3/1 (1988) 20-21. ISSN: 0269-2538. Author's affiliation: University of Leeds.

"The University currently has 750 terminals linked to the Amdahl mainframe computer directly... The Miles 33 front end is linked directly to Amdahl mainframe... All generically-coded files are then passed through an SGML SARA tab;e -- SARA standing for 'Search and Replace Automatically' -- which converts the generic tags to format calls on the Miles 33 system." The article is based upon a paper originally delivered at an SGML Users' Group Meeting at the University of Leeds, October, 1987.

[CR: 19961226]

Robie, Jonathan. "Components, Databases, and Repositories." Pages 305-312 in SGML '96 Conference Proceedings. Celebrating a Decade of SGML. SGML '96 Conference, Boston, MA, November 18-21, 1996. Sponsored by The Graphic Communications Association (GCA). [Edited by] Conference Co-Chairs: B. Tommie Usdin and Deborah A. Lapeyre. Alexandria, VA: GCA, 1996. Extent: 711 pages. Author's affiliation: POET Software Corporation, 3207 Gibson Road, Durham, N.C., 27703, USA; Tel: +1 919.598.5728; FAX: +1 919.598.6728; Email: jwrobie@mindspring.com or jonathan@poet.com.

Abstract: "Recently, the SGML world has been rediscovering the database repository. Many SGML users have documents which need to be shared at different levels of granularity, distributed among workgroups, created on the fly from smaller pieces, versioned, found via queries, or managed in parts because they are too large to fit in RAM. All of these requirements suggest that documents need to be composed of smaller units, 'components', which can be exchanged among users, combined to form documents, versioned, returned as the result of queries, and validated.

The term 'component' is abstract, and does not concretely specify the relationship of components to an SGML document or DTD. However, the manner in which components are used clearly indicate certain properties which they must have. This presentation uses three specific scenarios to determine these properties: versioning in a workgroup environment, logical access to the components of a document, and dynamic document generation.

Based on these scenarios, we define components in terms of their basic properties and uses, discuss the design choices available for implementing components in an SGML repository, and outline some of the design choices made in an SGML repository system which was jointly designed by F.A.Davis, a medical textbook and multi-media publisher, and POET Software, an object database company."

Note: The above presentation was part of the "SGML User" track at SGML '96. The SGML '96 Conference Proceedings volume containing the full text of the paper may be obtained from GCA.

[CR: 19971227 MD: 19980216]

Robie, Jonathan. "XML and Modern Software Architectures. XML in the World of the Internet, JavaBeans, Software Components, and Controls." Pages 179-184 in SGML/XML '97 Conference Proceedings. SGML/XML '97. "SGML is Alive, Growing, Evolving!" The Washington Sheraton Hotel, Washington, D.C., USA. December 7 - 12, 1997. Sponsored by the Graphic Communications Association (GCA) and Co-sponsored by SGML Open. Conference Chairs: Tommie Usdin (Chair, Mulberry Technologies), Debbie Lapeyre (Co-Chair, Mulberry Technologies); Michael Sperberg-McQueen (Co-Chair, University of Illinois). Alexandria, VA: Graphic Communications Association (GCA), 1997. Extent: 691 pages, CDROM; print volume contains author and title indexes, keyword and acronym lists. Author's affiliation: [Jonathan Robie]: Research Consultant, Texcel Ventures, Inc., 3207 Gibson Road, Durham, North Carolina USA 27703; Phone: +1 919.598.5728; Email: jonathan@texcel.no; WWW: http://www.texcel.no.

Abstract: "As XML has brought SGML into mainstream software development, the SGML community has had to change some basic assumptions about editing environments and documents. Recently, a variety of new XML-related standards have been proposed that envision using XML as a core technology for internet software development. These standards are in early phases, but if they come to be accepted and used, they paint a very interesting future for XML and SGML. SGML tools have generally been designed for single-user use or for use on a LAN. There are now proposed standards that define protocols for joint authoring across the internet, specifying how documents and document components can be created, traversed, locked, changed, checked in or out, versioned, and protected from unauthorized access. These programming interfaces may be used by distributed editing tools to allow XML documents, databases, and repositories to be edited or viewed simultaneously by many users working from different locations. By supporting these standards as they come out, SGML and XML tools can become a vital part of new Internet development."

"As an open format for structured documents, XML has made structured documents a natural way to define new standards. The SGML community has generally assumed that all documents were created by human beings, and ultimately read by other human beings. Although most XML documents probably will be written and read by humans, many of the new XML standards are for environments in which documents are created or consumed by programs. Some of these use XML as a rich data interchange format; others use XML to define protocols, software installation procedures, financial transactions, document transfer schedules, and system configuration."

"Finally, this paper explores the use of software components in SGML systems. Many new applications are based on the concept of reusable software components, and many SGML programs seem to use similar controls, such as tree browsers, list viewers, SGML browsers. Several suggestions are made for improving SGML software systems to support component based programming."

This paper was delivered as part of the "User" track in the SGML/XML '97 Conference.

A version of the document is available online in HTML format: "XML and Modern Software Architectures"; [local archive copy]

Robinson, Brian. Electronic Document Handling Using SGML: Hypertext Interchange and SGML. London: British Library Research and Development Department, 1994. 19 pages. LCCN: gb 94-96186; National bib. number: GB94-96186.

Robinson, Brian. Electronic Document Handling Using SGML: Management Report. London: British Library Research and Development Department, 1994. 11 pages. LCCN: gb gb 94-94374; National bib. number: GB94-94374.

Robinson, Brian; Wu, Gilbert. "Applications of SGML." <TAG> 19 (August 1991) 4-9.

Abstract: This paper does not dwell on the technicalities of the Standard Generalised Markup Language (SGML) but focuses on applications of SGML which are currently the subject of research and development contracts within the Information Technology Group or ERDC at the Hatfield Polytechnic. Detail is provided on one particular project undertaken for the Science and Engineering Research Council (SERC). Software has been developed which allows users to complete complex electronic forms on a standard Personal Computer in SGML format. The software is independent of the form structure which is defined in ASCII files using a powerful, compact purpose-designed language. Close control over all aspects of data capture including data integrity, virtual fields and online user help is supported. The completed forms are transmitted as electronic mail across a wide area network and processed automatically in a mainframe environment at SERC.

Robinson, Brian; Wu, Gilbert. "Applications of SGML." University Computing 14/2 (1992) 53-57. ISSN: 0265-4385.

Compare, for an overview, the abstract for the article by the same authors "Applications of SGML" which is a related version.

Robinson, Peter M. W. "The Canterbury Tales Project." Pages 199-209 [partial abstract] in Colloque International "Consensus ex Machina?". Abstracts. International Joint Conference of the ALLC (Association for Linguistic and Literary Computing) and ACH (Association for Computers and the Humanities), Sorbonne, Paris, 19-23 avril 1994. Paris: Laboratorie "Lexicométrie et textes politiques" (INaLF, CNRS), and Ecole Normale Supérieure de Fontenay - Saint Cloud, 1994. 244 pages. Author Affiliation: Centre for Humanities Computing, Oxford University.

[CR: 19960912]

Robinson, Peter R. The Transcription of Primary Textual Sources Using SGML. Office for Humanities Communication Publications, No. 6. Oxford: Oxford University Computing Service, Office for Humanities Communication, 1994. Extent: vii+ 136 pages, references. ISBN: 1897791070.

See the review by Julia Flanders in CHUM.

[CR: 19951206]

Robinson, Peter R.; Solopova, Elizabeth. The Canterbury Tales Project. Paper presented at The Electric Scriptorium. Approaches to the Electronic Imaging, Transcription, Editing and Analysis of Medieval Manuscript Texts: A Physical & Virtual Conference. The University of Calgary, Calgary, Alberta [physical conference]. November 10-12, 1995. Sponsored by The University of Calgary, Calgary Institute for the Humanities, and SEENET. Conference coordinated by Dr. Murray McGillivray, Thomas Wharton, Blair McNaughton, and Robert McLean. Extent: approximately 15 pages. Authors' affiliation: Oxford and Sheffield Universities.

"This paper will introduce the work of the Canterbury Tales Project, and especially its first CD-ROM (on the Wife of Bath's Prologue, due for release January 1996). The Canterbury Tales Project aims at the transcription, collation, analysis and publication of all extant pre-1500 witnesses, with almost all publication (and all publication of transcripts and manuscript images) to be in electronic form. The publication of the project's first CD-ROM mark the first publication of a major edition conceived, from the first, as an electronic publication. The imminent publication of the first CD-ROM, which will be demonstrated at the conference, is sufficient proof that such editions are now technically possible. Accordingly, this paper will focus on the intellectual difficulties in our work, rather than the technical problems."

See further details on the SGML aspect in the project description. The document is available on the Internet as part of the official conference record: see http://www.ucalgary.ca/~scriptor/chaucer/rob.html [mirror copy, partial only]. For further details on the Electric Scriptorium conference, see Electric Scriptorium Home Page.

Rockley, Ann. "Ontario Hydro and SGML." Technical Communication: Journal of the Society for Technical Communication 40/3 (Third Quarter, August 1993) 383-386. ISSN: 0049-3155. Author affiliation: Information Design Solutions.

When Ontario Hydro, Canada's largest utility, decided to explore the conversion of its 20,000 pages of paper manuals to online documentation, it hired Information Design Solutions to conduct the analysis. That analysis established the scope of the project, provided a set of design criteria, and recommended the use of SGML to create a flexible new documentation and the purchase of the Dynatext program to produce it. A prototype conversion showed the feasibility of the project, which is still going on.

[CR: 19951220]

Rockley, Ann. "Putting Large Documents Online." Pages 273-81 (with 4 references) in Proceedings of SIGDOC'93 [The 11th Annual International Conference Systems Documentation]. 11th Annual Conference on Systems Documentation, Waterloo, Ontario, Canada . October 5-8, 1993. Conference sponsored by SIGDOC. New York, NY: ACM Press, 1993. Author's affiliation: Information Design Solutions Inc., Stouffville, Ontario, Canada.

"Abstract: Large documents are the most suitable for online viewing. They can be stored compactly on the system and they can be searched in ways not possible in their printed form, by using full-text search retrieval methods or by making the underlying structure of the document accessible. Large documents can be searched more accurately and more completely by users, making their tasks easier. This paper reviews some of the issues that must be considered when putting large documents online."

[CR: 19971018]

Rockwell, Geoffrey M; Johnson, Joanna; Piro, Rocco. "MILE: A Markup Language for Interactive Drill Courseware." Pages 135 - 137 in ACH-ALLC '97. The 1997 Joint International Conference of the Association for Computers and the Humanities and the Association for Literary and Linguistic Computing. Conference Abstracts. ACH-ALLC '97. Queen's University at Kingston, Ontario, Canada. June 3 - 7, 1997. Compiled by Greg Lessard and Michael Levison. Ontario, Canada: Queen's University, 1997. ISBN: 0-88911-760-8. Authors' affiliation: McMaster University, Email: [Rockwell, contact] grockwel@mcmaster.ca.

[Extract:] ". . . in 1994 we started developing the MILE environment and the accompanying MILE Markup Language (MML). While this paper will demonstrate the MILE environment, the primary focus will be on the design of the markup language. . . Over the next couple of years the MILE project hopes to create a Lesson Builder for the WWW which would translate MML into a combination of JavaScript and HTML. We also hope to create a SGML DTD so that we can take advantage of the SGML tools available. This, along with input from users who desire more functionality, may lead to further revisions to MML."

Abstract available online in HTML format: "MILE: A Markup Language for Interactive Drill Courseware", by Geoffrey M. Rockwell, Joanna Johnson, Rocco Piro; [archive copy]

Additional information on the ACH-ALLC '97 Conference is available in the SGML/XML Web Page main conference entry, or [August 1997] via the Queen's University WWW server.

[CR: 19970228]

Rockwell, Richard C. "The World Wide Web as a Resource for Scholars and Students." American Council of Learned Societies Newsletter 4/4 (February 1997) 9-10, 21. Author's affiliation: Executive Director, Inter-university Consortium for Political and Social Research (ICPSR), University of Michigan .

The article is part of a special issue which "focuses on the presentations of a program session on Internet-accessible scholarly resources held at the 1996 ACLS Annual Meeting." The issue theme is entitled "Internet-Accessible Scholarly Resources for the Humanities and Social Sciences."

Describing the ICPSR archive (and the need to first scan 350,000 pages of material), Rockwell says: "What we need are structured documents, so that a study can be documented to a statistical package by the "code book" itself. Accordingly, we're developing something called a Document Type Definition (DTD) for SGML (Standard Generalized Markup Language) markup of social science code books, and we'll put that Document Type Definition in the public domain. Information on the DTD is available at the Data Documentation Initiative [see also the DDI entry in the SGML/XML Web Page]. We're striving to be compliant with the Text-Encoding Initiative (even though it wasn't really constructed for the social sciences). And we strongly hope to see federal statistical agencies use the Document Type Definition to prepare code books for their own data resources. If they do so, they will achieve a major improvement in documentation."

The article is available online in HTML format: http://www.acls.org/n44rock.htm; [mirror copy].

[CR: 19951113]

Röhrich, Johannes. "Abstract Markup in TEX." Pages 159-160 in TeX for Scientific Documentation. Proceedings of the Second European Conference. (The Second European Conference on TeX for Scientific Documentation, Strasbourg, France, June 19-21, 1986, Sponsored by: CNRS (Centre National de le Recherche Scientifique), SMF (Société Mathématique de France), Université Louis-Pasteur de Strasbourg). Edited by Jacques Désarménien. Lecture Notes in Computer Science, Number 236. Berlin/New York: Springer-Verlag, 1986. ISBN: 0387168079 (New York); ISBN: 3540168079 (Berlin).

The article discusses theoretical issues in abstraction concepts, including the semantics of markup languages. The second part of the contribution deals with the MAX (SGML-based) system developed at the University of Karlsruhe; MAX is a descriptive SGML-like markup language and document compiler.

[CR: 19951113]

Roisin, Cécile; Akpotsui, Extase K. A. "Implementing the Cut-and-Paste Operation in a Structured Editing System." Pages [???-???] in Principles of Documents Processing, PODP '94. Principles of Documents Processing. Darmstadt. April 11-12, 1994. Sponsored by: Fuji Xerox Systems and Commnunications Lab, GMD-IPSI, Rank Xerox Research Centre, and Xerox Webster Research Center. Edited by Makoto Murata and Herve Gallaire. [pub-location: Darmstadt?]: [publisher: GMD-IPSI?], 1994. Authors' affiliation: INRIA-IMAG, 2, rue de Vignate - F-38610 Gières - France Tel: (33) 76 63 48 33; Email: Cecile.Roisin@imag.fr, exta@berger-levrault.fr .

"Abstract: This paper addresses the problem of transforming the structure of a part of document when it is moved by the cut-and-paste operation in a structured editing system. The solution presented in this paper is based on the modelling of document types (DTD), allowing several order relations between types to be identified in order to determine when and how transformations can be done. Basically, the structural representation of a document type is given by a tree where the leaves are basic types. A canonical form of the types has been defined in order to eliminate syntactic details of SGML and to allow correct analysis of types. For efficient dynamic transformations, the tree representation is linearized in a 'Dyck' word, keeping only structural information and types of the leaves. A cut-and-paste operation is then implemented as a string comparison between the source instance word and the target type word; the result gives a way to construct a new target instance which conforms to the target type even if the source type is different."

Available in Postscript format on the Internet: ftp://ftp.imag.fr/pub/OPERA/doc/PODP94.ps.gz [mirrored copy, November 1995].

[CR: 19951113]

Roisin, Cécile; Vatton, Irène. Formatting Structured Documents. INRIA (Institut National de Recherche en Informatique et en Automatique) Rapport de recherche num. 2044. Grenoble, France: INRIA , septembre 1993. Extent: vi + 21 pages, 23 references. ISSN: 0249-6399.

"Abstract: Although it is well established that structured documents and generic models bring benefits to applications involving documents, integrating these document models in the formatting process of interactive editors is still an open problem. In this paper, the problem of laying out and formatting structured documents is investigated, taking into account the DSSSL standard. One key point of this document model is the potential to express the logical structure of documents independently from their graphical aspect. However, this approach introduces a more complex formatting process, as two independent structures have to be merged. This discussion is illustrated by our experience of dynamic formatting in the Grif editor."

"Résumé: S'il est clairement établi que l'utilisation de modèles de documents structurés constitue une avancé significative dans le domaine des applications de manipulation de documents, l'intégration de ces structures de documents dans les formateurs interactifs est encore mal résolue. Ce rapport traite de ce problème en s'appuyant sur la norme DSSSL. Un des atouts de la structuration est la possibilité d'exprimer la structure logique des documents indépendamment de leur aspect graphique. Cette approche conduit cependant à la mise en oeuvre d'un processus de formatage plus complexe, dans la mesure où il est nécessaire de fusionner deux structures indépendantes. Cette discussion est illustrée par notre expérience de formatage dynamique que nous avons réalisé dans l'éditeur Grif."

Available in Postscript on the Internet: ftp.imag.fr/pub/OPERA/doc/RR93Formatting.ps.gz [mirrored copy, November 1995].

[CR: 19970812]

Roisin, C.; Vatton, Irène. "Merging logical and physical structures in documents." Electronic Publishing: Origination, Dissemination, and Design (EPODD) EP '94. Fifth International Conference on Electronic Publishing, Document Manipulation, and Typography, Darmstadt, Germany, 13-15 April 1994. 6/4 (December 1993) 327-337. 20 references. Authors' affiliation: IMAG, Institut d'Informatique et de Mathématiques Appliquées de Grenoble, France.

Abstract: Although it is well established that structured documents and generic models bring benefits to applications involving documents, integrating these document models in the formatting process of interactive editors is still an open problem. In this paper, the problem of laying out and formatting structured documents is investigated, taking into account the Document Style Semantics and Specification Language (DSSSL) standard. One key point of this model is the possibility of expressing the logical structure of documents independently from their graphical aspect. However, this approach induces a more complex formatting process, as two independent structures have to be merged. This discussion is illustrated by our experience of dynamic formatting in the Grif editor.

Available via the IMAG FTP server: Merging Logical and Physical Structures in Documents, or ftp://ftp.inrialpes.fr/pub/opera/publications/EP94.ps.gz, or in a mirror copy here [August 01, 1995].

[CR: 19951113]

Roke, A. G.; San Giorgi, A. "ODA/ODIF: Onder architectuur gebouwd en van wereldformaat." Informatie 37/1 (January 1995) 33-40.

[Reference is from the PREMIUM Project]

[CR: 19970531]

Role, François. "La norme SGML: pour décrire la structure logique des documents." Documentaliste-Sciences de l'information 28/4-5 (1991) 187-192.

Summary: "Un texte en français, qui va un peu plus loin dans la technique (syntaxe, DTD) et qui parle de SGML pour les bibliothèques. (from Martin Sévigny)".

[CR: 19971202]

Romary, Laurent; Bonhomme, Patrice; Bruneseaux, Florence; Pierrel, Jean-Marie. "Silfide: A System for Open Access and Distributed Delivery of TEI Encoded Documents." Pages 117-122 in TEI 10: A Conference in Celebration of the Tenth Anniversary of the Text Encoding Initiative. Abstracts.. TEI 10: Text Encoding Initiative, Tenth Anniversary User Conference , Brown University, Providence, Rhode Island. November 14-16, 1997. Sponsored by Martin Hensel Corporation, Kluwer Academic Publishers, and MIT Press. Hosted by Brown University Library, and Computing and Information Services. Providence, RI: Brown University, 1977. Author's affiliation: CRIN-CNRS & INRIA Lorraine.

Summary: "SILFIDE (Serveur Interactif pour la Langue Française, son Identité, sa Diffusion et son Etude) is a tool for sharing, congenially and thoughtfully, a knowledge on different aspects of the French language. It consists in a network of data processing servers together with the necessary support. The aim of SILFIDE is not to integrate the totality of the contents (corpuses, glossaries, tools) of the available resources within an academic community, but to allow any researcher to be informed of the existence of such contents, to get a relatively precise idea of them and to be informed of the methods of access. In the case of resources which are widely used or which do not raise particular problems when accessed, SILFIDE will be able to propose the automatic transfer of the corresponding data."

"Considering the different targets Silfide is aiming at, it is clear that the project could not have taken place without the existence of an underlying framework for the representation of structured document in electronic format. As a matter of fact it has immediatly been obvious to us, even before starting any kind of work, that we should follow the steps of the Text Encoding Initiative rather than devise our own scheme, even if on some occasions, we have had to simplify, if not to misuse the actual guidelines provided by the TEI. What we want to show here is how a large and multifarious project as ours may consider the TEI from a great many points of view depending on a) the different kinds of data that are to be represented and) the different usages that are contemplated upon these. [...] putting the TEI into practice clearly shows that rather than aiming at being a 'standard', the TEI is the occasion to share practices - and sometime a kind of philosophyin the encoding of textual documents. Above all, the TEI will actually prove valuable when we will really be able to exchange both data and tools (such as Silfide) between us without having to revise either of them. As a matter of fact, this is something which is not yet able to reach, but can be achieved by even more collaborative work between the sites which are concerned by digital resources."

See the main database entry for Project Silfide (Serveur Interactif pour la Langue Française, son Identité, sa Diffusion, et son Étude).

The extended abstract for the document is available online: http://www.stg.brown.edu/webs/tei10/tei10.papers/romary.html; [local archive copy]. See the main database entry for additional information about the conference, or the Brown University web site.

[CR: 19950804]

Romary, Laurent; Mehl, Nathalie; Woolls, David. "The Lingua Parallel Concordancing Project: Managing multilingual texts for educational purposes." Electronic Texts and the Text Encoding Initiative [Special Issue] = TEXT Technology: The Journal of Computer Text Processing 5/3 (Autumn, 1995) 204-220. ISSN: 1053-900X. Authors' affiliation: [Romary and Mehl] CNRS/INRIA Research Cebter at Nancy; [Woolls] University of Birmingham Computing Centre. Contact: Laurent.Romary@loria.fr.

"Romary's paper is representative of the extraordinary spurt of activity in the European research community concerned with language engineering, the application of information technology to the age old problems of language understanding and learning, felt perhaps most keenly in the European Union, with its nine official languages. The research described by Romary and his colleagues is typical of many projects now underway, in which the building of multilingual corpora and ancillary software are seen as essential new tools in language pedagogy. The promise of long term portability and re-usability offered by the use of standards such as the TEI is naturally seen as central to all such endeavours." [from the issue Introduction, by Lou Burnard]

See the main entry for this special issue of TEXT Technology dedicated to the TEI, edited by Lou Burnard.

Rooney, Paula. "Versatile Electronic Data Delivery Fuels Corporate Interest in SGML." PC Week ?/? (March 1 1993) 37, 40.

Summarizes the reasons for SGML momentum. References a marketing study by InterConsult, predicting that SGML technologies will be worth at least $500 million in 1995.

[CR: 19970331]

Roposh, Cindy; Schoenrock, Hanna. "Developing Single-Source Documentation for Multiple Formats." Pages 205-212 in Conference Proceedings, SIGDOC '96. The 14th Annual International Conference on Computer Documentation. ["Marshalling New Technological Forces: Building a Corporate, Academic, and User-Oriented Triangle"]. ISGDOC '96: 14th Annual International Conference. Research Triangle Park, North Carolina, US. October 20-23, 1996. Sponsored by the Association for Computing Machinery Special Interest Group on Documentation (SIGDOC). New York, NY: Association for Computing Machinery, 1996. ISBN: 0-89-791-799-5. Author's affiliation: Publications Division, SAS Institute, Cary, NC.

The paper documents the experience gained at the SAS Institute in developing a strategy to use SGML as a single-source format for producing online and hardcopy technical manuals. All of the software products selected for use in the new composition system were required to be 'SGML-compliant'. The article describes the SGML tools used in this endeavor.

Several other articles in this proceedings volume are germane to SGML: Tom Banfalvi, et al., "Manufacturing Documentation in the Virtual Warehouse"; Betsy Brown, et al., "From Hardcopy to Online: Changes to the Editor's Role and Processes"; Paul Beam and Peter Goldsworthy, "Technical Writing on the Web-Distributed SGML-Based Learning"; Stephanie Copp, "Working with Academe"; Paul Prescod, "Multiple Media Publishing in SGML"; Lin-Ju Yeh, et al., "SSQL: a Semi-Structured Query Language for SGML Document Retrievals"; Dee Stribling, et al., "A Real World Conversion to SGML".

[CR: 19961226]

Rosenthal, David; Sokolowski, Rachael. "Using SGML For Voice-Enabled, Structured Medical Reporting." Pages 191-196 in SGML '96 Conference Proceedings. Celebrating a Decade of SGML. SGML '96 Conference, Boston, MA, November 18-21, 1996. Sponsored by The Graphic Communications Association (GCA). [Edited by] Conference Co-Chairs: B. Tommie Usdin and Deborah A. Lapeyre. Alexandria, VA: GCA, 1996. Extent: 711 pages. Authors' affiliation: [Rosenthal]: Kurzweil Applied Intelligence, 411 Waverley Oaks Road, Waltham, Massachusetts 02154, USA; Tel: +1 893-5151 ext. 309; Email: daver@kurzweil.com; [Sokolowski]: Kurzweil Applied Intelligence.

Abstract: "Producing and storing medical documentation is time-consuming and costly. Studies have shown that physicians spend upwards of 35% of their time on documentation, and the documents produced yearly number in the billions. Very little of this generated medical data is recorded in a format that is computer-readable. The result is a combination of high administrative costs and the inability of clinical decision-makers to use most of the data generated during the patient care process.

Kurzweil Applied Intelligence has received a research grant from the National Institute of Standards and Technology (NIST) to build a prototype system which will use large-vocabulary voice-recognition technology to produce SGML-structured medical reports.

SGML addresses the need for a structured reporting framework for medical applications because:

SGML enables the preservation of context and structure in medical reporting, making the information gathered more useful and accessible. The current lack of a widely accepted standard format for medical reporting has limited the benefits of computerized patient records.
The open systems approach of SGML facilitates communication among and porting between diverse platforms.
The SGML standard supports a wide variety of information types in addition to text; images, video and audio clips can be incorporated into the medical report. SGML formats can be extended to meet the industry's changing requirements.

Many of the issues surrounding the use of SGML in this project will be familiar to the general SGML community, particularly the advantages of tagging and structuring the data. In other respects, however, the project raises some new and interesting problems, such as the dynamic creation of SGML documents from a voice-controlled application. Another important issue is the lack of any standard DTD for clinical data. We have developed DTDs for patient demographic information, prescriptions, and primary care reports, and we are actively involved in the HL7 SGML Initiative, which is an effort to standardize healthcare DTDs."

Note: The above presentation was part of the "SGML User" track at SGML '96. The SGML '96 Conference Proceedings volume containing the full text of the paper may be obtained from GCA.

[CR: 19971018]

Rostek, Lothar. "Marking up in TATOE and exporting to SGML - Rule development for identifying NITF categories." Pages 140 - 142 in ACH-ALLC '97. The 1997 Joint International Conference of the Association for Computers and the Humanities and the Association for Literary and Linguistic Computing. Conference Abstracts. ACH-ALLC '97. Queen's University at Kingston, Ontario, Canada. June 3 - 7, 1997. Compiled by Greg Lessard and Michael Levison. Ontario, Canada: Queen's University, 1997. ISBN: 0-88911-760-8. Author's affiliation: GMD - Integrated Publication and Information Systems Institute, Email: rostek@darmstadt.gmd.de.

[Extract:] "This work belongs to a project which aims at a real-world application and due to this reason the categories of the SGML-based standard News Industry Text Format (NITF) have been applied. NITF was developed by the International Press Telecommunication Council (IPTC) for the exchange of news messages. An interesting feature of the NITF standard is that besides structural mark up, it allows also semantic encoding. Our aim in this project has been twofold: first, to develop an algorithm for the automatic identification of those phrases in new incoming messages which contain semantic information, e.g., names of persons, organizations, places, weekdays etc. Second, to mark up the messages according to the respective NITF categories and export the marked up messages as an NITF conformant SGML text. The degree of correctness of the automatic marked up texts is decisive for the applicability of this method for the daily practice."

Abstract available online in HTML format: "Marking up in TATOE and exporting to SGML - Rule development for identifying NITF categories", by Lothar Rostek; [archive copy]

Additional information on the ACH-ALLC '97 Conference is available in the SGML/XML Web Page main conference entry, or [August 1997] via the Queen's University WWW server.

[CR: 19980428]

Rubinsky, Yuri; Maloney, Murray. SGML on the Web: Small Steps Beyond HTML. Charles F. Goldfarb Series On Open Information Management. Upper Saddle River, NJ: Prentice Hall PTR [Professional Technical Reference], 1997. Extent: 528 pages, CDROM (with SoftQuad Panorama Pro 2.0 and other software). ISBN: Paper (0-13-519984-0). Authors' affiliation [at the time of writing]: SoftQuad, Inc.; Maloney is currently [19970717] Technical Marketing Director for GRIF S.A..

The book was completed on behalf of the late Yuri Rubinsky by Murray Maloney, also [formerly] of SoftQuad.

SGML on the Webis described as an "Introduction to SGML for HTML Users," and it should serve the reader well for this purpose. Appendix A contains a revised and corrected version of SoftQuad's popular SGML Primer; the book also has an excellent glossary, and a list of Yuri's publications. The accompanying CDROM contains a complete copy of the SoftQuad Panorama Pro 2.0 SGML browser -- which should make the book highly attractive to buyers for this reason alone.

The following online materials will assist potential users in evaluating SGML on the Web for their needs: (1) Table of Contents; (2) Volume Preface written by Yuri ; (3) the publisher's description ; (4) acknowledgements ; (5) summary of the CDROM contents. See also the positive review of SGML on the Web from Eric Freese, published in <TAG> 10/7 (July 1997) 10-11. The book was reviewed in XML Files: The XML Magazine; see also the review on the XMLxperts site.

Publisher's summary: "Go beyond the limitations of HTML, with the full-featured, worldwide standard markup language SGML. With this book, Web developers can learn how to gain many of the advantages of SGML with as little complexity as possible. The book requires no previous SGML knowledge: it builds on commonplace HTML knowledge. It introduces the idea of SGML on the World Wide Web and highlights the relationships between SGML and HTML. SGML is introduced with the simplest applications, and readers are walked through the creation of SGML applications. Twenty-two examples are presented, showing SGML at work in library science, document management, editorial review and change management. The book shows how to create SGML documents with the minimum number of element structures and types required, and with no formal document analysis. Part of the Charles F. Goldfarb Series. For anyone involved in electronic document generation or publishing, especially those familiar with HTML. . .Its features: (1) Learn how to use simple SGML markup to meet simple needs, and more complicated markup only when the situation calls for it; (2) The perfect introduction to SGML for HTML users and Webmasters; (3) Includes SoftQuad Panorama, a Web browser that supports SGML -- along with all the sample document descriptions and instances in the book."

Harvey Bingham writes: SGML on the Web: Small Steps Beyond HTML is Yuri's last book, completed with understanding and care by Murray Maloney. The book conveys in forty small steps the path to becoming comfortable with the generic markup available in the HTML application of SGML. Other applications of SGML on the Web can build on the firmness and stability from the ten-year history of successful SGML application implementations. As a language for designing applications, SGML allows you to define the new information structures you need. Its support for invention and delight is limited only by your imagination. We benefit from the insights of Rubinsky and Maloney, two visionaries and doers, concerning the Web, SGML, and the joy of making insight accessible." [see the context for the quotation via the Yuri Rubinsky Insight Foundation Web site.

Publisher's marketing blurb: "KEY BENEFIT: Shows Web developers how to gain many of the advantages of SGML with as little complexity as possible. Requires no previous knowledge of SGML -- builds on commonplace HTML knowledge. KEY TOPICS: Introduces the idea of SGML on the World Wide Web and highlights the relationships between SGML and HTML. Introduces SGML with the simplest applications, and walks readers through the creation of SGML applications. Includes 22 examples that show SGML at work in library science, document management, editorial review and change management, one step at a time. Shows how to create SGML documents with the minimum number of element structures and types required, and with no formal document analysis. Part of the Charles F. Goldfarb Series. MARKET: Anyone involved in electronic document generation or publishing, especially those familiar with HTML. [ Prentice Hall source]. For other information, see the description on "Prentice-Hall SGML Series" Web page, or contact Mark Taub at Prentice-Hall.

[CR: 19960202]

Rubinsky, Yuri. "Can Inanimate Objects Have Intentions? This Column Does, and So Can a DTD." <TAG>: The SGML Newsletter 10 (July 1989) 11. ISSN: 1067-9197. Author's affiliation: SoftQuad, Inc.

Rubinsky discusses the DTD created by Dominique Vignaud on behalf of two French associations (Syndicat national de l'édition, and Cercle de la libraire). This SNE/Cercle DTD uses only 60 element definitions (as opposed to about 220 in the AAP DTD), so that it serves as a model rather than a DTD for blind interchange. The AAP DTD, for example, contains elements for the tagging of a Table of Contents; the French DTD assumes that information necessary to construct a DTD will be available within the document, and that the TOC would be generated automatically. Vignaud's "intention" in the DTD design is that it not limit how publishing houses use it as a model: it is thus a "high-level" and "heavily intellectual" DTD. [See the bibliographic reference for Vignaud's book.]

[CR: 19960202]

Rubinsky, Yuri. "Comments on an SGML Application for Hypermedia and Multi-Media Interchange." SGML Users' Group Newsletter 15 (January 1990) 16-17. ISSN: 1067-9197. Author's affiliation: SoftQuad, Inc.

Rubinsky reports on a one-day conference sponsored by the CGA of interchange for hypermedia documents. The review included discussion of the July 21 draft of X3V1.8M/SD-7: Journal of Development for a Standard Music Description Language (SMDL).

Rubinsky, Yuri. "Comments on an SGML Application for Hyper- and Multi-Media Interchange: Informal Report from the GCA Hypertext/Hypermedia Standards Forum." <TAG> 11 (October 1989) 5-6.

[CR: 19960127]

[by Mark A Crook]. "Yuri Rubinsky Explores Use of SGML to Generate Text for Sight-impaired." OCLC Newsletter 212 (November/December 1994) 16-17. Author's affiliation: Sr. Consulting Systems Analyst, OCLC Office of Research. Email: mark_crook@oclc.org.

"SGML and the ICADD Methodology: Since December 1991, Dr. Rubinsky has served as a member ICADD, which is developing strategies and techniques for the use of SGML to generate Braille, large-print and voice-synthesized texts. ICADD's work began with three assumptions: 1) the markup technique must be straightforward and simple; 2) only one set of markup--if a second markup is required for nonvisual encoding, it will likely not happen; 3) archival documents must always contain the richest possible markup, thereby further facilitating access to the document. Given those assumptions, the ICADD technical subcommittee's goals were: 1) to make the transform process as automatic as possible; 2) to keep the technique simple; and 3) to reduce the costs involved in making texts available for the print-disabled community. The committee believes it is possible to have creators of DTDs build in the relevant attributes to allow for Braille, large-print, and voice-synthesis from the files encoded for other purposes, as a by-product." [extracted]

This article is a report by Mark Crook based upon an address by Yuri Rubinsky as part of OCLC's "Distinguished Seminar Series", held at OCLC on October 11, 1994. This entry duplicates the entry in the C-E bibliography section.

The article is available online: from the OCLC WWW server, http://www.oclc.org/oclc/new/n212/research.htm [mirror copy, partially linked]. Note that the article has a linked page with a photograph of Yuri Rubinsky. For more information on ICADD, see the main entry in this database.

Rubinsky, Yuri. "Copy of Letter to NIST [on Conformance Testing Program] Dated January 20, 1994." SGML Users' Group Bulletin Newsletter 26 (February 1994) 4-6. ISSN: 0952-8008.

The (open) letter is written on behalf of the SGML Open Consortium by SGML Open President, Yuri Rubinsky. The letter includes three recommendations on technical issues that would be more in the interests of SGML users as represented within GCA, SGML Open, and the SGML Users' Group. This letter is a supporting document to the article of Pamela Gennusa on NIST's proposed Conformance Testing program.

Rubinsky, Yuri. Description of the ICADD Mechanism. Technical Report, published on the Internet. See Yuri Rubinsky, "Description of the ICADD Mechanism" [65K plain text document; mirror copy]. Also available in PostScript format from the UCLA GOPHER server [mirror copy]. Date: 1994 [?]. Author's address: Yuri Rubinsky, SoftQuad Inc.; 56 Aberfoyle Cres. Suite 810, Toronto M5X 2W5 CANADA; Tel: +1 416 239-4801 FAX: +1 416 239-7105 Internet: yuri@sq.com.

"Large repositories of SGML-encoded material are beginning to be built up: Some 50 publishers are using (for one or more projects) the ANSI standard markup scheme for books and journals commonly called the AAP DTD. An updated version of this markup augmented with the Accessible Document techniques described in this booklet has been put forward as ISO 12083. Some 40 projects around the world are marking up millions of pages of literary materials using SGML Document Type Definitions created in accordance with the guidelines of the Text Encoding Initiative (TEI). Aerospace, defense, automotive, workstation computing, semiconductor and telecommunications industries all have DTDs supporting their requirements and new industries are exploring the role which SGML will play in their documentation. When encoded using the SGML Document Access (or SDA) techniques described in this booklet, all of the textual materials created by the associations, industries, corporations, institutions and individuals will be readily available to visually impaired readers, for Braille, for the generation of navigable voice-synthesized texts and for the publication of large print books and journals." [from the Foreword]

See the main ICADD SGML entry for further information on this important effort.

[CR: 19960202]

Rubinsky, Yuri. "Implementation Development and Surprise." SGML Users' Group Bulletin 2/2 (1987) 113-115. ISSN: 0269-2538. Author's affiliation: SoftQuad, Inc.

"The thesis of this paper is that the adventure of creating an SGML implementation was consistently rewarding and full of surprise. Ant that surprise was in discovering how supportive, how how generous SGML was, how easily it became a foundation for tasks like word-processing tasks but much richer. A piece of software that knows the structure of a document gives a writer considerable flexibility and stength in working with that document, more than ever imagined as the specifications for the program were created."

The article is based on a presentation given at MarkUp '87, Torremolinos.

[CR: 19960312]

Rubinsky, Yuri. "Life Beyond Cross-Roads." <TAG> 9/2 (February 1996) 5. ISSN: 1067-9197.

This extract from a keynote address delivered by Yuri at SGML '95 is printed in a special issue of <TAG> dedicated to the memory of Yuri Rubinsky. The text was supplied by GCA. See also the main eulogy collection.

[CR: 19960202]

Rubinsky, Yuri. "In Praise of Shelf Life and SGML." InterConsult's Corporate Publishing Newsletter [vol. ?] (Ocrober 19 1987) [pp. ]. Author's affiliation: SoftQuad, Inc.

The article is a guest editorial by the president of SoftQuad.

[CR: 19960202]

Rubinsky, Yuri. "The Screen is Deeper than the Page." InterConsult's Corporate Publishing Newsletter [vol. ?] (April 25 1989) [pages: ?]. Author's affiliation: SoftQuad, Inc.

The article is a guest editorial by the president of SoftQuad.

Rubinsky, Yuri. "SGML to Braille, Large Print, and Audio" . In Part 4: Distinguished Seminar Series Annual Review of OCLC Research, 1994. Dublin, OH: OCLC Online Computer Library Center, 1995. approximately 6 pages, 1 reference. Author's affiliation: SoftQuad, Inc.

Summary: ". . .Yuri Rubinsky and his International Committee for Accessible Document Design (ICADD) colleagues have devised a method of electronic document markup and transformation based on the Standard Generalized Markup Language (SGML) which leverages the existing document structure and enables rapid production of alternative forms of a text for the visually impaired. This article discusses Rubinsky's explanation of the process delivered as part of the Distinguished Seminar Series at OCLC on October 11, 1994. We will briefly cover the background of Rubinsky's work, discuss SGML and the automated transformation process, and suggest what this process might imply for patrons, libraries, and librarians." [from the document introduction]

The document is available via Internet on the OCLC WWW server [or in mirror copy, June 1995, text only].

[CR: 19960202]

Rubinsky, Yuri. "SGML in the Realm of the Desktop. GCA Conference Highlights Variety of Possibilities for SGML, Including Dictionaries." <TAG>: The SGML Newsletter 1/8 (January 1989) 12-13. Author's affiliation: SoftQuad, Inc.

Rubinsky surveys the highlights of the conference sponsored by the GCA and chaired by Rubinsky in November 1988: "Standards and the Desktop." Topics of interest included: use of SGML for encoding dictionaries (Robert Amsler, on behalf of UWaterloo NOED; Perseus Project (Elli Mylonas); Text Encoding Initiative (Michael Sperberg-McQueen).

[CR: 19970206]

Rubinsky, Yuri. "SGML: The Blueprint for Future Design." Focus Online 4/5 (1995 (?)) [??]. Author's affiliation: SoftQuad, Inc..

"HTML is an instance (or subset) of a much broader ISO standard called the Standard Generalized Markup Language. SGML has been adopted as a standard by government, aerospace, automotive, magazine and book publishing, railroad, and semi-conductor manufacturing industries for the re-use of information in a variety of ways across computer platforms."

[Focus Online is "the executive management magazine for Canada's public service", published by The Government Source (Canada) and Health Canada.]

Available online: http://governmentsource.com/rubsoftquad.html; [mirror copy]

[CR: 19950911]

Rubinsky, Yuri. "SGML Year in Review [1991]." SGML Users' Group Newsletter 21 (December, 1991) 3-6.

This article contains one of the annual SGML summaries [also published in <TAG>]; for others, see the special listing.

[CR: 19960202]

Rubinsky, Yuri. "SGML Year in Review [1991]." SGML Users' Group Newsletter 18 (November, 1990) 5-6.

This article contains the first [?] of the annual SGML summaries of the SGML Year in Review, prepared by Yuri Rubinsky. The review was delivered at SGML '90 as part of the GCA-sponsored conference "Building Architectures of Information." For other reviews, see the special listing in the SGML/XML Web Page.

[CR: 19950911]

Rubinsky, Yuri. "SGML Year in Review [1991]." <TAG> 20 (December, 1991) 1-5.

This article contains one of the annual SGML summaries [also published in the SGML Users' Group Newsletter]; for others, see the special listing.

[CR: 19960121]

Rubinsky, Yuri. "Electronic Texts The Day After Tomorrow." In SCHOLARLY PUBLISHING ON THE ELECTRONIC NETWORKS. The New Generation: Visions and Opportunities in Not-for-Profit Publishing. Proceedings of the Second Symposium. Second Symposium on Scholarly Publishing on the Electronic Networks, The Washington Vista Hotel, Washington, DC. December 5-9,, 1992. Sponsored by the Association of Research Libraries and Association of American University Presses, in collaboration with The American Mathematical Society and The National Science Foundation. Edited by Ann Okerson. Washington, DC: Association of Research Libraries, Office of Scientific & Academic Publishing, 1993. ISBN: 0-918006-61-9. Author's affiliation: President, SoftQuad Inc..

". . .That, incidentally, is what SGML is all about. Several of the speakers over the next few days will describe projects built on a foundation of SGML. The reason I haven't mentioned SGML in this whole talk is because it's there throughout. That's the common storage format -- and language to build new formats -- that allows the Braille simulcast, that will support the next generation of custom publishers, that can subsume and extend MARC. That provides the necessary hub for multimedia and content beyond text. That marks the highlights and hierarchies for skimming and fast forward displaying. And that, by co-incidence, lets you optimize the long-term value of every keystroke you own today, whether you're a publisher, a librarian, a conference organizer, a teacher, a learner or a reader.

"Without information storage structures, retrieval and inter-operable access at the levels we're talking about here are meaningless. All the technologies I've described today really become exhilarating only when you smoosh them all together. If SGML didn't exist, the foreword momentum of so many exciting possibilities would force us to invent it." [extracted]

Available on the Internet in HTML format: http://arl.cni.org/scomm/symp2/Rubinsky.html [mirror copy].

[CR: 19960202]

Rubinsky, Yuri; Lehman, Philip. "Markup Language Creates Blueprint for Style, Format." Government Computer News 7/13 (June 24 1988) 73-74.

[CR: 19960202]

Rubinsky, Yuri. "The SGML Year in Review [Being the Text of a Speech Given at the GCA's SGML '92 Conference]." SGML Users' Group Newsletter 23 (November 1992) 2-8.

Rubinsky summarizes the SGML highlights for 1992, recognizing the help of about 30 other people who supplied information for the report. See the full text of the presentation, and the special database section.

Rubinsky, Yuri; Usdin, Tommie. "The SGML Year in Review - 1993." SGML Users' Group Bulletin Newsletter 26 (February 1994) 8-15. ISSN: 0952-8008.

The report is a major contribution covering premier SGML events for 1993. It is in the series inaugurated by Yuri Rubinsky in about 1990; see the dedicated page. An online version of the article is available from Hal Software Systems, HTMLized by Mark A. Gaither [markg@hal.com]; a (partially linked) version in mirror copy is available here.

Rubinsky, Yuri. "Standards for Hypertext Interchange." SGML Users' Group Newsletter 15 (January 1990) 14-15.

Rubinsky, Yuri. "Standards for Hypertext Interchange Need Not Come out of Thin Air." <TAG> 11 (October 1989) 4-5.

Rubinsky, Yuri; Usdin, B. Tommie. "The [1994 SGML] Year in Review." <TAG> 7/12 (December 1994) 1-4. ISSN: 1067-9197. Authors' affiliation: Yuri Rubinsky is President of SoftQuad, Inc. in Toronto, Canada; Tommie Usdin is with ATLIS Consulting, Rockville, MD.

The article provides an overview of SGML highlights for 1994. It is the annual SGML review article, authored by Yuri Rubinsky since (about) 1989.

[CR: 19951113]

Ruggles, Clive L. N.; Andrews, Derek (editors). Formal Methods in Standards: A Report from the BCS [British Computer Society Formal Methods in Standards Working Group] Working Group. London/Berlin/Paris: Springer-Verlag, 1990. Extent: xi + 135 pages; bibliography (pp. 79-92), glossaire (pp. 100-135). ISBN: 0387195777 (New York); 3-540-19577-7 (Berlin). Author's affiliation: .

The volume covers ODA, SGML, and other formal notations for information standards.

[CR: 19950716]

Rutledge, Lloyd; Buford, John F. HyTime: A Standard for Hypermedia Document Systems. New York, NY: Springer Verlag, [forthcoming August] 1995. Extent: approximately 350 pages, 65 illustrations. ISBN: 3-540-58260-6. Authors' affiliation: Lloyd Rutledge and John F. Buford are both affiliated with University of Massachusetts, Lowell.

Abstract: "HyTime is a hypermedia time-based structuring language designed to represent the complex document types involved in multimedia computing. I t is an extension of SGML and was standardized by ISO in 1992. HyTime not only provides SGML users with a bridge to hypermedia documentation but also offers the commercial world a user friendly multimedia document model. Researchers and developers working in multimedia and electronic documentation will find a practical and detailed introduction to HyTime presented in this book. It will also be of use to those interested in evaluating HyTime for end use and adoption." [publisher's description]

Note: the Interactive Media Group at the Center for Productivity Enhancement, University of Massachusetts (Lowell), has done work on a HyTime engine called "HyOctane." Contact: J. Buford, email buford@uml.edu.

[CR: 19961126]

Rutledge, Lloyd; Buford, John F.; Rutledge, John L. "Modeling Techniques for HyTime." Pages 67-81 (with 8 references) in Multimedia Modeling. Towards Information Superhighway. Proceedings of International Conference on Multimedia Modeling. The International Conference on Multimedia Modeling, Singapore. November 14 - 17, 1995. Sponsored by . Edited by Tat-Seng Chua, Hung Keng Pung, and Tosiyasu L. Kunii. Singapore / River Edge, NJ: World Scientific Publishers, 1995. ISBN: 9810225024. Authors' affiliation: Department of Computer Science, University of Massachusetts, Lowell, MA, USA, and Centrum voor Wiskunde en Informatica (CWI) in Amsterdam, The Netherlands.

"Abstract: Hypermedia/time-based structuring language (HyTime) defines constructs for representing general hypermedia document concepts. Building documents with HyTime can be difficult because it uses many constructs and has an intricate relationship with its parent language Standard Generalized Markup Language (SGML). Further, HyTime inherits from SGML the establishment of document models as well as the document instances that follow them. In this paper we introduce some techniques for modeling how HyTime and SGML constructs contribute to the structure of documents and document models. We also introduce a defined set of "meta-HyTime constructs", which correspond to the semantic concepts HyTime constructs represent. Diagramming notations are provided in conjunction with these techniques as a tool for aiding document developers in understanding and communicating their use of HyTime."

Available online: MODELING TECHNIQUES FOR HYTIME, by LLOYD RUTLEDGE, JOHN F. BUFORD, AND JOHN L. RUTLEDGE; [HTML mirror copy, text only]; or in Postscript format[mirror copy].

[CR: 19971124]

Rutledge, Lloyd; van Ossenbruggen, Jacco; Hardman, Lynda; Bulterman, Dick C. A. "A Framework for Generating Adaptable Hypermedia Documents." Pages [?] in ACM Multimedia 97 - Electronic Proceedings. ACM MULTIMEDIA 97. The Fifth ACM International Multimedia Conference. Crowne Plaza Hotel, Seattle, USA. November 8-14, 1997. New York, NY: ACM, 1997. ISBN: . Authors' affiliation: [Rutledge]: CWI (Centrum voor Wiskunde en Informatica), P.O. Box 94079, 1090 GB Amsterdam, The Netherlands; Email: lloyd@cwi.nl; WWW: http://www.cwi.nl/~lloyd/; [van Ossenbruggen]: Vrije Universiteit, De Boelelaan 1081a, 1081 HV Amsterdam, Email: jrvosse@cs.vu.nl; WWW: http://www.cs.vu.nl/~jrvosse/; [Hardman]: Email: lynda@cwi.nl; WWW: http://www.cwi.nl/~lynda/; [Bulterman]: DCLAB, CWI.

Abstract: "Being able to author a hypermedia document once for presentation under a wide variety of potential circumstances requires that it be stored in a manner that is adaptable to these circumstances. Since the nature of these circumstances is not always known at authoring time, specifying how a document adapts to them must be a process that can be performed separately from its original authoring. These distinctions include the porting of the document to different platforms and formats and the adapting of the document's presentation to suit the needs of the user and of the current state of the presentation environment. In this paper we discuss extensions to our CMIF hypermedia authoring and presentation environment that provide adaptability through this distinction between authoring and presentation specification. This extension includes the use of HyTime for document representation and of DSSSL for presentation specification. We also discuss the Berlage architecture, our extension to HyTime that specifies the encoding of certain hypermedia concepts useful for presentation specification.

"This paper presents the HyTime encoding for CMIF. It also describes the extension to the CMIF environment for processing this encoding and the presentation instructions that accompany it. This translation provides an empirical test of HyTime's ability to represent the hypermedia structure of existing environments. It also provides an opportunity to explore the issues behind processing this generic structure for actual presentation on interactive multimedia environments. The HyTime encoding of CMIF uses our Berlage architecture, an SGML-defined extension of HyTime that encodes how a document's HyTime-defined structure is mapped to certain aspects of presentation processing."

Presented at ACM MULTIMEDIA 97, The Fifth ACM International Multimedia Conference. The document is available online in HTML format: http://www.cs.vu.nl/~dejavu/papers/ACMMM97/index.html; [local archive copy].


SEARCH \| ABOUT \| INDEX \| NEWS \| CORE STANDARDS \| TECHNOLOGY REPORTS \| EVENTS \| LIBRARY