[Mirror copy of document from ftp://naggum.no/pub/SGML94/; poster delivered by Erik Naggum at SGML '94 at Tyson's Corner, VA] How to Your Make SGML Documents Survive You One excellent way to make sure your documents last at least as long as they are valuable is to represent the information in SGML, because as a language it enables data independence and all that good stuff. Like all languages worth knowing, however, it allows a large number of forms of expression, and some of them are well suited to immediate concerns, while restraint becomes more important for longer-range goals. This is a list of such restraining conventions that you may want to consider for your long-term information investment. Inclusion Exceptions Don't use them. If elements are independent of their position, collect them someplace else or let them be ordinary elements. If elements are allowed "everywhere" in the document, make a parameter entity that allows them with PCDATA. If they are allowed outside of text, explicitly include them in the content models. Dynamic content models may be good for prototyping, but not for longevity. Declared Content (CDATA and RCDATA) These "protect" & and < from markup recognition, but does so by making parsing much more complicated--it is not apparent from looking at an element that it is special. Use entities or instead. The latter provide easily identifiable special treatment. Potential Markup as Data Characters & and < are data when followed by digits or white space, and that's the only time they are safe. Other characters may change their meaning when processed by a different system. Using as data when SHORTTAG is not used is a bad idea. Non-obfuscatory Minimization * if the start-tag is omitted, omit the end-tag, too. * if the start-tag is "net-enabling", end with the net (/). * don't use net-enabling start-tags on empty elements. * don't mix empty start- or end-tags with omitted tags. * except with short references, don't let elements cross entity boundaries. * don't omit the refc (;) in entity references. * don't omit the tagc (>) in sequences of tags. keep lines meaningful * don't use line breaks before the tagc (>). * don't use the line-end-suppressing entity reference. * don't use an entity reference for "new line". * don't use &#RS; and &#RE; in content.