DITA Proposed Feature 14 - basic

A specialized topic for publishing basic glossary entries.

Longer description

The problem: DITA users need to publish glossaries, especially as part of books.

The solution: Add a specialized topic for reusable glossary entries.

Scope

Minor - new topic type

Use Case

In DITA 1.1, the glossary topic meets only the core publishing requirements for books and online deliverables:

Publishing a glossary listing in the back of a book
The glossary definitions can be sorted and grouped by term prior to formatting.
Offering inline definitions in a help system or website
The glossary definitions can be included in HTML pages as popups windows or tooltips for mentions of unfamiliar terms.
Guiding authors during content creation
The glossary provides a controlled vocabulary to help authors create content that employs consistent terminology for a set of standard concepts.

Each sense of a term is defined in a separate topic. A formatting process can collate these definitional topics based on the term and indent the defined senses under the term.

Putting each glossary definition in a separate topic is better suited for translation because terms that are the same in one language may be different in others. Thus, glossary definitions should be collated based on the translated terms rather than assuming that a term will have the same set of definitions in all languages.

These glossary definition topics can be provided within a single ditabase file for the simple case or in separate files. The latter approach facilitates reuse because deliverables can assemble the glossary for the terminology used in content by selecting from a pool of available glossary definitions.

Technical Requirements

The relationship between concepts and terms is complex. A term can have many meanings. For instance, "element" can have a chemical, programming, or XML sense. Conversely, a concept can have many labels. For instance, an XML delimiter for hierarchical structured content could be called an "element" or "tag". A strict formal model of the relationship between concepts and terms would allow many-to-many relationships between the two.

A content creator, however, usually wants a one-to-one mapping between key concepts and terms. That is, within an information set, most content creators would like each key concept to have a single preferred label and each key term to have a standard, unambiguous meaning.

Thus, the glossary specialization should focus on the common, simple case where a single glossary entry specifies both the concept and term. In DITA 1.2, the specialization should scale to more complex cases where the concept and term need to be defined separately and associated by reference.

The simplest possible glossary entry might resemble the following:

<glossentry id="ddl">
    <glossterm>Data Definition Language</glossterm>
    <glossdef>A language used for defining databases....</glossdef>
</glossentry>

Here is the definition of the glossary topic specialization in Schema syntax:

<xs:complexType name="glossentry.class">
    <xs:sequence>
        <xs:group ref="glossterm"/>
        <xs:group ref="glossdef"/>
        <xs:group ref="related-links" minOccurs="0"/>
    </xs:sequence>
    <xs:attribute name="id" type="xs:ID" use="required"/>
    <xs:attribute name="conref" type="xs:string"/>
    <xs:attributeGroup ref="select-atts"/>
    <xs:attribute ref="ditaarch:DITAArchVersion" />
    <xs:attribute name="outputclass" type="xs:string"/>
    <xs:attribute ref="xml:lang"/>
    <xs:attributeGroup ref="global-atts"/>
    <xs:attribute ref="class" default="- topic/topic concept/concept glossentry/glossentry "/>
  </xs:complexType>

<xs:element name="glossterm" type="glossterm.class"/>
<xs:complexType name="glossterm.class" mixed="true">
    <xs:choice minOccurs="0" maxOccurs="unbounded">
        <xs:group ref="title.cnt"/>
    </xs:choice>
    <xs:attribute name="outputclass" type="xs:string"/>
    <xs:attributeGroup ref="id-atts"/>
    <xs:attributeGroup ref="global-atts"/>
    <xs:attribute ref="class" default="- topic/title concept/title glossentry/glossterm "/>
</xs:complexType>

<!-- placeholder - glossdef specializes the outer, section-like element for shortdesc / abstract -->
<xs:element name="glossdef" type="glossdef.class"/>
<xs:complexType name="glossdef.class" mixed="true">
    <xs:choice minOccurs="0" maxOccurs="unbounded">
        <xs:group ref="title.cnt"/>
    </xs:choice>
    <xs:attribute name="outputclass" type="xs:string"/>
    <xs:attributeGroup ref="id-atts"/>
    <xs:attributeGroup ref="global-atts"/>
    <xs:attribute ref="class" default="- topic/shortdesc concept/shortdesc glossentry/glossdef "/>
</xs:complexType>

Note that the glossdef will use the section-like content model for the abstract (embedding the <shortdesc> preview) as proposed for DITA 1.1

Here's the definition in DTD syntax:

<!ELEMENT glossentry      ((%glossterm;), (%glossdef;), (%related-links;)?)>
<!ATTLIST glossentry
             id         ID                               #REQUIRED
             conref     CDATA                            #IMPLIED
             %select-atts;
             xml:lang   NMTOKEN                          #IMPLIED
             %arch-atts;
             outputclass 
                        CDATA                            #IMPLIED
             domains    CDATA                "&included-domains;"
             %global-atts;
             class CDATA "- topic/topic concept/concept glossentry/glossentry ">

<!ELEMENT glossterm         (%title.cnt;)*> 
<!ATTLIST glossterm
             %id-atts;
             outputclass 
                        CDATA                            #IMPLIED    >
<!ATTLIST glossterm     %global-atts;  class CDATA "- topic/title glossentry/glossterm "      >

<!-- placeholder - glossdef specializes the outer, section-like element for shortdesc / abstract -->
<!ELEMENT glossdef     (%title.cnt;)*>
<!ATTLIST glossdef
             %id-atts;
             outputclass 
                        CDATA                            #IMPLIED    >
<!ATTLIST glossdef %global-atts;  class CDATA "- topic/shortdesc glossentry/glossdef "  >

Related Proposals

Costs

Benefits

Provides DITA adopters with glossary publishing as part of books.

Future evolution

The basic glossary provides the foundation for a more complete glossary in DITA 1.2.

Many content publishers will need to distinguish a labelling abbreviation from the full term. In addition, the DITA 1.2 glossary should take advantage of the ability to bind a mention of a glossary term with a sense definition by means either of the DITA 1.2 key referencing mechanism or of the matching mechanism for mentions and definitions. Finally, DITA 1.2 should satisfy the requirements of some content publishers for a more complete representation of terminology and key concepts for an information set. In particular:

Guiding translation
The glossary identifies key terminology for human translators as well as the meaning that the term must retain in translation. In addition, the identification of special terms and the terminology data for those terms provides a dictionary that helps to enable automated translation of mentions of terms and content in the vicinity of such mentions.

Key terminology standards include TMF and TBX.

Indicating subject matter for semantic processing
By defining concepts that might be unfamiliar, the glossary contributes to a formal definition of the subject matter for an information set. Formal definition of subject matter can enable semantic search as well as browsing or linking across content based on its subject matter.

Here is one example (http://tb1.siderean.com:7880/test/test2query3.jsp) of browsing based on different aspects of the subject matter.

Key standards for formal subject definition include TopicMaps and SKOS.

DITA 1.2 might specialize the <conbody> element as <glossdetail> to meet these requirements.

Time Required