Build on the DITA 1.1 glossary specialization for more complete support of glossary, linguistic, and semantic applications and also to assist in the resolution and handling of abbreviated-form text such as acronyms, general abbreviations, and short forms in source and target text within DITA documents.
DITA 1.1 introduce a simple glossary specialization to meet basic needs for publication as part of bookmap.
The DITA 1.1 glossary specialization, however, is too simple to support many common glossary applications. For instance, many content publishers need to distinguish an abbreviation from the full term. In addition, a more complete representation of terminology can support processing such as the following:
Key terminology standards include TBX.
Abbreviated forms, such as acronyms, are ubiquitous in technical documentation. Abbreviated forms are a special case of glossary term because they need to be expanded to the full form under some conditions (such as the first encounter within a printed document). In electronic published documents, abbreviated form expansions can also be made available in the form of a hyperlink or 'tool tip' mechanism. In addition, the expanded text of abbreviated forms should be available for automatic inclusion in glossary entries for the publication. This proposal relates to all types of abbreviations, such as acronyms, initialisms, apocope, clipping, elision, syncope, syllabic abbreviation, and portmanteau.
To enable these applications, DITA 1.2 allows additional detail in the glossary definition and additional methods for referring to abbreviated terms that can deliver either abbreviated or expanded forms of the term.
The following requirements apply to glossary terms generally:
In addition, abbreviated forms and their translations require special handling:
For example, the expansion of an abbreviated form in English might consist of the abbreviated form followed by its full form in parentheses. By contrast, the translated version might consist of the expanded form followed by the abbreviated form in parentheses. The translated version might also include the English and the translation.
For example, in a Polish book on Java Web programming, the first reference to JSP may appear as follows:
"JSP (ang. Java Server Pages)"Another example from a publication concerning OASIS:
"OASIS (ang. Organization for the Advancement of Structured Information Standards—organizacja dla propagowania strukturalnych standardów infomracyjnych)"In the first example, the translator assumes the reader will not require a translation of the English abbreviated form. In the second example, the translator assumes the reader may not understand the English expanded form and therefore adds the translation.
Moderate: adding elements to one specialized topic, providing a map domain for defining keys, and providing an element domain for referring to keys.
The expanded glossentry topic provides the following elements:
| Base | Element | Content | Purpose |
|---|---|---|---|
| <concept> | <glossentry> |
|
Defines a single reusable term subject within the glossary. The <glossterm>, <glossAbbreviation>, or <glossAcronym> gives the form of the preferred term. |
| <title> | <glossterm> <glossAbbreviation> <glossAcronym> <glossFullForm> <glossShortForm> <glossSurfaceForm> <glossSynonym> | title content for <glossterm> for consistency with DITA 1.1; text, <term>, <keyword>, or <tm> content for the other <title> specializations | Identifies the role of one term with respect to other variant terms. A <glossFullForm> alternate term should be specified only if the preferred term is an abbreviation, acronym, or some other shortened form. The <glossSurfaceForm> can, however, be specified as an expansion of any preferred term. |
| <abstract> | <glossdef> | section content or <shortdesc> | Defines the term subject for users. |
| <conbody> | <glossBody> |
|
Represents terminology detail. The part of speech applies to all term variants and encourages consistency of the variants with the preferred term. The status indicates the overall status of the term subject. The <glossProperty> and <note> elements are extension points for more detailed terminology definitions (such as the linguistic properties from basic or full TBX). |
| <data> | <glossPartOfSpeech> | value attribute enumerated as noun, properNoun, verb, adjective, or adverb; empty content | Identifies the part of speech for the preferred and alternate terms (using the proposed controlled values mechanism if approved but extensible with validation by processing if not) with a default of noun. By definition alternate terms must have the same part of speech as the preferred term to have a common term subject. The part of speech must be specified when glossary detail is provided. |
| <data> | <glossStatus> | value attribute enumerated as restricted, prohibited, or obsolete; empty content | Identifies the allowable use of a preferred or alternate term (using the proposed controlled values mechanism if approved but extensible with validation by processing if not). If the status isn't specified, the preferred term provides a preferred term and an alternate term provides an allowed term. |
| <data> | <glossProperty> | data content | An extension point for linguistic or semantic properties such as the gender of the term. |
| <note> | <glossUsage> | note content | Any information about the correct usage of the term. |
| <note> | <glossScopeNote> | note content | An explanation of the limitations on the applicability of the term subject. |
| <image> | <glossSymbol> | image content | Identifies a standard icon associated with the term subject. |
| <section> | <glossAlt> |
|
Identifies a variant term for the preferred term. Any list of alternative terms is, of course, specific to the language and may get longer or shorter during translation. |
| <xref> | <glossAlternateFor> | Empty content | Indicates when a variant term has a relationship to another variant term as well as to the preferred term. |
The following example shows the use of the expanded glossentry topic to define preferred and alternate terms:
<glossentry id="usbfd">
<glossterm>USB flash drive</glossterm>
<glossdef>A small portable drive.</glossdef>
<glossBody>
<glossPartOfSpeech value="noun"/>
<glossUsage>Do not use in upper case (as in "USB Flash Drive") so as not to suggest that this is a trademark.</glossUsage>
<glossAlt>
<glossAcronym>UFD</glossAcronym>
<glossUsage>Explain the acronym on first occurrence.</glossUsage>
</glossAlt>
<glossAlt id="memoryStick">
<glossSynonym>memory stick</glossSynonym>
<glossUsage>This is a colloquial term.</glossUsage>
</glossAlt>
<glossAlt>
<glossAbbreviation>stick</glossAbbreviation>
<glossStatus value="prohibited"/>
<glossUsage>This is too colloquial.</glossUsage>
<glossAlternateFor href="#memoryStick"/>
</glossAlt>
<glossAlt>
<glossAbbreviation>flash</glossAbbreviation>
<glossStatus value="prohibited"/>
<glossUsage>This short form is ambiguous.</glossUsage>
</glossAlt>
</glossBody>
</glossentry>
The usage and status markup is optional so that an adopter with simpler requirements could capture a list of alternative terms without the burden of the full terminology detail. For instance, the following example shows a minimal entry defining an abbreviation, its full form, and its surface form:
<glossentry id="abs">
<glossAcronym>ABS</glossAcronym>
<glossdef>A brake technology that minimizes skids.</glossdef>
<glossBody>
<glossPartOfSpeech value="noun"/>
<glossAlt>
<glossFullForm>Anti-lock Braking System</glossFullForm>
</glossAlt>
<glossAlt>
<glossSurfaceForm>Anti-lock Braking System (ABS)</glossSurfaceForm>
</glossAlt>
</glossBody>
</glossentry>
Two new domains complement the glossary topic:
The following example uses the "car.abs" key to refer to the glossary topic shown previously for the ABS abbreviation:
<map>
...
<glossref keys="car.abs" href="abs.dita"/>
...
<topicref type="task" href="maintcar.dita"/>
...
</map>
<task id="maintcar">
<title>Maintaining your car</title>
...
<info>The <abbreviation keyref="car.abs"/> system will prevent the car from skidding in adverse weather conditions.</info>
...
</task>
Writers can set the linking attribute to the "target" value on the <glossref> element to enable linking from the use to the glossary term or use a <keydef> or <topicref> element with a keys attribute to pull glossary topics into a TOC context while defining keys.
Content teams may choose to use the base <term> element to refer to glossary terms when referring to other terms as well as abbreviations. The <term> element can provide a context-specific surface form. That is, processing inserts the preferred term from the glossentry topic only when the <term> element doesn't contain text.
For authoring convenience, a <glossgroup> topic can contain multiple <glossentry> topics:
| Base | Element | Content | Purpose |
|---|---|---|---|
| <concept> | <glossgroup> |
|
Groups a set of glossary entries for some purpose, for instance, for convenient maintenance based on the alphabetic collation of the preferred term or on the subject matter covered by the terms. |
Relationships between term subjects (such as the hypernym or kind-of relationship and the holonym or part-of relationships specified by WordNet) can be specified for glossary topics by a subject scheme map. (Please see the Proposal 12031 for Controlled Values.)
When the writer provides a keyref to a glossentry topic with a <glossSurfaceForm> element, a process can emit the surface form in contexts where the abbreviation might be unfamiliar to the reader.
For instance, a process composing a book deliverable can emit the surface form on the first reference to the glossentry topic within the book or within copyright, warning, or legal sections. A process generating an online page can emit the surface form as a hover tooltip on every instance of the term.
For instance, if the topic with the keyref to the "car.abs" key provided the first appearance of the ABS term within a book, the sentence could be rendered as follows:
"The Anti-lock Brake System (ABS) system will prevent the car from skidding in adverse weather conditions."
If the ABS term had appeared previously within the book, the same sentence could instead be rendered as follows:
"The ABS system will prevent the car from skidding in adverse weather conditions."
The following cases for abbreviated forms must be contemplated when working with documents that require internationalization:
If there is no abbreviated form for the target language, the <glossterm> element is used instead of the <glossAbbreviation> or <glossAcronym> element when indicating the preferred term for the language. Where the preferred term has an expanded form, the <glossSurfaceForm> element can be used to provide text that displays for expanded occurrences. Consider the following example in English:
<glossentry id="wmd" xml:lang="en">
<glossAcronym>WMD</glossAcronym>
<glossdef>A weapon technology resulting in catastrophic loss of life.</glossdef>
<glossBody>
<glossPartOfSpeech value="noun"/>
<glossAlt>
<glossFullForm>Weapons of Mass Destruction</glossFullForm>
</glossAlt>
<glossAlt>
<glossSurfaceForm>Weapons of Mass Destruction (WMD)</glossSurfaceForm>
</glossAlt>
</glossBody>
</glossentry>
In Spanish, this becomes:
<glossentry id="wmd" xml:lang="es">
<glossterm>armas de destrucción masiva</glossterm>
<glossdef>Una tecnología de la arma dando por resultado la pérdida de vida catastrófica.</glossdef>
<glossBody>
<glossPartOfSpeech value="noun"/>
<glossAlt>
<glossSurfaceForm>armas de destrucción masiva</glossSurfaceForm>
</glossAlt>
</glossBody>
</glossentry>
In some languages, like Spanish, abbreviated-form expansion should be written in lower case. This can lead to a grammatical error if the first appearance of an abbreviated form occurs at the beginning of a sentence. The same problem may arise with the indefinite article in English 'a' or 'an' depending on whether the text to be inserted begins with a vowel. It is up to the composition/display software to handle this. For example, the acronym for AIDS should be translated as:
<glossentry id="aids" xml:lang="es">
<glossAcronym>SIDA</glossAcronym>
<glossdef>Una enfermedad que afecta el sistema autoinmune.</glossdef>
<glossBody>
<glossPartOfSpeech value="noun"/>
<glossAlt>
<glossFullForm>síndrome de inmuno-deficiencia adquirida</glossFullForm>
</glossAlt>
<glossAlt>
<glossSurfaceForm>síndrome de inmuno-deficiencia adquirida (SIDA)</glossSurfaceForm>
</glossAlt>
</glossBody>
</glossentry>
Normally the <glossSurfaceForm> text from the above example could not be used at the start of a sentence, because it begins with a lower case letter. It is up to the composition software for the given language to cope with this input.
Abbreviated forms can cause problems for inflected languages because abbreviated form expansion needs to be presented in the nominative case, without any inflection. This can be achieved by placing the expansion of the abbreviated form in parentheses immediately following the acronym in the <glossFullForm> element. For example, the Polish acronym for the European Union may be:
<glossentry id="eu" xml:lang="pl">
<glossAcronym>UE</glossAcronym>
<glossdef/>
<glossBody>
<glossPartOfSpeech value="noun"/>
<glossAlt>
<glossFullForm>Unia Europejska</glossFullForm>
</glossAlt>
<glossAlt>
<glossSurfaceForm>UE (Unia Europejska)</glossSurfaceForm>
</glossAlt>
</glossBody>
</glossentry>
Using the above construct enables automated handling of the abbreviated form in Polish without causing any problems with grammatical inflection. For example, when stating that something occurred within the EU, the inflected form in Polish caused by the use of the locative case would have to be used. For the actual abbreviated form itself this is not a problem, since abbreviated forms are not inflected. Consider, for example, the phrase "In the European Union (EU) there are many institutions…":
"W Unii Europejskiej (UE) jest wiele instytucji…"
However, by allowing the translator to control how the text is displayed via the <glossSurfaceForm> element, the first occurrence for the abbreviated form allows the translator to use the following acceptable construct:
"W UE (Unia Europejska) jest wiele instytucji…"
The Language Reference for the glossentry topic should be revised to reflect the contents of this proposal including translation considerations and their impact on the use of abbreviations.
Implementation of the DTD and Schema changes for the glossentry topic, of the map domain for the <glossref> element, of the topic domain for the <abbreviation> element, and of the glossgroup topic.
Implementation of special processing to emit the surface form when appropriate.
In particular, abbreviated forms will be handled in a uniform and consistent manner by putting resolution of the abbreviated form under the control of the composition software so that glossary, tooltip, and first forms can be provided as required to meet the end-user requirements.