RE: [dita] Proposed revision for the keyword definition (was Keywords in

dita message

Subject: RE: [dita] Proposed revision for the keyword definition (was Keywords in DITA)

From: "Paul Prescod" <paul.prescod@blastradius.com>

To: "Michael Priestley" <mpriestl@ca.ibm.com>

Date: Mon, 14 Mar 2005 18:16:08 -0800

Why would you use a keyword in document content at all if there are no clear semantics nor processing expectations associated with it? Could we document it as being allowed there primarily to allow specialization? And then discourage its direct use inline? Anyone using it directly is likely to want EITHER monospace OR topic level metadata attachment and be unhappy to get neither.

Within <keywords> the semantics-less-ness of the element is not a problem because the <keywords> element supplies the semantics. The user could equally use <term> or <ph> or whatever. They just need tags to act as delimiters between elements.

Okay, so here's my proposal:

"<keyword> represents a word or phrase with special significance in a particular domain. In the general case, <keyword> elements typically do not have any special semantics and processing associated with them and should be avoided. <keyword> specializations are more meaningful and are therefore preferable. <keyword> in the <keywords> element distinguishes a word or phrase that describes the content of a topic (a topic description keyword). Topic description keywords are typically used for searching, retrieval and classification purposes."

From: Michael Priestley [mailto:mpriestl@ca.ibm.com]
Sent: Monday, March 14, 2005 5:30 PM
To: Paul Prescod
Cc: Dana Spradley; dita@lists.oasis-open.org; Don Day; Erik Hennum; JoAnn Hackos; Rob Frankland
Subject: RE: [dita] Proposed revision for the keyword definition (was Keywords in DITA)

I agree that we should vote, but I will add a few more cents to the discussion. I will expand on a very important difference between <keywords> and <keyword> in DITA.

The semantic "keywords for the document" is associated with the <keywords> element in the prolog, and is clearly documented there. The <keywords> element in the prolog can contain <keyword>, <apiname>, <indexterm>, and many other things. Erik has suggested it should also be allowed to contain <term>, and I buy that for post-1.0.

<keyword> in DITA has absolutely no semantic significance. It just marks a noun or noun phrase that is of some, as yet unknown, special significance. The specializations give it that significance. In the absence of a specialization, it just marks some kind of significant word. It is the word or noun-phrase equivalent of <ph> for sentences, or <section> for divisions within a topic body, or <topic> for entire topics. See the topic "Limits of specialization", subtopic "Specialize from generic elements", in the architectural spec.

I can wish that we had chosen a different label for the element, but it's too late for that. If someone marks up <keyword>assert</keyword> they will not get monospace; they might have better luck with <codeph>, <parmname>, or <cmdname>, all of which have an associated semantic that <keyword> in DITA, despite your expectations, simply does not.

With respect to <apiname>, I do not believe we need to offer guidance as to whether an <apiname> is a keyword for the topic as a whole. If it is repeated in the <keywords> element in the prolog, then it is a key word for the topic; if it is not, then it isn't. This is an author's decision to make, and has nothing to do with the inherent semantic of whether or not the element is an <apiname>.

It would not make sense, for example, to say that a programming topic that mentions fifteen APIs must have all or none of them as keywords for the topic as a whole. It is much more likely that only some of them are core, while others are incidental, perhaps even occurring in the context of a code snippet without being discussed at all.

Being an <apiname> is an inherent property, while being a key word for the topic is a contextual property: hence the need to place the <apiname> in a special context (the <keywords> element) when it does act as a key word for the entire topic.

Michael Priestley
mpriestl@ca.ibm.com