Why
would you use a keyword in document content at all if there are no clear
semantics nor processing expectations associated with it? Could we document it
as being allowed there primarily to allow specialization? And then discourage
its direct use inline? Anyone using it directly is likely to want EITHER
monospace OR topic level metadata attachment and be unhappy to get
neither.
Within
<keywords> the semantics-less-ness of the element is not a problem because
the <keywords> element supplies the semantics. The user could equally use
<term> or <ph> or whatever. They just need tags to act as delimiters
between elements.
Okay,
so here's my proposal:
"<keyword> represents a word or phrase with special significance in
a particular domain. In the general case, <keyword> elements typically do
not have any special semantics and processing associated with them and should be
avoided. <keyword> specializations are more meaningful and are therefore
preferable. <keyword> in the <keywords> element distinguishes a word
or phrase that describes the content of a topic (a topic description keyword).
Topic description keywords are typically used for searching, retrieval and
classification purposes."
I agree that we should vote, but I will add a few more
cents to the discussion. I will expand on a very important difference between
<keywords> and <keyword> in DITA.
The semantic "keywords for the document" is associated
with the <keywords> element in the prolog, and is clearly documented
there. The <keywords> element in the prolog can contain <keyword>,
<apiname>, <indexterm>, and many other things. Erik has suggested
it should also be allowed to contain <term>, and I buy that for
post-1.0.
<keyword> in DITA
has absolutely no semantic significance. It just marks a noun or noun phrase
that is of some, as yet unknown, special significance. The specializations
give it that significance. In the absence of a specialization, it just marks
some kind of significant word. It is the word or noun-phrase equivalent of
<ph> for sentences, or <section> for divisions within a topic
body, or <topic> for entire topics. See the topic "Limits of
specialization", subtopic "Specialize from generic elements", in the
architectural spec.
I can wish
that we had chosen a different label for the element, but it's too late for
that. If someone marks up <keyword>assert</keyword> they will not
get monospace; they might have better luck with <codeph>,
<parmname>, or <cmdname>, all of which have an associated semantic
that <keyword> in DITA, despite your expectations, simply does
not.
With respect to
<apiname>, I do not believe we need to offer guidance as to whether an
<apiname> is a keyword for the topic as a whole. If it is repeated in
the <keywords> element in the prolog, then it is a key word for the
topic; if it is not, then it isn't. This is an author's decision to make, and
has nothing to do with the inherent semantic of whether or not the element is
an <apiname>.
It would not
make sense, for example, to say that a programming topic that mentions fifteen
APIs must have all or none of them as keywords for the topic as a whole. It is
much more likely that only some of them are core, while others are incidental,
perhaps even occurring in the context of a code snippet without being
discussed at all.
Being an
<apiname> is an inherent property, while being a key word for the topic
is a contextual property: hence the need to place the <apiname> in a
special context (the <keywords> element) when it does act as a key word
for the entire topic.
Michael
Priestley mpriestl@ca.ibm.com
|