Topic Maps Published Subjects Technical Committee
Pubsubj > Documents > Lars Marius Garshol
Proposed by: Lars Marius Garshol
This is a private proposal, and is not in any way endorsed by the TC. It is also quite unpolished, and needs considerable further refinement in order to be useful as part of or foundation for any TC recommendation.
It seems that published subject documentation may contain the types of information described below. Note that this does not in any way mean that they must, should, or will. These concepts are described here in order to help the TC structure its debates. The TC (and the author of this note) may well decide that some of these forms of information are unnecessary, harmful, or at least that they should be optional.
These are formal assertions made about the published subjects in a form optimized for machine consumption. That the assertions are "core" implies that they are part of the definition of the subjects, and should be accepted by anyone who is to use the subjects. The purpose of publishing them is to simplify the use of the published subjects by allowing users to import the assertions into their own systems. Examples are: this subject has this name, this subject is an instance of this class, etc.
An example can be found at http://www.oasis-open.org/committees/geolang/docs/language.xtm. (Note that this example contains only published subject assertions.)
These are like the core published subject assertions, except that they are not part of the definition of the published subjects, and may therefore be controversial. The purpose of publishing them separately from the core assertions is to make it easier for users to choose which assertions to use, and which to ignore.
A weakness of this terminology is that the terms are very similar to one another, making it difficult to tell them apart.
The above analysis of PSD contents makes it possible to discuss what the structure of PSDs might be in practice. The sections below describe a number of possible approaches.
In this approach two resources would be published. This approach is best suited for published subject sets that don't contain too many subjects. An example of a PSD following this approach is http://psi.ontopia.net/ontopia/.
It is of course possible to work the published subject documentation metadata into the entry resource, through the use of metadata encoded in HTML using META elements. Whether this is done or not is for the publisher to choose.
In this approach the published subjects are defined using nothing more than a single human-oriented resource, a suitable format for which would be HTML or XHTML. This resource would then contain the published subject definitions, the published subject documentation description, and possibly also the published subject documentation metadata.
In this approach the published subjects are defined using nothing more than a single machine-oriented resource, a suitable format for which would be RDF/XML or XTM. This resource would then contain the published subject definitions, the published subject documentation description, the core published subject assertions, and the published subject documentation metadata.
This approach is suitable for publishing large collections of published subjects, and would be separated into the following resources:
A key issue that is clarified by this proposal is: what are the published subject indicators? Should it be the published subject definitions, or the topic elements in the core published subject assertions? Both approaches are fully possible.
Do we want to recommend one of these approaches, several of the approaches, or to leave the choice entirely in the hands of the publishers? If we leave the decision of PSD structure to the publishers, what do we recommend?