OASIS Topic Maps Published Subjects TC
Deliverables
1. Documentation
of Published Subjects - Requirements and Recommendations
Version 0.3 - updated
February 20, 2002
Latest version : http://www.oasis-open.org/committees/tm-pubsubj/docs/recommendations/psdoc.htm
Editor: Bernard Vatant
Status
of this document :
Working Draft
This
recommendation addresses the "shall, should, may" of :
Documentation of Published Subjects
- TC Process Requirements
1 - Statement of Purpose
The OASIS Topic Maps Published Subjects Technical Committee has been set
forth to help application of Topic Maps specification ISO
13250, by providing recommendations for documentation, management
and use of published subjects. The general purpose is that topic maps
interoperability needs non-ambiguous definition of subjects (represented
by
topics), that should be provided by stable resources, made available on-line
through trustable publication process.
Those resources, organised in published subject documentation sets,
will provide
both published subject indicators (human-understandable non-ambiguous
definition of subjects) and published subject identifiers (stable
URIs fit for computer processing, topic maps interoperability and merging,
and many other foreseeable semantic applications).
The
purpose of this document is to provide recommendations for the structure
and content of published subject documentation sets. Those recommendations
are aimed at publishers of ontologies, classifications, taxonomies, thesaurus,
registries, catalogues, data bases ... to provide those publishers with
efficient ways to make their legacy available as published subjects documentation,
and therefore usable by topic maps and other semantic applications.
2 - A gentle introduction to Published Subjects
A
main and original feature of topic maps is that they they deal with subjects.
A subject can be an unique individual object, like "Isaac Newton",
"IBM, Inc.", or "Paris (France)" ... or a class of
such individuals, like "famous scientists" "software companies"
or "towns" ... or a more abstract concept like "gravitation"
"economic growth" or "baroque style" ...
In a nutshell, a subject can be anything deserving to be
identified, named, represented and generally talked about - in short,
whatever can be a subject of conversation.
-
A subject is anything whatsoever, regardless of whether it exists
or has any other specific characteristics, about which anything whatsoever
may be asserted by any means whatsoever.
How
do topic maps deal with a subject? First they represent a subject
formally as an abstract "topic". In XTM documents, a topic is
represented by a <topic> XML element. A topic should represent an
unique, well-defined, non-ambiguous subject. So far, so good, at least
in the mind of a single topic map author. But topic maps applications
dream of inter-operability. That means that topic maps authors, users,
and computer applications dealing with them, must have ways to know if
two or more topics in the same or different topic maps represent the
same subject.
How can that be achieved? A topic map author can indicate what
is the subject of a topic by referring to a document, or any other kind
of resource, where the subject appears to be defined in a proper
and non-ambiguous way. Such a resource will therefore be considered by
the topic map author as a subject indicator. Provided with this
resource, an human being will be able, hopefully, to know what subject
this topic represents.
- A
subject indicator is a resource that is referred to by the topic map
author to provide an unambiguous indication of the identity of a subject.
Any resource can become a subject indicator by being referred to as
such from within some topic map, whether or not it was intended by its
publisher to be a subject indicator.
Since
topic maps live in the Web universe, the subject indicator has to be an
addressable (network-retrievable) resource. The reference to the subject
indicator will therefore use some URI, which will both address the
subject indicator and identify the subject. Computers applications
will of course be happy to handle this subject identifier, since
two topics with the same subject identifier clearly refer to the same
subject indicator, and therefore represent the same subject.
- A subject identifier
is an URI used by a topic map author to identify and refer to a subject
indicator.
Unfortunately,
the whole above scenario is too simple to be sustainable. The subject
indicators and subject identifiers defined only from the topic map author's
end, are likely to be untrustable and unstable. URIs and the resources
they address are moving targets in the Web universe. The publishers of
resources used as subject indicators might not even be aware of it, and
are likely to leave topic maps authors with meaningless identifiers and
indicators, if any indicator at all, without previous notice.
Here the publishers enter in the loop. If some publishers are aware of
the whole problem, and want to provide topic maps applications with stable,
trustable, authoritative subject indicators and identifiers, the situation
is far better. The publishers can provide sets of subject indicators and
subject identifiers in a stable way, and declare their intention to maintain
them stable and trustable for topic maps and other applications. At that
point, the topic maps authors are provided with published subjects,
defined in published subject documentation sets, coming along with
published subject indicators and published subject identifiers.
They will use them as before, but the whole scenario will become really
sustainable.
- A
published subject is a subject for which there exists at least one published
subject indicator.
- A
published subject indicator is a subject indicator that is published
and maintained at an advertised address for the purpose of facilitating
topic map interchange and mergeability.
- A published
subject identifier is the canonical URI of a published subject indicator,
chosen and declared by its publisher as the URI to be used within topic
maps to identify the published subject.
- A
published subject documentation set is the complete set of documentation
about a set of published subject indicators and identifiers, as published
by its publisher.
The
topic maps litterature has coined for over a year the acronym "PSI".
Note that it can expand both in "published subject indicator"
and "published subject identifier". Those are two faces of the
concept, one looking at humans (the indicator), and one looking at computers
(the identifier).
Like Janus Bifrons over Roman doors, PSIs are warrants of a good
communication between two universes ...
3
- Glossary
The
following terms and concepts will be used in this document and further
TC recommendations.
Some of them are already defined and used by ISO 13250. Nevertheless,
the TC proposes some modifications to clarify some of them and their relationships
with new ones, and will send those proposals to ISO JTC1/SW34 for relevant
revision and extension of ISO 13250 terminology. Both current ISO 13250
definition and PubSubj TC proposal are given when necessary.
"Publisher" is used throughout in the sense defined in Dublin
Core metadata (dc:publisher)
"Resource" is used throughout in the sense of "network-retrievable
resource" (IETF) or "addressable resource" (ISO 13250)
- subject
as defined by ISO 13250 XTM
A subject is anything whatsoever, regardless of whether it exists or
has any other specific characteristics, about which anything whatsoever
may be asserted by any means whatsoever.
- subject
indicator
as defined by ISO 13250 XTM
A resource that is intended by the topic map author to provide a positive,
unambiguous indication of the identity of a subject.
definition proposal
A resource that is referred to by the topic map author to
provide an unambiguous indication of the identity of a subject. Any
resource can become a subject indicator by being referred to as such
from within some topic map, whether or not it was intended by its publisher
to be a subject indicator.
See "published
subject indicator"
- subject identifier
definition proposal
An URI used by a topic map author to identify and refer to a subject
indicator. When a subject identifier is declared by a publisher, in
a published subject documentation set, to identify a published subject
indicator, it is called a published subject identifier.
- published
subject
definition
proposal
A subject for which there exists at least one published subject indicator.
- published
subject indicator
as defined by ISO 13250 XTM
A subject indicator that is published and maintained at an advertised
address for the purpose of facilitating topic map interchange and mergeability.
- published
subject identifier
definition
proposal
The canonical URI of a published subject indicator, chosen and declared
by its publisher as the URI to be used within topic maps to identify
the published subject.
- published
subject documentation set
definition
proposal
The complete set of documentation about a set of published subject indicators
and identifiers, as published by its publisher.
4
- Recommendations for published subjects documentation
Considering the considerable legacy of taxonomies, classifications,
ontologies, data bases and catalogues likely to be made available as published
subject documentation sets, their publishers should not be constrained
more than necessary to use a specific structure, syntax or language. Therefore,
the present recommendation does not try to enforce upon publishers either
an unique standard structure for published subjects documentation, or
a specific syntax for subject definition resource, or for subject indicator
reference URIs. Nevertheless, it will suggest best practices for each
existing relevant syntax.
Besides access to a set of PSIs, a published subject documentation set
should include at least the following informations, ensuring their efficient
and trustable use.
- Statement of purpose
- Publisher and documentation
metadata
- Statement of documentation
structure
4.1
- Statement of purpose
A published subject documentation set shall include a formal statement
from its publisher, expliciting its conformance to this recommendation,
and its intention to maintain the documentation trustable, and the PSIs
stable.
4.2
- Publisher and documentation metadata
A published subject documentation set shall include the following
(Dublin Core) metadata.
- Identity
of the publisher (dc:publisher)
- Identity
of the documentation set (dc:identifier)
- Format
(dc:format)
- Source
of documentation (dc:source)
-
Creator (dc:creator) and contributors (dc:contributor)
The above identities should be defined themselves as PSIs
- Title
of the documentation (dc:title)
- Language
of publication (dc:language)
- Date
of publication or validation (dc:date)
- Possible
restrictions of use (dc:rights)
In
complement to those metadata, the documentation may include recommendations
for use, and list of registered users.
4.3
- Statement of documentation structure
4.3.1 - A published subject documentation set should provide explicit
information on the syntax used for its published subject identifiers.
This syntax should as far as possible follow a consistent schema throughout
the documentation, e.g. an uniform namespace or query string structure.
4.3.2
- Throughout a published subject documentation set, the published
subject indicators should follow a consistent and uniform structure (DTD,
schema or some equivalent structure definition), allowing unambiguous
understanding of their content. Such
uniformity could also enable their parsing and processing by topic maps
engines, search engines, intelligent agents and any foreseeable kind of
semantic web application.
4.4
- Information
provided by published subject indicators
A
published subject indicator shall provide, following a formal structure
as defined in 4.3.2, explicit information items establishing the published
subject identity, that should include at least the following elements.
- Identifier
(dc:identifier)
The canonical URI that is to be used as the published subject identifier.
- Name
(dc:subject)
A name given to the subject.
- Type
(dc:type)
A class of which the subject is an instance.
-
Description (dc:description)
Can be text, image or any kind of relevant resource describing the subject
in an human-understandable way.
- Equivalence
Reference to equivalent published subject indicators in other published
subject documentation sets.
5 - Best
practices for published subject documentation structure
To be delivered - this part will provide examples
of, or references to, published subject documentation sets conformant
to the present recommendation, in various relevant formats, such as XTM,
RDF or XHTML.
|