OASIS
Topic Maps Published Subjects Technical Committee Pubsubj > Documents > Deliverables > 1. Definitions, Requirements and Examples |
OASIS Topic Maps Published Subjects TC
Deliverables Version
0.3 - last updated 2002, September 26 (Previous
version)
Editor's Notes:
3.
This document uses topic maps terminology, and intends to be consistent
with its use both in ISO 13250 specification and in the Standard Application
Model for Topic Maps, the latter currently under discussion.
Terms like "subject","topic", "subject indicator",
"subject identifier", "published subject", "published
subject indicator", "published subject identifier", are
introduced informally in this document. SAM
provides normative formal definitions for those terms. 1
- Statement of Purpose The
first and main target of this Technical Committees recommendations is
therefore topic maps interoperability, through efficient definition and
identification of subjects represented
by topics in topic maps. Both identification of subjects by applications and definition of subjects for their human users, can be provided by stable network-retrievable resources, made available under trustable publication process, as defined by the present and following recommendations. Subjects defined and identified in such a way are called published subjects.
2 - A gentle introduction to Published Subjects A subject can be an individual, like "Isaac Newton", "IBM, Inc.", or "Paris (France)" ... or a class of such individuals, like "famous scientists" "software companies" or "towns" ... or a more abstract concept like "gravitation" "economic growth" or "baroque style"... In short, a subject can be anything deserving to be identified, named, represented and generally talked about - otherwise said a subject of conversation. Applications deal with a subject through a formal representation or proxy, that will be called throughout this document a topic, to conform to topic maps terminology. Otherwise said, a topic is the representation, inside an application, of an unique, well-defined, and non-ambiguous subject. 2.2 - Subject Indicators and Subject Identifiers Applications
aggregate information around topics, and represent relationships between
topics. Inter-operability
of applications sharing information in the same network needs that authors,
users, and applications are provided ways to agree if two or more topics
in the same or different applications represent the same subject.
This agreement has to be effective in both human-to-application and application-to-application
transactions. 2.2.1 - Indication of the subject for humans : Subject Indicators An application can indicate to human users what is the subject of a topic by referring to a document, or any other kind of network-retrievable resource, where the subject appears to be defined, described, or at least indicated in a human-readable and non-ambiguous way. Such a resource is called a subject indicator. Provided with a subject indicator, human users should be able to know what subject the topic represents. Whenever applications are considered media for human transactions, subject indicators will provide a common reference to human users connected through the application, and agreement on the subject indicator will be used as the external expression of agreement on the identity of a subject. 2.2.2 - Identification of the subject for applications : Subject Identifiers While being able to provide humans with subject indicators, the computer applications cannot "know" what the subject "is". But they can handle identifiers (strings) allowing them to decide if two subjects are identical or not. If the reference to a subject indicator in the network uses some URI, this URI will be the best subject identifier for applications. A subject identifier is an URI that refers to a subject indicator, and provides an unambiguous identification of a subject to an application. Subject indicator and subject identifier are therefore two faces of the same identification mechanism, the former being for humans and the latter for applications. This identification mechanism is the support for agreement on subject identity throughout the network, between applications, between users, and between applications and users. 2.2.3 - Example : Subject Identifier and Subject Indicator for the subject "Apple Tree" (Malus Domestica) 2.3 - Published Subjects 2.3.1 - Shortcomings of the above scenario Unfortunately,
the whole above scenario is too simple to be sustainable. Any resource
can be considered a subject indicator by being referred to as such by
an application, whether or not this resource was intended by its publisher
to be a subject indicator, and whether or not the publisher is aware of
it or even cares about it. Hence,
subject indicators and subject identifiers defined in such a way are untrustable,
and are likely to be either ambiguous, or unstable, or both. 2.3.2
- Publishers in the loop The publication space, where such published subjects will be used, is a network of applications connected together and of users allowed to access those applications. It can be of course as wide and open as the Web, but it can be also a more or less closed network like an enterprise intranet or community portal. "Published" does not mean necessarily "public". 2.3.3 - Example : Published Subject Identifier and Published Subject Indicator for the subject "Apple" In the below figure, the subject identified for the computer by the (fictitious) URL "http://psi.fruit.org/#apple" is indicated to Isaac Newton by a dedicated resource in the Fruit.Org Published Subjects, providing him with a non-ambiguous and stable definition. The Publisher (Fruit.Org) has declared this resource stable and intended to be used as a PSI. Isaac Newton can trust the URI resolution to provide him with a stable on-line resource as long as he has access to the network.
1. The above picture seems only slightly different from the previous one. A minor difference is that the subject is the fruit here, and the fruit tree there. But major differences are publisher's statement of purpose, disambiguation of the subject, and stability. 2. The topic maps literature has coined the acronym "PSI", used in XTM 1.0 specification. Note that it can be expanded both as "published subject indicator" and "published subject identifier". Those are two faces of the concept, one looking towards humans (the indicator), and one looking towards computers (the identifier). Like Janus Bifrons over Roman doors, PSIs are warrants of a good communication between two universes. 3 - Requirements and Recommendations for PSIs The following are the basic requirements and recommendations for PSIs Requirement 1 :
Requirement 2 :
It has been widely
discussed if URNs could be used as PSIs, or only URLs. Although general
best practice will certainly use URLs, URNs are not completely ruled out
as PSIs ... providing the publisher defines some resolution mechanism,
to conform to Requirement 2. Recommendation 1 :
Recommendation 2 :
Machine-processable
metadata is recommended so that applications can use more information
on the subject than solely URI identification. Human-readable as
well as machine-processable metadata can be included in the Subject Indicator
itself (e.g. RDF metadata), or in a separate resource referenced from
the Subject Indicator (e.g. XTM metadata). Recommendation 3 :
Consistency between human-readable and machine-processable metadata is the warrant of consistent "interpretation" by applications and humans. This can be achieved, for example, by human-readable metadata being an expression of machine-processable metadata. This issue will be addressed by Deliverable 2. Recommendation 4 :
This statement of purpose has to be clearly endorsed by the publisher (see below). Recommendation 5 :
Publisher is to be
understood here in its Dublin Core definition: "An entity responsible
for making the resource available." |