OASIS
Topic Maps Published Subjects Technical Committee Pubsubj > Documents > Deliverables > 1. Definitions, Requirements and Examples |
OASIS Topic Maps Published Subjects TC
Deliverables
Version
0.2 - last updated 2002, August 20 (
Previous
version ) Preliminary
Editor's Note:
1
- Statement of Purpose The
first and main target of this Technical Committees recommendations is
therefore topic maps interoperability, through efficient definition and
identification of subjects represented
by topics in topic maps. Both identification of subjects by applications and definition of subjects for their human users, can be provided by stable network-retrievable resources, made available under trustable publication process, as defined by the present and following recommendations. Subjects defined and identified in such a way are called published subjects.
2 - A gentle introduction to Published Subjects terminology A subject can be an individual, like "Isaac Newton", "IBM, Inc.", or "Paris (France)" ... or a class of such individuals, like "famous scientists" "software companies" or "towns" ... or a more abstract concept like "gravitation" "economic growth" or "baroque style"... In short, a subject can be anything deserving to be identified, named, represented and generally talked about - otherwise said a subject of conversation. Topic Maps specification XTM 1.0 proposes an extremely general definition of a subject. Definition 1 A subject is anything whatsoever, regardless of whether it exists or has any other specific characteristics, about which anything whatsoever may be asserted by any means whatsoever. Applications
deal with a subject by handling a formal representation or proxy, that
will be called throughout this document a topic, to conform to
topic maps terminology. A topic is a representation, inside an application, of an unique, well-defined, and non-ambiguous subject. 2.2 - Subject Indicators and Subject Identifiers Inter-operability
of applications sharing information in the same network needs that authors,
users, and applications are provided ways to agree if two or more topics
in the same or different applications represent the same subject.
This agreement has to be effective in both human-to-application and application-to-application
transactions. 2.2.1 - Indication of the subject for humans : Subject Indicators An application can indicate to human users what is the subject of a topic by referring to a document, or any other kind of network-retrievable resource, where the subject appears to be defined, described, or at least indicated in a human-readable and non-ambiguous way. Such a resource is called a subject indicator. Provided with this resource, human users will be able to know what subject the topic represents. Whenever applications are considered media for human transactions, subject indicators will provide a common reference to human users connected through the application. Definition 3 A subject indicator is a network-retrievable resource that is referred to by an application, to provide an unambiguous indication of the identity of a subject to a human being. 2.2.2 - Identification of the subject for applications : Subject Identifiers While being able to provide humans with subject indicators, the computer applications cannot "know" what the subject "is". But they can handle identifiers (strings) allowing them to decide if two subjects are identical or not. If the reference to a subject indicator in the network uses some URI, this URI will be the best subject identifier for applications. Definition 4 A subject identifier is an URI that refers to a subject indicator, and provides an unambiguous identification of a subject to an application. Subject indicator and subject identifier are therefore two faces of the same identification mechanism, the former being for humans and the latter for applications. This identification mechanism is the support for agreement on subject identity throughout the network, between applications, between users, and between applications and users. 2.2.3 - Example : Subject Identifier and Subject Indicator for the subject "Apple Tree" (Malus Domestica) 2.3 - Published Subjects Unfortunately,
the whole above scenario is too simple to be sustainable. Any resource
can be considered a subject indicator by being referred to as such by
an application, whether or not this resource was intended by its publisher
to be a subject indicator. The subject indicators and subject identifiers
defined in such a process are likely to be untrustable and unstable. URIs
and the resources they address are moving targets in the networked universe.
The publishers of resources used as subject indicators might not even
be aware of it or just not care about it, and are likely to leave applications
and users with meaningless identifiers and indicators, if any indicator
at all, without previous notice (as an example, check if the above resource
is still available on the Web, by clicking on the URI box on the image) At that point, applications and users will be provided with published subjects, published subject indicators and published subject identifiers. They will use them as above, but the whole scenario will become more sustainable. Definition 5 A published subject is a subject for which at least one published subject indicator is available. Definition 6 A published subject indicator is a subject indicator that is published and maintained at an advertised address in order to facilitate interoperability of applications. Definition 7 A published subject identifier is the URI of a published subject indicator, chosen and declared by its publisher as the URI to be used by applications to identify the published subject. Note that "publication space" means a network of applications connected together, and of users allowed to access those applications. It can be of course as wide and open as the Web, but it can be also a more or less closed network (enterprise intranet, community portal ...). Published does not mean necessarily "public"... Example: Published Subject Identifier and Published Subject Indicator for the subject "Apple" In the below figure,
the subject identified for the computer by the URI http://psi.fruit.org/#apple
is indicated to Isaac Newton by a dedicated resource in the Fruit Glossary,
providing him with a definition and image.
Notes: 1. The above picture seems only slightly different from the previous one (a minor difference is that the subject is the fruit here, and the fruit tree there). But major differences are publisher's statement of purpose and user's trust. 2. The topic maps literature has coined the acronym "PSI", used in XTM 1.0 specification. Note that it can be expanded both as "published subject indicator" and "published subject identifier". Those are two faces of the concept, one looking towards humans (the indicator), and one looking towards computers (the identifier). Like Janus Bifrons over Roman doors, PSIs are warrants of a good communication between two universes. 3 - Requirements and Recommendations for PSIs The following are the basic requirements and recommendations for PSIs Requirement 1 :
Requirement 2 :
It has been widely
discussed if URNs could be used as PSIs, or only URLs. Although general
best practice will certainly use URLs, URNs are not completely ruled out
as PSIs ... providing the publisher defines some resolution mechanism,
to conform to Requirement 2. Recommendation 1 :
Recommendation 2 :
Machine-processable
metadata is recommended so that applications can use more information
on the subject than solely URI identification. Human-readable as
well as machine-processable metadata can be included in the Subject Indicator
itself (e.g. RDF metadata), or in a separate resource referenced from
the Subject Indicator (e.g. XTM metadata). Recommendation 3 :
Consistency between human-readable and machine-processable metadata is the warrant of consistent "interpretation" by applications and humans. This can be achieved, for example, by human-readable metadata being an expression of machine-processable metadata. This issue will be addressed by Deliverable 2. Recommendation 4 :
This statement of purpose has to be clearly endorsed by the publisher (see below). Recommendation 5 :
Publisher is to be
understood here in its Dublin Core definition: 4 - Examples The purpose of PSIs
can be stated - in a nutshell - as providing the mechanism to make it
possible to distinguish apples from oranges. 1.
Examples of Published Subject Identifiers A (fictive) publisher owning the domain fruit.org, uses the subdomain psi.fruit.org, dedicated to Published Subjects about fruits. Various URLs can be used as Subject Identifiers for the fruit class "apple", for example:
2. Examples of Published Subject Indicators 2.1 XHTML PSI To be delivered 2.2 RDF PSI To be delivered |