Topic Maps Published Subjects Technical Committee
Pubsubj > Documents > Deliverables > 1. Definitions, Requirements and Examples
OASIS Topic Maps Published Subjects TC
Latest version : http://www.oasis-open.org/committees/tm-pubsubj/docs/recommendations/general.htm
Topic map terminology used in this document is consistent with ISO/IEC
13250 Topic Maps and the Standard Application Model for Topic Maps (currently
under development, see SAM editor's draft at: http://www.isotopicmaps.org/sam/sam-model/).
- Statement of Purpose
Therefore, the principal goal of this TC is the specification of how subjects should be defined by users and identified by applications so as to result in unambiguous and stable identities known as "published subjects."
2 - A gentle introduction to Published Subjects
A subject can be an individual, like "Isaac Newton", "IBM, Inc.", or "Paris (France)" ... or a class of such individuals, like "famous scientists" "software companies" or "towns" ... or a more abstract concept like "gravitation" "economic growth" or "baroque style"... In short, a subject can be anything deserving to be identified, named, represented and generally talked about - otherwise said a subject of conversation.
Subjects are represented in an application by a "topic," as used in topic map terminology. To be useful, a unique, well-defined and unambiguous subject should be made explicit for each topic.
2.2 - Subject Indicators and Subject Identifiers
Applications use topics as a gathering point for information that is relevant in some way to the subject which that topic represents. In order to insure that all relevant information for a particular subject is gathered by a given topic, authors, users and applications should use the same topic to refer to the same subject. That common reference, the same topic representing the same subject, depends upon the use of a stable mapping of a particular subject to a given topic. Between people, that common agreement is arrived at (hopefully!) during iterative conversation. For automatic processing by computer based applications, a less iterative process is required. Both human and automated processes benefit from a common point (document or resource) that defines this mapping of subject to topic.
2.2.1 - Subject to Topic Mapping (for humans) : Subject Indicators
An application can indicate to human users what is the subject of a topic by referring to a document, or any other kind of network-retrievable resource, where the subject appears to be defined, described, or at least indicated in a human-readable and non-ambiguous way. Such a resource is called a subject indicator.
Provided with a subject indicator, human users should be able to know what subject the topic represents. Whenever applications are considered media for human transactions, subject indicators will provide a common reference to human users connected through the application, and agreement on the subject indicator will be used as the external expression of agreement on the identity of a subject.
2.2.2 - Subject to Topic Mapping (for applications) : Subject Identifiers
Computer applications using a topic cannot "know" what a subject "is". In order to reach (hopefully!) a similar conclusion to that of a human user inspecting a subject indicator, the application relies upon what is known as a subject identifier. A subject identifier is a string that is used by an application to compare two topics to see if they "match." If they do "match," then the application considers the subject of those two topics to be the same. If the strings do not "match," then the subjects represented by those topic are considered to be different.
If the reference to a subject indicator in the network uses a URI, this URI will be the best subject identifier for applications. A subject identifier is a URI that refers to a subject indicator, and provides an unambiguous identification of a subject to an application.
Subject indicator and subject identifier are therefore two faces of the same identification mechanism, the former being for humans and the latter for applications. This identification mechanism is the support for agreement on subject identity throughout the network, between applications, between users, and between applications and users.
2.2.3 - Example : Subject Identifier and Subject Indicator for the subject "Apple Tree" (Malus Domestica)
2.3 - Published Subjects
2.3.1 - Problems with Simple Subject Identifiers and Subject Indicators
The Subject Identifier and Subject Indicator mechanisms outlined above are deficient in two important respects:
Any resource can be considered as a Subject Identifier or Subject Indicator
Since any resource can be considered as a Subject Identifier and/or Subject Indicator, there is no mechanism to require additional information that would enhance the utility of those mechanisms. The more structured information that is allowed for the Subject Identifier and/or Subject Indicator mechanism, the less ambiguity and greater utility of the resource.
A second problem is that the source of the resource, i.e., the author of the topic map, may or may not intend for any particular resource to be used or even considered as a Subject Identifier or Subject Indicator. If its use in that manner was not intended by the topic map authors or they are unaware of such use, the resource could change in unpredictable ways or even disappear altogether.
mechanisms thus far described may suffer from ambiguity, lack of sufficient
detail to be useful and unstable both in terms of content as well as location.
An alternative to the current mechanism is outlined below.
- Publishers in the loop
The publication space, where such published subjects will be used, is a network of applications connected together and of users allowed to access those applications. It can be of course as wide and open as the Web, but it can be also a more or less closed network like an enterprise intranet or community portal. "Published" does not mean necessarily "public".
[Note : Patrick's suggestion was to strike this whole section]
2.3.3 - Published Subject Identifier and Published Subject Indicator: Subject "Apple"
The Published Subject Identifier and Published Subject Indicator shown below differ from the first example in three very important ways:
1. The publisher states this is a Published Subject Identifier and Published Subject Indicator (intentional PSI, not accidental)
2. The subject is unambiguous
3. The resource is stable
The topic maps literature has coined the acronym "PSI", used in XTM 1.0 specification. Note that it can be expanded both as "published subject indicator" and "published subject identifier". Those are two faces of the concept, one looking towards humans (the indicator), and one looking towards computers (the identifier).
3 - Requirements and Recommendations for PSIs
3.1 - Requirements for PSIs
Requirement 1 :
Requirement 2 :
It has been discussed if URNs could be used as PSIs. The best practice is to use URLs. URNs may be used as PSIs provided the publisher defines some resolution mechanism, to conform to Requirement 2.
3.2 - Recommendations for PSIs
Recommendation 1 :
Recommendation 2 :
metadata is recommended so that applications can use more information
on the subject than solely URI identification.
well as machine-processable metadata can be included in the Subject Indicator
itself (e.g. RDF metadata), or in a separate resource referenced from
the Subject Indicator (e.g. XTM metadata).
Recommendation 3 :
Consistency between human-readable and machine-processable metadata is the warrant of consistent "interpretation" by applications and humans. This can be achieved, for example, by human-readable metadata being an expression of machine-processable metadata. This issue will be addressed in a future deliverable.
Recommendation 4 :
This statement of purpose has to be clearly endorsed by the publisher (see below).
Recommendation 5 :
Publisher is to be understood here in its Dublin Core definition: "An entity responsible for making the resource available."