OASIS Topic Maps Published Subjects Technical Committee
Pubsubj > Documents > Working documents

 

Use Case for Published Subject Registry
Mary Nishikawa - Preliminary Draft for Discussion -
November 5, 2001

These are some preliminary ideas that I would like to discuss about a Use Case for a Published Subject Directory. It is based on the Background and Questions, and pubsubj.dtd and Published Subject examples of Bernard Vatant (based on my understanding) with some additional thoughts on the subject after reading back postings of the xtm-wg (now archived in topicmaps-comment), particularly the message thread on Registration Authorities and PSI from message 151 in June 2001 and Bernard Vatant's comment "I'm thinking more and more that dc elements should be included in PSI standard content" in this recent tm-pubsubj message.

  • Dublin Core Qualifier for Subject: The Dublin Core qualifiers of the Encoding Scheme for subject* (http://dublincore.org/documents/dcmes-qualifiers/) is proposed for the highest level organizing principle of the registry to follow the publishing authority and namespace for recommendations and registry http://www.oasis-open.org/pubsubj (or some equivalent domain). The qualifiers recommended by the Dublin core for Subject are LCSH (Library of Congress Subject Headings), MeSH (Medical Subject Headings), DDC (Dewey Decimal Classification), LCC (Library of Congress Classification), UDC (Universal Decimal Classification). One of these qualifiers would need to be selected by the Pubsubj creator with the encoding of the subject, to create a pubsubj in the directory.

Note: The choice of Encoding scheme by the pubsubj creator would need to be verified. How this might be done needs to be decided.

X, Y, and Z would depend on the subject. How many levels should be used may be difficult to ascertain until actual subjects are added to the directory. For the Dublin Core examples (Bernard Vatant) we could have the following:

URI: http://www.oasis-open.org/pubsubj/LCC/ZA4050-4460/metadata/dublin_core/contributor.xml

This would be the same for all the other xml files: coverage.xml, creator.xml, date.xml, description.xml, format.xml, identifier.xml, language.xml, publisher.xml, relation..xml, rights.xml, source.xml, subject.xml, title.xml, type.xml. It may also be possible to have all contents in one base xml file and have each referenced such as http://www.oasis-open.org/pubsubj/LCC/ZA4050-4460/metadata/dublin_core/index.xml#contributor.

This does not actually mean that the files would be located in a dublin_core directory but they might possibly be. It may be a better idea to have the files or information itself reside in some database that can be queried.

Advantage: These organizations have huge subject databases that can be used by those who need to classify their subject. Much work has gone into the Dublin Core by many, many experts. This would help in realizing a standard that would be acceptable to knowledge communities.

Disadvantage: Since all of these choices would be made available, it might be difficult for the publisher of the published subject to know which one is best. For example, since I am not familar with the LCC, and wouldn't know if DDC is a better choice, I would need an expert to verify that the classification I chose was the best for the subject. This would be a great responsibility on the shoulders of the experts. It may be possible to have some kind of self registration and validation. How can we depend on the validity of the subject? Are human resources available to do the validation or some validation by some online database? This would need careful consideration by the users of the resources. Another problem would be multiple entries (non unique subject identities). One person could register an LCC for "Dublin Core" as a subject while another person might want to register it as a DDC.

  • Intellectual Property Identity: It may be necessary to assure the publishers of pubsubj that the ownership of their resources are protected from abuse. You may not wish your resources to appear in some directories. What would stop anyone from adding your pubsubj to their topic map? There should be a clear identification of the source and agreement to protect the IP of the resources in the subject. The first step to avoid this is to append an IP reference to the resource. This would not be required, but it should be made available.

  • Security level Identity: The security level could be defined but optional. There are some governmental or corporate organizations that could benefit from this. A private registry could be set up for various levels of confidentiality for corporate or governmental use. One the content is made public, it could then be placed into the public published subject registry.

Use Case for Published Subject

 

Details for Use Case Diagram

(The source of Creator, modifier, validator, and user are my undertanding about their use in pubsubj.dtd. This needs to be corrected or clarified by Bernard Vatant.) A validator of the uriReference of the user may need to be added to the dtd.

TM Authors: Authors of topic maps (ISO 13250 Topic Maps , XML Topic Maps (XTM) 1.0 Specification, or XML Schema for ISO 13250 Topic Maps Draft) could use a published subject URI from the registry as the id attribute of the topic or AbstractTopic element of their topic maps ( I am not so sure how this would work, but I would like it to be thought about). I came across much discussion about having one and only unique identifier for a subject. This seems to be an important requirement of this group.

TM Engines: The engines that are used to merge maps that have the same published subject URI for the id as described above. There would not be any doubt that the subjects were in fact the same (I am being optimistic here).

Portals: The portals could be general, corporate, or governmental. They could use the merged maps for extensive indexing and search of subjects, providing a service to the public or to a vertical industry.

The Pubsubj Creator (in pubsubj.dtd)  is the primary publisher of the subject. It is the creator in creation of the dtd.

The Pubsubj Modifier (pubsubj creator in modification of pubsubj.dtd) is anyone that has the authority to update either the primary subject file or appended a uri reference. It is usually the creator, but may not always be.

The Pubsubj Validator (pubsubj creator in validation of pubsubj.dtd) might be the creator or some validating organization or system  that is authorized to do so.

Pubsubj User (user in pubsubj.dtd) is someone who appends a uriReference to the published subject. The user must be the owner of that information, or have permission from the owner to add it to the subject.

The Validator of a uriReference  is someone or some system that can accept and approve the appending of the uri to the Published topic by the User. It may be the user or someone who can do the verifying.

Note: In most cases it is a human validator, but let's allow the possibility for some checking by a system to "look up" the information and verify it.