|  
         
       
        OASIS Topic Maps Published Subjects TC 
        Deliverables  
         1. Documentation 
        of Published Subjects - Requirements and Recommendations 
       
        Version 0.4 - updated 
        March 2, 2002 
        Latest version : http://www.oasis-open.org/committees/tm-pubsubj/docs/recommendations/psdoc.htm 
        Editor: Bernard Vatant 
         
         
          
          Status 
        of this document : 
        Working Draft 
      This 
      recommendation addresses the "shall, should, may" of : 
      Documentation of Published Subjects 
      - TC Process Requirements  
       
       
        
        1 - Statement of Purpose 
         
        The OASIS Topic Maps Published Subjects Technical Committee has been set 
        forth to help application of Topic Maps specification ISO 
        13250, by providing recommendations for documentation, management 
        and use of published subjects. The general purpose is that topic maps 
        interoperability needs non-ambiguous definition of subjects (represented 
         by 
        topics), that should be provided by stable resources, made available on-line 
        through trustable publication process. 
         
        Those resources, organised in published subject documentation sets, 
        will provide 
        both published subject indicators (human-understandable non-ambiguous 
        definition of subjects) and published subject identifiers (stable 
        URIs fit for computer processing, topic maps interoperability and merging, 
        and many other foreseeable semantic applications).  
      The 
        purpose of this document is to provide recommendations for the content 
        and structure of published subject documentation sets. Those recommendations 
        are aimed at publishers of ontologies, classifications, taxonomies, thesaurus, 
        registries, catalogues, data bases ... to provide those publishers with 
        efficient ways to make their legacy available as published subjects documentation, 
        and therefore usable by topic maps and other semantic applications. 
         
        2 - A gentle introduction to Published Subjects  
      A 
        main and original feature of topic maps is that they they deal with subjects. 
        A subject can be an unique individual object, like "Isaac Newton", 
        "IBM, Inc.", or "Paris (France)" ... or a class of 
        such individuals, like "famous scientists" "software companies" 
        or "towns" ... or a more abstract concept like "gravitation" 
        "economic growth" or "baroque style" ... 
        In a nutshell, a subject can be anything deserving to be 
        identified, named, represented and generally talked about - in short, 
        whatever can be a subject of conversation.  
      
        -  
          A subject is anything whatsoever, regardless of whether it exists 
          or has any other specific characteristics, about which anything whatsoever 
          may be asserted by any means whatsoever.
 
       
      How 
        do topic maps deal with a subject? First they represent a subject 
        formally as an abstract "topic". In XTM documents, a topic is 
        represented by a <topic> XML element. A topic should represent an 
        unique, well-defined, non-ambiguous subject. So far, so good, at least 
        in the mind of a single topic map author. But topic maps applications 
        dream of inter-operability. That means that topic maps authors, users, 
        and computer applications dealing with them, must have ways to know if 
        two or more topics in the same or different topic maps represent the 
        same subject. 
         
        How can that be achieved? A topic map author can indicate what 
        is the subject of a topic by referring to a document, or any other kind 
        of resource, where the subject appears to be defined in a proper 
        and non-ambiguous way. Such a resource will therefore be considered by 
        the topic map author as a subject indicator. Provided with this 
        resource, an human being will be able, hopefully, to know what subject 
        this topic represents.  
      
        - A 
          subject indicator is a resource that is referred to by the topic map 
          author to provide an unambiguous indication of the identity of a subject. 
          Any resource can become a subject indicator by being referred to as 
          such from within some topic map, whether or not it was intended by its 
          publisher to be a subject indicator. 
 
           
       
      Since 
        topic maps live in the Web universe, the subject indicator has to be an 
        addressable (network-retrievable) resource. The reference to the subject 
        indicator will therefore use some URI, which will both address the 
        subject indicator and identify the subject. Computers applications 
        will of course be happy to handle this subject identifier, since 
        two topics with the same subject identifier clearly refer to the same 
        subject indicator, and therefore represent the same subject.  
      
        - A subject identifier 
          is an URI used by a topic map author to identify and refer to a subject 
          indicator. 
 
       
      Unfortunately, 
        the whole above scenario is too simple to be sustainable. The subject 
        indicators and subject identifiers defined only from the topic map author's 
        end, are likely to be untrustable and unstable. URIs and the resources 
        they address are moving targets in the Web universe. The publishers of 
        resources used as subject indicators might not even be aware of it, and 
        are likely to leave topic maps authors with meaningless identifiers and 
        indicators, if any indicator at all, without previous notice. 
         
        Here the publishers enter in the loop. If some publishers are aware of 
        the whole problem, and want to provide topic maps applications with stable, 
        trustable, authoritative subject indicators and identifiers, the situation 
        is far better. The publishers can provide sets of subject indicators and 
        subject identifiers in a stable way, and declare their intention to maintain 
        them stable and trustable for topic maps and other applications. At that 
        point, the topic maps authors are provided with published subjects, 
        defined in published subject documentation sets, coming along with 
        published subject indicators and published subject identifiers. 
        They will use them as before, but the whole scenario will become really 
        sustainable. 
      
        - A 
          published subject is a subject for which there exists at least one published 
          subject indicator.
 
        - A 
          published subject indicator is a subject indicator that is published 
          and maintained at an advertised address for the purpose of facilitating 
          topic map interchange and mergeability.
 
        - A published 
          subject identifier is the canonical URI of a published subject indicator, 
          chosen and declared by its publisher as the URI to be used within topic 
          maps to identify the published subject.
 
        - A 
          published subject documentation set - PS DocSet - is the complete set 
          of documentation about a set of published subject indicators and identifiers, 
          as published by its publisher. 
 
       
      The 
        topic maps litterature has coined for over a year the acronym "PSI". 
        Note that it can expand both in "published subject indicator" 
        and "published subject identifier". Those are two faces of the 
        concept, one looking at humans (the indicator), and one looking at computers 
        (the identifier). 
        Like Janus Bifrons over Roman doors, PSIs are warrants of a good 
        communication between two universes ... 
         
        3 
        - Glossary  
      The 
        following terms and concepts will be used in this document and further 
        TC recommendations.  
        Some of them are already defined and used by ISO 13250. Nevertheless, 
        the TC proposes some modifications to clarify some of them and their relationships 
        with new ones, and will send those proposals to ISO JTC1/SW34 for relevant 
        revision and extension of ISO 13250 terminology. Both current ISO 13250 
        definition and PubSubj TC proposal are given when necessary. 
         
        "Publisher" is used throughout in the sense defined in Dublin 
        Core metadata (dc:publisher) 
        "Resource" is used throughout in the sense of "network-retrievable 
        resource" (IETF) or "addressable resource" (ISO 13250) 
      
        - subject 
          
 
           
           as defined by ISO 13250 XTM  
          A subject is anything whatsoever, regardless of whether it exists or 
          has any other specific characteristics, about which anything whatsoever 
          may be asserted by any means whatsoever. 
       
      
        - subject 
          indicator 
 
           
          as defined by ISO 13250 XTM  
          A resource that is intended by the topic map author to provide a positive, 
          unambiguous indication of the identity of a subject.  
           
          definition proposal  
          A resource that is referred to by the topic map author to 
          provide an unambiguous indication of the identity of a subject. Any 
          resource can become a subject indicator by being referred to as such 
          from within some topic map, whether or not it was intended by its publisher 
          to be a subject indicator.  
          See "published 
          subject indicator" 
       
      
        - subject identifier
 
           
          definition proposal  
          An URI used by a topic map author to identify and refer to a subject 
          indicator. When a subject identifier is declared by a publisher, in 
          a published subject documentation set, to identify a published subject 
          indicator, it is called a published subject identifier. 
       
      
        - published 
          subject
 
           
          definition proposal 
           
          A subject for which there exists at least one published subject indicator. 
           
       
      
        - published 
          subject indicator 
          - PS Indicator
 
           
          as defined by ISO 13250 XTM 
          A subject indicator that is published and maintained at an advertised 
          address for the purpose of facilitating topic map interchange and mergeability. 
       
      
        - published 
          subject identifier - 
          PS Identifier
 
             
          definition 
          proposal  
          The canonical URI of a published subject indicator, chosen and declared 
          by its publisher as the URI to be used within topic maps to identify 
          the published subject.  
       
      
        - published 
          subject documentation set - PS DocSet
 
           
           definition 
          proposal 
          The complete set of documentation about a set of published subject indicators 
          and identifiers, as published by its publisher. 
       
      4 
        - Requirements for PS DocSet content 
         
        A PS DocSet shall contain at least the following mandatory elements:  
      
        - Statement of 
          Purpose
 
        - Statement of 
          PS DocSet structure and format
 
        - PS DocSet metadata
 
        - Homogeneous 
          PSI set
 
       
      4.1 
        - Statement of Purpose 
         
        A PS DocSet shall include the following formal statement from its publisher, 
        expliciting its conformance to this recommendation, and its intention 
        to maintain the documentation trustable, and its URIs stable. 
      This 
        namespace "http://psi.organization-foo/bar/" is dedicated by 
        its publisher, "organization-foo"  
        to host a permanent and stable Published Subject Documentation Set,  
        in conformance with Requirements and Recommendations of OASIS Topic Maps 
        Subjects Technical Committee:  
        http://www.oasis-open.org/committees/tm-pubsubj/docs/recommendations/psdoc.htm 
         
      4.2 
        - Statement of PS DocSet structure and format 
          
          
      4.2.1 
        - A single URI shall be used both to identify the PS DocSet, and to 
        provide a namespace for its PS Identifiers. 
        All PS DocSet elements shall be identified by URIs belonging to that namespace. 
         
         
       
        Remark: The wording of the above is certainly 
        to improve. 
        What is intended is that if the PS DocSet is identified by http://psi.organization-foo/bar/ 
         
        ... then all PSIs in the PS DocSet shall be identified by URIs like http://psi.organization-foo/bar/unameit 
       
        4.2.2 
        -  A PS DocSet shall provide explicit declaration of its structure, 
        format and syntax. 
      
        -  
          Syntax used for all PS Identifiers shall follow a consistent and 
          declared format throughout the PS DocSet. 
 
        -  
          Syntax 
          and structure used for all PS Indicators shall be uniform and declared 
          explicitly, by reference to some XML DTD, Schema or any other equivalent 
          structure definition.
 
           
       
      4.3 
        - PS DocSet metadata 
         
        A PS DocSet shall include the following mandatory Dublin Core metadata. 
      
        - Type 
          (dc:type)
 
          The declaration of the resource as a PS DocSet, by reference to a core 
          PSI for PS DocSet 
        - Identifier 
           
          (dc:identifier)
 
          The canonical PS DocSet URI namespace  
        - Subject 
          (dc:subject)
 
          A declaration of the general PS DocSet subject, domain or scope 
        - Publisher 
          (dc:publisher)
 
          The publisher is the legal authority appearing in the Statement of Purpose 
        - Language 
          (dc:language)
 
          The default language of publication used by PS Indicators 
        - Format 
          (dc:format)
 
          The format, language or syntax in which the PS Indicators are expressed 
        - Date 
          (dc:date)
 
          Date of publication, latest validation or revision 
       
       It 
        may also include the other - optional - Dublin Core metadata 
         
      
        - Title 
          (dc:title)
 
          An usage name or title for the PS DocSet 
        - Description 
          (dc:description) 
 
          Complementary relevant information not contained in (dc:subject) element 
           
        - Creator 
          (dc:creator) 
 
        - Contributor 
          (dc:contributor)
 
        - Conditions 
          of use (dc:rights)
 
        - Source 
          (dc:source)
 
        - Coverage 
          (dc:coverage) 
 
       
       In 
        complement to those metadata, the PS DocSet may include various recommendations 
        for use, list of registered users, or any other relevant information item. 
      4.4 
        - Homogeneous 
        PSI Set  
          
       4.4.1 
        - Every PS Indicator in a PS DocSet shall be identified by, and retrievable 
        through an unique canonical URI. 
        This canonical URI is the corresponding PS Identifier, 
        uniquely defined in the PS DocSet namespace. 
       4.4.2 
        - Troughout a PS DocSet, all PS Indicators shall follow the same formal 
        structure, as declared in 4.2.2 
      4.4.3 
        - A PS Indicator shall include at least the following Dublin Core elements: 
      
        - Identifier 
          (dc:identifier) 
 
          The canonical URI that shall be used as the PS Identifier.  
          This URI shall be unique, and defined in the PS DocSet namespace. 
        - Language 
          (dc:language)
 
          Language in which subject, type, and description are expressed - if 
          different of the default PS DocSet language. 
        - Subject 
          (dc:subject)
 
          A name given to the subject that is identified by the PS Identifier. 
           
          This name shall be unique in the PS DocSet namespace, in a given language 
          scope.  
        - Type 
          (dc:type)
 
          A class of which the subject is an instance. This class should be defined 
          itself by its PSI. 
           
        -  
          Description  (dc:description)
 
          Text, image or any kind of relevant resource, describing the subject 
          in a non-ambiguous, human-understandable way. 
           
       
      5 
        - Recommended 
        Syntaxes, and examples of PS DocSet 
         
         
        Considering 
        the considerable legacy of taxonomies, classifications, ontologies, data 
        bases and catalogues likely to be made available as PS DocSets, their 
        publishers should not be constrained to use any specific structure or 
        syntax. 
         
        Therefore, the present recommendation will not enforce upon publishers 
        either an unique standard structure for PS DocSets, or a specific syntax 
        for PSIs. Nevertheless, it will recommend best practices for a certain 
        number of existing relevant syntaxes, listed below. This list does not 
        pretend to be exhaustive, and does not preclude any other present or future 
        format and structure that would fit the requirements expressed in section 
        4.  
      5.1 
        - Recommendations for PS DocSet using XTM 
         
         
        Draft Proposals submitted to TC 
      
      5.2 
        - Recommendations for PS DocSet using 
        RDF  
         
        To be delivered 
      5.3 
        - Recommendations for PS DocSet using XHTML 
         
        To 
        be delivered 
     |