[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: Proposal for alphanumeric version identifiers in XRI metadata spec
Following is a proposal for discussion on today's XRI TC telecon. First, a little background. The current Committee Draft spec of XRI Metadata 2.0 (at http://www.oasis-open.org/committees/download.php/11854/xri-metadata-V2.0-cd -01.pdf) established four initial categories of XRI metadata: language, datetime, version, and annotation. In the new Working Draft under preparation by Marty Schleiff and myself (and Dave McAlpin when he is available), we are deprecating one of these four categories (annotation, per a message to the list last month), and adding one new category (identifier type metadata -- $t -- also discussed extensively last fall). Of the other three, language will remain largely unchanged (except for the newer, less ambiguous ABNF). However we are proposing some changes to datetime and version. This message will summarize the proposed changes to version metadata (which I suspect will take up today's call time.) Once we finish that discussion, a subsequent message/discussion will summarize datetime. *** PART ONE: PROPOSAL FOR MASTER ABNF *** First, here's the proposed "master ABNF" that now governs all XRI metadata: xri-metadata-exp = "$" metadata-tag [ "*" metadata-subtag ] "*" target metadata-tag = alpha / xref metadata-subtag = 1*xri-pchar / xref target = *xri-pchar / xref As explained in an earlier message several months ago, this is based on the basic RDF subject-predicate-object pattern as follows: Subject = target Predicate = metadata-tag Object = metadata-subtag Note that in the master ABNF, a metadata-subtag value is OPTIONAL. Since a specific metadata tag can only further restrict and not loosen this ABNF, that means some types of metadata tags may omit the subtag value (i.e., declare a default subtag value) and others may require a subtag. Of the four categories of metadata, $d datetime and $v version have defaults and $l language and $t type do NOT have defaults and thus require a subtag. This ABNF also means all XRI metadata is now defined as describing the target identifier included in the cross-reference containing the metadata (which Marty and I are calling the "metadata expression"). The semantics of this description are defined unambigously in the spec with regards to the target identifier, however the interpretation of the metadata expression as a whole (meaning relative to the parent node(s) or child node(s) in the XRI) is left to the authority for the XRI. For instance, in the following examples... xri://(example.root)*delegate/($l*fr*mot)/resource xri://(example.root)*delegate/resource*($v*2.1) xri://(example.root)*delegate/resource*($d*2000-01-12T12:13:14Z) ...the metadata "$l*fr" describes the identifier "mot", the metadata "$v" describes the identifier "2.1", and the metadata "$d" describes the identifier "2000-01-12T12:13:14Z". *** PART TWO: PROPOSAL FOR VERSION METADATA ABNF *** Now, given this master, here's the proposed ABNF for version metadata: xri-version-exp = "$v" [ "*" ver-subtag / xref ] "*" ver-target In CD01, a version tag could not include a subtag -- there was only one version identifier format (numeric). This new ABNF allows version metadata to include a subtag value, however because it is OPTIONAL, there is a default value when a subtag is not present. Marty and I originally planned to define several version subtags -- for example, one for numeric, one for alpha, and one for alphanumeric -- and specify one as the default. However we realized that if it was done right, we could just define alphanumeric and make it the default and be done with it, as it is a superset of both alpha and numeric version identifiers. The ABNF turns out to be very simple: def-ver-target = ver-segment *( [ ver-seg-delim ] ver-segment ) version-segment = 1*digit / alpha ver-seg-delim = "." / "-" Note that the version segment delimiters (dot or dash) are OPTIONAL. This allows all of the following as valid version metadata expressions: ($v*2.1) ($v*2.12) ($v*2.12.4) ($v*2.1a) ($v*2.1ab) ($v*2.1a-b) ($v*2.1a1) ($v*a) ($v*a1) ($v*a12) ($v*a1a) ($v*a1ab74) ($v*a-1-ab.7.4) Now, the question is, if these are all valid version metadata expressions, what are the normalization and comparision rules? This was a little tricky to figure out, but they end out being quite simple: ** Normalization Rules ** 1) Normalize all alpha characters to lowercase. 2) Normalize all delimiters to dots. 3) For all sequences of alpha characters, add dot delimiters so that every alpha character becomes a single-character version-segment. Example: ($v*abc) becomes ($v*a.b.c) and ($v*2.1c5d) becomes ($v*2.1.c.5.d). This means all segments are either a single alpha character or one-or-more digit characters. 4) For all digit segments, remove leading zeros. ** Comparison Rules ** After normalization, working from left-to-right, compare each segment value according to the following four rules: a) An alpha-segment comes before a digit-segment. b) All alpha-segments are compared by ASCII value. The higher ASCII value is the later version value. c) All digit-segments are a comparison of the integer value of the entire sequence. The higher integer is the later version value. d) If all version-segment values are equivalent but one version-target has more version-segments than another, the version-target with more version-segments is the later version value. Examples in order of earliest-to-latest: RAW NORMALIZED ($v*02.4) ($v*2.4) ($v*02.4e) ($v*2.4.e) ($v*02.4ef) ($v*2.4.e.f) ($v*02.4g) ($v*2.4.g) ($v*02.5) ($v*2.5) ($v*02.51) ($v*2.51) ($v*2.52) ($v*2.52) ($v*20.52) ($v*20.52) ($v*21) ($v*21) ($v*21a) ($v*21.a) ($v*21ab) ($v*21.a.b) ($v*21abc) ($v*21.a.b.c) ($v*21a1) ($v*21.a.1)
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]