[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: Processing instructions (was: [ubl] Minutes of Atlantic UBL TC call 5 April 2006)
At 2006-04-10 13:33 -0700, jon.bosak@sun.com wrote:
>MINUTES OF ATLANTIC UBL TC MEETING
>15:00 - 17:00 UTC WEDNESDAY 5 APRIL 2006
>...
> JB: Other NDR issues?
>
> PB: Version is covered by ABIE instead of attribute, and that's
> OK. But we have a requirement from uk/se/dk for a place to
> indicate which application generated an instance for debugging
> purposes. We were told in Ottawa to use a PI. GKH says we
> shouldn't formalize a PI into a standard, but this is an
> instruction for processors.
>
> JB: There doesn't seem to be another good way to do this.
I was unaware from the earlier discussion this was for debugging purposes:
At 2006-04-04 14:46 +0200, Peter Borresen wrote:
>To be able to track how the instance was generated the following
>process-instruction SHOULD be added to each document instances:
>
><?InstanceInfo
> Creator="<person or application (inklusiv full verison
> attributes) that has generated the document"
> Created="<date and time the document was send>"
>?>
The above example is attributing persistent information about the
instance to the document, not transient processing information. This
example isn't (in my mind) a processing directive ... it is
additional information about the instance.
Consider the standardized processing instruction for stylesheet association:
http://www.w3.org/1999/06/REC-xml-stylesheet-19990629
A processing instruction is an annotation ... it isn't (shouldn't be)
a source of additional information. The information *in* the
document does not change based on the presence of this standardized
processing instruction. Nor does the information change with the
absence of one or the removal of one that may have been. Nor can
anyone associate the information in the PI with the information in
the document ... the PI is solely for a processing application in the
interpretation of the document.
The PI is a processing directive (hence the name: processing
instruction). The information in the example above isn't directing
anything, nor instructing anything about the document.
Furthermore, the presence of and syntax used in processing
instructions cannot be constrained by XML document modeling technologies.
Encoding the author and time stamp of the document in the PI feels
too much like using an arbitrary and unvalidatable mechanism to add
information items to the document.
However ... that's just my opinion, and if it is decided to include
such information in UBL using processing instructions, I then have
comments about the above processing instructions themselves as follows.
Note that a processing instruction does not have real attributes,
regardless of the syntax used within the processing
instruction. There are only two pieces of information: the name
token at the start (called the PI target) and the rest of the string
following the white-space that follows the target. A downstream
application is obliged to parse the information found in that
unstructured string. There is no validation that the correct quoting
has been used in these pseudo-attributes, or to access the
information, and a processing application has to take on the burden
of parsing the string.
When I've designed processing instructions, I've tried to determine
what information is standalone, and what information is tied
together. Analyzing the W3C standardized stylesheet association
processing instruction, there are four pieces of mandatory
information that are all tied together and related. Since they are
all tied together and related, they are all in a single processing
instruction. This burdens processing applications with parsing the
PI string to find the four pieces of information, and using
name/value pairs as attributes are is a meaningful way to do this
association.
But the important issue is that one is not only not obliged to use
pseudo-attribute syntax, I suggest that for singleton values it is
inappropriate to use pseudo-attribute syntax.
Instead of:
<?InstanceInfo
Creator="<person or application (inklusiv full verison
attributes) that has generated the document"
Created="<date and time the document was send>"
?>
I would rather suggest two separate processing instructions, either
of which still has meaning if the other one is missing, and there is
no burden on processing instructions to obtain the information out of
the string value (no quotes, no parsing, just the PI value is the data value):
<?UBL-creator person-or-application-as-rest-of-string?>
<?UBL-created date-time-as-rest-of-string?>
Alternatively, I could live with the following where a single PI
target identifies all information targeted for UBL processors and it
is easy to extract and use the initial space-delimited name token in
the processing instruction string to determine what the rest of the
string represents.
<?UBL creator person-or-application-as-rest-of-string?>
<?UBL created date-time-as-rest-of-string?>
Lastly, though this isn't something that can be taken advantage of in
XSLT, to be complete according to the XML recommendation, I believe
any agreement on a processing instruction target name should include
an agreement on an associated SYSTEM and possibly PUBLIC identifier
for the NOTATION associated with the target. The target is,
according to the spec, documentary (like a namespace prefix), but it
has been given weight in XML language API interfaces because I
believe the interface designers missed this association between the
PI target and the formal identifiers.
This is defined in XML section 4.7:
http://www.w3.org/TR/2004/REC-xml-20040204/#Notations
So, in DTD speak we would then need something like:
<!NOTATION UBL SYSTEM "urn:oasis:names:specification:ubl:processing">
In W3C Schema speak it would be:
<xsd:notation name="UBL"
system="urn:oasis:names:specification:ubl:processing">
If we had separate PI targets for each, then it would be:
<xsd:notation name="UBL-creator"
system="urn:oasis:names:specification:ubl:processing:creator">
<xsd:notation name="UBL-created"
system="urn:oasis:names:specification:ubl:processing:created">
But seeing that just reinforces to me that these two pieces of
information requested still don't feel like processing directives to
me ... they still feel like information items ... and I don't think
they belong in processing instructions.
XML says it all: A processing instruction allows a document to
contain instructions for applications. A processing instruction
target identifies the application to which the directive is
addressed. For stylesheet association, "xml-stylesheet" is an
appropriate PI target. The target "InstanceInfo", or even
"UBL-creator" and "UBL-created", are not appropriate PI targets.
I hope this helps.
. . . . . . . . . . . . . Ken
--
Registration open for XSLT/XSL-FO training: Wash.,DC 2006-06-12/16
Also for XML/XSLT/XSL-FO training:Birmingham,England 2006-05-22/25
Also for XSLT/XSL-FO training: Copenhagen,Denmark 2006-05-08/11
World-wide on-site corporate, govt. & user group XML/XSL training.
G. Ken Holman mailto:gkholman@CraneSoftwrights.com
Crane Softwrights Ltd. http://www.CraneSoftwrights.com/o/
Box 266, Kars, Ontario CANADA K0A-2E0 +1(613)489-0999 (F:-0995)
Male Cancer Awareness Aug'05 http://www.CraneSoftwrights.com/o/bc
Legal business disclaimers: http://www.CraneSoftwrights.com/legal
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]