[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: Re: [ubl] Examples for Proposed Single-pass Extension Validation
Thank you, Ken, for your efforts with this task. But I'm reminded of the Kobayashi Maru! I certainly agree that the approach you put forward will report valid documents as valid. But it will not report invalid documents as invalid. There will be invalid documents that this approach will report as valid. That changes the rules of the test. At the time the extension point was designed, the committee's thoughts about validation were centred around the body of the UBL document: any structural invalidity must be reported and the document called invalid. Thus, the philosophy was carried over to the extension point, and the various machinations I've put into PRD1 address that. Specifically, there were two issues enunciated within the committee at the time that are not covered with your approach: (1) - an extension that has a UBL construct as the apex element, including any CBC, any CAC and any document ABIE ... we explicitly said we didn't want users putting UBL documents under the extension point, let alone allowing something like: <ext:UBLExtension> <ext:ExtensionContent> <cbc:UUID>123</cbc:UUID> </ext:ExtensionContent> </ext:UBLExtension> ... or: <ext:UBLExtension> <ext:ExtensionContent> <in:Invoice>.....</in:Invoice> </ext:ExtensionContent> </ext:UBLExtension> - in retrospect, I realize now my approach doesn't even cover this! W3C schema won't even let my approach catch the above kind of error (2) - an extension that has an incorrect user extension element as the apex element; consider that a user defines this extension: <ext:UBLExtension xmlns:corpx="urn:corpx"> <ext:ExtensionContent> <corpx:CorpXItem> <cbc:ID>1</cbc:ID> <cbc:TypeCode>CR</cbc:TypeCode> <corpx:CorpXOtherItem>abc</corpx:CorpXOther> </corpx:CorpXItem> </ext:ExtensionContent> </ext:UBLExtension> ... then the proposed scheme would not signal this as invalid: <ext:UBLExtension xmlns:corpx="urn:corpx"> <ext:ExtensionContent> <corpx:CorpXOtherItem>abc</corpx:CorpXOther> </ext:ExtensionContent> </ext:UBLExtension> - this is being caught by my approach as an invalid document, and this was the focus of our efforts at the time: we only wanted the apex element to be allowed as the child of the extension content Of course one could take the viewpoint that if the *wrong* element is at the apex of the extension, that simply represents an extension that you don't recognize and so you don't care that it is wrong. As it is you don't care about other extensions you don't recognize, so extensions in error are simply extensions you don't recognize so why worry? But that wasn't the attitude at the time we designed the extension point. If the sender of the document incorrectly structured their instance (as in (2) above) and it inadvertently validated because of this new policy, they would send that incorrect document to the recipient thinking it contained important information to be processed by the recipient. The recipient would then accept the document because it didn't violate any schema constraints. The sender *thinks* he has sent extension information but the recipient *knows* the sender didn't because nothing recognizable was at the apex of the extension point. Committee members know from years of experience I'm really quite anal about this validation stuff, but if as a group we accept a revised extension point policy in our PRD1 review, then I can certainly live with it. The design I put forward supported the policy at the time it was designed. This new approach was expressly rejected because it was "too loose". If that policy proves overly burdensome, then let's change the policy and I agree I would have done it the way you've done it. In fact, writing this response in detail has helped me to more appreciate the generic approach we rejected years ago and you've articulated here. Its simplicity outweighs, I now think, the limited benefits of what the 2006 approach brings to the table. Let's change the test. Had RELAX-NG been used, none of this would need a discussion. It would catch all invalid documents as invalid. We are wrestling with a limitation of W3C Schema: no namespace exclusion other than "other", and unique particle attribution preventing the choice using two of those "namespace exclusion" operands. So I'm prepared to adopt this new approach for PRD2 if the technical committee agrees during the PRD1 review. Thank you again for your efforts. . . . . . . . . . . Ken p.s. regarding your note, I ran your example instance through Xerces, Saxon and Altova2010 standalone validation and it was passed by all three processors At 2010-10-29 23:50 -0400, Kenneth Vaughn wrote: >Per my task from the last teleconference, I have created an example of >what my proposal would look like for handling extensions in a single >pass. The easiest way to demonstrate this is to unzip the contents of >the attached file into your UBL PRD1 directory. You can then modify >the "include" element in UBL-CommonExtensionComponents-2.1.xsd to >point to either UBL-ExtensionContentDataType-2.1.klv.xsd or UBL- >ExtensionContentDataType-2.1.klv-cust.xsd. The files should then be >good to play with (with one note below). Descriptions for each file are: > >UBL-ExtensionContentDataType-2.1.klv.xsd >This is the proposed STANDARD schema file that would be distributed. >It defines the ExtensionContentType datatype and references the >digital signature namespace. (Obviously, the namespace reference would >be removed, if we moved signatures into the main body). If you >evaluate Example-Invoice-2.1.klv.xml with the >CommonExtensionComponents-2.1.xsd including this file, it will pass >validation, but none of the extensions will be checked as the >namespaces will not be recognized by the validator; instead, the >extensions are simply skipped. > > >UBL-ExtensionContentDataType-2.1.klv-cust.xsd >This is an example customization of the standard file above. An >implementor would customize the file to >1) define the customization namespaces that the implementation can >recognize and >2) import the customization schemas containing the specific elements >that can be included as extensions >That is the only customization needed; and technically, the namespaces >do not need to be declared since they are not actually referenced in >the schema; they are only provided for readability. If you evaluate >Example-Invoice-2.1.klv.xml with the CommonExtensionComponents-2.1.xsd >including this file, the content of the defined CorpX and RegionY >extensions will be checked - this can be shown by playing with the >values in the example XML document as described in the comments >embedded within the file. > > >CorpX.xsd >An example of a schema containing a customized extension. > > >RegionY.xsd >A second example of a schema containing a customized extension; the >second example demonstrates that a single instance file can reference >multiple custom extensions and they are all validated in a single pass. > > >Example-Invoice-2.1.klv.xml >The example XML document to play around with. As is, it should >validate with either Extension Content Data Type schema; but if you >change the extension content when using the customized Extension >Content Data Type schema, errors will be flagged. The example also >includes an extension from a third (undefined) namespace to >demonstrate that this extension will be skipped over while still >validating the recognized extensions. > > > > >=========== NOTE NOTE NOTE =================== > >As near as I can tell, it SHOULD work as described above. For some >reason, XMLSpy would not check the content of the Extensions as >these files are written... I had to change the namespace for the UBL >Invoice schema in both the XML instance document and the UBL >Invoice schema for it to work as described. I spent about 3 hours >trying to figure out why it was not checking the content only to >stumble upon this work-around. It works if I simply change the >namespace (i.e., the full URN) to have a 3 at the end instead of a 2... > >I can not figure out why this would be a problem, my working theory >is that it is some bug within XMLSpy itself (perhaps it is stuck on the >old definition of the namespace referring to the old Extension??? But >I rebooted my machine and still had the problem...). In short, I was >unable to think of any reason that the namespace would pose a real >problem and I am suspecting that it is unique to XMLSpy (older >version) and/or my machine - but if you do not see it checking content >when it should, please let me know and perhaps we can figure out >why this anomaly occurred. > >Actually, if it does work for you, I'd be interested in knowing as well >and what software you are using... 3 hours is too much time to spend >chasing down a fake problem! -- Crane Softwrights Ltd. http://www.CraneSoftwrights.com/o/ G. Ken Holman mailto:gkholman@CraneSoftwrights.com Male Cancer Awareness Nov'07 http://www.CraneSoftwrights.com/o/bc Legal business disclaimers: http://www.CraneSoftwrights.com/legal
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]