[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: Re: [ubl] Schematron demo
At 2005-09-07 08:25 -0400, Burnsmarty@aol.com wrote: >Hey Ken, comments on your comments to my comments inline. > >Hope this helps, I appreciate the dialogue, Marty, thank you. Unfortunately, I think some of the quoting over many messages has been munged because you are attributing some comments to me that aren't from me. >In a message dated 8/30/2005 8:27:23 A.M. Eastern Daylight Time, >gkholman@CraneSoftwrights.com writes: >... >At 2005-08-30 07:53 -0400, Burnsmarty@aol.com wrote: >... > >UBL schemas don't have to accommodate code list extensibility. > >Yes, I believe this is a requirement: What you've quoted isn't something I've typed, it is something found in your message as coming from you (as I interpret it): http://lists.oasis-open.org/archives/ubl/200508/msg00177.html I believe it is a requirement for "type 2 schemas" (not defined by UBL or ATG), but I do not believe "type 1 schemas" (those hardwired by UBL or ATG) need to be expressed by a mechanism that allows extensibility. A two-stage validation using Schematron does address extensibility for type 2 schemas, and indeed offers trading partners to validate the use of a subset of any type 1 code list should they wish. > >More Problematic: > >The schematron tests for attribute name and therefore bypasses the > >use of the context of the schema -- therefore it can ignore the > >schema's constraints. For example, what if the schema designer wants > >to constrain the values in a particular usage but not others. > >XPath addresses this. Note in Jon's demonstration how the >"currency-value" rule is declared to be abstract and is only made >concrete by the use of a non-abstract rule with a context= >attribute. The demo uses a wildcard context="*[@amountCurrencyID]" >but could easily just as well use context="cbc:TotalTaxAmount" >instead for a more focused context. > >Your approach will allow the namespace identifier to be searched I'm not sure what you mean by "namespace identifier to be searched". >but does not enforce any constraints imposed in the schema itself. Correct, it is the role of the schema to enforce constraints on the position in the structure. The schema, however, would only have base="xsd:token" for the value of the code list type 2 item which is then validated in the second pass using Schematron. To your question "what if the schema designer wants to constrain the values in a particular usage but not others", then the Schematron expression can have two different assertions, one for one usage and one for all the others, which can be checked with different values. I don't believe that can be done with W3C Schema as the code list value will have only one declaration for all uses, not supporting contextual differences as can be done with Schematron. >I suspect you can't be faithful to the schema with this method >without reconstructing the schema itself with schematron rules. There is no need to duplicate what is already validated structurally. Provided there is awareness (in XML) of all of the elements and attributes expressed by type 2 code lists then that XML can be processed to create the contexts for the abstract rules. Tony, you demonstrate the creation of the abstract Schematron rules from the genericode instance, but I don't see where you specify the contexts in which codes are being used. How would we express *where* in UBL the currency code lists are being used for elements and attributes so that the concrete rules pointing to abstract rules will have the correct XPath match patterns? (ed: I answer this myself below, but I'll leave the question since others might be asking themselves this question) >Otherwise there will need to be specific NDR rules on the use of >code lists that prevent ambiguity when schematron is used as the >validation method. Which ambiguity? Trading partners could agree that currency codes allowed for one part of an instance must be different than those allowed in another part of the instance, and each of these parts would be contexts for the two different abstract rules. > >Approach is namespace insensitive. > >Only the demonstration is namespace insensitive, one could easily >just as well use context="cbc:*[@amountCurrencyID]". > >Same comment as previous. Do you mean the comment about reconstructing the schema? If so, then I don't understand. > >How does one detect that an instance file is based on an extended code list? > >Business rules and the choice to apply a given set of assertions. > >So you are saying there is no way to determine by inspecting the >instance file, what composition rules it is expected to follow? What is a "composition rule" in this context? If the business rules state that one subset of a code list applies in one place and a different subset of a code list applies in a different place, then trading partners express two code lists and point each context to the abstract rules accordingly. Again, I'm assuming the allowed values of type 1 code lists are fixed in the schema, though subsettable by trading partners in part 2, and allowed values of type 2 code lists are permissive tokens in the schema requiring trading partners to specify the agreed-upon code list values in a code list instance that generates the Schematron assertions (as in Tony's example). Though I need an answer from Tony regarding specifying the different contexts. > >How can one tell in the instance file what code lists are being used? > >One could base the context on the presence of the code list support >attributes: > > context="*[@amountCurrencyID][@amountCurrencyCodeListVersionID='0.3']" > > >How does one detect that the correct version of the code list is being used? > >Specific values can be checked with the above, or one could report an >error that a given version isn't being used: > > <assert test="amountCurrencyCodeListVersionID='0.3'"> > >So what I think you are saying is that the schematron becomes an >integral part of the schema itself and has to be part of the UBL >packaging and versioning. This schematron would refine what the >schema says is required. An integral part of validation, not an integral part of the schema [expression]. Yes, the Schematron expressions may choose to limit the use of values in type 1 code lists (since the schema allows all values the committee defines and partners may not want all values available) and should express the limited use of values in type 2 code lists (since the schema doesn't specify any values for type 2 code lists). >How do we prevent the schematron from conflicting with the schema >since both check content based on overlapping rules? I think this >collision doesn't occur when schema is used for individual value >validation and schematron is used for validating values based on other values. The collision also does not occur when the schema allows a value set for type 1 code lists and a permissive wildcard token for type 2 code lists and Schematron is used for validating values based on a given context or any context. > >Where is the code list itself? > >It is an instance of the upcoming code list schema and is input to an >automated process to produce the .sch set of assertions. > >Does this mean that a third party needs to construct an XML file for >use by UBL and schemas for all other ebusiness standards? I was >trying to devise a mechanism by which third parties could construct >a single reference and it could be used by all. There would be a single reference ... perhaps it is Tony's genericode expression of code list values. In fact, you just answered one of my questions for Tony. The genericode expression *can't* have UBL contexts because it is just an expression of code values for all document models. I will look into an analysis of the UBL schemas to see how that, combined with the genericode expressions, would produce a Schematron expression of the abstract genericode lists in the contexts described by the schemas. I have in mind how it would be done, so I'll give it a try. > >What does an extension document look like? > >What is an "extension document"? In Schematron one merely enumerates >the desired conditions that must be true or the desired conditions >that must not be false. > >An extension document is a description of changes that a user makes >to the standard schemas that allows for the testing of conformant >documents without having to alter the underlying standard schemas. >The extension mechanism would allow information represented in the >schemas to be "extended". Then the extension document is described along the lines of Tony's genericode example. If *all* standards committees were geared to work with genericode instances of code lists, then these become the focal points of trading partner agreements for code list validation across all applications. It would be wonderful if trading partners could write their own genericode instances of code lists and then run three or four different applications all using their agreed-upon values without having to write custom ones for each application. > >What does a restriction document look like? > >What is a "restriction document"? > >Are you speaking of expressions that extend or restrict an instance >of the upcoming code list schema? I would leave such a question to >Tony in regard to his proposed code list instance expression. > >Restrictions are the corrolary to extensions that might permit, for >example, trading partners to agree that, for example, >PricingCurrencyCode must have a value of EUR. Okay, then same answer: genericode instances agreed upon by trading partners. An extension context for genericode use would be for type 2 code lists that are permissively defined by the schema and don't have the values needed by trading partners. A restriction context for genericode use would be for type 1 code lists that are hardwired by the schemas and have too many values than needed by trading partners. > >How is versioning handled? > >Versioning of what? > >Versioning of the schematron documents which are required to >validate the instance documents. The Schematron expressions would be synthesized from the UBL schemas and the genericode instances, so trading partners would agree on the versions of each of those and the synthesis of the validation wouldn't, itself, need to be versioned. It could evaporate after use, or it could be cached until any of the inputs are changed. I don't see that the Schematron expression would need to be versioned. I think you and others are still under the impression that the Schematron expressions are authored and persistent. They are synthesized and possibly cached, but I do not believe they are persistent with their own identity. > >Still need to develop: > >xslt that translates code list in XML to schematron form. > >When we decide what the code list instance looks like, and then what >a proforma Schematron set of assertions looks like, the XSLT should >be easily created. It would be a waste to start writing stylesheets >before knowing the inputs and the outputs. I don't have any concerns >that if we know the inputs and the outputs that an XSLT can be >created to use for illustration. > >The question is, how complex will a schematron document be that is >faithful to the schema and completely validates all code lists in UBL usage. Let me see if I can derive that from the schemas and the genericode instances and illustrate it in an automated fashion. I believe the complexity is totally irrelevant since it isn't a persistent and authored artifact ... it is synthesized and cached and need never be manipulated. > >Examples of schematron to handle type 1 (code list as element) > >My recollection of "type 1" is that being an element is not an aspect >of its distinction. > >However, because everything is XPath, one can just assert >test=".='CAD'" instead of test="@amountCurrencyID='CAD'" ... and as I >mentioned earlier the implementation of Schematron I'm using appears >to be old and deficient and I'm trying to find an implementation of >ISO Schematron where the tests can all be based on the current node >and the contexts can be set to either an element or an attribute. > >I will point out here that one distinction of type 1 usage is that >the name of the element is different within the context of the >schema. This is slightly more complex than the type 2 case which >uses an attribute that always has the same name. There is no >requirement in UBL, I think, that says that element names must be >globally unique within a namespace. But I think it is a W3C Schema constraint that sibling elements of the same name cannot be structured differently in the schema, and globally defined elements of a given name can only be declared once. So, yes, I think by use of W3C Schema that the structures for element names must be globally unique within a namespace. For trading partners that one element X to have one set of values in context Y and for a different element X to have a different set of values in context Z that that can only be validated using Schematron. Though of course none of the above is true for RELAX-NG as it can have siblings that are different and the declaration mechanisms are more powerful ... but I digress .... > >Construction of schematron to handle all validation of ubl instance > >documents. > >Schematron is only tasked with value validation and not structural >validation, though if necessary, Schematron can be used to constrain >structural validation that may be loosely defined in other schema expressions. > > >How complex will this be? Is there any processing resource demands > >imposed by this method. > >I'm not sure what you are asking. > >We are looking so far at simplified examples. To judge overall >complexity we need a fully done example that validates instance >files of the standard UBL examples such as order. I now agree ... but I could use more fodder to help with my example building. Where have we catalogued all of the code lists as being either type 1 or type 2, and who can help create genericode instances of as many of the code lists as possible. Can this catalogue be expressed in XML so that I can use it as input to a transformation? Then I can use those inputs with my ideas for synthesizing the Schematron. I believe I can use the HISC XPath files, which themselves are synthesized from the schemas, as the input to tell me all of the contexts in which code lists are used. To create a full example for Order I will need this help ... and any help thus provided will be useful in the final deliverable anyway, so it won't be a waste just for a demo, it will be useful as draft versions for a final deliverable. Thanks again, Marty! And thanks to anyone who can help me with these code list artifacts (genericode instances and code list type catalogue) I need while I write the XSLT for the Schematron. . . . . . . Ken -- World-wide on-site corporate, govt. & user group XML/XSL training. G. Ken Holman mailto:gkholman@CraneSoftwrights.com Crane Softwrights Ltd. http://www.CraneSoftwrights.com/o/ Box 266, Kars, Ontario CANADA K0A-2E0 +1(613)489-0999 (F:-0995) Male Cancer Awareness Aug'05 http://www.CraneSoftwrights.com/o/bc Legal business disclaimers: http://www.CraneSoftwrights.com/legal
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]