[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: Re: [ubl-ndrsc] Rule: 115 and 116 Containers
Bill, I think your argument is bogus. The alternative to <?xml version="1.0" encoding="UTF-8"?> <doc> <SuperfluousContainer> <Fruit>Apple</Fruit> <Fruit>Orange</Fruit> <Fruit>Banana</Fruit> </SuperfluousContainer> </doc> is not, in real life, <?xml version="1.0" encoding="UTF-8"?> <doc> <Fruit>Apple</Fruit> <Fruit>Orange</Fruit> <Fruit>Banana</Fruit> </doc> but more probably <?xml version="1.0" encoding="UTF-8"?> <doc> <someelement>foo</somelement> <Fruit>Apple</Fruit> <anotherone>bar</anotherone> <Fruit>Orange</Fruit> <alongcontainerlikeaddress> <a> <b> <c>foo</c> </b> </a> </alongcontainerlikeaddress> <Fruit>Banana</Fruit> </doc> Also, although I don't have the time or the inclination of checking this out, (I am on vacation after all) I believe your first stylesheet is way more complicated than needed for dealing with the container case, I believe it can be cut in half -- but again, I have not checked this, it's just based on previous experience with stylesheets. Burcham, Bill wrote: > I'm with Chee-Kai -- I think [R 116] is wrong. (I know it's probably too > late -- but I'm gonna say my peace anyway :-) > The two cases I've heard made in favor of it are: > > 1. container elements foster more readable stylesheets > 2. container elements significantly improve document processing performance > > Argument 1 is weak. Forgive me for posting working code, but here is an > instance document with superfluous containers: > > <?xml version="1.0" encoding="UTF-8"?> > <doc> > <SuperfluousContainer> > <Fruit>Apple</Fruit> > <Fruit>Orange</Fruit> > <Fruit>Banana</Fruit> > </SuperfluousContainer> > </doc> > > And here is a stylesheet to process it: > > <?xml version="1.0" encoding="UTF-8"?> > <xsl:transform version="1.0" > xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> > <xsl:output method="xml" version="1.0" encoding="UTF-8" > indent="yes"/> > <xsl:template match="doc"> > <xsl:element name="NewDoc"> > <xsl:apply-templates select="current()/*"/> > </xsl:element> > </xsl:template> > <xsl:template match="SuperfluousContainer"> > <BeforeFruit/> > <xsl:apply-templates select="current()/*"/> > <AfterFruit/> > </xsl:template> > <xsl:template match="Fruit"> > <AFruit> > <xsl:value-of select="text()"/> > </AFruit> > </xsl:template> > </xsl:transform> > > And here is the output: > > <?xml version="1.0" encoding="UTF-8"?> > <NewDoc> > <BeforeFruit/> > <AFruit>Apple</AFruit> > <AFruit>Orange</AFruit> > <AFruit>Banana</AFruit> > <AfterFruit/> > </NewDoc> > > The example injects an element before the first fruit and after the last > one. That's the example we've been discussing for a couple years as being > the bugaboo here. > > And here is an analogous source instance doc -- this time with no > superfluous containers: > > <?xml version="1.0" encoding="UTF-8"?> > <doc> > <Fruit>Apple</Fruit> > <Fruit>Orange</Fruit> > <Fruit>Banana</Fruit> > </doc> > > And here is a different stylesheet to process this one: > > <?xml version="1.0" encoding="UTF-8"?> > <xsl:transform version="1.0" > xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> > <xsl:output method="xml" version="1.0" encoding="UTF-8" > indent="yes"/> > <xsl:template match="doc"> > <xsl:element name="NewDoc"> > <xsl:apply-templates select="current()/*"/> > </xsl:element> > </xsl:template> > <xsl:template match="Fruit"> > <xsl:if test="position() = 1"> > <BeforeFruit/> > </xsl:if> > <AFruit> > <xsl:value-of select="text()"/> > </AFruit> > <xsl:if test="position() = last()"> > <AfterFruit/> > </xsl:if> > </xsl:template> > </xsl:transform> > > Comparing the two stylesheets I note that the one for superfluous containers > is 19 lines and the one for repeating elements (with no superfluous > containers) is 20 lines. That's only one line of code difference. And I > don't think the second stylesheet is any less readable than the first. > > If I look at the two source documents, and extrapolate to larger documents > with more nesting I can say with certainty that superfluous containers make > for larger documents and IMHO are a bit harder for humans to read -- do to > the increase in indentation necessitated by the deeper hierarchy. > > As for point 2 (processing performance), that's just Voodoo Computer > Science. So, which XML processing tools are we using for comparison? Which > versions of those tools? What is the use-case/scenario/algorithm? How big > is the document? Worst-case, if you tell me that the document is HUGE then > I'll tell you a) the Bolivian rug-weaver using Perl as the processing tool > isn't gonna see the HUGE document and b) the company (Wal*Mart) that sees > the HUGE document can darn-well write a transform on the incoming document > (or four or five transforms) that make it more amenable to efficient > processing. > > But you know what -- I still haven't seen any real _evidence_ that > superfluous containers provide any processing performance advantage in the > first place. It's more likely they hurt performance since they _definitely_ > make documents larger! > > So by my count, it's: > > Superfluous containers: they make documents bigger (inflicting a processing > burden) and harder for humans to read > Repeated elements (no superfluous containers): they make documents smaller > and easier for humans to read, and necessitate a tiny bit more XSLT code in > some situations. > > Down with [R 116]! > > > Bill Burcham > Sr. Software Architect, Integration Software Development > Sterling Commerce, Inc. > 469.524.2164 > bill_burcham@stercomm.com > > -----Original Message----- > From: Chin Chee-Kai [mailto:cheekai@softml.net] > Sent: Wednesday, July 16, 2003 8:38 PM > To: UBL-NDR > Subject: Re: [ubl-ndrsc] Rule: 115 and 116 Containers > > > >>>[R 115] All documents shall have a container for metadata and which >>>proceeds the body of the document and is named "Head" _____________. >>>(anything but header) > > >>>[R 116] All elements with a cardinality of 1..n, (and lack a >>>qualifying >>>structure) must be contained by a list container named "(name of > > repeating > >>>element)List", which has a cardinality of 1..1. > > > I remain critical of having to maintain such virtual structure for no > apparent use. I've heard that the rules don't affect FPSC at all. By > design, they should not affect LC. So who's benefiting from carrying all > the empty luggages around? > > > That said, I pointed out last time that the [R 115] should have "precedes" > instead of "proceeds", unless the proponent of the rule wants Head sitting > at the tail. > > > > Best Regards, > Chin Chee-Kai > SoftML > Tel: +65-6820-2979 > Fax: +65-6743-7875 > Email: cheekai@SoftML.Net > http://SoftML.Net/ > > > On Wed, 16 Jul 2003, Lisa-Aeon wrote: > > >>>Rules for Voting: Each email will have only one rule in it, I will >>>try to mark the rules that group with it, or rules that might >>>duplicate it. The membership has 5 working days to bring forth >>>objection or discussion, after the 5 working days, if there are no >>>objections, the rule will be assumed to be "ACCEPTED" and be given to >>>the LCSC for their implementation. >>> >>>Please Reply leaving first email in Reply. >>> >>>Voting period on this rule ends: July 23, 2003 >>> >>>******************************* >>>I am combining the last two rules, because we have already voted on a >>>decision. These are the old rules: >>> >>>[R 115] All documents shall have a container for metadata and which >>>proceeds the body of the document and is named "Head" _____________. >>>(anything but header) >>> >>>[R 116] All elements with a cardinality of 1..n, (and lack a >>>qualifying >>>structure) must be contained by a list container named "(name of > > repeating > >>>element)List", which has a cardinality of 1..1. >>> >>>These are the new rules agreed upon during the teleconference call on >>>9 July. These are voted as approved, just need polishing up. To >>>remind everybody, here is the motion and it was approved. >>> >>>***Motion:(Arofan) We agree in the direction of the rules being >>>submitted, a. Endorse the direction as indicated in this proposal. >>> >>>b. Authorize Arofan to make the changes that were discussed in this >>>meeting. >>> >>>Changes: >>> >>>Substitute the word "Top" for "Head", >>> >>>Make sure we have explicitly covers the 1..n in the wording. >>> >>>c. Authorize Mark to make editorial changes. >>> >>>d. Submit to list for final approval. (vote by email) >>> >>>****** >>>Proposed full set of rules, as discussed: >>> >>>---------------------------------------------------------------------- >>>------ >>>---- >>> >>>(1) All non-repeatable BIEs that are direct children of the >>>document-level BIE in the model will be child elements of a generated >>>"Top" element in the schema. The generated "Top" element will be named >>>"[doctype]Top", and its content model will be a sequence. It will >>>reference a generated type named "[doctype]TopType". Both the >>>generated "Top" element and its type will be declared in the same >>>namespace as the document-level element. (Note: This rule implies that >>>all documents will have generated "Top" elements, without exception, >>>regardless of their other 'body' contents, to cover cases where the >>>document will be extended with the Context mechanism, and for general >>>consistency.) >>> >>>(2) All repeatable BIEs in the model will have generated containers. >>>The containers will be named "[name_of_repeatable_element]List". These >>>containers will be required if the cardinality of their contained >>>immediate children requires at least one; if their contained children >>>are optional; the container itself will be optional. At least one of >>>the repeatable children of the List will always be required, but there >>>may be more than one required child if that agrees with the >>>cardinality found in the business model. >>> >>>All "_____List" elements will reference a "_______ListType", which >>>will be declared in the same namespace as the element that represents >>>the repeatable BIE in the business model. The content model of this >>>type will have a single child element, which will have a maximum >>>occurrence that reflects the maximum occurrence in the business model, >>>and a minimum occurrence as described in this rule, above. >>> >>>(NOTE: This rule applies equally to 'list' containers at the document >>>level, and also at lower levels within the document.) >>> >>>(3) The document element in the schema will have a content model that >>>is a sequence of elements, the first of which will be the "Top" >>>element, and the others will be the generated "List" elements, in the >>>order in which their contained, repeatable children appeared in the >>>model. >>> >>>(4) All elements in the generated schema that are direct children of >>>the generated "top" elements in all documents should be gathered >>>together into a common aggregate type, named "TopType", which will be >>>declared in the Common Aggregate Types namespace. This type should be >>>declared abstract, and all document headers should be extensions - >>>even if only trivial extensions to facilitate re-naming - of this >>>abstract type. (Note: This rule allows for polymorphic processing of >>>the set of generic header elements across all document types.) >>> >>> >>>--- >>>Outgoing mail is certified Virus Free. >>>Checked by AVG anti-virus system (http://www.grisoft.com). >>>Version: 6.0.498 / Virus Database: 297 - Release Date: 7/8/2003 >>> >>> >>> >>>--- >>> >>>File has not been scanned >>> >>>Checked by AVG anti-virus system (http://www.grisoft.com). >>>Version: 6.0.498 / Virus Database: 297 - Release Date: 7/8/2003 >>> > > > > You may leave a Technical Committee at any time by visiting http://www.oasis-open.org/apps/org/workgroup/ubl-ndrsc/members/leave_workgroup.php > -- Eduardo Gutentag | e-mail: eduardo.gutentag@Sun.COM Web Technologies and Standards | Phone: +1 510 550 4616 x31442 Sun Microsystems Inc. | W3C AC Rep / OASIS TAB Chair
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]