<?xml version="1.0" encoding="iso-8859-1"?>
<!DOCTYPE article [
<!-- ELEMENT declarations work around MSXML bug. -->
<!ELEMENT section ANY>
<!ATTLIST section id ID #IMPLIED>
<!ELEMENT appendix ANY>
<!ATTLIST appendix id ID #IMPLIED>
<!ELEMENT bibliomixed ANY>
<!ATTLIST bibliomixed id ID #IMPLIED>
]>
<article status="Working Draft">
<articleinfo>
<releaseinfo>$Id: annotate.xml,v 1.5 2001/08/04 11:19:49 jjc Exp $</releaseinfo>
<title>RELAX NG DTD Compatibility Annotations</title>
<authorgroup>
<editor>
  <firstname>James</firstname><surname>Clark</surname>
  <affiliation>
    <address><email>jjc@jclark.com</email></address>
  </affiliation>
</editor>
<editor>
  <surname>MURATA</surname><firstname>Makoto</firstname>
  <affiliation>
    <address><email>mura034@attglobal.net</email></address>
  </affiliation>
</editor>
</authorgroup>
<pubdate>4 August 2001</pubdate>
<releaseinfo role="meta">
$Id: annotate.xml,v 1.5 2001/08/04 11:19:49 jjc Exp $
</releaseinfo>

<copyright><year>2001</year><holder>OASIS</holder></copyright>

<legalnotice>

<para>Copyright &#169; The Organization for the Advancement of
Structured Information Standards [OASIS] 2001. All Rights
Reserved.</para>

<para>This document and translations of it may be copied and furnished
to others, and derivative works that comment on or otherwise explain
it or assist in its implementation may be prepared, copied, published
and distributed, in whole or in part, without restriction of any kind,
provided that the above copyright notice and this paragraph are
included on all such copies and derivative works. However, this
document itself may not be modified in any way, such as by removing
the copyright notice or references to OASIS, except as needed for the
purpose of developing OASIS specifications, in which case the
procedures for copyrights defined in the OASIS Intellectual Property
Rights document must be followed, or as required to translate it into
languages other than English.</para>

<para>The limited permissions granted above are perpetual and will not
be revoked by OASIS or its successors or assigns.</para>

<para>This document and the information contained herein is provided
on an <quote>AS IS</quote> basis and OASIS DISCLAIMS ALL WARRANTIES,
EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE
USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY
IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR
PURPOSE.</para>

</legalnotice>

<legalnotice role="status"><title>Status of this Document</title>

<para>This is a working draft constructed by the editors. It is not an
official committee work product and may not reflect the consensus
opinion of the committee.  Comments on this document may be sent to
<ulink url="mailto:relax-ng-comment@lists.oasis-open.org"
>relax-ng-comment@lists.oasis-open.org</ulink>.</para>

<!--
<para>This working draft was approved for publication by the OASIS
RELAX NG technical committee. It represents the current consensus of
the committee.  However, it is a draft document and further changes
are still possible.  Comments on this document may be sent to <ulink
url="mailto:relax-ng-comment@lists.oasis-open.org"
>relax-ng-comment@lists.oasis-open.org</ulink>.</para>
-->

</legalnotice>

<abstract>
<para>This specification defines elements and attributes that can be
used as annotations in <xref linkend="spec"/> schemas.  The purpose of
these annotations is to support some of the features of XML 1.0 DTDs
that are not supported by RELAX NG.</para>
</abstract>

<revhistory>
<revision>
  <revnumber>Working Draft</revnumber>
  <date>4 August 2001</date>
</revision>
</revhistory>
</articleinfo>

<section>
<title>Introduction</title>

<para>RELAX NG <xref linkend="spec"/> provides an annotation
capability. In a RELAX NG schema, RELAX NG-defined elements can be
annotated with child elements and attributes from other
namespaces. The goal of this specification is to facilitate transition
from XML 1.0 DTDs to RELAX NG schemas by defining annotations that
support some of the features of XML 1.0 DTDs that are not supported by
RELAX NG.</para>

<para>RELAX NG itself performs only validation: it does not change the
infoset <xref linkend="infoset"/> of an XML document.  Most of the
features of XML 1.0 DTDs that are not supported by RELAX NG involve
modification to the infoset.  In XML 1.0, validation and infoset
modification are combined in a monolithic XML processor.  It is a goal
of this specification to provide a clean separation between validation
and infoset modification, so that a wide variety of implementation
scenarios are possible. In particular, it should be possible to
perform RELAX NG validation either before or after the infoset
modifications implied by annotations. It should also be possible for
an implementation of this specification not to modify the infoset at
all and instead provide the application with a description of the
modifications implied by the annotations, independently of any
particular instance.</para>

<para>This specification does not provide any support for features of
XML 1.0 DTDs, such as entity declarations, that cannot be cleanly
separated from validation.</para>

<para>In an XML 1.0 document that is valid with respect to a DTD, each
element or attribute in the instance has a unique corresponding
element or attribute declaration in the DTD.  With RELAX NG this is
not always the case: it may be ambiguous which
<literal>element</literal> or <literal>attribute</literal> pattern any
particular element or attribute in the instance matches.  In addition,
it is non-trivial to determine when a RELAX NG schema is ambiguous.  A
further complication is that even when cases where it is not
ambiguous, it may require multiple passes or lookahead to determine
which <literal>element</literal> or <literal>attribute</literal>
pattern a particular element or attribute matches.  Detecting this
situation is also non-trivial.</para>

<para>Some features of XML 1.0 DTDs, in particular default attribute
values and ID/IDREF/IDREFS validation, depend crucially on this
unambiguous correspondence between elements or attributes in the
instance and their corresponding declarations.  In order to support
these features by means of annotations on RELAX NG patterns, it is
therefore necessary to impose restrictions on the use of such
annotations.  The goals in framing these restrictions were as
follows:</para>

<orderedlist>

<listitem><para>It must be possible to determine whether a schema
satisfies the restrictions independently of any particular
instance.</para></listitem>

<listitem><para>Processing of the instance must not require lookahead
or multiple passes.</para></listitem>

<listitem><para>The modified infoset must be XML 1.0 compatible: it
must be an infoset that could have been produced by a validating XML
1.0 parser for some DTD.</para></listitem>

<listitem><para>Implementation of the restrictions should be
straightforward.</para></listitem>

<listitem><para>The restrictions should not be any more restrictive
than necessary.</para></listitem>

</orderedlist>

<para>The elements and attributes defined in this specification have
the namespace URI.</para>

<programlisting>http://relaxng.org/ns/annotation/0.9</programlisting>

<para>Examples in this specification follow the convention of using
the prefix <literal>a</literal> to refer to this namespace URI.</para>

</section>

<section>
<title>Example</title>

<para>The following DTD</para>

<programlisting><![CDATA[<!DOCTYPE employees [
<!-- A list of employees. -->
<!ELEMENT employees (employee*)>
<!-- An individual employee. -->
<!ELEMENT employee (#PCDATA)>
<!ATTLIST employee
  id ID #REQUIRED
  manages IDREFS #IMPLIED
  managedBy IDREF #IMPLIED
  country (US|JP) "US"
>
]>]]></programlisting>

<para>could be translated to the following RELAX NG schema with
annotations:</para>

<programlisting><![CDATA[<element name="employees"
         xmlns="http://relaxng.org/ns/structure/0.9"
         xmlns:a="http://relaxng.org/ns/annotation/0.9">
  <a:documentation>A list of employees.</a:documentation>
  <zeroOrMore>
    <element name="employee">
      <a:documentation>An individual employee.</a:documentation>
      <attribute name="id" a:attributeType="ID">
        <data type="token"/>
      </attribute> 
      <optional>
        <attribute name="manages" a:attributeType="IDREFS">
          <list>
            <oneOrMore>
              <data type="token"/>
            </oneOrMore>
          </list>
        </attribute>
      </optional>
      <optional>
        <attribute name="managedBy" a:attributeType="IDREF">
          <data type="token"/>
        </attribute>
      </optional>
      <optional>
        <attribute name="country" a:defaultValue="US">
          <choice>
            <value>US</value>
            <value>JP</value>
          </choice>
        </attribute>
      </optional>
      <text/>
    </element>
  </zeroOrMore>
</element>]]></programlisting>

</section>

<section>
<title>Conformance</title>

<para>This specification defines three features:</para>

<itemizedlist>
<listitem><para>attribute default value</para></listitem>
<listitem><para>ID/IDREF/IDREFS</para></listitem>
<listitem><para>documentation</para></listitem>
</itemizedlist>

<para>Conformance is defined separately for each feature.  A
conformant implementation can support any combination of features.
There are also two levels of conformance.</para>

<orderedlist>

<listitem><para>Level 1 requires validation only. An implementation
that supports a feature at level 1 is only required to check that the
schema uses the feature correctly and that the instance is valid with
respect to the schema's use of the feature.</para></listitem>

<listitem><para>Level 2 requires that an implementation provide
information about the infoset modifications implied by annotations.
An implementation can provide the application either with a modified
infoset or with sufficient information that would allow the
application to modify the infoset itself.</para></listitem>

</orderedlist>

<para>A conformant implementation may support different features at
different levels.</para>

<para>A conformant implementation may be an integral part of a RELAX
NG validator or may be a separate software module.</para>

</section>

<section id="default-value">
<title>Attribute default values</title>

<para>An <literal>a:defaultValue</literal> attribute on a RELAX NG
<literal>attribute</literal> element specifies the default value for
the attribute.</para>

<para>If an <literal>attribute</literal> element has an
<literal>a:defaultValue</literal> attribute, then, after schema
simplification,</para>

<itemizedlist>

<listitem><para>its first child must be a <literal>name</literal>
element</para></listitem>

<listitem><para>the first child of the containing <literal>element</literal>
element must be a <literal>name</literal> element</para></listitem>

<listitem><para>the value of the <literal>a:defaultValue</literal>
attribute must match the pattern contained in the
<literal>attribute</literal> element</para></listitem>

<listitem><para>the pattern in the <literal>attribute</literal>
element must not contain <literal>data</literal> or
<literal>value</literal> elements with context-dependent
datatypes</para></listitem>

<listitem><para>it must not have an
<literal>a:attributeType</literal> attribute (see <xref
linkend="attribute-type"/>)</para></listitem>

<listitem><para>it must not have a <literal>oneOrMore</literal>
ancestor</para></listitem>

<listitem><para>any ancestor that is a <literal>choice</literal>
element must have one child that is an <literal>empty</literal>
element</para></listitem>

<listitem><para>it must have at least one <literal>choice</literal>
ancestor</para></listitem>

<listitem><para>if the containing definition competes with another
definition, then that other definition must also contain an
<literal>attribute</literal> element with the same name and with an
<literal>a:defaultValue</literal> attribute with the same
value.  A definition</para>

<programlisting>&lt;define name="<replaceable>ln1</replaceable>"&gt;
  &lt;element&gt;
    <replaceable>nc1</replaceable>
    <replaceable>p1</replaceable>
  &lt;/element&gt;
&lt;/define&gt;</programlisting>

<para>competes with a definition</para>
 
<programlisting>&lt;define name="<replaceable>ln2</replaceable>"&gt;
  &lt;element&gt;
    <replaceable>nc2</replaceable>
    <replaceable>p2</replaceable>
  &lt;/element&gt;
&lt;/define&gt;</programlisting>

<para>if there is a name <replaceable>n</replaceable> that belongs to
both <replaceable>nc1</replaceable> and
<replaceable>nc2</replaceable>.</para></listitem>

</itemizedlist>

<para>The <literal>a:defaultValue</literal> annotation implies a
modification of the infoset that adds attribute information items for
omitted attributes.</para>

<note role="ednote"><para>Define this more precisely.</para></note>

</section>

<section id="attribute-type">
<title>ID, IDREF and IDREFS</title>

<para>An <literal>a:attributeType</literal> attribute on a RELAX NG
<literal>attribute</literal> element must have the value
<literal>ID</literal>, <literal>IDREF</literal> or
<literal>IDREFS</literal>.  It specifies the XML 1.0 attribute type of
the attribute and corresponds to the [attribute type] infoset
property.</para>

<note role="ednote"><para>Should we allow other attribute types such
as <literal>NOTATION</literal>?</para></note>

<para>Leading and trailing whitespace is ignored in the value of the
<literal>a:attributeType</literal> element during schema
simplification.</para>

<para>If an <literal>attribute</literal> element has an
<literal>a:attributeType</literal> attribute, then, after schema
simplification,</para>

<itemizedlist>

<listitem><para>its first child must be a <literal>name</literal>
element</para></listitem>

<listitem><para>the first child of the containing <literal>element</literal>
element must be a <literal>name</literal> element</para></listitem>

<listitem><para>if the value of the <literal>a:attributeType</literal>
attribute is <literal>ID</literal>, then there must not be a
definition that competes with the definition containing the
<literal>attribute</literal> element and that contains an
<literal>attribute</literal> element that has a different name and an
<literal>a:attributeType</literal> attribute with value
<literal>ID</literal>.  Note that a definition competes with itself.
This implies that in the instance if two attributes are both IDs and
have a parent element with the same name, then the two attributes must
have the same name.</para></listitem>

<listitem><para>any competing <literal>attribute</literal> element
must have an <literal>a:attributeType</literal> attribute with the
same value.  Two attribute elements</para>

<programlisting>&lt;attribute&gt; <replaceable>nc1</replaceable> <replaceable>p1</replaceable> &lt;/attribute&gt;</programlisting>

<para>and</para>

<programlisting>&lt;attribute&gt; <replaceable>nc2</replaceable> <replaceable>p2</replaceable> &lt;/attribute&gt;</programlisting>

<para>compete if and only if the containing definitions compete and
there is a name <replaceable>n</replaceable> that belongs to both
<replaceable>nc1</replaceable> and <replaceable>nc2</replaceable>.  Note
that a definition competes with itself.</para>

</listitem>

<listitem><para>it must not have an <literal>a:defaultValue</literal>
attribute (see <xref linkend="default-value"/>)</para></listitem>

</itemizedlist>

<para>An instance is valid with respect to the
<literal>a:attributeType</literal> attributes in the schema if the
attribute values in the instance declared by
<literal>a:attributeType</literal> attributes in the schema to be of
type ID, IDREF or IDREFS meet the validity constraints specified in
<xref linkend="xml-rec"/> for values of that type, after normalizing
the values by applying the normalizeWhiteSpace function defined in
<xref linkend="spec"/>.</para>

<para>The <literal>a:attributeType</literal> annotation implies a
modification of the infoset that changes the [attribute type] property
of attribute information items to <literal>ID</literal>,
<literal>IDREF</literal> or <literal>IDREFS</literal> and modifies the
[normalized value] by applying the normalizeWhiteSpace
function.</para>

<note role="ednote"><para>Define this more precisely.</para></note>

</section>

<section>
<title>Documentation</title>

<para>The <literal>a:documentation</literal> element can be used to
specify human-readable documentation. It supports the functionality
provided by comments in XML 1.0 DTDs. An
<literal>a:documentation</literal> element must not contain any
elements. It can have any attributes whose namespace URI is neither
the empty string, the RELAX NG namespace URI nor the RELAX NG
annotation namespace URI. In particular, it may have an
<literal>xml:lang</literal> attribute.</para>

<para>The documentation specified in an
<literal>a:documentation</literal> element applies to the parent of
the <literal>a:documentation</literal> element.  To apply
documentation to a <literal>value</literal> element, wrap the
<literal>value</literal> element in a <literal>group</literal>
element. To apply documentation to a <literal>name</literal> element,
wrap the <literal>name</literal> element in a
<literal>choice</literal> element.</para>

<note role="ednote"><para>Is there a better solution? Allow an
<literal>a:documentation</literal> attribute as an
alternative?</para></note>

<para>A RELAX NG element may have multiple
<literal>a:documentation</literal> child elements, but all
<literal>a:documentation</literal> child elements must precede all
child elements from the RELAX NG namespace.</para>
 
</section>


<appendix>
<title>RELAX NG schema</title>

<para>To be supplied.</para>

</appendix>

<bibliography><title>References</title>

<bibliomixed id="spec"><abbrev>RELAX NG</abbrev>James Clark, Makoto
MURATA, editors.  <citetitle><ulink
url="http://www.oasis-open.org/committees/relax-ng/spec.html">RELAX NG
Specification</ulink></citetitle>.  OASIS, 2001.</bibliomixed>

<bibliomixed id="xml-rec"><abbrev>XML 1.0</abbrev>Tim Bray,
Jean Paoli, and
C. M. Sperberg-McQueen, Eve Maler, editors.
<citetitle><ulink url="http://www.w3.org/TR/REC-xml">Extensible Markup
Language (XML) 1.0 Second Edition</ulink></citetitle>.
W3C (World Wide Web Consortium), 2000.</bibliomixed>

<bibliomixed id="infoset"><abbrev>XML Infoset</abbrev>John Cowan, Richard Tobin,
editors.
<citetitle><ulink url="http://www.w3.org/TR/xml-infoset/">XML
Information Set</ulink></citetitle>.
W3C (World Wide Web Consortium), 2001.</bibliomixed>

</bibliography>

</article>
