Manage constraints on vocabularies
DITA adopters have long requested the ability to restrict element content without changing the semantics or processing expectations. As the number of specialized vocabularies increases, an increasing need to restrict the choices presented to writers is an inevitable consequence.
Because specialization adds branches to the type hierarchy, specialization is not an optimal solution for this request. Using specialization to implement restriction pads the type hierarchy with types that don't introduce new semantics.
Allow constraints on the content of elements provided by vocabularies.
Definition: A constraint eliminates some of the possible instances of the vocabularies assembled by a document type without changing the semantics of those vocabularies. In the same way that a specialized vocabulary is interoperable with its base vocabulary, a constrained vocabulary is interoperable with the unconstrained vocabulary.
The availability of constraints has benefits for the type hierarchy. Designers can specialize with general elements and loose content models and rely on constraints to provide a more guided authoring experience. Such flexible vocabularies provide a better base for subsequent specialization.
By extending the existing DITA design pattern to constrain vocabularies, the DITA architecture realizes the following benefits:
For an earlier version of this proposal, please see:
http://www.oasis-open.org/committees/download.php/14936/Issue34.html
Constraints must allow the following kinds of restrictions:
A constraint should be able to simplify or enforce best practices for the element content. For instance, a constraint should be able to omit optional elements, restrict the range of occurrences for a position, turn a choice with more with one occurrence into a sequence, restrict the values of an attribute, remove an attribute, and so on.
The request to simplify the content model for the <section> element or block elements is common:
http://tech.groups.yahoo.com/group/dita-users/message/258
More formally, a content constraint imposes a restriction allowed under the rules of specialization without changing the semantics of the container element.
Also because vocabulary compatibility is established by module, domain extensions cannot replace a base element but can only add alternatives to the base element.
A constraint should be able to extend a base element with only some of the specialized elements provided by domain vocabularies. For example, a constrained domain might extend the <ph> element with the <b> and <i> elements from the highlight domain but not with the <sub>, <sup>, <tt>, or <u> elements.
In addition, a constraint should be able to replace the base element with the specialized domain elements. Effectively, such replacement makes the base element an abstract element in the context.
Here is a sample request for this capability:
http://tech.groups.yahoo.com/group/dita-users/message/524
DITA 1.1 allows document type shells to replace nested <topic> elements without providing a method by which processors can detect such restrictions. To formalize this practice, constraint support for replacement domains should also handle replacement of nested topics.
The adopter performs the following actions:
The adopter could constrain different document type shells for other authoring populations and write common processing against the unconstrained document type.
Similarly, if blocks were relaxed to allow self nesting to enable specialization, constraints could preserve that restriction in the existing document type shells.
This proposal has the following impacts:
The proposed constraints would be implemented as follows
The implementation of constraints on different particles of the content model for one element cannot be combined. That is, constraints implementations for a specific element cannot be aggregated.
The following rules apply to constraints modules:
In the same way that the designer bears the responsibility of implementing a specialized content model that's at least as restrictive as its base module, the designer bears the responsibility of implementing a constrained content model that's more restrictive than the unconstrained content model for the same element.
The content model and attributes of one element can be constrained only by one constraints module included in a document type shell. Other shells may include different constraints modules that restrict the same element in a different way.
The list of extension elements provided by a domains module can be constrained only by one constraints module included in a document type shell. Other shells may include different constraints modules that restrict the list of extension element for the same domain in a different way.
Each constraints module may constrain elements from only one vocabulary module. This rule maintains granularity of reuse at the module level.
Constraints modules that restrict different elements within the same vocabulary module can be combined with one another or with a constraints module that selects a subset of the extension elements for the vocabulary. Such combinations of constraints on a single vocabulary module have no meaningful order or precedence.
Designers have the option to declare a constraints module or combination of constraints modules to be more restrictive than another constraints module or combination of constraints modules on the same vocabulary module or a base vocabulary module. This option is particularly useful when a designer wants to constrain base and specialized elements in a consistent way. The advantage of declaring the consistency is that processors can take advantage of the consistency when converting document instances.
A document type with constraints allows a subset of the possible instances of a document type for the same vocabularies without constraints. To put it another way, all instances of the constrained document type are guaranteed to be valid instances of the unconstrained document type.
As a result, a constraint doesn't change basic or inherited content processing. The constrained instances remain valid instances of the element type, and the element retains the same semantics and class attribute declaration. In other words, a constraint never creates a new case for content processing such as output formatting.
For instance, a document type constrained to require the <shortdesc> element allows a subset of the possible instances of the unconstrained document type with an optional <shortdesc> element. Thus, the content processing for topic still works when topic is constrained to require a short description.
Currently, DITA document instances declare (by means of the domains attribute and the class attribute for the topic or map elements) the vocabularies available in its document type. A processor can examine these declarations to determine whether a document instance uses a subset of the vocabularies in another DITA document type and thus is compatible with that document type.
A constrained document type allows only a subset of the possible instances of the unconstrained document type. Thus, for a processor to determine whether a document instance is compatible with another document type, the document instance must declare any constraints on the document type.
For instance, an unconstrained task is compatible with an unconstrained topic because the task can be generalized to topic. If, however, the topic is constrained to require the <shortdesc> element, a document type with an unconstrained task is not compatible with the constrained document type because some instances of the task might not have a <shortdesc> element. If, however, the task document type has also been constrained to require the <shortdesc> element, it is compatible with the constrained topic document type.
To allow processors to detect constraints, the domains attribute lists constrain modules as well as vocabulary modules (such as topic or domain modules). The rules for declaring constraints modules with parenthetical expressions in the domains attribute are as follows:
The root name for a constraint is formed by removing the extension and "Constraints" infix from the module filename. In declaration contexts such as the domains attribute, the "-c" suffix is added. Thus, the shortdescReqConstraints.xsd Schema or shortdescReqConstraints.mod DTD implementation of a constraint has the root name of "shortdescReq" and the declaration name of "shortdescReq-c".
Here are some examples of constraints module declarations as qualifications on vocabulary modules:
The domains attribute declaration:
(topic shortdescReq-c)
The domains attribute declaration:
(topic hi-d noNestedHighlight-c)
The domains attribute declaration:
(topic hi-d basicHighlight-c)
The domains attribute declaration:
(topic shortdescReq-c) (topic simpleSection-c)
The domains attribute declaration:
(topic hi-d noNestedHighlight-c) (topic hi-d basicHighlight-c)
The domains attribute declaration:
(topic noBasePhrase-c) (topic hi-d) (topic pr-d)
For another example, the concept document type customizes the content model of <concept> to allow extension of the nested <topic> element only by other concept topics. This restriction could be declared by the domains attribute as follows:
(topic concept nestedConcept-c)
The designer can declare the compatibility of the simpleTaskSection constraints module with the simpleSection constraints module so processors know that instances can be safely generalized to the topic module constrained by simpleSection. The designer bears the responsibility of determining that any instance of the constrained task specializations of <section> are valid for the constrained topic <section>.
The domains attribute declaration:
(topic simpleSection-c task simpleTaskSection-c)
By definition, an instance of task constrained by simpleTaskSection can always generalize to task or topic (the unconstrained vocabulary modules).
Note that the example doesn't imply that task is consistent with topic constrained by simpleSection. A vocabulary module never has a relationship with a constrained version of another vocabulary module. That is, constraints always augment the basic relations between vocabularies.
A designer who knows about the shortdescReq constraints module has the option to declare the compatibility of the strictTopic constraints module with the shortdescReq constraints module so processors know that instances can be safely converted to the less restrictive schema.
The domains attribute declaration:
(topic shortdescReq-c strictTopic-c)
Note the difference from the earlier example of shortdescReq combined with simpleSection. Because the shortdescReq constraint isn't declared in the rightmost position, it doesn't constrain topic in this document type shell. Again, the designer would only want to declare this compatibility when needing interchange with a different document type shell that applies the shortdescReq constraint.
To determine compatibility between two document instances, a conref processor can check the domains attribute to confirm that
Some examples:
| Referencing | Referenced | Resolution |
|---|---|---|
(topic) |
(topic shortdescReq-c) |
Allowed - content model of referenced topic is more constrained |
(topic shortdescReq-c) |
(topic) |
Prevented - content model of referenced topic is less constrained |
(topic hi-d) |
(topic hi-d basicHighlight-c) |
Allowed - domain extension list of referenced document type shell is more constrained |
(topic hi-d basicHighlight-c) |
(topic hi-d) |
Prevented - domain extension list of referenced document type shell is less constrained. |
(topic hi-d) |
(topic noBasePhrase-c) (topic hi-d) |
Allowed - referencing document type shell doesn't replace base element with domain extensions. |
(topic noBasePhrase-c) (topic hi-d) |
(topic hi-d) |
Prevented - referencing document type shell does replace base element with domain extensions. |
(topic task) (topic hi-d basicHighlight-c) |
(topic simpleSection-c task simpleTaskSection-c) |
Allowed - referencing shell has a subset of the constraints of the referenced shell on the common vocabulary modules. |
(topic shortdescReq-c task shortdescTaskReq-c) (topic hi-d basicHighlight-c) |
(topic simpleSection-c task simpleTaskSection-c) |
Prevented - referencing shell has constraints on common vocabulary modules that aren't in the referenced shell. |
Similarly, to determine compatibility between a document instance and a target document type, a generalization processor can use the domains and class attributes for the document instance and the domains attribute for the target document type to determine how to rename elements in the document instance. For each element instance, the generalization processor:
Iterates over the class attribute on the element instance from specific to general, inspecting the vocabulary modules.
Looks for the first vocabulary module that is both present in the target document type and that has a subset of the constraints in the document instance.
If a module is found in the target document type, that module becomes the minimum threshhold for the generalization of contained element instances.
If a module is not found, the document instance cannot be generalized to the target document type and, instead, can only be generalized to a less constrained document type.
Note that a document instance can always be converted from a constrained document type to an unconstrained document type merely by switching the binding of the document instance to the less restricted schema (which would also have a different domains attribute declaration). No renaming of elements is needed to remove constraints.
The basic strategy for implementing constraints in schemas is as follows:
...
<xs:group name="basicHighlight-c-ph">
<xs:choice>
<xs:element ref="b"/>
<xs:element ref="i"/>
</xs:choice>
</xs:group>
...
...
<xs:redefine schemaLocation="topicMod.xsd">
<!-- constrain content and attributes of <topic> element -->
<xs:complexType name="topic.class">
<xs:complexContent>
<xs:restriction base="topic.class">
<xs:sequence>
<xs:group ref="title"/>
<xs:group ref="titlealts" minOccurs="0"/>
<!-- make required -->
<xs:choice>
<xs:group ref="shortdesc" />
<xs:group ref="abstract" />
</xs:choice>
<xs:group ref="prolog" minOccurs="0"/>
<xs:group ref="body" minOccurs="0"/>
<!-- remove <related-links> -->
<xs:group ref="topic-info-types" minOccurs="0" maxOccurs="unbounded"/>
</xs:sequence>
</xs:restriction>
</xs:complexContent>
</xs:complexType>
...
</xs:redefine>
...
...
<xs:include schemaLocation="basicHighlightConstraint.xsd"/>
...
<xs:redefine schemaLocation="commonElementGrp.xsd">
<xs:group name="ph">
<!-- drop base <ph> as well as apply basic subset of highlight domain -->
<xs:choice>
<xs:group ref="basicHighlight-c-ph"/>
</xs:choice>
</xs:group>
...
</xs:redefine>
<xs:redefine schemaLocation="strictTopicConstraint.xsd">
<xs:complexType name="topic.class">
<xs:complexContent>
<xs:extension base="topic.class">
<!-- declare the constraint of topic and highlight vocabulary modules
and compatibility of constrained highlight with subset of
topic constraints -->
<xs:attribute name="domains" type="xs:string"
default="(topic noBasePhrase-c)
(topic strictTopic-c)
(topic strictTopic-c hi-d basicHighlight-c)"/>
...
</xs:extension>
</xs:complexContent>
</xs:complexType>
...
</xs:redefine>
...
The basic strategy for implementing constraints in DTDs is as follows:
...
<!ENTITY % topic.content "((%title;), (%titlealts;)?, (%shortdesc;|%abstract;)?,
(%prolog;)?, (%body;)?, (%related-links;)?, (%topic-info-types;)*)">
<!ENTITY % topic.attributes
"id ID #REQUIRED
conref CDATA #IMPLIED
%select-atts;
%localization-atts;
outputclass CDATA #IMPLIED">
...
<!ELEMENT topic %topic.content;>
<!ATTLIST topic %topic.attributes;>
<!ATTLIST topic
%arch-atts;
domains CDATA "&included-domains;">
...
<!ENTITY % basicHighlight-c-ph "b | i"> <!ENTITY basicHighlight-c-att "(topic hi-d basicHighlight-c)">
<!ENTITY topic-constraints "(topic strictTopic-c)">
...
<!ENTITY % topic.content "((%title;), (%titlealts;)?, (%shortdesc;|%abstract;),
(%prolog;)?, (%body;)?, (%topic-info-types;)*)">
...
...
<!ENTITY % basicHighlight-c-dec SYSTEM "basicHighlightConstraint.ent">
%basicHighlight-c-dec;
...
<!-- drop base <ph> as well as apply the basic subset of highlight domain -->
<!ENTITY % ph "%basicHighlight-c-ph;">
...
<!ENTITY % strictTopic-c-def SYSTEM "strictTopicConstraint.mod">
%strictTopic-c-def;
...
<!-- declare the constraint of topic and highlight vocabulary modules and
compatibility of constrained highlight with subset of topic constraints -->
<!ENTITY included-domains "(topic noBasePhrase-c)
(topic strictTopic-c)
(topic strictTopic-c hi-d basicHighlight-c)">
...
<!ENTITY % topic-type SYSTEM "topic.mod">
%topic-type;
...