DITA Proposed Feature 32

Domain and topic integration (an architectural enhancement).

Longer description

The problem: Currently, DITA has a notion of module inheritance. The elements defined within a module can only specialize elements defined in a single base module (or its base module, recursively). As a result, elements assigned to positions within content models can only be shared across modules by providing all of the shared elements in a base module.

Designers can encounter scenarios where a base type cannot be used to share vocabulary that has an assigned position within content models. As a result, designers must create many elements with different names for the same semantic, content, and processing, resulting in greater complexity, difficulty in reusing content fragments, and duplicate design, processing, and documentation.

The solution: Change the current notion of module inheritance into the more general notion of module dependency managed by the architecture, including dependencies on multiple modules.

That is, in this approach, elements defined within one module would continue to have inheritance relationships on elements defined in other modules. The modules themselves, however, would only have dependencies on other modules.

Note that other systems with hierarchical inheritance (notably Java) have had success with this same dependency-based approach for modular packaging.

Scope

Major – architectural.

Use Case

Here are some examples of how this architectural enhancement would allow designers to create better document types:

Sharing vocabulary across several topic types
For example, the Java API reference provides topic types for documenting a Java class library, including javaClass, javaMethod, and so on. These topic types in turn derive from general-purpose topic types such as apiClassifier, apiOperation, and so on.

Some content is common across javaClass and javaMethod. For instance, there is a need to identify the name of a Java class used as a base class within javaClass and as a method parameter, return value, or exception within javaOperation.

Because javaClass and javaOperation specialize from different topic types, however, they cannot share vocabulary for Java with assigned positions in content models. Thus, currently, two distinct javaClassClass and javaOperationClass elements would be needed for the same content.

With the enhancement, the javaClass topic type could depend on both the apiClassifier topic type and a javaName vocabulary domain. In turn, the javaMethod topic type could depend on the apiOperation topic type and also on the javaName vocabulary domain.

Incorporating existing domain elements into topic type content models
For example, a GUI task specialization might reasonably require the existing <menucascade> element to identify how to start the task, but a specialized topic type cannot have a dependency on an element from a domain module.

With the enhancement, the GUI task specialization could depend on both the task topic type and the existing UI vocabulary domain.

Extending existing topic types with optional variations on existing elements
For example, a domain could reasonably add a <parameters> specialization of the reference <properties> element to restrict parameter types to class names or simple value types but make this optional alternative to the <properties> element available for any kind of reference topic.
Taking advantage of multiple base domains when defining new vocabularies
For example, a module for describing object-oriented source files might include a <classname> specialization of the existing <apiname> element from the programming domain as well as the <sourcefile> specialization of the existing <filepath> element from the software domain.

Technical Requirements

The module dependency approach doesn't alter the fundamental rules of specialization. The cardinal rule is that a specialized element can be substituted only for its base element and not for an unrelated element or an ancestor above its base element. For instance, a <glossarylist> specialized from <dl> would be able to contain <termhead> and <termentry> specializations only if those elements are direct specializations from the <dlhead> and <dlentry> elements. In addition, a <glossarylist> would only be possible to introduce into <dl> contexts.

What changes with this proposal is that the <glossarylist>, <termhead>, and <termentry> specializations can be provided by different modules. By virtue of containing <termhead> and <termentry>, <glossarylist> has a dependency on the modules that supply those elements. An invalid <glossarylist> (for instance, one in which <termhead> has to be generalized but not <glossarylist>) is impossible because any document type that would require generalization of <termhead> would also require generalization of <glossarylist>, and the base element for <termhead> can appear within the base element for <glossarylist>.

This enhancement changes the architecture as follows:

Define topics in domain modules
The architecture treats a topic specialization as a substitution for <topic> in positions where <topic> can appear. These positions include the root position for a topic document, within a <dita> element, or after <related-links> within the <topic> element.

These positions can be restricted to topic specializations using the standard specialization mechanisms.

For instance, in a DTD implementation, the content model for the <topic> element would list the %topic; attribute after the <related-links> element so domain substitution could introduce specializations of the <topic> element.

Declare module dependencies
The architecture adds the dependencies architectural attribute on the outer element for the content object (that is, on the topic or map). The dependencies attribute lists the modules that contributed to the document type and summarizes how one module depends on other modules.

A module has a direct dependency on every module that provides base elements from which it specializes. For example, if the elements provided in the javaClass module specialize from elements in the apClassifier and javaName modules, the javaClass module depends on the apClassifier and javaName modules.

Note: The domains attribute is deprecated but preserved so existing processes can be supported for existing document types and modules.

The following example shows the dependencies attribute for the <javaClass> specialization of <topic>. The dependencies attribute indicates that the javaClass module depends on the apiClassifier and javaName modules, that the apiClassifier module depends on the reference and pr-d modules, and so on. Only the topic module has no dependencies.

<javaClass ...
        dependencies="javaClass( apiClassifier javaName )
            apiClassifier( reference pr-d )
            javaName( pr-d )
            reference( topic )
            pr-d( topic )
            topic( )">

Note that dependencies cannot be circular. For instance, it would be an error if the reference module depended on the javaClass module.

A validation utility could check the dependencies attribute against the class attributes to make sure that the dependencies attribute reflects every base relationship between modules.

The following table lists the basic possible configurations of dependencies. (Refer to the Use Case section for more explanation of the parenthetical examples.)

Case Depending module Depended-on modules
1 (valid in DITA 1)

topic
(task topic)

1 topic
(topic)

2 (valid in DITA 1)

domain
(javaName domain)

1 domain
(apiName domain)

3

topic
(GUITask topic)

1 topic and 1 domain
(task topic and UI domain)

4

domain
(parameter domain)

1 topic
(reference topic)

5

domain
(OOSource domain)

2 domains
(programming and software domains)

Module dependencies would have the following impacts on DITA processing.

Inherited processing (aka fallback processing)
Processing based on the class attribute continues to work without modification.
Conref
In coarse validation, the process inspects the dependencies attribute (similar to previous use of the domains attribute) to determine whether the modules for the content source are a subset of the modules of the content destination and thus the content fragment elements are guaranteed to be valid. In fine-grained validation, the process inspects the class attributes of the elements in the content fragment to confirm that the lowest module in each class attribute is included in the dependencies for the content destination.
Generalization
The operation of actually renaming elements requires only the class attribute. Thus, module dependencies have no effect on the renaming operation itself.

To determine whether the element is valid for the target document type, the generalization operation scans the class attribute from most to least specialized element until it finds a module that is supported by the target document type. The generalizer uses the element as the element name.

To find out which modules are valid in the target document type, the generalizer might look at the dependencies attribute for an existing instance of the target document type, look at an XML representation (such as XML Schema or Neko DTDx) of the validation grammar, or take a list of module names as input.

As noted above, any document type that forces generalization of an element also forces generalization of the container element (if necessary), so the generalized element is guaranteed to be valid in the position. That is, because this proposal doesn't change the rules of specialized substitution, generalization remains reliable.

Costs

This change is backward compatible for instances and processing:

DTD impact: Apply the design pattern for domains to topics.

Schema impact: Apply the design pattern for domains to topics.

Benefits

By making it possible to use a single element in all contexts that have the same kind of content, this proposal has the following benefits:

Also, eliminating the distinction between topic and domain modules simplifies DITA overall because there are only modules.

Time Required

Two days to refactor the existing DTDs. Note that this work overlaps with that needed to enable constraints (see issue 34).