Extensible universal attributes, specifically for conditional processing
(filtering/flagging, also known as profiling), but also for arbitrary attributes
that have a similarly simple syntax.
Longer description
Allow DITA document
type developers to incorporate new conditional processing attributes that
can be used for filtering and flagging, or new attributes with no existing
equivalent that can be managed and generalized in the same way as conditional
processing attributes.
The new attributes need to be:
- Identified as conditional processing attributes (when intended for this
purpose)
- Preserved during generalization and respecialization
- While generalized, still operable on by either general or specialized
behaviors (for example, conditional processing)
This proposal also documents a proposal for increased flexibility
in the attribute values used for conditional processing, and in the ditaval
format that is used to select values for exclusion, inclusion, or flagging.
Use Case: conditional processing
- DITA architect for a team defines new attributes that are needed by the
team (eg proglanguage)
- DITA architect expresses each new attribute as a separate domain package
(eg proglanguage.mod, with new attribute specialized from props attribute)
- DITA architect integrates the domain packages into the authoring DTDs
or schemas:
- redefining "props" attribute entity to include proglanguage attribute,
same way we redefine element entities to integrate new domain elements;
- adding the new attribute domain to the list of domains in the domains
attribute, preceded by an "a"; for example domains="a(props proglanguage)"
or domains="a(props audience role")
- Author can now add values to the new attributes, since they are physically
present in the document type
- Build developer defines values in ditaval format and runs a build to remove
or flag content based on the new attributes (eg flag all proglanguage="Java").
- Another build developer includes their content but needs to run all content
through a specialization-unaware trademarking tool that requires generalization
of the contributed content; after generalization, the content is processed
into output with filtering based on the new attributes (which are now collapsed
into props attribute):
- the generalize process turned proglanguage="Java" into props="proglanguage(Java)"
- the conditional processing transform recognizes the new form as equivalent
to the old, and the instruction "flag all proglanguage=java" operates on either
props="proglanguage(Java)" or proglanguage="Java".
Draft comment: The grouping mechanism generated during generalization
will also be directly authorable. Should it be documented as such? This would
effectively allow the expression of OR statements within a single attribute.
If we do so, would we need to distinguish more directly between values that
identify groupings (for example the "proglanguage" value in props="proglanguage(Java)")
and those that are directly processible (for example props="vendor1")? One
possibility would be to include a colon in the syntax for attribute-based
groupings (eg props="proglanguage:(Java)").
For consistency,
the rev attribute will also be made specializable, although it and its specializations
will only be usable for flagging, not filtering. For example, a specialization
of the rev attribute might identify a particular kind of revision (technical
vs grammatical) or the role of the reviser (editor vs author).
Use case: generic attributes
- DITA architect for a team needs to add a new attribute that has no equivalent
in existing DITA, for example a "phase" attribute that identifies what phase
of a process an element is associated with.
- DITA architect expresses each new attribute as a separate domain package
(eg phases.mod, with new attribute specialized from "base" attribute)
- DITA architect integrates the domain packages into the authoring DTDs
or schemas:
- redefining "base" attribute entity to include phase attribute, same way
we redefine element entities to integrate new domain elements;
- adding the new attribute domain to the list of domains in the domains
attribute, preceded by an "a"; for example domains="a(base phase)" or domains="a(base
phase phasetype")
- The DITA architect must also supply processing behavior for the new attribute,
and ensure that it works on both the specialized form (eg phase="develop")
and the generalize form (eg base="phase(develop)"), using the conditional
processing match logic as a pattern.
Use case: negative values
The DITA 1.0 attribute
syntax supports positive values only. This makes it difficult to work with
cases where the classification really is by negation (for example: "this applies
to every possible user EXCEPT programmers"); while a special value could be
created (for example "notprogrammer"), it would need to be managed in parallel
with the positive value (for example, "include notprogrammer, exclude programmer")
and is not a particularly usable solution.
Proposed change: allow NOT
as a special keyword within an attribute value. It applies only to the next
value: if there are multiple negatives, they will have to be independently
negated. This is still not full BOOLEAN logic support, and is intended to
remain as simple and readable as possible, with an eye on the fact that a
major cost of conditional processing is maintaining, debugging, and transferring
ownership of documents with complex conditions. Maps are expected to do much
more of the heavy lifting in DITA, and complex conditions are deliberately
not supported within a single attribute.
Use case: scoped values
The DITA 1.0 attribute syntax
supports simple values only. This makes it difficult to work with cases where
several values have a common feature, for example audience="programmerJava
programmerCPP programmerPython".
Proposed change: in order to make
semantic scopes within an attribute more explicit, support componentized values
separated by /: for example audience="programmer/database programmer/Java
programmer/Web". The separate components of a value can then be addressed
directly when filtering or flagging, for example "exclude programmer" would
match all three, whereas "exclude programmer/Java" would match only the second
value. The componentized syntax would require the most general scope to occur
to the left, becoming more specific as it moves to the right, to be consistent
with similar uses in the class attribute and href syntax in DITA.
Use case: extended syntax for ditaval
Publishers
require more flexibility in how they process values. The following format
extends and formalizes the .ditaval format used in the DITA toolkit and referred
to non-normatively in the DITA 1.0 specification:
- val
- Root element, contains one or more prop or revprop elements
- prop
- Identifies an attribute, and usually values in the attribute, to take
an action on. The attribute must be a specialization of the props attribute
(such as platform, product, audience, and otherprops).
- @att
- The attribute to be acted upon. Must be one of props, audience, platform,
product, otherprops, or a specialization of them. If the att attribute is
absent, then the prop element declares a default behavior for any attribute
specialized from props.
- @val
- The value to be acted upon. The value may be only a component of a scoped
value, for example "programmer" would match "programmer/enterprise". If the
val attribute is absent, then the prop element declares a default behavior
for any value in the specified attribute.
- action
- The action to be taken. The options are:
- include
- Include the content in output. This is the default behavior unless otherwise
set.
- exclude
- Exclude the content from output (if all values in the particular attribute
are excluded).
- passthrough
- Include the content in output, and preserve the attribute value as part
of the output stream for further processing. For example, add to the class
attribute in html output, using the format for generalized values: eg class="programminglanguage(programmer/Javaprogrammer)"
- flag
- Flag the content on output (if the content has not been excluded).
- @startimg
- If flag has been set, the image to use for flagging the beginning of flagged
content.
- @endimg
- If flag has been set, the image to use for flagging the ending of flagged
content.
- @color
- If flag has been set, the color to use to flag text. Colors may be entered
by name or by code. Processor support is recommended for the following: blue
#CAE1FF, green #DAF4F0, dark pink #CCCCFF, light pink #FFF0F5, yellow #ffffcc,
and tan #EED6AF
Draft comment: the list of values given here and below
differs from current toolkit support - what do we want to doc in the spec?
need to strike right balance between tool flexibility and interoperability
- @backcolor
- If flag has been set, the color to use as background for flagged text.
Colors may be entered by name or code. Processor support is recommended for
the following: blue #CAE1FF, green #DAF4F0, dark pink #CCCCFF, light pink
#FFF0F5, yellow #ffffcc, and tan #EED6AF
- @style
- If flag has been set, the text style to use for flagged text. The following
values are enumerated:
- underline
- double-underline
- italics
- overline
- bold
- @printchar
- If flag has been set, the character to include in the margin of the flagged
text, for example "|".
- revprop
- Identifies a value in the rev attribute of content, or of a specialization
of the rev attribute, that should be flagged in some manner. Unlike the props
attribute, which can be used for both filtering and flagging, the rev attribute
and its specializations can only be used for flagging.
- @att
- The attribute to be acted upon. Must be rev or a specialization of rev.
If the att attribute is absent, then the revprop element declares a default
behavior for any attribute specialized from rev.
- @val
- The value to be acted upon. The value may be only a component of a scoped
value, for example "programmer" would match "programmer/enterprise". If the
val attribute is absent, then the prop element declares a default behavior
for any value in the specified attribute.
- action
- The action to be taken. The options are:
- include
- Include the content in output without flags. This is the default behavior
unless otherwise set.
- passthrough
- Include the content in output, and preserve the attribute value as part
of the output stream for further processing.
- flag
- Flag the content on output (if the content has not been excluded).
- Flag the content using >> and << characters in addition to whatever
image or style options are chosen.
- @startimg, @endimg, @color, @backcolor, @style, @printchar
- Same as for prop element
- startimgalt
- An element allowed inside either prop or revprop to provide alternate
text for an image, when the startimg attribute sets an image to be used for
flagging. If the element is absent inside revprop, the default alternate text
of "start of change" will be used, in the language of the current document
or element.
- endimgalt
- An element allowed inside either prop or revprop to provide alternate
text for an image, when the endimg attribute sets an image to be used for
flagging. If the element is absent inside revprop, the default alternate text
of "end of change" will be used, in the language of the current document or
element.
Draft comment:
Should we make ditaval a true DITA format,
by adding class attributes, id and conref attributes, and the base attribute?
(I'm assuming props and its derivatives would be inappropriately self-referential
in this context).
Technical Requirements
Change to the
architectural specification to allow specialization of new universal attributes
off of the attributes props, rev, and base, following domain model: each new
universal attribute defined in a separate domain package that provides an
attribute definition entity and a domain attribute value that starts with
"a" and then lists the attribute ancestry in parentheses, eg a(props language).
The domain can be integrated into a doctype by redefining the univ-atts entity
to include the new attribute entity, and redefining the domains attribute
to include the domains value entity.
Define syntax for generalized attribute
values that allows for continued processing and roundtripping: put the values
of the generalized attribute into parentheses preceded by its specialized
name, eg props="proglanguage(Java)" or props="audience(role(developer))".
Update
conditional processing logic to work the same on either specialized or generalized
forms of the value: OR between attributes, AND within an attribute, whether
or not an attribute actually exists.
Remove existing section of specification
on how to "break" architecture to get attribute specialization.
Add
a "props" attribute to the architecture, from which the other metadata attributes
(platform, product, audience, otherprops) will be specialized.
Add a
"base" attribute to the architecture, which will be ignored by unspecialized
processing.
Ensure all attributes are expressed in entities to allow
domain-based expansion/specialization (DTDs); document equivalent mechanism
for schemas
Formalize and document new conditional processing attribute
value syntax to allow scoped values (eg Java/EJB) and negative values (eg
NOT Java)
Formalize and document ditaval format, including logic for
filtering, flagging, ignoring, or passing through values in the props attribute
or in attributes specialized from the props attribute; for setting attribute
defaults, and for matching partial values based on value scopes or components..
Costs
Time required for design should
hopefully be minimal. There will be more work by the open-source toolkit to
enhance existing transforms to handle "base" and "props"-specialized attribute
generalization and respecialization, and make the conditional processing logic
specialization-aware.
Benefits
Many people would make use
of this. It is consistently a highly rated requirement. For some, this would
remove a major barrier to DITA adoption.
Time Required
3 1-hour meetings
to review requirements
3 1-hour meetings to agree on solution
2
days to complete document solution