Here is my draft Stage 2 proposal for the inclusion of strong and em under a new domain and redefining the b and i elements in a more semantically-descriptive manner.

I know that both Kris Eberlein and Bob Thomas have volunteered to review this. Apologies if this should have been circulated to them first, but the other Stage 2 proposals I have looked at did not indicate if they were reviewed beforehand.

Also, I admit my technical weakness when it comes to devising a DTD description, so I welcome any guidance in this part of the proposal where I have inadvertently gone astray.

DITA 2.0 proposed feature #107

Date and version information

Date that this feature proposal was completed

23 March 2018

Champion of the proposal

Keith Schengili-Roberts

Links to any previous versions of the proposal

https://lists.oasis-open.org/archives/dita/201803/msg00012.html

Links to minutes where this proposal was discussed at stage 1 and moved to stage 2

https://www.oasis-open.org/apps/org/workgroup/dita/download.php/62726/minutes20180313.txt

Links to e-mail discussion that resulted in new versions of the proposal

N/A

Link to the GitHub issue

https://github.com/oasis-tcs/dita/issues/107

Original requirement

Many new people coming to DITA have expressed confusion as to the supposed semantic nature of DITA, and then seeing the existence of only the b (bold) and i (italics) elements. HTML has long supported (since the “HTML+” specification from 1993) the additional strong and em elements as more descriptive, semantic equivalents for b and i.

HTML5 has taken this one step further by fully defining b and i as semantic elements, distinct from strong and em.

In keeping with HTML5, a standard that many coming to DITA for have more than a passing familiarity with, this proposal suggests that strong and em be added as elements under a new domain—tentatively titled “semantic_descriptors”—that is separate from the highlighting domain, containing these two new tags. At the same time, the existing b and i elements within the highlighting domain will be re-defined within the DITA 2.0 specification in a more semantic manner. This will bring them more in-line with their equivalent meanings for HTML5; they are otherwise unchanged.

Use cases

For users seeking a semantic equivalent for the b and i elements, strong and em could now be used instead.

The retention and redefining of the b and i elements would also make it clear as to the situations for which strong and em should be used, and the scenarios where b and i are more appropriate.

New terminology

strong element would inherit from topic/ph semantic_descriptors, and could be defined as follows:

The strong element should be used to indicate strong importance, seriousness, or urgency of content. Typically, it’s content will be rendered in boldface at output. This element is part of the semantic descriptors domain. Use this element only when a more semantically appropriate element is not available. For example, for a specific warning, consider using an appropriate element from the hazard statement domain, such as hazardstatement.

em element would also inherit from topic/ph, and could be defined as follows:

The em element should be used to indicate emphasis. A stress emphasis is designed to change the meaning of a phrase or sentence, or stressing the importance of a particular noun, verb or adjective. Typically, it’s content will be rendered in italics at output. This element is part of the semantic descriptors domain. Use this element only when a more semantically appropriate element is not available. For example, when indicating a different mood or voice, the i element may be more relevant.

The b element description would also change, making it more semantically descriptive, and more in line with HTML5’s current definition. This could look like the following:

The b element should be used to draw attention to a word or phrase for utilitarian purposes without implying that there is any extra importance. There is also no implication of an alternate voice or mood, or that its content should be actionable. For example, it can be used to indicate product names within a review, highlighting roles within a process, or for use in spans of text where the typical presentation is expected to be in a boldface.

Similarly, the i element would also be redefined to make it more semantically descriptive. It could look like the following:

The i element should be used for a word or phrase indicating either an alternate voice or mood, or to otherwise offset it from the content around it to indicate a different quality of text, such as a taxonomic designation, an idiomatic phrase from another language, technical term, or a ship name.

Proposed solution

1. Create a new domain for these two new semantically descriptive elements, tentatively called “semantic_descriptors”.

2. Create two new phrase-level elements within this domain: strong and em.

3. Add new descriptions plus example code illustrating the intended usage for these elements.

4. Change the descriptions for the b and i elements within the highlighting domain and include example code illustrating intended usage.

Benefits

Who will benefit from this feature?

Authors seeking a more semantic element for encapsulating content that should either be strong or emphasized. The redefinition of the b and i elements will also make it plain when and where these highlighting elements should be used. It will also benefit DITA trainers who will now be able to point to more semantic equivalents to the existing b and i elements.

What is the expected benefit?

Authors working within DITA will have a more clear-cut choice on when to use strong and em, and when to use b and i, in keeping with how these elements are currently defined within HTML5.

How many people probably will make use of this feature?

There are known cases where technical writing teams have constrained out the highlighting domain because of its lack of semantic elements. Similarly, there are DITA authoring groups that have either specialized ph to create their own equivalent of strong and em, or, more awkwardly, use @outputclass with ph to achieve the same ends. The redefinitions proposed for b and i may convince the former to retain the highlighting domain, while providing new, semantically-described strong and em elements ought to take care of the latter group.

While this proposal is not sufficient to draw people to use DITA 2.0, it will likely be welcomed by the user community.

How much of a positive impact is expected for the users who will make use of the feature?

Likely minimal; in many ways this is less a feature than a long-overdue tweak to the specification. However, those who will use this feature are likely to be pleased with its addition.

Technical requirements

Adding new elements or attributes

Two new elements, strong and em, will be added under a new domain.

Adding a domain

The new semantic_descriptors domain would fall under the set of general-purpose Domain elements.

Adding an element

Two new elements will be added under the semantic_descriptors domain: strong and em.

Inheritance:

+ topic/ph semantic_descriptors/strong

+ topic/ph semantic_descriptors/em

DTDs:

(Please note that the following is based on DITA 1.3 and does not include any proposed changes for phrase-level elements that may have already been proposed for DITA 2.0).

<!ENTITY % semanticdescriptors-d-ph
"strong | em"
>





<!ENTITY semanticdescriptors-d-att
"(semanticdescriptors-d-ph)"
>

<!ENTITY % strong "strong" >

<!ENTITY % em "em" >

<!ENTITY % strong.content

"(#PCDATA |

%basic.ph; |

%data.elements.incl; |

%draft-comment; |

%foreign.unknown.incl; |

%required-cleanup;)*"

<!ENTITY % strong.attributes

"%univ-atts;

outputclass

CDATA

#IMPLIED"

<!ELEMENT strong % strong.content;>

<!ATTLIST strong % strong.attributes;>

<!ENTITY % em.content

"(#PCDATA |

%basic.ph; |

%data.elements.incl; |

%draft-comment; |

%foreign.unknown.incl; |

%required-cleanup;)*"

<!ENTITY % em.attributes

"%univ-atts;

outputclass

CDATA

#IMPLIED"

<!ELEMENT em % em.content;>

<!ATTLIST em % em.attributes;>

Renaming or refactoring elements and attributes

Only the description of the b and i elements need to be updated in the DITA 2.0 specification. See the “New Terminology” section for the proposed changes in wording.

Renaming or refactoring an attribute

N/A

Removing elements or attributes

N/A

Processing impact

Expected to be minimal.

Overall usability

Users will have a choice between using strong and em vs. the b and i elements. There may be some confusion as to when to best use strong vs. b and em vs. i, but this can be mitigated by providing numerous, relevant code examples in the specification for each element.

Backwards compatibility

Changing the meaning of an element or attribute in a way that would disallow existing usage?

As the b and i elements are not being removed, going forward DITA 2.0 users can continue to use these elements if they choose, opt to use strong and em as their replacements, or to use both sets of elements in parallel.

Migration plan

Might any existing documents need to be migrated?

Use of strong and em is optional as b and i are still present, so there is no need to update all instances of b to strong, and i to em, though there will undoubtedly be some technical documentation teams that choose to do so.

Might any existing processors or implementations need to change their expectations?

Not in terms of expectations, though output processors (such as the DITA OT) will need to accommodate the formatting of the two new elements, though for compatibility it is suggested that strong copies the default output behavior of b, and that em copies that of i.

Might any existing specialization or constraint modules need to be migrated?

Groups that have previously constrained out the highlighting domain, or who have specialized ph for creating equivalents for strong and em, are likely to drop their modifications with this proposal. There may still be groups that choose to constrain out the highlighting domain despite the revised semantic descriptions for b and i, but if so that would be their choice.

Costs

Outline the impact (time and effort) of the feature on the following groups:

Maintainers of the grammar files

Minor cost in adding the new domain and its associated elements.

Editors of the DITA specification

· How many new topics will be required?

Three. One to describe the intent of the new semantic_descriptors domain, and one for each new element (strong and em).

· How many existing topics will need to be edited?

Two. The topics for b and i ought to be updated to be more semantically descriptive, which will align them with their counterparts in HTML5.

Will the feature require substantial changes to the information architecture of the DITA specification? If so, what?

Only the addition of the new domain. Other than that, no significant architectural change is required.

A possible advantage of having a new semantic_descriptors domain is that it opens the possibility to include other, more semantically-descriptive elements. For example, the existing overline element is used within electrical engineering to describe an “active low” signal, and subscript is used within mathematics to describe the base (or “radix”) of a number. These examples may be too specific to a particular discipline for inclusion here, but the idea of the semantic_descriptors domain provide a basis for further, more semantically-defined phrasal elements.

Vendors of tools

Low cost is expected. Again, this is less a significant new feature than an overdue “tweak”.

DITA community-at-large

· Will this feature add to the perception that DITA is becoming too complex?

Any additional element adds to the total number of elements available in DITA. However, the intent is to bring DITA more in line with current HTML5 practice, something that will likely be welcomed by the community.

· Will it be simple for end users to understand?

Yes. As mentioned earlier, this is less of a wholly new feature than a long-overdue tweak. It seems likely that the community is likely to embrace these new tags, along with the alignment with their equivalents in HTML5.

Producing migration instructions or tools

If there are teams that decide to migration all instances of b and i to strong and em, there are already tools capable of doing this one-for-one switch. It is unlikely that there will be new tools needed to do this.

A white paper to describe the correct usage of the new and revised elements would be overkill, especially if sufficient code examples explaining the context for usage are provided within the specification.

If there is new terminology, is it likely to conflict with any usage of those terms in the existing specification?

The new definitions for strong and em, plus b and i, will make it clear as to their scenarios for use, along with a good set of code examples to demonstrate best practices for when they should be used.

Examples

Example code for strong:

The strong element can be used to indicate content that is considered to be important, serious, or some form of urgency (without being a specific warning). It can be used in the following manner:

Important: Before proceeding to wrangle your first ostrich, ensure you know the location of the closest first aid station.

Example code for em:

The em element can be used to indicate emphasis, stressing the importance of a particular word to the reader.

What was previously called block-level content up to HTML 4.1 is now called flow content in HTML5.

Example code for b:

The b element can be used to indicate a product name within a review:

One of the best features of Mr. Flip-it is its ability to manipulate objects within a three-dimensional space so that you can see the other side.

It can also be used to highlight things so that the user can easily scan text for things like the names of roles that otherwise carry no additional level of importance:

Solid Waste Operations Manager: plans and manages the countywide transfer station and landfill operations, coordinates solid waste processing operations with the planning and engineering staff, and performs related duties as required.

The b element can also be used in situations where boldfaced text is expected for stylistic purposes, such as when the house style for an article lede is to be rendered in boldface:

Ostrich wrangling is not a job for the faint-hearted. But it can be a highly rewarding endeavour as they are raised commercially for their meat, hide and feathers.

Example code for i:

The i element can also be used for indicating text in a different voice, such as when foreign words or phrases are used:

<note type="caution">Even highly experienced operators of heavy machinery should remain alert for dangerous situations. Having a laissez-faire attitude is a recipe for disaster.</note>

Or when different character voices are being indicated:

Edgar: I know thee well—a serviceable villain, as duteous to the vices of thy mistress as badness would desire.

Gloucester: What, is he dead?

It can also be used to indicate a taxonomic designation:

When wrangling ostriches (Struthio camelus) people are advised that while they are a type of bird (Class: Aves), they are thought to be descendants of their extinct dinosaur (Suborder: Theropoda) relatives and sharing the same type of temperament.

The i element can also be used to designate the name of a ship:

The MV Rena was a container ship that ran aground near Tauranga, New Zealand, resulting in an oil spill.

It can also be used to indicate a new or technical term the first time it is introduced:

Immediately prior to undergoing an MRI, a doctor may inject a contrast agent called the gadolinium contrast medium into the patient. This ‘dye’ highlights the part of the body being scanned and can provide more information to the radiologist who is assessing the patient’s problem.

Cheers!

Keith Schengili-Roberts

Market Researcher and DITA Evangelist

IXIASOFT

825 Querbes, Suite 200, Montréal, Québec, Canada, H2V 3X1

tel + 1 514 279-4942 / toll free + 1 877 279-4942

robertsk@ixiasoft.com / www.ixiasoft.com

dita message