DITA Proposed Feature #38

Bookmap / bkinfo revision (cleanup and extend the structure; consolidate and organize metadata).

Longer description

The bookmap and bkinfo specializations have become more widely used within the user community. As a basic concept, they have received a high acceptance. Due to the wide use of the bookmap, a list of requirements to the existing bookmap/bkinfo has come in from different companies like IBM, Nokia, and KONE. In addition, current DITA 1.1 proposals offer the possibility for elegantly refining the original design. This proposal is based on these requirements and other contingent TC proposals.

Feature Owner(s)

Nancy Harrison, Chris Kravogel, JoAnn Hackos, Don Day

Requirements gathered from:
  • Chris Kravogel (representing KONE)
  • Indi Liepa (Nokia)
  • Simcha Gralla, Nancy Harrison (IBM)

Scope

Assumptions: Several proposed features will actually inherit into bookmap from ditamap and topic metadata if some dependent other features are approved for 1.1, therefore they are listed here as related, but are not addressed in the scope of this proposal:
  • #41 Expanded content for shortdesc (applies to Indi's requirement for transitional text--specialize the shortdesc in topicmeta in map to keep "connective text" in the structural context, not in topics)
  • #48 Support change history and annotations in prolog (might be covered already in revised "bookmeta" element)
  • #14 Specialize glossary entry and definition elements
  • #45 Indexing elements (See, See Also)
The following original requirement is not clearly a dependency for bookmap (ask Kravogel).
  • #4 Use subset of OASIS xNAL standard for addresses
In addition, one accepted 1.1 feature is the basis for the proposed redesign of metadata:
  • #9 Specialize new DATA element from keyword

Note: The rest of the discussion in this section are design notes that did not fit elsewhere in to the proposal template.

The general constraints for the proposal were to review existing practice to discover any book structures that were not already in the bookmap definition, and to generally clean up the book metadata scattered in several locations. We found the general book structure to be basically okay except for the additional design work needed to finish up the "booklist " structures for collected data.

Upon review of the existing bookmap specialization as used in the DITA Open Toolkit's demo/book directory, and of the known requirements, it was clear that all requirements could be met through a series of mostly backwards-compatible updates to new element definitions and content model additions. The deltas provide broader base capability that companies can specialize in order to constrain behaviors to their requirements (versus creating "parameter-driven" designs that introduce style into the design).

The main change from the original bookmap design is the consolidation of book metadata that had been previously spread across three different parts of markup into a unified book metadata design fully contained within the bookmap structure itself, making use of the new "data" element.

The lists requirement deals with how to specify both the location and content of special chapter-like constructs that generally have content derived by query during processing, such as a Table of contents, or lists of terms or endnotes. The balance here was for the design to allow author flexibility to specify which lists are important for a deliverable while leaving the ordering and location of these parts up to the policies of the house style (enforced by the transforms). The solution is a specialized topicgroup (booklists) at the start of the bookmap that has specialized topicrefs that represent the various booklist types. Authors can insert 0 or more of these "markers" into the booklists parent, where they represent instructions to the formatter to generate the appropriate content in the house style. A general override transform can offer call-templates that can be arranged in different order or commented out to effect a house policy for layout of such affordances. This increases the portability of the base bookmap between different processors.

Two requirements (transitional text between topics and limited nesting) were felt to be examples of further specialization that an organization can impose/infuse into local practice. Transitional text can be effected by specializing a topicref so that its topicmeta/shortdesc is available for "inline" discourse between other topicrefs in a map. Nesting limits can be effected locally by specializing topics down to a cutoff depth, as with DocBook's numbered sections, or by creating XSLT that counts the levels and generates a report of out-of-bound nesting.

Use Cases

{Describe this feature's use, as if ideally implemented.}

Use Case 1: First Contact with Bookmap
At the first contact with bookmap/bkinfo it looked as a promising specialization. However it became difficult to understand the meaning of a couple of elements and structures. Any guidance of how to use them was missing. An language reference guide would help a lot.
Use Case 2: Unlimited Nesting
Bookmap allows unlimited nesting of topics. This degree of freedom results in poor linear documents. More control or constraints would help.
Use Case 3: Authoring book metadata
As the author opens a new bookmap, the specialized topicmeta element for the specialized map offers a set of specialized sections for the book metadata. All information that was formerly spread across three locations is now consolidated in a single location for ease of maintenance, less duplication of effort, and simpler documentation.
Use Case 4: Print on demand booklets
Users reading at a Web-based Information Center will select some articles and arrange the desired sequence, then submit the collection for printing on demand. The topics are actually aggregated by bookmap and processed into PDF that has a basic cover, ToC, all collected copyright attributions and terms of use, and "chapters" of content. The PDF is returned within seconds to the user either by URL or in email.

Technical Requirements

The current collection of bookmap-only requirements that inform on the proposed design are:

  • Allow for logical grouping of metadata for ease of authoring, ease of review, and ease of reuse. This involves consolidation of metadata scattered between bkinfo, bkbasicinfo, and bkinfo/prolog:
    • Cover
    • Title and book information
    • Author information
    • Copyright information
    • Publisher information
    • Trademark information
    • Legal information
    • Book Change History
  • General updates:
    • Support multiple lists in a consistent way (both generated and user-supplied content) (eg, for ToC, Tables, Figures, Abbreviations, Glossary, Index, Bibiography)
    • Documentation: A language reference and user guide for bookmap
  • Specific updates (some of these are addressed in the consolidation exercise):
    • Allow for division of the book title information into library and title.
    • Bookmap title should provide in-line semantics. A title element is required in preference to an attribute.
    • Change the naming of the firstname and lastname elements to match current standards: givenname, familyname (support this by reference to issue number 4--XNAL).
    • Add a special notices section to allow for specific notices to be placed at the front of the document.
    • Add a group of navigation list elements to specify the addition of automated or compiled content lists for document navigation (TOC, figure and table lists). Add a group of collections to specify the addition of automated or compiled collections of information (glossaries, trademarks, etc.).
    • Adding more address elements e.g. pobox (maybe covered by #4).
    • Adding more elements to organization and name e.g. www, fax, e-mail (maybe covered by #4).
    • Adding more elements to bkhistory e.g. changerequestcode, translation history (maybe covered by #48)
Implementing these updates to the existing bookmap base (DTDs, Schemas, tools) will generally require:
  1. Addition of some new elements into existing content models (backwards compatible)
  2. Introduction of a new domain element (backwards compatible)
  3. Modifying one element from PCDATA to contain structured title content.
  4. Using the newly defined <data> element as the archetype for extending new metadata structures in the bookmap's topicmeta to consolidate and enrich the book metadata structures.

Costs

  • Time spent defining and refining the proposal--estimating 4 telecons, exchanged notes, and time/effort spent on review readings.
  • Updating any existing bookmap implementations. A number of bookmap implementations already exist. The cost of updating them to cover the changes to the DTDs is felt to be manageable since the main impact is to metadata and revisions to its use in existing templates. Creating any new book application from scratch is a major effort; but modifying an existing base becomes a matter of management of deltas only.
  • Testing. There are a lot of permutations of bookmap/bkinfo content. The design of the model is intended to be general enough to accomodate most requirements. We will limit testing to specific test cases provided by DITA TC member companies. Some tests might be done within company firewalls and the results reported back to the TC. Due to considerable field testing of the demo version of bookmap (including its use in several products already), some degree of validation is already noted.
  • Documentation. The effort of developing a Reference Guide for the language and a Users Guide for the processing tools and best practices. Due to the number of elements in the design, and at a rate of about 3 topics per day, this effort will be substantial, perhaps 3 person-months. This effort might be conducted in part in the open source or DITA community venues.

Benefits

The use of the consolidated bookmap would allow users to create a cohesive linear document out of their topics. They would be able to surround their current DITA topics with the information required by their organizations for published materials.

The bookmap proposal enables existing maps of information to be conrefed into bookmap structures, making it faster to produce a book based largely on existing ditamaps.

Time Required

  • Definition phase: 2 months
  • Testing the DTDs (check on using Robert Anderson's "hole punching" tool and process)
  • Documentation: about 3 person months (can be co-developed by many in shorter term)
  • Implementation phase (for vendors, tool writers): 3 months from the TC's affirmation of direction (Candidate Release DTDs). (For the existing reference implementation in the DITA Open Toolkit, this will be a reasonable delta due to the existing base; for others started new, the effort might be greater.)

Structures


Table 1. bookmeta, booklists, notices
New structurenaming influenceslegacy fragmentproposed declarationcontent model
bookmetaafter "topicmeta for books"bkinfo, prolog, bkbasicinfospecialized topicmeta, first in bookmapThe structures that follow!
booklistsafter "lists for books"not previously implementedspecialized topicgroup, first in bookmap following bookmeta, populated with specialized topicrefs.Toc, figures, tables, glossary terms, etc.. Zero or more occurance (empty parent is okay). No @href means generate the content; provided @href means use that content.
special noticespeer to edition notices; contains terms and other legal/policy info.not previously implementedspecialized topicrefhref to any topic type



Table 2. Author information
New structurenaming influenceslegacy fragmentproposed declarationContent model
Author Information.none - containerbookmeta/authorInformation(person | organization)*, contact*
personDITA, DocBook, IBMIDDocbkinfo/bkinfobody/bkhistory/bkauthored/personbookmeta/authorInformation/personhonorific?, givenname*, middlename*, familyname*, lineage?, address*, phone*, resource*, summary?, affiliations?, otherinfo*
organization.bkinfo/bkinfobody/bkhistory/bkauthored/organizationbookmeta/authorInformation/organizationorgname?, address*, phone*, resource*, summary?, otherinfo*
contact..bookmeta/authorInformation/contact(person | organization)*
phoneDITA, DocBook, IBMIDDocperson/phone, organization/phonebookmeta/person/phone, bookmeta/organization/phoneoffice*, fax*, cellular*, URL*


Note: This leaves out the "Contact department" request. I am not sure where to list that - should it be part of the <person> model? An addition to <contact>?

Note: Unsure how to set the model of the phone element - how to identify the different types? Different numbers, or should there be a marker (such as <cellular/>) inside the phone element?

Note: The content model for person, organization, and elements within those is the same in the original bkinfo and in this new proposal.


Table 3. Publisher information
New structurenaming influenceslegacy fragmentproposed declarationContent model
Publisher Information.none - containerbookmeta/publisherInformation(person | organization)*, bkprintloc*, bkpublished*, colophon?
bkpublished.bkinfo/bkinfobody/bkhistory/bkpublishedbookmeta/publisherInformation/bkpublishedsame as in bkinfo
colophon..bookmeta/publisherInformation/colophonempty - this would be a reference to the topic that describes the colophon


Note: The bkpublished information moved from bkhistory in the original bkinfo.


Table 4. Book History
New structurenaming influenceslegacy fragmentproposed declarationContent model
book change historyDITAbkinfo/bkinfobody/bkhistorybookmeta/bkChangeHistory(*(%bkreviewed;) | (%bkedited;) | (%bktested;) | (%bkapproved;) | (%bkevent;))*


Title Information / Book Identification


Table 5. Title Information
New structurenaming influenceslegacy fragmentproposed declarationcontent model
BookTitleDITA, IBMIDDoc, DocBookNo global containerbkbooktitle.
Book LibraryDITA, IBMIDDoc, DocBookbkinfo/bkid/bkvolume/bklibrarybkbooktitle/bklibrary.
Book TitleDITA, IBMIDDoc, DocBookbkinfo/titlebkbooktitle/bktitle.
Short/Abbreviated TitleDocbook (titleabbrev); IDDoc (stitle)bkinfo/bktitlealts/bktitleabbrevbkbooktitle/bktitlealt.



Table 6. Book Identification
New structurenaming influenceslegacy fragmentproposed declarationcontent model
Book identificationDITA, IBMIDDoc, DocBookbkinfo/bkidbookmeta/bkid.
Book/document NumberDITA, IBMIDDoc, DocBookbkinfo/bkid/bknumbookmeta/bkid/bknum.
ISBNDITA, IBMIDDoc, DocBookbkinfo/isbnbookmeta/bkid/isbn.
Part number.bkinfo/bkpartnobookmeta/bkid/bkpartno.
Volume number.bkinfo/bkid/bkvolume/bkvolidbookmeta/bkid/bkvolume.


Copyright


Table 7. Copyrights
New structurenaming influenceslegacy fragmentproposed declarationcontent model
bkrightsDITA, IBMIDDoc, DocBookbkinfo/bkrightsbookmeta/bkrights.