Document:
Semantic Representations of the UN/CEFACT CCTS-based Electronic Business Document Artifacts

Draft (A preliminary unapproved sketch, outline, or version.)

Details

Submitted By Ms. Asuman Dogac on 2008-09-24 2:13 pm UTC

Publication Type

None at this time.

Group / Folder

OASIS Semantic Support for Electronic Business Document Interoperability (SET) TC / Standards

Modified by

Not modified.

Copy

This document is not a copy.

Technical Contact

None at this time.

Download Count

3216

Download Agreement

None at this time.

Description

The purpose of this SET TC deliverable is to provide standard semantic representations of
electronic document artifacts based on UN/CEFACT Core Component Technical Specification
(CCTS) and hence to facilitate the development of tools to support semantic interoperability.
The basic idea is to explicate the semantic information that is already given both in the
CCTS and the CCTS based document standards in a standard way to make this
information available for automated document interoperability tool support.

UN/CEFACT CCTS specifies the semantics of document artifacts in several dimensions:
through the Core Components Data Types; through the structure of the core components; the
semantics implied by the naming convention used; the semantics implied by the context, the
Business Information Entities and the code lists. However, currently this semantics is available only through text-based search mechanisms.

In order to help with the interoperability of the document artifacts, we explicate the CCTS
based business document semantics. By "explicating", we mean to define their semantic
properties through a formal, machine processable language as an ontology and the Web
Ontology Language (OWL) is used for this purpose. Note that in defining the semantic
properties of document artifacts, we kept the "context" semantics
at an absolute minimum since UN/CEFACT UCM is working on this subject.

The semantics is explicated at two levels: At the first level, an upper ontology describing the
CCTS document content model is specified. Furthermore, at this level, the upper ontologies for
the prominent CCTS based standards, namely, GS1 XML, OAGIS 9.1 and UBL are also
developed. The various equivalence relationships between the classes of the CCTS upper
ontology and the CCTS based document standard ontologies are defined. These relationships
are later used to find the similarities among the document artifacts from different document
schemas.

At the next level, the semantics of the document schemas in each standard are described
based on its upper ontology. The difference between the document schema specific ontology
and the upper ontology is that the upper ontology describes the generic entities in a document
content model whereas document schema ontologies describe the actual document artifacts
as the subclasses of the classes in the upper ontology.

Furthermore, we explicate some semantics related with the different usages of document data
types in different document schemas to obtain some desired interpretations by means of such
informal semantics. The intention is to give the reasoner the same information that the
humans use in transforming document schemas into one another.

When these ontologies are harmonized using a DL reasoner, the computed inferred
ontologies reveal the implicit equivalences and subsumtion relationships between the
document artifacts. In other words, the shared semantic properties of the CCTS based
document artifacts together with the implicit relationships inferred, help to identify their
similarities. As expected, the harmonized ontology is effective only to discover equivalence of
both semantically and structurally similar document artifacts. Yet different document standards
use core components in different structures. Semantic properties of document artifacts are not
enough to find the similarity of the structurally different but semantically equivalent document
artifacts; possible differences in structures must be provided through heuristics to enhance the
practical uses of the specified semantics. This heuristics is about possible ways of organizing
core components into compound artifacts and is given in terms of predicate logic rules.
Note that a DL reasoner by itself cannot process predicate logic rules and we resort to a well
accepted practice of using a rule engine to execute the more generic rules and carry the
results back to the DL reasoner through wrappers developed. The results involve declaring
further class equivalences in the ontology.

Finally, the similarities discovered among the document artifacts are then used to automate
the mapping process by generating the XSLT rules.

The SET harmonized ontology contains about 4758 Named OWL Classes and 16122
Restriction Definitions conforming to the specification described in this document consisting of
the following:
- All of the CCs/BIEs in UN/CEFACT CCL 07B.
- All of the BIEs in the common library of UBL 2.0.
- All of the common library of GS1 XML.
- OAGIS 9.1 Common Components and Fields
- The harmonized ontology expresses the relationships among the document artifacts of
UN/CEFACT CCL, UBL 2.0, OAGIS 9.1 and GS1 XML according to SET specifications.
- The SET harmonized ontology is publicly available from
http://www.srdc.metu.edu.tr/iSURF/OASIS-SET-TC/ontology/HarmonizedOntology.owl

Related with performance, an issue that needs to be addressed is whether the gain in automation
justifies the resources needed to develop the ontological representation of the document schemas. In
order to reduce this cost, we provide the SET XSD-OWL Convertor tool to create OWL definitions of
the document schemas. This component converts a CCTS based document schema into OASIS SET
TC OWL Definition and is publicly available from
http://www.srdc.metu.edu.tr/iSURF/OASIS-SET-TC/tools/OASISSET.zip

Note that, by conforming to a standard ontological representation and hence having all the document
schema ontologies in a common pool, the users of the harmonized ontology only need to create a
document schema ontology if it is not already in the harmonized ontology and benefit from all the
existing connections when they do so.

Another issue related with performance is the computational complexity of the reasoning process
involved. On a PC with 2GB RAM, the Racer Pro 1.9.2 Beta reasoner takes about 120 seconds to
compute the harmonized ontology. SET TC Members will receive a password to use Racer Pro
for free for three months. Considering that the harmonized ontology will be re-computed only
when a new document schema or a new CCTS based upper document ontology is introduced to the
system, this performance is quite acceptable.

This work will be discussed to be further enhanced in the SET TC and technical support will be
provided to the SET TC Members who develop their own use cases using the harmonized
ontology. The SET XSD-OWL Converter tool can be used to generate the OWL definitions of
their own document artifacts. The aim is to demonstrate the feasibility and practicability of the
specifications to encourage industry take up.

Status:
This document is an OASIS Semantic Support for Electronic Business Document
Interoperability (SET) TC Working Draft Profile and the work by the Editors is realized within
the scope of the ICT 213031 iSURF Project (http://www.iSURFProject.eu) sponsored by
the European Commission, DG Enterprise Networking Unit
(http://cordis.europa.eu/ist/ict-ent-net/index.html).

Committee members should send comments on this specification to the set@lists.oasis-open.org
list. Others should subscribe to and send comments to the set-comment@lists.oasis-open.org
list. To subscribe, send a blank email message to set-comment-subscribe@lists.oasis-open.org.
Once you confirm your subscription, you may post messages at any time.