When we define a new data format, there are some tasks that have to be solved again and again:
This is much work to do in each case and I didn't even mention the concrete problem to be solved.
XML 1.0 solves almost all of these general problems. You probably haven't even to learn the syntax because it's a simplified syntag you know from HTML. The additional benefit is that the formal definition is stored in a language independant way in a separate document (the DTD). A programer who wants to use the format doesn't even have to implementthe constraints sincesince they are tested by a parser with well defined APIs.
When programers who work with e-commerce or music notation saw it, they immediately knew and agreed that this is what they needed. The main problem is, that XML 1.0 emerged from a (text) document oriented domain. This is the reeason why there is no way to define data types, to use inheritance, define keys or reuse general modules in different formats.
These problems are addressed by XML Schema. Oracle wants to be a leader in e-commerce and proves its ability to do so in providing the first experimental parser for XML Schema (v 0.9 Alpha). It is written in Java and follows the XML Schema Working draft of 25. February 2000.
MusiXML is a music notation format by Gerd Castan that is based on XML.
See the DTD and an Example that uses it. This example should work with any XML 1.0 compliant parser or editor.
See the Schema (last update 11. April 2000) and an Example that uses it. This is an experimental implementation that bases on the XML Schema draft of 25 February 2000. It has been tested with oracles XML Schema parser (0.9 Alpha). Thanks to Oracle for providing this parser.
The MusiXML schema will change when W3C changes XML schema, Oracle makes changes to its parser, a more compliant parser is available or when my experience grows.
One main goal is to store each data only once insted of holding them consistent. And we consider what a minority of webmasters do on the WWW: separating content from style. Assume we represent musical sheets for a symphony orchestra. To see the problem, we need only the score and the violin 1 part. Using a hierarchical representation, the score looks like this:
work -> page -> system -> staff -> measure -> <content> and the violin 1 sheets look like this: work -> page -> system -> staff -> measure -> <content>Both have the same structure. They have different instances of page, system and staff, but they share parts of <content>. The structure of the graphical hierarchy makes it necessary to store two copies of <content>. The simple idea is to store <content> in a separate place, the logical domain, and to refer to it. Now our structure looks like this:
work -> page -> system -> staff -> measure -> reference to part of <content>
work -> page -> system -> staff -> measure -> reference to part of <content>
But the work embeds everything, so we change the structure to
<work>
<body>
<content>
</body>
<filter>
<extract> (rendering information for score with reference to <content>) </extract>
<extract> (rendering information for violin 1 with reference to <content>) </extract>
</filter>
</work>
Since <extract> contains declarative instructions, how to process <content>, I saved some hierarchies there, to make it easier to process. The instructions in <extract> have to be declarative and not something like do this and then do that, since procesing goes in both directions: <content> has to be rendered using these instructions and <content> has to be changed using the instructions if the user changes something on the screen. The logical domain <body> contains almost all the the knowledge a music notation program can have about music. ItŒs about the same information that is in the linear input mode of some music notation programs.
Then we have a graphical domain that is contained in the <filter> element. It contains <extract> elements. Each <extract> element defines a different printout: the score and each part that is to be printed (In relational databases we would use the term view insted of extract, but we need the term view in the object oriented context).
There are many more XML music notation formats. But none of them uses XML schema so far.
The only somewhat accepted music notation exchange format is the binary NIFF format. To give you a better understanding of NIFF, I mapped NIFF to XML.
This is not to blame Oracle. The parser *IS* alpha and Oracle has done a great job at this early state of XML schema.
What works is defining elements, declaring complexTypes (named and unnamed), deriving (via equivClass), ref and built in types. The SAX API works fine. This all you need to make an experimental implementation.
I had problems in the following cases: