The Cover PagesThe OASIS Cover Pages: The Online Resource for Markup Language Technologies
SEARCH | ABOUT | INDEX | NEWS | CORE STANDARDS | TECHNOLOGY REPORTS | EVENTS | LIBRARY
SEARCH
Advanced Search
ABOUT
Site Map
CP RSS Channel
Contact Us
Sponsoring CP
About Our Sponsors

NEWS
Cover Stories
Articles & Papers
Press Releases

CORE STANDARDS
XML
SGML
Schemas
XSL/XSLT/XPath
XLink
XML Query
CSS
SVG

TECHNOLOGY REPORTS
XML Applications
General Apps
Government Apps
Academic Apps

EVENTS
LIBRARY
Introductions
FAQs
Bibliography
Technology and Society
Semantics
Tech Topics
Software
Related Standards
Historic
Last modified: May 28, 2002
XML Articles and Papers. July - September 2001.

XML General Articles and Papers: Surveys, Overviews, Presentations, Introductions, Announcements

References to general and technical publications on XML/XSL/XLink are also available in several other collections:

The following list of articles and papers on XML represents a mixed collection of references: articles in professional journals, slide sets from presentations, press releases, articles in trade magazines, Usenet News postings, etc. Some are from experts and some are not; some are refereed and others are not; some are semi-technical and others are popular; some contain errors and others don't. Discretion is strongly advised. The articles are listed approximately in the reverse chronological order of their appearance. Publications covering specific XML applications may be referenced in the dedicated sections rather than in the following listing.

September 2001

  • [September 28, 2001] "RDF/Topic Maps: late/lazy reification vs. early/preemptive reification." By Steven R. Newcomb. Posting 2001-09-27. "For me, at least, the shortest, most compelling and cogent demonstration of a certain critical difference between Topic Maps and RDF was Michael Sperberg-McQueen's wrap-up keynote at the Extreme Markup Languages Conference (www.extrememarkup.com) last August. Michael brought colored ribbons and other paraphernalia to the podium, in order to illustrate his words... In the past, I myself have considered RDF as the competitor of Topic Maps. Happily, I was wrong -- at least in fundamental technical terms. Indeed, I now believe that if there were no RDF, the Topic Maps camp would have to invent something like it in order to make the Maps paradigm predictably comprehensible by the programmers who are pioneering the development of the Internet. There are other interesting comparisons to be made between RDF and Topic Maps, but ever since Michael's demonstration of the difference between early vs. late (preemptive vs. lazy) reification, I have been meaning to document both the difference and the demonstration..." See: (1) "Resource Description Framework (RDF)" and (2) "(XML) Topic Maps."

  • [September 24, 2001] "XML Schema Quick Reference Cards." Prepared by Danny Vint. See: (1) XML Schema - Structures Quick Reference Card, and (2) XML Schema - Data Types Quick Reference Card. XML-DEV posting: "I've just uploaded 2 quick reference cards that I built for the XML Schema Data types and Structures specifications. These cards are available in PDF format. If you download and print them realize that they are setup for 8.5 x 14 paper. If when you print these files, just set the 'Fit to page' and Landscape mode to get a properly scaled copy of these documents. I'm also in the process of moving my 'XML Family EBNF Productions Help' to this new site as well as updating the content. This isn't completed; I'm currently showing the older version that I have previously published..." For schema description and references, see "XML Schemas."

  • [September 21, 2001] "Modeling XML Vocabularies with UML: Part II." By Dave Carlson. From XML.com. September 19, 2001. "Mapping UML Models to XML Schema: This is where the rubber meets the road when using UML in the development of XML schemas. A primary goal guiding the specification of this mapping is to allow sufficient flexibility to encompass most schema design requirements, while retaining a smooth transition from the conceptual vocabulary model to its detailed design and generation. A related goal is to allow a valid XML schema to be automatically generated from any UML class diagram, even if the modeller has no familiarity with the XML schema syntax. Having this ability enables a rapid development process and supports reuse of the model vocabularies in several different deployment languages or environments because the core model is not overly specialized to XML... The default mapping rules described in this article can be used to generate a complete XML schema from any UML class diagram. This might be a pre-existing application model that now must be deployed within an XML web services architecture, or it might be a new XML vocabulary model intended as a B2B data interchange standard. In either case, the default schema provides a usable first iteration that can be immediately used in an initial application deployment, although it may require refinement to meet other architectural and design requirements. The first article in this series presented a process flow for schema design that emphasized the distinction between designing for data-oriented applications versus text-oriented applications. The default mapping rules are often sufficient for data-oriented applications. In fact, these defaults are aligned with the OMG's XML Metadata Interchange (XMI) version 2.0 specification for using XML as a model interchange format. This approach is also well aligned with the OMG's new initiative for Model Driven Architecture (MDA). Text-oriented schemas, and any other schema that might be authored by humans and used as content for HTML portals, often must be refined to simplify the XML document structure. For example, many schema designers eliminate the wrapper elements corresponding to an association role name (but this also prevents use of the XSD <all> model group). This refinement and many others can be specified in a vocabulary model by setting a new default parameter for one UML package, which then applies to all of its contained classes..." See: (1) Part I of Carlson's article; (2) "Conceptual Modeling and Markup Languages"; (3) "XML Schemas."

  • [September 21, 2001] "Being Too Generous." By Leigh Dodds. From XML.com. September 19, 2001. ['Microsoft's recent release of Internet Explorer 6 has already attracted criticism for its deprecation of the Netscape plugin API. This week in his XML-Deviant column Leigh Dodds takes a look at IE6's XML support, and relates how community criticism has been met with a positive response from Microsoft.'] "This week the XML-Deviant looks at some recent community criticism over the XML support in Internet Explorer, which has been resolved with some promising feedback from Microsoft. Despite its many and varied successes XML has still not achieved it's aim of being 'SGML on the Web'. At least not within the most popular viewport of the Web, the browser. HTML is still the Web's lingua franca despite the desire of many in the XML community to see it be deprecated in favor of XHTML or CSS-styled XML documents. In other environments XML has been a runaway success, yet it is still having trouble gaining a foothold in user agents. Arguably RSS is the most successful XML format being displayed to users and processing even popular formats like SVG is handed off by browsers to optional plug-ins rather than being natively supported. There are a few reasons for this. The XML community has not made the effort it could to convince the web development community of the advantages of XML, leading to an image problem. Strong disagreements over the relative merits of XSLT and CSS has also displayed a lack of common vision for the role of XML in client-side document styling. There can be little doubt that the lack of good XML/XSLT/CSS support in recent browsers is the root cause of the problem. Which is ironic since the browser was instrumental in getting pointy-bracket parsers on millions of desktops around the world. Of course the situation is not completely bleak. XML processing capabilities are appearing in both major browsers. The added irony is that Internet Explorer appears to be leading the way, despite the fact that the most widely regarded tools in the XML toolkit are open source, and despite MS XML parser's baroque installation modes..."

  • [September 21, 2001] "Writing SAX Drivers for Non-XML Data." By Kip Hampton. From XML.com. September 19, 2001. "In a previous column, we covered the basics of the Simple API for XML (SAX) and the modules that implement that interface in Perl. Over the course of the next two months we will move beyond these basic topics to look at two slightly more advanced ones: creating drivers that generate SAX events from non-XML sources and writing custom SAX filters. If you are not familiar with the way SAX works, please read High-Performance XML Parsing With SAX before proceeding. SAX is an event-driven API in which the contents of an XML document are accessed through callback subroutines that fire based on various XML parsing events (the beginning of an element, the end of an element,character data, etc.) For the purpose of this article, a SAX driver (sometimes called a SAX generator) can be understood to mean any Perl class that can generate these SAX events. In the most common case, a SAX driver acts as a proxy between an XML parser and the one or more handler classes written by the developer. The handler methods detailed in the SAX API are called as the parser makes its way through the document, thereby providing access to the contents of that XML document. In fact, this is precisely what SAX was designed for: to provide a simple means to access information stored in XML. As we will see, however, it is often handy to be able to generate these events from data sources other than XML documents... I'm certain that there are XML purists out there for whom this technique -- using a non-XML class to produce SAX event streams -- will seem like heresy. Indeed you do need to be a bit more careful when letting your own custom module stand in for an XML parser (for the reasons stated above), but, in my opinion, the benefits far outweigh the costs. Writing custom SAX drivers provides a predictable, memory-efficient, easy to take advantage of Perl's advanced built-in data handling capabilities and vast collection of non-XML parsers and other data interfaces to create XML document streams..."

  • [September 21, 2001] "Tools for Dynamic Web Sites: ASP vs. PHP vs. ASP.NET." By Hans Hartman. In Seybold Report: Analyzing Publishing Technology [ISSN: 1533-9211] Volume 1, Number 12 (September 17, 2001). ['Are there practical reasons for choosing a scripting language? Or is it just a matter of taste? A developer who's built commercial sites using PHP and ASP describes the pros and cons of each. He also looks ahead to next-generation tools, ASP.NET and PHP version 5.'] "Creating database-driven Web sites used to be complex and time-consuming. Fortunately, several server-scripting tools were invented to make it easier for publishers to generate Web content automatically from their databases instead of manually coding it in HTML. Today, the most popular of these tools are PHP and ASP. Soon, the new ASP.NET, which Microsoft is developing as the centerpiece of its dot-NET initiative, may also be important. In this article, we'll compare all three technologies... With their first iterations launched in the mid-nineties, PHP and ASP are both mature technologies for creating database-driven Web sites. Their feature sets are comparable, but differ in two areas. First, ASP is a commercial technology, supported by Microsoft and commercial third parties, whereas PHP is open-source technology, supported by the open-source community and Zend. ASP is somewhat easier to learn, whereas PHP enables developers to create object-oriented code and, by modifying the source files that other programmers have already written, to create highly tailored modules without undue work. Second, PHP runs on a multitude of servers and platforms. ASP is limited to the IIS server and Microsoft operating systems. ASP.NET, whose arrival in the market is imminent, promises to be a faster and more efficient environment than ASP, and possibly, PHP. In addition, ASP.NET makes it easy to create SOAP- and XML-based Web services. But it, too, is limited to Microsoft platforms. Will the advantages of ASP.NET be enough to convert PHP developers? We doubt it. There is strong loyalty in the open-source community to PHP and the Apache server platform, and there are equivalent -- albeit not as easy -- tools for creating reusable object code, XML and SOAP protocols. A more interesting question is whether ASP.NET will attract new users who have no previous commitments to a Web server and OS. We think it might; it should be especially attractive where connecting to partner sites through XML and SOAP is high on the wish list."

  • [September 21, 2001] "Corel heads down the cross-media path. But will Micrografx and SoftQuad acquisitions be enough of a suite? [Cross-Media Publishing.]" By [Seybold Staff - TSR]. In Seybold Report: Analyzing Publishing Technology [ISSN: 1533-9211] Volume 1, Number 11 (September 03, 2001), pages 3, 29. ['Corel, perhaps best known as a vendor of shrink-wrapped software, has purchased XML pioneer SoftQuad. Now Corel plans to use its acquired XML technology as part of a cross-media publishng system. Inside, we tell you its chances for success.'] "Reduced to a second-tier player in the desktop application markets, Corel is making a bid to rebound as a leader in the nascent market for cross-media creation tools. In the past two months, Corel has announced its intention to buy Micrografx and SoftQuad and to piece together a new suite of cross-media tools for manipulating text and graphics... Corel announced plans to acquire SoftQuad in a stock deal worth about $34 million. One of the few suppliers of a complete XML-based authoring tool, SoftQuad gives Corel a next-generation word processing tool, as well as expertise that could be helpful to the WordPerfect team. Corel said it expects to continue WordPerfect development for the legal and government markets... SoftQuad's customer base continues to expand, with 167 new XMetaL customers reported in the last quarter, bringing the total number of XMetaL customers up to about 2,000. However, overall sales have been heavily dependent on key customers, including Cisco, which signed a license agreement valued at about $1 million. Unfortunately, SoftQuad's brand recognition has always been ahead of its sales. Its finances have rarely been healthy, and the company has struggled to produce a software hit from its core structured-authoring technology... As far as Corel, seeing it hop from bandwagon to bandwagon does not instill confidence that its new-found affinity for cross-media publishing will last any longer than its short-lived love affair with Linux. The post-Cowpland management team at Corel has its work cut out for it proving that its spotty track record of the past decade is no indication of its future potential. We do believe server-based graphics engines and XML-based authoring are corporate applications poised for growth. But they are components, not a 'cross-media solution.' A solution encompasses not only authoring but also workflow, content management, production and delivery systems for multiple media -- typically print and Web. It's a stretch to believe that Ventura will be successfully resurrected for page composition, and Corel has no content-management system that would serve as the heart of a cross-media solution. Its plan to partner such vendors, as Arbortext found out, will leave it at the mercy of their painfully long sales cycles. Corel may have bought new products and established brand names, but it still faces a formidable challenge in turning those into profitable businesses..." [See the press release, a letter to SoftQuad shareholders from Roberto Drassinower, and the relevant FAQ document, PR alt URL]

  • [September 21, 2001] "PDF Collaboration In Action. [Acrobat-Based Collaboration: How Well Does It Work? Workflow.]" By Bernd Zipper and John Parsons. In Seybold Report: Analyzing Publishing Technology [ISSN: 1533-9211] Volume 1, Number 11 (September 03, 2001), pages 7-12. ['Paper-based approval processes are generally at odds with shrinking deadlines, multi-departmental reviews and the needs of cross-media production. While numerous vendors have developed network-savvy methods for viewing, annotating and approving electronic files, many of those systems are proprietary in nature and few fully support PDF. One of Acrobat 5's significant new features is the ability to add comments and digital signatures online, forming the basis of a PDF-based collaborative workflow. We tested the new features and examined their strengths and weaknesses.'] "One of the significant new features of Acrobat 5, released in April, was the ability to add comments and digital signatures online. Although Adobe's tools are rudimentary, they form the basis of a collaborative workflow that is based, not on a proprietary raster file, but on PDF, which is widely recognized as an open, flexible and data-intensive format. The basic workflow. Acrobat 5's online workflow means that a PDF can be uploaded to a Web server, viewed in a browser by any prospective collaborator, and annotated online. Multiple users may view online PDFs, but must upload and download comments as a separate step... A separate server is required to hold the FDF files created by this commenting process. Adobe provides two methods for doing this: designating a shared folder on a network server, or specifying the URL of a WebDAV server... Importing and exporting annotations (FDF files) is handled via the Comments pane, or from the File menu... Adobe developed FDF as a 'transport format' for flexible transfer of information. For example, it is used for transferring the contents of tables or fields. The FDF format is based on the syntax of PDF. Its descriptions of objects and data are similar to those used by PDF itself, and they offer many options for display. Using this transport format, it is possible to forward and exchange data collected from forms, notes and annotations, and even optical markings... With Acrobat, there are two possible solutions for uploading PDF files to a Web server. One is WebDAV, which is an extension of the HTTP 1.1 protocol... WebDAV (Web Distributed Authoring and Versioning) extends HTTP to add the capability of securing data on the server; members of a team can use WebDAV to work on the same document at the same time, without being in the same place. The shared access is implemented by functions such as file locking and version control. The locking feature allows a user to temporarily block access to a file while he or she is working with it. Once the changes are completed, it is unlocked again. Locking and unlocking happen automatically, controlled by WebDAV, to avoid a 'collision.' It is not necessary to maintain a network connection during the time the lock (called a 'persistent lock') is applied to a file. Thus, a file can be opened online and edited offline. Subsequently, the changes are 'written' to the server. WebDAV also provides for the association of properties with documents. These properties are metadata encoded as XML. WebDAV distinguishes between 'dead' and 'live' properties. Live properties are generated by the server itself, including such things as creation date and date of modification. Dead properties are name-value combinations that incorporate a URL and XML coding. In the case of Acrobat, these are online annotations... The full $249 version of Acrobat 5 is required for users to view online comments, even if the user does not need to make comments. Neither the free Reader nor the recently introduced Approval product ($39) can view WebDAV-hosted online comments... Acrobat's online capabilities have increased the potential for collaborative workflow, using a common file format and (at least for comments) the obvious strengths and popularity of WebDAV. As tantalizing as this potential is, however, we feel the process is unfinished, and that practical solutions are still to come..." See "WEBDAV (Extensions for Distributed Authoring and Versioning on the World Wide Web."

  • [September 21, 2001] "ContentGuard Scales Back Operations. Will Discontinue Services and Concentrate on XrML." By Mike Letts. In Seybold Report: Analyzing Publishing Technology [ISSN: 1533-9211] Volume 1, Number 11 (September 3, 2001), page 31. "Hard times have fallen on high-profile digital rights management (DRM) provider ContentGuard, which has announced that it is in the process of discontinuing all of its service offerings, as well as several product lines. As a result, the company is directing customers to other service providers or vendors. In conjunction with the scaling back of operations, ContentGuard cut its workforce from about 90 employees to 30 and is slimming down or closing several of its smaller offices. For the foreseeable future, said Michael Miron, co-chairman of the board of directors and CEO of ContentGuard, the focus will fall almost solely on promoting the company's XrML rights language as a standard for the digital content industry... In addition to pushing forward with XrML, Miron noted that ContentGuard also plans on releasing a series of new development tools for customers that will allow them to integrate XrML with their own systems. Some will be free, and some will be for-purchase, said Miron. The first of these toolkits will be available 'soon,' he said, perhaps as early as this fall. In addition, some of the toolkits will allow industry participants to create extensions to the language, although Miron said the company has no plans to openly publish XrML or relinquish control of its licensing... [Editorial comment:] ContentGuard's inability to sell its services is indicative of the growing pains of the DRM market. Huge legislative and technical issues need to be ironed out before concrete revenues will be seen, so the attrition is sure to continue. Perhaps the question that should be asked is: If a company with the backing of Microsoft and Xerox can't make it, who can?" See: "Extensible Rights Markup Language (XrML)."

  • [September 21, 2001] "Talk SOAP. Building Devices that Communicate." By Amit Asaravala. In WebTechniques Volume 6, Issue 10 (October 2001), pages 35-37. "SOAP, an industry-backed device communications protocol, may make the "semantic Web" a closer reality than you thought possible... Like most protocols, SOAP isn't a tangible product, but rather a set of rules that anyone can implement in a software client or server. In essence, SOAP lets your applications invoke methods on servers, services, components, and objects that lie at remote locations on the Internet. While other protocols like DCOM and IIOP/CORBA let you do similar things, they're limited in that they weren't designed specifically for the Internet and for communication between diverse companies and devices. Using DCOM to communicate between applications in two separate companies is a difficult task that first requires agreeing on ports, transfer protocols, and so on. SOAP, on the other hand, sits on top of existing HTTP connections. As most companies have Web servers configured for HTTP connections on standard port 80, most of the initial coordination is complete. Of course, companies will still need to share APIs for available objects and methods; but SOAP lets people focus on these APIs and the data that needs to be transferred, rather than on the trouble of getting two disparate systems to communicate. All you need is a SOAP-compliant client application on the one side and a SOAP-compliant server on the other. The server could be as simple as a Web server that checks the headers of incoming HTTP requests. If it finds a POST statement with a text/xml-SOAP content-type or SOAPAction header, it sends the statement to a SOAP engine that parses the command found within. There are numerous SOAP implementations available, including SOAP::Lite for Perl... Because it's infrastructure agnostic, SOAP is positioned to become the de facto standard in communications. With its reliance on common protocols and languages such as HTTP and XML, SOAP promises to reduce the amount of coordination and development traditionally necessary to facilitate communication between two or more devices. In addition to its uses for network appliances, SOAP is being touted as the enabling component for Web services. This is one of the major reasons behind Microsoft's involvement with the SOAP specification. The company's .Net frameworks will use SOAP messages to send information between companies that have agreed to share data. eBay has already agreed to use the .Net framework to open up its auction databases. When the technology is in place, developers from other sites will be able to write auction applications that rely on live data from eBay's central database. In essence, SOAP is enabling the Web that we don't see. It's the technology that will help us realize a semantic, invisible Web that runs in the background, doing our bidding without our constant attention. So long, Web browsers." See "Simple Object Access Protocol (SOAP)."

  • [September 21, 2001] "SVG Gets an Editor. [Product Review.]" By Lynne Cooney. In WebTechniques Volume 6, Issue 10 (October 2001), page 28. "Scalable Vector Graphics (SVG) is a graphics format based on XML, and is currently nearing completion at W3C. Support for SVG isn't as broad as for Flash and Shockwave yet, but you can expect that to change with broader industry and browser support. Preparing to grab hold of the emerging market, Jasc has released a beta version of WebDraw... I like the preset objects tool, which lets you place simple arrows, hearts, and other graphics into your document from a library. To edit the points or nodes of these objects you must select the object and choose Convert to Paths. There are four primary drawing tools. You can use the Line tool to draw straight lines. With the Polyline tool you can draw irregular polylines or polygons. The FreeHand tool lets you draw a path freely, without clicking for each point. And finally, the Path tool is similar to the default pen found in vector illustration tools such as Illustrator and FreeHand... Overall, I think Jasc WebDraw is an excellent low-cost tool for creating simple SVG code. It would be a great asset to anyone creating scripts to output SVG on the fly, or to build consistent, styled headlines on a Web site. It lacks Illustrator's higher end features, but you can't beat the price -- free, while in beta!" See: "W3C Scalable Vector Graphics (SVG)."

  • [September 21, 2001] "Microsoft's Golden Road to the Internet. Visual Studio .Net Enterprise Edition. [Product Review.]" By John Pearson. In WebTechniques Volume 6, Issue 10 (October 2001), pages 48-49. "Microsoft's future is bound to Visual Studio .Net, which is probably why it has made so many changes. If you've used previous releases of Visual Studio, this will be a whole new world. If you've used enterprise suites from other companies, you have to get this one and try it out... To understand what happened to VB -- indeed, what happened to Visual Studio -- we need to look under the covers and discuss Microsoft's .Net initiative. This initiative begins with the .Net Framework -- essentially a set of libraries, classes, and interpreters that constitute the foundation upon which all .Net services and languages are built. The libraries are collectively known as the Common Language Runtime (CLR) and include fundamental programming services such as memory management, process management, and security enforcement. Part of this library is also a compiler to process language instructions. This library has a set of classes, known as .Net Framework Unified Classes, that perform systems and programming tasks such as file management, system input and output, and operating system functionality. On top of the systems classes, other classes are built that do the things that we really want to get to: data classes (ADO.Net), Windows forms, and XML and Web classes... ADO.Net is the data access component of the .Net framework. It provides access to SQL Server and OLE DB data sources via an XML interface. For Web applications, first it provides the schema in XSD format, then transmits data in XML datasets. For ASP.Net programming, this makes database and server-side programming much easier. You can easily incorporate XML data from other sources into your applications, and you can use XML to transmit data from your application to others. ADO.Net is fully integrated into Visual Studio .Net and is the primary means of data access for the applications developed with it."

  • [September 21, 2001] "The Semantic Web: An Introduction." By Sean B. Palmer. ['This document is designed as being a simple but comprehensive introductory publication for anybody trying to get into the Semantic Web: from beginners through to long time hackers. The document discusses many principles and technologies of the Semantic Web, including RDF, RDF Schema, DAML, ontologies, inferences, logic, SEM, queries, trust, proof, and so on. Because it touches a lot of subjects, it may cover some well-known material, but it should also have something that will be of interest to everyone.'] "... So the Semantic Web can be seen as a huge engineering solution... but it is more than that. We will find that as it becomes easier to publish data in a repurposable form, so more people will want to pubish data, and there will be a knock-on or domino effect. We may find that a large number of Semantic Web applications can be used for a variety of different tasks, increasing the modularity of applications on the Web. But enough subjective reasoning... onto how this will be accomplished. The Semantic Web is generally built on syntaxes which use URIs to represent data, usually in triples based structures: i.e., many triples of URI data that can be held in databases, or interchanged on the world Wide Web using a set of particular syntaxes developed especially for the task. These syntaxes are called 'Resource Description Framework' syntaxes... Table Of Contents: 1. What Is The Semantic Web?; 2. Simple Data Modelling: Schemata; 3. Ontologies, Inferences, and DAML; 4. The Power Of Semantic Web Languages; 5. Trust and Proof; 6. Ambient Information and SEM; 7. Evolution; 8. Does It Work? What Semantic Web Applications Are There?; 9. What Now? Further Reading." See: "XML and 'The Semantic Web'."

  • [September 21, 2001] "C/C++ developers: Fill your XML toolbox. Tools advice for C and C++ programmers ramping up on XML." By Rick Parrish. From IBM developerWorks. September 2001. "Designed for C and C++ programmers who are new to XML development, this article gives an overview of tools to assemble in preparation for XML development. Tool tables outline generic XML tools like IDEs and schema designers, parsers, XSLT tools, SOAP and XML-RPC libraries, and other libraries either usable from or actually written in C and/or C++. The article includes advice for installing open-source libraries on Windows, Unix, and Linux, plus a brief glossary of key XML terms. It seems as if everywhere you look there is some new XML-related tool being released in source code form written in Java. Despite Java's apparent dominance in the XML arena, many C/C++ programmers do XML development, and there are a large assortment of XML tools for the C and C++ programmer. We'll confront XML library issues like validation, schemas, and API models. Next, we'll look at a collection of generic XML tools like IDEs and schema designers. Finally, we'll conclude with a list and discussion of libraries either usable from or actually written in C and/or C++. This isn't a comparative review that rates tools. My goal is to explain the types of tools you'll probably need and to point you to likely candidates. You'll still need to research, test, and compare tool features against your project needs to assemble your ultimate toolbox. To incorporate XML in your own software projects, you're going to want to have two sets of tools in your bag of tricks. The first set is a dialect designer (or more properly 'schema designer'). The second set of tools includes software libraries that will add parsing and XML-generation features to your application... These tools ought to give you a good start on your XML toolbox. If you want to suggest other C/C++ tools for XML that you have tried or to make any other comment, join the discussion referenced in this article." Also in PDF format.

  • [September 21, 2001] "Enabling XML security. An introduction to XML encryption and XML signature." By Murdoch Mactaggart (IBMDev@TextBiz.com). From IBM developerWorks. September 2001. "XML is a major enabler of what the Internet, and latterly Web services, require in order to continue growing and developing. Yet a lot of work remains to be done on security-related issues before the full capabilities of XML languages can be realised. At present, encrypting a complete XML document, testing its integrity, and confirming the authenticity of its sender is a straightforward process. But it is increasingly necessary to use these functions on parts of documents, to encrypt and authenticate in arbitrary sequences, and to involve different users or originators. At present, the most important sets of developing specifications in the area of XML-related security are XML encryption, XML signature, XACL, SAML, and XKMS. This article introduces the first two. XML has become a valuable mechanism for data exchange across the Internet. SOAP, a means of sending XML messages, facilitates process intercommunication in ways not possible before, while UDDI seems to be fast becoming the standard for bringing together providers and users of Web services; the services themselves are described by XML in the form of WSDL, the Web Services Description Language. Without XML, this flexibility and power would not be possible and, as various people have remarked, it would be necessary to invent the metalanguage. The other area of rapid growth is that of security. Traditional methods of establishing trust between parties aren't appropriate on the public Internet or, indeed, on large LANs or WANs. Trust mechanisms based on asymmetric cryptography can be very useful in such situations, but the ease of deployment and key management, the extent of interoperability, and the security offered are, in reality, far less than the enthusiastic vendors of different Public Key Infrastructures (PKI) would have us believe. There are particular difficulties in dealing with hierarchical data structures and with subsets of data with varying requirements as to confidentiality, access authority, or integrity. In addition, the application of now standard security controls differentially to XML documents is not at all straightforward. Several bodies are actively involved in examining the issues and in developing standards. The main relevant developments here are XML encryption and the related XML signature, eXtensible Access Control Language (XACL), and the related Security Assertion Markup Language (SAML -- a blending of the formerly competing AuthML and S2ML). Each of these is driven by OASIS, and XML Key Management Specification (XKMS). This article introduces XML encryption and XML signature... SAML is an imitative driven by OASIS that attempts to blend the competing specifications AuthML and S2ML, and to facilitate the exchange of authentication and authorisation information. Closely related to SAML, but focusing more on a subject-privilege-object orientated security model in the context of a particular XML document, is the eXtensible Access Control Markup Language, also directed by OASIS and variously known (even within the same documents) as XACML or XACL. By writing rules in XACL, a policy author can define who can exercise what access privileges for a particular XML document, something relevant in the situations cited earlier. XKMS, now being considered by a W3C committee, is intended to establish a protocol for key management on top of the XML signature standard. With SAML, XACL, and other initiatives, XKMS is an important element in the large jigsaw that makes up security as applied to XML documents. Its immediate effect is to simplify greatly the management of authentication and signature keys; it does this by separating the function of digital certificate processing, revocation status checking, and certification path location and validation from the application involved -- for example, by delegating key management to an Internet Web service." See "XML Digital Signature (Signed XML - IETF/W3C)."

  • [September 20, 2001] "Directory Services Markup Language Version 2.0." From OASIS TC for Directory Services Markup Language (DSML). Draft. September 19, 2001. 38 pages. "The Directory Services Markup Language v1.0 (DSMLv1) provides a means for representing directory structural information as an XML document.1 DSMLv2 goes further, providing a method for expressing directory queries and updates (and the results of these operations) as XML documents. DSMLv2 documents can be used in a variety of ways. For instance, they can be written to files in order to be consumed and produced by programs, or they can be transported over HTTP to and from a server that interprets and generates them. DSMLv2 functionality is motivated by scenarios including: (1) A smart cell phone or PDA needs to access directory information but does not contain an LDAP client. (2) A program needs to access a directory through a firewall, but the firewall is not allowed to pass LDAP protocol traffic because it isn't capable of auditing such traffic. (3) A programmer is writing an application using XML programming tools and techniques, and the application needs to access a directory. In short, DSMLv2 is needed to extend the reach of directories. DSMLv2 is not required to be a strict superset of DSMLv1, which was not designed for upward-compatible extension to meet new requirements. However it is desirable for DSMLv2 to follow the design of DSMLv1 where possible. ... DSMLv2 focuses on extending the reach of LDAP directories. Therefore, as in DSMLv1, the design approach is not to abstract the capabilities of LDAP directories as they exist today, but instead to faithfully represent LDAP directories in XML. The difference is that DSMLv1 represented the state of a directory while DSMLv2 represents the operations that an LDAP directory can perform and the results of such operations...." With the draft XML schemas: Batch Envelope, [imported]. See references in "Directory Services Markup Language (DSML)."

  • [September 20, 2001] "Understanding WSDL in a UDDI Registry. How to Publish and Find WSDL Service Descriptions." By Peter Brittenham, Francisco Cubera, Dave Ehnebuske, and Steve Graham. From IBM developerWorks [Web services articles]. September 2001. ['The Web Services Description Language has a lot of versatility in its methods of use. In particular, WSDL can work with UDDI registries in several different ways depending upon the application needs. In this first of a three-part series, we will look at these different methods of using WSDL with UDDI registries.'] The Web Services Description Language (WSDL) is an XML language for describing Web services as a set of network endpoints that operate on messages. A WSDL service description contains an abstract definition for a set of operations and messages, a concrete protocol binding for these operations and messages, and a network endpoint specification for the binding. Universal Description Discovery and Integration (UDDI) provides a method for publishing and finding service descriptions. The UDDI data entities provide support for defining both business and service information. The service description information defined in WSDL is complementary to the information found in a UDDI registry. UDDI provides support for many different types of service descriptions. As a result, UDDI has no direct support for WSDL or any other service description mechanism. The UDDI organization, UDDI.org, has published a best practices document titled Using WSDL in a UDDI Registry 1.05. This best practices document describes some of the elements on how to publish WSDL service descriptions in a UDDI registry. The purpose of this article is to augment that information. The primary focus is on how to map a complete WSDL service description into a UDDI registry, which is required by existing WSDL tools and runtime environments. The information in this article adheres to the procedures outlined in that best practices document and is consistent with the specifications for WSDL 1.1, UDDI 1.0 and UDDI 2.0..." For related articles, see the IBM developerWorks Web Services Zone. References: (1) "Web Services Description Language (WSDL)", and (2) "Universal Description, Discovery, and Integration (UDDI)."

  • [September 20, 2001] "Sun, IBM Update Web Services Tools." By Tom Sullivan. In InfoWorld (September 19, 2001). "Both Sun Microsystems and IBM on Tuesday announced upgraded tools for building applications and Web services. Sun, in Palo Alto, California, said that Forte for Java 3.0, Enterprise Edition, is now generally available. The company said that the primary focus of the new version is enhanced support for EJBs (Enterprise JavaBeans) and on creating XML-based services. In addition to the Enterprise Edition, Sun also maintains the free Community Edition, which now includes seven modules previously in the Internet version. These modules include an external editor, XML support, database explorer, CORBA support, terminal emulation, file copy, and support for C, C++, and Fortran. IBM, also on Tuesday, placed the latest version of its WSTK (Web services Toolkit) on the company's alphaWorks Web site for developers. WSTK v2.4 offers developers a runtime environment as well as introductory material and examples of Web services that developers can use. New to this version are support for IBM's WebSphere application server, HTTPR (reliable HTTP), WSDL (Web Services Description Language), and WSIF (Web Services Invocation Framework), which enables developers to describe non-SOAP based services in WSDL..."

  • [September 20, 2001] "Extend the Power of Java Technology with the Modular, Extensible Forte for Java IDE." By [Staff]. Sun Forte Tools Feature Story. September 01, 2001. "Whether you're a beginning programmer or a professional Java technology developer, Sun's Forte for Java, release 3.0 integrated development environment (IDE) provides an outstanding platform in which to create and deploy Enterprise JavaBeans (EJB). The Forte for Java IDE supports the editions of the Java 2 Platform: the Micro Edition (J2ME), the Standard Edition (J2SE), and the Enterprise Edition (J2EE). Moreover, the Forte for Java IDE is modular and extensible -- allowing you to quickly incorporate new technologies, such as wireless, smart Web services, and robust application-specific user interfaces from Sun, Sun's 75+ partners, and the open source community... As a key component of the Sun Open Net Environment (Sun ONE), you can count on the Forte for Java IDE to be an outstanding product that integrates with the Sun ONE architecture. Written in the Java programming language, it generates J2EE code. Because the Forte for Java IDE also includes many wizards and productivity features, and integrates with the iPlanet Application Server and the iPlanet Web Server, developers are enabled to create and deploy EJBs in a highly productive manner. You can choose either of the following Forte for Java software editions: (1) The Community Edition is offered at no charge and includes a complete and highly integrated set of tools -- including a Web browser, Web server, a relational database and support for CORBA, RMI, XML, and source code management. This edition includes all the functionality needed for teams of developers building database-aware Web applications, including integration with Tomcat... (2) The Enterprise Edition is ideally suited for developing scalable, robust applications and services based on the J2EE architecture specification. This edition includes all the functionality of the Community Edition plus support for building and assembling EJBs into applications. It also supports deploying applications to an integrated application server, such as the iPlanet Application Server. The Enterprise Edition also enables you to develop and publish Web services with the Web Services module or add a partner plug-in module that extends your development environment to support standards, such as ebXML, WSDL, UDDI, and SOAP..." See also the announcement from Sun.

  • [September 20, 2001] "RosettaNet Sets Compliance Program." By Chuck Moozakis. In InternetWeek (September 18, 2001). "RosettaNet today took the wraps off RosettaNet Ready, a package of developer tools and source code aimed at accelerating the adoption of the product definition standard. Ready has two components. The first, a developer tools library, lets companies test software to ensure it complies with RosettaNet standards. The second, a set of software compliance badges, verifies that applications written by members and other software developers conform to RosettaNet... Fourteen companies have already signed on as Ready backers, including application integration companies webMethods and SeeBeyond. Electronics industry exchange E2open is another backer. The exchange earlier this month kicked off a RosettaNet Onboarding service that incorporates RosettaNet's XML product descriptions into i2's supply chain apps. The service is geared to electronics and semiconductor manufacturers that want to use the Web to collaborate with their trading partners. RosettaNet hopes to have another 90 or so companies signed up to support the Ready initiative. Currently, RosettaNet has about 400 member companies that have pledged to adopt the standard..." See "RosettaNet."

  • [September 20, 2001] "Device Independence Principles." W3C Working Draft 18-September-2001. Edited by Roger Gimson (HP); Co-edited by Shlomit Ritz Finkelstein (Nexgenix), Stéphane Maes (IBM), and Lalitha Suryanarayana (SBC Technology Resources). Latest version URL: http://www.w3.org/TR/di-princ/. Produced as part of the W3C Device Independence Activity. ['The Device Independence Working Group has released its first publication, a Working Draft of Device Independence Principles. The document describes the principles necessary to make the Web accessible by "anyone, anywhere, anytime, anyhow".'] Abstract: "This document celebrates the vision of a device independent Web. It describes device independence principles that can lead towards the achievement of greater device independence for Web content and applications." Goal: "The aim of this document is to set out some principles that can be used when evaluating current solutions or proposing new solutions, and can lead to more detailed requirements and recommendations in the future. The principles are independent of any specific markup language, authoring style or adaptation process. They do not propose specific requirements, guidelines or technologies. It is intended, however, that these principles be used as a foundation when proposing greater device independence through, for example: (1) guidelines for authoring of content and applications that use existing markup languages, (2) modifications and extensions to existing markup languages, (3) designs of adaptation tools and processes, (4) evolution of new markup languages..." See also the mailing list archives.

  • [September 19, 2001] "Indexing and Querying XML Data for Regular Path Expressions." By Quanzhong Li and Bongki Moon (Department of Computer Science, University of Arizona, Tucson, AZ 85721, USA). Paper presented at the 2001 International Conference on Very Large Databases (VLDB 2001), Rome, Italy, September, 2001. 10 pages, with 25 references. "With the advent of XML as a standard for data representation and exchange on the Internet, storing and querying XML data becomes more and more important. Several XML query languages have been proposed, and the common feature of the languages is the use of regular path expressions to query XML data. This poses a new challenge concerning indexing and searching XML data, because conventional approaches based on tree traversals may not meet the processing requirements under heavy access requests. In this paper, we propose a new system for indexing and storing XML data based on a numbering scheme for elements. This numbering scheme quickly determines the ancestor-descendant relationship between elements in the hierarchy of XML data. We also propose several algorithms for processing regular path expressions, namely, (1) EE-Join for searching paths from an element to another, (2) EA-Join for scanning sorted elements and attributes to find element-attribute pairs, and (3) KC-Join for finding Kleene-Closure on repeated paths or elements. The EE-Join algorithm is highly effective particularly for searching paths that are very long or whose lengths are unknown. Experimental results from our prototype system implementation show that the proposed algorithms can process XML queries with regular path expressions by up to an order of magnitude faster than conventional approaches... The XQuery language is designed to be broadly applicable across all types of XML data sources from documents to databases and object repositories. The common features of these languages are the use of regular path expressions and the ability to extract information about the schema from the data. Users are allowed to navigate through arbitrary long paths in the data by regular path expressions. For example, XPath uses path notations as in URLs for navigating through the hierarchical structure of an XML document. Despite the past research efforts, it is widely believed that the current state of the art of the relational database technology fails to deliver all necessary functionalities to efficiently store XML and semi-structured data. Furthermore, when it comes to processing regular path expression queries, only a few straightforward approaches based on conventional tree traversals have been reported in the literature. Such approaches can be fairly inefficient for processing regular path expression queries, because the overhead of traversing the hierarchy of XML data can be substantial if the path lengths are very long or unknown. In this paper, we propose a new system called XISS for indexing and storing XML data based on a new numbering scheme for elements and attributes. The index structures of XISS allow us to efficiently find all elements or attributes with the same name string, which is one of the most common operations to process regular path expression queries. The proposed numbering scheme quickly determines the ancestor-descendant relationship between elements and/or attributes in the hierarchy of XML data. We also propose several algorithms for processing regular path expression queries... The new query processing paradigm proposed in this paper poses an interesting issue concerning XML query optimization. A given regular path expression can be decomposed in many different ways. Since each decomposition leads to a different query processing plan, the overall performance may be affected substantially by the way a regular path expression is decomposed. Therefore, it will be an important optimization task to find the best way to decompose an expression. We conjecture that document type definitions and statistics on XML data may be used to estimate the costs and sizes of intermediate results. In the current prototype implementation of XISS,all the index structures are organized as paged files for effi-cient disk IO. We have observed that trade-off between disk access efficiency and storage utilization. It is worth investigating the way to find the optimal page size or the break-even point between the two criteria." See "XML and Query Languages." [cache]

  • [September 18, 2001] "A Fast Index for Semistructured Data." By Brian F. Cooper, Neal Sample, Michael J. Franklin, Gísli R. Hjaltason, and Moshe Shadmon. Paper presented at the 27th VLDB Conference, Roma, Italy, September 13, 2001. 19 pages, with 32 references. Abstract: "Queries navigate semistructured data via path expressions, and can be accelerated using an index. Our solution encodes paths as strings, and inserts those strings into a special index that is highly optimized for long and complex keys. We describe the Index Fabric, an indexing structure that provides the efficiency and flexibility we need. We discuss how 'raw paths' are used to optimize ad hoc queries over semistructured data, and how 'refined paths' optimize specific access paths. Although we can use knowledge about the queries and structure of the data to create refined paths, no such knowledge is needed for raw paths. A performance study shows that our techniques, when implemented on top of a commercial relational database system, outperform the more traditional approach of using the commercial system's indexing mechanisms to query the XML." Detail: "... Typically, indexes are constructed for efficient access. One option for managing semistructured data is to store and query it with a relational database... An alternative option is to build a specialized data manager that contains a semistructured data repository at its core. Projects such as Lore and industrial products such as Tamino and XYZFind take this approach. It is difficult to achieve high query performance using semistructured data repositories, since queries are again answered bytraversing many individual element-to-element links, requiring multiple index lookups. Moreover, semistructured data management systems do not have the benefit of the extensive experience gained with relational systems over the past few decades. To solve this problem, we have developed a different approach that leverages existing relational database technology but provides much better performance than previous approaches. Our method encodes paths in the data as strings, and inserts these strings into an index that is highly optimized for string searching. The index blocks and semistructured data are both stored in a conventional relational database system. Evaluating queries involves encoding the desired path traversal as a search key string, and performing a lookup in our index to find the path. There are several advantages to this approach. First, there is no need for a prioriknowledge of the schema of the data, since the paths we encode are extracted from the data itself. Second, our approach has high performance even when the structure of the data is changing, variable or irregular. Third, the same index can accelerate queries along many different, complex access paths. This is because our indexing mechanism scales gracefully with the number of keys inserted, and is not affected by long or complex keys (representing long or complex paths). Our indexing mechanism, called the Index Fabric, utilizes the aggressive key compression inherent in a Patricia trie to index a large number of strings in a compact and efficient structure. Moreover, the Index Fabric is inherently balanced, so that all accesses to the index require the same small number of I/Os. As a result, we can index a large, complex, irregularly-structured, disk-resident semistructured data set while providing efficient navigation over paths in the data. Indexing XML with the Index Fabric: Because the Index Fabric can efficiently manage large numbers of complex keys, we can use it to search many complex paths through the XML. In this section, we discuss encoding XML paths as keys for insertion into the fabric, and how to use path lookups to evaluate queries... We encode data paths using designators: special characters or character strings. A unique designator is assigned to each tag that appears in the XML. The designator-encoded XML string is inserted into the layered Patricia trie of the Index Fabric, which treats designators the same way as normal characters, though conceptually they are from different alphabets. In order to interpret these designators (and consequently to form and interpret queries) we maintain a mapping between designators and element tags called the designator dictionary. When an XML document is parsed for indexing, each tag is matched to a designator using the dictionary. New designators are generated automatically for new tags. The tag names from queries are also translated into designators using the dictionary, to form a search key over the Index Fabric. ... Raw paths index the hierarchical structure of the XML by encoding root-to-leaf paths as strings. Simple path expressions that start at the root require a single index lookup. Other path expressions may require several lookups, or post-processing the result set. [Here] we focus on the encoding of raw paths. Raw paths build on previous work in path indexing. Tagged data elements are represented as designator-encoded strings. We can regard all data elements as leaves in the XML tree..." See "XML and Query Languages." [cache 2001-09-18]

  • [September 14, 2001] "Microsoft Integration Software Targets Chemical Industry." By Renee Boucher Ferguson. In eWEEK (September 12, 2001). "Microsoft Corp. this week introduced a BizTalk business-to-business integration software development kit for the chemical industry. The Microsoft BizTalk Server 2000 CIDX Software Development Kit, which was rolled out at the Instrumentation, Systems and Automation Society conference in Houston, is designed to help chemical companies rapidly integrate applications, platforms and business processes inside and outside their firewalls. The SDK uses the core XML (Extensible Markup Language) protocols developed by the Chemical Industry Data Exchange, a consortium of chemical industry leaders. The software provides XSLT (Extensible Stylesheet Language Transformation) mapping documents that allow customers to map data from CIDX transactions to SAP AG intermediate documents, which are used in application linking and embedding. To help users along in the process, the CIDX kit also includes a sample utility that demonstrates an approach for automating the configuration of BizTalk, as well as a tutorial explaining how to implement support for a CIDX OrderCreate transaction. BizTalk is part of Microsoft's .Net platform, which supports creation of services that run on Web sites. While the chemical industry has been slow to adopt Microsoft technology as an e-business software provider and CIDX as a standard, Christopher McCormick believes it is only a matter of time before CIDX becomes the starting point for all e-business transactions in the industry... McCormick [CEO of IndigoB2B.com Inc.] estimated that about 20 percent of chemicals industry businesses use CIDX... The CIDX Chem eStandards grew out of some broad standards developed for high-tech manufacturing by RosettaNet, a multi-industry consortium of which Microsoft is a founding member..." See "XML-Based 'Chem eStandard' for the Chemical Industry."

  • [September 13, 2001] "The Race To Make Numbers Useful. XML-like Standards Aim to Enable Analysis of Data Posted Online." By L. Scott Tillett. In InternetWeek #877 (September 10, 2001), page 15. "The problem with numbers on the Web these days is that they're buried in an environment designed for text. Pulling numerical data off of a Web site and running it in an analytical application requires cutting and pasting or retyping. And good luck if you need to convert euros to dollars before plugging the numbers into your app... Efforts are multipronged, with vendors working on proprietary standards for defining, sharing and translating numerical data via the Web. Others such as e-Numerate of McLean, Va., are developing standards that they say will be open. And then there are efforts that pull in multiple industry players, such as XBRL.org, a consortium seeking to develop Extensible Business Reporting Language. Putting numbers into a language modeled on XML, for example, could let a Web site visitor view a company's financial statement and instantly merge those numbers into an app to compare that company's performance with that of the visitor's own firm. The same approach could work for a multinational company that wants to use applications to analyze numerical data flowing in from lots of countries... The two open standards being developed by e-Numerate are intended for sharing numerical data in an XML framework via the Web. One standard, RDL, addresses the meaning of numbers, including source information, descriptors and magnitude -- whether the numbers represent inches, dollars, euros, millions, thousands or whatnot. Scott Santucci, vice president of sales and marketing for e-Numerate, compares RDL to HTML descriptors that tell a browser whether to present information in bold, in italics or in a certain location on the page. The other standard, RXL, functions essentially as math equations that are applied to RDL. Santucci described them as 'macros' that process numbers--to adjust or "normalize" them for the rate of inflation, for example, or to convert them to another numerical standard... E-Numerate, which is backed by Carlyle Venture Partners and led by William M. Diefenderfer III, former President George Bush's budget director, is building a gateway to RDL/RXL-enabled numerical data that will be released sometime next year. Meanwhile, the company expects to release a Web development kit this month to let companies develop their Web sites using RDL/RXL-enabled numbers... Mike Willis, a partner at PricewaterhouseCoopers and chair of the XBRL.org steering committee, said that the concept of putting numbers within an XML framework would take off as the use of XML in business continues to gain momentum. Meanwhile, vendors such as Hyperion, CaseWare and Innovision continue to work in parallel to e-Numerate to create applications for the new numbers-sharing approach." See: "Re-Useable Data Language (RDL)."

  • [September 13, 2001] "Requirements for XML Document Database Systems." By Airi Salminen (Dept. of Computer Science and Information Systems, University of Jyväskylä, Jyväskylä, Finland) and Frank Wm. Tompa (Department of Computer Science, University of Waterloo, Waterloo, ON, Canada). Paper to be presented at ACM Symposium on Document Engineering, November 2001. 10 pages, with 52 references. "The shift from SGML to XML has created new demands for managing structured documents. Many XML documents will be transient representations for the purpose of data exchange between different types of applications, but there will also be a need for effective means to manage persistent XML data as a database. In this paper we explore requirements for an XML database management system. The purpose of the paper is not to suggest a single type of system covering all necessary features. Instead the purpose is to initiate discussion of the requirements arising from document collections, to offer a context in which to evaluate current and future solutions, and to encourage the development of proper models and systems for XML database management. Our discussion addresses issues arising from data modelling, data definition, and data manipulation... Effective means for the management of persistent XML data as a database are needed. We define an XML document database (or more generally an XML database, since every XML database must manage documents) to be a collection of XML documents and their parts, maintained by a system having capabilities to manage and control the collection itself and the information represented by that collection. It is more than merely a repository of structured documents or of semistructured data. As is true for managing other forms of data, management of persistent XML data requires capabilities to deal with data independence, integration, access rights, versions, views, integrity, redundancy, consistency, recovery, and enforcement of standards. A problem in applying traditional database technologies to the management of persistent XML documents lies in the special characteristics of the data, not typically found in traditional databases. Structured documents are often complex units of information, consisting of formal and natural languages, and possibly including multimedia entities. The units as a whole may be important legal or historical records. The production and processing of structured documents in an organization may create a complicated set of documents and their components, versions and variants, covering both basic data and metadata... Data model, DDL, and DML design must be coordinated if the resulting system is to be consistent. Much effort has been devoted to data definition for the purpose of validation and to query language features. We believe that now the highest priority is to define a complete data model that covers enterprise and document data, serves as a means to define conceptual schemas, and defines the mechanism to answer whether any two items of data are equivalent. We are encouraged by the move towards convergence of the XPath and XQuery data models; if convergence with the DOM and Infoset models were undertaken, a complete and stable database model might evolve. DDLs and DMLs can then be defined to include all components of the model. We believe that priority should also be given to developing mechanisms to manage collections of DTDs and other document definitions along with managing the documents themselves. This is especially important in the context of managing diverse collections of documents, each of which encompasses many versions and variants and subject to various levels of validity. The purpose of the paper is to initiate discussion of the requirements for XML databases, to offer a context in which to evaluate current and future solutions, and to encourage the development of proper models and systems for XML database management. A well-defined, general-purpose XML database system cannot be implemented before database researchers and developers understand the needs of document management in addition to the needs of more traditional database applications..." See: "XML and Databases." [cache]

  • [September 13, 2001] "Pork Barrel Protocols." By Martin Gudgin and Timothy Ewald. From XML.com. September 12, 2001. "XML Endpoints is a new column about web services, one of the most controversial and confusing topics in distributed systems development today. Our goal for this column is to examine web services as they exist today and as they will be evolving in the future. Along the way, we'll talk about protocols, programming models, toolkits, interoperability and more. We'll also try to sift through all of the proposals for competing web service related specifications -- e.g., WSDL, WSFL, XLANG, HTTPR, SOAPRP, UDDI, and so on -- in order to explain which ones are likely to be useful and why. Before we get to all that, however, we need to define the term 'web service'. . . First, web services rely on standard Internet protocols like HTTP and SMTP because nearly every platform supports them and because the entire Internet infrastructure -- the proxy servers, routers, gateways and firewalls that make up the physical network -- is designed and configured to transport and accept these protocols. Second, web services use XML-based messages because XML has industry-wide support and processing tools are inexpensive and ubiquitous. The SOAP specification defines a very widely endorsed format for these messages, and at least one alternative exists, XML-RPC. Third, web services describe their message formats in terms of a language and platform neutral type system. This helps facilitate the definition of precise wire-level contracts between web services and their clients, which makes building robust interoperable distributed systems easier. XML schema (XSD) is an obvious choice for a type system, but there are some conflicts between XSD and portions of the SOAP specification that need to be resolved. Fourth, web services provide some way to access metadata describing the messages they accept in terms of the type system mandated by the previous requirement..."

  • [September 13, 2001] "Picture Perfect." By Edd Dumbill. From XML.com. September 12, 2001. "Last week, the World Wide Web Consortium issued Scalable Vector Graphics (SVG) 1.0 as a Recommendation. SVG, as its name implies, is an XML application for describing two-dimensional graphics in the form of vector-based objects. As well as allowing normal 2D drawings, SVG is scriptable, enabling user interaction, and it incorporates animation capabilities from SMIL. Along with W3C XML Schema, SVG is one of the most important technologies to emerge from the W3C this year. It's certainly been long in the making -- SVG's first public Working Draft was published over two and a half years ago. Although many have been impatient for the final recommendation, this lengthy period of maturation has produced benefits in terms of the quality of SVG's specification and the number of supporting implementations. The text of the SVG Recommendation makes for an impressive read. It starts with a useful Concepts section that explains the key points and motivations behind SVG. The specification itself is beautifully formatted, comprehensively hyperlinked, and filled with examples. In addition, it is also very well indexed and useful as a reference, both for SVG processor implementers and those wishing to create SVG diagrams in XML. Accompanying the recommendation is a test suite, allowing developers of SVG implementations to verify their code against the expected renderings of SVG documents. Although W3C XML Schema has recently gained a test suite after going to recommendation status, to have one available through the development of a specification and at publication of the recommendation is an excellent move. It has also enabled the W3C to publish implementation conformance information for the various available SVG renderers... Setting aside the excellence of the specification itself, we must ask where SVG will succeed. After all, the best of technologies have been known to fail due to poor adoption. On the Web, SVG's most immediate competitor is Flash, the only real established technology for vector-based illustration and animation. Microsoft's Internet Explorer has had support for its own predecessor to SVG, VML, for a while now, but this hasn't really achieved widespread deployment on web sites. It is clearly the W3C's hope that SVG will supplant Macromedia's Flash to a certain extent, bringing as it does the benefits of integration with the emerging XML infrastructure both in browsers and on the server side, and of course the open process of a W3C-fostered specification..." See (1) the news entry for the Scalable Vector Graphics (SVG) 1.0 specification as a W3C Recommendation, and (2) "W3C Scalable Vector Graphics (SVG)."

  • [September 13, 2001] "What Are XForms?" By Micah Dubinko. From XML.com. September 12, 2001. "XForms are the new XML-based replacement for web forms. Think about how many times a day you use forms, electronic or otherwise. On the Web, forms have truly become commonplace for search engines, polls, surveys, electronic commerce, and even on-line applications. Nearly all user interaction on the Web is through forms of some sort. This ubiquitous technology, however, is showing its age. It predates XML by several years, a contributing factor to some of its limitations: poor integration with XML, device dependent, running well only on desktop browsers, blending of purpose and presentation, [and] limited accessibility features. A new technology, XForms, is under development within the W3C and aims to meld XML and forms. The design goals of XForms meet the shortcomings of HTML forms point-for-point: (1) Excellent XML and Schema integration; (2) Device independent, yet still useful on desktop browsers; (3) Strong separation of purpose from presentation; (4) Universal accessibility. This document gives an introduction to XForms, based on the 28 August 2001 Working Draft. The most important concept in XForms is 'instance data', an internal representation of the data mapped to the more visible 'form controls'. Instance data is based on XML and defined in terms of XPath's internal representation and processing of XML. It might seem strange at first to associate XPath and XForms. XPath is perhaps best known as the common layer between XSLT and XPointer, not as a foundation for web forms. As XForms evolved, however, it became apparent that forms needed greater structure than was possible with simple name-value pairs, as well as syntax to reach into the instance data to connect or "bind" form controls to specific parts of the data structure. XForms processing combines input and output into the same tree: (1) From an input source, either inline or an XML document on a server, "instance data" is parsed into memory. (2) Processing of the instance data involves interacting with the user and recording any changes in the data. (3) Upon submit, the instance data is serialized, typically as XML, and sent to a server... The XForms specification fully adopts the XML Schema data-types mechanism (including a narrower subset for small devices such as mobile phones) to provide additional data collection parameters such as maximum length or a regular expression pattern like an email address. This, combined with form-specific properties, is called the 'XForms Model' and is the basis for creating powerful forms that aren't dependent on scripts..." See the XForms 1.0 Working Draft published 28-August-2001 and the main reference page "XML and Forms."

  • [September 12, 2001] "Compatibility Encoding Scheme for UTF-16: 8-Bit (CESU-8)." Proposed Draft Unicode Technical Report #26. By Toby Phipps. Version URL: http://www.unicode.org/unicode/reports/tr26/tr26-1. "This document specifies an 8-bit Compatibility Encoding Scheme for UTF-16 (CESU) that is intended as an alternate encoding to UTF-8 for internal use within systems processing Unicode in order to provide an ASCII-compatible 8-bit encoding that preserves UTF-16 binary collation. It is not intended nor recommended as an encoding used for open information exchange. The Unicode Consortium, does not encourage the use of CESU-8, but does recognize the existence of data in this encoding and supplies this technical report to clearly define the format and to distinguish it from UTF-8. This encoding does not replace or amend the definition of UTF-8."

  • [September 12, 2001] "In the Financial Flow. Financial Information Gets an XML Wrap Thanks to Users' Push for Open Standards [XML Enlarges the Funnel. XML is Opening Up the Financial Biz.]" By Mark Leon. In InfoWorld Volume 23, Issue 37 (September 10, 2001), page 37. ['Mark Hunt (Reuters, Director of E-Business) no longer sees a competitive edge in owning information standards.'] "Money may not grow on trees, but XML seems to be sprouting up everywhere -- and now the financial information industry has some fresh new leaves of its own. After years of protecting proprietary data systems, financial organizations are now working together on open XML standards that, according to analysts, will ease consumers' burden of juggling multiple financial information data formats. But they aren't doing it just to be nice. 'Companies like Reuters realize that they can no longer shove their proprietary standards down customers' throats,' says Dana Stiffler, an analyst at AMR Research in Boston. She explains that in the past these firms were able to dictate a proprietary messaging format, forcing their customers to buy specialized hardware and software to access financial information. Most people have seen images of stock brokers and traders surrounded by several terminals. The need for all the screens arose because none of the companies providing the vital information services were willing to use open messaging standards -- traders needed a different system for each financial data source. But the rise of the Internet, says Stiffler, has now made these same customers less tolerant of closed systems. Mark Hunt, director of e-business capability at Reuters in London, is well-acquainted with this trend toward openness. 'NewsML is a good example [of an open standard],' says Hunt. 'We took the lead on creating this as a standard for combining text, images, and video in an XML-based news feed. Now, we could have tried to own NewsML and make it a Reuters standard, but that is not the way the Web works.' NewsML binds text in multiple languages, images, and video together in a Web-based format accessible to Internet search engines... For Hunt, the funnel represents a host of open messaging standards -- and a wider funnel can hold more data. And one company can't possibly own that funnel, says Hunt, or others would have to license the product or software to access the funnel's data, so fewer sources will feed it: hence the move to create industry organizations and hammer out XML standards for just about any financial information topic. The new standards include FPML (Financial Products XML) and XBRL (Extensible Business Reporting Language). FPML is an XML format designed to handle complex financial instruments. 'It took 18 months just to define the DTD (Document Type Definition) for this,' says Hunt, noting that 'J. P. Morgan had a particular interest in FPML to ease the processing of interest rate derivatives.' XBRL is intended to introduce some consistency to the way information appearing in financial reports is formatted. 'Profits after tax may have a different meaning, depending on the country you are in,' says Hunt. 'XBRL tries to make this transparent to the consumers of this information.' Bridge's Hartley agrees that XML standards are changing the nature of his business. He adds MDDL (Market Data Definition Language) to the financial industry's alphabet soup..." See references in (1) Extensible Business Reporting Language (XBRL); (2) Financial Products Markup Language (FpML); (3) Market Data Definition Language (MDDL).

  • [September 12, 2001] "Digital Evolution draws up internal UDDI registries." By Charles Babcock [Interactive Week]. From ZDNet TechInfo. September 10, 2001. "Universal Description, Discovery and Integration registries are expected to provide a path to services over the Web. But Eric Pulier, president of Digital Evolution, believes UDDI is also good for providing services within the enterprise. The UDDI registry is something like the pages of the phone book. The UDDI Community, an industry consortium of 280 technology vendors and businesses, will eventually submit a mature specification to a standards body, such as the World Wide Web Consortium or the Internet Engineering Task Force... Digital Evolution is a company that provides a UDDI registry inside a company's firewall for employees and anointed business partners to access and use. The UDDI registry is something like the pages of the phone book. The XML-based specification provides support for contact names and Web addresses (white pages), an industry classification (yellow pages) and types of services offered (green pages). At some point, a series of UDDI servers are expected to exist around the Web like the current Domain Name System, which translates typed Web site names into TCP/IP addresses. By querying those servers, a Web application or other software could discover what services are available to it, what transactions they offer and what level of encryption is used... Digital Evolution is one of the first companies to seize the emerging UDDI standard and build a product line around it, though it is aimed inside the corporation at the IT manager rather than outside at software to software Web operations. 'We have a private UDDI registry. We seek to sell a suite of products that facilitate the use of Web services in an enterprise,' Pulier said. The products include: Data Consumer, a browser-based data sorter that allows narrowing a data set to what the user is interested in; Margin Call, which allows a server to be set up to store frequently requested data in main memory, leading to speedier responses; Code Mason, which automatically creates copies of stored procedures and data access classes of a database system, reducing the need for programmers to recreate them manually; and Java Trap, which creates a repository of XML files containing information about the environment in which a Java application will run..." See: "Universal Description, Discovery, and Integration (UDDI)."

  • [September 10, 2001] "High Performance Web Sites: ADO versus MSXML." By Timothy M. Chester, Ph.D. (Senior Systems Analyst, Computing & Information Services, Texas A&M University). TAMU CIS white paper. 15 pages. A related version has been published in Dr Dobb's Journal [DDJ] (October 2001) #329, pages 81-86 [Internet Programming. 'ADO and MSXML are tools that can be used to create high-performance web sites. MSXML provides flexibility, but ADO offers performance.' With listings.] "This article is about comparing the ASP/ADO and XML/XSL programming models. The emphasis is not on technique (although sample code is provided) but on performance. This article asks the question, 'MSXML is really cool, but how does it perform when compared to the ASP/ADO model I am already familiar with?' Like most web related issues, both methods have tradeoffs. I'll build two versions of a simple website, one using ASP and ADO to generate the user interface (UI), the other using MSXML. Then I will conduct benchmarks and compare the performance of both models. In some scenarios, ASP/ADO was found to perform better than MSXML. However, in other situations MSXML provided a ten-fold increase in performance... The Internet has evolved from simple static websites to web-based computing systems that support thousands of users. This evolutionary process experienced tremendous growth with the introduction of Microsoft's Active Server Pages (ASP), an easy-to-use and robust scripting platform. ASP makes it easy to produce dynamic, data driven webpages. The next big step came with ActiveX Data Objects (ADO) and the Component Object Model (COM). These tools allow developers to access multiple datasources easily and efficiently and best of all, in a way that is easy to maintain. Together, ASP and ADO provide the basic infrastructure for separating data from business and presentation logic, following the now infamous 'n-Tier architecture'. With the introduction of XML and XSL, websites are now taking another gigantic leap forward. In this article, I will compare the latest evolutionary leaps with an eye toward website performance by building two versions of a simple website - one using ASP and ADO to generate the user interface (UI) and the other using Microsoft's MSXML parser to transform XML/XSL documents. I will then conduct benchmarks and compare the throughput (transactions per second) of both models. Like most web related issues, both methods have tradeoffs. In some scenarios ASP/ADO performs better than MSXML. In other situations, however, MSXML provides an incredible performance advantage... [Summary:] Website performance is not a black and white subject, but is actually very, very, gray. One basic premise is often overlooked: the ways in which a website is coded has as much (or more) to do with performance than the power of the underlying web server. ADO and MSXML are tools that can be used to create high performance websites. MSXML provides increased flexibility to developers, but at a cost. When drawing data directly from a database, MSXML performs slower than ADO. However, MSXML provides an easy way to cache the presentation of data, thereby providing up to a ten fold increase in website performance. This is a viable solution for websites that need to support thousands of concurrent users..."

  • [September 10, 2001] "VoiceXML and the Voice/Web Environment. Visual Programming Tools for Telephone Application Development." By Lee Anne Phillips. In Dr Dobb's Journal [DDJ] (October 2001) #329, pages 91-96. Programmer's Toolchest. "While the Internet is making inroads into the public switched-telephone network, XML protocols such as VoiceXML are providing access to a set of tools that address the entire range of web applications..." The article provides an overview of GUI tools for creating VoiceXML applications, and reviews two: Visual Designer 2.0 from Voxeo, and Covigo Studio. [Covigo Studio "provides a visual programming environment that helps you to rapidly develop integrated mobile data and voice applications. Based on a user-centric process modeling approach, Studio separates user-interaction workflow from presentation design and data source integration. It allows you to build mobile applications from the ground-up or as extensions to existing applications, and to constantly optimize their applications to meet changing user, industry and business needs. The visual modeling approach provides multiple ways to integrate with existing enterprise applications at the presentation layer, business logic layer, or data layer levels. The product integrates with existing IT systems - including complex enterprise business processes encapsulated in systems used for customer relationship management (CRM), enterprise resource planning (ERP), and supply chain automation (SCM). This includes integrating with such technologies as HTML, JSPs, EJBs, JDBC, XML, and packaged application APIs..." The Visual Designer 2.0 from Voxeo is available at no cost. One can use the designer "to visually design phone applications and it will automatically generate the VoiceXML or CallXML markup for you. This allows a voice application developer to focus on important issues like usability and functionality, without having to worry about syntax. Voxeo Designer 2.0 is the first visual phone markup design tool to fully support round-trip development -- any CallXML or Voice XML application may be opened in the Designer tool, updated graphically (or by editing the XML directly) and re-deployed for use. Features include: Visual application design using flowcharts; Full round-trip, bi-directional development; Element/Attribute syntax validation; FTP and HTTP support for file read and write; Full CallXML Tag Support; Full VoiceXML 1.0 Tag support; 100% Pure-Java IDE, runs on any Java Virtual Machine ..."] Additional resources with Lee Anne's article include listings and source code. See "VoiceXML Forum."

  • [September 10, 2001] "Regular Expressions in C++. Text processing for C/C++ programmers." By John Maddock. In Dr Dobb's Journal [DDJ] (October 2001) #329, pages 21-26. "Regular expressions form a central role in many programming languages, including Perl and Awk, as well as many familiar UNIX utilities such as grep and sed. The intrinsic nature of pattern matching in these languages has made them ideally suited to text processing applications, particularly for those web applications that have to process HTML. Traditionally, C/C++ users have had a hard time of it, usually being forced to use the POSIX C API functions regcomp, regexec, and the like. These primitives lack support for search and replace operations and are tied to searching narrow character C-strings. Some time ago, I began work on a modern regular expression engine that would support both narrow- and wide-character strings, as well as standard library-style iterator-based searches. I call this library 'regex++', available at http://ourworld.compuserve.com/homepages/john_maddock/regexpp.htm; it was accepted as part of the peer-reviewed boost library. In this article, I'll show how regex++ can be used to make C++ as versatile for text processing as script-based languages such as Awk and Perl... I do not intend to discuss the regular expression syntax in this article, but the syntax variations supported by regex++ are described online. The documentation for Perl, Awk, sed, and grep are other useful sources of information, as is the Open UNIX Standard... This article shows some of the power that regular expressions in C++ can give you. Regex++ does not seek to replace traditional regex tools such as lex. Rather, it provides a more convenient interface for rapid access to all kinds of pattern matching and text processing -- something that has traditionally been limited to scripting languages. In addition, it provides a modern iterator-based implementation that allows it to work seamlessly with the C++ Standard Library, providing the versatility that C++ users have come to expect from modern libraries."

  • [September 10, 2001] "Rampant Confusion." By Chad Dickerson [InfoWorld CTO]. In InfoWorld Volume 23, Issue 37 (September 7, 2001), page 12. "I'm going to start this week's column by making a couple of hype-challenged statements: XML is inherently useless; and Web services, although it's the next big thing to nontechnical folks, has been chugging along quietly for a few years without much fanfare. In my role as CTO, I sit in a lot of meetings where I act as translator between the business folks and the engineers. XML, which works quite well in helping machines talk to each other, creates quite a lot of confusion when people talk about it. Many of the discussions go as follows: Business person: 'We need to integrate data from Company X into our Web site.' Me: 'What format will the data be in?' Business person (smiling broadly): 'XML; it's all XML.' Me: 'OK, I'll need to have an engineer look at how they structure their data so we can process it properly and integrate it into the site.' Business person (smile weakening): 'But it's in XML. ... ' Me: 'Great, I'm glad it's in XML format. We need some time to port the data into our database, do QA, and make sure we process the data feed properly as it comes in.' Business person (frown developing): 'But it's in XML. ... ' At this point I start explaining that receiving an XML feed is the beginning of an integration process, not the end. To paraphrase from the XML FAQ: XML is a markup specification language and XML files are data: They just sit there until you run a program which displays them (like a browser), or does some work with them (like a converter which writes the data in another format, or a database which reads the data), or modifies them (like an editor). In other words, as much as we all love it, XML alone is more or less useless. Although XML can be wonderful for trading data among applications, applications do not magically appear around XML documents. XML does, however, function as a great point of leverage for applications, which leads us to Web services... The term Web services confuses many people, and what was supposed to make things easier is making things more difficult. But this is mainly due to lack of clarity in marketing, not shortcomings in what is essentially an extraordinarily simple and powerful concept... the XML-RPC specification provides an easily grasped window into the technical promise of Web services, while also serving as a spirited manifesto for the then-new Web services world order. When I grow confused about what Web services means, I read the XML-RPC spec and it makes sense again..." See: "XML-RPC."

  • [September 10, 2001] "XML-RPC for PHP, Version 1.0." By Edd Dumbill. Documentation. Version 1.0 is available for download. "The 1.0 release is the final release to be managed by Useful Information Company... We've developed classes which encapsulate XML-RPC values, clients, messages and responses. Using these classes it's possible to query XML-RPC servers. XML-RPC is a format devised by Userland Software for achieving remote procedure call via XML. XML-RPC has its own web site, www.XmlRpc.com. The most common implementations of XML-RPC available at the moment use HTTP as the transport. A list of implementations for other languages such as Perl and Python can be found on the www.xmlrpc.com web site. This collection of PHP classes provides a framework for writing XML-RPC clients and servers in PHP..." [Edd's XML-DEV post: "So, it took me two years to get brave enough to call it '1.0', but here it is. I finally reckon my all-PHP classes for doing XML-RPC are 'stable.' Available under the BSD license. More detail at http://xmlrpc.usefulinc.com/php.html. A good time to note too that I've moved the project to SourceForge as well (which turns out to surpass my expectations in niftiness), and have already gained two more developers on the project. It is my intent to step down as maintainer as soon as a suitable replacement emerges..." Note also the book Programming Web Services with XML-RPC, by Simon St.Laurent, Joe Johnston, and Edd Dumbill [foreword by Dave Winer]. O'Reilly, June 2001. "XML-RPC, a simple yet powerful system built on XML and HTTP, lets developers connect programs running on different computers with a minimum of fuss. Java programs can talk to Perl scripts, which can talk to ASP applications, and so on. With XML-RPC, developers can provide access to functionality without having to worry about the system on the other end, so it's easy to create web services... Programming Web Services with XML-RPC introduces the simple but powerful capabilities of XML-RPC, which lets you connect programs running on different computers with a minimum of fuss, by wrapping procedure calls in XML and establishing simple pathways for calling functions. With XML-RPC, Java programs can talk to Perl scripts, which can talk to Python programs, ASP applications, and so on..." See: "XML-RPC."

  • [September 10, 2001] "Use XML as a Java Localization Solution. The reusability that XML affords TMX-formatted data benefits Java internationalization development." By Masaki Itagaki. From LISA web site. "Java has been one of the best programming languages for global market-oriented application development since JDK 1.1 covered basic components for internationalization. Java has many internationalization approaches supporting such aspects as Unicode 2.0, multilingual environment, and Locale objects, to name a few. However, you still have to consider the daunting, fundamental work that is required for a global market, which means translating all text items such as labels, messages, menu items, and so on. Even for these kinds of localization issues, Java offers a nice solution in the ResourceBundle class. You can extract all the text items from original source codes, isolating them into ResourceBundle components such as a ListResourceBundle class or a property file. Although such a scheme makes a developer's life much easier, it's rather clumsy from the translation point of view, especially in terms of reusability of translations. In the localization industry, Translation Memory eXchange (TMX) is a standardized data format that uses XML for software and document translation assets. Most of the commercial translation tools can use the TMX file to reuse translation data. Translators who want to use the TMX solution for Java must implement their own data conversion between TMX and ResourceBundle data... Since 1997 the localization industry has put a lot of effort into standardizing a translation data format. The Localization Industry Standards Association (LISA), a nonprofit internationalization and localization organization, formed a special interest group called Open Standards for Container/Content Allowing Reuse (OSCAR) to define a translation memory data format and publish the TMX standard. This is simply XML-formatted data defining elements and attributes that are necessary to organize translation data efficiently... Most benefits of the TMXResourceBundle class are on the development side. Since the number of words usually determines the cost of translation, requesting translation of the same items is not cost efficient. Using TMX's DTD, you can also embed such information as a package name, a class name, and a project name. This gives you an exact match in translation data, which enables you to extract only new items. Meanwhile, if you want to achieve consistency between software translation and document translation (such as guides, manuals, and even computer-based training programs), TMX proves to be a great solution. By importing your Java TMX file into any translation tool, you can reuse Java translations through a word book or glossary functions, which are included in most translation tools. Thus, TMX benefits not just the translation industry, but Java internationalization development, as well..." Article originally published in JavaPro Magazine. See: "Translation Memory Exchange (TMX)."

  • [September 10, 2001] "Quality of Service Extension to IRML." IETF INTERNET-DRAFT 'draft-ng-opes-irmlqos-00.txt.' July 2001. By Chan-Wah Ng, Pek Yew TAN, and Hong CHENG (Panasonic Singapore Laboratories Pte Ltd). "The Intermediary Rule Markup Language (IRML) is an XML-based language that can be used to describe service-specific execution rules for network edge intermediaries under the Open Pluggable Edge Services (OPES) framework, as described in "Extensible Proxy Services Framework" and "Example Services for Network Edge Proxies". This memo illustrates examples of employing the IRML for Quality of Service (QoS) policing and control, and suggests extensions to IRML for better QoS support. This memo begins in Section 2 by illustrating a few scenarios where QoS policing and control can be incorporated into the OPES intermediary. From there, a set of preliminary requirements for QoS extension to the IRML is drafted in Section 3. Section 4 proposed a set of QoS extension to the 'property' element defined in the IRML, and Section 5 presents some examples illustrating possible use of these extensions." [cache]

  • [September 10, 2001] "Sub-System Extension to IRML." IETF INTERNET-DRAFT 'draft-ng-opes-irmlsubsys-00.txt.' July 2001. By Chan-Wah Ng, Pek Yew TAN, and Hong CHENG (Panasonic Singapore Laboratories Pte Ltd). "The Intermediary Rule Markup Language (IRML) is an XML-based language that can be used to describe service-specific execution rules for network edge intermediaries under the Open Pluggable Edge Services (OPES) framework. This memo discusses the need for OPES framework to have different sub-systems in different deployment scenario, and proposes additions to IRML for a more flexible approach to supporting different sub-systems. Section 2 presents the motivation behind having sub-systems support in IRML. Section 3 proposes a set of QoS extension to the 'property' element defined in the IRML, and Section 4 presents some examples illustrating possible use of these extensions." See the revised proposed IRML DTD. [cache]

  • [September 10, 2001] "Web Services Spells Changing Tide for Systems Integration." By Mark Jones, Ed Scannell, Tom Sullivan, Brian Fonseca, and Eugene Grygo. In InfoWorld Volume 23, Issue 37 (September 7, 2001), pages 21, 24. "Emerging Web services pose a unique challenge to the likes of HP, Compaq, and IBM Global Services, companies keenly aware that sustainable revenue growth is tied to their IT services capabilities. Driving the revenue shift is an understanding that new methods of application integration dovetail with the generally understood definition of Web services: Loosely coupled software components are delivered over the Internet via standards-based technologies such as XML and SOAP (Simple Object Access Protocol). As a result, Web services represent a new component architecture for building and distributing applications and facilitating the integration process. The challenge, in the view of some observers, is that as systems integrators look at how they can deliver Web services, they must adapt to the revenue shift by offering higher value-added services such as business process management. A study released in late August by Jupiter Media Metrix reflects more than pure enthusiasm for Web services. Jupiter states that 60 percent of business executives interviewed plan to deploy Web services for integrating internal applications during the next year. Also, a recent Gartner report states that through the second half of 2002, 75 percent of enterprises with more than $100 million in revenue will interface periodically with Web services. But despite the lofty claims that Web services promotes value-added business, industry participants agree that converting the dream to reality will not be a walk in the park -- particularly given that the concept of prebundled software components is not a new idea... Based on findings from its Web services report, Jupiter argues Web services in reality will not enable companies to sell computational services to parties they might not have prior relationships with. Obstacles include inertia around existing, comfortable relationships and the need for proven security and trust payment models; it will take years to open up the promise of new Web services business channels... When will systems integrators, and enterprise customers in turn, really start to feel the changes brought about by Web services? Estimates vary widely. Some executives say within 12 months, others talk in terms of the next five years. Perhaps the next important signpost will come during the second half of 2002, when analysts say Web services technology will mature to the point that enterprise application vendors will be rearchitecting all of their software around common standards..."

  • [September 10, 2001] "Let Your DOM Do The Walking. A Look at the DOM Traversal Module." By Brett McLaughlin (Enhydra strategist, Lutris Technologies). From IBM developerWorks. August 2001. "The Document Object Model (DOM) offers useful modules to extend its core functionality in advanced ways. This article examines the DOM Traversal module in depth, showing how to find out if this module is supported in your parser and how to use it to walk either sets of selected nodes or the entire DOM tree. You'll come away from this article with a thorough understanding of DOM Traversal, and a powerful new tool in your Java and XML programming kit. Eight sample code listings demonstrate the techniques. If you have done much XML processing during the last three years, you've almost certainly come across the Document Object Model, or DOM for short. This object model represents an XML document in your application, and it provides a simple way to read XML data and to write and change data within an existing document (see Resources for more background if you're new to the DOM). If you're on your way to being an XML guru, you've probably learned the DOM backward and forward, and you know how to use almost every method that it offers. However, there is a lot more to the DOM than most developers realize. Most developers actually have experience with the core of the DOM. That means the specification that outlines what represents the DOM, how it should operate, what methods it makes available, and so forth. Even experienced developers do not have much knowledge or understanding of the variety of extra DOM modules that are available. These modules allow developers to work more efficiently with trees, deal with ranges of nodes at the same time, operate upon HTML or CSS pages, and more all with an ease not possible using just the core DOM specification. Over the next few months, I plan articles to detail several of the modules, including the HTML module -- the Range module -- and in this article, the Traversal module. Moving through DOM trees in a filtered way makes it easy to look for elements, attributes, text, and other DOM structures. You should also be able to write more efficient, better organized code using the DOM Traversal module. Learning to use DOM Traversal, you'll see how quickly it can move throughout a DOM tree, build custom object filters to easily find the data you want, and walk a DOM tree more easily than ever. I'll also introduce you to a utility that lets you check your parser of choice for specific DOM module support, and along the way I'll manage to throw a lot of other sample code in as well..." See: "W3C Document Object Model (DOM)."

  • [September 07, 2001] "Markup Languages: Comparison and Examples." By Yolanda Gil and Varun Ratnakar (USC/Information Sciences Institute, TRELLIS project). 2001-09-07 or later. ['We are making available a comparison table that we created to understand the tradeoffs and differences among markup languages along common knowledge representation requirements. It compares XML Schema, RDF Schema, and DAML+OIL. For each dimension of comparison, the table includes a description of how each language handles that issue and hyperlinks to examples.'] "Below is a comparison table that we created to understand the tradeoffs and differences among markup languages. It compares XML (Extensible Markup Language), RDF (Resource Description Framework), and DAML (DARPA Agent Markup Language) by showing a description and examples of how each language addresses common knowledge representation requirements. We are preparing an article describing this comparison in detail. If you have any comments or suggestions, please email them to us.. Our interest in markup languages stems from: (1) Research on TRELLIS, a framework to help users create semantically annotated traces and rationale for their decisions. A prototype of TRELLIS has just been released and you can try it here. (2) Our research on PHOSPHORUS, an ontology-based agent matchmaker that the ISI Electric Elves framework uses to support human organizations..." See "DARPA Agent Mark Up Language (DAML)."

  • [September 04, 2001] "Software Component Certification." By John Morris, Gareth Lee, Kris Parker, and Gary A. Bundell (University of Western Australia); Chiou Peng Lam (Murdoch University). In IEEE Computer Volume 34, Number 9 (September 2001), pages 30-36. "Most current process-based methods for certifying software require software publishers to 'take oaths concerning which development standards and processes they will use.' Jeffrey Voas, among others, has suggested that independent agencies -- software certification laboratories (SCLs) -- should take on a product certification role. The authors accept that this approach may work well for certain software distribution models, but they also observe that it cannot be applied to all software development. Third-party SCLs would add unnecessarily to the costs that small developers incur by speculating on the success of a given component. However, supplying complete test sets with components incurs little additional cost because component authors must generate the tests in the first place. Any extra effort adds value to a component because a tested component certainly offers a more marketable commodity. The authors believe that while SCLs have a place in large or safety-critical software projects, there will always be small commercial-software developments for which failure represents a moderate cost. In such cases, the cost of generating and inspecting tests can be justified... If developers are to supply test sets to purchasers, they will need a standard, portable way of specifying tests so that a component user can assess how much testing the component has undergone. Potential customers can then make an informed judgment about the likely risk of the component failing in their application, keeping in mind the nature of the tests and the intended application. To fill this role, we designed a test specification that aims to be (1) standard and portable; (2) simple and easy to learn; (3) devoid of language-specific features; (4) equally able to work with object-oriented systems, simple functions, and complex components such as distributed objects or Enterprise JavaBeans; (5) efficient at handling the repetitive nature of many test sets; (6) capable of offering widely available and easily produced test-generation tools that do not require proprietary software; (7) free of proprietary-software requirements for interpreting and running the tests; and (8) able to support regression testing. We based our test pattern document format on the W3C's Extensible Markup Language, which satisfies most of our requirements. XML is a widely adopted general-purpose markup language for representing hierarchical data items. We have defined an XML grammar, specialized for representing test specifications, published in the form of a document type definition (DTD) that can be downloaded from our Web site. XML is well suited to representing test specifications because it adheres to a standard developed by an independent organization responsible for several other widely accepted standards. It has achieved broad acceptance across the industry, leading to the development of editors and parsers for a variety of platforms and operating systems. Further, XML's developers designed the language to provide structured documents, which support our test specifications well. XML documents -- laid out with some simple rules -- can be read and interpreted easily. Several readily available editors make understanding the language easier by highlighting its structure and providing various logical views. To keep the test specification simple and easy to use, we defined a minimal number of elements for it. Rather than adding elements to support high-level requirements, we allow testers to write helper classes in the language of the system they are testing. This approach gives testers all the power of a programming language they presumably already know and avoids forcing them to learn an additional language solely for testing...The specification uses the terminology of object-oriented designs and targets a class's individual methods. However, it can describe test sets for functions written using non-OO languages such as C or Ada equally well. As long as a well-defined interface exists, a tester can construct MethodCall elements..." See "SCL Component Test Bed Specification."

  • [September 04, 2001] "Crouching Error, Hidden Markup." By Neville Holmes. In IEEE Computer Volume 34, Number 9 (September 2001), pages 126-128. ['Holmes compares Script, Roff, and Tex to MS Word: its lack of a versatile and visible markup language can make using Microsoft Word a nightmare -- and reflects poorly on our profession.'] "... Word lets a user load and save documents with markup codes for formats such as Hypertext Markup Language (HTML) and Rich Text Format (RTF) -- but must hide some kind of markup language beneath its own fancy fagade. What Word lacks, however, is an overt means for formally marking up plain text while developing the document. I get the impression that Word's developers add formatting features impulsively, without the unifying philosophy or moderating principles that an underlying plain-text markup scheme would foster. Markup conventions have a rich history. If you take a long-term view, markup conventions have been used in the data processing industry for thousands of years. Markup is conventional annotation designed to convey guidance to the user of plain text about the text's intended treatment: This guidance originally applied to how the text should be read aloud and is otherwise know