The Cover PagesThe OASIS Cover Pages: The Online Resource for Markup Language Technologies
SEARCH | ABOUT | INDEX | NEWS | CORE STANDARDS | TECHNOLOGY REPORTS | EVENTS | LIBRARY
SEARCH
Advanced Search
ABOUT
Site Map
CP RSS Channel
Contact Us
Sponsoring CP
About Our Sponsors

NEWS
Cover Stories
Articles & Papers
Press Releases

CORE STANDARDS
XML
SGML
Schemas
XSL/XSLT/XPath
XLink
XML Query
CSS
SVG

TECHNOLOGY REPORTS
XML Applications
General Apps
Government Apps
Academic Apps

EVENTS
LIBRARY
Introductions
FAQs
Bibliography
Technology and Society
Semantics
Tech Topics
Software
Related Standards
Historic
Last modified: January 31, 2003
XML Articles and Papers January 2003

XML General Articles and Papers: Surveys, Overviews, Presentations, Introductions, Announcements

Other collections with references to general and technical publications on XML:

January 2003

  • [January 31, 2003] "XML Pipelining with Ant." By Michael Fitzgerald. From XML.com January 29, 2003. ['Mike Fitzgerald on using the Ant build tool for pipelined XML processing.'] "Ant is an extensible, open-source build tool written in Java and sponsored by Apache's Jakarta project. Ant has developed into something more than a just a build tool, however. It has gone beyond its predecessor make (and make's kin) to become a framework for performing an even larger variety of operations in a single step, not just compiling code or cleaning up after a build. Ant's build files are written in XML, and Ant takes advantage of XML in a variety of ways. In my opinion, Ant is a suitable if not ideal framework for XML pipelining -- that is, a framework for performing a variety of XML processing, in the desired order and in one fell swoop. The reason why I say ideal is because Ant is open, somewhat mature, reasonably stable, readily available, widely known and used, easily extensible, and already amenable to XML processing. What else could you ask for? In this article, I'll discuss the XML structures in an Ant build file, named build.xml by default, talk about some common XML-related tasks that Ant can perform, and then finish up with an example of XML pipelining... I realize that Ant was not intended to be a an XML pipeline tool, but it turns out to be a pretty good one anyway. Other tools exist and may eventually do a better job, such as Sean McGrath's XPipes or Eric van der Vlist's XML Validation Interoperability Framework (XVIF). For now, though, Ant remains an attractive option. Like XML, Ant can do things that perhaps it was not originally intended to do..."

  • [January 31, 2003] "Databases and Element Names." By John E. Simpson. From XML.com January 29, 2003. ['John focuses on answering a question concerning the use of XML and databases; that is, how to map table names containing characters with "special" meaning in XML. John also investigates the relationships between numerical types in W3C XML Schema.'] "... '[my] field names do not meet the requirements for element names, so I am forced to run them through a sanitizing function before naming the element nodes. The function replaces or removes the characters offensive to XML...Unfortunately, this sanitizing process introduces the possibility that I could end up with elements with the same name, although in the database they are named differently.' A: [you might forget] about a literal mapping of database field or column names to element names. That is, you could push the database field or column names into attribute values, assigning corresponding element names either arbitrarily or according to some more or less intelligible scheme... It would then be pretty straightforward to transform this document via XSLT into either a comma-separated values text file or even an SQL statement..." Also on 'Numeric datatypes in XML Schema' - a question about the numerical datatypes long, unsignedLong, int, unsignedInt, short, unsignedShort, byte and unsignedByte. Which of these datatypes are subsets of the datatypes float and double?... In short, although WXS is the product of logical, rational minds, don't assume -- especially when considering datatypes, primitive and otherwise -- that all its principles will necessarily follow the logic of everyday common sense..."

  • [January 31, 2003] "XML Forms, Web Services and Apache Cocoon." By Ivelin Ivanov. From XML.com January 29, 2003. ['Ivelin Ivanov introduces Cocoon's XMLForms features, which allow a model-view-controller paradigm for web applications, helping to separate the user interface from the business logic.'] "Server side business logic is often invariant with regard to client devices. An email client supports the same basic operations whether it's used from a cellular phone, PDA, or a PC. To address the needs of web developers who build applications for a variety of devices, the W3C has formed the XForms working group. In this article we discuss the Cocoon XMLForm framework's separation of the purpose from the presentation of a form, maximizing its reusability for a variety of client devices. We also explain how this technology allows us to extend web applications to Web Services. Apache Cocoon XMLForm is aligned to a large extent with the W3C XForms standard. While XForms requires that client devices understand the XForms markup, XMLForm can be used with any browser for any client device. The trade-off for this convenience is that XMLForm lacks some of the client side features of XForms, such as events and actions... XMLForm is a middle-tier framework based on the MVC (Model-View-Controller) design, which combines the best from Jakarta Struts, W3C XForms, and Schematron... XMLForm allows developers to build and edit an XML document, called the form model or instance -- subject to constraints from some schema: WXS, Schematron, and so on -- by interacting with a series of form pages... [The article] introduces a new perspective on form handling in web applications, a technique for connecting the business logic and the UI layer, while preserving a thin line which cleanly separates them. Programmers can now focus on the implementation of the application workflow without the burden of tedious HTML coding. Web page authors on the other hand can work on the presentational aspects of the application without knowing how to code in Java or even run the application server. Usability experts can sketch UML activity diagrams and write the initial XML form documents. And quality assurance professionals can write regression tests against the web pages in their XML stage, rather than manually testing poorly structured HTML..." See: "XML and Forms."

  • [January 31, 2003] "IBM WebSphere Upgrade Looks Beyond J2EE. Big Blue Exec Cites Shortcomings in Java." By James Niccolai. In InfoWorld (January 30, 2003). "IBM is developing an upgrade to its WebSphere application server that aims to make it easier for companies to orchestrate transactions among groups of business applications, and to expose applications as Web services that can be used by other companies, an official said this week. The WebSphere upgrade draws on a technology being developed by vendors including IBM and Microsoft called BPEL4WS (Business Process Execution Language for Web Services), as well as capabilities being prepared for the next iteration of Sun Microsystems 's J2EE (Java 2 Enterprise Edition) specification, version 1.4, said Scott Hebner, IBM's director of marketing for WebSphere. The upgrade aims in part to make it easier for developers to build and deploy applications that can be offered as services to other businesses by integrating workflow, business rules, and transaction capabilities into WebSphere. A company that has built its own retirement plan application, for example, could expose it as a Web service and make it available for use by other companies, Hebner said... Hebner acknowledged that the standards for creating Web services have yet to be finalized. Indeed, BEA is backing two technologies for choreographing Web services, BPEL4WS and WSCI (Web Service Choreography Interface), which are competing for the attention of standards bodies. IBM's decision to include J2EE features that are not yet standards marks a shift for the company that brings it more into line with other Java vendors, said Mike Gilpin, an analyst with Giga Information Group. In the past IBM has tended to wait until specifications are complete before including them, he said... Competition among software vendors is increasingly shifting towards tools, Gilpin said, as features in application servers become standardized. BEA, for example, has already implemented much of J2EE 1.4 in its WebLogic application server, although it hasn't disclosed plans for including WSCI or BPEL4WS . While the vendors compete to expand their middleware offerings, some customers have become disillusioned with the promise of XML and with the whole idea of Web services..."

  • [January 31, 2003] "WS-I Modifies Its Basic Profile." By Darryl Taft. In eWEEK (January 31, 2003). "The Web Services Interoperability Organization Thursday announced the availability of documents to support the organization's Basic Profile 1.0, which is expected to be released in the second quarter of this year. Meanwhile, the WS-I recently voted to amend the Basic Profile to include Simple Object Access Protocol (SOAP) attachments in Version 1.1 of the profile, which the organization will release shortly after the release of the Basic Profile 1.0, WS-I officials said. In addition, WS-I officials said nominations for WS-I board membership continue to come in and that Web services security remains a key concern for enterprises. The WS-I Thursday released its Sample Application Technical Architecture, use cases and usage scenarios for the Basic Profile. The WS-I Basic Profile is a set of implementation guidelines for how a set of Web services specifications should work together to develop interoperable Web services... The documents released Thursday feature a model of a supply chain management system, said Rob Cheng, a Web services evangelist for Oracle Corp. and chairman of a WS-I committee. Cheng said the new documents will help to establish best practices for using the profile and deploying Web services in production environments. The Sample Application Technical Architecture provides a way to check SOAP messages, schema naming conventions, SOAP message styles and issues involving the Web Services Definition Language (WSDL). The use cases and usage scenarios are key in that 'they drive the other work,' Cheng said. The usage scenarios translate the use cases into technical requirements, he said. Essentially, the new documents will be used to help prove the viability and reliability of the Basic Profile 1.0. Meanwhile, WS-I officials said nominations for board seats continue to arrive. Earlier this week, webMethods Inc. and Cape Clear Software Inc. announced they had nominated company executives to the WS-I board..." See: (1) "WS-I Publishes Supply Chain Management Candidate Review Drafts"; (2) ""Web Services Interoperability Organization (WS-I)."

  • [January 31, 2003] "An XML Format for Mail and Other Messages." By Graham Klyne (Nine by Nine). IETF Network Working Group, Internet Draft. Reference: 'draft-klyne-message-xml-00.txt'. 20-January-2003, expires July 2003. Appendix A: Message/Email+XML content-type registration; Appendix B: DTD for Email+XML message format; Appendix C: XML schema for Email+XML message format; Appendix D: RDF representation of Email+XML message; Appendix E: RDF schema for Email+XML message format. "This document describes a coding of email and other messages in XML. This coding is intended for use by XML applications that exchange information about such messages... The XML coding is designed to address the following goals: (1) to fully capture the semantics of Internet email messages, per RFC822. However it is not intended to provide a loss-less coding of RFC822 syntax. (2) to extend the scope of address information that can be conveyed to arbitrary URIs. (3) to take account of 8-bit clean transfer environments. (4) to fully support, where applicable, international character sets and languages within the message header and content. (5) to be usable in MIME and pure XML transfer environments. (6) to be fully compliant with the XML and XML namespace specifications. (7) to allow header information to be compatible with RDF format , for use by generalized metadata processing applications. [The document represents] a reissue of a previous expired draft with a name change and minimal other changes. It is expected that a number of significant changes may be made in light of more recent considerations, and the document re-issued as soon as such changes have been crystalized..."

  • [January 31, 2003] "WS-I Publishes Draft Guidance Documentation." By Stacy Cowley. In Network World (January 30, 2003). "The Web Services Interoperability Organization (WS-I), a year-old industry group that offers application development guidance, released Thursday draft versions of a series of documents intended to describe the design and deployment of a sample, standards-compliant Web services application. The drafts include a sample technical architecture, use cases and usage scenarios documentation illustrating material covered in WS-I's Basic Profile 1.0. The Basic Profile, the first project the group undertook, is an implementation guide on using a set of core standards in developing interoperable Web services. A draft version was released in October, and a final version of the Basic Profile is scheduled for release during the second quarter. The new documents released Thursday are intended to define best practices for using the Basic Profile, and to offer real-world implementation information for customers building applications for Web services, the WS-I said. Web services technology aims at creating a common infrastructure for connecting via the Internet's diverse applications, allowing heterogeneous IT systems to interact with each other. While other groups oversee the standards involved, WS-I focuses on assisting developers with the practical work of creating interoperable software..." See details in the 2003-01-31 news item "WS-I Publishes Supply Chain Management Candidate Review Drafts" and general references in "Web Services Interoperability Organization (WS-I)."

  • [January 30, 2003] "X+V 1.1 -- XHTML+Voice. A Multimodal Markup Language." By Jonny Axelsson (Opera Software), Chris Cross (IBM), Håkon W. Lie (Opera Software), Gerald McCobb (IBM), T. V. Raman (IBM), and Les Wilson (IBM). From IBM developerWorks, XML zone. January 2003. XHTML+Voice Profile 1.1 referenced is 49 pages (PDF). ['Look at the XHTML + Voice specification. X+V brings spoken interaction to standard WWW content by integrating a set of mature WWW technologies such as XHTML and XML Events with XML vocabularies developed as part of the W3C Speech Interface Framework. X+V brings together voice modules that support speech synthesis, speech dialogs, command and control, speech grammars, and the ability to attach Voice handlers for responding to specific DOM events, thereby re-using the event model familiar to web developers. Voice interaction features are integrated directly with XHTML and CSS, and can consequently be used directly within XHTML content.'] "X+V is designed for Web clients that support visual and spoken interaction. To this end, this document first re-formulates VoiceXML 2.0 as a collection of modules. These modules, along with Speech Synthesis Markup Language and Speech Recognition Grammar Specification are then integrated with XHTML using XHTML modularization. Finally, we integrate the result with module XML-Events so that voice handlers can be invoked through a standard DOM2 EventListener interface to create an X+V multimodal framework. You can download the full document of the X+V 1.1 -- XHTML+Voice [28-January-2003] specification from developerWorks..." Note: an earlier version of the document is "XHTML+Voice Profile 1.0," W3C Note 21-December-2001. [cache]

  • [January 30, 2003] "Human-Facing Web Services, Part 3. Build Portals with WSRP." By Judith M. Myerson (Systems Architect and Engineer). From IBM developerWorks, Web services. January 2003. ['In the first two articles in this series, Judith Myerson examined business users' collective viewpoints on how Web pages and remote portals should be presented, and looked at how the WSIA specifications can be used to build human-facing applications. In this third installment, you'll learn how you can use Web Services for Remote Portals (WSRP) to extend the functionalities of the WSXL component services. You'll see sample code that demonstrates how to aggregate interactive applications into a single portal using one standard adapter for different interfaces and protocols.'] "Web Services for Remote Portals (WSRP) is a standard for XML and Web services that allows the interactive, human-facing Web services to be plugged into portals with a minimum of fuss. These services can be published, found, and bound in a standard way. In the days before the advent of WSRP, vendors often wrote special adapters to accommodate different interfaces and protocols and integrate applications into a single portal, which created a confusing environment for developers. In January 2002, the Organization for the Advancement of Structured Information Standards (OASIS) formed the WSRP Technical Committee as an effort to standardize an adapter for these vendors. In that same month, OASIS also formed the Web Services Component Model Technical Committee; it aimed to create a standard component model under which developers could put together visual presentation and portal components. In May 2002, OASIS changed the name of this group to the Web Services for Interactive Applications Technical Committee (WSIA TC) to better describe the purpose of its work. The renamed committee has broadened its focus from the appearance of applications to the applications' complete interactive, human-facing experience. On September 30, 2002, the WSIA and WSRP technical committees jointly announced the WSIA-WSRP Core Specification, Working Draft 0.7. This document proposed a standard adapter that vendors can employ to mix, match, and reuse human-facing interactive Web service applications from different sources... WSRP lets you use an adapter code you can plug in to applications from any Remote Portlet Web Service. With this standard, you can implement a Remote Portlet Web service as a Java/J2EE-based Web service, a Web service implemented on the Microsoft .Net platform, or as a portlet published by a portal. To help applications, clients, and vendors to discover and display your Remote Portlet Web Services, you can publish them into public or corporate service directories (that is, UDDI). Remote Portlet Web Services are the key to portals, since they can be consumed by intermediary applications across platforms. With these remote Web services, during an operation a portal can: (1) Get information from data sources; (2) Aggregate information into composite pages; (3) Provide personalized information to users in an interactive fashion... WSRP provides a single adapter for different interfaces and protocols for human-facing Web services. The recent WSIA-WSRP Core Specification is indicative of the trend toward user control standards for human-facing interactive Web services. These standards, for example, could aim at the behaviors (such as color or font size) that users might want to control in a standardized way..." Also in PDF format. See: "Web Services for Remote Portals (WSRP)."

  • [January 30, 2003] "The MIME application/vnd.cip4-jdf+xml Content-Type." By Tom Hastings (Xerox Corporation) and Ira McDonald (High North Inc). IETF Internet Draft. Reference: 'draft-mcdonald-cip4-jdf-mime-00.txt'. 25-January-2003, expires 25-July-2003. "The International Cooperation for the Integration of Processes in Prepress, Press, and Postpress (CIP4) is an international worldwide standards body located in Switzerland. The purpose of CIP4 is to encourage computer based integration of all processes that have to be considered in the graphic arts industry. CIP4 has defined two document formats that are encoded in W3C Extensible Markup Language (XML): (1) CIP4 Job Definition Format (JDF) -- an open standard for integration of all computer aided business and production processes around print media; (2) CIP4 Job Messaging Format (JMF) - an open standard for job messaging using Hyper Text Transport Protocol/1.1 (HTTP/1.1, RFC2616) that defines Query, Command, Response, Acknowledge, and Signal message families. This document defines two new MIME sub-types for IANA registration: (1) application/vnd.cip4-jdf+xml for CIP4 Job Definition Format; (2) application/vnd.cip4-jmf+xml for CIP4 Job Messaging Format..." [JDF: "JDF is simply an exchange format for instructions and job parameters. You can use PDF, or its standard variant (PDF/X), to relay production files from one platform to another. You can do the same with JDF to relay job parameters and instructions. JDF can be used to describe a printing job logically, as you would in exchanging a job description with a client within an estimate. It can also be used to describe a job in terms of individual production processes and the materials or other process inputs required to complete a job."] See: (1) JDF Specification Release 1.1, Revision A; (2) general references in "Job Definition Format (JDF)." [cache]

  • [January 30, 2003] "XML Watch: Have Data, Will Travel. Using SyncML to Mobilize Your Data." By Edd Dumbill (Editor and publisher, xmlhack.com). From IBM developerWorks, XML zone. January 2003. ['In his continuing quest to make his data available wherever and whenever he wants it, XML developer Edd Dumbill sets out on a journey to investigate and deploy SyncML.'] "The arrival of XML, along with the acknowledgement of the usefulness of open standards, has begun to liberate data from the confines of single applications... There's already solid agreement on standards for certain common data items like calendars and contact lists, but an unfortunate lack of convenient ways to transmit such data. This is where SyncML comes into the equation. SyncML, an XML-based protocol for synchronizing data, is enjoying a surge in popularity in the latest batch of mobile devices. Even with current synchronization technology, it's hard to keep contacts and schedules synchronized across my Palm Pilot, desktop PC, laptop, and mobile phone. So hard, in fact, that I've given up trying -- much to my frustration. I'm desperately tired of trying to remember to carry my PDA around with me as well as my cell phone just to have my contact lists available. SyncML seems like a great opportunity to solve this problem, and as it will work over Wireless Application Protocol (WAP), I can synchronize with a remote server wherever I am. Unfortunately there don't really seem to be any consumer-facing SyncML products, and only a small amount of supporting open source code available. So I set out to see what was involved in implementing SyncML, with the intentions of integrating my cell phone and my personal information management software, then releasing the code. These next few installments of this column will follow my efforts, focusing particularly on where XML technologies enter into the picture The goal of this exploration is to create a basic SyncML server component that can be deployed either on a Web server or on an OBEX server (for instance, on a Bluetooth-enabled computer). We've already learned that in addition to implementing the semantics of the SyncML language, we'll need to be able to handle WBXML as well as XML... The name SyncML is somewhat misleading. While it is indeed an XML-based markup language, it's not just a data format. It's really a protocol that provides the structure for agents to synchronize data with each other, by defining the permissible exchanges and determining how they are to be interpreted. ... Because mobile devices have limited memory and processing capacity, their manufacturers created an XML-like binary meta language called Wireless Binary XML (WBXML). The basic idea behind WBXML is that by taking advantage of foreknowledge of the DTDs, you can minimize the tags to one byte. The tradeoff, as you can see, is it loses some of its readability. I was surprised to find out that many XML developers had not come across the WBXML specification before. This is all the more peculiar as there are often questions in developer forums asking about binary encodings for XML. WBXML is most commonly deployed as the encoding for WML pages, as delivered to the WAP browsers on mobile phones. In fact, the SyncML specifications deal with both XML and WBXML encodings of the protocol. SyncML is intended for use over any device, but to support mobile phone class devices, a SyncML server must be able to send and receive WBXML in addition to XML... See also "SyncML intensive, A beginner's look at the SyncML protocol and procedures," by Chandandeep Pabla..." See references in "The SyncML Initiative."

  • [January 30, 2003] "XML in Java Data binding, Part 2: Performance. After kicking the tires in Part 1, take data binding frameworks out for a test drive." By Dennis M. Sosnoski (President, Sosnoski Software Solutions, Inc). From IBM developerWorks, XML zone. January 2003. ['Enterprise Java expert Dennis Sosnoski checks out the speed and memory usage of several frameworks for XML data binding in Java. These include all the code generation approaches discussed in Part 1, the Castor mapped binding approach discussed in an earlier article, and a surprise new entry in the race. If you're working with XML in your Java applications you'll want to learn how these data binding approaches stack up!'] "Part 1 provides background on why you'd want to use data binding for XML, along with an overview of the available Java frameworks for data binding. If you haven't already read Part 1, you'll probably want to at least glance over it now. In this part I'm going straight to the issue of performance without further discussion of the whys and hows... This look at data binding performance shows some interesting results, but doesn't fundamentally change the recommendations from Part 1. Castor provides the best current support for data binding using code generation from W3C XML Schema definitions. Its unmarshalling performance is weak compared to other alternatives, but it does give good memory utilization and a fairly fast startup time. The Castor developers say that they plan to focus on performance issues prior to their 1.0 release, so you may also see some improvement in the unmarshalling performance by then. JAXB still looks like a good choice for the code generation approach in the future (the beta license only allows evaluation use). The current reference implementation beta is both bulky in terms of jar size and somewhat inefficient in terms of memory usage, but here again you may see better performance in the future. As of this writing, the current version is still a beta, and even after it's released commercial or open source projects may improve performance over the reference implementation. Since it will be a standard part of the J2EE platform, JAXB is definitely going to play an important role in working with XML in Java. The performance results also confirm the use of JBind, Quick, and Zeus as most appropriate for applications with special requirements rather than for general usage. JBind's XML Code approach can provide a great basis for an application built around processing of an XML document, but the performance of the current implementation is liable to be a problem. Quick and Zeus offer code generation from DTDs, but as I mentioned in Part 1, it's generally pretty easy to convert DTDs to Schemas. On the downside, Quick seems overly complex to use and Zeus supports only Strings for bound data values (no primitives or object references using ID-IDREF or an equivalent). For mapped approaches to data binding, Castor has the advantage of a fairly stable implementation and substantial real-world usage. Quick can be used for this type of binding as well, but again seems complex to set up. JiBX is new and not yet in full usage, but offers excellent performance along with a high degree of flexibility..." See also "XML in Java. Data Binding, Part 1: Code Generation Approaches -- JAXB and More. Generating Data Classes from DTDs or Schemas."

  • [January 30, 2003] "Tip: SAX and document order. Tracking parent-child relationships. Indices help in building applications that need to navigate through XML trees." By Howard Katz (Proprietor, Fatdog Software). From IBM developerWorks, XML zone. January 2003. ['The tips in this series explore the concept of document order and the use of so-called document order indices in SAX. This tip looks at the use of DOIs in modeling parent-child relationships in XML documents. Such DOI representations of document hierarchy are useful in building applications, such as DOMs and query engines, that need to navigate through XML trees.'] "The previous tip in this series introduced the concept of document order indices, or DOIs. DOIs are simply integers that represent the document ordering of nodes in an XML document in a convenient and compact form... While this tip doesn't go into the internals of any particular search-engine implementation, it does show how to provide the engine and other clients with sufficient information on parent-child relationships between elements and other nodes to enable it to resolve, for example, XPath location path queries. That resolution, in turn, requires the engine to be able to navigate its way around an XML tree. Parent and child pointers provide the highways and traffic signs, if you'll allow the poetic license... See also 'SAX and document order,' which explains what document order is and why it's useful, and presents some simple SAX code that shows a practical implementation of DOIs in a search engine application.

  • [January 29, 2003] "WS-I Members Take Stand Against 'Big-Name Bias'." By Gavin Clarke [ComputerWire]. In The Register (January 29, 2003). "Small and medium sized ISVs are vying to lead an IBM and Microsoft Corp-backed web services organization, amid sentiment the group's direction is being misdirected by big-name vendors, Gavin Clarke writes... webMethods Inc and Cape Clear Software Inc told ComputerWire yesterday they will stand for election to the board of the Web Services Interoperability (WS-I) organization, in the hope of making the board more representative of common members' interests. The vendors are the first companies to be named as candidates as the WS-I has refused to release details, saying its constitution does not require disclosure. To date only Sun Microsystems Inc has been named as a potential candidate for elections, due in March. webMethods Inc and Cape Clear spoke as it emerged yesterday that WS-I has agreed to add support for Simple Object Access Protocol (SOAP) attachments to its first major piece of published work, the Basic Profile 1.0 currently in public draft. Support for SOAP attachments would ensure a standards-based approach is taken in the Basic Profile for adding binary attachments, such as JPEG files, to SOAP messages. Failure to include SOAP attachments means files must, instead, be encoded in the main SOAP message by a sender and then de-coded by the recipient in a process that reduces the potential efficiency of web service-based communications... Prasad Yendluri, co-editor of WS-I's Basic Profile and webMethods' principle architect, said, though, the WS-I yesterday approved inclusion of SOAP attachments in an incremental release of Basic Profile, version 1.1, to avoid impacting delivery of 1.0. Yendluri said version 1.1 would be published 'soon after' version 1.0. He added SOAP attachments were the subject of early debate but confirmed these were initially discarded from version 1.0. 'There has been a recent re-consideration,' he said citing member feedback and evolution of version 1.2 of the World Wide Web Consortium's (W3C's) underlying SOAP specification. The issue, though, is far-from resolved for small- and medium-size companies who constitute the bulk of WS-I's membership and clearly feel that their interests are not being properly represented by the board. The WS-I's nine-member board comprises Accenture, BEA Systems Inc, Fujitsu-Siemens, Hewlett-Packard Co, Intel Corp, IBM, Microsoft, Oracle Corp and SAP AG..." General references in "Web Services Interoperability Organization (WS-I)."

  • [January 29, 2003] "Dispute Could Silence VoiceXML." By Paul Festa. In ZDNet News (January 29, 2003). "The Web's leading standards group called on developers to implement its nearly finished specification for bringing voice interaction to Web sites and applications. But the intellectual property claims of a handful of contributors, including Philips Electronics and Rutgers University, threaten to keep the specification tied up in negotiations, the standards body warned. The World Wide Web Consortium (W3C) on Tuesday issued VoiceXML 2.0 as a candidate recommendation, the penultimate stage in the consortium's approval process. The job of VoiceXML -- part of the W3C's Voice Browser Activity -- is to let people interact with Web content and applications using natural and synthetic speech, other kinds of prerecorded audio, and touch-tone keypads. In addition to adding speech as a mode of interaction for everyday Web surfing, the W3C has its eye on other applications. These include the use of speech for the visually impaired and for people accessing the Web while driving. The group called VoiceXML a central part of its work on voice-computer interaction. 'The VoiceXML language is the cornerstone of what we call the W3C speech interface framework -- a collection of interrelated languages that are used to create speech applications,' said Jim Larson, co-chair of the W3C's voice browser working group and manager of advanced human I/O (input/output) at Intel. 'Using these types of applications, the computer can ask questions and the user can respond using words and phrases or by touching the buttons on their touch-tone phone'... Other W3C specifications control individual pieces of the voice-browsing puzzle. The Speech Synthesis Markup Language (SSML), for example, describes how the computer pronounces words, with attention to voice inflection, volume and speed. The Speech Recognition Grammar Specification (SRGS), establishes what a user must say in response to a computer prompt. And the Semantic Interpretation for Speech Recognition (Semantic Interpretation) strips down text and translates it to a form that the computer can understand..." See "VoiceXML Forum" and the 2003-01-29 news item "W3C Advances VoiceXML Version 2.0 to Candidate Recommendation Status."

  • [January 29, 2003] "KaVaDo Secures Web Services. Security Software Bundle Protects at the Application Layer." By Logan G. Harbaugh. In InfoWorld (January 24, 2003). ['Any organization providing any sort of interactive Web services is vulnerable to application-layer exploits and hacks. KaVaDo's suite of applications can find and protect against these vulnerabilities. Given the potential costs associated with not only down time, but loss of proprietary data, customer data, site defacement, or malicious alteration of data, any organization should be investigating application-layer protection... InterDo intercepts HTTP, SOAP, WSDL, and WebDAV traffic and looks for unauthorized attempts to attack the applications using the data. ScanDo finds existing vulnerabilities in your Web site, making it simpler to set up InterDo.'] "... Most managers may think malicious hackers penetrate a system by exploiting a weakness in the operating system to gain a password. But it is equally feasible for that hacker to use a standard HTTP, SOAP, or XML request, or an intentionally altered HTML document, to retrieve private data, to add or delete files on the server, or to take other equally unwanted actions by attacking via a published Web service. Protecting servers at the application layer is the only way to address these security issues. KaVaDo has three products that operate to protect any Web application, be it a site, service, or server: InterDo, ScanDo, and AutoPolicy. InterDo functions as a firewall, with either two NICs routing traffic between a trusted and a public network or with one NIC operating as a proxy server. The application parses HTTP, WebDAV, WSDL, SOAP, and XML requests, looking for and denying requests that are malformed or that ask for data that shouldn't be accessed. InterDo comes in two flavors: Enterprise Edition, which protects any number of servers or applications in an enterprise, and Business Edition, which protects one Web server or application server... ScanDo finds and InterDo protects against numerous threats including: unauthorized SQL commands; invalid application parameters; invalid or altered cookies; exploits of known vulnerabilities in Web servers, database products, or operating systems; altered SOAP or Web services messages; invalid characters in messages; HTTP exploits; unauthorized file uploads; modified application or network protocols; buffer overflow attacks; and requests that use unauthorized data encoding..." See the KaVaDo website.

  • [January 29, 2003] "Adobe Ramps Up Documents." By John Taschek. In eWEEK (January 27, 2003). ['Document Server takes electronic publishing to the next level -- as long as customers use a bevy of Adobe products and can pay for integrating the product with back-end systems.'] "...Adobe Systems Inc.'s Document Server, which shipped last month, will further blur that definition while making the document a more powerful medium. Document Server is a $20,000-per-CPU server that dynamically creates documents that can be signed, filled out as a form or simply read. In short, the things that Document Server can do almost instantly might take weeks or months to do without it. On the downside, however, organizations that use Document Server have a steep (but quick) learning curve to overcome, and they'll be facing unfamiliar territory with new standards, such as XSL-FO (Extensible Style Sheets-Formatting Objects). They'll also likely be integrating Document Server by themselves, because the product is new, and very few of their peers have practical experience with it. With Document Server, Adobe is leveraging the popularity of its PDF file type. Adobe officials claim more than 300 million copies of Acrobat Reader have been downloaded or distributed, and hundreds of millions of documents are stored as PDFs. At its core, Document Server is a superset of Adobe's Graphic Server, formerly named Altercast. Graphic Server concentrates on distributing and assembling graphics, but Document Server digests most Adobe formats as well as some standard files, assembles the documents and spits out PDFs. Document Server works equally well with forms and regular documents. It also works well with SVG (Scalable Vector Graphics), a Worldwide Web Consortium standard for defining electronic documents through XML. The advantage of SVG over graphics stored in JPEG and PNG formats is that the graphics are scalable and readable by any device with an appropriate reader. Regardless, most electronic forms will be based on XSL-FO. Document Server, however, works best when used with Adobe's product line, including FrameMaker, Photoshop and Illustrator. Organizations that haven't standardized on Adobe products may not find Document Server as compelling. We tested Document Server on a single-processor Windows XP system. We created Illustrator files, borrowed Photoshop graphics and created a set of FrameMaker templates... Smaller organizations may be able to get by using Adobe Acrobat, which requires the document creator to include an interactive form with the full version of Acrobat. There are also open-source initiatives that are just beginning to emerge. Because the standards are being set for the format and structure of documents, it might be only a matter of time before there are multiple competitors to Adobe, including the OpenJade project. For now, however, Adobe is at least two years ahead of its competitors..." See "Enhanced Adobe Document Servers Support XML-Based Workflow and Digital Signature Facilities."

  • [January 28, 2003] "Setting a Standard. [eWEEK Labs.]" By Cameron Sturdevant. In eWEEK (January 27, 2003). "Standards play a huge role in enterprise applications -- as well as the decision companies make to use a product or not -- and that role will only get bigger as Web services gain momentum. However, as vendor-led consortia increasingly define the standards that shape enterprise products, it's getting tougher to separate standards from vendor politics, posturing and power. But the worst thing enterprise IT managers could do is to simply sit by and watch it all happen. Indeed, there are a couple of reasons why the time is right to band together with suppliers, customers -- even competitor -- and dive into the standards process. First, XML has reached the "recommendation" stage, the highest level of approval from the World Wide Web Consortium. This means that XML -- arguably one of the most important standards to come along in the past five years -- has stabilized to the point that industry-specific schemas can be developed without fear that drastic changes will convulse the foundation of the W3C's work. Second, IT managers still have a chance to make a significant impact on the groups that are trying to wrestle Web services to the ground. During discussions with industry leaders such as Tim Bray, co-inventor of XML and founder and chief technology officer at Antarctica Systems Inc., eWEEK Labs was pleased to find a desire for feedback from potential customers... Inviting as some groups may be, however, the standards world is small and populated with its fair share of technocrats and sharp operators. But participation can be extremely beneficial for IT managers who judiciously allocate staff members to participate in the groups. Engineers from the biggest IT vendors routinely participate in standards groups, and face-to-face networking with these people can yield blunt and sometimes priceless answers and advice on strategic IT projects... the barriers to participation are significant with most standards groups. These include a commitment of several hours per week just to read and respond to e-mail, not to mention attending as many as six face-to-face meetings a year. Finally, most standards groups support themselves on membership dues that range from nothing at the IETF (Internet Engineering Task Force) to $50,000 or more per year... Understanding how standards bodies work is one of the keys to determining which standards to follow. We chose for this report the IEEE, IETF, W3C and OASIS (Organization for the Advancement of Structured Information Standards) organizations because they cover the gamut -- from open membership to exclusive vendor groups, from long-standing names familiar to everyone in IT to relatively new formations..."

  • [January 28, 2003] "Berners-Lee: Keeping Faith. [eWEEK Labs.]" By Anne Chen and Tim Berners-Lee. In eWEEK (January 27, 2003). ['The World Wide Web Consortium is a driving force behind the Web's interoperability and evolution. Founded in 1994 by Web inventor Tim Berners-Lee, the group has 450 member organizations worldwide. Since its inception, the W3C has developed standards such as XML and P3P. eWEEK Labs Senior Writer Anne Chen recently spoke with Berners-Lee, in Cambridge, Mass., about the changing role of standards bodies and how enterprise IT organizations should participate in the process.'] "eWEEK: What are the biggest issues facing standards bodies today? TBL: Intellectual property rights are a much bigger issue [today]. There's a fear of patents... Companies are realizing they have to formally tackle the issues of making standards royalty-free... There's a general global shift toward the realization that royalty-free standards are the only standards that can support Internet technology. There is a lot of fussing around... in general, though, the shift over the last two years is definitely toward royalty-free standards... eWEEK: What advice can you offer enterprise IT organizations that want to get involved in standards work? TBL: I'd say beware of organizations that might look like a standards body but are controlled by a vendor. Very often, they're more or less set up as a marketing and branding exercise produced by a group of companies... Look at how a [standards] organization manages the idea of being open, of being fair, and look for speed but also coordination... Something we've had to agree on as a consortium group is that we're not just working with our own groups but also with those from other standards bodies. You want to make sure people are collaborating, that there are people working together. Are members coming to the table and wishing to share and to build a new market? Are they excited about what's happening, or are they trying to exclude other people? [...] eWEEK: In the W3C, what's to keep vendors from pushing their own technologies through as standards? TBL: We have a review before we start an activity so that a company can't start a standard around their own product. You can't even have three members decide there's going to be a W3C activity. Three of five self-selected people might be able to do work together, but all [450] members have to have a look at proposed work. ... There's no secret in how that work gets done. All of the members have an equal opportunity to speak up when it comes time to say the W3C may assign resources to this new work..."

  • [January 28, 2003] "Enterprises Seek Role in Standards. [eWEEK Labs.]" By Anne Chen. In eWEEK (January 27, 2003). "As standards -- and standards organizations -- proliferate, IT managers such as Robert Kozak say they must pick and choose carefully where and how much they will participate. Kozak, director of Internet development at W.W. Grainger Inc., and his colleagues decided to focus most of their company's standards attention on the World Wide Web Consortium because that group's open, inclusive procedures gave Grainger its best chance to influence future technologies. 'We looked at where we could really maximize our impact,' said Kozak, in Lake Forest, Ill. 'Other organizations are just as important, but it's difficult to proportion the amount of standards work we can do. We need to ensure we can contribute in a way that provides the greatest impact for our customers, for Grainger and for the organizations we're involved with.' IT managers from companies such as Grainger and General Motors Corp. say participating in standards bodies gives them a competitive advantage by ensuring that they have a say in how technology such as XML will work and will be implemented in future products they are likely to purchase. Participating also gives enterprise IT a chance to play a role in the struggle to develop technologies that are not only interoperable but also lower the cost of doing business. Choosing which standards bodies to participate in, however, can be as complex as the technologies being worked on. IT managers making that choice should, first, look at their long-term IT goals and decide which technologies -- and therefore which groups -- will be most important to them, experts say. IT managers should also evaluate the quality of work being produced by each group. In addition, experts say, IT managers should be leery of standards-focused groups that are vendor-dominated... IT managers should also look at the work being produced by an organization for quality and acceptance by other standards organizations. Interoperability, openness, implementation and testing are all issues the standards body should be focused on, said John Parkinson, chief technology officer at Cap Gemini Ernst and Young U.S. LLC, in Chicago. At Grainger, one of the largest distributors of facilities maintenance products in the United States, the decision to join the W3C was based not only on the organization's openness and its insistence on royalty-free standards but also on the success of standards such as XML and HTTP, said Carl Turza, vice president of e-Business at Grainger. Vendor-driven standards bodies were not considered..."

  • [January 28, 2003] "E-Commerce Standard Plans Made Public." By Thor Olavsrud. In Internetnews.com (January 28, 2003). "E-business interoperability consortium OASIS Tuesday said the first draft of a royalty-free data method for international electronic commerce has been released by one of its technical groups. The new OASIS schemas encompass the Universal Business Language (UBL). UBL is a standard for XML (define) document formats that encode business messages, such as purchase orders and invoices. UBL treats business-to-business (B2B) communication across all industry sectors and domains for all types of organizations, including small- and medium-sized enterprises... Eventually, OASIS hopes to make UBL a legal standard for international trade, and therefore the technical committee grounded the UBL Library in the Core Component semantics developed for ebXML (define), a modular suite of specifications for standardizing XML globally in order to facilitate trade between organizations regardless of size. ebXML was jointly developed by OASIS and the United Nations Centre for Trade Facilitation and Electronic Business (UN/CEFACT). While UBL is not a deliverable of the ebXML initiative, ebXML's Core Component specification is a system for creating idealized, business-context-free models for business information that can be mapped to traditional EDI syntax, XML syntax, or other syntaxes. With UBL, OASIS is trying to take a concrete next-step by mapping the Core Components to XML as an XML Schema (define) representation, thereby allowing for the contextualization of information in an XSD environment..." See other references in the 2003-01-27 news item "UBL Technical Committee Releases First Draft of XML Schemas for Electronic Trade."

  • [January 27, 2003] "BEA Seeks to Ease XML Development on Java. Hosted Service to Feature XMLBeans Technology." By Paul Krill. In InfoWorld (January 27, 2003). "BEA Systems on Monday is introducing a technology called XMLBeans, which is designed to improve developer productivity by eliminating challenges to incorporating XML data into Java To debut initially as a free, hosted service, XMLBeans aims to enable developers to focus time on value-added development for Web services and enterprise applications. The hosted service eventually will be part of BEA's WebLogic for Java application development and deployment. Version 8.1 of WebLogic, due in beta release at the BEA eWorld conference in March, will feature XMLBeans... XMLBeans provides a Java object-based view of XML data. Unlike other Java 'binding' solutions, XMLBeans enables programmers to maintain the fidelity of raw XML data while gaining the productivity and flexibility benefits of Java, according to BEA. It features a core set of Java classes that provide a common XML store... A BEA representative said the company wants to make XMLBeans a standard but has not yet decided which specific route it will take toward that end. BEA's XMLBeans provides direct access to XML using a conventional set of interfaces such as XQuery, [BEA's Carl Sjogreen] said. 'It makes it very easy for [developers] to access that XML information,' without losing any information in the XML Schema, he said. XMLBeans differs from other approaches to incorporating XML data into Java, such as DOM and SAX, in that it does not result in loss of data due to fundamental differences between the two languages, requiring recoding of information and development of custom linkages. DOM and SAX present low-level API approaches to using XML that are tedious to work with and make applications brittle, while other technologies, such as JAX-B or Castor, force developers to fit XML into Java classes, Sjogreen said... BEA hopes to get feedback from developers via launching XMLBeans as a hosted service. Developers can log onto the site and upload a schema that describes documents they want to use, and in return will get back XMLBeans classes needed to process that document in their application, according to Sjogreen..." See: (1) the XMLBeans overview; the text of the announcement "BEA Systems Drives Convergence of Application Integration and Application Development with BEA XMLBeans. Technology Innovation Revolutionizes Java and XML Interoperability and Increases Productivity for Java Developers."

  • [January 24, 2003] "Web Consortium Captures Captioning." By Paul Festa. In CNET News.com (January 24, 2003). "... The World Wide Web Consortium (W3C) has chartered the Timed Text Working Group (TTWG) to come up with a streaming text specification, based on XML (Extensible Markup Language), that will synchronize text with video or audio streamed over the Internet. 'Simply put, this is to have a broad standard for captioning on the Web,' said W3C representative Janet Daly. 'There's a lot of industry interest in this. The potential for entertainment is clear.' The W3C's timed text effort is not its first attempt to synchronize elements in multimedia presentations. One standard that has already reached the consortium's final recommendation status is the Synchronized Multimedia Integration Language (SMIL, pronounced 'smile'). But SMIL describes how to coordinate diverse media types in general terms. Without a specification for text, proprietary methods have cropped up, leading to text captioning that is specific to a certain browser or device. Daly said the application would prove useful both for people who want to play multimedia content silently, in a restrictive environment like an office, and for people who are hard-of-hearing. 'This is not just for the slacker in the office, but for people with disabilities to capture the information in the audio stream,' Daly said..." See details and references in the 2003-01-24 news item "W3C Charters Timed-Text Working Group (TTWG)."

  • [January 24, 2003] "Communication Interception Goes Global." By James Pearce. In ZDNet News (January 24, 2003). "OASIS has set up a committee to develop a technical framework that enables security agencies to share information easily around the world Security agencies around the world will soon be able to share intercepted information on criminal and 'terrorist' activities far more easily after a new technical framework is introduced, a committee claims. The Organization for the Advancement of Structured Information Standards (OASIS) has created the LegalXML Lawful Intercept XML (LI-XML) technical committee to develop a global framework to support the "rapid discovery and sharing of suspected criminal and terrorist evidence by law enforcement agencies'... The initial meetings of the committee have already been held, and the committee hopes to identify all XML schemas relating to lawful interception by 10 February. By January 2004 the committee plans to publish information on the interoperability test results of the different schemes. According to the committee, the most common activity related to lawful interception is the simple act of looking up basic information about a communications identifier (such as an email address or phone number), and one objective of the committee is to develop a Global User Identifier Lookup schema. The second most common activity is requesting the records of a specific user from a communications services operator. There are currently no means to do this electronically, and the committee plans to develop a Global Communications Record Lookup schema. The technical committee also plans to develop specifications for interoperable Lawful Intercept Global Identifier Registries and associated verification and authentication schemes, after noting that there is no global mechanism to verify the authenticity of the tens of thousands of parties worldwide involved in the lawful interception of communications..." See details and references in the 2003-01-23 news story: "OASIS LegalXML Member Section Forms Lawful Intercept XML Technical Committee."

  • [January 23, 2003] "Mobile P2P Messaging, Part 2: Develop Mobile Extensions to Generic P2P Networks. Turn Mobile Devices Into JXTA and Jabber Clients." By Michael Juntao Yuan (Research Associate, Center for Electronic Commerce, University of Texas at Austin). From IBM developerWorks, Wireless. January 2003. ['Generic peer-to-peer computing networks such as JXTA and Jabber are often too complex for mobile devices. Thus, lightweight mobile clients or special architectures that work through relays are needed to extend those P2P communities to mobile users.'] "In this second part of our series on mobile peer-to-peer messaging, Michael Yuan discusses JXME, a J2ME JXTA client project. We'll examine the examples bundled in the JXME distribution to show you how to use the JXME APIs. In addition, we will also briefly discuss options to develop mobile Jabber applications. As we discussed in Part 1 of this series ['Access SMS Using the Wireless Messaging API and Other Packages'], SMS-based messaging is very convenient for wireless phone users. However, it is not a suitable messaging platform for non-phone mobile devices such as PDAs and WAN-connected handhelds. SMS messaging across different cell networks (such as international calls) can also be expensive or even in some cases impossible. In this second and final installment of our series on mobile P2P messaging, we will introduce you to two general-purpose peer-to-peer networks -- the JXTA P2P and Jabber instant messaging networks -- that you might use in situations that don't lend themselves to SMS... JXTA defines a set of open protocols for peer-to-peer networks. These XML-based protocols describe complex operations such as peer discovery, endpoint routing, connection binding, basic query/response message exchange, and network propagation through rendezvous peers... JXTA is a generic P2P framework that goes far beyond simple messaging. It addresses issues like P2P file sharing, P2P application services, and collaborative distributed computing. While the JXTA protocols are designed to be independent of any implementation technology, JXTA's reference implementation is built on the Java platform. This reference implementation uses Java APIs to wrap JXTA protocol messages and provide a programmatic way to access the JXTA network from Java applications... The power and flexibility of JXTA come at a price: complexity. A JXTA peer needs to take care of a lot of tasks and process messages at the XML-over-socket level. Such a peer would be too complex to run on most mobile devices. In addition, neither XML nor raw socket support are part of the standard J2ME/MIDP specification. To make JXTA networks available to mobile P2P users, we need a set of lightweight JXTA APIs for mobile devices. The JXME project aims to provide JXTA APIs for the CLDC and MIDP platforms. It can also be used in higher-end J2ME profiles, such as the Personal Profile... Ultimately, the success of any P2P system depends on its ability to attract users. Although JXTA is a very powerful and technically advanced framework, its technical complexity hinders its adoption. Jabber is a much simpler P2P system than JXTA; it is primarily designed for instant messaging. Jabber has a much larger peer network than JXTA. Jabber was originally designed to provide interoperability among popular Internet instant messaging systems (AOL, MSN, Yahoo!, ICQ, and so on). It is a powerful and flexible yet simple protocol that can wrap around all existing IM protocols. With powerful features and completely open XML protocols, Jabber is by far the most advanced IM system available today. Jabber can also support many advanced P2P applications, such as calendar groupware and file sharing. Jabber peers communicate with each other through Jabber servers. Jabber servers can also talk with each other to form large domains of peers that are not directly connected to the same server. All communication among Jabber peers and servers takes the form of open XML-formatted messages sent over raw socket connections..." See: (1) JXTA Project website; (2) "XML Encoding for SMS (Short Message Service) Messages"; (3) "Jabber XML Protocol."

  • [January 23, 2003] "OASIS Enlists XML in War Against Terror." By Darryl K. Taft. In eWEEK (January 23, 2003). "The Organization for the Advancement of Structured Information Standards (OASIS) Thursday announced the formation of a new technical committee to develop a universal global framework aimed at helping law enforcement agencies share criminal and terrorist evidence. In a move targeted at helping those agencies involved in the homeland defense effort, OASIS is putting its XML expertise to use for the cause of national security. The new technical committee is called the OASIS LegalXML Lawful Intercept XML (LI-XML) Technical Committee. OASIS officials said the consortium formed the new committee in response to mandates both in the United States and Europe. Anthony Rutkowski, an executive with VeriSign Inc., has been named chair of the LI-XML technical committee. He said governments and Internet access providers alike would benefit from the LI-XML effort to generate a standard XML schema for sharing information about terrorists..." See details and references in the 2003-01-23 news story: "OASIS LegalXML Member Section Forms Lawful Intercept XML Technical Committee."

  • [January 23, 2003] "VorteXML Turns Data Into XML." By Timothy Dyck. In eWEEK (January 20, 2003). Review of VorteXML Server 1.0. ['VorteXML makes what can be a difficult job -- turning text data into XML -- straightforward. The tool is easy to use and will do the hoped-for job in many situations. However, the 1.0 release has a significant number of functional gaps that make it difficult for administrators to detect when input text files contain formatting errors.'] "Datawatch Corp.'s new VorteXML Server 1.0, which started shipping last month, provides a flexible template-based system to extract data from the straw of undifferentiated text files and turn it into XML gold. The server's sweet spot is with organizations that have collections of plain-text or HTML files (such as invoices, reports, confirmation e-mail messages or log files) that they want to turn into the more usable XML data format. However, eWeek Labs' tests also found a number of important limitations that make this product more difficult to deploy than it should be and could point users toward products from close rivals Whitehill Technologies Inc. and ItemField Inc... Converting nonstructured formats such as text files into a structured format such as XML is inherently a hard problem to solve. VorteXML Server's strongest feature is its VorteXML desktop tool, which uses an intuitive, flexible "painting" system to highlight data fields in input text files. (VorteXML can do the XML conversion itself but only on a single input file.) VorteXML provides a mechanism to identify data fields through a combination of nearby text field labels, delimiters and absolute line position. It also has an expression language (although not a full programming language) to perform variable manipulation. [VorteXML Server description: "VorteXML Server is a scalable, high-volume server that enables users to easily automate the often arduous and lengthy process of transforming legacy data into valid XML. It is expressly designed to empower business users to quickly and easily transform their operational data into XML for popular e-business applications including bill and statement presentment, B2B interactions, legacy transformation and Web Services. VorteXML Server offers users the ability to: Convert high-volumes of text data to XML; Automate complex conversions and transformations; Invoke conversion remotely through a web service via Java, .NET or any other SOAP-enabled client; Run conversions on a recurring basis; Trigger XML conversions based on file creation."]

  • [January 23, 2003] "IBM's Long-Awaited Software Taps Data Integration Research." By Lisa Vaas. In eWEEK (January 20, 2003). "IBM is gearing up to release the first product from its years-long Xperanto data integration research project for beta testing next quarter. The as-yet-unnamed software will enable, for example, call center representatives to access and change customer information in real time from multiple sources, an IBM spokeswoman confirmed last week. The application will tap different data types, such as e-mail messages, enterprise application data, purchase orders and photos. Delivery dates of this first Xperanto product won't be determined before next quarter, the spokeswoman said, but it is expected to be generally available by the end of the year. Xperanto is a data integration technology that relies on XML and XQuery searching technologies to tie together any type of data from any source, leaving that data in whatever database or application it natively resides. It will allow users of IBM software to move from storing all data in a single, consolidated database to accessing data in a kind of virtual database that federates data from multiple sources. The Xperanto product follows in line with IBM's On Demand computing initiative, a concept of IT infrastructure that has four characteristics: integrated systems, open- standards software, virtualized software that allows more efficient use of IT resources, and autonomic or self-managing systems..." See IBM Research's Xperanto project and references in "IBM: Xperanto Rollout To Start In Early 2003. Long-Promised Information Integrator on the Horizon." General references in "XML and Databases."

  • [January 23, 2003] "Introduction to XFML." By Peter Van Dijck. From XML.com. January 22, 2003. ['Peter van Dijck introduces XFML, a lightweight language for faceted metadata.'] "XFML is a simple XML format for exchanging metadata in the form of faceted hierarchies, sometimes called taxonomies. Its basic building blocks are topics, also called categories. XFML won't solve all your metadata needs. It's focused on interchanging faceted classification and indexing data. XFML addresses the following problems with basic hierarchical classification: (1) Creating and maintaining a good topic hierarchy is a lot of work, ask any librarian. (2) Indexing (categorizing) large amounts of content consistently is even harder; see Cory Doctorow's 'Metacrap'. (3) Creating a centralized hierarchy to organize a large amount of information doesn't scale. -- if you think Yahoo's hierarchy scales, ask yourself why you keep turning to Google. [So] XFML provides a simple format to share classification and indexing data. It also provides two ways to build connections between topics, information that lets you write clever tools to automate the sharing of indexing efforts. It's based on the principles of faceted classification, addressing many of the scaling issues with simple hierarchies... Facets sound scary and librarian-like, but they are really just a common sense approach to classifying things. Instead of building one huge tree of topics, a faceted classification uses multiple smaller trees (each tree is called a facet) that can then be combined by the user to find things more easily... The XFML core spec gives an introduction, defines the concepts, and specifies the XML format. The spec is stable and frozen, which means you can safely build applications that use it... The building blocks of a faceted hierarchy in XFML are facets and topics. A facet is the top node of each tree. The nodes in the tree are called topics. XFML can define multiple hierarchies, and each hierarchy is a facet... Once you have some facets and topics defined, you will want to classify or index some web pages and add them to your XFML document so your indexing efforts can be shared. You can only classify things that have a URI. Each URI (we call them pages but you can use other filetypes as well) can be classified under multiple topics... XFML is a simple standard to exchange faceted, hierarchical metadata. What makes it different is the way it addresses specific problems with metadata authoring by allowing for distributed metadata through the <connect> and <psi> elements. It is designed to be easy to code for and is already supported by a number of tools. To get started with XFML, I recommend writing an XFML file by hand and uploading it to Facetmap..." See the XFML website for other details; compare "(XML) Topic Maps."

  • [January 23, 2003] "Excel Reports with Apache Cocoon and POI." By Steve Punte. From XML.com. January 22, 2003. ['Steve Punte on generating Excel reports dynamically with Apache Cocoon.'] "We describe a simple, low-cost, modular, XML component oriented solution for generating and rendering real-time Excel reports. Future enhancements will probably include support for formulas, multiple sheets, and so on. Portions of this solution were developed for Amansi.com, which has donated this solution to open-source... Generating professional reports within a web application can be a difficult problem to solve. However, by combining two Apache projects (Cocoon and POI) you can produce Excel reports from a pure Java server application. The key to this solution is to embrace Excel on the client and deploy a Java solution on the server. Why Use Microsoft Excel? Excel is the business world's spreadsheet of choice. While most readers of this article are software technologists, many of the projects and solutions you're developing are meant for use by non-technologists in a corporate environment, in which Microsoft Office is the dominant software suite. Your end user is probably already familiar and skilled with Office. Providing reports as Excel spreadsheets allows the data to be manipulated to meet the end user's needs. As long as a sufficiently similar report and associated data are available, most end users can manipulate the report to obtain the desired results. Although Excel is a Microsoft Windows application, its binary file format is well known and may be manipulated by many low cost solutions, including the pure Java Apache POI project. Rendering reports in Excel does not require that the web application runs Windows, only that the client runs Windows..." [POI: "POI stands for Poor Obfuscation Implementation. Why would we name our project such a derogatory name? Well, Microsoft's OLE 2 Compound Document Format is a poorly conceived thing. It is essentially an archive structured much like the old DOS FAT filesystem. Redmond chose, instead of using tar, gzip, zip or arc, to invent their own archive format that does not provide any standard encryption or compression, is not very appendable and is prone to fragmentation."]

  • [January 23, 2003] "Parsing RSS At All Costs." By Mark Pilgrim. From XML.com. January 22, 2003. ['Mark Pilgrim explains how to handle even ill-formed RSS feeds. He provides a sample parse-at-all-costs RSS parser.'] "RSS is an XML-based format for syndicating news and news-like sites. XML was chosen, among other reasons, to make it easier to parse with off-the-shelf XML tools. Unfortunately in the past few years, as RSS has gained popularity, the quality of RSS feeds has dropped. There are now dozens of versions of hundreds of tools producing RSS feeds. Many have bugs. Few build RSS feeds using XML libraries; most treat it as text, by piecing the feed together with string concatenation, maybe (or maybe not) applying a few manually coded escaping rules, and hoping for the best. On average, at any given time, about 10% of all RSS feeds are not well-formed XML. Some errors are systemic, due to bugs in publishing software. It took Movable Type a year to properly escape ampersands and entities, and most users are still using old versions or new versions with old buggy templates. Other errors are transient, due to rough edges in authored content that the publishing tools are unable or unwilling to fix on the fly. As I write this, the Scripting News site's RSS has an illegal high-bit character, a curly apostrophe. Probably just a cut-and-paste error -- I've done the same thing myself many times -- but I don't know of any publishing tool that corrects it on the fly, and that one bad character is enough to trip up any XML parser... In next month's column I'll examine some other RSS validity issues. Valid RSS is more than just well-formed XML. Just because there's no DTD or schema doesn't mean it can't be validated in other ways. We'll discuss the inner workings of one such RSS validator..." General references in "RDF Site Summary (RSS)."

  • [January 23, 2003] "The Return of XML Hypertext." By Kendall Grant Clark. From XML.com. January 22, 2003. ['Kendall Clark reports on the creation of a new mailing list focused on the use of XML for hypertext.'] "... What is XML, and what is it best for, if you've spent long hours popping and pushing HyperCard stacks? Among the places one might go for an answer to these questions, consider the newly minted xml-hypertext mailing list. The first thing one might say about xml-hypertext is that its credentials suggest that it is a trustworthy source. A brief glance through its archive is like a glance through the Who's Who of the XML community. Not only is the roster of participants a good indication of the quality of conversation, but it also suggests that the list's motivating idea is not the product of a single person, but reflects broader community interests. In announcing the xml-hypertext list, Simon St. Laurent suggested a basic agenda for the conversation; 'appropriate subjects include,' St. Laurent said, 'technologies for linking and pointing, hypertext-oriented transformations, and interactions between XML and Web infrastructure'. Among the technologies which fit that bill, St. Laurent mentioned XLink, XPointer, HLink, SkunkLink, VELLUM (one of St. Laurent's own proposed linking technologies), XHTML, RDF and Topic Maps, SMIL. It makes sense that a mailing list about XML and hypertext will focus on linking technologies, since linking is essential to every robust hypertext proposal -- computing technology has long tempted very smart people as a way to overcome what is seen to be the static nature of old-fashioned, that is, printed books While it's still early, the xml-hypertext list may turn out to be an important new chorus of voices for XML technologists to pay attention to. Going forward, its two primary technical subjects of interest are likely to be new ways of rejuvenating linking in XML applications, including the end-user Web, and (though this is more a prediction than a promise) the issue of linkbases, that is, collections of out-of-band links or relations between resources..." See the information page for xml-hypertext -- Discussion of hypertext using XML and the mail archives. Related item: Bob DuCharme has posted references for a (demo) prototype implementation for "1-to-many" links using HTML. General references in "XML Linking Language."

  • [January 23, 2003] "XML Technical Specification for Higher Education." Edited by Mike Rawlins. With contributions from IMS Global Learning Consortium, University of Wisconsin-Madison, Miami-Dade Community College, Brown University, and the US Department of Education. For the Postsecondary Electronic Standards Council [Washington, DC, USA]. Working Draft Version 2.1. August 2002. 54 pages. "The purpose of this document is to provide guidance in the development and maintenance of a data dictionary and XML sschemas. The scope of this specification includes the data which institutions and their partner's exchange in support of the existing business processes within Higher Education like administrative applications for student financial aid, admissions, and registrar functions. The internal audience of this document is the members of the XML Forum for Education as well as the technical members of the education community at large wishing to use XML in their data exchanges... This specification is an ongoing output of the Technology Work Group of the XML Forum for Education. First organized in August 2000 on the recommendation of a PESC study group, the XML Forum has as its mission the establishment of Extensible Markup Language (XML) standards for the education community through collaboration. The Technology Work Group was charged with performing research on existing XML specifications and best practices and providing technical guidance to XML developers in the education space. This document is the result of its efforts over the past eighteen months. It will be updated periodically as national and international XML standards are established.." [About: "The XML Forum for Education serves as an industry group focused on XML standards in the education space. In addition to monitoring global XML specification initiatives and developing standards appropriate to education, the Forum provides the community with information on XML applications and their potential."] See: (1) Postsecondary Electronic Standards Council XML Forum website; (2) XML Schemas, incluuding the PESC XML Forum College Transcript Schema Version 0.01 [diagram]; (3) "PostSecondary Electronic Standards Council XML Forum for Education." [source .DOC, cache]

  • [January 22, 2003] "XML Developer Tool is Upgraded." By Paul Krill. In InfoWorld (January 22, 2003). "Altova on Wednesday announced availability of XMLSPY 5 Release 3, an XML development environment that adds support for C# code generation and a WSDL documenting utility in the new release. XMLSPY is specifically tuned for XML development and has 1 million registered users, said Larry Kim, Altova marketing director, in Beverly, Mass. Version 5 Release 3 offers key new features including a WSDL documentation utility, C# support, PDF publishing, and the ability to edit XML documents stored within Oracle XML DB, the XML repository within the Oracle9i Release 2 database. The WSDL Documentation generation utility is intended to enable Web services developers to document and publish a Web service interface to business partners, developers, or to the public. WSDL files can be annotated then published into a Word or HTML output file... Oracle XML DB functionality in XMLSPY enables developers to perform common operations on data managed by XML DB, such as listing XML Schemas, loading a Schema, and saving or deleting an XML Schema. XMLSPY's stylesheet designer now supports visual editing and generation of a PDF file from an XML document. Developers can preview output in either PDF or HTML formats..." See details in the 2003-01-22 announcement: "Altova Simplifies XML Development Through Enhanced Support for Microsoft .NET, Oracle XML DB, Web Services and Document Publishing in XMLSPY 5 Release. New features further demonstrate XMLSPY 5 is the most comprehensive XML development environment for any XML-enabled software project." Related news: "Altova and Oracle Announce Tighter Integration with Oracle XML DB in XMLSPY 5 Release 3" General references in "XML Schemas."

  • [January 22, 2003] "OASIS Offers XML Standards Clearinghouse." By Darryl K. Taft. In eWEEK (January 22, 2003). "The Organization for the Advancement of Structured Information Standards (OASIS) standards body Wednesday announced the first in a series of XML.org Focus Areas to enable users to deliver and access domain-specific content on XML standards. The first three areas in the series focus on insurance, human resources and printing and publishing. These Focus Areas are now available, with partners and experts available to provide assistance with content and other assistance to users, the organization said... OASIS officials said these are the first in a series of Focus Areas the organization will release for other horizontal and vertical markets. The Focus Areas include information on standards, news, movements among implementers of the specifications and other domain-specific material..." Note: The word "clearinghouse" which appears in the title of the announcement is probably not intended to describe the role of the XML.org Focus Areas as prescriptive or regulatory; it's a descriptive term referring to the XML.org "portal" as a whole, and in particular, it refers to functionality intended to be provided by XML.ORG Registry, 'an open clearinghouse for the exchange of XML schemas and vocabularies'. Focus Areas are not committees; they are not charged with creating new standards or issuing official OASIS guidelines. The principal strategy in the formation of XML.org "Focus Areas" is to draw upon the domain expertise of OASIS member companies in several key industry sectors -- to collect and maintain domain-specific information. According to the announcement: "Through XML.org, OASIS partners with experts in each Focus Area to provide content and editorial guidance [for website collection development and organization]. The XML.org Focus Area on Insurance has been developed in conjunction with ACORD, a nonprofit association that facilitates the development and use of standards for the insurance, reinsurance and related financial services industries. The XML.org Focus Area on Human Resources has been developed in cooperation with the HR-XML Consortium, a non-profit organization that develops and promotes XML specifications to enable e-business and the automation of human resources-related data exchanges. The XML.org Focus Area on Printing & Publishing has been developed in cooperation with IDEAlliance (founded as Graphic Communications Association), a membership organization that advances user-driven, cross-industry solutions for all publishing and content-related processes... Future XML.org Focus Areas will include Financial Services, Defense Logistics, Education, Tax/Accounting, E-Government, Security, Retail, Localization & Globalization, E-Marketplaces, and others. XML.org is sponsored by BEA Systems, Global Exchange Services (GXS), ISOGEN International, and SAP." See the complete text of the press release.

  • [January 22, 2003] "Requirements for the Ink Markup Language." Edited by Yi-Min Chee (IBM) and Sai Prasad (Intel). W3C Note 22-January-2003. First public version. Version URL: http://www.w3.org/TR/2003/NOTE-inkreqs-20030122/. Latest version URL: http://www.w3.org/TR/inkreqs/. "This document describes requirements for the Ink Markup Language that will be used in the multimodal interaction framework as proposed by the W3C Multimodal Interaction Working Group. The Ink Markup Language will serve as the data format for representing ink entered with an electronic pen or stylus in a multimodal system. The markup will allow for the input and processing of handwriting, gestures, sketches, music and other notational languages in web-based multimodal applications. In the context of the W3C Multimodal Interaction Framework, the markup provides a common format for the exchange of ink data between components such as handwriting and gesture recognizers, signature verifiers, and other ink-aware modules... W3C's Multimodal Interaction Activity is developing specifications for extending the Web to support multiple modes of interaction. One mode of interaction that is expected to play a role in many multimodal use cases is pen input. The requirements described in this document will be used to guide the development of a markup language for representing ink data captured by a pen-enabled multimodal system... The Ink Markup will consist of primitive elements that represent low-level ink data information and application-specific elements that characterize meta information about the ink. Examples of primitive elements are device and screen context characteristics, and pen traces. Application-specific elements provide a higher level description of the ink data. For example, a segment tag could represent a group of ink traces that belong to a field in a form. Consequently, the requirements for the Ink Markup Language could fall in either of the two categories. This document does not attempt to classify requirements based on whether they are low-level or application specific..."

  • [January 22, 2003] "OpenGIS Catalog Services Specification." Edited by Douglas Nebert. From Open GIS Consortium Inc. Version: 1.1.1. Date: 2002-12-13. OpenGIS project document reference: OGC 02-087r3. Category: OpenGIS Implementation Specification. 239 pages. Section 9.4 'Interface Definition' (pages 88-94) supplies the XML encoding rules. "OpenGIS Catalog Service Implementation Specification: The OpenGIS Catalog Service Specification version 1.1.1 documents industry consensus regarding an open, standard interface to online catalogs for geographic information and web-accessible geoprocessing services. Industry agreement on a common interface for publishing metadata and supporting discovery of geospatial data and services is an important step toward giving Web users and applications access to all types of "where" information. Version 1.1.1 is more comprehensive than earlier OpenGIS Catalog Service Specification versions and proposals. It addresses the controlled enterprise environment where a-priori knowledge exists about the client and server, and it also addresses the global Internet case where no a-priori knowledge exists between client and server. It is consistent with existing and pending geomatics and metadata standards under the ISO Technical Committee 211, and it is consistent with XML data discovery and processing and with the emerging Web Services infrastructure. The [specification] document provides guidance on the deployment of catalog services through the presentation of abstract and implementation-specific models. Catalog services support the ability to publish and search collections of descriptive information (metadata) for data, services, and related information objects. Metadata in catalogs represent resource characteristics that can be queried and presented for evaluation and further processing by both humans and software. Catalog services are required to support the discovery of registered information resources within a collaborating community... For HTTP transport the XML messages are defined by the XML encoding rules. The specification for the XML encoding rules can be found at http://asf.gils.net/xer . This specification derives the encoding of the Application Protocol Data Units (APDUs) from the ASN.1 specification of Z39.50 available from http://lcweb.loc.gov/z39.50/agency/document.html . For information a DTD for Z39.50 encoded using XER is given below [...]" See other XML-based OpenGIS Implementation Specifications and the text of the 2003-01-22 announcement "OGC Approves Important Spatial Catalog Specification." Related references: (1) "Geography Markup Language (GML)"; (2) "OGC Announces Critical Infrastructure Protection Initiative Phase 2 Kickoff" (3) Open GIS Consortium Issues RFC for Web Coverage Service Implementation Specification. [cache]

  • [January 22, 2003] "SBC Tries to Enforce Patent on Frame-Like Browsing." By Grant Gross. In InfoWorld (January 22, 2003). "SBC Intellectual Property owns two U.S. patents on a Web site navigation tool called a 'structured document browser' and it is asking MuseumTour.com and other sites to pony up licensing fees. The structured document browser's definition sounds like the technique of using frames to link to other documents on a Web site, which would be used by hundreds of thousands, if not millions, of sites. SBC Intellectual Property President Harlie Frost said the patent claims are 'related to frames' before referring more questions to SBC's public relations representatives... Several Internet activists, including members of the free software movement, have for years blamed the U.S. Patent and Trademark Office for granting patents on technologies that were already widely used. The most publicized case was in September 1999, when Amazon.com was granted a patent for its one-click shopping service. Amazon.com Chief Executive Officer Jeff Bezos later called for patent reform..." Abstract for 'Structured document browser' [6,442,574]: "A structured document browser includes a constant user interface for displaying and viewing sections of a document that is organized according to a pre-defined structure. The structured document browser displays documents that have been marked with embedded codes that specify the structure of the document. The tags are mapped to correspond to a set of icons. When the icon is selected while browsing a document, the browser will display the section of the structure corresponding to the icon selected, while preserving the constant user interface." On this theme [patents being granted for very unremarkable "new ideas" that would occur naturally to eleven out of any twelve researchers working in the same timeframe with the same common knowledge and experience], see "Patents and Open Standards."

  • [January 22, 2003] "SBC Enforcing All-Encompassing Web Patent. You've been Framed." By Kevin Murphy [ComputerWire]. In The Register (January 23, 2003). "SBC Communications Inc is enforcing a patent it owns that, it claims, covers the use of frame-like user interfaces in web sites, it emerged this week Kevin Murphy writes. If your web site uses a frames or a persistent user interface, then you could be in infringement. Using SBC's interpretation of its patent, hundreds of thousands of web sites, including those of many SBC's own hosting customers, many of the web's biggest sites, and the United States Patent and Trademark Office itself, could be in infringement... According to an SBC letter published by MuseumTour.com, which was the first to disclose SBC was demanding license fees, simply using an interface that remains on-screen while a user navigates the site could constitute infringement of US patents 5,933,841 and 6,442,574, both entitled 'Structured Document Browser'. 'Your site includes several selectors or tabs that correspond to specific locations in your site document. These selectors are not lost when a different part of the document is displayed to the user,' the SBC letter reads. '[These features] appear to infringe several issued claims in the '841 and '574 patents'... After the news broke, SBC's actions were immediately condemned by many in the internet-using community, their criticisms echoing those made during controversies over the enforcement of patents on arguably obvious inventions by the likes of Amazon.com Inc and BT Group Plc. SBC's now defunct Prodigy brand consumer ISP unit was on the receiving end of one of the last major 'obvious' patent suit to hit the headlines. BT Group Plc sued Prodigy, claiming a decades-old patent on Videotext systems covered hypertext. The suit was ultimately unsuccessful... Support for the HTML Frames method, which SBC's letter to MuseumTour alludes to as a way to build the persistent user interfaces SBC says it owns, was introduced in the first beta release of the old Netscape Navigator 2.0 browser, which became available to developers in October 1995..." Note the Len Bullard comment 2003-01-23: "Another one sillier than the last one... A patent on frame-based navigation... This is becoming an issue for the US at the national electorate level. It is time for organizations to begin to talk to candidates for high office about real patent reform, what is required and what it will take to get it."

  • [January 21, 2003] "What's New with Smart Tags in Office 11." By Chris Kunicki. From Microsoft MSDN Library. January 20, 2003. ['This article looks at a handful of new smart tag enhancements introduced with Microsoft "Office 11" Beta 1 that are designed to make smart tags easier to develop and also addresses a few limitations of smart tags in Microsoft Office XP.'] "Smart tags first appeared in Microsoft Office XP with a great deal of fanfare, as they represented an innovative new way to make the data in Microsoft Office documents more meaningful and actionable. How often have you found yourself typing in the name of a customer contact, an invoice number, a tracking number, or some other form of relevant information with meaning to you or your company? In the old world without smart tags, that information just sat in the document as static text. Smart tag technology makes it possible to link that relevant information to other resources that might provide you with additional information that is useful in creating a document, or better yet, it might bring that relevant information right back into your document. In Microsoft Office 11, Microsoft is adding numerous enhancements based on customer feedback to broaden the potentional for smart tag technology... In addition to new application support, the smart tag application programming interface (API) library has been extended to support a number of new interfaces that enable new functionality. This library is named Microsoft Smart Tags 2.0 Type Library... . When Office 11 ships, we can expect to see some interesting goodies. For example: (1) The ability to control which Microsoft Office Smart Tag List (MOSTL) XML-based smart tags are enabled or disabled. (2) MOSTL XML adds support for regular expression and context-free grammar recognition. This should simplify recognizing simple and complex patterns of text. (3) Smart documents, which complement smart tags, include a new technology that simplifies securely installing smart tag DLLs, Component Object Model (COM) add-ins, and other files that extend Office. This will simplify deploying smart tags to the desktop and keeping them up to date. (4) Some smart tags should only be valid for a certain period of time, and with Office 11, smart tags can be marked to expire on a certain date. (5) Temporary smart tags are smart tags that are not saved with the document but are active when the document is open. This ensures that private information is not forwarded with a document. (6) The smart tag Recognize method now has a new parameter that passes in the application name. There are times when you want to change the behavior of recognition depending on the application in use. This new parameter makes the recognition more reliable. (7) Most developers new to smart tags have found it challenging to figure out how to efficiently parse text in documents to identify recognized terms. The Recognize2 method now sports an ISmartTagTokenList object. This object breaks down the text sent to the Recognize2 method into individual words. This object will greatly simplify text-parsing code for new developers. Even so, experienced smart tag developers will find this a welcome enhancement. (8) The Word smart tag object model is extended to improve compatibility with the Excel smart tag object model. These extensions allow for firing a smart tag action and enabling and disabling recognizers..."

  • [January 21, 2003] "IBM Brings Domino and WebSphere Closer Together." By Dennis Callaghan. In eWEEK (January 20, 2003). "IBM's Lotus Software division will unveil two initiatives at its Lotusphere conference next week that enable developers to embed Domino components as Web services in other applications. Lotus' Contextual Collaboration initiative will integrate Domino and WebSphere development environments, according to sources close to the Cambridge, Mass., company. The first initiative, code-named Project Montreal, will add Domino classes to IBM's WSAD (WebSphere Studio Application Developer) tool kit, which is based on the Eclipse 2 open-source Java integrated development environment. This will allow non-Domino developers to create collaborative applications as Web services. It will also provide integration between WebSphere and Domino applications while allowing WebSphere developers to remain in the coding environment they're familiar with. The second, and thought to be the more ambitious, initiative is code-named Project Seoul. It will allow Domino developers to work within the Domino development environment but output code as Java 2 Enterprise Edition components, which can be embedded in other non-Domino J2EE-based applications..." [IBM's Websphere Studio Application Developer is a "core development environment from IBM which helps you to to optimize and simplify Java 2 Enterprise Edition (J2EE) and Web services development by offering best practices, templates, code generation, and the most comprehensive development environment in its class. Use WSAD to (1) create J2EE and Web services applications with integrated support for Java components, EJB, servlet, JSP, HTML, XML, and Web services all in one development environment; (2) uild new applications or enable existing assets quickly with Web services using open standards such as UDDI, SOAP, and WSDL via Web services Client Wizard; (3) transform data using a comprehensive XML development environment that offers wizards and mapping tools for creating DTDs, XML Schemas, XSL style sheets and other data transformation..."]

  • [January 21, 2003] "Real Releases Digital Media Source Code." From Reuters. January 21, 2003. RealNetworks has made available "the source code for sending video and audio over the Internet to other software and hardware makers. The release of the code, called the Helix DNA Server, is part of RealNetworks' push to create a universal technology for sending and receiving digital media in order to fend off crosstown rival Microsoft. Late in 2002, Seattle-based RealNetworks had already announced two other components of its Helix technology: the software player used to receive digital streams, and encoding software used to convert raw content into digital format. With all three components, developers such as mobile phone manufacturers can create systems that can send and receive digital content in any format, said Dan Sheeran, vice president of media systems at RealNetworks. More than 10,000 developers have already joined RealNetworks' Helix development community, Sheeran said... RealNetworks, best known for its Real series of players for video and audio sent over the Internet, released last July the Helix Universal Server for sending media over the Web for Windows and Unix-based server operating systems as well as for open-source Linux..." According to the announcement: "the third source code component of the Helix platform, the Helix DNA Server, is now available to software developers through the Helix Community at www.helixcommunity.org. With Helix DNA Server, developers for the first time have source code access to a major end-to-end media delivery platform, consisting of producer, server and client components. Helix DNA Server is the core source code of RealNetworks' Helix Universal Server, the 9th generation, multi-format digital media server that thousands of webcasters already use to deliver content on the Internet. The Helix DNA Server offers a robust source code base for developing digital media delivery products including: 9th generation media server; Industry standard streaming media protocol and transports including -- RTSP, RTP, SAP and SDP; Live and on-demand broadcasting; Native MP3 support; RealVideo and RealAudio codec/file format support; Administration, monitoring and logging; Full client authentication support, such as for pay per view services. It is available for free under the RPSL, included in the RCSL royalty for commercial distribution... The Helix DNA Server will be licensed under both a public source license and a commercial community source license. Both licenses are free of charge for research and development use." See previously "RealNetworks Eases Rights Management."

  • [January 21, 2003] "Thinking XML: The Open Office File Format. An XML Format for Front Office Documents." By Uche Ogbuji (Principal Consultant, Fourthought, Inc). From IBM developerWorks, XML zone. January 2003. ['OpenOffice.org is a mature, open source, front office applications suite with the advantage of a saved file format based on an open XML DTD. This gives users and developers an extraordinary amount of flexibility and power in dealing with work produced in OpenOffice.org. In this article, Uche Ogbuji introduces the OpenOffice file format and explains its advantages.'] "The OpenOffice.org project, which produces a complete, open-source office suite derived from StarOffice, uses XML for its core file formats, rather than as a separate export option. OpenOffice includes a word processor, spreadsheet, a presentation tool, and a graphics/diagramming tool... The stake-holders in OpenOffice.org -- the contributors and users on the OpenOffice.org Web site -- have all committed to making its file format as open and general as possible, in the hopes of fostering greater interoperability and flexibility among office file formats. To further this goal, they have contributed the file formats to a new technical committee of the Organization for the Advancement of Structured Information Standards (OASIS)... In this article, I introduce the OpenOffice file formats. This is an interesting time for the intersection of XML and office software. There has been a lot of discussion of the recent Microsoft XDocs technology and how it may or may not compete with or complement XForms, the OpenOffice formats, and other such projects. I shall not cover any such connections here -- in part because of lack of space, and in part because details of XDocs are just emerging... I provide a sketch of the OpenOffice text file format, but the project does not just toss out a text format and leave it at that. OpenOffice provides a rich toolkit for integrating XML tools, and there is a growing body of third-party tools as well. These include SAX filters, XSLT plug-ins, and even low-level Java APIs. Developers from the community have already used these facilities to augment OpenOffice with the ability to load and save Docbook, HTML, TeX, plain text, and the document formats used by PalmOS and PocketPC. XMerge is a project for working with OpenOffice content on small devices such as PDAs and cell phones. Work on XMerge is proceeding at a remarkable pace, and vendors such as Nokia have seen fit to chip into the project. This underlines another huge benefit of the openness embraced by OpenOffice. It encourages contributions from a wide variety of sources, even commercial interests, who understand that this openness brings about a level playing field, as opposed to the use of a proprietary format. XMerge uses XSLT plug-ins for document conversion, which also ensures cross-platform support. In the OASIS Open Office XML Format TC we will continue to improve these file formats, with a sharp eye on enhancing interoperability even further. This is an open process with an open mailing list, and any OASIS member can join formally. I encourage all who are interested in managing front-office documents to participate..." See: (1) "OpenOffice.org XML File Format"; (2) "XML File Formats for Office Documents."

  • [January 21, 2003] "Requirements for XML Schema 1.1." Edited by Charles Campbell, Ashok Malhotra (Microsoft), and Priscilla Walmsley. W3C Working Draft 21-January-2003. Version URL: http://www.w3.org/TR/2003/WD-xmlschema-11-req-20030121/. Latest version URL: http://www.w3.org/TR/xmlschema-11-req/. "This document contains a list of requirements and desiderata for version 1.1 of XML Schema... Since the XML Schema Recommendation (Part 0: Primer, Part 1: Structures, and Part 2: Datatypes) was first published in May, 2001, it has gained acceptance as the primary technology for specifying and constraining the structure of XML documents. Users have employed XML Schema for a wide variety of purposes in many, many different situations. In doing so, they have uncovered some errors and requested some clarifications. They have also requested additional functionality. Most of the errors and clarifications are addressed in the published errata and will be integrated into XML Schema 1.0 Second Edition, to be published shortly. Additional functionality and any remaining errors and clarifications will be addressed in XML Schema 1.1 and XML Schema 2.0. This document discusses the requirements for version 1.1 of XML Schema. These issues have been collected from e-mail lists and minutes of telcons and meetings, as well as from the various issues lists that the XML Schema Working Group has created during its lifetime. Links are provided for further information. The items in this document are divided into three categories: (1) A requirement must be met in XML Schema 1.1; (2) A desideratum should be met in XML Schema 1.1; (3) An opportunistic desideratum may be met in XML Schema 1.1..." General references in "XML Schemas."

  • [January 21, 2003] "IBM Aims to Get Smart About AI." By Michael Kanellos. In CNET News.com (January 21, 2003). "In the coming months, IBM will unveil technology that it believes will vastly improve the way computers access and use data by unifying the different schools of thought surrounding artificial intelligence. The Unstructured Information Management Architecture (UIMA) is an XML-based data retrieval architecture under development at IBM. UIMA will greatly expand and enhance the retrieval techniques underlying databases, said Alfred Spector, vice president of services and software at IBM's Research division. UIMA 'is something that becomes part of a database, or, more likely, something that databases access,' he said. 'You can sense things almost all the time. You can effect change in automated or human systems much more.' Once incorporated into systems, UIMA could allow cars to obtain and display real-time data on traffic conditions and on average auto speeds on freeways, or it could let factories regulate their own fuel consumption and optimally schedule activities. Automated language translation and natural language processing also would become feasible... The theory underlying UIMA is the Combination Hypothesis, which states that statistical machine learning -- the sort of data-ranking intelligence behind search site Google -- syntactical artificial intelligence, and other techniques can be married in the relatively near future... The results of current, major UIMA experiments will be disclosed to analysts around March, with public disclosures to follow, sources at IBM said..."

  • [January 21, 2003] "Architecting Knowledge Middleware." By Alfred Z. Spector (Vice President, Services and Software, IBM Research Division). Keynote Address delivered May 9, 2002 at WWW 2002, The Eleventh International World Wide Web Conference (Sheraton Waikiki Hotel, Honolulu, Hawaii, USA, 7-11 May 2002). Abstract. 40 pages, with discussion of Unstructured Information Management Architecture (UIMA) on pages 32-38. "IBM Research's Knowledge Middleware Architecture (UIMA): Traditional approaches to building UIM [Unstructured Information Management] applications are 'algorithmic centric', resulting in tightly integrated vertical applications, whose design is dominated by concerns of computational load. A new approach for providing NLP functionality is evolving which recognizes the inherent need for flexibility and exploits todays extrodinary MIPS, storage, and networking capacity... The KM Architecture of IBM's UIMA Project provides a common framework for the integration of UIM technologies, using a Common Annotation System (Abstract Data Structure). It provides a flexible and adaptable Service Oriented Architecture) which uses XML standards to support dynamic binding of services and distributed (multiagent) implementations (RDF, WSDL, WSFL, etc.). The arthitecture supports 'persistent binding' to avoid dynamic binding overhead for batch, single agent processes; it provides both tightly- and loosely-coupled variants. As a 'toolkit / library' (not a monolithic system) it accommodates variety of applications, and separates programming tasks that require distinct skills. It supports a seamless integration of: (1) structured, semi-structured, and unstructured data, with (2) human agents and computer agents... High-Level Services in the architecture include search, query processing, result reordering, hyperlinking, collaboration, navigation, collaborative filtering, pub/sub, knowledge agents, personal taxonomies and relationships. Analyzers and Indexers provide for indexing, ranking, categorization, clustering, summarization topic detection, semantic relationships, and incremental updates. Core Services support tokenization, parsing, stemming, part of speech, translation, access control, authentication, profile management, workflow, speech, transcoding for mobile use, crawling, caching, data access, and format normalization... Maintenance Tools