Jack J. Woehr
XML Bible, by Elliotte Rusty Harold, is a readable and meaty introduction to practical XML. Harold states his case succinctly:
This books has one primary goal: to teach you to write XML for the Web. Fortunately, XML has a decidedly flat learning curve...As you learn a little, you can do a little.
XML Bible is, in form, at least, one of those rush-to-print wonders documenting standards that aren't yet adopted for languages that aren't yet finished so that the reader can use tools that aren't written to produce documents for browsers that aren't commonly available. Harold writes:
I've outlined a lot of exciting stuff in this chapter. However honesty compels me to tell you that...much of what I've described is the promise of XML rather than the current reality.
Yet it's a surprisingly well-assembled volume, nicely integrated with its CD-ROM and authored and edited by individuals who had some inkling of the difficulties the reader would encounter exercising the content. The author states:
In this book, I mostly assume you're using Windows 95 or NT 4.0 or later. As a longtime Mac and UNIX user, I somewhat regret this. Like Java, XML is supposed to be platform independent. Also, like Java, the reality is somewhat short of the hype.
Harold visits quite a range of specialty markups, each effectively at this time in XML history more-or-less requiring a different browser to appreciate. W3C's Amaya for several platforms comes on the CD-ROM, as do Netscape 4.0.4 and IE5, both for Wintel. It becomes obvious pretty quickly that IE5, as of the cut, had an edge on generalized XML browsing. However, only Amaya knew how to render MathML. On Linux, I did best with a stable release of Mozilla, after upgrading one of my Linux 5.2 machines to Linux 6.x so it would run the latest. However, Mozilla stumbled on XSL formatting. And no browser freely available for any of the book's supported platforms handles exotica like VoxML (used for telephony).
Laying aside for now the absorbing minutiae of bleeding-edge geek tool frenzy that afforded us many hours of entertainment in the course of preparing this review, we should note that the bulk of the book focuses not on exotica, but on the nitty-gritty basics of XML and style markup. Document type definitions, cascading style sheets, XSL formatting, and VML are among the topics given extensive coverage.
To a certain extent, you can anticipate such flaws as this book possesses. Any computer book named "The [subject] Bible" is probably going to merit the subtitle "A little too much about [subject]," and this toe-breaker is no exception. The day is coming when computer book publishers are going to stop trying to snow readers with avoirdupois weight and abandon printing the long code listings in favor of pointing the reader to the CD-ROM content (all sample code is indeed included on XML Bible's disk).
The CD-ROM contains, in addition to browsers and source from the book, an XML parser written in Java, various utilities, and some standard and specification documents.
Harold is a talented technical writer with a lively style well suited to his audience. The book is reasonably well edited and the production values are high. While (despite cover blurbs) neither comprehensive nor authoritative, it is broad, energetic, helpful, and alert. For the working web author needing a boosterized ramp-up to productivity in XML, XML Bible is more than adequate, it's also quite useful and entertaining.
Electronic Review of Computer Books
Read an Excerpt
Chapter 2: An Introduction to XML Applications...Open Financial Exchange
Software cannot be changed willy-nilly The data that software knows how to read has inertia. The more data you have in a given program's proprietary, undocumented format, the harder it is to change programs. For example, my personal finances for the last five years are stored in Quicken. How likely is it that I will change to Microsoft Money even if Money has features I need that Quicken doesn't have? Unless Money can read and convert Quicken files with zero loss of data, the answer is "NOT BLOODY LIKELY!"
The problem can even occur within a single company or a single company's products. Microsoft Word 97 for Windows can't read documents created by some earlier versions of Word. And earlier versions of Word can't read Word 97 files at all. And Microsoft Word 98 for the Mac can't quite read everything that's in a Word 97 for Windows file, even though Word 98 for the Mac came out a year later!
As noted in Chapter 1, the Open Financial Exchange Format (OFX) is an XML application used to describe financial data of the type you're likely to store in a personal finance product like Money or Quicken. Any program that understands OFX can read OFX data. And since OFX is fully documented and non-proprietary (unlike the binary formats of Money, Quicken, and other programs) it's easy for programmers to write the code to understand OFX.
OFX not only allows Money and Quicken to exchange data with each other. It allows other programs that use the same format to exchange the data as well. For instance, if a bank wants to deliver statements to customers electronically, it only has to write one program to encode the statements in the OFX format rather than several programs to encode the statement in Quicken's format, Money's format, Managing Your Money's format, and so forth.
The more programs that use a given format, the greater the savings in development cost and effort. For example, six programs reading and writing their own and each other's proprietary format require 36 different converters. Six programs reading and writing the same OFX format require only six converters. Effort is reduced to O(n) rather than O(n2). Figure 2-6 depicts six programs reading and writing their own and each other's proprietary format. Figure 2-7 depicts six programs reading and writing the same OFX format. Every arrow represents a converter that has to trade files and data between programs. In Figure 2-6, you can see the connections for six different programs reading and writing each other's proprietary binary format. In Figure 2-7, you can see the same six different programs reading and writing one open XML format. The XML-based exchange is much simpler and cleaner than the binary-format exchange.
Extensible Forms Description Language
I went down to my local bookstore today and bought a copy of Armistead Maupin's novel Sure of You. I paid for that purchase with a credit card, and when I did so I signed a piece of paper agreeing to pay the credit card company $14.07 when billed. Eventually they will send me a bill for that purchase, and I'll pay it. If I refuse to pay it, then the credit card company can take me to court to collect, and they can use my signature on that piece of paper to prove to the court that on October 15, 1998 I really did agree to pay them $14.07.
The same day I also ordered Anne Rice's The Vampire Armand from the online bookstore amazon.com. Amazon charged me $16.17 plus $3.95 shipping and handling and again I paid for that purchase with a credit card. But the difference is that Amazon never got a signature on a piece of paper from me. Eventually the credit card company will send me a bill for that purchase, and I'll pay it. But if I did refuse to pay the bill, they don't have a piece of paper with my signature on it showing that I agreed to pay $20.12 on October 15, 1998. If I claim that I never made the purchase, the credit card company will bill the charges back to Amazon. Before Amazon or any other online or phone-order merchant is allowed to accept credit card purchases without a signature in ink on paper, they have to agree that they will take responsibility for all disputed transactions.
Exact numbers are hard to come by, and of course vary from merchant to merchant, but probably a little under 10% of Internet transactions get billed back to the originating merchant because of credit card fraud or disputes. This is a huge amount! Consumer businesses like Amazon simply accept this as a cost of doing business on the Net and work it into their price structure, but obviously this isn't going to work for six figure business-to-business transactions. Nobody wants to send out $200,000 of masonry supplies only to have the purchaser claim they never made or received the order. Before business-to-business transactions can move onto the Internet, a method needs to be developed that can verify that an order was in fact made by a particular person and that this person is who he or she claims to be. Furthermore, this has to be enforceable in court. (It's a sad fact of American business that many companies won't do business with anyone they can't sue.)
Part of the solution to the problem is digital signatures--the electronic equivalent of ink on paper. To digitally sign a document, you calculate a hash code for the document using a known algorithm, encrypt the hash code with your private key, and attach the encrypted hash code to the document. Correspondents can decrypt the hash code using your public key and verify that it matches the document. However, they can't sign documents on your behalf because they don't have your private key. The exact protocol followed is a little more complex in practice, but the bottom line is that your private key is merged with the data you're signing in a verifiable fashion. No one who doesn't know your private key can sign the document.
The scheme isn't foolproof--it's vulnerable to your private key being stolen, for example-but it's probably as hard to forge a digital signature as it is to forge a real ink-on-paper signature. However, there are also a number of less obvious attacks on digital signature protocols. One of the most important is changing the data that's signed. Changing the data that's signed should invalidate the signature, but it doesn't if the changed data wasn't included in the first place. For example, when you submit an HTML form, the only things sent are the values that you fill into the form's fields and the names of the fields. The rest of the HTML markup is not included. You may agree to pay $1500 for a new 450 MHz Pentium II PC running Windows NT, but the only thing sent on the form is the $1500. Signing this number signifies what you're paying, but not what you're paying for. The merchant can then send you two gross of flushometers and claim that's what you bought for your $1500. Obviously, if digital signatures are to be useful, all details of the transaction must be included. Nothing can be omitted.
The problem gets worse if you have to deal with the U.S. federal government. Government regulations for purchase orders and requisitions often spell out the contents of forms in minute detail, right down to the font face and type size. Failure to adhere to the exact specifications can lead to your invoice for $20,000,000 worth of depleted uranium artillery shells being rejected. Therefore, you not only need to establish exactly what was agreed to; you also need to establish that you met all legal requirements for the form. HTML's forms just aren't sophisticated enough to handle these needs.
XML, however, can. It is almost always possible to use XML to develop a markup language with the right combination of power and rigor to meet your needs, and this example is no exception. In particular UWI.COM has proposed an XML application called the Extensible Forms Description Language (XFDL) for forms with extremely tight legal requirements that are to be signed with digital signatures. XFDL further offers the option to do simple mathematics in the form, for instance to automatically fill in the sales tax and shipping and handling charges and total up the price.
UWI.COM has submitted XFDL to the W3C, but it's really overkill for Web browsers, and thus probably won't be adopted there. The real benefit of XFDL, if it becomes widely adopted, is in business-to-business and business-to-government transactions. XFDL can become a key part of electronic commerce, which is not to say it will become a key part of electronic commerce. It's still early, and there are other players in this space...