Professional XML / Edition 2

Paperback (Print)
Used and New from Other Sellers
Used and New from Other Sellers
from $1.99
Usually ships in 1-2 business days
(Save 96%)
Other sellers (Paperback)
  • All (23) from $1.99   
  • New (6) from $4.66   
  • Used (17) from $1.99   
Sort by
Page 1 of 1
Showing All
Note: Marketplace items are not eligible for any coupons and promotions
Seller since 2010

Feedback rating:



New — never opened or used in original packaging.

Like New — packaging may have been opened. A "Like New" item is suitable to give as a gift.

Very Good — may have minor signs of wear on packaging but item works perfectly and has no damage.

Good — item is in good condition but packaging may have signs of shelf wear/aging or torn packaging. All specific defects should be noted in the Comments section associated with each item.

Acceptable — item is in working order but may show signs of wear such as scratches or torn packaging. All specific defects should be noted in the Comments section associated with each item.

Used — An item that has been opened and may show signs of wear. All specific defects should be noted in the Comments section associated with each item.

Refurbished — A used item that has been renewed or updated and verified to be in proper working condition. Not necessarily completed by the original manufacturer.


Ships from: Sussex, WI

Usually ships in 1-2 business days

  • Canadian
  • Standard, 48 States
  • Standard (AK, HI)
  • Express, 48 States
  • Express (AK, HI)
Seller since 2008

Feedback rating:


Condition: New
2000 Paperback New Ships Fast! Satisfaction Guaranteed!

Ships from: Skokie, IL

Usually ships in 1-2 business days

  • Canadian
  • International
  • Standard, 48 States
  • Standard (AK, HI)
  • Express, 48 States
  • Express (AK, HI)
Seller since 2015

Feedback rating:


Condition: New
Brand New Item.

Ships from: Chatham, NJ

Usually ships in 1-2 business days

  • Canadian
  • International
  • Standard, 48 States
  • Standard (AK, HI)
  • Express, 48 States
  • Express (AK, HI)
Seller since 2008

Feedback rating:


Condition: New

Ships from: fallbrook, CA

Usually ships in 1-2 business days

  • Standard, 48 States
  • Standard (AK, HI)
Seller since 2015

Feedback rating:


Condition: New
Brand new.

Ships from: acton, MA

Usually ships in 1-2 business days

  • Standard, 48 States
  • Standard (AK, HI)
Seller since 2008

Feedback rating:


Condition: New

Ships from: Chicago, IL

Usually ships in 1-2 business days

  • Standard, 48 States
  • Standard (AK, HI)
Page 1 of 1
Showing All
Sort by


This book is the definitive, practical guide to what's in XML and the DOM. The book is applicable to all development languages including Java developers applying XML to new APIs. It is a full breakdown of the parser & XML toolsets available for commercial development.
Read More Show Less

Product Details

  • ISBN-13: 9781861003119
  • Publisher: Wrox Press, Inc.
  • Publication date: 1/1/2000
  • Series: Programmer to Programmer Series
  • Edition number: 2
  • Pages: 1168
  • Product dimensions: 7.36 (w) x 9.19 (h) x 2.10 (d)

Meet the Author

Didier Martin Didier PH Martin has worked with computers for the last 21 years. Even so, he still enjoys playing with new technologies and XML tools are like new toys at Christmas for him. After developing an accounting package, building robots, and even creating video games, he encountered the SGML world, and later the world of XML and this was the beginning of his passion for markup technologies. He is currently CEO of Talva Corp and enjoys working with his colleagues, creating the next generation of XML tools. Didier lives near a mountain and, when he's not creating tools or working on new standards, don't be surprised to see him on his skis in winter and on his bike in summer. His favorite thought: "a different point of view is worth a thousand points of IQ".

Mark Birbeck Mark Birbeck has been a professional programmer for over 18 years. He's twiddled bits with Z80 and 6502 assembly language and written in C on early Unix systems. He remembers the day Windows actually became something worth using to write programs, and the day Microsoft made their C compiler speak C++. And he'll never forget the first time he saw a web server serve. But for him, none of this compares to the day he stumbled across XML - and he hasn't looked back. His company - - specialise in the development of XML tools that help with the building of portal sites.

Mark would like to apologise to all the people he neglected when writing his chapter, even though none of them have got the faintest idea what he's talking about.

Michael Kay Michael Kay works for the IT services company ICL, where he holds the post of ICL Fellow, responsible for investigating new technologiesand advising clients on their exploitation, especially in the information management arena. He is known in the XML world as the author of SAXON, an open source XSL processor. His background (and Ph.D) is in database management: in the past he has designed a number of ICL software products from object-oriented databases to text search engines, and has served on standards committees including the ANSI X32 group responsible for SQL. His most recent XML project was the design of a messaging backbone for a cable TV company, to exchange data between a wide variety of systems operated internally and by suppliers. In a personal capacity, he has also been promoting the use of XML for exchanging family history data. Michael is based in Bracknell, England.

Brian Loesgen Brian Loesgen is a Senior Software Engineer at Stellcom Inc., a San Diego-based leader in Internet solutions. At Stellcom Brian is involved in some of the most advanced web application development projects being built today. Brian has spoken at numerous technical conferences worldwide, and has been known to seize every opportunity possible to promote new technologies. He enjoys playing with bleeding edge software and translating those new technologies into real world benefits.

Stephen Mohr Stephen Mohr is a senior systems architect with Omicron Consulting. Over the last ten years, he has specialized in the PC computing platform, designing and developing systems using C++, Java, JavaScript, COM, and various internetworking standards and protocols. His latest efforts include the use of XML for application integration. Stephen holds BS and NIS degrees in computer science from Rensselaer Polytechnic Institute. His research interests include distributed object-based computing and the practical applications of artificial intelligence.

Jonathan Pinnock Jonathan Pinnock started programming in Pal III assembler on his school's PDP 8/e, with a massive 4K of memory, back in the days before Moore's Law reached the statute books. After turning his back on computers for three years in order to study Mathematics at Cambridge University, he was forced back into programming in order to make a living, something that he still does from time to time. These days, he works as an independent developer and consultant, mainly in the City of London. He is the author of Professional DCOMApplication Development, but hopes that this will not be held against him.

Jonathan lives in Hertfordshire, England, with his wife, two children and 1961 Ami Continental jukebox. His moderately interesting web site is located at, and he can be contacted on

Steven Livingstone Steven is based in Glasgow, Scotland and specialises in developing distributed web applications for business, as well as the creation of e-commerce applications using Site Server and XML. He also maintains the citixxom and web sites. At deltabizxom, he is currently working on an exciting project in developing a range of next generation electronic commerce products using BizTalk and SOAP. Watch this space.

1 would like to thank everyone at Wrox for their help as well as my fellow authors who gave me some good advice when it was needed. Mostly, 1 would like to thank Donna for the patience she has shown for the last few months (told you we would go on holiday!).

1 would be glad to hear from anyone at

Peter Stark Peter Stark works as an architect at He has been working in the WAP Forum from the day it was founded: the first year in the protocol's group, and the last year in the application's group, where WML and WMLScript is being specified. He also represents in the W3C HTML working group, which specifies the XML variant of HTML, XHTML. He is originally from Sweden, but lives currently lives in San Francisco, California.

Read More Show Less

Read an Excerpt

Chapter 1: Introducing XML

In this chapter we will take you through from a brief discussion of the historical origins of XML to an understanding of how the key aspects of XML-related technology fit together. Along the way we will discuss the nature of XML in general, and its impact on web architectures past and future. We hope this will provide you with a good foundation for digging into the rest of the material in the book.

Markup Languages

Ever since the invention of the printing press, writers have made notes on manuscripts to instruct the printers concerning typesetting, and other production issues. These notes were called markup, and a collection of such notes that conform to a defined syntax and grammar can be called a language. For example, proofreaders use a hand-written markup language (ML) to communicate corrections to authors. Even the modern use of punctuation is a form of markup that remains with the text to advise the reader how to interpret that text. Most of these MLs use a distinct appearance so as to differentiate markup from the text to which it refers. Proofreaders' marks use a combination of cursive handwriting and special symbols to distinguish markup from the typeset text. Similarly, punctuation uses special symbols that cannot be confused with the alphabet and numbers that represent the textual content. Some punctuation symbols are so necessary to the understanding and production of printed English that they were included in the ASCII character set, the basis of the character sets used in almost all modern computers. Therefore these symbols also became part of modern programming language syntaxes, the standardization of the symbol set driving their re-appropriation for roles other than the punctuation of English.

The ASCII standard also defined a set of symbols (the "CO control characters", with hexadecimal values 00 to 10 that were intended to be used to markup the structure of data transmissions. Only a few of these symbols found wide-spread acceptance, and their use was often inconsistent. The most common example is the character(s) used to delimit the end of a line of text in a document.

Teletype machines used the physical motion-based character pair CR-LF (carriage-return, line-feed), that was later used by both MS-DOS and MS-Windows. In contrast, Unix uses a single LF character, and the MacOS uses a single CR character. Because of these conflicting and non-standard uses of ASCII, document interchange between these systems often requires a translation step - a simple text file cannot be shared without conversion - and this is just the simplest of markup issues that doesn't even address the question of what constitutes a "line" of text. Most word-processing programs have eliminated the use of a text "line", and have instead treated end-of-line markup as "end-of-paragraph", with the ASCII period-space (" . ") or period-space-space (" . ") strings being used to delimit sentences (though this method is imperfect).

Various forms of delimiters have been used to define the boundaries of containers for content, special symbol glyphs, presentation style of the text, or other special features of a document. For example, the C and C++ programming languages use braces ( . . . ) to delimit units of data or code, such as functions, data structures, and object definitions. A typesetting language, intended for manual human editing, might use more readable strings like . begin and . end. Other languages use other characters, or literal strings of characters - commonly called tags. Of course, there has often been conflict between different sets of tags and their interpretation. Without common delimiter vocabularies, much less common internal data formats, it has been very difficult to convert data from one format to another, or otherwise share data between applications and organizations.

In 1969, a person walked on the Moon for the first time. In the same year, Ed Mosher, Ray Lorie, and Charles F. Goldfarb of IBM Research invented the first modern markup language, Generalized Markup Language (GML). GML was a self-referential language for marking the structure of an arbitrary set of data, and was intended to be a meta-language - a language that could be used to describe other languages, their grammars and vocabularies. GML later became Standard Generalized Markup Language (SGML). In 1986, SGML was adopted as an international data storage and exchange standard by the International Organization for Standardization (ISO), designated ISO 8879 (see With the major impact of the World Wide Web (WWW) upon human commerce and communications, it could be argued that the quiet invention of GML was a more significant event in the history of technology than the high adventure of that first trip to another celestial body.

SGML is an extremely powerful (and rather complicated) markup language that has been widely used by the U.S. government and its contractors, large manufacturing companies, and publishers of technical information. Publishers often construct paper documents, such as books, reports, and reference manuals in SGML. These SGML documents are then transformed into a presentable format, and then sent to the typesetter and printer. SGML is also used to exchange technical specifications for manufacturing. However, its complexities and the high cost of its implementation have meant that most businesses and individuals cannot afford to embrace this useful technology.

More information about SGML can be found at http://www.oasis-open.orglcover

With advances in the development of the World Wide Web there was a drive for a simpler approach.

Origins and Goals of XML

In 1996, the World Wide Web Consortium (or W3C, ht tp: //www.w3 . org) began the process of designing an extensible markup language that would combine the flexibility and power of SGML with the widespread acceptance of HTML. The language that became XML drew on the specification of SGML, and indeed, was specified to be a subset of this language. Using SGML as a starting point allowed the design team to concentrate on making what already worked simpler. SGML already provided an open-ended language that could be extended by anyone for their own purposes. The intention that XML should be simpler than SGML was driven by the consideration of ease-of-use: in part the reading and writing of markup by persons using simple and commonly available tools, but also the simplifying of computer processing of documents and interchange datasets. Due to its many optional features, SGML is so complex that it is difficult to write generic parsers, whereas XML parsers are much simpler. In addition, XML leverages existing Internet protocols and software for easy data processing and transmission. Being a proper subset of SGML, XML also retains backwards compatibility with existing SGML-oriented systems, so data marked up in XML could still be used in these systems, saving SGML-based industries a lot of money in conversion costs, whilst leveraging the greater accessibility provided by the Web.

XML 1.0 became a W3C Recommendation in February 1998. The formal specification, including the grammar in Extended Backus-Naur Form (EBNF) notation, is readily available on the Web from the W3C (at; and there is also an excellent annotated version by Tim Bray, one of the co-editors of the XML specification (at

An XML 1.0 FAQ maintained by Peter Flynn et al. on behalf of the W3C's XML Special Interest Group at http://www. uoC.ielxmllproaides extensive links to other topics related to XML.

XML is a simple, standard way to delimit text data. It has been described as "the ASCII of the Web". It is as if you could use your favorite programming language to create an arbitrary data structure, and then share it with anyone using any other language on any other computing platform. XML tags name the concept you are describing, and named attributes modify the tagged structures. So, you can formally describe the syntax you have devised and share it with others.

Without worrying too much about the particulars of the syntax, we can immediately see how powerful a mechanism the simple addition of tags describing the information they envelop is.

This data description mechanism in XML means it is a great way to share information over the Internet, because:

  • It is open; XML can be used to exchange data with other users and programs in a platformindependent way.
  • Its self-describing nature makes it an effective choice for business-to-business and extranet solutions.
  • You can share data between programs without prior coordination. As we shall see shortly, To work with XML documents, the W3C has standardized an API (Application Programming Interface) for XML, so it is easy to create programs that read and write XML, while the developer community has devised a popular, complimentary, event-based alternative API. In addition, XML was designed for ready support of non-European languages and internationalization. Like HTML 4.01, XML is based upon the Universal Character Set (UCS), defined in the ISO/IEC 10646 character set standard (which is currently congruent with the somewhat better-known Unicode standard, All the features that propelled HTML to popularity are present in XML.

    However, XML is not, directly, a replacement for HTML. You can read every word of the XML Recommendation (the World Wide Web Consortium's equivalent of a standard) and not find a single word related to visual presentation. Unlike HTML, which fuses data and presentation, XML is about data alone.

    Although XML itself is data, the XML community has not forgotten presentation. Unlike traditional methods of presenting data, which relied on extensive bodies of code, the presentation techniques for styling XML are data driven. These range from the simple to the extremely complex. Regardless of the technique chosen, however, XML styling is accomplished through another document dedicated to the task, called a style sheet. In it a designer specifies formatting styles and rules that determine when the styles should be applied. The same style sheet can then be used with multiple documents to create a similar appearance...

Read More Show Less

Table of Contents

Chapter 2: XML Syntax ..... 31
Chapter 3: Document Type Definitions ..... 69
Chapter 4: Data Modeling and XML ..... 105
Chapter 5: The Document Object Model ..... 145
Chapter 6: SAX 1.0: The Simple API for XML ..... 185
Chapter 7: Namespaces and Schemas ..... 237
Chapter 8: Linking and Querying ..... 295
Chapter 9: Transforming XML ..... 369
Chapter 10: XML and Databases ..... 421
Chapter 11: Server to Server ..... 497
Chapter 12: eBusiness and XML ..... 577
Chapter 13: Styling XML ..... 663
Chapter 14: Wireless Application Protocol ..... 719
Chapter 15: Case Study 1 - Data Duality ..... 771
Chapter 16: Case Study 2 - XML and Distributed Applications ..... 797
Chapter 17: Case Study 3 - Book Catalog Information Service ..... 837
Chapter 18: Case Study 4 - SOAP ..... 887
Appendix A: Extensible Markup Language (XML) 1.0 Specification ..... 937
Appendix B: IE5 XML Document Object Model ..... 983
Appendix C: SAX 1.0: The Simple API for XML ..... 1041
Appendix D: XML Schemas and Data Types ..... 1073
Appendix E: IE5 XSL Reference ..... 1085
Appendix F: CSS Properties ..... 1101
Appendix G: Installing XT ..... 1117
Appendix H: Support and Errata ..... 1121
Index ..... 1161
Read More Show Less

First Chapter

Chapter Six

This manuscript is an abridged version of a chapter from the Wrox Press book Professional XML. Chapter 6 of Professional XML covers the Simple API for XML, or SAX, interface. It explains why might you use it instead of the DOM, and will get you writing simple applications with SAX, as well as explaining a little bit about where it came from, and where it's going.

Professional XML is a broad compendium that investigates and describes how the total XML concept will work for programmers. It's the next edition of the popular XML Applications (Wrox 1998).

The focus of Professional XML is on real-world applications that use XML as an enabling technology. It presents good design techniques, and shows how to interface XML-enabled applications with Web applications and database systems. It explores the frontiers of XML and previews some nascent technologies. Whether your requirements are oriented toward data exchange or visual styling, this book will cover all the relevant techniques in the XML community.

Professional XML is for anyone who wants to use XML to build applications and systems. Web site developers can learn techniques to take their sites to the next level of sophistication. Managers, designers, and software architects can learn where XML fits into their systems and how to use it to solve problems in application integration. For further details about the book, and other books in our range, visit the Wrox Press Web Site.

  • SAX 1.0: The Simple API for XML (Part 1)
  • SAX 1.0: The Simple API for XML (Part 2)
  • SAX 1.0: The Simple API for XML (Part 3)
  • SAX 1.0: The Simple API for XML (Part 4)
  • SAX 1.0: The Simple API for XML (Part 5)
  • SAX 1.0: The Simple API for XML (Part 6)

SAX 1.0: The Simple API for XML

In Chapter 5 we looked at how to write applications using the Document Object Model. In this chapter we'll look at an alternative way of processing an XML document: the SAX interface. We'll start by discussing why you might choose to use the SAX interface rather than the DOM. Then we'll explore the interface by writing some simple applications. We'll also discuss some design patterns that are useful when creating more complex SAX applications, and finally we'll look at where SAX is going next.

SAX is a very different style of interface from DOM. With DOM, your application asks what is in the document by following object references in memory; with SAX, the parser tells the application what is in the document by notifying the application of a stream of parsingevents.

SAX stands for "Simple API for XML". Or if you really want it in full, the Simple Application Programming Interface for Extensible Markup Language.

As the name implies, SAX is an interface that allows you to write applications to read the data held in an XML document. It's primarily a Java interface, and all of our examples will be in Java. (Since we don't have the space to explain Java in this chapter we will assume knowledge of it for the purposes of this exposition. See Beginning Java 2, Wrox Press ISBN 1861002238, or the documentation at for more information.)

The SAX interface is supported by virtually every Java XML parser, and the level of compatibility is excellent. For a list of some of the implementations see or David Megginson's site at

To write a SAX application in Java, you'll need to install the SAX classes (in addition to the Java JDK, of course). In most cases you'll find that the XML Parser does this for you automatically (we'll tell you where you can get parsers shortly). Check to see that classes such as org.xml.sax.Parser are present somewhere on your classpath. If not, you can install them from

We'll say a few words later on about where SAX came from and where it's going. But for the moment, we'll just mention a most remarkable feature: SAX doesn't belong to any standards body or consortium, nor to any company or individual; it just exists in cyberspace for anyone to implement and everyone to use. In particular, unlike most of the XML family of standards it has nothing to do with the W3C.

SAX development is co-ordinated by David Megginson, and its specification can be found on his site: That specification, with trivial editorial changes, is reproduced for convenience in Appendix C of this book.

An Event-Based Interface

There are essentially three ways you can read an XML document from a program.

1.    You can just read it as a file and sort out the tags for yourself. This is the hacker's approach, and we don't recommend it. You'll quickly find that dealing with all the special cases (different character encodings, escape conventions, internal and external entities, defaulted attributes and so on) is much harder work than you thought; probably you won't deal with all these special cases correctly and sooner or later someone will feed you a perfectly good XML document that your program can't handle. Avoid the temptation: it's not as if XML parsers are expensive (most are free).

2.    You can use a parser that analyses the document and constructs a tree representation of its contents in memory: the output from the parser passes into the Document Object Model, or DOM. Your program can then start at the top of the tree and navigate around it, following references from one element to another to find the information it needs.

3.    You can use a parser that reads the document and tells your program about the symbols it finds, as it finds them. For example it will tell you when it finds a start tag, when it finds some character data, and when it finds an end tag. This is called an event-based interface because the parser notifies the application of significant events as they occur. If this is the right kind of interface for you, use SAX.

Let's look at event-based parsing in a little more detail.

You may have come across the term 'event-based' in user interface programming, where an application is written to respond to events such as mouse-clicks as they occur. An event-based parser is similar: in particular, you have to get used to the idea that your application is not in control. Once things have been set in motion you don't call the parser, the parser calls you. That can seem strange at first, but once you get used to it, it's not a problem. In fact, it's much easier than user-interface programming, because unlike a user going crazy with a mouse, the XML parsing events occur in a rather predictable sequence. XML elements have to be properly nested, so you know that every element that's been opened will sooner or later be closed, and so on.

Consider a simple XML file such as the following:

<?xml version="1.0"?>


    <book>Professional XML</book>


As the parser processes this, it will call a sequence of methods such as the following (we'll describe the actual method names and parameters later, this is just for illustration):


startElement( "books" )

startElement( "book" )

characters( "Professional XML" )

endElement( "book" )

endElement( "books" )


All your application has to do is to provide methods to be called when the events such as startElement and endElement occur.

Why Use an Event-Based Interface?

Given that you have a choice, it's important to understand when it's best to use an event-based interface like SAX, and when it's better to use a tree-based interface like the DOM.

Both interfaces are well standardized and widely supported, so whichever you choose, you have a wide choice of good quality parsers available, most of which are free. In fact many of the parsers support both interfaces.

The Benefits of SAX

The following sections outline the most obvious benefits of the SAX interface.

It Can Parse Files of Any Size

Because there is no need to load the whole file into memory, memory consumption is typically much less than the DOM, and it doesn't increase with the size of the file. Of course the actual amount of memory used by the DOM depends on the parser, but in many cases a 100Kb document will occupy at least 1Mb of memory.

A word of caution though: if your SAX application builds its own in-memory representation of the document, it is likely to take up just as much space as if you allowed the parser to build it.

It Is Useful When You Want to Build Your Own Data Structure

Your application might want to construct a data structure using high-level objects such as books, authors, and publishers rather than low-level elements, attributes, and processing instructions. These "business objects" might only be distantly related to the contents of the XML file; for example, they may combine data from the XML file and other sources. If you want to build up an application-oriented data structure in memory in this way, there is very little advantage in building up a low-level DOM structure first and then demolishing it. Just process each event as it occurs, to make the appropriate incremental change to your business object model.

It Is Useful When You Only Want A Small Subset Of The Information

If you are only interested, say, in counting how many books have arrived in the library this week, or in determining their average price, it is very inefficient and quite unnecessary to read all the data that you don't want into memory along with the small amount that you do want. One of the beauties of SAX is that it makes it very easy to ignore the data you aren't interested in.

It Is Simple

As the name suggests, it's really quite simple to use.

It Is Fast

If it's possible to get the information you need from a single serial pass through the document, SAX will almost certainly be the fastest way to get it.

The Drawbacks of SAX

Having looked at the benefits it is only fair to address the potential drawbacks in using SAX.

There's No Random Access to the Document

Because the document is not in memory you have to handle the data in the order it arrives. SAX can be difficult to use when the document contains a lot of internal cross-references, for example using ID and IDREF attributes.

Complex Searches Can Be Difficult to Implement

Complex searches can be quite messy to program as the responsibility is on you to maintain data structures holding any context information you need to retain, for example the attributes of the ancestors of the current element.

The DTD Is Not Available

SAX 1.0 doesn't tell you anything about the contents of the DTD. Actually the DOM doesn't tell you much about it either, though some vendors have extended the DOM interface to do so. This isn't a problem for most applications: the DTD is mainly of interest to the parser; and as we'll see towards the end of the chapter the problem is fixed in SAX 2.0.

Lexical Information Is Not Available

The design principle in SAX is that it doesn't provide you with lexical information. SAX tries to tell you what the writer of the document wanted to say, and avoids troubling you with details of the way they chose to say it. For example:

  • You can't find out whether the original document contained "&#xa;" or "&#10;" or whether it contained a real newline character: all three are reported to the application in the same way.

  • You don't get told about comments in the document: SAX assumes that comments are there for the author's benefit, not for the reader's.

  • You don't get told about the order in which attributes were written: it isn't supposed to matter.

These restrictions are only a problem if you want to reproduce the way the document was written, perhaps for the benefit of future editing. For example, if you are writing an application designed to leave the existing content of the document intact, but to add some extra information from another source, the document author might get upset if you change the order of the attributes arbitrarily, or lose all the comments. In fact, most of the restrictions apply just as much to the DOM, although it does give you a little more information in some areas: for example, it retains comments. Again, many of the restrictions are fixed in SAX 2.0; though not all, for example the order of attributes is still a closely guarded secret, as is the choice of delimiter (single or double quotes).

SAX Is Read-Only

The DOM allows you to create or modify a document in memory, as well as reading a document from an XML source file. SAX, by contrast, is designed for reading XML documents, not for writing them.

Actually it turns out that the SAX interface is quite handy for writing XML documents as well as reading them. As we'll see later, the same stream of events that the parser sends to the application when reading an XML document can equally be sent from the application to an XML generator when writing one.

SAX Is Not Supported In Current Browsers

Although there are many XML parsers that support the SAX interface, At the time of writing there isn't a parser built into a mainstream web browser that supports it. You can incorporate a SAX-compliant parser within a Java applet, of course, but the overhead of downloading it from the server may strain the patience of a user with a slow Internet connection. In practice, your choice of interfaces for client-side XML programming is rather limited...

Read More Show Less


The Extensible Markup Language (XML) has emerged in just a few short years as nothing less than a phenomenon in computing. It is a concept, elegant in its simplicity, driving dramatic changes in the way Internet applications are written.

What Does This Book Cover?

This book explains and demonstrates the essential techniques for designing, using, and displaying XML documents. First and foremost, this book covers the fundamentals of XML as they are codified by the World Wide Web Consortium (W3C). The W3C is the standards body that originated XML in a formal way and continues to develop specifications for XML. Although the wider XML community is increasingly jumping in and offering new XML-related ideas outside the control of the W3C, the W3C is still central and important to the development of XML.

The focus of this book is on learning how to use XML as an enabling technology in real-world applications. It presents good design techniques, and shows how to interface XML-enabled applications with Web applications and database systems. It explores the frontiers of XML and previews some nascent technologies. Whether your requirements are oriented toward data exchange or visual styling, this book will cover all the relevant techniques in the XML community.

Each chapter contains a practical example. As XML is a platform-neutral technology, the examples cover a variety of languages, parsers, and servers. All the techniques are relevant across all the platforms, so you can get valuable insight from the examples even if they are not implemented using your favorite platform.

Who Is This Book For?

This book is for anyone who wants to use XML to build applicationsand systems. Web site developers can learn techniques to take their sites to the next level of sophistication, while programmers and software architects can learn where XML fits into their systems and how to use it to solve problems in application integration.

XML applications are usually distributed in nature and are commonly Web oriented. This is not a book specifically about distributed systems or Web development, so you do not need deep familiarity with those areas. A general awareness of multi-tier architectures and internetworking via the Web will be sufficient.

The examples in this book use a variety of programming languages and technologies. An important part of XML's appeal is the fact that it is platform and language neutral. If you have done some Web development, chances are you will find some examples written in your favorite language. If you don't see any examples specific to your platform, take heart. Tools for working with XML are available for Perl, C++, Java, JavaScript, and any COM-enabled language. Microsoft Internet Explorer (mainly version 5.0 and later) has strong XML capabilities built into it, and the Mozilla browser (the community source successor to Netscape's proprietary browser) is gaining similar support. XML tools are also turning up in major relational database management systems, as well as Web and application servers. If your platform isn't covered in this book, learn the fundamentals of XML and study the techniques presented in the examples, and you will be able to apply the lessons you learn here on any common computing platform.

How is this Book Structured?

Each chapter of this book takes up a separate topic pertaining to XML. Chapter 1 provides a conceptual introduction to the main aspects of XML. Chapters 2 and 3 are closely related as they cover the fundamentals of XML. Chapter 2 gets you started by covering the basic syntax and rules of XML. Chapter 3 takes you forward by providing tools for formally defining your own problem-specific XML vocabulary. The remaining chapters, however, are largely self-contained in terms of the techniques and technologies they present.

The main chapters are tied together with a unifying example. The example will assume a publisher wants to present their catalog of books in XML form. We will start by devising rules for describing books in a catalog, then build on those rules to show how each technology takes a turn in helping us build XML applications. You will see how book catalogs can be turned into documents, how such documents can be manipulated and accessed in code, and how their content can be styled for human readers. Since such an application would not, in practice, exist in a vacuum, you will also see how XML applications interface with databases.

There are several threads that run through this book which are outlined in the next section. This should allow you to read through the book focusing only on those issues that are important to you, skimming other sections.

Learning Threads

XML is evolving from its simple roots as a document markup language to a large, wide-ranging field of related markup technologies. It is this growth that is powering XML applications. With growth comes divergence. Different readers will come to this book with different expectations. XML is different things to different people. While we hope that you will read this book cover to cover, that is not necessary. Indeed, that may not even be the best way for everyone to approach this book.

This book has three threads springing from a common core. While you can certainly start at the first chapter and work your way sequentially through to the last, you can follow a more direct path to the knowledge you need. Everyone should read the core chapters to gain a common understanding of what XML encompasses. From there, you can approach XML as data or as content for visual presentation and styling.


Chapters 2 (Well-formed XML) and 3 (DTDs) cover the fundamentals of XML 1.0. Chapter 2 gives you the basic syntax, while Chapter 3 tells you how to formally specify an XML vocabulary in a way that every XML programmer is expected to understand. These chapters form the irreducible minimum you need to understand XML and begin working with it. Chapter 4 (on Data Modeling) gives you effective guidelines and lessons in creating good XML structures. It's hard to recover from a bad XML vocabulary, but a good one will forgive a lot of programming mistakes. Chapter 5 teaches you the Document Object Model (DOM), the W3C's API for XML documents, among other things. This takes you out of the realm of documents and into the world of applications.

These four chapters are enough for you to begin XML applications programming. When you are finished with them, you will know what XML is, how to structure it, and how to manipulate XML documents in code. Although a wealth of XML techniques lies ahead, you will have a firm foundation upon which to build.

So the 'Core' thread includes:

Chapter 2: Well-formed XML
Chapter 3: Document Type Definitions
Chapter 4: Data Modeling
Chapter 5: Document Object Model

XML as Data

As you will see in the core chapters, XML, unlike HTML, clearly separates the content of a document from its visual representation. In fact, for the purposes of many applications, visual rendering of XML documents is not important. These applications treat XML as data. The concern here is with using XML as an interface between programs and systems. This may be the most exciting area of XML today especially where XML can enable e-commerce as a technique for Web applications that negotiate commercial transactions.

Chapter 6 starts this thread. It discusses an event driven API (called SAX) for manipulating XML documents. As such, the API is especially useful for processing very large quantities of XML, streams of XML, or for when you need the smallest possible footprint in a parser. Chapter 7 introduces Namespaces and Schemas, two areas that let us express concepts more creatively and effectively than we can with DTDs. They are the emerging core for describing data in XML.

Chapter 8 will show you how to link documents and query within a document for a particular element. The querying technology used in the examples actually stems from the styling side of XML, and this chapter does double duty by appearing in the 'Presentation' thread as well as this one. It is useful in this thread for demonstrating how queries can be used in quickly finding elements we need, and for showing how we can relate different XML documents. Chapter 9 (Manipulating XML) also covers techniques for transforming XML documents for various purposes. It is interesting from the standpoint of data because it is presents some very powerful techniques for translating between vocabularies. It will prove useful for the interchange of data, particularly in e-commerce and business-to-business situations. Again, this chapter also has a bearing on the 'Presentation' thread as it introduces the idea of transforming XML documents into other languages which can help when it comes to presenting XML for a user to view.

Chapter 10 (XML and Databases) is all about data. Relational databases and XML are two approaches to capturing data for computing although they play different roles. This chapter teaches you how to interface the traditional approach to data storage to the use of XML. Chapter 11 (Server to Server) will show you how to reach out to another server when you don't have the data locally. This is a novel technique that is going to become common as web applications move to the forefront of computing. Chapter 12 then draws on the information in these two preceding chapters in its discussion of the use of XML as the messaging medium for e-commerce interactions. In this case, the other server belongs to a business partner. They examine the issues of exchanging data in this context, where XML fits into this picture, and details of how it is used.

Wrapping up this thread is the discussion of WAP (the Wireless Application Protocol) and it's associated use of XML in the Wireless Markup Language (WML), in Chapter 14. Much of WAP is concerned with the metamorphosis of data from the verbose form of XML to a compact binary representation without losing the benefits of the former for use on mobile devices. Considering this problem and seeing WAP's solution will let you better appreciate the benefits of XML as a data exchange medium. In addition, if XML is going to be used to store and transfer data, you'll want to put your data on all the common data devices, which will increasingly include wireless devices like cell phones and dedicated Web devices.

So our XML-as-data thread consists of:

Chapter 6: SAX: The Simple API for XML
Chapter 7: Namespaces and Schemas
Chapter 8: Linking and Querying
Chapter 9: Manipulating XML
Chapter 10: XML and Databases
Chapter 11: Server to Server
Chapter 12: eBusiness
Chapter 14: WAP and WML
Visual Presentation of XML

The data thread is great for moving data about between machines, but if you plan to pass XML to a human, you will be interested in the styling thread. Unlike more traditional computing fields that have focused on data, such as relational databases, the XML community has given quite a bit of thought to how data can be rendered efficiently. XML's solution is, appropriately, data-driven. Whether we use CSS or XSL, we apply the data in style sheets to the data in an XML document to produce a visual representation for the human consumer of our data.

Chapter 8, Linking and Querying, starts this thread. This is because a subset of the querying technology lets a programmer specify a set of criteria that is used in to select a part of a document that has to be styled. Styling can be as precise as specifying how to render particular elements depending on the context in which they are found. The same type of element can be rendered differently depending on who its parent is, or what else appears near it. With the context in place, Chapter 9 tells programmers the knowledge for transforming XML, if needed, into some other format suited to presentation. This is at the heart of data-driven styling.

Chapter 13 (Styling) builds on Chapters 8 and 9 to teach you styling for XML. Our style sheets become powerful sets of rules that are applied to the data in XML documents to create a visual presentation. From one set of data, you can quickly and efficiently produce multiple views for presentation. This is where the benefits of separating data from presentation are fully realized.

Chapter 14 (WAP) is included in the presentation thread because styling is an important consideration for small devices, and small devices are the primary users of wireless communications. It addresses how designers can compress the visual representation to fit the constraints of a very small display. This parallels the consideration our data counterparts have to give to compressing the data to fit through a low bandwidth network connection. Because your styling is driven by a style sheet and not embedded with the data, you can create an effective presentation format specifically for the wireless device.

To recap, our presentation thread is comprised of:

Chapter 8: Linking and Querying

Chapter 9: Manipulating XML

Chapter 13: Styling

Chapter 14: WAP

Read More Show Less

Customer Reviews

Be the first to write a review
( 0 )
Rating Distribution

5 Star


4 Star


3 Star


2 Star


1 Star


Your Rating:

Your Name: Create a Pen Name or

Barnes & Review Rules

Our reader reviews allow you to share your comments on titles you liked, or didn't, with others. By submitting an online review, you are representing to Barnes & that all information contained in your review is original and accurate in all respects, and that the submission of such content by you and the posting of such content by Barnes & does not and will not violate the rights of any third party. Please follow the rules below to help ensure that your review can be posted.

Reviews by Our Customers Under the Age of 13

We highly value and respect everyone's opinion concerning the titles we offer. However, we cannot allow persons under the age of 13 to have accounts at or to post customer reviews. Please see our Terms of Use for more details.

What to exclude from your review:

Please do not write about reviews, commentary, or information posted on the product page. If you see any errors in the information on the product page, please send us an email.

Reviews should not contain any of the following:

  • - HTML tags, profanity, obscenities, vulgarities, or comments that defame anyone
  • - Time-sensitive information such as tour dates, signings, lectures, etc.
  • - Single-word reviews. Other people will read your review to discover why you liked or didn't like the title. Be descriptive.
  • - Comments focusing on the author or that may ruin the ending for others
  • - Phone numbers, addresses, URLs
  • - Pricing and availability information or alternative ordering information
  • - Advertisements or commercial solicitation


  • - By submitting a review, you grant to Barnes & and its sublicensees the royalty-free, perpetual, irrevocable right and license to use the review in accordance with the Barnes & Terms of Use.
  • - Barnes & reserves the right not to post any review -- particularly those that do not follow the terms and conditions of these Rules. Barnes & also reserves the right to remove any review at any time without notice.
  • - See Terms of Use for other conditions and disclaimers.
Search for Products You'd Like to Recommend

Recommend other products that relate to your review. Just search for them below and share!

Create a Pen Name

Your Pen Name is your unique identity on It will appear on the reviews you write and other website activities. Your Pen Name cannot be edited, changed or deleted once submitted.

Your Pen Name can be any combination of alphanumeric characters (plus - and _), and must be at least two characters long.

Continue Anonymously
Sort by: Showing 1 Customer Reviews
  • Anonymous

    Posted May 11, 2001

    Excellent XML OVerview -- Coverage is wide, but not overly deep.

    Coverage of a wide variety of topics under the XML umbrella. Coverage is accurate and wide, but, sometimes, not very deep. This book is an excellent Intermediate Level reference for XML, SOAP, The XML Document Object Model (DOM) and SAX.

    Was this review helpful? Yes  No   Report this review
Sort by: Showing 1 Customer Reviews

If you find inappropriate content, please report it to Barnes & Noble
Why is this product inappropriate?
Comments (optional)