Processing XML with Java: A Guide to SAX, DOM, JDOM, JAXP, and TrAX

Paperback (Print)
Buy Used
Buy Used from BN.com
$38.23
(Save 41%)
Item is in good condition but packaging may have signs of shelf wear/aging or torn packaging.
Condition: Used – Good details
Used and New from Other Sellers
Used and New from Other Sellers
from $1.99
Usually ships in 1-2 business days
(Save 96%)
Other sellers (Paperback)
  • All (21) from $1.99   
  • New (6) from $41.37   
  • Used (15) from $1.99   

Overview

Praise for Elliotte Rusty Harold’s Processing XML with Java

“The sophistication and language are very appropriate for Java and XML application developers. You can tell by the way the author writes that he too is a developer. He delves very deeply into the topics and has really taken things apart and investigated how they work. I especially like his coverage of ‘gotchas,’ pitfalls, and limitations of the technologies.”

        —John Wegis, Web Engineer, Sun Microsystems, Inc.

“Elliotte has written an excellent book on XML that covers a lot of ground and introduces current and emerging technologies. He helps the novice programmer understand the concepts and principles of XML and related technologies, while covering the material at a level that’s deep enough for the advanced developer. With a broad coverage of XML technologies, lots of little hints, and information I haven’t seen in any other book on the topic, this work has become a valuable addition to my technical library.”

        —Robert W. Husted, Member, Technical Staff, Requisite Technology, Inc.

“The code examples are well structured and easy to follow. They provide real value for someone writing industrial-strength Java and XML applications. The time saved will repay the cost of this book a hundred times over.

“The book also contains more of the pearls of wisdom we’ve come to expect from Elliotte Rusty Harold—the kind of pointers that will save developers weeks, if not months, of time.”

        —Ron Weber, Independent Software Consultant

Written for Java programmers who want to integrate XML into their systems, this practical, comprehensive guide and reference shows how to process XML documents with the Java programming language. It leads experienced Java developers beyond the basics of XML, allowing them to design sophisticated XML applications and parse complicated documents.

Processing XML with Java™ provides a brief review of XML fundamentals, including XML syntax; DTDs, schemas, and validity; stylesheets; and the XML protocols XML-RPC, SOAP, and RSS. The core of the book comprises in-depth discussions on the key XML APIs Java programmers must use to create and manipulate XML files with Java. These include the Simple API for XML (SAX), the Document Object Model (DOM), and JDOM (a Java native API). In addition, the book covers many useful supplements to these core APIs, including XPath, XSLT, TrAX, and JAXP.

Practical in focus, Processing XML with Java™ is filled with over two hundred examples that demonstrate how to accomplish various important tasks related to file formats, data exchange, document transformation, and database integration. You will learn how to read and write XML documents with Java code, convert legacy flat files into XML documents, communicate with network servers that send and receive XML data, and much more. Readers will find detailed coverage of the following:

  • How to choose the right API for the job
  • Reading documents with SAX
  • SAX filters
  • Validation in several schema languages
  • DOM implementations for Java
  • The DOM Traversal Module
  • Output from DOM
  • Reading and writing XML documents with JDOM
  • Searching XML documents with XPath
  • Combining XSLT transforms with Java code
  • TrAX, the Transformations API for XML
  • JAXP, the Java API for XML Processing

In addition, the book includes a convenient quick reference that summarizes the major elements of all the XML APIs discussed. A related Web site, located at http://www.cafeconleche.org/books/xmljava/, contains the entire book in electronic format, as well as updates and links referenced in the book.

With thorough coverage of the key XML APIs and a practical, task-oriented approach, Processing XML with Java™ is a valuable resource for all Java programmers who need to work with XML.

Read More Show Less

Editorial Reviews

From Barnes & Noble
The Barnes & Noble Review
Experienced Java programmers have plenty of XML tools to choose from: More open source XML tools have been written in Java than in any other language. But many of these tools are far less flexible than XML itself, because, in Elliotte Rusty Harold’s words, they treat XML as “just a funny kind of database, or just like an object, or just like remote procedure calls.” To gain true mastery, says Harold, accept XML on its own terms, “in all its messiness: valid and invalid, mixed and unmixed, typed and untyped, and both all and none of these at the same time.” Harold’s Processing XML with Java will help you cope with whatever XML challenges come your way.

This book delivers all the clarity and readability we’ve come to expect from Harold. Using extensive examples, he shows how to use XML as a data format; write and read XML; and convert flat files to XML. Next, he moves on to three XML APIs every Java developer should understand: SAX, with its XMLReader interface and filters; DOM; and finally, the cleaner, more modern JDOM. Harold concludes with useful introductions to XPath, XSLT, and the new Transformations API for XML. Highly recommended. Bill Camarda

Bill Camarda is a consultant, writer, and web/multimedia content developer. His 15 books include Special Edition Using Word 2000 and Upgrading & Fixing Networks For Dummies®, Second Edition.

Read More Show Less

Product Details

  • ISBN-13: 9780201771862
  • Publisher: Addison-Wesley
  • Publication date: 11/28/2002
  • Pages: 1071
  • Product dimensions: 7.40 (w) x 9.20 (h) x 2.20 (d)

Meet the Author

Elliotte Rusty Harold is an internationally respected writer, programmer, and educator. He is an Adjunct Professor of Computer Science at Polytechnic University in Brooklyn, where he lectures on Java and object-oriented programming. His Cafe con Leche Web site has become one of the most popular sites for information on XML. In addition, he is the author and coauthor of numerous books, the most recent of which are The XML Bible (John Wiley & Sons, 2001) and XML in a Nutshell (O'Reilly, 2002).

0201771861AB06062003

Read More Show Less

Read an Excerpt

One night five developers, all of whom wore very thick glasses and had recently been hired by Elephants, Inc., the world's largest purveyor of elephants and elephant supplies, were familiarizing themselves with the company's order processing system when they stumbled into a directory full of

"What do you mean, a table?" replied the second developer, well versed in object-oriented theory and occupied with a collection of XMI documents that encoded UML diagrams for the system. "Even a Visual Basic programmer could see that

"Objects? A strange kind of object, indeed!" said the third developer, a web designer of some renown, who had loaded the XHTML user documentation for the order processing system into Mozilla."I don't see any types at all. If you think this is an object, then it's your software I refuse to install. But with all those stylesheets there, it should be clear to anyone not sedated that

"HTML? You must be joking" said the fourth, a computer science professor on sabbatical from MIT, who was engrossed in an XSLT stylesheet that validated all of the other documents against a Schematron schema. "Look at the clean nesting of hierarchical structures, each tag matching its partner as it should. I've never seen HTML that looks this good. What we have here is S-expressions, which is certainly nothing new. Babbage invented this back in 1882!"

"S-expressions?" queried the technical writer, who was occupied with documentation for the project, written in DocBook. "Maybe that means something to those in your learned profession. But to me, this looks just like a FrameMaker MIF file. However, locating the GUI does seem to be taking me a while."

And so they argued into thenight, none of them willing to give an inch, all of them presenting still more examples to prove their points, but none bothering to look at the others' examples. Indeed, they're probably still arguing today. You can even hear their shouts from time to time on

There are a lot of tools, APIs, and applications in the world that pretend

This book is going to show you how to handle

  • SAX, the Simple API for
  • DOM, the Document Object Model
  • JDOM, a Java native API

These APIs are the core of this book. In addition, I cover a number of preliminaries and supplements to the basic APIs, including

  • DTDs, schemas, and validity
  • XPath
  • XSLT and the TrAX API
  • JAXP, a combination of SAX, DOM, and TrAX with a few factory classes

And, since we're going to need a few examples of

Who You Are

This book is written for experienced Java developers who want to integrate

Processing

will teach you how to

  • Save
  • Read
  • Search, query, and update
  • Convert legacy flat data into hierarchical
  • Communicate with network servers that send and receive
  • Validate documents against DTDs, schemas, and business rules
  • Combine functional XSLT transforms with traditional imperative Java code

This book is intended for Java developers who need to do anything with

What You Need to Know

This is not an introductory book with respect to either Java or

  • Object-oriented programming, including inheritance and polymorphism.
  • Packages and the CLASSPATH. You should not be surprised by classes that do not have main() methods or that are not in the default package.
  • I/O including streams, readers, and writers. You should understand that System.out is a horrible example of what really goes on in Java programs.
  • The Java Collections API including hash tables, maps, sets, iterators, and lists.

In addition, in one or two places in this book I use some SQL and JDBC. These sections are relatively independent of the rest of the book, however, and chances are if you aren't already familiar with SQL, then you don't need the material in these sections anyway. What You Need to Have

Most of the material in this book is relatively independent of the specific Java version. Java 1.4 bundles SAX, DOM, and a few other useful classes into the core JDK. However, these are easily installed in earlier JVMs as open source libraries from the Apache

How to Use This Book

This book is organized as an advanced tutorial that can also serve as a solid and comprehensive reference. Chapter 1 covers the bare minimum material needed to start working with

  • The event-based SAX API
  • The tree-based DOM API
  • The tree-based JDOM API
  • XPath APIs for searching
  • The TrAX API for XSLT processing

Finally, the book finishes with an appendix providing quick references to the main APIs.

If you have limited experience with

  • Chapters 6 to 8 cover SAX.
  • Chapters 9 to 13 cover DOM.
  • Chapters 14 and 15 cover JDOM.

Once you're comfortable with one or more of these APIs, you can read Chapters 16 and 17 on XPath and XSLT. However, those APIs and chapters do require some knowledge of at least one of the three major APIs. The Online Edition

The entire book is available online in plain-vanilla HTML at my Cafe con Leche web site. You can find it at http://www.cafeconleche.org/books/

The online version has no protection other than copyright law and your own good will. You don't need to register to read it, or to download some special electronic key that becomes invalid when you buy a new laptop (and that probably wouldn't run on Linux or a Mac in the first place). I want people to read and use this book. I do not want to put up silly roadblocks that make it less useful than it could be. I do ask, as a courtesy, that you do not republish the online edition on your own server. Doing so makes it extremely difficult for me to keep the book up to date. If you want to save a few pages on your laptop so you can read this book on an airplane, I don't really mind. But please don't pass out your own copies to anyone else. Instead, refer your friends and colleagues to the web site or the printed book. Some Grammatical Notes

The rules of English grammar were laid down, written in stone, and encoded in the DNA of elementary school teachers long before computers were invented. Unfortunately, this means that sometimes I have to decide between syntactically correct code and syntactically correct English. When forced to do so, English normally loses. This means that sometimes a punctuation mark appears outside a quotation mark when you'd normally expect it to appear inside, a sentence begins with a lowercase letter, or something similarly unsettling occurs. For the most part, I've tried to use various typefaces to make the offending phrase less jarring. In particular, please note the following:

  • Italicized text is used for emphasis, the first occurrence of an important term, titles of books and other cited works, words in languages other than English, words as words themselves (for example, Booboisie is a very funny word), Java system properties, host names, and resolvable URLs.
  • Monospaced text is used for
  • Italicized monospace text is used for pieces of
  • Bold monospaced text is used for literal text that the user types at a command line, as well as for emphasis in code.

It's not just English grammar that gets a little squeezed, either. The necessities of fitting code onto a printed page rather than a computer screen have occasionally caused me to deviate from the ideal Java coding conventions. The worst problem is line length. I can fit only 65 characters across the page in a line of code. To try to make maximum use of this space, I indent each block by two spaces and indent line continuations by one space, rather than the customary four spaces and two spaces respectively. Even so, I still have to break lines where I otherwise would prefer not to. For example, I originally wrote this line of code for Chapter 4: result.append(" " + amount + "<\/Amount>\r\n");

To fit it on the page, however, I had to split it into two pieces, like this: result.append(" ");
result.append(amount +"<\/Amount>\r\n");

This wasn't too bad, but sometimes even this wasn't enough and I had to remove indents from the front of the line that would otherwise be present. This occasionally forced the indentation not to line up as prettily as it otherwise might, as in this example from Chapter 3: wout.write(
"

);

The silver lining to this cloud is that sometimes the extra attention I give to the code when I'm trying to cut down its size results in better code. For example, in Chapter 4, I found I needed to remove a few characters from this line: OutputStreamWriter wout = new OutputStreamWriter(out, "UTF8");

On reflection I realized that nowhere did the program actually need to know that wout was an OutputStreamWriter as opposed to merely a Writer. Thus I could easily rewrite the offending line as follows: Writer wout = new OutputStreamWriter(out, "UTF8");

This follows the general object-oriented principle of using the least-specific type that will suit. This polymorphism makes the code more flexible in the future should I find a need to swap in a different kind of Writer. Contacting the Author

I always enjoy hearing from readers, whether with general comments, specific ways I could improve the book, or questions related to the book's subject matter. Because this book is being published in its entirety online, it is possible for me to reprint at least the online edition much faster than can be done with a traditional paper book. Thus corrections and errata are especially helpful because I have a real chance to fix them. Before sending in a correction, please do check the online edition to see if I have already fixed the problem.

Please send all comments, inquiries, bouquets, and brickbats to elharo@ metalab.unc.edu. I get a lot of e-mail, so I can't promise to answer them all; but I do try. It's helpful if you use a subject line that clearly identifies yourself as a reader of this book. Otherwise, your message may accidentally get misidentified as spam I don't want or bulk mail I don't have time to read and be dropped in the bit bucket before I see it. Also, please make absolutely sure that your message uses the correct reply-to address and that the address will be valid for at least several months after you send the message. There's nothing quite as annoying as taking an hour or more to compose a detailed response to an interesting question, only to have it bounce because the reader sent the email from a public terminal or changed their ISP. But please do write to me. I want to hear from you.

Elliotte Rusty Harold
Brooklyn, New York
June 7, 2002

Read More Show Less

Table of Contents

(NOTE: Each chapter concludes with a Summary.)

List of Examples.

List of Figures.

Preface.

Who You Are.

What You Need to Know.

What You Need to Have.

How to Use This Book.

The Online Edition.

Some Grammatical Notes.

Contacting the Author.

Acknowledgments.

I. XML.

1. XML for Data.

Motivating XML.

A Thought Experiment.

Robustness.

Extensibility.

Ease-of-Use.

XML Syntax.

XML Documents.

XML Applications.

Elements and Tags.

Text.

Attributes.

XML Declaration.

Comments.

Processing Instructions.

Entities.

Namespaces.

Validity.

DTDs.

Schemas.

Schematron.

The Last Mile.

Stylesheets.

CSS.

Associating Stylesheets with XML Documents.

XSL.

2. XML Protocols: XML-RPC and SOAP.

XML as a Message Format.

Envelopes.

Data Representation.

HTTP as a Transport Protocol.

How HTTP Works.

HTTP in Java.

RSS.

Customizing the Request.

Query Strings.

How HTTP POST Works.

XML-RPC.

Data Structures.

Faults.

Validating XML-RPC.

SOAP.

A SOAP Example.

Posting SOAP Documents.

Faults.

Encoding Styles.

SOAP Headers.

SOAP Limitations.

Validating SOAP.

Custom Protocols.

3. Writing XML with Java.

Fibonacci Numbers.

Writing XML.

Better Coding Practices.

Attributes.

Producing Valid XML.

Namespaces.

Output Streams, Writers, and Encodings.

A Simple XML-RPC Client.

A Simple SOAP Client.

Servlets.

4. Converting Flat Files to XML.

The Budget.

The Model.

Input.

Determining the Output Format.

Validation.

Attributes.

Building Hierarchical Structures from Flat Data.

Alternatives to Java.

Imposing Hierarchy with XSLT.

The XML Query Language.

Relational Databases.

5. Reading XML.

InputStreams and Readers.

XML Parsers.

Choosing an XML API.

Choosing an XML Parser.

Available Parsers.

SAX.

DOM.

JAXP.

JDOM.

dom4j.

ElectricXML.

XMLPULL.

II. SAX.

6. SAX.

What Is SAX?

Parsing.

Callback Interfaces.

Implementing ContentHandler.

Using the ContentHandler.

The DefaultHandler Adapter Class.

Receiving Documents.

Receiving Elements.

Handling Attributes.

Receiving Characters.

Receiving Processing Instructions.

Receiving Namespace Mappings.

“Ignorable White Space”.

Receiving Skipped Entities.

Receiving Locators.

What the ContentHandler Doesn't Tell You.

7. The XMLReader Interface.

Building Parser Objects.

Input.

InputSource.

EntityResolver.

Exceptions and Errors.

SAXExceptions.

The ErrorHandler Interface.

Features and Properties.

Getting and Setting Features.

Getting and Setting Properties.

Required Features.

Standard Features.

Standard Properties.

Xerces Custom Features.

Xerces Custom Properties.

DTDHandler.

8. SAX Filters.

The Filter Architecture.

The XMLFilter Interface.

Content Filters.

Filtering Tags.

Filtering Elements.

Filtering Attributes.

Filters That Add Content.

Filters versus Transforms.

The XMLFilterImpl Class.

Parsing Non-XML Documents.

Multihandler Adapters.

III. DOM.

The Document Object Model.

The Evolution of DOM.

DOM Modules.

Application-Specific DOMs.

Trees.

Document Nodes.

Element Nodes.

Attribute Nodes.

Leaf Nodes.

Nontree Nodes.

What Is and Isn't in the Tree.

DOM Parsers for Java.

Parsing Documents with a DOM Parser.

JAXP DocumentBuilder and DocumentBuilderFactory.

DOM3 Load and Save.

The Node Interface.

Node Types.

Node Properties.

Navigating the Tree.

Modifying the Tree.

Utility Methods.

The NodeList Interface.

JAXP Serialization.

DOMException.

Choosing between SAX and DOM.

10. Creating XML Documents with DOM.

DOMImplementation.

Locating a DOMImplementation.

Implementation-Specific Class.

JAXP DocumentBuilder.

DOM3 DOMImplementationRegistry.

The Document Interface as an Abstract Factory.

The Document Interface as a Node Type.

Getter Methods.

Finding Elements.

Transferring Nodes between Documents.

Normalization.

11 The DOM Core.

The Element Interface.

Extracting Elements.

Attributes.

The NamedNodeMap Interface.

The CharacterData Interface.

The Text Interface.

The CDATASection Interface.

The EntityReference Interface.

The Attr Interface.

The ProcessingInstruction Interface.

The Comment Interface.

The DocumentType Interface.

The Entity Interface.

The Notation Interface.

12. The DOM Traversal Module.

NodeIterator.

Constructing NodeIterators with DocumentTraversal.

Liveness.

Filtering by Node Type.

NodeFilter.

TreeWalker.

13. Output from DOM.

Xerces Serialization.

OutputFormat.

DOM Level 3.

Creating DOMWriters.

Serialization Features.

Filtering Output.

IV. JDOM.

14. JDOM.

What Is JDOM?

Creating XML Elements with JDOM.

Creating XML Documents with JDOM.

Writing XML Documents with JDOM.

Document Type Declarations.

Namespaces.

Reading XML Documents with JDOM.

Navigating JDOM Trees.

Talking to DOM Programs.

Talking to SAX Programs.

Configuring SAXBuilder.

SAXOutputter.

Java Integration.

Serializing JDOM Objects.

Synchronizing JDOM Objects.

Testing Equality.

Hash Codes.

String Representations.

Cloning.

What JDOM Doesn't Do.

15. The JDOM Model.

The Document Class.

The Element Class.

Constructors.

Navigation and Search.

Attributes.

The Attribute Class.

The Text Class.

The CDATA Class.

The ProcessingInstruction Class.

The Comment Class.

Namespaces.

The DocType Class.

The EntityRef Class.

V. XPath/XSLT.

16. XPath.

Queries.

The XPath Data Model.

Location Paths.

Axes.

Node Tests.

Predicates.

Compound Location Paths.

Absolute Location Paths.

Abbreviated Location Paths.

Combining Location Paths.

Expressions.

Literals.

Operators.

Functions.

XPath Engines.

XPath with Saxon.

XPath with Xalan.

DOM Level 3 XPath.

Namespace Bindings.

Snapshots.

Compiled Expressions.

Jaxen.

17. XSLT.

XSL Transformations.

Template Rules.

Stylesheets.

Taking the Value of a Node.

Applying Templates.

The Default Template Rules.

Selection.

Calling Templates by Name.

TrAX.

Thread Safety.

Locating Transformers.

The xml-stylesheet Processing Instruction.

Features.

XSLT Processor Attributes.

URI Resolution.

Error Handling.

Passing Parameters to Stylesheets.

Output Properties.

Sources and Results.

Extending XSLT with Java.

Extension Functions.

Extension Elements.

VI. APPENDIXES.

Appendix A: XML API Quick Reference.

SAX.

org.xml.sax.

org.xml.sax.ext.

org.xml.sax.helpers.

DOM.

The DOM Data Model.

org.w3c.dom.

org.w3c.dom.traversal.

JAXP.

javax.xml.parsers.

TrAX.

javax.xml.transform.

javax.xml.transform.stream.

javax.xml.transform.dom.

javax.xml.transform.sax.

JDOM.

org.jdom.

org.jdom.filter.

org.jdom.input.

org.jdom.output.

org.jdom.transform.

org.jdom.xpath.

XMLPULL.

org.xmlpull.v1.

Appendix B: SOAP 1.1 Schemas.

The SOAP 1.1 Envelope Schema.

The SOAP 1.1 Encoding Schema.

W3C Software Notice and License.

Appendix C: Recommended Reading.

Books.

Specifications.

Index. 0201771861T10222002

Read More Show Less

Preface

One night five developers, all of whom wore very thick glasses and had recently been hired byElephants, Inc., the world's largest purveyor of elephants and elephant supplies, were familiarizingthemselves with the company's order processing system when they stumbled into a directoryfull of XML documents on the main server. "What's this?" the team leader asked excitedly.None of them had ever heard of XML before so they decided to split up the files between themand try to figure out just what this strange and wondrous new technology actually was.

The first developer, who specialized in optimizing Oracle databases, printed out a stack of FMPXMLRESULTdocuments generated by the FileMaker database where all the orders werestored, and began pouring over them. "So this is XML! Why, it's nothing novel. As anyone cansee who's able, an XML document is nothing but a table!"

"What do you mean, a table?" replied the second programmer, well versed in object orientedtheory and occupied with a collection of XMI documents that encoded UML diagrams for thesystem. "Even a Visual Basic programmer could see that XML documents aren't tables. Duplicatesaren't allowed in a table relation, unless this is truly some strange mutation. Classes andobjects is what these document are. Indeed, it should be obvious on the very first pass. An XMLdocument is an object and a DTD is a class."

"Objects? A strange kind of object, indeed!" said the third developer, a web designer of somerenown, who had loaded the XHTML user documentation for the order processing system intoMozilla. "I don't see any types at all. If you think this is an object, then it's your software Irefuse to install. But with all thosestylesheets there, it should be clear to anyone not sedated,that XML is just HTML updated!"

"HTML? You must be joking" said the fourth, a computer science professor on sabbatical fromMIT, who was engrossed in an XSLT stylesheet that validated all the other documents against aSchematron schema. "Look at the clean nesting of hierarchical structures, each tag matching itspartner as it should. I've never seen HTML that looks this good. What we have here is Sexpressions,which is certainly nothing new. Babbage invented this back in 1882!"

"S expressions?" queried the technical writer, who was occupied with documentation for theproject written in DocBook. "Maybe that means something to those in your learned profession.But to me, this looks just like a FrameMaker MIF file. However, locating the GUI does seem tobe taking me awhile."

And so they argued into the night, none of them willing to give an inch, all of them presentingstill more examples to prove their points, none of them bothering to look at the others' examples.Indeed, they're probably still arguing today. You can even hear their shouts from time totime on xml-dev. Their mistake, of course, was in trying to force XML into the patterns of technologiesthey were already familiar with rather than taking it on its own terms. XML can storedata, but it is not a database. XML can serialize objects, but an XML document is not an object.Web pages can be written in XML, but XML is not HTML. Functional (and other) programminglanguages can be written in XML, but XML is not a programming language. Books arewritten in XML, but that doesn't make XML desktop publishing software.

XML is something truly new that has not been seen before in the world of computing. Therehave been precursors to it, and there are always fanatics who insist on seeing XML throughdatabase (or object, or functional, or S-expression) colored glasses. But XML is none of thesethings. It is something genuinely unique and new in the world of computing; and it can only beunderstood when you're willing to accept it on its own terms, rather than forcing it into yesterday'spigeon holes.

There are a lot of tools, APIs, and applications in the world that try to pretend XML is somethingmore familiar to programmers; that it's just a funny kind of database, or just like an object,or just like remote procedure calls. These APIs are occasionally useful in very restricted andpredictable environments. However, they are not suitable for processing XML in its most generalformat. They work well in their limited domains, but they fail when presented with XMLthat steps outside the artificial boundaries they've defined. XML was designed to be extensible,but it's a sad fact that many of the tools designed for XML aren't nearly as extensible as XMLitself.

This book is going to show you how to handle XML in its full generality. It pulls no punches. Itdoes not pretend that XML is anything except XML, and it shows you how to design your programsso that they handle real XML in all its messiness: valid and invalid, mixed and unmixed,typed and untyped, and both all and none of these at the same time. To that end, this book focuseson those APIs that don't try to hide the XML. In particular, there are three major JavaAPIs that correctly model XML, as opposed to modeling a particular class of XML documentsor some narrow subset of XML. These are:

  • SAX, the Simple API for XML
  • DOM, the Document Object Model
  • JDOM, a Java native API

These APIs are the core of this book. In addition I cover a number of preliminaries and supplementsto the basic APIs including:

  • XML syntax
  • DTDs, schemas, and validity
  • XPath
  • XSLT and the TrAX API
  • JAXP, a combination of SAX, DOM, and TrAX with a few factory classes

And, since we're going to need a few examples of XML applications to demonstrate the APIswith, I also cover XML-RPC, SOAP, and RSS in some detail. However, the techniques thisbook teaches are hardly limited to just those three applications.

Who You Are

This book is written for experienced Java programmers who want to integrate XML into theirsystems. Java is the ideal language for processing XML documents. Its strong Unicode supportin particular made it the preferred language for many early implementers. Consequently, moreXML tools have been written in Java than in any other language. More open source XML toolsare written in Java than in any other language. More programmers process XML in Java than inany other language.

Processing XML with Java™ will teach you how to:
  • Save XML documents from applications written in Java
  • Read XML documents produced by other programs
  • Search, query, and update XML documents
  • Convert legacy flat data into hierarchical XML
  • Communicate with network servers that send and receive XML data
  • Validate documents against DTDs, schemas, and business rules
  • Combine functional XSLT transforms with traditional imperative Java code

This book is meant for Java programmers who need to do anything with XML. It teaches thefundamentals and advanced topics, leaving nothing out. It is a comprehensive course in processingXML with Java that takes developers from little knowledge of XML to designing sophisticatedXML applications and parsing complicated documents. The examples cover a wide rangeof possible uses including file formats, data exchange, document transformation, database integration,and more.

What You Need to Know

This is not an introductory book with respect to either Java or XML. I assume you have substantialprior experience with Java and preferably some experience with XML. On the Java side, Iwill freely use advanced features of the language and its class library without explanation orapology. Among other things, I assume you are thoroughly familiar with:

  • Object oriented programming including inheritance and polymorphism
  • Packages and the CLASSPATH. You should not be surprised by classes that do not have
  • main() methods or that are not in the default package.
  • I/O including streams, readers, and writers. You should understand that System.out is a
  • horrible example of what really goes on in Java programs.
  • The Java Collections API including hash tables, maps, sets, iterators, and lists.

In addition, in one or two places in this book I'm going to use some SQL and JDBC. However,these sections are relatively independent of the rest of the book; and chances are if you aren't alreadyfamiliar with SQL, then you don't need the material in these sections anyway.

What You Need to Have

XML is deliberately architecture, platform, operating system, GUI, and language agnostic (infact, more so than Java). It works equally well on Mac OS, Windows, Linux, OS/2, various flavorsof Unix, and more. It can be processed with Python, C++, Haskell, ECMAScript, C#, Perl,Visual Basic, Ruby, and of course Java. No byte order issues need concern you if you switch betweenPowerPC, X86, or other architectures. Almost everything in this book should workequally well on any platform that's capable of running Java.

Most of the material in this book is relatively independent of the specific Java version. Java 1.4bundles SAX, DOM, and a few other useful classes into the core JDK. However, these are easilyinstalled in earlier JVMs as open source libraries from the Apache XML Project and othervendors. For the most part, I used Java 1.3 and 1.4 when testing the examples; and it's possiblethat a few classes and methods have been used that are not available in earlier versions. In mostcases, it should be fairly obvious how to backport them. All of the basic XML APIs exceptTrAX should work in Java 1.1 and later. TrAX requires Java 1.2 or later.

How to Use This Book

This book is organized as an advanced tutorial that can also serve as a solid and comprehensivereference. The first chapter covers the bare minimum material needed to start working withXML, though for the most part this is intended more as a review for readers who've alreadyread other, more basic books than as a comprehensive introduction. The second chapter introducesRSS, XML-RPC, and SOAP, the XML applications we'll be using for examples in therest of the book. This is followed by two chapters on generating XML from your own programs(a subject which is all too often presented as a lot more complicated than it actually is). The firstcovers generating XML directly from code. The second covers converting legacy data in otherformats to XML. The remaining bulk of the book is devoted to the major APIs for processingXML:

  • The event based SAX API
  • The tree-based DOM API
  • The tree-based JDOM API
  • XPath APIs for searching XML documents
  • The TrAX API for XSLT processing

Finally, the book finishes with an appendix providing quick references to the main APIs.If you have limited experience with XML, I suggest you read at least the first five chapters inorder. From that point forward, if you have a particular API preference, you may begin with thepart covering the major API you're interested in:

  • Chapters 6-8 cover SAX
  • Chapters 9-13 cover DOM
  • Chapters 14 and 15 cover JDOM

Once you're comfortable with one or more of these APIs, you can read Chapters 16 and 17 onXPath and XSLT. However, those APIs and chapters do require some knowledge of at least oneof the three major APIs.



Read More Show Less

Customer Reviews

Be the first to write a review
( 0 )
Rating Distribution

5 Star

(0)

4 Star

(0)

3 Star

(0)

2 Star

(0)

1 Star

(0)

Your Rating:

Your Name: Create a Pen Name or

Barnes & Noble.com Review Rules

Our reader reviews allow you to share your comments on titles you liked, or didn't, with others. By submitting an online review, you are representing to Barnes & Noble.com that all information contained in your review is original and accurate in all respects, and that the submission of such content by you and the posting of such content by Barnes & Noble.com does not and will not violate the rights of any third party. Please follow the rules below to help ensure that your review can be posted.

Reviews by Our Customers Under the Age of 13

We highly value and respect everyone's opinion concerning the titles we offer. However, we cannot allow persons under the age of 13 to have accounts at BN.com or to post customer reviews. Please see our Terms of Use for more details.

What to exclude from your review:

Please do not write about reviews, commentary, or information posted on the product page. If you see any errors in the information on the product page, please send us an email.

Reviews should not contain any of the following:

  • - HTML tags, profanity, obscenities, vulgarities, or comments that defame anyone
  • - Time-sensitive information such as tour dates, signings, lectures, etc.
  • - Single-word reviews. Other people will read your review to discover why you liked or didn't like the title. Be descriptive.
  • - Comments focusing on the author or that may ruin the ending for others
  • - Phone numbers, addresses, URLs
  • - Pricing and availability information or alternative ordering information
  • - Advertisements or commercial solicitation

Reminder:

  • - By submitting a review, you grant to Barnes & Noble.com and its sublicensees the royalty-free, perpetual, irrevocable right and license to use the review in accordance with the Barnes & Noble.com Terms of Use.
  • - Barnes & Noble.com reserves the right not to post any review -- particularly those that do not follow the terms and conditions of these Rules. Barnes & Noble.com also reserves the right to remove any review at any time without notice.
  • - See Terms of Use for other conditions and disclaimers.
Search for Products You'd Like to Recommend

Recommend other products that relate to your review. Just search for them below and share!

Create a Pen Name

Your Pen Name is your unique identity on BN.com. It will appear on the reviews you write and other website activities. Your Pen Name cannot be edited, changed or deleted once submitted.

 
Your Pen Name can be any combination of alphanumeric characters (plus - and _), and must be at least two characters long.

Continue Anonymously
Sort by: Showing 1 Customer Reviews
  • Anonymous

    Posted November 22, 2002

    Attractively lucid and voluminous

    It used to be that to get a job as a java programmer, all you typically needed was knowledge of java itself plus some general background in computer science. But today we have a severe high tech slump, and technology has also moved on. The former has caused companies that are still hiring, and those that are picking programmers to retain, to require a broader skill set. One of these has been produced by the latter, XML. It really is shaping up that data serialisation is increasingly in XML format, if that data exists outside a database. So for professional reasons you should learn XML, if you are indeed any type of programmer. For example, Microsoft's .NET revolves around XML, and they don't use java. But it turns out that the coupling between java and XML is tight. The most advanced parsers for XML exist for java. In C++ and C#, the parsers are essentially one step/generation behind. Given this, where do you turn to learn XML? An excellent choice is this book. A voluminous and eloquent exposition of the uses of XML. Harold covers the latest versions of the SAX and DOM parsers, explaining the relative merits. As a java programmer, you should find the idea behind SAX simple. It uses a callback, similar to that in GUIs. Simpler, in fact, because you can only have a single callback. SAX's biggest drawback is that it does not build a tree of the document. DOM addresses this. Harold explains the tradeoffs, and how you can decide which to use. Plus, he describes JDOM, which is DOM-like, but written expressly for java. You should find JDOM far more intuitive than DOM. There is one place where I must differ with the author. He claims that this book is for the experienced java programmer who has already had some XML. I think he is being too conservative; he doesn't want to oversell this book to someone who will not benefit from it. I claim that if you are experienced, by which I mean you have a year or more in java, then you have the intellectual wherewithal to gain, even if you have never seen a stitch of XML.

    Was this review helpful? Yes  No   Report this review
Sort by: Showing 1 Customer Reviews

If you find inappropriate content, please report it to Barnes & Noble
Why is this product inappropriate?
Comments (optional)