XML in a Nutshell: A Desktop Quick Reference

XML in a Nutshell: A Desktop Quick Reference

by Elliotte Rusty Harold, W. Scott Means
XML in a Nutshell: A Desktop Quick Reference

XML in a Nutshell: A Desktop Quick Reference

by Elliotte Rusty Harold, W. Scott Means

eBook

$26.99  $35.99 Save 25% Current price is $26.99, Original price is $35.99. You Save 25%.

Available on Compatible NOOK Devices and the free NOOK Apps.
WANT A NOOK?  Explore Now

Related collections and offers


Overview

If you're a developer working with XML, you know there's a lot to know about XML, and the XML space is evolving almost moment by moment. But you don't need to commit every XML syntax, API, or XSLT transformation to memory; you only need to know where to find it. And if it's a detail that has to do with XML or its companion standards, you'll find it--clear, concise, useful, and well-organized--in the updated third edition of XML in a Nutshell.With XML in a Nutshell beside your keyboard, you'll be able to:

  • Quick-reference syntax rules and usage examples for the core XML technologies, including XML, DTDs, Xpath, XSLT, SAX, and DOM
  • Develop an understanding of well-formed XML, DTDs, namespaces, Unicode, and W3C XML Schema
  • Gain a working knowledge of key technologies used for narrative XML documents such as web pages, books, and articles technologies like XSLT, Xpath, Xlink, Xpointer, CSS, and XSL-FO
  • Build data-intensive XML applications
  • Understand the tools and APIs necessary to build data-intensive XML applications and process XML documents, including the event-based Simple API for XML (SAX2) and the tree-oriented Document Object Model (DOM)
This powerful new edition is the comprehensive XML reference. Serious users of XML will find coverage on just about everything they need, from fundamental syntax rules, to details of DTD and XML Schema creation, to XSLT transformations, to APIs used for processing XML documents. XML in a Nutshell also covers XML 1.1, as well as updates to SAX2 and DOM Level 3 coverage. If you need explanation of how a technology works, or just need to quickly find the precise syntax for a particular piece, XML in a Nutshell puts the information at your fingertips.Simply put, XML in a Nutshell is the critical, must-have reference for any XML developer.

Product Details

ISBN-13: 9781449379049
Publisher: O'Reilly Media, Incorporated
Publication date: 09/23/2004
Series: In a Nutshell (O'Reilly)
Sold by: Barnes & Noble
Format: eBook
Pages: 714
File size: 7 MB

About the Author

Elliotte Rusty Harold is an adjunct professor of computer science at Polytechnic University in Brooklyn, New York, where he lectures on object-oriented programming and XML. His Cafe con Leche Web site has become one of the most popular sites for information on XML. In addition, he is the author and coauthor of numerous books, the most recent of which are "The XML Bible" (John Wiley & Sons, 2001) and "XML in a Nutshell" (O'Reilly, 2002).
0321150406AB08272003

Means began his career as a software developer with Microsoft in 1988, joining the company at the age of 17. He is currently serving as President and CEO of Enterprise Web Machines, a South Carolina-based Internet software product and services company.

Read an Excerpt

Chapter 9: XPath

XPath is a non-XML language used to identify particular parts of XML documents. XPath lets you write expressions that refer to the document's first person element, the seventh child element of the third person element, the ID attribute of the first person element whose contents are the string "Fred Jones," all xml-stylesheet processing instructions in the document's prolog, and so forth. XPath indicates nodes by position, relative position, type, content, and several other criteria. XSLT uses XPath expressions to match and select particular elements in the input document for copying into the output document or further processing. XPointer uses XPath expressions to identify the particular point in or part of an XML document that an XLink links to.

XPath expressions can also represent numbers, strings, or Booleans, so XSLT stylesheets carry out simple arithmetic for numbering and cross-referencing figures, tables, and equations. String manipulation in XPath lets XSLT perform tasks like making the title of a chapter uppercase in a headline, but mixed case in a reference in the body text.

The Tree Structure of an XML Document

An XML document is a tree made up of nodes. Some nodes contain other nodes. One root node ultimately contains all other nodes. XPath is a language for picking nodes and sets of nodes out of this tree. From the perspective of XPath, there are seven kinds of nodes:

  • The root node

  • Element nodes

  • Text nodes

  • Attribute nodes

  • Comment nodes

  • Processing instruction nodes

  • Namespace nodes

Note the constructs not included in this list: CDATA sections, entity references, and document type declarations. XPath operates on an XML document after these items have merged into the document. For instance, XPath cannot identify the first CDATA section in a document or tell whether a particular attribute value was included directly in the source element start tag or merely defaulted from the declaration of the attribute in the DTD.

Consider the document in Example 9-1. This document exhibits all seven types of nodes. Figure 9-1 is a diagram of this document's tree structure....

...The XPath data model has several inobvious features. First, the tree's root node is not the same as its root element. The tree's root node contains the entire document, including the root element and comments and processing instructions that occur before the root element start tag or after the root element end tag. In Example 9-1, the root node contains the xml-stylesheet processing instruction and the root element people.

The XPath data model does not include everything in the document. In particular, the XML declaration and DTD are not addressable via XPath. However, if the DTD provides default values for any attributes, then XPath recognizes those attributes. The homepage element has an xlink:type attribute supplied by the DTD. Similarly, any references to parsed entities are resolved. Entity references, character references, and CDATA sections are not individually identifiable, though any data they contain is addressable. For example, XSLT does not enable you to make all text in CDATA sections bold because XPath doesn't know what text is and isn't part of a CDATA section.

Finally, xmlns attributes are reported as namespace nodes. They are not considered attribute nodes, though a non-namespace aware parser will see them as such. Furthermore these nodes are attached to every element and attribute node for which that declaration has scope. They are not just attached to the single element where the namespace is declared.

Location Paths

The most useful XPath expression is a location path. A location path uses at least one location step to identify a set of nodes in a document. This set may be empty, contain a single node, or contain several nodes. These nodes can be element, attribute, namespace, text, comment, processing instruction, root nodes, or any combination of them.

The Root Location Path

The simplest location path is the one that selects the document's root node. This path is simply the forward slash /. (You'll notice that a lot of XPath syntax was deliberately chosen to be similar to the syntax used by the Unix shell. Here / is the root of a Unix filesystem and / is the root node of an XML document.) For example, this XSLT template uses the XPath pattern / to match the entire input document tree and wrap it in an html element...

Table of Contents

Preface

Part I: XML Concepts

Chapter 1: Introducing XML
What XML Offers
Portable Data
How XML Works
The Evolution of XML

Chapter 2: XML Fundamentals
XML Documents and XML Files
Elements, Tags, and Character Data
Attributes
XML Names
Entity References
CDATA Sections
Comments
Processing Instructions
The XML Declaration
Checking Documents for Well-Formedness

Chapter 3: Document Type Definitions
Validation
Element Declarations
Attribute Declarations
General Entity Declarations
External Parsed General Entities
External Unparsed Entities and Notations
Parameter Entities
Conditional Inclusion
Two DTD Examples
Locating Standard DTDs

Chapter 4: Namespaces
The Need for Namespaces
Namespace Syntax
How Parsers Handle Namespaces
Namespaces and DTDs

Chapter 5: Internationalization
The Encoding Declaration
Text Declarations
XML-Defined Character Sets
Unicode
ISO Character Sets
Platform-Dependent Character Sets
Converting Between Character Sets
The Default Character Set for XML Documents
Character References
xml:lang

Part II: Narrative-Centric Documents

Chapter 6: XML as a Document Format
SGML's Legacy
Narrative Document Structures
TEI
DocBook
Document Permanence
Transformation and Presentation

Chapter 7: XML on the Web
XHTML
Direct Display of XML in Browsers
Authoring Compound Documents with Modular XHTML
Prospects for Improved Web Search Methods

Chapter 8: XSL Transformations
An Example Input Document
xsl:stylesheet and xsl:transform
Stylesheet Processors
Templates
Calculating the Value of an Element with xsl:value-of
Applying Templates with xsl:apply-templates
The Built-in Template Rules
Modes
Attribute Value Templates
XSLT and Namespaces
Other XSLT Elements

Chapter 9: XPath
The Tree Structure of an XML Document
Location Paths
Compound Location Paths
Predicates
Unabbreviated Location Paths
General XPath Expressions
XPath Functions

Chapter 10: XLinks
Simple Links
Link Behavior
Link Semantics
Extended Links
Linkbases
DTDs for XLinks

Chapter 11: XPointers
XPointers on URLs
XPointers in Links
Bare Names
Child Sequences
Points
Ranges

Chapter 12: Cascading Stylesheets (CSS)
The Three Levels of CSS
CSS Syntax
Associating Stylesheets with XML Documents
Selectors
The Display Property
Pixels, Points, Picas, and Other Units of Length
Font Properties
Text Properties
Colors

Chapter 13: XSL Formatting Objects (XSL-FO)
XSL Formatting Objects
The Structure of an XSL-FO Document
Master Pages
XSL-FO Properties
Choosing Between CSS and XSL-FO

Part III: Data-Centric Documents

Chapter 14: XML as a Data Format
Programming Applications of XML
Describing Data
Support for Programmers

Chapter 15: Programming Models
Event- Versus Object-Driven Models
Programming Language Support
Non-Standard Extensions
Transformations
Processing Instructions
Links and References
Notations
What You Get Is Not What You Saw

Chapter 16: Document Object Model (DOM)
DOM Core
DOM Strengths and Weaknesses
Parsing a Document with DOM
The Node Interface
Specific Node Types
The DOMImplementation Interface
A Simple DOM Application

Chapter 17: SAX
The ContentHandler Interface
SAX Features and Properties

Part IV: Reference

Chapter 18: XML 1:0 Reference
How to Use This Reference
Annotated Sample Documents
Key to XML Syntax
Well-Formedness
Validity
Global Syntax Structures
DTD (Document Type Definition)
Document Body
XML Document Grammar

Chapter 19: XPath Reference
The XPath Data Model
Datatype
Location Paths
Predicates
XPath Functions

Chapter 20: XSLT Reference
The XSLT Namespace
XSLT Elements
XSLT Functions

Chapter 21: DOM Reference
Object Hierarchy
Object Reference

Chapter 22: SAX Reference
The org:xml:sax Package
The org:xml:sax:helpers Package
SAX Features and Properties
The org:xml:sax:ext Package

Chapter 23: Character Sets
Character Tables
HTML4 Entity Sets
Other Unicode Blocks

Index
From the B&N Reads Blog

Customer Reviews