- Shopping Bag ( 0 items )
• Covers all the most recent XML core and related specifications including XML 1.1, J2EE 1.4, ...
Ships from: North Dartmouth, MA
Usually ships in 1-2 business days
Ships from: Chatham, NJ
Usually ships in 1-2 business days
• Covers all the most recent XML core and related specifications including XML 1.1, J2EE 1.4, Microsoft .NET's latest iteration, as well as open source XML items from the Apache project.
• Strong coverage of XML use with databases, transactions, and XML security.
• Discusses both Microsoft (.NET) and Sun (Java) programming integration with XML, an approach not taken in any other book.
• Presents extensive business examples, including several major applications developed throughout the book.
• No previous exposure to XML is assumed.
|Pt. I||Introducing XML||1|
|Ch. 1||XML Concepts||3|
|Ch. 2||XML Documents||29|
|Ch. 3||XML Data Format and Validation||47|
|Ch. 4||XML Parsing Concepts||79|
|Ch. 5||Parsing XML with DOM||91|
|Ch. 6||Parsing XML with SAX||123|
|Ch. 7||XSLT Concepts||173|
|Ch. 8||XSL Transformations||191|
|Ch. 9||XSL Formatting Objects||217|
|Pt. II||Microsoft Office and XML||233|
|Ch. 10||Microsoft XML Core Services||235|
|Ch. 11||Working with the MSXML DOM||251|
|Ch. 12||Generating XML from MS Access Data||271|
|Ch. 13||Creating and Excel Spreadsheet from an XML Data Source||291|
|Pt. III||XML Web Applications Using J2EE||309|
|Ch. 14||XML Tools for J2EE: IBM, Apache, Sun, and Others||311|
|Ch. 17||XML APIs from Sun||361|
|Pt. IV||Relational Data and XML||429|
|Ch. 18||Accessing and Formatting XML from SQL Server Data||431|
|Ch. 19||Accessing and Formatting XML from Oracle Data||473|
|Ch. 20||Accessing and Formatting XML from DB2||509|
|Ch. 21||Building XML-Based Web Applications with JDBC||539|
|Ch. 22||Transforming Relational XML Output into Other Formats||591|
|Pt. V||Introducing Web Services||623|
|Ch. 23||Web Service Concepts||625|
|Ch. 27||Microsoft Web Services||665|
|Ch. 28||J2EE Web Services||683|
|Pt. VI||Microsoft.NET and Web Services||697|
|Ch. 29||Creating and Deploying .NET Web Services||699|
|Ch. 30||Accessing .NET Web Services||711|
|Ch. 31||Building a .NET Web Services Clients||719|
|Pt. VII||Web Services and J2EE||735|
|Ch. 32||Web Service Tools for J2EE: IBM, Apache, Sun, and Others||737|
|Ch. 33||Web Services with the Sun Java Web Services Developer Pack||747|
|Ch. 34||Apache Axis||773|
|Ch. 35||Accessing Web Services from Java Applications||801|
|Pt. VIII||Advanced Web Services||833|
|Ch. 36||Accessing Relational Data via Web Services||835|
|Ch. 37||Authentication and Security for Web Services||871|
This book is targeted at programmers who need to
develop solutions using XML. Being a programmer
myself, I know that theory without practical examples and
applications can be tedious, and you probably want to get
straight to real-world examples. You're in luck, because this
book is full of working examples-but not in this chapter.
Some theory is necessary so that you have a fundamental
understanding of XML. I'll keep the theory of XML and related
technologies to a minimum as I progress through the chapters,
but we do need to cover some of the basics up front.
This chapter provides readers who are new to XML with an
overview and history of XML, its purposes, and comparisons
against previous and alternative integration technologies, and
ends with an overview of the next XML version, XML 1.1. The
rest of the chapters in this part of the book will use real-world
examples to describe XML basic formats, the structure of well-formed
XML documents, and XML validation against DTDs and
Schemas. The chapters on XSL Transformations and XSL
Formatting Objects will illustrate the transformation and formatting
of XML data using XSLT via working examples.This
part of the book will be finished with examples of parsing XML
documents, as well as specific examples of XML parsing using
Simple API for XML (SAX) and Document Object Model (DOM).
What Is XML?
XML stands for Extensible Markup Language, and it is used to
describe documents and data in a standardized, text-based format
that can be easily transported via standard Internet protocols.
XML, like HTML, is based on the granddaddy of all markup
languages, Standard Generalized Markup Language (SGML).
SGML is remarkable not just because it's the inspiration and
basis for all modern markup languages, but also because of
the fact that SGML was created in 1974 as part of an IBM
document-sharing project, and officially became an
International Organization for Standardization (ISO) standard in 1986, long before
the Internet or anything like it was operational. The ISO standard documentation
for SGML (ISO 8879:1986) can be purchased online at iso.org.
The first popular adaptation of SGML was HTML, which was developed as part of a
project to provide a common language for sharing technical documents. The advent
of the Internet facilitated the document exchange method, but not the display of
the document. The markup language that was developed to standardize the display
format of the documents was called Hypertext Markup Language, or HTML, which
provides a standardized way of describing document layout and display, and is an
integral part of every Web browser and Website.
Although SGML was a good format for document sharing, and HTML was a good
language for describing the layout of the documents in a standardized way, there
was no standardized way to describe and share data that was stored in the document.
For example, an HTML page might have a body that contains a listing of
today's closing prices of a share of every company in the Fortune 500. This data can
be displayed using HTML in a myriad of ways. Prices can be bold if they have
moved up or down by 10 percent, and prices that are up from yesterday's closing
price can be displayed in green, with prices that are down displayed in red. The
information can be formatted in a table, and alternating rows of the table can be in
However, once the data is taken from its original source and rendered as HTML in a
browser, the values of the data only have value as part of the markup language on
that page. They are no longer individual pieces of data, but are now simply pieces
of "content" wedged between elements and attributes that specify how to display
that content. For example, if a Web developer wanted to extract the top ten price
movers from the daily closing prices displayed on the Web page, there was no standardized
way to locate the top ten values and isolate them from the others, and
relate the prices to the associated Fortune 500 Company.
Note that I say that there was no standardized way to do this; this did not stop
developers from trying. Many a Web developer in the mid- to late-1990s, including
myself, devised very elaborate and clever ways of scraping the data they needed
from between HTML tags, mostly by eyeballing the page and the HTML source
code, then coding routines in various languages to read, parse, and locate the
required values in the page. For example, a developer may read the HTML source
code of the stock price page and discover that the prices were located in the only
table on the HTML page. With this knowledge, code could be developed in the
developer's choice of language to locate the table in the page, extract the values
nested in the table, calculate the top price movers for the day based on values in
the third column in the table, and relate the company name in the first column of
the table with the top ten values.
However, it's fair to say that this approach represented a maintenance nightmare
for developers. For example, if the original Web page developers suddenly decided
to add a table before the stock price table on the page, or add an additional column
to the table, or nest one table in another, it was back to the drawing board for the
developer who was scraping the data from the HTML page, starting over to find the
values in the page, extract the values into meaningful data, and so on. Most developers
who struggled with this inefficient method of data exchange on the Web were
looking for better ways to share data while still using the Web as a data delivery
But this is only one example of many to explain the need for a tag-based markup
language that could describe data more effectively than HTML. With the explosion
of the Web, the need for a universal format that could function as a lowest common
denominator for data exchange while still using the very popular and standardized
HTTP delivery methods of the Internet was growing.
In 1998 the World Wide Web Consortium (W3C) met this need by combining the
basic features that separate data from format in SGML with extension of the HTML
tag formats that were adapted for the Web and came up with the first Extensible
Markup Language (XML) Recommendation. The three pillars of XML are
Extensibility, Structure, and Validity.
XML does a great job of describing structured data as text, and the format is open
to extension. This means that any data that can be described as text and that can
be nested in XML tags will be generally accepted as XML. Extensions to the language
need only follow the basic XML syntax and can otherwise take XML wherever
the developer would like to go. The only limits are imposed on the data by the data
itself, via syntax rules and self-imposed format directives via data validation, which
I will get into in the next chapter.
The structure of XML is usually complex and hard for human eyes to follow, but it's
important to remember that it's not designed for us to read. XML parsers and other
types of tools that are designed to work with XML easily digest XML, even in its
most complex forms. Also, XML was designed to be an open data exchange format,
not a compact one-XML representations of data are usually much larger than
their original formats. In other words, XML was not designed to solve disk space or
bandwidth issues, even though text-based XML formats do compress very well
using regular data compression and transport tools.
It's also important to remember that XML data syntax, while extensible, is rigidly
enforced compared to HTML formats. I will get into the specifics of formatting rules
a little later in this chapter, and will show examples in the next chapter.
Aside from the mandatory syntax requirements that make up an XML document,
data represented by XML can optionally be validated for structure and content,
based on two separate data validation standards. The original XML data validation
standard is called Data Type Definition (DTD), and the more recent evolution of
XML data validation is the XML Schema standard. I will be covering data validation
using DTDs and Schemas a little later in this chapter, and showing working examples
of data validation in the next chapter.
What Is XML Not?
With all the hype that continues to surround XML and derivative technologies such
as XSL and Web Services, it's probably as important to review what XML is not as it
is to review what XML is.
While XML facilitates data integration by providing a transport with which to send
and receive data in a common format, XML is not data integration. It's simply the
glue that holds data integration solutions together with a multi-platform "lowest
common denominator" for data transportation. XML cannot make queries against a
data source or read data into a repository by itself. Similarly, data cannot be formatted
as XML without additional tools or programming languages that specifically
generate XML data from other types of data. Also, data cannot be parsed into destination
data formats without a parser or other type of application that converts data
from XML to a compatible destination format.
It's also important to point out that XML is not HTML. XML may look like HTML,
based on the similarities of the tags and the general format of the data, but that's
where the similarity ends. While HTML is designed to describe display characteristics
of data on a Web page to browsers, XML is designed to represent data structures.
XML data can be transformed into HTML using Extensible Style Sheet
Transformations (XSLT). XML can also be parsed and formatted as HTML in an
application. XML can also be part of an XML page using XML data islands. I'll discuss
XSLT transformations, XML parsing, and data islands in much more detail later
in the book.
XML Standards and the World Wide Web
The World Wide Web Consortium (W3C) is where developers will find most of the
specifications for standards that are used in the XML world. W3C specifications are
referred to as "Recommendations" because the final stage in the W3C development
process may not necessarily produce a specification, depending on the nature of
the W3C Working Group that is producing the final product, but for all intents and
purposes, most of the final products are specifications.
W3C specifications on the Recommendation track progress through five stages:
Working Draft, Last Call Working Draft, Candidate Recommendation, Proposed
Recommendation, and Recommendation, which is the final stop for a specific version
of a specification such as XML.
W3C Working Groups produce Recommendations, and anyone can join the W3C
and a Working Group. More information on joining the W3C can be found at
w3.org/Consortium/Prospectus/Joining. Currently, W3C
Working Groups are working hard at producing the latest recommendations for
XML and related technologies such as XHTML, Xlink, XML Base, XML Encryption,
XML Key Management, XML Query, XML Schema, XML Signature, Xpath, Xpointer,
XSL, and XSLT.
XML Elements and Attributes
Because XML is designed to describe data and documents, the W3C XML
Recommendation, which can be found buried in the links at w3.org/
XML, is very strict about a small core of format requirements that make the difference
between a text document containing a bunch of tags and an actual XML document.
XML documents that meet W3C XML document formatting recommendations
are described as being well-formed XML documents. Well-formed XML documents
can contain elements, attributes, and text.
Elements look like this and always have an opening and closing tag:
There are a few basic rules for XML document elements. Element names can contain
letters, numbers, hyphens, underscores, periods, and colons when namespaces
are used (more on namespaces later). Element names cannot contain spaces;
underscores are usually used to replace spaces. Element names can start with a letter,
underscore, or colon, but cannot start with other non-alphabetic characters or
a number, or the letters xml.
Aside from the basic rules, it's important to think about using hyphens or periods
in element names. They may be considered part of well-formed XML documents,
but other systems that will use the data in the element name such as relational
database systems often have trouble working with hyphens or periods in data identifiers,
often mistaking them for something other than part of the name.
Attributes contain values that are associated with an element and are always part
of an element's opening tag:
The basic rules and guidelines for elements apply to attributes as well, with a few
additions. The attribute name must follow an element name, then an equals sign (=),
then the attribute value, in single or double quotes. The attribute value can contain
quotes, and if it does, one type of quote must be used in the value, and another
around the value.
Text is located between the opening and closing tags of an element, and usually represents
the actual data associated with the elements and attributes that surround
Text is not constrained by the same syntax rules of elements and attributes, so virtually
any text can be stored between XML document elements. Note that while the
value is limited to text, the format of the text can be specified as another type of
data by the elements and attributes in the XML document.
Last but not least, elements with no attributes or text can also be represented in an
XML document like this:
This format is usually added to XML documents to accommodate a predefined data
structure. I'll be covering ways to specify an XML data structure a little later in this
XML Document Structure
Although elements, attributes, and text are very important for XML documents,
these design objects alone do not make up a well-formed XML document without
being arranged under certain structural and syntax rules. Let's examine the structure
of the very simple well-formed XML 1.0 document in Listing 1-1.
Most XML documents start with an element at the top of the page. This is
called an XML document declaration. An XML document declaration is an optional
element that is useful to determine the version of XML and the encoding type of the
source data. It is not a required element for an XML document to be well formed in
the W3C XML 1.0 specification. This is the most common XML document declaration:
There are two attributes contained in this XML declaration that are commonly seen
but not often explained.
Excerpted from XML Programming Bible
by Brian Benz John Durant
Copyright © 2003 by Brian Benz, John Durant.
Excerpted by permission.
All rights reserved. No part of this excerpt may be reproduced or reprinted without permission in writing from the publisher.
Excerpts are provided by Dial-A-Book Inc. solely for the personal use of visitors to this web site.