XML Schemas

XML Schemas

by Chelsea Valentine, Lucinda Dykes, Ed Tittel
     
 

Whether it is used for web development, creating documentation, or exchanging data between business partners, XML continues to grow in importance as a highly flexible document-design and data-modeling tool. Despite the limitations of using SGML Document Type Definitions (DTDs) to define document structures, XML has made inroads wherever data must flow among disparate…  See more details below

Overview

Whether it is used for web development, creating documentation, or exchanging data between business partners, XML continues to grow in importance as a highly flexible document-design and data-modeling tool. Despite the limitations of using SGML Document Type Definitions (DTDs) to define document structures, XML has made inroads wherever data must flow among disparate platforms. The Schema specification has achieved W3C recommendation status, providing an alternative to DTDs that enables you to precisely structure XML data. But using the Schema Language does more than provide a more powerful way of defining data; it's also a better way because it uses XML's structure, syntax, and namespaces, instead of those derived from the complex SGML. XML Schemas introduces you to this elegant new technology, which brings the power of data modeling and data structuring to XML. A truly practical book has to give you more than just the details on syntax and semantics, examples of constructs and datatypes, and instruction in standard procedures. You get all that, but you'll also find lots of expert tips and techniques for document modeling, all reinforced with practical, real-world examples. Even as you're discovering the advantages of XML Schema, you'll learn about the continuing use of DTDs. In some situations -- when designing document-oriented XML, for example -- DTDs might still be the way to go. You'll learn about visual XML Schema tools, but you'll also see how setting out armed with just a text editor gives you insights you might not acquire otherwise. It won't be long before you're developing your own XML Schema documents, using the power of XML to structure data for seamless, cross-platform exchange.

Read More

Product Details

ISBN-13:
9780782140453
Publisher:
Sybex, Incorporated
Publication date:
01/28/2002
Pages:
656
Product dimensions:
7.57(w) x 9.02(h) x 1.40(d)

Read an Excerpt

Chapter 2: Of DTDs and Schemas

  • Understanding basic schema structure
  • Examining the basics behind DTDs
  • Comparing schemas and DTDs
  • Looking toward the future
R ecently,there ’s been a lot to say about schemas and Document Type Definitions (DTDs). As XML has moved from its document-centric roots to a more data-centric environment, the tools necessary for validation have evolved as well.DTDs,a long-standing part of the Standard Generalized Markup Language (SGML),were (and still are)quite good at defining document structure.However,because XML is being used within e-commerce applications and for business-to-business (B2B)transactions,developers quickly learned that DTDs were not equipped to handle data-centric information.

This is not to say that DTDs have outlived their usefulness;however,schemas are giving them a run for their money.As developers,we now have options for validation.Two options come from the W3C:XML Schema and XML DTDs.However,you ’re not limited to only those two options.In addition to XML Schema and XML DTDs,there are several other schema languages that are circulating throughout the XML community.Two worth noting are REgular LAnguage description for XML Next Generation (RELAX NG)and Schema- tron,which are both lightweight schema languages that offer functionality similar to the functionality XML Schema offers.Find out more about RELAX NG and Schematron in Chapter 12.

In this chapter,we focus on basic underlying concepts of XML Schema and XML DTDs. It ’s important to understand the strengths and weaknesses of both approaches before you choose the appropriate validation tool for your application.

Understanding DTD Structures and Functions

DTDs have provided structure to XML documents for a long time.Whereas flexibility is one of XML ’s primary strengths,there are instances where structure is important —even a requirement.Defining a document model provides a structure to which documents must conform.E-commerce and B2B transactions are two common scenarios that require strict document models.

There are several reasons you might want to use a validation mechanism:

  • If multiple developers will be working with the document model,the DTD would pro- vide a framework from which they can work.

  • If your document model contains required elements (such as a price for your product), DTDs allow you to define element and attribute behavior.

  • If you ’re developing a document model that will continue to evolve,a DTD could help guide that process.
DTD validation was the first solution for defining document models and is over 20 years old.DTDs have several key functions:
  • Provide a structural framework for documents

  • Define a content model for elements

  • Declare a list of allowable attributes for each element

  • Allow for limited datatyping for attribute values

  • Provide default values for attributes

  • Define a mechanism for creating reusable chunks of data,with some limitations

  • Provide a mechanism for passing non-XML data to the appropriate processor

  • Allow you to use conditional sections to mark declarations for inclusion or xclusion
In this section,we define notable DTD components.

TIP As you probably know, DTD validation is no longer the only option for defining document models. XML Schema offers a flexible solution to the preceding scenarios.

Declarations

DTDs consist of declarations that provide rules for your document model.Each declaration defines an element,set of attributes,entity,or notation.These four declaration types make up the bulk of any DTD:
element declarations Identify the names of elements and the nature of their content. DTDs do not allow for complex content model definitions.Rather,DTDs allow authors to provide information about element hierarchy.The only datatype you can define for ele- ment content is parsed character data (PCDATA ).

attribute declarations Identify which elements may have attributes,what attributes they may have,what values the attributes may hold,and what the default value is.

entity declarations Allow you to associate a name with some other fragment of content. That construct can be a chunk of regular text,a chunk of the document type declaration, or a reference to an external file containing either text or binary data.

notation declarations Identify specific types of external binary data.This information is passed to the processing application.

When defining DTD declarations,you have to follow a few rules governing the order of their occurrence.If multiple declarations exist for the same element,entity,attribute,or notation,the first one defined takes precedence (the other redundant declarations are then ignored).You also have to be careful when defining entities.Parameter entities (entities defined and used within the DTD)must be declared before they can be referenced.

The syntax used to create declarations allows for white space anywhere within the declara- tions,but there are a few delimiters that have to be written accurately (such as the exclama- tion point in !ELEMENT .The follow declarations are all correct:

<!ELEMENT book (title,author)>
<!ELEMENT book (title,author)>
<!ELEMENT book (	title,
							author)>

Declarations can reside inside the XML document or can be defined as a stand-alone doc- ument.If defined as a part of an XML document,the collection of declarations is referred to as the internal subset .If the declarations are defined externally in a separate file,that file is referred to as an external subset .Many times,you ’ll find that you need to use both internal and external subsets.The collection of all subsets is known as the DTD.Listing 2.1 provides an example of a small collection of DTD declarations defined as a part of the internal DTD subset.

.Listing 2.1 An XML Document Containing an Internal DTD Subset

<?xml version=”1.0 ”?>
<!DOCTYPE publications [
<!ELEMENT publications (book+)>
<!ELEMENT book (title,author)>
<!ELEMENT title (#PCDATA)>
<!ELEMENT author (#PCDATA)>
]>
<publications>
<book>
<title>Mastering XHTML
<author>Ed Tittel
</book>
<book>
<title>Java Developer ’s Guide to E-Commerce with XML
and JSP
<author>William Brogden
</book>
</publications>

This example defines only element type declarations.In most cases,your document model would be more complex,also allowing for attributes,notations,and entities.For each ele- ment,there ’s a corresponding content model defined.For example,the book element is allowed to contain only a title element followed by an author element.

Internal Subset
Internal subsets are handy if you plan to import declarations from external DTD subsets. This is because you can override externally defined declarations by defining a new declaration in the internal subset.The declaration found first (the XML parser reads the internal subset before the external)takes precedence.

There are a couple of restrictions placed on internal subsets:

  • You cannot use conditional sections to mark the inclusion or exclusion of DTD declara- tions.Conditional sections make it easier to combine DTD subsets,therefore allowing you to modularize your DTD.

  • Your parameter entity usage is limited.According to the XML 1.0 Specification,you can- not define and use a parameter entity within another declaration.
Listing 2.2 provides an example of an internal subset....

External Subset
An internal subset allows you to override externally defined declarations or to add new ele- ments or attributes to a specific document.Other than that,you ’ll likely be working with external DTD subsets.More times than not,a common vocabulary will be defined using an external DTD subset.For example,the Extensible Hypertext Markup Language (XHTML),Scalable Vector Graphics (SVG),and the Synchronized Multimedia Integration Language (SMIL)all have externally defined DTDs.Every document that adheres to the defined model just has to include a DOCTYPE declaration that points to the external DTD subset.Separating the grammar rules from the data keeps your file sizes manageable. Listing 2.3 highlights an XML document that references an external DTD subset....

Read More

Meet the Author

Chelsea Valentine is a webmaster, writer, and trainer. She develops websites and teaches others how to do the same.

Lucinda Dykes is the principal at Zero G Web Design in Santa Fe, New Mexico. She has been developing websites and writing code since 1994 and teaches web-related classes at Santa Fe Community College.

Ed Tittel is a 20-year computing industry veteran who has written on XML, XHTML, and HTML in over 30 books. He also contributes XML articles regularly to www.searchmiddleware.com and www.informIT.com.

All three are coauthors of Mastering XHTML, Premium Edition, also from Sybex.

Customer Reviews

Average Review:

Write a Review

and post it to your social network

     

Most Helpful Customer Reviews

See all customer reviews >