Definitive XML Schemaby Priscilla Walmsley
“XML Schema 1.1 has gone from strong data typing to positively stalwart—so powerful it can enforce database level constraints and business rules, so your data transfer code won’t have to. This book covers the 1.1 changes—and more—in its 500 revisions to Priscilla Walmsley’s 10-year best-selling classic. It’s the guide
“XML Schema 1.1 has gone from strong data typing to positively stalwart—so powerful it can enforce database level constraints and business rules, so your data transfer code won’t have to. This book covers the 1.1 changes—and more—in its 500 revisions to Priscilla Walmsley’s 10-year best-selling classic. It’s the guide you need to navigate XML Schema’s complexity—and master its power!”
—Charles F. Goldfarb
For Ten Years the World’s Favorite Guide to XML Schema—Now Extensively Revised for Version 1.1 and Today’s Best Practices!
To leverage XML’s full power, organizations need shared vocabularies based on XML Schema. For a full decade, Definitive XML Schema has been the most practical, accessible, and usable guide to working with XML Schema. Now, author Priscilla Walmsley has thoroughly updated her classic to fully reflect XML Schema 1.1, and to present new best practices for designing successful schemas.
Priscilla helped create XML Schema as a member of the W3C XML Schema Working Group, so she is well qualified to explain the W3C recommendation with insight and clarity. Her book teaches practical techniques for writing schemas to support any application, including many new use cases. You’ll discover how XML Schema 1.1 provides a rigorous, complete specification for modeling XML document structure, content, and datatypes; and walk through the many aspects of designing and applying schemas, including composition, instance validation, documentation, and namespaces. Then, building on the fundamentals, Priscilla introduces powerful advanced techniques ranging from type derivation to identity constraints. This edition’s extensive new coverage includes
- Many new design hints, tips, and tricks – plus a full chapter on creating an enterprise strategy for schema development and maintenance
- Design considerations in creating schemas for relational and object-oriented models, narrative content, and Web services
- An all-new chapter on assertions
- Coverage of new 1.1 features, including overrides, conditional type assignment, open content and more
- Modernized rules for naming and design
- Substantially updated coverage of extensibility, reuse, and versioning
- And much more
If you’re an XML developer, architect, or content specialist, with this Second Edition you can join the tens of thousands who rely on Definitive XML Schema for practical insights, deeper understanding, and solutions that work.
Read an Excerpt
Chapter 9: Simple typesBoth element and attribute declarations can use simple types to describe the data content of the components. This chapter introduces simple types, and explains how to define your own atomic simple types for use in your schemas.
9.1 Simple type varietiesThere are three varieties of simple type: atomic types, list types, and union types.
- Atomic types have values that are indivisible, such as 10 and
- List types have values that are whitespace-separated lists of
atomic values, such as <availableSizes>10 large
- Union types may have values that are either atomic values or list values. What differentiates them is that the set of valid values, or "value space," for the type is the union of the value spaces of two or more other simple types. For example, to represent a dress size, you may define a union type that allows a value to be either an integer from 2 through 18, or one of the string values small, medium, or large.
9.1.1 Design hint: How much should I break down
my data values?
Data values should be broken down to the most atomic level possible.
This allows them to be processed in a variety of ways for different uses,
such as display, mathematical operations, and validation. It is much
easier to concatenate two data values back together than it is to split
them apart. In addition, more granular data is much easier to validate.
It is a fairly common practice to put a data value and its units in
the same element, for example <length>3cm</length>. How-ever,
the preferred approach is to have a separate data value,
preferably an attribute, for the units, for example <length
Using a single concatenated value is limiting because:
- It is extremely cumbersome to validate. You have to apply a
complicated pattern that would need to change every time a
unit type is added.
- You cannot perform comparisons, conversions, or mathematical
operations on the data without splitting it apart.
- If you want to display the data item differently (for example, as "3 centimeters" or "3 cm" or just "3", you have to split it apart. This complicates the stylesheets and applications that process the instance document.
<orderDate> <year>2001</year> <month>06</month> <day>15</day> </orderDate>
This is probably an overkill unless you have a special need to process these items separately.
9.2 Simple type definitions
9.2.1 Named simple typesSimple types can be either named or anonymous. Named simple types are always defined globally (i.e., their parent is always schema or redefine) and are required to have a name that is unique among the data types (both simple and complex) in the schema. The XSDL syntax for a named simple type definition is shown in Table 9–1.
The name of a simple type must be an XML non-colonized name, which means that it must start with a letter or underscore, and may only contain letters, digits, underscores, hyphens, and periods. You cannot include a namespace prefix when defining the type; it takes its namespace from the target namespace of the schema document. All of the examples of named types in this book have the word "Type" at the end of their names, to clearly distinguish them from element-type names and attribute names. However, this is not a requirement; you may in fact have a data type definition and an element declaration using the same name.
Example 9–1 shows the definition of a named simple type Dress-SizeType, along with an element declaration that references it. Named types can be used in multiple element and attribute declarations.
Example 9–1. Defining and referencing a named simple type
<xsd:simpleType name="DressSizeType"> <xsd:restriction base="xsd:integer"> <xsd:minInclusive value="2"/> <xsd:maxInclusive value="18"/> </xsd:restriction> </xsd:simpleType> <xsd:element name="size" type="DressSizeType"/>
9.2.2 Anonymous simple typesAnonymous types, on the other hand, must not have names. They are always defined entirely within an element or attribute declaration, and may only be used once, by that declaration. Defining a type anonymously prevents it from ever being restricted, used in a list or union, or redefined. The XSDL syntax to define an anonymous simple type is shown in Table 9–2.
Example 9–2 shows the definition of an anonymous simple type within an element declaration.
Example 9–2. Defining an anonymous simple type
<xsd:element name="size"> <xsd:simpleType> <xsd:restriction base="xsd:integer"> <xsd:minInclusive value="2"/> <xsd:maxInclusive value="18"/> </xsd:restriction> </xsd:simpleType> </xsd:element>
9.2.3 Design hint: Should I use named or anonymous
The advantage of named types is that they may be defined once and
used many times. For example, you may define a type named Product-CodeType
that lists all of the valid product codes in your organization.
This type can then be used in many element and attribute declarations in many schemas. This has the advantages of:
- encouraging consistency throughout the organization,
- reducing the possibility of error,
- requiring less time to define new schemas,
- simplifying maintenance, because new product codes need only be added in one place.
An anonymous type, on the other hand, can be used only in the element or attribute declaration that contains it. It can never be redefined, have types derived from it, or be used in a list or union type. This can seriously limit its reusability, extensibility, and ability to change over time.
However, there are cases where anonymous types are preferable to named types. If the type is unlikely to ever be reused, the advantages listed above no longer apply. Also, there is such a thing as too much reuse. For example, if an element can contain the values 1 through 10, it does not make sense to try to define a data type named OneToTen-Type that is reused by other unrelated element declarations with the same value space. If the value space for one of the element declarations that uses the named data type changes, but the other element declarations do not change, it actually makes maintenance more difficult, because a new data type needs to be defined at that time.
In addition, anonymous types can be more readable when they are relatively simple. It is sometimes desirable to have the definition of the data type right there with the element or attribute declaration....
Meet the Author
PRISCILLA WALMSLEYserves as Managing Director of Datypic, a consultancy specializing in XML architecture and design, SOA and Web services implementation, and content management.
Most Helpful Customer Reviews
See all customer reviews
This book answered my questions about XMLSchema. Prior to reading it I felt intimidated by the W3C XMLSchema specification. Now, I understand further why Schema's are so valuable and necessary. Several other technical books I have read seem to be simple paraphrases of W3C specifications. However, this book is very well written and full of clear examples.