Read an Excerpt
Chapter 5: Creating Links and Cross-References
Generating Links with the id() Function
Generating Links with the key() Function
Generating Links in Unstructured Documents
Using the id() function
Doing more advanced linking with the key() function
Generating links in unstructured documents
Generating Links with the id() Function
Our first attempt at linking will be with the XPath id() function.
The ID, IDREF, and IDREFs DatatypesThree of the basic datatypes supported by XML Document Type Definitions (DTDs) are ID, IDREF, and IDREFS. Here's a simple DTD that illustrates these datatypes:
<!--glossary.dtd--> <!--The containing tag for the entire glossary--> <!ELEMENT glossary (glentry+) > <!--A glossary entry--> <!ELEMENT glentry (term,defn+) > <!--The word being defined--> <!ELEMENT term (#PCDATA) > <!--The id is used for cross-referencing, and the xreftext is the text used by cross-references.--> <!ATTLIST term id ID #REQUIRED xreftext CDATA #IMPLIED > <!--The definition of the term--> <!ELEMENT defn (#PCDATA | xref | seealso)* > <!--A cross-reference to another term--> <!ELEMENT xref EMPTY > <!--refid is the ID of the referenced term--> <!ATTLIST xref refid IDREF #REQUIRED > <!--seealso refers to one or more other definitions--> <!ELEMENT seealso EMPTY> <!ATTLIST seealso refids IDREFS #REQUIRED >
In this DTD, each <term> element is required to have an id attribute, and each <xref> element must have an refid attribute. The ID and IDREF datatypes work according to two rules:
Each value of the id attribute must be unique.
Each value of the refid attribute must match a value of an id attribute elsewhere in the document.
To round out our example, the <seealso> element contains an attribute of type IDREFS. This datatype contains one or more values, each of which must match a value of an ID elsewhere in the document. Multiple values, if present, are separated by whitespace.
There are some complications of ID and related datatypes, but we'll discuss them later. For now, we'll focus on how the id() function works.
An XML Document in Need of Links
To illustrate the value of linking, we'll use a small glossary written in XML. The glossary contains some <glentry> elements, each of which contains a single <term> and one or more <defn> elements. In addition, a definition is allowed to contain a cross-reference (<xref>) to another <term>. Here's a short sample document:
<?xml version="1.0" ?> <!DOCTYPE glossary SYSTEM "glossary.dtd"> <glossary> <glentry> <term id="applet">applet</term> <defn> An application program, written in the Java programming language, that can be retrieved from a web server and executed by a web browser. A reference to an applet appears in the markup for a web page, in the same way that a reference to a graphics file appears; a browser retrieves an applet in the same way that it retrieves a graphics file. For security reasons, an applet's access rights are limited in two ways: the applet cannot access the file system of the client upon which it is executing, and the applet's communication across the network is limited to the server from which it was downloaded. Contrast with <xref refid="servlet"/>. <seealso refids="wildcard-char DMZlong pattern-matching"/> </defn> </glentry> <glentry> <term id="DMZlong" xreftext="demilitarized zone">demilitarized zone (DMZ)</term> <defn> In network security, a network that is isolated from, and serves as a neutral zone between, a trusted network (for example, a private intranet) and an untrusted network (for example, the Internet). One or more secure gateways usually control access to the DMZ from the trusted or the untrusted network. </defn> </glentry> <glentry> <term id="DMZ">DMZ</term> <defn> See <xref refid="DMZlong"/>. </defn> </glentry> <glentry> <term id="pattern-matching">pattern-matching character</term> <defn> A special character such as an asterisk (*) or a question mark (?) that can be used to represent zero or more characters. Any character or set of characters can replace a pattern-matching character. </defn> </glentry> <glentry> <term id="servlet">servlet</term> <defn> An application program, written in the Java programming language, that is executed on a web server. A reference to a servlet appears in the markup for a web page, in the same way that a reference to a graphics file appears. The web server executes the servlet and sends the results of the execution (if there are any) to the web browser. Contrast with <xref refid="applet" />. </defn> </glentry> <glentry> <term id="wildcard-char">wildcard character</term> <defn> See <xref refid="pattern-matching"/>. </defn> </glentry> </glossary>
In this XML listing, each <term> element has an id attribute that identifies it uniquely. Many <xref> elements also refer to other terms in the listing. Notice that each time we refer to another term, we don't use the actual text of the referenced term. When we write our stylesheet, we'll use the XPath id function to retrieve the text of the referenced term; if the name of a term changes (as buzzwords go in and out of fashion, some marketing genius might want to rename the "pattern-matching character," for example), we can rerun our stylesheet and be confident that all references to the new term contain the correct text.
Finally, some <term> elements have an xreftext element because some of the actual terms are longer than we'd like to use in a cross-reference. When we have an <xref> to the term ASCII (American Standard Code for Information Interchange), it would get pretty tedious if the entire text of the term appeared throughout our document. For this term, we'll use the xreftext attribute's value, ensuring that the cross-reference contains the less-intimidating text ASCII....