Refactoring HTML: Improving the Design of Existing Web Applications (Addison-Wesley Signature Series) / Edition 1
  • Alternative view 1 of Refactoring HTML: Improving the Design of Existing Web Applications (Addison-Wesley Signature Series) / Edition 1
  • Alternative view 2 of Refactoring HTML: Improving the Design of Existing Web Applications (Addison-Wesley Signature Series) / Edition 1
  • Alternative view 3 of Refactoring HTML: Improving the Design of Existing Web Applications (Addison-Wesley Signature Series) / Edition 1
<Previous >Next

Refactoring HTML: Improving the Design of Existing Web Applications (Addison-Wesley Signature Series) / Edition 1

5.0 1
by Elliotte Rusty Harold
     
 

ISBN-10: 0321503635

ISBN-13: 9780321503633

Pub. Date: 05/16/2008

Publisher: Addison-Wesley

Like any other software system, Web sites gradually accumulate “cruft” over time. They slow down. Links break. Security and compatibility problems mysteriously appear. New features don’t integrate seamlessly. Things just don’t work as well. In an ideal world, you’d rebuild from scratch. But you can’t: there’s no time or

Overview

Like any other software system, Web sites gradually accumulate “cruft” over time. They slow down. Links break. Security and compatibility problems mysteriously appear. New features don’t integrate seamlessly. Things just don’t work as well. In an ideal world, you’d rebuild from scratch. But you can’t: there’s no time or money for that. Fortunately, there’s a solution: You can refactor your Web code using easy, proven techniques, tools, and recipes adapted from the world of software development.

In Refactoring HTML, Elliotte Rusty Harold explains how to use refactoring to improve virtually any Web site or application. Writing for programmers and non-programmers alike, Harold shows how to refactor for better reliability, performance, usability, security, accessibility, compatibility, and even search engine placement. Step by step, he shows how to migrate obsolete code to today’s stable Web standards, including XHTML, CSS, and REST—and eliminate chronic problems like presentation-based markup, stateful applications, and “tag soup.”

The book’s extensive catalog of detailed refactorings and practical “recipes for success” are organized to help you find specific solutions fast, and get maximum benefit for minimum effort. Using this book, you can quickly improve site performance now—and make your site far easier to enhance, maintain, and scale for years to come.

Topics covered include

• Recognizing the “smells” of Web code that should be refactored
• Transforming old HTML into well-formed, valid XHTML, one step at a time
• Modernizing existing layouts with CSS
• Updating old Web applications: replacing POST with GET, replacing old contact forms, and refactoring JavaScript
• Systematically refactoring content and links
• Restructuring sites without changing the URLs your users rely upon

This book will be an indispensable resource for Web designers, developers, project managers, and anyone who maintains or updates existing sites. It will be especially helpful to Web professionals who learned HTML years ago, and want to refresh their knowledge with today’s standards-compliant best practices.
This book will be an indispensable resource for Web designers, developers, project managers, and anyone who maintains or updates existing sites. It will be especially helpful to Web professionals who learned HTML years ago, and want to refresh their knowledge with today’s standards-compliant best practices.

Product Details

ISBN-13:
9780321503633
Publisher:
Addison-Wesley
Publication date:
05/16/2008
Series:
Addison-Wesley Signature Series
Edition description:
New Edition
Pages:
340
Product dimensions:
7.20(w) x 9.30(h) x 0.90(d)

Table of Contents


Foreword by Martin Fowler xvii
Foreword by Bob DuCharme xix
About the Author xxi

Chapter 1 Refactoring 1
Why Refactor 3
When to Refactor 11
What to Refactor To 13
Objections to Refactoring 23

Chapter 2 Tools 25
Backups, Staging Servers, and Source Code Control 25
Validators 27
Testing 34
Regular Expressions 48
Tidy 54
TagSoup 60
XSLT 62

Chapter 3 Well-Formedness 65
What Is Well-Formedness? 66
Change Name to Lowercase 69
Quote Attribute Value 73
Fill In Omitted Attribute Value 76
Replace Empty Tag with Empty-Element Tag 78
Add End-tag 81
Remove Overlap 85
Convert Text to UTF-8 89
Escape Less-Than Sign 91
Escape Ampersand 93
Escape Quotation Marks in Attribute Values 96
Introduce an XHTML DOCTYPE Declaration 98
Terminate Each Entity Reference 101
Replace Imaginary Entity References 102
Introduce a Root Element 103
Introduce the XHTML Namespace 104

Chapter 4 Validity 107
Introduce a Transitional DOCTYPE Declaration 109
Remove All Nonexistent Tags 111
Add an alt Attribute 114
Replace embed with object 117
Introduce a Strict DOCTYPE Declaration 123
Replace center with CSS 124
Replace font with CSS 127
Replace i with em or CSS 131
Replace b with strong or CSS 134
Replace the color Attribute with CSS 136
Convert img Attributes to CSS 140
Replace applet with object 142
Replace Presentational Elements with CSS 146
Nest Inline Elements inside Block Elements 149

Chapter 5 Layout 155
Replace Table Layouts 156
Replace Frames with CSS Positions 170
Move Content to the Front 180
Mark Up Lists as Lists 184
Replace blockquote/ul Indentation with CSS 187
Replace Spacer GIFs 189
Add an ID Attribute 191
Add Width and Height to an Image 195

Chapter 6 Accessibility 199
Convert Images to Text 202
Add Labels to Form Input 206
Introduce Standard Field Names 210
Turn on Autocomplete 216
Add Tab Indexes to Forms 218
Introduce Skip Navigation 222
Add Internal Headings 225
Move Unique Content to the Front of Links and Headlines 226
Make the Input Field Bigger 228
Introduce Table Descriptions 230
Introduce Acronym Elements 235
Introduce lang Attributes 236

Chapter 7 Web Applications 241
Replace Unsafe GET with POST 241
Replace Safe POST with GET 246
Redirect POST to GET 251
Enable Caching 254
Prevent Caching 258
Introduce ETag 261
Replace Flash with HTML 265
Add Web Forms 2.0 Types 270
Replace Contact Forms with mailto Links 277
Block Robots 280
Escape User Input 284

Chapter 8 Content 287
Correct Spelling 287
Repair Broken Links 292
Move a Page 298
Remove the Entry Page 302
Hide E-mail Addresses 304

Appendix A Regular Expressions 309
Characters That Match Themselves 309
Metacharacters 311
Wildcards 312
Quantifiers 313

Index 327

Customer Reviews

Average Review:

Write a Review

and post it to your social network

     

Most Helpful Customer Reviews

See all customer reviews >

Refactoring HTML: Improving the Design of Existing Web Applications (Addison-Wesley Signature Series) 5 out of 5 based on 0 ratings. 1 reviews.
Guest More than 1 year ago
The Web means mostly webpages written in HTML. The popularity of HTML is overwhelming. Yet it has well known problems. There is no intrinsic separation of semantic content from presentation details. And the tag syntax is very sloppy. Harold explains in clear and strong terms why you should clean up your webpages. Mostly by using CSS and by making [and checking] that the pages are well formed and valid under XHTML. This is not a text on CSS, and if you are going to follow the precepts of the book, you will need another book, dedicated to CSS. The strength of Harold's message is in the clarity. He is trying to influence you in a top-down manner. To make these strategic decisions. For example, by going with CSS, you simplify maintenance. Because files are factored into CSS files, which layout people can work on, and semantic content files, which can be the purview of others who are more involved with intrinsic information processing. The latter files also have the advantage that they can be used with different types of display devices and programs, and not just for the typical web browser. Think of cellphones, or devices for the blind. The latter is another good point he makes. Writing pages that are also accessible to the blind is not just good for that reason. It lets you focus not on what the page looks like, but on what it means. Why is this good? Because it improves the chance that search engines will look at and positively classify your semantic files. Search engines often deprecate presentation instructions and CSS files. They are also looking for files with high semantic content. Also, by factoring using CSS files, the resultant set of files gets to be smaller, which reduces outgoing bandwidth from your web server. For large, popular sites, this can be a cost saving. While the writing of well formed and [better yet] XHTML-valid pages increases the chances that different browsers can accurately show the pages. The reason is that browsers have been written to pragmatically show HTML, where the tag structure is sloppy. To do this, a browser has to make certain display assumptions with a badly written file. The problem is that different browsers make different assumptions. And so some HTML files will not display well, or at all. There are also other smaller level tips scattered thru the book. Like suppose you have an image that shows essentially only text. Replace the image with text. Less bandwidth is consumed. Plus search engines don't really do much with images. [Image analysis is very intensive and hard.] So giving them more meaningful text instead of images helps your page ranking. As a side note, some spammers do precisely the opposite. They have images which are mostly to display text. To evade a search engine or antispam software that keys off suspicious text. In related wise, your should always have an alt attribute describing the image. Helps the blind visitor. But mostly it helps a search engine classify the image. There is one unintended ironical aspect of the book's last page. It talks about hiding your email address in the webpage, from screen scraper bots run by spammers harvesting email addresses. One way is to use JavaScript to generate the address. Where the script is run by the visitor's browser as it displays the page. This is to evade spammers. The irony is that a spammer can use this very method, when sending spam email. Many antispam programs now use a blacklist, since spam often has links to the spammer's domain. But the programs usually [always?] check against static links in an email. The spammer can write JavaScript that dynamically makes links, to evade this. Sure, browsers that have JavaScript turned off will not show these links. But in fact, most users turn JavaScript on, because many websites use it. And the spammer might figure that the loss of links due to no JavaScript is greatly outweighed by being able to evade the now almost axiomatic use of blacklists by antispam