Administering Data Centers: Servers, Storage, and Voice over IP / Edition 1

Hardcover (Print)
Used and New from Other Sellers
Used and New from Other Sellers
from $13.25
Usually ships in 1-2 business days
(Save 81%)
Other sellers (Hardcover)
  • All (10) from $13.25   
  • New (7) from $19.99   
  • Used (3) from $13.25   

Overview

"This book covers a wide spectrum of topics relevant to implementing and managing a modern data center. The chapters are comprehensive and the flow of concepts is easy to understand."
—Cisco reviewer

Gain a practical knowledge of data center concepts

To create a well-designed data center (including storage and network architecture, VoIP implementation, and server consolidation) you must understand a variety of key concepts and technologies. This book explains those factors in a way that smoothes the path to implementation and management. Whether you need an introduction to the technologies, a refresher course for IT managers and data center personnel, or an additional resource for advanced study, you'll find these guidelines and solutions provide a solid foundation for building reliable designs and secure data center policies.

  • Understand the common causes and high costs of service outages
  • Learn how to measure high availability and achieve maximum levels
  • Design a data center using optimum physical, environmental, and technological elements
  • Explore a modular design for cabling, Points of Distribution, and WAN connections from ISPs
  • See what must be considered when consolidating data center resources
  • Expand your knowledge of best practices and security
  • Create a data center environment that is user- and manager-friendly
  • Learn how high availability, clustering, and disaster recovery solutions can be deployed to protect critical information
  • Find out how to use a single network infrastructure for IP data, voice, and storage
Read More Show Less

Product Details

  • ISBN-13: 9780471771838
  • Publisher: Wiley
  • Publication date: 11/24/2005
  • Edition number: 1
  • Pages: 632
  • Product dimensions: 7.50 (w) x 9.52 (h) x 1.65 (d)

Read an Excerpt

Administering Data Centers


By Kailash Jayaswal

John Wiley & Sons

ISBN: 0-471-77183-X


Chapter One

No Time for Downtime

We gain strength, and courage, and confidence by each experience in which we really stop to look fear in the face ... we must do that which we think we cannot. - Eleanor Roosevelt

The need for high availability did not originate with the Internet or e-commerce. It has existed for thousands of years. When Greek warships or merchant ships sailed to discover new lands or business, the captains carried spare sails and oars on board. If the primary sail failed, the crew would immediately hoist a replacement and continue on their way, while they repaired damaged sails. With the advent of electronic sensors, the spare parts employed in industrial systems did not need human intervention for activation. In the early twentieth century, electric power-generating plants automatically detected problems, if any, in the primary generator and switched to a hot standby unit.

With the recent explosive growth of the Internet and our dependence on information systems, high availability has taken on a new meaning and importance. Businesses and consumers are turning to the Internet for purchasing goods and services. People conduct business anytime from their computer. They expect to buy clothes at 2 a.m. on the Web and expect the site to function properly, without problem or delay, from the first click to the last. If the Web site is slow or unavailable, they will click away to a competitor's site. Business globalization caused by the Internet adds another layer of complexity. A popular online store, with business located in Bismarck, North Dakota, may have customers in Asia who keep the seller's servers busy during quiet hours in the United States. Time zones, national borders, and peak and off-peak hours essentially disappear on the Web.

As computers get faster and cheaper, they are being used for more and more critical tasks that require 24-7 uptime. Hospitals, airlines, online banking services, and other service industries modify customer-related data in real time. The amount of online data is rapidly expanding. It is estimated that online data will grow more than 75 percent every year for the next several years. The rapidly increasing demand for placing more and more data online and the constantly decreasing price of storage media have resulted in an increase of huge amounts of critical information being placed online.

Employees and partners depend on data being available at all times. Work hours have extended beyond the traditional 9-to-5, five days a week. Intranet servers such as e-mail, internal applications, and so forth, must be always up and functional for work to continue. Every company has at least one business-critical server that supports the organization's day-to-day operation and health. The unavailability of critical applications translates to lost revenue, reduced customer service and customer loyalty, and well-paid, but idle, workers. A survey of 450 Fortune 100 companies (conducted by the Strategic Research Division of Find/SVP) concluded that U.S. businesses incur about $4 billion of losses per year because of system or network downtime.

In fact, analysts estimate that every minute of Enterprise Resource Planning (ERP) downtime could cost a retailer between $10,000 and $15,000. Systems and data are not expected to be down, not even for maintenance. Downtime literally freezes customers, employees, and partners, who cannot even complete the most basic daily chores.

The requirements for reliability and availability put extreme demands on servers, network, software, and supporting infrastructure. Corporate and e-commerce sites must be capable of processing large numbers of concurrent transactions and are configured to operate 24-7. All components, including both the server hardware and software, must be configured to be redundant.

And what happens when no one can get to the applications? What happens when data is unreachable and the important servers do not want to boot up? Can you shut down your business and ask your employees to go home? Can you tell your customers to go somewhere else? How is it that no one planned for this scenario? Is it possible to recover from this? How long will it take and how much will it cost? What about reputation among customers? Will they ever come back? Why doesn't this happen to your competitors?

As you can see, it happens all the time and all around us. Following are some events that have occurred over the last few years. They expose our total dependence on computer systems and utter helplessness if critical systems are down.

* In April of 1998, AT&T had a 26-hour frame relay-network outage that hurt several business customers. In December of 1999, AT&T had an 8-hour outage that disrupted services to thousands of AT&T WorldNet dial-up users.

* In early 1999, customers of the popular online stock trading site ETrade could not place stock trade orders because the trading sites were down. At the same time, there were a few outages at The Charles Schwab Corporation because of operator errors or upgrades. Schwab later announced a plan to invest $70 million in information technology (IT) infrastructure.

* In June of 1999, eBay had a 22-hour outage that cost the company more than $3 million in credits to customers and about $6 billion (more than 20 percent) in market capitalization. In January of 2001, parts of the site were again down for another 10 hours.

* In August of 1999, MCI suffered about 10 days of partial outages and later provided 20 days of free service to 3,000 enterprise customers.

* Three outages at the Web retailer amazon.com during the busy holiday-shopping season of December 2000 cost Amazon more than $500,000 in sales loss.

* Denial-of-Service and several virus-induced attacks on Internet servers continue to cause Web site outages. On July 19, 2002, a hacker defaced a page on the U.S. Army Research Laboratory's Web site with a message criticizing the Army's organization for bias to certain nations.

* Terrorist attacks in Washington, D.C., New York, London, and cities around the world in recent years have destroyed several data centers and offices.

Businesses everywhere are faced with the challenge of minimizing downtime. At the same time, plans to enhance service availability have financial and resource-related constraints. Taking steps to increase data, system, and network availability is a delicate task. If the environment is not carefully designed and implemented, it would cost dearly (in terms of required time, money, and human resources) to build and manage it.

To increase service availability, you must identify and eliminate potential causes of downtime, which could be caused by hardware failures, network glitches, software problems, application bugs, and so forth. Sometimes, poor server, application, or network performance is perceived as downtime. Service expectations are high. When someone wants to place a phone call, he or she picks up the phone and expects a dial tone within a few seconds. The call must connect within one second of dialing and there should be no dropped connections. When surfing the Web, users expect the first visual frame of a Web page within a few seconds of accessing the site. All systems, especially those related to consumers and critical operations, should always be ready and must operate with no lost transactions.

But potential causes of downtime abound. The entire IT infrastructure is made up of several links, such as user workstations, network devices, servers, applications, data, and so forth. If any link is down, the user is affected. It then does not matter if the other links in the chain are available or not. Downtime, in this book, is defined as an end user's inability to get his or her work done. This book examines ways to enhance service availability to the end user and describes techniques for improving network, data, server, and application uptime.

Availability is the portion of time that an application or service is available to internal or external customers for productive work. The more resilient a system or environment is, the higher the availability is. An important decision is the required availability level. When you ask a user or project manager how much uptime he or she needs for the application, the reflex answer is "One-hundred percent. It must always be available at all times." But when you explain the high costs required to achieve 100 percent uptime, the conversation becomes more of a two-way negotiation. The key point is to balance downtime cost with availability configuration costs.

Another point is the time duration when 100 percent uptime is necessary. Network Operations Center (NOC) and 24-7 network monitoring applications and e-commerce Web sites require 100 percent uptime. On the other extreme are software development environments, used only when developers are accessing the system. If, on occasion, you take development systems down (especially at night or on weekends), and if you warn your users well in advance, downtime is not an issue.

Table 1-1 illustrates how little time per year is afforded for planned or unplanned downtime as availability requirements move closer to 100 percent. Suppose a server, "hubble," has no special high-availability features except for RAID-1 volumes and regular backups and has 98 percent uptime. The 2 percent downtime is too high and, therefore, it is clustered with "casper," which also has 98 percent uptime. Server "casper" is used only 2 percent of the time when hubble is down. The combined availability is 98 percent plus 98 percent of 2 (0.98 ??2), which is 1.96 percent. These add to a theoretical service uptime of 99.96 percent for the two-node cluster.

In reality, several other factors affect both servers, such as downtime during failover duration, power or network outages, and application bugs. These failures will decrease the theoretical combined uptime.

As you move down the table, the incremental costs associated with achieving the level of availability increase exponentially. It is far more expensive to migrate from a "four-nines" to a "five-nines" (99.99 percent to 99.999 percent uptime) configuration than to move from 99 percent to 99.9 percent uptime.

Causes of Downtime

About 80 percent of the unplanned downtime is caused by process or people issues, and 20 percent is caused by product issues. Solid processes must be in place throughout the IT infrastructure to avoid process-, people-, or product-related outages. Figure 1-1 and Table 1-2 show the various causes of downtime. As you can see, planned or scheduled downtime is one of the biggest contributors (30 percent). It is also the easiest to reduce. It includes events that are preplanned by IT (system, database, and network) administrators and usually done at night. It could be just a proactive reboot. Other planned tasks that lead to host or application outage are scheduled activities such as application or operating system upgrades, adding patches, hardware changes, and so forth.

Most of these planned events can be performed without service interruption. Disks, fans, and power supplies in some servers and disk subsystems can be changed during normal run-time, without need for power-offs. Data volumes and files systems can be increased, decreased, or checked for problems while they are online. Applications can be upgraded while they are up. Some applications must be shut down before an upgrade or a configuration change.

Outages for planned activities can be avoided by having standby devices or servers in place. Server clustering and redundant devices and links help reduce service outages during planned maintenance. If the application is running in a cluster, it can be switched to another server in the cluster. After the application is upgraded, the application can be moved back. The only downtime is the time duration required to switch or failover services from one server to another. The same procedure can be used for host-related changes that require the host to be taken off-line. Apart from the failover duration, there is no other service outage.

Another major cause of downtime is people-related. It is caused by poor training, a rush to get things done, fatigue, lots of nonautomated tasks, or pressure to do several things at the same time. It could also be caused by lack of expertise, poor understanding of how systems or applications work, and poorly defined processes. You can reduce the likelihood of operator-induced outages by following properly documented procedures and best practices. Organization must have several, easy-to-understand how-tos for technical support groups and project managers. The documentation must be placed where it can be easily accessed, such as internal Web sites. It is important to spend time and money on employee training because in economically good times, talented employees are hard to recruit and harder to retain. For smooth continuity of expertise, it is necessary to recruit enough staff to cover emergencies and employee attrition and to avoid overdependence on one person.

Avoiding unplanned downtime takes more discipline than reducing planned downtime. One major contributor to unplanned downtime is software glitches. The Gartner Group estimates that U.S. companies suffer losses of up to $1 billion every year because of software failure. In another survey conducted by Ernst and Young, it was found that almost all the 310 surveyed companies had some kind of business disruption. About 30 percent of the disruptions caused losses of $100,000 or more each to the company.

When production systems fail, backups and business-continuance plans are immediately deployed and are every bit worth their weight, but the damage has already been done. Bug fixes are usually reactive to the outages they wreak. As operating systems and applications get more and more complex, they will have more bugs. On the other hand, software development and debugging techniques are getting more sophisticated. It will be interesting to see if the percentage of downtime attributed to software bugs increases or decreases in the future. It is best to stay informed of the latest developments and keep current on security, operating system, application, and other critical patches. Sign up for e-mail-based advisory bulletins from vendors whose products are critical to your business.

Environmental factors that can cause downtime are rare, but they happen. Power fails. Fires blaze. Floods gush. The ground below shakes. In 1998, the East Coast of the United States endured the worst hurricane season on record. At the same time, the Midwest was plagued with floods. Natural disasters occur mercurially all the time and adversely impact business operations. And, to add to all that, there are disasters caused by human beings, such as terrorist attacks.

The best protection is to have one or more remote, mirrored disaster recovery (DR) sites. In the past, a fully redundant system at a remote DR site was an expensive and daunting proposition. Nowadays, conditions have changed to make it very affordable:

* Hardware costs and system sizes have fallen dramatically.

* The Internet has come to provide a common network backbone.

* Operating procedures, technology, and products have made an off-site installation easy to manage remotely.

To protect against power blackouts, use uninterruptible power supplies (UPS). If Internet connection is critical, use two Internet access providers or at least separate, fully redundant links from the same provider.

Cost of Downtime

Organizations need to cost out the financial impact caused by downtime. The result helps determine the extent of resources that must be spent to protect against outages. The total cost of a service outage is difficult to assess. Customer dissatisfaction, lost transactions, data integrity problems, and lost business revenue cannot be accurately quantified. An extended period of downtime can result in ruin and, depending on the nature of the business, the hourly cost of business outage can be several tens of thousands of dollars to a few million dollars. Table 1-3 provides some examples of downtime costs.

(Continues...)



Excerpted from Administering Data Centers by Kailash Jayaswal Excerpted by permission.
All rights reserved. No part of this excerpt may be reproduced or reprinted without permission in writing from the publisher.
Excerpts are provided by Dial-A-Book Inc. solely for the personal use of visitors to this web site.

Read More Show Less

Table of Contents

Ch. 1 No time for downtime 3
Ch. 2 The high-availability continuum 17
Ch. 3 Data center requirements 27
Ch. 4 Data center design 37
Ch. 5 Network infrastructure in a data center 51
Ch. 6 Data center maintenance 61
Ch. 7 Power distribution in a data center 71
Ch. 8 Data center HVAC 79
Ch. 9 Reasons for data center consolidation 95
Ch. 10 Data center consolidation phases 107
Ch. 11 Server performance metrics 129
Ch. 12 Server capacity planning 143
Ch. 13 Best practices in IT 157
Ch. 14 Server security 175
Ch. 15 Server administration 187
Ch. 16 Device naming 203
Ch. 17 Load balancing 217
Ch. 18 Fault tolerance 233
Ch. 19 RAID 241
Ch. 20 Data storage solutions 255
Ch. 21 Storage area networks 273
Ch. 22 Configuring a SAN 299
Ch. 23 Using SANs for high availability 309
Ch. 24 IP-based storage communications 319
Ch. 25 Cluster architecture 333
Ch. 26 Cluster requirements 345
Ch. 27 Designing cluster-friendly applications 361
Ch. 28 Network devices 371
Ch. 29 Network protocols 389
Ch. 30 IP addressing 397
Ch. 31 Network technologies 413
Ch. 32 Network topologies 435
Ch. 33 Network design 447
Ch. 34 Designing fault-tolerant networks 471
Ch. 35 Internet access technologies and VPNs 481
Ch. 36 Firewalls 495
Ch. 37 Network security 505
Ch. 38 Disaster recovery 521
Ch. 39 DR architectures 537
Ch. 40 Voice over IP and converged infrastructure 545
Ch. 41 What's next 571
Read More Show Less

Customer Reviews

Be the first to write a review
( 0 )
Rating Distribution

5 Star

(0)

4 Star

(0)

3 Star

(0)

2 Star

(0)

1 Star

(0)

Your Rating:

Your Name: Create a Pen Name or

Barnes & Noble.com Review Rules

Our reader reviews allow you to share your comments on titles you liked, or didn't, with others. By submitting an online review, you are representing to Barnes & Noble.com that all information contained in your review is original and accurate in all respects, and that the submission of such content by you and the posting of such content by Barnes & Noble.com does not and will not violate the rights of any third party. Please follow the rules below to help ensure that your review can be posted.

Reviews by Our Customers Under the Age of 13

We highly value and respect everyone's opinion concerning the titles we offer. However, we cannot allow persons under the age of 13 to have accounts at BN.com or to post customer reviews. Please see our Terms of Use for more details.

What to exclude from your review:

Please do not write about reviews, commentary, or information posted on the product page. If you see any errors in the information on the product page, please send us an email.

Reviews should not contain any of the following:

  • - HTML tags, profanity, obscenities, vulgarities, or comments that defame anyone
  • - Time-sensitive information such as tour dates, signings, lectures, etc.
  • - Single-word reviews. Other people will read your review to discover why you liked or didn't like the title. Be descriptive.
  • - Comments focusing on the author or that may ruin the ending for others
  • - Phone numbers, addresses, URLs
  • - Pricing and availability information or alternative ordering information
  • - Advertisements or commercial solicitation

Reminder:

  • - By submitting a review, you grant to Barnes & Noble.com and its sublicensees the royalty-free, perpetual, irrevocable right and license to use the review in accordance with the Barnes & Noble.com Terms of Use.
  • - Barnes & Noble.com reserves the right not to post any review -- particularly those that do not follow the terms and conditions of these Rules. Barnes & Noble.com also reserves the right to remove any review at any time without notice.
  • - See Terms of Use for other conditions and disclaimers.
Search for Products You'd Like to Recommend

Recommend other products that relate to your review. Just search for them below and share!

Create a Pen Name

Your Pen Name is your unique identity on BN.com. It will appear on the reviews you write and other website activities. Your Pen Name cannot be edited, changed or deleted once submitted.

 
Your Pen Name can be any combination of alphanumeric characters (plus - and _), and must be at least two characters long.

Continue Anonymously

    If you find inappropriate content, please report it to Barnes & Noble
    Why is this product inappropriate?
    Comments (optional)