ISBN-10:
0132402920
ISBN-13:
9780132402927
Pub. Date:
05/02/1997
Publisher:
Prentice Hall
Mission Critical Systems Management / Edition 1

Mission Critical Systems Management / Edition 1

Paperback

Current price is , Original price is $38.8. You
Select a Purchase Option (New Edition)
  • purchase options

Overview

Mission Critical Systems Management / Edition 1

This book compiles state-of-the-art industry experience on managing distributed systems in mission-critical Wall Street environments, enabling IS professionals to improve reliability, availability and support while lowering cost. Shares years of practical experience in establishing production standards for complex distributed systems. Introduces methods for effectively restructuring systems support through consolidation, outsourcing, insourcing and automation. Presents design ideas for new systems management tools. Includes useful architectures and code excerpts for system developers, frameworks for system integrators, advice on purchasing and support, and ideas for improving IS understanding of user requirements. For information systems management and support personnel in distributed systems environments, especially those who must support mission-critical applications.

Product Details

ISBN-13: 9780132402927
Publisher: Prentice Hall
Publication date: 05/02/1997
Edition description: New Edition
Pages: 640
Product dimensions: 6.80(w) x 9.00(h) x 1.10(d)

About the Author

Yuval Lirov is a Senior Vice President at Lehman Brothers. He specializes in improving systems reliability and reducing support costs through economies of scale. His four years of managing 24x7 systems support for 3,300 hosts, 360 database servers, and 150 mission critical applications have resulted in efficiency improvements of greater than 100%. User satisfaction, determined through polls, has been sustained at 98%. Prior to joining Lehman Brothers, Dr. Lirov held management and development positions at Salomon Brothers and Bell Laboratories. He earned his doctorate in Systems Science under the guidance of Professor E. Y. Rodin at Washington University in St. Louis. Dr. Lirov is an author of "Mission Critical Systems Management" (Prentice Hall, 1997) and over 100 technical publications and patents in distributed systems management, troubleshooting, and resource allocation.

Table of Contents

List of Figures ..... xvii
List of Tables ..... xxii
Foreword ..... xxv
Preface ..... xxvii
Introduction ..... 1
Source of Difficulties ..... 2
Support Strategy ..... 2
Accountability ..... 3
Three-Tiered Architecture ..... 3
Integration ..... 4
Key Results ..... 4
Support of Growing Demand While Reducing Support Costs ..... 4
Performance Improvements ..... 5
Reliability Improvements ..... 5
Conclusions ..... 5
Distributed Approach ..... 7
Centralized Approach ..... 7
References ..... 8

PART I Cost Control ..... 9

Chapter 1 The Spiraling Costs of Systems Management (Frank Henderson) ..... 13
Introduction ..... 14
Network Management vs. Systems Management (It's a Thin Line) ..... 14
Complications of Distributed Environments ..... 15
Networked Security ..... 16
Other Manifestations of the Support Cost Issue ..... 16
Options ..... 18
Summary ..... 19

References ..... 19

Chapter 2 IT Service Management in the Distributed Enterprise (Doug McBride) ..... 21
Introduction ..... 22
Background ..... 23
Service Management Is a Strategy ..... 24
Service Level Agreement ..... 26
Service Level Objectives ..... 26
The Right Metric for the Job ..... 28
Service Management Support Tools ..... 30
Summary ..... 32
Appendix -- Eight Steps to a Successful Service Level Agreement ..... 33

Chapter 3 Buy vs. Build vs. Sell in Distributed Systems Management (Aaron Goldberg

and Yuval Lirov) ..... 41
Introduction ..... 42
Modeling the Buy vs. Build Decision ..... 43
Baseline Model ..... 43
Feasibility Constraints ..... 44
Implicity Costs ..... 44
Applying the Model to Real-World Decisions ..... 45
Trouble Ticket System ..... 45
Mail Alias Management ..... 46
System Monitoring ..... 47
Batch Scheduling ..... 48
Conclusions ..... 49
References ..... 49

PART II Automation ..... 51

Chapter 4 Distributed Systems Monitoring (Aaron Goldberg, Boris Grinfeld, Baruch Katz,

Yuval Lirov, and Hadil Sabbagh) ..... 55
Introduction ..... 56
Configuration Management ..... 57
Monitors .... 59
Alerts ..... 60
Host Availability Monitor ..... 60
Log File Monitoring ..... 61
Other Monitors ..... 63
Summary ..... 63
References ..... 65

Chapter 5 Fault Management (Martha Ben-Michael, Yuval Lirov,

and John O'Donnell) ..... 67
Introduction ..... 68
Architecture ..... 68
Multidimensional Object Classification (MOC) ..... 70
The Action Configuration System (ACS) ..... 70
The Action Resolver Program ..... 70
Filtering ..... 71
Configuration Management ..... 73
Configuration of Hosts ..... 73
Configuration of Dataservers ..... 75
Complexity ..... 75
Conclusions ..... 76
References ..... 77

Chapter 6 Problem Management (Martha Ben-Michael, David Freudenstein, Aaron

Goldberg, and Yuval Lirov) ..... 79
Overview: Distributed vs. Centralized Support Models ..... 80
The Support Productivity Metric ..... 80
The Centralized Model ..... 81
The Distributed Model ..... 81
An Illustration ..... 82
Automation ..... 83
Problem Management System Design ..... 83
Requirements ..... 83
Escalation ..... 86
Ticket Priorities and Corporate Culture ..... 86
Trend Identification ..... 86
Solutions Knowledge-Base: Historical vs. Active ..... 87
Notifications: Tracking the Work ..... 87
User Interface ..... 88
Email Ticket Submission ..... 89
Command-Line Ticket Submission Interface ..... 90
Reports ..... 90
Scalability ..... 91

Customer Reviews

Most Helpful Customer Reviews

See All Customer Reviews