ISBN-10:
1118551125
ISBN-13:
9781118551127
Pub. Date:
11/11/2013
Publisher:
Wiley
Designing High Availability Systems: DFSS and Classical Reliability Techniques with Practical Real Life Examples / Edition 1

Designing High Availability Systems: DFSS and Classical Reliability Techniques with Practical Real Life Examples / Edition 1

Current price is , Original price is $133.0. You

Temporarily Out of Stock Online

Please check back later for updated availability.

Product Details

ISBN-13: 9781118551127
Publisher: Wiley
Publication date: 11/11/2013
Pages: 480
Product dimensions: 6.00(w) x 9.30(h) x 1.10(d)

About the Author

ZACHARY TAYLOR is a Systems Architect at Nokia Solutions& Networks with over thirty years' experience designing highavailability and mission critical systems at GE, Lockheed Martin,and Motorola. He has a Masters in Electrical Engineering.

SUBRAMANYAM RANGANATHAN is a DFSS Master Black Belt atNokia Solutions & Networks with over twenty years' experiencein the high-tech industry including at Motorola. He has a Mastersin Electrical Engineering and an MBA from the Kellogg School ofManagement.

Read an Excerpt

Click to read or download

Table of Contents

Preface xiii

List of Abbreviations xvii

1. Introduction 1

2. Initial Considerations for Reliability Design 3

2.1 The Challenge 3

2.2 Initial Data Collection 3

2.3 Where Do We Get MTBF Information? 5

2.4 MTTR and Identifying Failures 6

2.5 Summary 7

3. A Game of Dice: An Introduction to Probability 8

3.1 Introduction 8

3.2 A Game of Dice 10

3.3 Mutually Exclusive and Independent Events 10

3.4 Dice Paradox Problem and Conditional Probability 15

3.5 Flip a Coin 21

3.6 Dice Paradox Revisited 23

3.7 Probabilities for Multiple Dice Throws 24

3.8 Conditional Probability Revisited 27

3.9 Summary 29

4. Discrete Random Variables 30

4.1 Introduction 30

4.2 Random Variables 31

4.3 Discrete Probability Distributions 33

4.4 Bernoulli Distribution 34

4.5 Geometric Distribution 35

4.6 Binomial Coeffi cients 38

4.7 Binomial Distribution 40

4.8 Poisson Distribution 43

4.9 Negative Binomial Random Variable 48

4.10 Summary 50

5. Continuous Random Variables 51

5.1 Introduction 51

5.2 Uniform Random Variables 52

5.3 Exponential Random Variables 53

5.4 Weibull Random Variables 54

5.5 Gamma Random Variables 55

5.6 Chi-Square Random Variables 59

5.7 Normal Random Variables 59

5.8 Relationship between Random Variables 60

5.9 Summary 61

6. Random Processes 62

6.1 Introduction 62

6.2 Markov Process 63

6.3 Poisson Process 63

6.4 Deriving the Poisson Distribution 64

6.5 Poisson Interarrival Times 69

6.6 Summary 71

7. Modeling and Reliability Basics 72

7.1 Introduction 72

7.2 Modeling 75

7.3 Failure Probability and Failure Density 77

7.4 Unreliability, F(t) 78

7.5 Reliability, R(t) 79

7.6 MTTF 79

7.7 MTBF 79

7.8 Repairable System 80

7.9 Nonrepairable System 80

7.10 MTTR 80

7.11 Failure Rate 81

7.12 Maintainability 81

7.13 Operability 81

7.14 Availability 82

7.15 Unavailability 84

7.16 Five 9s Availability 85

7.17 Downtime 85

7.18 Constant Failure Rate Model 85

7.19 Conditional Failure Rate 88

7.20 Bayes’s Theorem 94

7.21 Reliability Block Diagrams 98

7.22 Summary 107

8. Discrete-Time Markov Analysis 110

8.1 Introduction 110

8.2 Markov Process Defined 112

8.3 Dynamic Modeling 116

8.4 Discrete Time Markov Chains 116

8.5 Absorbing Markov Chains 123

8.6 Nonrepairable Reliability Models 129

8.7 Summary 140

9. Continuous-Time Markov Systems 141

9.1 Introduction 141

9.2 Continuous-Time Markov Processes 141

9.3 Two-State Derivation 143

9.4 Steps to Create a Markov Reliability Model 147

9.5 Asymptotic Behavior (Steady-State Behavior) 148

9.6 Limitations of Markov Modeling 154

9.7 Markov Reward Models 154

9.8 Summary 155

10. Markov Analysis: Nonrepairable Systems 156

10.1 Introduction 156

10.2 One Component, No Repair 156

10.3 Nonrepairable Systems: Parallel System with No Repair165

10.4 Series System with No Repair: Two Identical Components172

10.5 Parallel System with Partial Repair: Identical Components176

10.6 Parallel System with No Repair: Nonidentical Components183

10.7 Summary 192

11. Markov Analysis: Repairable Systems 193

11.1 Repairable Systems 193

11.2 One Component with Repair 194

11.3 Parallel System with Repair: Identical Component Failureand Repair Rates 204

11.4 Parallel System with Repair: Different Failure and RepairRates 217

11.5 Summary 239

12. Analyzing Confidence Levels 240

12.1 Introduction 240

12.2 pdf of a Squared Normal Random Variable 240

12.3 pdf of the Sum of Two Random Variables 243

12.4 pdf of the Sum of Two Gamma Random Variables 245

12.5 pdf of the Sum of n Gamma Random Variables 246

12.6 Goodness-of-Fit Test Using Chi-Square 249

12.7 Confidence Levels 257

12.8 Summary 264

13. Estimating Reliability Parameters 266

13.1 Introduction 266

13.2 Bayes’ Estimation 268

13.3 Example of Estimating Hardware MTBF 273

13.4 Estimating Software MTBF 273

13.5 Revising Initial MTBF Estimates and Tradeoffs 274

13.6 Summary 277

14. Six Sigma Tools for Predictive Engineering 278

14.1 Introduction 278

14.2 Gathering Voice of Customer (VOC) 279

14.3 Processing Voice of Customer 281

14.4 Kano Analysis 282

14.5 Analysis of Technical Risks 284

14.6 Quality Function Deployment (QFD) or House of Quality284

14.7 Program Level Transparency of Critical Parameters 287

14.8 Mapping DFSS Techniques to Critical Parameters 287

14.9 Critical Parameter Management (CPM) 287

14.10 First Principles Modeling 289

14.11 Design of Experiments (DOE) 289

14.12 Design Failure Modes and Effects Analysis (DFMEA) 289

14.13 Fault Tree Analysis 290

14.14 Pugh Matrix 290

14.15 Monte Carlo Simulation 291

14.16 Commercial DFSS Tools 291

14.17 Mathematical Prediction of System Capability instead of“Gut Feel” 293

14.18 Visualizing System Behavior Early in the Life Cycle297

14.19 Critical Parameter Scorecard 297

14.20 Applying DFSS in Third-Party Intensive Programs 298

14.21 Summary 300

15. Design Failure Modes and Effects Analysis 302

15.1 Introduction 302

15.2 What Is Design Failure Modes and Effects Analysis (DFMEA)? 302

15.3 Definitions 303

15.4 Business Case for DFMEA 303

15.5 Why Conduct DFMEA? 305

15.6 When to Perform DFMEA 305

15.7 Applicability of DFMEA 306

15.8 DFMEA Template 306

15.9 DFMEA Life Cycle 312

15.10 The DFMEA Team 324

15.11 DFMEA Advantages and Disadvantages 327

15.12 Limitations of DFMEA 328

15.13 DFMEAs, FTAs, and Reliability Analysis 328

15.14 Summary 330

16. Fault Tree Analysis 331

16.1 What Is Fault Tree Analysis? 331

16.2 Events 332

16.3 Logic Gates 333

16.4 Creating a Fault Tree 335

16.5 Fault Tree Limitations 339

16.6 Summary 339

17. Monte Carlo Simulation Models 340

17.1 Introduction 340

17.2 System Behavior over Mission Time 344

17.3 Reliability Parameter Analysis 344

17.4 A Worked Example 348

17.5 Component and System Failure Times Using Monte CarloSimulations 359

17.6 Limitations of Using Nontime-Based Monte Carlo Simulations361

17.7 Summary 365

18. Updating Reliability Estimates: Case Study 367

18.1 Introduction 367

18.2 Overview of the Base Station Controller—Data Only(BSC-DO) System 367

18.3 Downtime Calculation 368

18.4 Calculating Availability from Field Data Only 371

18.5 Assumptions Behind Using the Chi-Square Methodology 372

18.6 Fault Tree Updates from Field Data 372

18.7 Summary 376

19. Fault Management Architectures 377

19.1 Introduction 377

19.2 Faults, Errors, and Failures 378

19.3 Fault Management Design 381

19.4 Repair versus Recovery 382

19.5 Design Considerations for Reliability Modeling 383

19.6 Architecture Techniques to Improve Availability 383

19.7 Redundancy Schemes 384

19.8 Summary 395

20 Application of DFMEA to Real-Life Example 397

20.1 Introduction 397

20.2 Cage Failover Architecture Description 397

20.3 Cage Failover DFMEA Example 399

20.4 DFMEA Scorecard 401

20.5 Lessons Learned 402

20.6 Summary 403

21. Application of FTA to Real-Life Example 404

21.1 Introduction 404

21.2 Calculating Availability Using Fault Tree Analysis 404

21.3 Building the Basic Events 405

21.4 Building the Fault Tree 406

21.5 Steps for Creating and Estimating the Availability UsingFTA 408

21.6 Summary 416

22. Complex High Availability System Analysis 420

22.1 Introduction 420

22.2 Markov Analysis of the Hardware Components 420

22.3 Building a Fault Tree from the Hardware Markov Model427

22.4 Markov Analysis of the Software Components 427

22.5 Markov Analysis of the Combined Hardware and SoftwareComponents 433

22.6 Techniques for Simplifying Markov Analysis 437

22.7 Summary 446

References 447

Index 450

Customer Reviews

Most Helpful Customer Reviews

See All Customer Reviews