Read an Excerpt
Safety Critical Systems HandbookA Straightforward Guide to Functional Safety: IEC 61508 (2010 Edition) and Related Standards Including: Process IEC 61511, Machinery IEC 62061 and ISO 13849
By David J Smith Kenneth GL Simpson
Butterworth-HeinemannCopyright © 2011 Dr David J Smith and Kenneth G L Simpson
All right reserved.
Chapter OneThe Meaning and Context of Safety Integrity Targets
Chapter Outline 1.1 Risk and the Need for Safety Targets 4 1.2 Quantitative and Qualitative Safety Targets 7 1.3 The Life-cycle Approach 10 Section 7.1 of Part 1 10 Concept and scope [Part 1 – 7.2 and 7.3] 11 Hazard and risk analysis [Part 1 – 7.4] 12 Safety requirements and allocation [Part 1 – 7.5 and 7.6] 12 Plan operations and maintenance [Part 1 – 7.7] 12 Plan the validation [Part 1 – 7.8] 12 Plan installation and commissioning [Part 1 – 7.9] 12 The safety requirements specification [Part 1 – 7.10] 12 Design and build the system [Part 1 – 7.11 and 7.12] 12 Install and commission [Part 1 – 7.13] 12 Validate that the safety-systems meet the requirements [Part 1 – 7.14] 12 Operate, maintain, and repair [Part 1 – 7.15] 13 Control modifications [Part 1 – 7.16] 13 Disposal [Part 1 – 7.17] 13 Verification [Part 1 – 7.18] 13 Functional safety assessments [Part 1 – 8] 13 1.4 Steps in the Assessment Process 13 Step 1. Establish Functional Safety Capability (i.e. Management) 13 Step 2. Establish a Risk Target 13 Step 3. Identify the Safety Related Function(s) 14 Step 4. Establish SILs for the Safety-related Elements Step 5. Quantitative Assessment of the Safety-related System 14 Step 6. Qualitative Assessment Against the Target SILs 14 Step 7. Establish ALARP 1.5 Costs 15 1.5.1 Costs of Applying the Standard 15 1.5.2 Savings From Implementing the Standard 1.5.3 Penalty Costs from not Implementing the Standard 1.6 The Seven Parts of IEC 61508 16
1.1 Risk and the Need for Safety Targets
There is no such thing as zero risk. This is because no physical item has zero failure rate, no human being makes zero errors and no piece of software design can foresee every operational possibility.
Nevertheless public perception of risk, particularly in the aftermath of a major incident, often calls for the zero risk ideal. However, in general most people understand that this is not practicable, as can be seen from the following examples of everyday risk of death from various causes:
All causes (mid-life including medical) 1 x 10-3 pa All accidents (per individual) 5 x 10-4 pa Accident in the home 4 x 10-4 pa Road traffic accident 6 x 10-5 pa Natural disasters (per individual) 2 x 10-6 pa
Therefore the concept of defining and accepting a tolerable risk for any particular activity prevails.
The actual degree of risk considered to be tolerable will vary according to a number of factors such as the degree of control one has over the circumstances, the voluntary or involuntary nature of the risk, the number of persons at risk in any one incident and so on. This partly explains why the home remains one of the highest areas of risk to the individual in everyday life since it is there that we have control over what we choose to do and are therefore prepared to tolerate the risks involved.
A safety technology has grown up around the need to set target risk levels and to evaluate whether proposed designs meet these targets, be they process plant, transport systems, medical equipment or any other application.
In the early 1970s people in the process industries became aware that, with larger plants involving higher inventories of hazardous material, the practice of learning by mistakes (if indeed we do) was no longer acceptable. Methods were developed for identifying hazards and for quantifying the consequences of failures. They were evolved largely to assist in the decision-making process when developing or modifying plant. External pressures to identify and quantify risk were to come later.
By the mid 1970s there was already concern over the lack of formal controls for regulating those activities which could lead to incidents having a major impact on the health and safety of the general public. The Flixborough incident in June 1974, which resulted in 28 deaths, focused UK public and media attention on this area of technology. Many further events, such as that at Seveso (Italy) in 1976 through to the Piper Alpha offshore disaster and more recent Paddington (and other) rail incidents, have kept that interest alive and have given rise to the publication of guidance and also to legislation in the UK.
The techniques for quantifying the predicted frequency of failures are just the same as those previously applied to plant availability, where the cost of equipment failure was the prime concern. The tendency in the last few years has been towards a more rigorous application of these techniques (together with third party verification) in the field of hazard assessment. They include Fault Tree Analysis, Failure Mode & Effect Analysis, Common Cause Failure Assessment and so on. These will be explained in Chapters 5 and 6.
Hazard assessment of process plant, and of other industrial activities, was common in the 1980s but formal guidance and standards were rare and somewhat fragmented. Only Section 6 of the Health and Safety at Work Act 1974 underpinned the need to do all that is reasonably practicable to ensure safety. However, following the Flixborough disaster, a series of moves (including the Seveso directive) led to the CIMAH (Control of Industrial Major Accident Hazards) regulations, 1984, and their revised COMAH form (Control of Major Accident Hazards) in 1999. The adoption of the Machinery Directive by the EU, in 1989, brought the requirement for a documented risk analysis in support of CE marking.
Nevertheless, these laws and requirements do not specify how one should go about establishing a target tolerable risk for an activity, nor do they address the methods of assessment of proposed designs nor provide requirements for specific safety-related features within design.
The need for more formal guidance has long been acknowledged. Until the mid 1980s risk assessment techniques tended to concentrate on quantifying the frequency and magnitude of consequences arising from given risks. These were sometimes compared with loosely defined target values but, being a controversial topic, such targets (usually in the form of fatality rates) were not readily owned up to or published.
EN 1050 (Principles of risk assessment), in 1996, covered the processes involved in risk assessment but gave little advice on risk reduction. For machinery control EN 954-1 (see Chapter 10) provided some guidance on how to reduce risks associated with control systems but did not specifically include PLCs (programmable logic controllers) which were separately addressed by other IEC (International Electrotechnical Commission) and CENELEC (European Committee for Standardization) documents.
The proliferation of software during the 1980s, particularly in real time control and safety systems, focused attention on the need to address systematic failures since they could not necessarily be quantified. In other words whilst hardware failure rates were seen as a credibly predictable measure of reliability, software failure rates were generally agreed not to be predictable. It became generally accepted that it was necessary to consider qualitative defenses against systematic failures as an additional, and separate, activity to the task of predicting the probability of so called random hardware failures.
In 1989, the HSE (Health and Safety Executive) published guidance which encouraged this dual approach of assuring functional safety of programmable equipment. This led to IEC work, during the 1990s, which culminated in the international safety Standard IEC 61508 – the main subject of this book. The IEC Standard is concerned with electrical, electronic and programmable safety-related systems where failure will affect people or the environment. It has a voluntary, rather than legal, status in the UK but it has to be said that to ignore it might now be seen as "not doing all that is reasonably practicable" in the sense of the Health and Safety at Work Act and a failure to show "due diligence". As use of the Standard becomes more and more widespread it can be argued that it is more and more "practicable" to use it. The Standard was revised and re-issued in 2010. Figure 1.1 shows how IEC 61508 relates to some of the current legislation.
The purpose of this book is to explain, in as concise a way as possible, the requirements of IEC 61508 and the other industry-related documents (some of which are referred to as 2nd tier guidance) which translate the requirements into specific application areas.
The Standard, as with most such documents, has considerable overlap, repetition, and some degree of ambiguity, which places the onus on the user to make interpretations of the guidance and, in the end, apply his/her own judgement.
The question frequently arises as to what is to be classified as safety-related equipment. The term 'safety-related' applies to any hard-wired or programmable system where a failure, singly or in combination with other failures/errors, could lead to death, injury or environmental damage. The terms "safety-related" and "safety-critical" are often used and the distinction has become blurred. "Safety-critical" has tended to be used where failure alone, of the equipment in question, leads to a fatality or increase in risk to exposed people. "Safety-related" has a wider context in that it includes equipment in which a single failure is not necessarily critical whereas coincident failure of some other item leads to the hazardous consequences.
A piece of equipment, or software, cannot be excluded from this safety-related category merely by identifying that there are alternative means of protection. This would be to pre-judge the issue and a formal safety integrity assessment would still be required to determine whether the overall degree of protection is adequate.
1.2 Quantitative and Qualitative Safety Targets
In an earlier paragraph we introduced the idea of needing to address safety-integrity targets both quantitatively and qualitatively:
Quantitatively: where we predict the frequency of hardware failures and compare them with some tolerable risk target. If the target is not satisfied then the design is adapted (e.g. provision of more redundancy) until the target is met.
Qualitatively: where we attempt to minimize the occurrence of systematic failures (e.g. software errors) by applying a variety of defenses and design disciplines appropriate to the severity of the tolerable risk target.
It is important to understand why this twofold approach is needed. Prior to the 1980s, system failures could usually be identified as specific component failures (e.g. relay open circuit, capacitor short circuit, motor fails to start). However, since then the growth of complexity (including software) has led to system failures of a more subtle nature whose cause may not be attributable to a catastrophic component failure. Hence we talk of:
Random hardware failures: which are attributable to specific component failures and to which we attribute failure rates. The concept of "repeatability" allows us to model proposed systems by means of associating past failure rates of like components together to predict the performance of the design in question. and Systematic failures: which are not attributable to specific component failures and are therefore unique to a given system and its environment. They include design tolerance/ timing related problems, failures due to inadequately assessed modifications and, of course, software. Failure rates cannot be ascribed to these incidents since they do not enable us to predict the performance of future designs.
Excerpted from Safety Critical Systems Handbook by David J Smith Kenneth GL Simpson Copyright © 2011 by Dr David J Smith and Kenneth G L Simpson. Excerpted by permission of Butterworth-Heinemann. All rights reserved. No part of this excerpt may be reproduced or reprinted without permission in writing from the publisher.
Excerpts are provided by Dial-A-Book Inc. solely for the personal use of visitors to this web site.