This article discusses the design and implementation of safety- related control systems which deal with all types of electrical, electronic and programmable-electronic systems (including computer-based systems). The overall approach is in accordance with proposed International Electrotechnical Commission (IEC) Standard 1508 (Functional Safety: Safety-Related
Systems) (IEC 1993).
Background
During the 1980s, computer-based systems—generically referred to as programmable electronic systems (PESs)—were increasingly being used to carry out safety functions. The primary driving forces behind this trend were (1) improved functionality and economic benefits (particularly considering the total life cycle of the device or system) and (2) the particular benefit of certain designs, which could be realized only when computer technology was used. During the early introduction of computer-based systems a number of findings were made:
- The introduction of computer control was poorly thought out and planned.
- Inadequate safety requirements were specified.
- Inadequate procedures were developed with respect to the validation of software.
- Evidence of poor workmanship was disclosed with respect to the standard of plant installation.
- Inadequate documentation was generated and not adequately validated with respect to what was actually in the plant (as distinct from what was thought to be in the plant).
- Less than fully effective operation and maintenance procedures had been established.
- There was evidently justified concern about the competence of persons to perform the duties required of them.
In order to solve these problems, several bodies published or began developing guidelines to enable the safe exploitation of PES technology. In the United Kingdom, the Health and Safety Executive (HSE) developed guidelines for programmable electronic systems used for safety-related applications, and in Germany, a draft standard (DIN 1990) was published. Within the European Community, an important element in the work on harmonized European Standards concerned with safety-related control systems (including those employing PESs) was started in connection with the requirements of the Machinery Directive. In the United States, the Instrument Society of America (ISA) has produced a standard on PESs for use in the process industries, and the Center for Chemical Process Safety (CCPS), a directorate of the American Institute of Chemical Engineers, has produced guidelines for the chemical process sector.
A major standards initiative is currently taking place within the IEC to develop a generically based international standard for electrical, electronic and programmable electronic (E/E/PES) safety-related systems that could be used by the many applications sectors, including the process, medical, transport and machinery sectors. The proposed IEC international standard comprises seven Parts under the general title IEC 1508. Functional safety of electrical/electronic/programmable electronic safety-related systems. The various Parts are as follows:
- Part 1.General requirements
- Part 2.Requirements for electrical, electronic and programmable electronic systems
- Part 3.Software requirements
- Part 4.Definitions
- Part 5.Examples of methods for the determination of safety integrity levels
- Part 6.Guidelines on the application of Parts 2 and 3
- Part 7.Overview of techniques and measures.
When finalized, this generically based International Standard will constitute an IEC basic safety publication covering functional safety for electrical, electronic and programmable electronic safety-related systems and will have implications for all IEC standards, covering all application sectors as regards the future design and use of electrical/electronic/programmable electronic safety-related systems. A major objective of the proposed standard is to facilitate the development of standards for the various sectors (see figure 1).
Figure 1. Generic and application sector standards
PES Benefits and Problems
The adoption of PESs for safety purposes had many potential advantages, but it was recognized that these would be achieved only if appropriate design and assessment methodologies were used, because: (1) many of the features of PESs do not enable the safety integrity (that is, the safety performance of the systems carrying out the required safety functions) to be predicted with the same degree of confidence that has traditionally been available for less complex hardware-based (“hardwired”) systems; (2) it was recognized that while testing was necessary for complex systems, it was not sufficient on its own. This meant that even if the PES was implementing relatively simple safety functions, the level of complexity of the programmable electronics was significantly greater than that of the hardwired systems they were replacing; and (3) this rise in complexity meant that the design and assessment methodologies had to be given much more consideration than previously, and that the level of personal competence required to achieve adequate levels of performance of the safety-related systems was subsequently greater.
The benefits of computer-based PESs include the following:
- the ability to perform on-line diagnostic proof checks on critical components at a frequency significantly higher than would otherwise be the case
- the potential to provide sophisticated safety interlocks
- the ability to provide diagnostic functions and condition monitoring which can be used to analyse and report on the performance of plant and machinery in real time
- the capability of comparing actual conditions of the plant with “ideal” model conditions
- the potential to provide better information to operators and hence to improve decision-making affecting safety
- the use of advanced control strategies to enable human operators to be located remotely from hazardous or hostile environments
- the ability to diagnose the control system from a remote location.
The use of computer-based systems in safety-related applications creates a number of problems which need to be adequately addressed, such as the following:
- The failure modes are complex and not always predictable.
- Testing the computer is necessary but is not sufficient in itself to establish that the safety functions will be performed with the degree of certainty required for the application.
- Microprocessors may have subtle variations between different batches, and therefore different batches may display different behaviour.
- Unprotected computer-based systems are particularly susceptible to electrical interference (radiated interference; electrical “spikes” in the mains supplies, electrostatic discharges, etc.).
- It is difficult and often impossible to quantify the probability of failure of complex safety-related systems incorporating software. Because no method of quantification has been widely accepted, software assurance has been based on procedures and standards which describe the methods to be used in the design, implementation and maintenance of the software.
Safety Systems under Consideration
The types of safety-related systems under consideration are electrical, electronic and programmable electronic systems (E/E/PESs). The system includes all elements, particularly signals extending from sensors or from other input devices on the equipment under control, and transmitted via data highways or other communication paths to the actuators or other output devices (see figure 2).
Figure 2. Electrical, electronic and programmable electronic system (E/E/PES)
The term electrical, electronic and programmable electronic device has been used to encompass a wide variety of devices and covers the following three chief classes:
- electrical devices such as electro-mechanical relays
- electronic devices such as solid state electronic instruments and logic systems
- programmable electronic devices, which includes a wide variety of computer-based systems such as the following:
- microprocessors
- micro-controllers
- programmable controllers (PCs)
- application-specific integrated circuits (ASICs)
- programmable logic controllers (PLCs)
- other computer-based devices (e.g., “smart” sensors, transmitters and actuators).
By definition, a safety-related system serves two purposes:
- It implements the required safety functions necessary to achieve a safe state for the equipment under control or maintains a safe state for the equipment under control. The safety-related system must perform those safety functions that are specified in the safety functions requirements specification for the system. For example, the safety functions requirements specification may state that when the temperature reaches a certain value x, valve y shall open to allow water to enter the vessel.
- It achieves, on its own or with other safety-related systems, the necessary level of safety integrity for the implementation of the required safety functions. The safety functions must be performed by the safety-related systems with the degree of confidence appropriate to the application in order to achieve the required level of safety for the equipment under control.
This concept is illustrated in figure 3.
Figure 3. Key features of safety-related systems
System Failures
In order to ensure safe operation of E/E/PES safety-related systems, it is necessary to recognize the various possible causes of safety-related system failure and to ensure that adequate precautions are taken against each. Failures are classified into two categories, as illustrated in figure 4.
Figure 4. Failure categories
- Random hardware failures are those failures which result from a variety of normal degradation mechanisms in the hardware. There are many such mechanisms occurring at different rates in different components, and since manufacturing tolerances cause components to fail on account of these mechanisms after different times in operation, failures of a total item of equipment comprising many components occur at unpredictable (random) times. Measures of system reliability, such as the mean time between failures (MTBF), are valuable but are usually concerned only with random hardware failures and do not include systematic failures.
- Systematic failures arise from errors in the design, construction or use of a system which cause it to fail under some particular combination of inputs or under some particular environmental condition. If a system failure occurs when a particular set of circumstances arises, then whenever those circumstances arise in the future there will always be a system failure. Any failure of a safety-related system which does not arise from a random hardware failure is, by definition, a systematic failure. Systematic failures, in the context of E/E/PES safety-related systems, include:
- systematic failures due to errors or omissions in the safety functions requirements specification
- systematic failures due to errors in the design, manufacture, installation or operation of the hardware. These would include failures arising from environmental causes and human (e.g., operator) error
- systematic failures due to faults in the software
- systematic failures due to maintenance and modification errors.
Protection of Safety-Related Systems
The terms that are used to indicate the precautionary measures required by a safety-related system to protect against random hardware failures and systematic failures are hardware safety integrity measures and systematic safety integrity measures respectively. Precautionary measures that a safety-related system can bring to bear against both random hardware failures and systematic failures are termed safety integrity. These concepts are illustrated in figure 5.
Figure 5. Safety performance terms
Within the proposed international standard IEC 1508 there are four levels of safety integrity, denoted Safety Integrity Levels 1, 2, 3 and 4. Safety Integrity Level 1 is the lowest safety integrity level and Safety Integrity Level 4 is the highest. The Safety Integrity Level (whether 1, 2, 3 or 4) for the safety-related system will depend upon the importance of the role the safety-related system is playing in achieving the required level of safety for the equipment under control. Several safety-related systems may be necessary—some of which may be based on pneumatic or hydraulic technology.
Design of Safety-Related Systems
A recent analysis of 34 incidents involving control systems (HSE) found that 60% of all cases of failure had been “built in” before the safety-related control system had been put into use (figure 7). Consideration of all the safety life cycle phases is necessary if adequate safety-related systems are to be produced.
Figure 7. Primary cause (by phase) of control system failure
Functional safety of safety-related systems depends not only on ensuring that the technical requirements are properly specified but also in ensuring that the technical requirements are effectively implemented and that the initial design integrity is maintained throughout the life of the equipment. This can be realized only if an effective safety management system is in place and the people involved in any activity are competent with respect to the duties they have to perform. Particularly when complex safety-related systems are involved, it is essential that an adequate safety management system is in place. This leads to a strategy that ensures the following:
- An effective safety management system is in place.
- The technical requirements that are specified for the E/E/PES safety-related systems are sufficient to deal with both random hardware and systematic failure causes.
- The competence of the people involved is adequate for the duties they have to perform.
In order to address all the relevant technical requirements of functional safety in a systematic manner, the concept of the Safety Lifecycle has been developed. A simplified version of the Safety Lifecycle in the emerging international standard IEC 1508 is shown in figure 8. The key phases of the Safety Lifecycle are:
Figure 8. Role of the Safety Lifecycle in achieving functional safety
- specification
- design and implementation
- installation and commissioning
- operation and maintenance
- changes after commissioning.
Level of Safety
The design strategy for the achievement of adequate levels of safety integrity for the safety-related systems is illustrated in figure 9 and figure 10. A safety integrity level is based on the role the safety-related system is playing in the achievement of the overall level of safety for the equipment under control. The safety integrity level specifies the precautions that need to be taken into account in the design against both random hardware and systematic failures.
Figure 9. Role of safety integrity levels in the design process
Figure 10. Role of the Safety Lifecycle in the specification and design process
The concept of safety and level of safety applies to the equipment under control. The concept of functional safety applies to the safety-related systems. Functional safety for the safety-related systems has to be achieved if an adequate level of safety is to be achieved for the equipment that is giving rise to the hazard. The specified level of safety for a specific situation is a key factor in the safety integrity requirements specification for the safety-related systems.
The required level of safety will depend upon many factors—for example, the severity of injury, the number of people exposed to danger, the frequency with which people are exposed to danger and the duration of the exposure. Important factors will be the perception and views of those exposed to the hazardous event. In arriving at what constitutes an appropriate level of safety for a specific application, a number of inputs are considered, which include the following:
- legal requirements relevant to the specific application
- guidelines from the appropriate safety regulatory authority
- discussions and agreements with the different parties involved in the application
- industry standards
- national and international standards
- the best independent industrial, expert and scientific advice.
Summary
When designing and using safety-related systems, it must be remembered that it is the equipment under control that creates the potential hazard. The safety-related systems are designed to reduce the frequency (or probability) of the hazardous event and/or the consequences of the hazardous event. Once the level of safety has been set for the equipment, the safety integrity level for the safety-related system can be determined, and it is the safety integrity level that allows the designer to specify the precautions that need to be built into the design to be deployed against both random hardware and systematic failures.