Thursday, March 29, 2007

CSB Report of BP Texas accident - detail

The 337 Page report is now available at the Chemical Safety Board's website.

Below is a reasonably detailed summary of the report. I have also put together a much more brief overview.

On March 23, 2005, at 1:20 p.m., the BP Texas City Refinery suffered one of the worst industrial disasters in recent U.S. history. Explosions and fires killed 15 people and injured another 180, alarmed the community, and resulted in financial losses exceeding $1.5 billion. The incident occurred during the startup of an isomerization1 (ISOM) unit when a raffinate splitter tower2 was overfilled; pressure relief devices opened, resulting in a flammable liquid geyser from a blowdown stack that was not equipped with a flare. The release of flammables led to an explosion and fire. All of the fatalities occurred in or near office trailers located close to the blowdown drum.

The Texas City disaster was caused by organizational and safety deficiencies at all levels of the BP Corporation. Warning signs of a possible disaster were present for several years, but company officials did not intervene effectively to prevent it. The extent of the serious safety culture deficiencies was further revealed when the refinery experienced two additional serious incidents just a few months after the March 2005 disaster.

There were numerous cultural, human factors, and organizational causes of the disaster . One underlying cause was that BP used inadequate methods to measure safety conditions at Texas City. For instance, a very low personal injury rate at Texas City gave BP a misleading indicator of process safety performance. In addition, while most attention was focused on the injury rate, the overall safety culture and process safety management (PSM) program had serious deficiencies.

Cost-cutting and failure to invest in the 1990s by Amoco and then BP left the Texas City refinery vulnerable to a catastrophe.


Key Technical Findings

PROCEDURES
* Many deviations from written procedures occurred. These were not unique actions but as a result of established work practices, frequently taken to protect unit equipment and complete the startup in a timely and efficient manner.
* Management did not ensure procedures were updated, incorporated learning from incidents or adapted to cover unique startup circumstances.
* There was no effective management of change of procedures.
* Management actions (or inactions) sent a strong message to operations personnel that procedures were not strict instructions but were outdated documents to be used as guidance.
* The ISOM startup procedure was not followed and no record was made of steps completed. As a result a key valve was shut that prevented liquid leaving the raffinate splitter tower.
* The procedure required filling the tower to 50% level. Previous experience showed it needed to be filled higher because the level would typically drop significantly during startup. It was filled to a 99% level reading, but was actually way off scale. A high level alarm was activated at 72%, but a subsequent high level switch was faulty (this was not noticed by operators).

PRE-STARTUP CHECKS
* A rigorous pre-startup procedure required all startups after turnarounds to go through a Pre-Startup Safety Review (PSSR). The process safety coordinator for the ISOM was unfamiliar with its applicability, and therefore, no PSSR procedure was conducted.
* The PSSR is a formal review carried out by a technical team led by the operations superintendent and signed off by senior management. It involves verification of all safety systems and equipment, including procedures and training, process safety information, alarms and equipment functionality, and instrument testing and calibration. Also, that all non-essential personnel had been removed from the unit and neighboring units and that the operations crew had reviewed the startup procedure.
* BP guidelines state that unit startup requires a thorough review of startup procedures by operators and supervisors; but this was not performed or checked off.
* The start-up procedure covered the scenario of one continuous startup. In reality the startup was paused, part of the plant shutdown and later restarted.

MAINTENANCE
* Faulty equipment, including level indicators and control valves, had been identified but not repaired.
* BP Supervisors deemed there was not enough time during the turnaround to make the necessary repairs.
* BP Supervisors stopped technicians checking alarms and instruments because there was not time to complete the checks before the unit was due to start.
* The same BP Supervisors then signed the startup procedure that required that all control valves had been tested and were operational prior to startup.

CONTOL SYSTEM
* Level indicator showed the tower level declining when it was actually overfilling.
* Redundant high level alarm did not activate
* Tower was not equipped with any other level indications or automatic safety devices.
* The control board display did not provide adequate information on the imbalance of flows in and out of the tower to alert the operators to the dangerously high level.

CONTROL SYSTEM INTERFACE
* The reading of how much liquid raffinate was entering the unit was on a different screen from the one showing how much raffinate product was leaving the unit. This made it difficult to identify a discrepancy (Texaco Pembroke Explosion is referenced).

MANNING
* There was a lack of supervisory oversight and technically trained personnel during the startup. This is despite analaysis carried out by Amoco based on 15 previous incidents that showed incidents were 10 times more likely during startup than normal operation. Guidelines on site were that supplementary assistance be present during startup, including additional board operators.
* One supervisor had to leave site due to a family emergency. No one was assigned to provide effective cover.

SHIFT HANDOVER
* Supervisors and operators poorly communicated critical information during the shift turnover (handover);
* Night shift operator left early. Subsequent shift handover was brief because it did not involve the person who had done all the work.
* Records in the shift log were brief and ambiguous. They were mis-interpreted by the incoming shift. This was further exacerbated by the failure to record steps completed on the startup procedure by the previous shift operators.
* BP did not have a shift turnover communication requirement for its operations staff.


FATIGUE
* Operators had worked 12-hour shifts for 29 or more consecutive days.
* Had been getting about 5 hours sleep per night.
* BP has no corporate or site-specific fatigue prevention policy or regulations.
* “Operators were expected to work” the 12-hour, 7-days-a-week turnaround schedule, although they were allowed time off if they had scheduled vacation , used personal/vacation time, or had extenuating circumstances that would be considered on a “case-by-case” basis.

COMMUNICATION
* Key messages were not written down, but passed verbally over phone and radio.
* Board and outside operators interpreted a message regarding routing of rafinate. The Board operator closed a control valve. The outside operator manually opened that valve.

TRAINING
* The operator training program was inadequate. In particular hazards of unit startup.
* Training for abnormal situations was insufficient.
* Training consisted of on-the-job instruction, which covered primarily daily, routine duties.
* Startup or shutdown procedures would be reviewed only if the trainee happened to be scheduled for training at the time the unit was undergoing such an operation.
* BP’s computerized tutorials provided factual and often narrowly focused information, such as which alarm corresponded to which piece of equipment or instrumentation. This type of information did not provide operators with knowledge of the process or safe operating limits.
* BP training program did not include specific instruction on the importance of calculating material balances, and the startup procedures did not discuss how to make such calculations.
* Managers did not effectively conduct performance appraisals to determine the knowledge level and training development plans of operators.
* The central training department staff had been reduced from 28 to eight,
* Simulators were unavailable for operators to practice handling abnormal situations, including infrequent and high hazard operations such as startups and unit upsets.

PLANT
* The process unit was started despite previously reported malfunctions of the tower level indicator, level sight glass, and a pressure control valve.
* The size of the blowdown drum was insufficient to contain the liquid sent to it by the pressure relief valves.
* Neither Amoco nor BP replaced blowdown drums and atmospheric stacks, even though a series of incidents warned that this equipment was unsafe.

SAFE OPERATING LIMITS
* ISOM operating limits did not include limits for high level in the raffinate splitter tower.
* BP had developed an electronic system for monitoring operation outside defined envelope. However, the feature to alert that this had occurred had not been activated.

RISK MANAGEMENT
* Occupied trailers were sited too close to a process unit handling highly hazardous materials. All fatalities occurred in or around the trailers.
* Eight previous serious releases of flammable material from the ISOM blowdown stack had not been investigated these events.
* BP Texas City managers did not effectively implement their pre-startup safety review policy to ensure that nonessential personnel were removed from areas

ORGANISATIONAL FAILURES

COST-CUTTING – failure to invest and production pressures

BOARD OF DIRECTORS – No director responsible for assessing and verifying the performance of BP’s major accident hazard prevention programs.

SAFETY PERFORMANCE - Reliance on the low personal injury but not indicators of process safety performance and the health of the safety culture.

MECHANICAL INTEGRITY - “run to failure” of process equipment at Texas City.

CHECK BOX MENTALITY - Personnel completed paperwork and checked off on safety policy and procedural requirements even when those requirements had not been met.

CULTURE – lack of reporting and learning culture. Personnel not encouraged to report safety problems and some feared retaliation for doing so. Lessons not captured or acted upon, including those from other sites and organisations.

FALURE TO ACT - Numerous surveys, studies, and audits identified deep-seated safety problems at Texas City, but the response of BP managers at all levels was typically “too little, too late.”

MANAGEMENT OF CHANGE - BP Texas City did not effectively assess changes involving people, policies, or the organization that could impact process safety.

CSB Report of BP Texas accident - overview

The 337 Page report is now available at the Chemical Safety Board's website. The executive summary seems quite comprehensive and readable, but a quick scan of the main report suggests that there is more to learn if you dig deep enough.

From what I have read so far, the key issues were.

* Procedures - did not reflect how tasks were done in practice, and were not really used for the startup
* Pre-start checks - a comprehensive program of checks was specified but not carried out
* Maintenance - faulty equipment was not repaired during the turnaround because supervisors decided there was not enough time
* Control system - indicators and alarms were not working
* Interface - information to carry out a mass balance was not available on a single screen (exactly the same as the Texaco Pembroke accident)
* Manning - failure to provide extra personnel for startup
* Shift handover - insufficient discussion and poor log keeping
* Fatigue - operators working on the turnaround, 12 hours shifts for 30 days without a break
* Communication - critical messages passed verbally and misunderstood
* Training - mostly on the job with no training for abnormal situations, including startup.
* Poor plant design
* Operating limits - failure to identify all key operating limits and to monitor operations
* Poor risk management - including siting of trailers and failure to remove non-essential personnel during start-up
* Multiple organisational failures - as identified in Baker report

I've also put together a more detailed summary here

Wednesday, March 21, 2007

Shift handover and shift log software

I realised a long time ago from my studies of the Piper Alpha inquiry that shift handover is a critical activity that can contribute the major accidents. However, it has not received much attention in the past, possibly because it has fallen in to the category of 'too hard.' The Buncefield inquiry has also identified shift handover as an issues, and it seems likely that it will be receiving a higher profile now.

I met up with the guys from Infotechnics last week to look at their shift log and handover software called Opralog. I was very impressed. It seems to be very easy to use but provide a great deal of power that takes it well beyond being simply a tool for assisting handovers. In particular, it allows companies to start logging events from the perspective of what needs to be done to deal with them, rather than simply what happened to plant and equipment.

Opralog's main features (as I see it) include:

* Predefined events mean operators and technicians have less to write, meaning they are more inclined to record useful information;
* Interfacing with plant data allows text descriptions to be recorded to explain observed plant events
* Events can be logged automatically, triggered by plant data (e.g. if a parameter exceeds a certain value) which prompts the operator or technician to record an explanation
* Logs can feed in to each other - for example, certain parts of operator logs can populate part of their supervisors log.

Well worth a look.

You can find out more about shift handover at my website

Monday, February 26, 2007

Too much training

Scotland were heavily defeated by Italy in the six nations rugby union on Saturday. Scotland had a disastrous start with Italy intercepting three times (a chip-over kick, pop-pass and long pass) in the first 7 minutes, scoring tries each time.

I'm no expert on rugby, but it looked to me that Scotland had been practicing the maneuvers, but in training had no opposition. This was later said by one of the commentators at the BBC.

I think parallels can be made with industrial settings. We know training is important, but often fail to provide the right training. This case highlights that whilst skills are important, the training programme can back fire if people are not able to make the correct decisions about which skills to use and when. In at least two of the three cases (chip kick and long pass) it is an obvious risk that an interception is possible. What Scotland failed to do was consider whether the risk was worthwhile. If they were close to Italy's try, it probably would be as they had a good chance of scoring points and would have more chance of recovering from an interception before Italy scored. As they were close to their own, the benefit was much less and risk much higher.

This is one area where companies get it wrong with simulators. They ask for 'high fidelity' versions that allow people to gain skills in operating the plant. Unfortunately the complexity and cost of these simulators means that more time is spent gaining skill, leaving relatively little time to practice decision making. Conversely 'low fidelity' simulators do not provide the opportunity to gain skills in operating plant, but this leaves a lot more time on practicing decision making, problem solving etc.

Andy Brazier

Friday, February 16, 2007

TLA's

Jeremy Clarkson's article in the InGear section of the Sunday Times on 11 February 2007 discussed how some Three Letter Acronyms take longer to say than the full versions. Given that communication can be critical to safety, these examples may be useful to illustrate scenarios.

www = 9 syllables whereas world wide web = 3

From the army

IED= Improvised explosive device = bomb
ACV = armoured combat vehicle = tank
ADW = air defence warning = siren

In business China becomes PRC

And Jeremy adds one that is probably libelous IFA = thief.

Leading indicators of safety performance part 2

Following a comment I received on my previous post on the topic, I have given some further thought to leading indicators.

When you talk about performance indicators people often say they need to be S.M.A.R.T - Simple/sensible; measurable; attainable; realistic; time-based.

But there is a counter argument (I am sure I read this as a quote somewhere once) that says 'all that is important cannot be measured and things that can be measured are not always important.'

The comment made to my earlier post says that lagging indicators need to be based on actual consequences rather so that they are precise, accurate, difficult to manipulate and easily understood. Therefore near misses and high potential incidents cannot be used as indicators. This is an interesting point, and now I have time to think about it I am pretty sure it is correct. I still maintain that a huge amount can be learnt from near misses, but agree that this is not the same as using them to provide performance indicators.

The comment also said that leading indicators are certainly more difficult than lagging ones, but if we ask the people who are close to the risk and working with it every day, we will very quickly get a good indication of which of our systems are weak. Then we can hang indicators on those systems to drive improvement.

So from this I conclude that

1. Our traditional lagging indicators are useful, and there but there is probably no need to look for many new ones.
2. Leading indicators can be identified, but they need to be fluid in order to reflect the issues most relevant to an organisation at any time.
3. Near misses are an excellent source of important information but do not provide data that we can use to measure performance.

Wednesday, February 14, 2007

Behavoural safety - IOSH branch presentation

Presentation by Nick Wharton of JMOC at the IOSH Manchester Branch 13 February 2007

Nick gave a very good presentation covering the basics of behavioural safety. He is a good and entertaining presenter, and clearly very experienced. I think that, although he is obviously quite evangelical about behavioural safety, he was also very honest that it is impossible to create a direct link between introducing a behavioural programme and improving safety performance. This is quite a contrast to some presentations I have seen where it is claimed behaviour modification is The answer to safety.

I am fairly ambivalent to behavioural safety. I consider it to be a useful tool in the safety toolbox but have concerns that companies often put all their effort and resources into it, at the expense of other approaches. In fact some of Nick's figures showed how much continued effort is required. It is not just a case of keeping up a level of effort, but it seems you need to keep increasing effort otherwise safety performance starts to drop. I wonder how sustainable this can be.

Nick suggested that behaviour modification is applicable to process safety as well as personal safety. I am far from convinced about this. However, Nick did say that behavioural safety should not be used until good systems are in place. Perhaps it is the case that the process safety systems are not yet well enough developed, and so there is more to do before we can try behavioural safety. I guess my question is whether systems will every be good enough, and I feel effort spent on systems may always be more beneficial than that spent on behaviours.

The role of motivation

I was rather horrified at comments made by someone at the IOSH Hazardous Industries specialist group meeting on 13 Feb 2007.

This person is very senior in an organisation and ex-HSE. He said that the number of slip/trip/fall accidents on site suddenly started to increase. There had been talk about changes to company ownership and pension arrangements, and so morale was low. This was considered to be the cause of increased accident rate. Departmental managers went to speak to their staff to "explain what is expected of them." He reported that the number of accidents then fell.

I have a number of problems with this. The first is, I am not aware of poor morale being a significant cause of slip/trip/fall type accidents. I can see that people may be a bit distracted so it may have some influence, but I would expect this to be minor. However, I am sure that morale has a very big influence on whether people will come to work when injured (if you are fed up at work, any excuse to stay off is welcomed) and report accidents (with an eye to a claim). Also, if morale was the cause how could the managers message that staff are expected to pay attention to what they are doing have an impact on the cause?

Leading indicators of safety performance

Ian Travers of HSE presented at the IOSH Hazardous Industries specialist group, 13 February 2007. He was referencing a new guidance document HSG254 on leading indicators.

A good analogy was presented. If someone arrives late for a meeting you have a lagging indicator of failure. Arriving on time may be a leading indicator. Thinking about this, I would say arriving on time is not a true indicator. However, if we were to know what speed that person had driven on the journey we would know that either they had plenty of time to arrive safely (kept below speed limits) or if they had been in a rush (above speed limits at times) we would have leading indicator of how safe the arrangements really had been, even though success had been achieved.

Unfortunately putting this into practice in an industrial setting seems very difficult. It was notable that Ian did not provide any good examples in his presentation. He did give some idea of how to go about identifying indicators, with a key steer being to have a good idea of "what success looking like." This needs to be considered carefully, because for example it is not a case of saying what a permit-to-work looks like but what is the system intended to deliver.

Ian said there was a need for better indicators because audit is usually too infrequent and compliance focused, whilst work place inspection rarely addresses critical controls. However, before going down the route of developing leading indicators it is important to answer the following 3 questions:
1. Why do it?
2. What will you do with the data?
3. How will it influence safety performance?

It is unlikely that there will be any generic leading indicators. Even company wide are unlikely to be effective as having enough indicators to cover all the requirements is likely to be too many for them to be effective.

At the end of the presentation I concluded that leading indicators may not be the solution we are looking for (at least at this time). Whilst in theory they are exactly what is required, putting it in to to practice is very difficult. We may do better by using lagging indicators better, especially learning from near-misses and process disturbances.

Monday, February 12, 2007

BP's integrity management

From the IChemE's Safety and Loss Prevention Subject Group Newsletter Winter 2007. Report from a seminar on 28 September 2006 titles 'Asset integrity management in the process industry'.

Peter Elliott presented BP's integrity management standard. This is built on the principle that accidents occur because of simultaneous failures of the 3 layers of protection that are plant, processes and people. BP's integrity management standard is part of the company's operating management system and includes the 10 following elements.

1. Accountabilities
2. Competence
3. Hazard evaluation and risk management
4. Facilities and process integrity
5. Protective devices
6. Practices and procedures
7. Management of change
8. Emergency response
9. Incident investigation and learning
10. Performance management

Wednesday, February 07, 2007

Who reads procedures & other safety documents?

I think there is a rule of thumb that says 20% people will never read anything you give them. If we follow that analogy on, of the 80% who read the first page, 20% are unlikely to turn the page. See how this pans out below

Page 1 80%
Page 2 64%
Page 3 51%
Page 4 41%
Page 5 33%
Page 6 26%
Page 7 21%
Page 8 17%
Page 9 13%
Page 10 11%

So if the document you give them is more than 3 pages long, less than 50% are likely to make it to the end.

This is not based on any studies or science, but I think is not a bad representation of reality. It highlights that we need to address the fact that 20% will read nothing. But if we make sure all the most important information is in the first page or two, at least the majority will read it.

Also, this shows that about 10% of people will read everything. If these are the key influencer's for the rest of the population this can be pretty significant.

Andy Brazier

Tuesday, February 06, 2007

Business continuity planning

I atteneded a short talk given by Lynne Hughes of Conwy council yesterday. The Civil Contingencies act 2004 place duties on agencies likely to be involved in dealing with major incidents to assess risks, have plans, warn and inform, share information and co-operate.

North Wales councils have got together to publish a booklet. Conwy council have a website

Andy Brazier

Are good leaders born or created

Article in Sunday Times 4 February 2007 by Mary Braid

Leadership defined as "the ability to inspire others to strive and enable others to accomplish great things." In this case they nurture the talent of others.

The debate is whether it is something you are born with or can be learnt. An alternative view is that leadership has to be "earned" by gaining respect from others.

Whatever the truth, it is pretty clear that one style of leadership does not fit all circumstances. In some cases command and control is appropriate, but in many it is not. Now that a lot of responsibility has been devolved down the chain of command, more people need leadership qualities; using initiative, creativity and innovation.

One problem is that we want more people to take leadership, and this includes taking risks. Unfortunately failure, which is part of risk, still tends to be punished. Also, when people are taught leadership (if it is possible) they are given an idealised view of the world, which often does not have much baring to real life.

In real life it takes one set of skills to work to the top and another to be effective when you get there. "That is why good governance is so important" to ensure talented and effective leaders make it to the top.

Andy Brazier

Monday, January 29, 2007

Error detection

Ergonomics journal April 2006

'Error detection: a study in anesthesia' A Nyssen and A Blavier

Accident reporting system developed and used to collect information about error detection patterns. A significant relationship was found between the type of error and and error detection mode, and between the type of error and the level of training of the anesthetist who committed the error.

It is not possible to prevent every error, so reducing their consequences is important, which requires detection. This has not been studied much, and the most of the studies have been on simple tasks in laboratory settings.

Six modes of detection were identified
1. Standard check (routine monitoring of the environment)
2. Recognising outcome signs
3. Suspicion from knowledge
4. Interpolation (by someone else, not person who committed error)
5. Alarm sounds
6. By chance

Standard checks were found to be most prominent. This suggests checking has become part of the routine activity and because it is non-specific it enables to staff to pick up many different types of error. It was found that the risk associated with the operation made little difference to the detection strategies employed.

More experienced staff become better at detecting a wider range of error types. It is suggested this is because they have more control over what they are doing and so can do more diverse things. This is like an experience musician being able to improvise more.

The authors suggest the findings from this study could have significant implications for safety. At present when errors occur in complex systems the tendency is to increase policies and procedures, which restrict actions and reduce opportunities to action options. This does not reduce the complexity of the system, but may reduce the opportunities people have to detect errors. Therefore, it may be better to concentrate on improving experience in order to increase the likelihood of error detection, rather than restricting activities with more procedures.




, followed by recognising signs and alarm.

Error recovery

Ergonomics journal April 2006

'Error recovery in a hospital pharmacy' L Kanse, TW van der Schaaf, ND Vrijland and H van Mierlo

Field study involving confidential reports and follow up interviews on near miss incident where recovery had occurred.

Near misses have the same underlying failure factors as accidents. Sometimes by coincidence but more often by timely detection harm is avoided. Understanding recovery factors offers a way of improving safety in addition to preventing errors in the first place. This is achieved by introducing system characteristics that build in or strengthen opportunities for recovery.

Recovery involves detection followed by a combination of explanation of the causes and putting in countermeasures aimed at returning to normal or at least limiting the consequences.

Although there have only been limited studies they have shown that personal related factors such as experience and knowledge are important in recovery. But also technical factors such as the design of the workplace and equipment interfaces. As well as organisational factors such as culture, work design, procedures and management priorities. In fact the organisational factors may be the most important because they determine the context in which people work and hence are likely to affect other factors.

Planned and unplanned recoveries. Planned involve the activation of in built barriers, such automatic safety controls or procedures that are implemented under certain conditions. Unplanned recovery is more ad hoc and depend on creative problem solving capabilities of the people involved.

The factors that influence planned vs unplanned are likely to be quite different.

Looking at pharmacy errors the study found that procedural checks had started to lapse because people had perceived that so many checks were being carried out that it was too much, and people had become less aware of their importance. Other factors that influenced lack of recovery included management priorities, shortcomings or lack of procedures, insufficient transfer of knowledge and information, technical failures, factors related to the design of computer systems used for medication preparation. With shortcomings or lack of procedures being the most.

On the positive side a lot of the recovery were made by nursing staff, and not the pharmacy staff who made the error. This was down to procedural checks and in some cases where the knowledge of the staff realised something was wrong.

To improve recovery it is important that management make sure everyone understands how important procedural checks are. Near miss reporting systems is a useful way of demonstrating this.

Clearly, there are opportunities to improve procedures. Also, the computer system that could better help detect conflicts in data. Where the severity of errors is great, having double or even triple procedural checks is important, but as not all are forseeable employees must also have the knowledge to detect and recover from problems without the support of procedures. Special training may be needed using problem scenarios inspired by reported near misses to maintain everyones creative problem solving skills.

Andy Brazier

Executive's attitudes to their staff

Article in The Observer 28 January 2007

'What sort of boss gives a monkey's about his staff?' by Simon Caulkin

Survey by management consultancy Hudson found that three-quarters of senior executives would do an annual cull of their workforce to boost productivity and performance (although only 4% actually do this). One sixth think they could get rid of 20% without any damage to performance or morale; and nearly half think firing up to 5% of staff would be a good thing.

Another study by the Chartered Institute of Personnel and Development found 38% of employees feel senior managers and directors treat them with respect and 66% don't trust them. About a quarter of employees rarely or never look forward to going to work, and almost half are leaving or trying to. It is like 'a marriage under stress, characterised by poor communications and low levels of trust.'

Another study by Gallup finds poor management means workers become more disaffected the longer they are in a job so that 'human assets tha6t should increase in value with training and development instead depreciate as companies fail to maximise this investment.'

Some companies (GE and Microsoft) regularly rank staff performance with an eye to getting rid of the poor staff, but there is no evidence this works. This approach suggests that team performance is simply a sum of the parts, but this is rarely the case.

Sometimes companies do need to get rid of people, particularly if they have been incompetently recruited. However, if this a routine it creates fear and unhealthy competition. An example is quote from US nursing. Units with the most 'talented' nurses and an attitude that heads will roll if people make mistakes tend not to learn because errors are covered by and not reported. However, teams that are more able to all work together report more mistakes do learn and are actually safer.

This article is particularly interesting when some of my more recent posts to this blog concerning errors when working under stress and poor morale are considered.
See previous and previous

Andy Brazier

Friday, January 26, 2007

Capturing psychological mechanisms in error reports

Ergonomics journal April 2006

'From cognition to the system: developing a multilevel taxonomy of patient safety in general practice.' O Kostopoulou

In developing a taxonomy the author has raised a few interesting issues.

If a GP does not prescribe necessary medication for a patient it could be seen as his/her error or failure. But it may be that he could not read the handwritten request from the hospital so there is an external cause and performance shaping factors, but no psychological mechanism or immediate internal causes.

Suggested that it would be better to replace the terms 'error' and 'failure' with the term 'action' as this would be more constructive and blame free.

Other taxonomies, such as those based on the slips, lapses, mistakes framework can only be used to classify errors.

Fear of blame, particularly in the medical industry can severely reduce the likelihood of talking about and learning from patient safety events. Certainly where terms such as 'carelessness' or 'thoughtlessness' are included in a taxonomy. This can also be the case where classifications are allocated either to systems or humans, which can reinforce the blame culture by concentrating attention on the error producing human as distinct from the error inducing system.

Performance shaping factors are likely to change as policies, systems and technologies change. For example introducing an electronic prescription system will eliminate handwriting issues but introduce new opportunities for error, such as selecting the wrong item from an alphabetical list of medicines.

The study highlights the importance of identifying the psychological mechanism that led to an error. This requires an understanding of the cognitive basis of behaviour. In many cases reporting systems do not capture that sort of information and it will not be obtained unless incidents are followed up very quickly. The idea of the taxonomy is to capture that information.

Andy Brazier

Effects of stress and job control on errors

Ergonomics journal April 2006

'Work stress and patient safety: Observer-rated work stressors as predictors of characteristics of safety-related events reported by young nurses.' A Elfering, NK Semmer and S Grebner

Study used self-reporting and observation. Found the most frequent safety related stressful events included incomplete or incorrect documentation (40%), medication errors near misses (21%), delays in delivery of patient care (9.7%), violent patients (9.7%).

Familiarity of events and probability of occurrence was seemingly predicted by job stresses and low job control. These were shown to be risk factors for patient safety.

The results suggest that jobs should be redesigned to enhance job control and decrease stress. These interventions may be effective at improving patient safety.

Found that safety related events related to stresses, most notably onto concentration demands and lack of control. In other words people working under high demands and low control are more likely to have safety events. This is explained that secondary tasks such as second checking and documenting may not be tended to as well as they should. Also, stress can result in less competent social behaviour which may affect the behaviour of patients.

The relationship between work demands and patient safety need to be better understood so that nurses can be educated in self-management strategies for stressful situations.

Andy Brazier

Influence of stress and morale in medication errors and violations

Ergonomics journal April 2006

'Patient safety dring medication administration: the influence of organisational and individual variables on unsafe work practices and medication errors.' GJ Fogarty and M McKeon.

Used structural equation modelling to measure organisational climate to see how it affected unsafe medication administration behaviours including the role of stress and morale.

As in other high risk industries failure to follow procedures is a major contributor to medication errors with over half being due to violations. People who violate procedures are 1.4 times as likely to commit other types of errors (note - this study seems to have included violations as a type of error). The size of this influence and the suggestion that the psychological pathways to violations and errors are different has led people to say that they should be treated as different safety outcomes.

Violations are typically associated with attitude and behaviours whilst errors are associated with deficiencies in skill and information processing.

Another significant difference is that violations are intentional so that people know they are committing them whereas errors are unintentional so people do not always know they have made them. This means that people can self report violations and that may be a way of learning more about the organisation.

The study used self-reporting of perceptions of the organisation and of violations and errors.

Violations are intentional actions but not intended to do harm. This study has shown that violations are most likely when the individual is distressed and morale is low; and that these individual states are influenced by organisational climate.

It is proposed that more frequent monitoring of organisational and safety climate, and individual stress and morale will help in preventing violations.

Andy Brazier

Work courses 'wasting £75m'

Article in Sunday Times 21 January 2007 by Roger Eglin

Kaisen (business psychologists) has completed research into the benefits of development courses aimed at business leaders and concluded they are largely valueless. This was because most courses focused on educating through providing useful concepts and theories, whereas what was needed was skills such as how to act and behave, how to be more assertive and motivate people.

Learning behavioural skills requires practice in the workplace. Going on a course is like trying to learn to ski in a hotel by watching videos and reading course material.

Even the practical elements of courses were of little value because people were usually paired off, which given that all were there to learn meant inexperienced people were coaching each other. Also, courses emphasised self-awareness, which has its uses for the individual but does not help them understand their colleagues.

One-to-one coaching is probably much better than sending leaders on courses.

Andy Brazier

Tuesday, January 23, 2007

Glasgow accident analysis group

Looks like a lot of potentially useful information available at http://www.dcs.gla.ac.uk/research/gaag/

Also, they have a conference in March 2007. Details and some of the papers at http://www.dcs.gla.ac.uk/~johnson/papers/workshop/human_error.htm

Nissan improves safety

Article at http://www.murfreesboropost.com

'Focus on safety begins to pay off for Nissan' by Erin Edgemon. This is interesting as it sounds like the programme being followed is most concerned with hazardous conditions, rather than behaviors.

Nissan North America decreased its reportable injuries at its manufacturing plants by nearly 72 percent from 2000 to 2005.

2000 Nissan North America had a recordable injury rate of 31.4 meaning that 31.4 out of every 100 people had an injury that required more than first aid treatment during the year. By 2005 that rate dropped to 8.9. Over the same time lost work time rate dropped from 6.3 to 1.6.

Nissan's safety program works because it gets everyone from top-level management to production technicians on the manufacturing floor involved. Nissan benchmarked other companies including receiving consultations from Dupont, which is known for its world-class safety program. It spent millions of dollars to improve safety in its plants from purchasing mats for technicians to stand on to installing lift assists and robots to do more physically demanding jobs.

“The intent of it was to enhance our safety program to become world class,” said Greg Daniels, senior vice president of Nissan’s U.S. manufacturing of the program. “We wanted our employees to come to work and leave the same way. It was critical to us.”

In order to make the program work, Daniels said employee mindset had to be changed to view safety as the first priority. Every employee is trained to spot problems and are expected to tell non-management safety committee members or management to have them corrected.

Workers inspect their work zones at least once a week. It takes a few hours to inspect for potential hazards and to talk to the employees in his area. The employees doing the inspections have the authority to fix problems or write-up work orders to have problems fixed.

Each zone on the manufacturing floor is audited for safety eight times a month due to the constantly changing environment, Dove said.

Now that Nissan has the basics perfected, more of the company’s focus has been placed on ergonomics and making the assembly of vehicles easier for employees.
"Most of our issues now are design issues," Dove said.

Andy Brazier

Wednesday, January 17, 2007

BP Baker Panel Report - Human Factors

The report identifies human factors as a key part of process safety. It also raises a number of human factors issues, which are summarised below.

For most of its incident investigations, BP uses a list of causal factors to analyze root causes. BP refers to this method as the Comprehensive List of Causes (CLC).

A list of human factors is also provided for use in conjunction with the CLC. This contains a guide to analyzing human behaviors, beginning with a determination of whether the identified behavior leading to a cause was intentional or unintentional and leading to the identification of external and internal influences and other conditions under which personnel are likely to make mistakes.

In the Panel’s experience, investigations typically use a checklist as a complete list of potential causes instead of a starting point for discussion of the deeper root causes and usually will not identify factors that are not on the list.

The Panel also believes that BP’s list of systemic factors related to engineering problems (e.g., “inadequate technical design”) appears somewhat superficial.

While inadequate technical design is a valid factor, BP should use it to invite more extensive inquiry: What is the design inadequacy? Why was it present? Why was it not discovered prior to the incident under investigation?

Many of the listed systemic factors do not represent systemic issues. Fatigue, for instance, is included as a systemic cause.

BP uses the CLC for both personal safety accidents and process safety accidents. In the Panel’s opinion, the causal factors involved in occupational or personal safety incidents and process safety incidents typically are very different.

The human error analysis, which focuses investigators’ efforts on personal safety aspects of incidents rather than all aspects of an incident, may introduce additional bias in the analysis toward finding behavioral root causes.

At the time of the Carson refinery technical review in May 2006, about half of process hazard analysis, or PHA, action items at Carson from 2001-2004 remained open.

Action items from facility siting and human factors checklists used in PHAs were not consistently tracked and implemented.

Andy Brazier

Tuesday, January 16, 2007

BP Baker Panel Report - Process Safety

Not all refining hazards are caused by the same factors or involve the same degree of potential damage. Personal or occupational safety hazards give rise to incidents—such as slips, falls, and vehicle accidents—that primarily affect one individual worker or each occurrence.

Process safety hazards can give rise to major accidents involving the release of potentially dangerous materials, the release of energy (such as fires and explosions), or both. Process safety incidents can have catastrophic effects and can result in multiple injuries and fatalities, as well as substantial economic, property, and environmental damage. Process safety refinery incidents can affect workers inside the refinery and members of the public who reside nearby. Process safety in a refinery involves the prevention of leaks, spills, equipment malfunctions, over-pressures, excessive temperatures, corrosion, metal fatigue, and other similar conditions. Process safety programs focus on the design and engineering of facilities, hazard assessments, management of change, inspection, testing, and maintenance of equipment, effective alarms, effective process control, procedures, training of personnel, and human factors. The Texas City tragedy in March 2005 was a process safety accident.

Andy Brazier

BP Baker Panel Report - Recommendations

RECOMMENDATION # 1 – PROCESS SAFETY LEADERSHIP
BP’s corporate management must provide effective leadership on and establish appropriate goals for process safety.

RECOMMENDATION #2 – INTEGRATED AND COMPREHENSIVE PROCESS SAFETY MANAGEMENT SYSTEM
BP should establish and implement an integrated and comprehensive process safety
management system that systematically and continuously identifies, reduces, and manages process safety risks at its U.S. refineries.

RECOMMENDATION #3 – PROCESS SAFETY KNOWLEDGE AND EXPERTISE
BP should develop and implement a system to ensure that its executive management, its
refining line management above the refinery level, and all U.S. refining personnel, including managers, supervisors, workers, and contractors, possess an appropriate level of process safety knowledge and expertise.

RECOMMENDATION #4 – PROCESS SAFETY CULTURE
BP should involve the relevant stakeholders to develop a positive, trusting, and open
process safety culture within each U.S. refinery.

RECOMMENDATION #5 – CLEARLY DEFINED EXPECTATIONS AND ACCOUNTABILITY FOR PROCESS SAFETY
BP should clearly define expectations and strengthen accountability for process safety performance at all levels in executive management and in the refining managerial and supervisory reporting line.

RECOMMENDATION #6 – SUPPORT FOR LINE MANAGEMENT
BP should provide more effective and better coordinated process safety support for the U.S. refining line organization.

RECOMMENDATION #7 – LEADING AND LAGGING PERFORMANCE INDICATORS FOR PROCESS SAFETY
BP should develop, implement, maintain, and periodically update an integrated set of
leading and lagging performance indicators for more effectively monitoring the process safety performance of the U.S. refineries

RECOMMENDATION #8 – PROCESS SAFETY AUDITING
BP should establish and implement an effective system to audit process safety performance at its U.S. refineries.

RECOMMENDATION #9 – BOARD MONITORING
BP’s Board should monitor the implementation of the recommendations of the Panel and the ongoing process safety performance of BP’s U.S. refineries. The Board should also report publicly on the progress of such implementation and on BP’s ongoing process safety performance.

RECOMMENDATION #10 – INDUSTRY LEADER
BP should use the lessons learned from the Texas City tragedy and from the Panel’s report to transform the company into a recognized industry leader in process safety management.

BP Baker Panel Report - Key findings

Released today. Review of BP's corporate safety culture, safety management systems,
and corporate safety oversight at its U.S. refineries following the Texas City Fire. Running to 374 pages, it will be a while before I have had the chance to read it all, but the executive summary is very interesting - almost exclusively regarding process safety - see separate post here for an explanation.

The full report is available here

The Report's Recommendations are summarised here.

KEY POINTS FROM EXECUTIVE SUMMARY

The Panel believes that BP has not provided effective process safety leadership and has not adequately established process safety as a core value across all its five U.S. refineries.

BP has not provided effective leadership in making certain its management and U.S. refining workforce understand what is expected of them regarding process safety performance.

BP has emphasized personal safety in recent years and has achieved significant improvement in personal safety performance, but BP did not emphasize process safety.

BP mistakenly interpreted improving personal injury rates as an indication of acceptable process safety performance at its U.S. refineries. This created a false sense of confidence.

Process safety leadership appeared to have suffered as a result of high
turnover of refinery plant managers.

At Texas City, Toledo, and Whiting, BP has not established a positive, trusting, and open environment with effective lines of communication between management and the workforce

BP has not always ensured that it identified and provided the resources required
for strong process safety performance at its U.S. refineries.

Despite having numerous staff at different levels of the organization that support
process safety, BP does not have a designated, high-ranking leader for process safety dedicated to its refining business.

The company did not always ensure that adequate resources were effectively allocated to support or sustain a high level of process safety performance.

BP’s corporate management mandated numerous initiatives that applied to the U.S. refineries and that, while well-intentioned, have overloaded personnel at
BP’s U.S. refineries. This “initiative overload” may have undermined process safety performance at the U.S. refineries.

In addition, operations and maintenance personnel in BP’s five U.S. refineries sometimes work high rates of overtime, and this could impact their ability to perform their jobs safely and increases process safety risk.

The Panel also found that BP did not effectively incorporate process safety into management decision-making.

BP tended to have a short-term focus, and its decentralized management system and
entrepreneurial culture have delegated substantial discretion to U.S. refinery plant managers without clearly defining process safety expectations, responsibilities, or accountabilities.

BP has not demonstrated that it has effectively held executive management and refining line managers and supervisors accountable for process safety performance.

Although the five refineries do not share a unified process safety culture, each exhibits some similar weaknesses.

The Panel found instances of a lack of operating discipline, toleration of serious deviations from safe operating practices, and apparent complacency toward serious process safety risks at each refinery.

While all of BP’s U.S. refineries have active programs to analyze process hazards, the system as a whole does not ensure adequate identification and rigorous analysis of those hazards. The

The Panel observed that BP does have internal standards and programs for managing process but found that BP’s corporate safety management system does not ensure timely compliance with internal process safety standards. This included standards applying to rupture disks under relief valves; equipment inspections; critical alarms and
emergency shut-down devices; area electrical classification; and near miss investigations.

BP’s corporate safety management system does not ensure timely implementation of external good engineering practices that support and could improve process safety performance

BP’s system for ensuring an appropriate level of process safety awareness, knowledge, and competence in the organization has not been effective in a number of respects.

BP has not effectively defined the level of process safety knowledge or competency required of executive management, line management above the refinery level, and refinery managers.

BP has not adequately ensured that its U.S. refinery personnel and contractors have sufficient process safety knowledge and competence.

The implementation of and over-reliance on BP’s computerbased training contributes to inadequate process safety training of refinery employees.

BP’s corporate process safety management system does not effectively translate corporate expectations into measurable criteria for management of process risk or define the appropriate role of qualitative and quantitative risk management criteria.

BP has not effectively implemented its corporate-level aspirational guidelines and
expectations relating to process risk. Therefore, the Panel found that BP has not implemented an integrated, comprehensive, and effective process safety management system for its five U.S. refineries.

Significant deficiencies existed in BP’s site and corporate systems for measuring process safety performance, investigating incidents and near misses, auditing system performance, addressing previously identified process safety-related action items, and ensuring sufficient management and board oversight.

Many of the process safety deficiencies are not new but were identifiable to BP based upon lessons from previous process safety incidents, including process incidents that occurred at BP’s facility in Grangemouth, Scotland in 2000.

BP tracked some metrics relevant to process safety at its U.S. refineries. Apparently, however, BP did not understand or accept what this data indicated about the risk of a major accident or the overall performance of its process safety management systems.

BP has not instituted effective root cause analysis procedures to identify
systemic causal factors that may contribute to future accidents. When true root or system causes are not identified, corrective actions may address immediate or superficial causes, but not likely the true root causes.

BP has an incomplete picture of process safety performance at its U.S. refineries because BP’s process safety management system likely results in underreporting of incidents and near misses.

BP has not implemented an effective process safety audit system for its U.S. refineries based

The principal focus of the audits was on compliance and verifying that required management systems were in place to satisfy legal requirements. It does not appear, however, that BP used the audits to ensure that the management systems were
delivering the desired safety performance or to assess a site’s performance against industry best practices.

BP has sometimes failed to address promptly and track to completion process safety deficiencies identified during hazard assessments, audits, inspections, and incident investigations.

The Panel’s review found repeat audit findings at BP’s U.S. refineries, suggesting that true root causes were not being identified and corrected.

BP does not effectively use the results of its operating experiences, process hazard analyses, audits, near misses, or accident investigations to improve process operations and process safety management systems.

The company’s system for assuring process safety performance uses a bottom-up reporting system that originates with each business unit, such as a refinery. As information is reported up, however, data is aggregated. By the time information is formally reported at the Refining and Marketing segment level, for example, refinery-specific performance data is no longer presented separately.

The Panel’s examination indicates that BP’s executive management either did not receive refinery-specific information that suggested process safety deficiencies at some of the U.S. refineries or did not effectively respond to the information that it did receive.

A substantial gulf appears to have existed between the actual performance of BP’s process safety management systems and the company’s perception of that performance.

BP’s Board can and should do more to improve its oversight of process safety at BP’s five U.S. refineries.


The Report's Recommendations are summarised here.

Some comments are made about human factors in the report. They are summarised here

Andy Brazier

IET Health and safety information

A useful looking website from The Institution of Engineering and Technology. Includes briefings, a guide and news. May be a bit too regulatory led for my liking, but some of it looks very useful.

Briefings include

Behaviour-based safety
Blame Free Reporting
Contractor Management
Cost of Safety
Determining the Acceptability of Risk
Do Accidents and Ill-Health Really Cost Me Money?
Hazard Analysis (HAZAN)
Hazard and Operability Studies (HAZOP)
Indicators and Targets
Organisational Change and Safety
Permit to Work Systems
Failure Modes and Effects Analysis - FEMA
Event Tree Analysis - ETA
Fault Tree Analysis - FTA
Reasonably Practicable
Risk Based Inspection
Safety Culture

Andy Brazier

Tuesday, January 09, 2007

Detecting and responding to plant disturbances

Article in October 2006 edition of Ergonomics journal
Title 'Human process of control: tracing the goals and strategies of control room team'
By J Patrick, N. James and A. Ahmed

A study from the nuclear industry. Five shift operations teams were evaluated using a process simulator. Each team was made up of two control room operators and a supervisor.

The simulator was a full-scale mimic of operations.

The teams were set about normal operations and then asked to carry out a routine task of changing over boiler feed pumps. Whilst doing this task a small leak was initiated on one of the pumps. And on top of this a spurious fire alarm in an office was activated that required some action from the supervisor, although not the operators.

The simulated leak was considered to be a plant disturbance that would not create an immediate alarm. However, the leak would cause a drop of level in the dearator, which if undetected would cause a low level alarm and eventually a reactor trip.

The time taken to detect the level drop was recorded. The results were
Team A - 9 min 3 s
Team B - 4 min 6 s
Team C - 1 min 57 s
Team D - 6 min 3 s
Team E - 2 min 30 s
Expert judgement on the site that this scenario should be detected within 2 minutes. Therefore, from the results only Team C was successful.

To explain the long time taken to detect the event the researchers looked at how much attention the operators paid to the routine pump changeover task. Apparently, the normal procedure was that only one operator would be involved in the task, leaving the other to monitor the plant and the supervisor would have minimal involvement. The actual results for proportion of time spent on the task were
Team A - Op1 74% Op2 64% Super 12%
Team B - Op1 81% Op2 53% Super 0%
Team C - Op1 26% Op2 00% Super 16%
Team D - Op1 95% Op2 90% Super 44%
Team E - Op1 98% Op2 93% Super 23%
Clearly, with the exception of Team C, both operators were heavily involved in the routine task, and it is no wonder that they took so long to detect the leak because they would have had little time to monitor the rest of the plant. This is further exacerbated by the fact that in some teams the supervisors got quite involved in the task (which should not have been necessary) taking them away from the main supervisory tasks.
Delving deeper the researchers also found that in most teams, not only did they spend relatively little time monitoring the plant, when they did they were fairly poor at it. In other words the monitoring they did was not good enough to detect the leak quickly.
Overall the finding is that operators easily become fixated on procedural tasks at the expense of the wider, continuous task of monitoring. Also, that supervisors do not tend to intervene to re-orientate the operators (in this case making sure only one is involved in the routine task and the other is concentrating on monitoring) and also have a tendency to get involved where it should not be necessary.

Having detected the leak, the teams had to diagnose its cause and take appropriate action. This was also evaluated. The results for diagnosis and control were
Team A - 6 min 11 s
Team B - 3 min 23 s
Team C - 4 min 46 s
Team D - 1 min 58 s
Team E - 4 min 48 s
In trying to diagnose the problem each team developed a number of hypotheses. Interestingly the operators generated most of these, with relatively little input from the supervisor. It had been assumed that supervisors would lead diagnosis, which clearly did not happen.
Unfortunately none of the hypotheses covered the actual cause of the problem and none of the teams diagnosed the cause of the problem correctly from the control room and needed information from the plant (i.e. someone looking where the water was leaking from).
The researchers identified that the teams took significantly different approaches to dealing with this phase of the scenario in the way they spilt their time between problem solving, mitigating the consequences of the leak and keeping an eye on the rest of the plant. They concluded that none of the teams gave problem hypothesis generation and testing a high enough priority. This suggests a need for more training in diagnosis.

Andy Brazier

Monday, January 08, 2007

Room temperature and health risks

Article on BBC website explores why the death rate in the UK during winter due to cold is significantly worse than really cold countries. It is interesting because it points out that most UK deaths are not due to 'massive cold' where people are exposed to very cold temperatures, but to 'quite minor degrees of cold that people were getting every day.' The actual cause of death in these cases is stroke and heart attack because the blood is more liable to clot when cold.

Comfort and health issues are quoted from the West Midlands Public Health Observatory

24C - top range of comfort
21C - recommended living room temperature
Less than 20C - death risk begins
18C - recommended bedroom temperature
16C - resistance to respiratory diseases weakened
12C - more than two hours at this temperature raises blood pressure and increases heart attack and stroke risk
5C - Significant risk of hypothermia

People in cold countries keep their houses warmer and take outdoor clothing much more seriously.

Andy Brazier

Friday, January 05, 2007

Business continuity

A useful checklist for any business is available from London Prepared website

It covers building facilities, personnel, security, documents, equipment, IT, suppliers, customers and insurance.

Andy Brazier

Perceived risks of hydrogen

Article in November 2006 The Chemical Engineer
Titled "Hydrogen: a matter of perception" by Miriam Ricci, Paul Ballaby, Rob Flynn and Gordon Newsholme.

Hydrogen is being proposed as a fuel of the future for vehicles and other uses. Clearly it is a hazardous material, and there is the danger that this will be viewed in isolation and mean that it is not accepted by society. But this fails to compare the risks associated with hydrogen with those of currently accepted fuels or to consider the benefits of hydrogen. The authors claim that hydrogen should not be viewed according to its physical and chemical properties, but as "an energy carrier in a complex socio-technical system." This is because the risks will depend on how hydrogen is ultimately produced, transported, stored, delivered and used - much of this is currently unknown.

People perceive risks according to perceived benefits and costs (i.e. it is totally contextual). Trust has a lot to do with it, and the public can become uneasy about the motivation of the organisations involved and who is likely to benefit and who is likely to be at risk.

If the public can be persuaded that hydrogen is safe enough, or at least as safe as currently accepted fuels it may well be accepted. But this confidence will take a major hit if there is any sort of large hydrogen-related accident, particularly during transition to a hydrogen economy. This makes a case for not exaggerating the safety of hydrogen. But also, it highlights why industry needs to be very careful when introducing new technology as a loss of confidence due to failure to manage risks can deny society something that in the long term is beneficial.

Reference is made to a more indepth report available online Risk perception of emergency technology

Andy Brazier

The purpose of maintenance - creating change

Article in December 06/January 07 The Chemical Engineer
Article titles "Make the most of your assets" by Sandy Dunn

One of the major barriers to overcome if maintenance is to be improved is that its purpose is not just to repair equipment after it has broken. It is not even abut prediction and prevention - rather it is a holistic process that ensures equipment fulfills its intended business purpose. A key element of this is identification and elimination of the things that cause failures, and this goes far beyond the maintenance department. As a minimum they need to be working with operations and purchasing, and all need to have a shared common goal.

To achieve step-change improvements, the organisation needs to be ready to change to this holistic view. Many will not be, and will need a compelling reason for change.

The necessary change will be multi-dimensional. To achieve it you need to

* make sure people have access to the right tools and information, the authority to make decisions
* change the way performance is measured and rewarded
* rethink lines of reporting
* redesign jobs and procedures; and train people.

To get people on board you need to have a good answer to the question "what is in it for me?" This can be financial, self-esteem, recognition, job satisfaction, career growth, pride and many others. When these have been identified they should be emphasised frequently, although care is required to avoid creating unreasonable expectations.

A rule of thumb says that if there are no tangible benefits following a change within six months support will halve and barriers will double. Therefore the program of change needs to ensure benefits will be achieved throughout as people may not be prepared to wait until the end to see them.

Quoting Dunn. "Newton's Third Law was never so true: an object at rest tends to stay at rest until acted upon by external forces. In change projects, inertia is to be avoided. It is too easy for stakeholders to remain exactly where they are, especially if they are anxious about the change project. Stakeholders need continuous invitations to become involved, constant reassurance that they will get their wins."


Andy Brazier

Change management when introducing new IT

Article in December 06/January 07 The Chemical Engineer.
Titled 'All aboard' and written by Christopher Abiodun of IBM Global Business Services.

Concerned development of the Enterprise Asset Management system for BP's Greater Plutonio oil field operations in Angola. The system was intended to manage maintenance, but was also integrated with materials management, purchasing and supply chain management.

As well as teams dealing with the system's main functions, a change management team was put in place to assist with the people issues such as stakeholder management, project awareness, communications, and training - to ensure engagement with end users. It was felt that this was invaluable in keeping stakeholders engaged, raising awareness of the project, and assessing and communicating to the end users how the system would change current ways of working.

A key functions of the change management team was to analyse training needs and ensure that users got the training they needed. The team found that training was required in using the new system, but also in the underlying business processes. This was to help users understand why the system was designed in the way it was and how to get the most out of it. Also, it reduced intertia and resitance to change.

Abiodun writes "The importance of using change management from the early days of the project through to its conclusion to ensure engagement by stakeholders cannot be overemphasised. It was a key factor in engaging over 200 geographically-dispersed users across two continents to adopt the new system.... The effort this takes should not be underestimates; assessing the impact of change brought about by the new system, developing change strategies and executing them took hard work and dedication."

There is an element of risk in everything. By assessing this early on the risks can be managed. For this project the number 1 risk was reliance on network communications to a deepwater offshore facility. The project had enough time to identify, evaluate and assess a number of potential solutions.

Final quote from Abiodun. "A well-equipped and motivated team of seemingly ordinary people can achieve extraordinary results - although it helps to have the odd one or two extraordinary players in the team."


Andy Brazier

Tuesday, December 19, 2006

Cheap office ergonomics

Good article at Psychology today about how to avoid back pain at minimal cost:

1. Maximize your space - make sure that the things you use frequently, such as the stapler or message pad, are within reach. Grabbing for objects can cause back contortions resulting in injury.

2. Level the field - one of the leading causes of back pain is craning your neck to look at a computer screen below your field of vision. "Prop up your monitor with a telephone book," says Kirschner. "They're free and widely available."

3. Lumbarize your chair - if your office chair doesn't offer you enough lumbar support roll up a small towel and placing it in the curve of your lower back. Make sure it is not too large, the towel should just fill the gap between your back and the chair.

4. Get up and stretch periodically - just raise your hands above your head or do a slight back bend every 20 to 40 minutes.


5. Don't cradle the phone - "The single most important preventive measure: don't cradle your phone between your ear and shoulder." Invest in a hands-free headset or use the speakerphone.

Andy Brazier

OHSAS 18001 to become BS

BSi recently held a 'webinar' regarding the planned issued of British Standard BS18001. This is intended to superseded the current occupational health 'specification' OHSAS 18001. It seems an international standard is not yet forthcoming because global requirements are not stringent enough for UK legislation.

The BS will be more closely related to ISO 9001 and 14001 and use of terminology will change a bit. Hazard identification and risk assessment will be required to take into account:
* Human factors such as behaviour and capabilities
* Infrastructure, equipment and materials
* Changes or proposed changes in the organisation or its activity
* Modifications to the OH&S MS…and their impacts on operations, processes and activities
* Any legal obligations relating to risk assessment and implementation of necessary control measures

Risk controls will need to be selected to the fairly well accepted hierarchy of control (elimination, substitution, engineering controls, signs/warnings/procedural, PPE).

A commitment must be made to prevent OH&S incidents. The active role of top management will be emphasised, including how they will demonstrate commitment. Also, all employees will have to take responsibility for aspects of OH&S over which they have control.

There will be a requirement to identify training needs, for those need to be met, to evaluate the effectiveness of training and to keep records of training, education and experience.

Organisations will have to periodically evaluate compliance with applicable legal and other requirements and to keep records of the results. Accidents will need to be investigated and analysed with results being documented.

Benefits of achieving OHSAS 18001 are quoted as

* 52% - large/significant improvement in regulatory compliance
* 32% - decrease in overall costs of accidents
* 17% - decrease in insurance premiums
* 4% - decrease of over 10% in insurance premiums

I guess the implication is that BS18001 will have even more benefits.

From this a couple of things strike me because they are things I have felt to have been very important for sometime:

* Taking human factors into account in hazard identification and risk assessment;
* Training needs analysis and evaluation after training
* Identifying accident investigation and analysis as two processes

Andy Brazier

Autopilot

According to this article whilst autopilots and pilots individually seldom make mistakes, errors sometimes occur because of "inefficient collaboration" between them and this has been known to have caused accidents.

To avoid this new software is being developed that gives the autopilot more calculation work to do. The result is that the human pilot is presented with explicit statements of the current situation, action to be taken and objectives. This gives them a better understanding of what is going on and hence what their part is in it all. Also, it reduces the workload on the pilot, leaving them to spend more time monitoring situations.

It is interesting to read about how errors occur between automated systems and humans, and this could be entirely relevant in other industries such as process control, where I know optimisers can cause confusion. Whether this new software is the solution, I am not so sure. It sounds like the pilot's role is being further eroded, becoming more passive and boring, which may not help their alertness and may even lead to a degradation in skill over time.

Andy Brazier

Wednesday, November 08, 2006

Genuine errors that kill

Good post on NHS Blog Doctor. Discusses how we should deal with errors that have catastrophic consequences when someone does something quite normal. In this case the example is loosing control because of sneezing when either driving a car or a surgeon operating. The trouble is if someone dies because of someone else's error, the general public expect someone to be punished. Where someone is negligent or reckless (i.e. driving or operating drunk) this is quite clear-cut. But punishing someone for sneezing does not seem right.

A news article on a similar theme was on BBC website 7 November 2006 nurse gives baby morphine overdose.

In this case the nurse gave morphine meant for another baby when she thought she was giving human albumin solution. She was an experienced nurse and there seems to be no explanation for why she made the error. However, she was sacked and has now been found guilty of misconduct.

Tuesday, November 07, 2006

European power outage

Parts of Germany, France, Belgium, Spain, Portugal, Croatia and Italy were blacked out on 5 November 2006 when German power controllers switched off a cable that left some areas lacking power and others overloaded.

Good article about it on BBC website

It is interesting that inter-connecting national grids intends to secure supply. However, it adds complexity which can contribute to failures. This seems to be the result of new technology. The likelihood of failures are reduced but there consequences when they happen are often much greater.

Andy Brazier

Fire risks

There is a phenomenal amount of information available regarding fire at the following website.

I think it refers to the old regulations (i.e. before 1 October 2006) hence is in archive, but most will still be useful guidance.

Andy Brazier

Fatigue & alertness testing

A company in US (Bowles Langley Technology) has developed online tools that people can use to test their alertness. Aim is to allow people to test themselves to check if they are safe to work or drive home. You can try a demo on their website

Andy Brazier

Monday, November 06, 2006

Chronic fatigue after long working hours

October 2002 Mark Fiebig was killed when he fell asleep at the wheel of his car driving home from work. His employer has recently been found guilty of breaching health and safety laws and fined £30k + £24k costs as it was felt they had failed to monitor work hours closely enough.

This is interesting because the accident happened outside work hours. Admittedly the hours being worked were way in excess of what most would do, with it being reported that he had worked 17 hour shifts for four consecutive days. But it is a point I have raised with clients in the past, especially following night shifts. I have tried to encourage them to consider what they would do if someone said they felt really tired. Would they drive the employee home to make sure he got there safely?

The case is reported in a number of places including

Norwich Union

Cambridge evening news

TUC

Wednesday, October 25, 2006

The problems with behavioural safety

I have just found this article by Nancy Lessin published at hazards.org

Problems identified in this paper include:

* Focusing on worker behaviour tends to mean root causes of problems are not looked at closely enough. Production pressure is quoted as a common reason why employees do not behave as safely as they should;
* There is a tendency to place the burden of prevention on the worker, rather than developing technical solutions;
* Everyone makes mistakes, is at some time careless, complacent, overconfident, and stubborn. At times each of us becomes distracted, inattentive, bored and fatigued. BS seems to suggest this should not be the case, and that if people are more careful mistakes will not happen.
* BS tends to mean that any individual acting unsafely is subject to 'inquisitions.' This is not pleasant, so the result is incidents don't get reported.
* BS programmes can be used by management to justify actions that unions have identified in the past, and thus undermining the union.
* A 'systems approach' that emphasizes the identification and elimination of root causes of workplace injuries and illnesses: workplace health and safety hazards would be far more effective.

The paper quotes some examples of where unions and workers have fought back against BS. They include:
* Engaging in a campaign that includes educating and involving the membership, identifying allies, identifying leverage and employing escalating tactics.
* Workers all wearing anti-behavioural safety buttons (badges);
* Placing fluorescent stickers on hazards in the workplace to bring a focus back to hazards rather than workers' "unsafe behaviours";
* Making a sign for the union bulletin board that reads "It has been x days since we asked management to correct [a particular hazard] and they have still not fixed it" (and keeping the count going each day);
* Threatening to call OSHA in to inspect the workplace.
* The United Steelworkers of America developed buttons (badges) for locals going through such campaigns that have a large BS in the center, with a line drawn through it, and the words "Eliminate Hazards - Don't Blame Workers" around the outside.

I certainly don't agree with everything in this paper or the way the message is put over. However, I do also share some of the concerns and am convinced that a systems approach to improved health and safety would be more effective and likely to address process as well as personal safety, as well as health.

Andy Brazier

Thursday, October 19, 2006

Safety last

Article in the Guardian by David Brindle and Paul Lewis on 18 October 2006 link

Provide a summary of the recent debate about society becoming more risk averse. Includes some examples. The problem is, what is the solution?

Andy Brazier

Controlling risk associated with violence

An excellent set of responses to a question posted on an IOSH forum to a question related to protecting doctors from violent patients. Not much for me to say. I just want to record the link here for future reference.

Link

Wednesday, October 18, 2006

Driver warning system

Article at CBC published 17 October 2006.

Ford Motor Co. testing a number of different systems that warn drivers when they stray off the lane on a road. Researchers studied drivers who had not slept for 23 hours and had each of the participants drive for three hours in a simulator.

Found that all systems were effective at improving reaction time, implying they would reduce likelihood of accidents. However, I wonder how much such devices will affect driver behaviour. Will people pay less attention when driving because they know there is a device that will warn them that they are straying. Will people drive for longer without a break or be less concerned about driving when they haven't slept?

Andy Brazier

Working under fire

Report by Robert Jaques 17 October 2006 published here

Military student medic were required to perform a thoracostomy (insertion of a tube into the chest cavity to permit fluid to drain) under virtual reality battle conditions.

Interesting findings
* The students' completion times showed that they could perform the surgery efficiently, but that the quality of their work suffered.
* Those who performed the procedure faster were more susceptible to the virtual sniper fire.
* The stress created by the simulated environment may have caused some students to engage in inappropriate and dangerous behaviour that would be likely to result in their being killed in a real combat situation.

Not sure how this would translate into a business setting, but I can imagine that during a major incident people are likely to act differently. We rarely get the chance to give our staff the opportunity to see what it will be like, and have no real idea of how they will react.

Andy Brazier

Friday, October 13, 2006

Employers not liable for unforseeable events

The HSE has recently lost a case at the court of appeal regarding a case where two employees of Hatton Traffic Management (HTM) died when taking part in road improvements on the A66 near Scotch Corner.

According to this website "HTM were providing traffic management services for contractors (L) who were resurfacing the A66. There were contraflow works, lit at each end by HTM’s mobile telescopic towers which were 9.1 metres tall. 20,000 volt electricity cables passed overhead, dipping to 7.5 metres above the ground. HTM had two employees on site, C and D, who took their day to day instructions from L. C and D were told to move one of the towers. They did not lower the tower under the cables (contrary, said HTM, to their training and to instructions on the tower) and the inevitable happened, with fatal consequences for both employees."

HTM were charged with failing to discharge their duty under s.2(1) of the Health and Safety at Work Act 1974, namely failing to ensure, so far as was reasonably practicable, the health, safety and welfare at work of all its employees. At a preparatory hearing, the judge ruled in favour of HTM on both points. The prosecution’s appeal was unsuccessful.

The HSE took it to appeal and lost. According to this website The implication is that this ruling demonstrates that "Employers cannot be found negligent on health and safety grounds when employees are acting outside their remit."

According to another website HTM's lawyer said after the case "If this argument had been upheld by the COA, Groch believes, it would have effectively removed of any real defence available to employers in the area of risk management. Insurance premiums would have also beeen affected as insurance companies would take action to protect themselves against substantial claims. Another disturbing implication would be that some employers may question the need to invest heavily in health and safety provisions if, in reality, they have no effective defence against criminal prosecution."

But this is unlikely to be the end of the case. HSE will probably take it to the House of Lords, and it does seem there is plenty to debate. I personally find it hard to say that with high voltage cable nearby that it was not foreseeable that workers may forget to lower the lights before moving them. Also, we all know people take shortcuts and we should consider this in our risk assessments.

A spokesman from Norwich Union made the following comments at this website. "In this case it seems that HTM argued they had taken all reasonably practicable steps to ensure the safety of the employees and had provided training and instruction, as required by law. But, they argued the sequence of events that occurred was not foreseeable.

"Some might consider this somewhat disingenuous, despite the ruling. If there is a shortcut - that will save a bit of time and perhaps enable an early tea break, a chance to have a few minutes in the cab out of the rain - then is it not the case that employees will find it?"

Andy Brazier

Wednesday, October 11, 2006

Human error caused Cyprus air crash

Reuters website 10 October 2006.

Crash in August 2005. The plane on a Larnaca-Prague flight flew on autopilot for two hours, its pilots slumped over the controls, before running out of fuel and ramming into a Greek hillside killing all 121 people on board.

The report blamed deficient technical checks on the ground, failure by the pilots to pick up on compression warnings and a series of other mistakes for the Cypriot Helios Airways Boeing 737-300 crash.

The compression system regulates the oxygen supply, which decreased as the aircraft gained altitude and rendered the pilots and passengers unconscious.

BBC webstite added more. Including:

* Pilots misread instruments regulating cabin pressure and misinterpreted a warning signal.
* Maintenance officials on the ground left pressure controls on an incorrect setting.
* Plane's manufacturers Boeing took "ineffective" measures in response to previous pressurisation incidents in the particular type of aircraft.
* Airline came in for criticism for "deficiencies" in its organisation
* The Cypriot regulatory authority was accused of "inadequate execution of its safety oversight responsibilities"

Andy Brazier

Eye strain from computer use

Article by Darryl E. Owens published 10 October 2006 on the Orlando Sentinal

Studies haven't found that long-term computer use produces permanent damage But some people do suffer from burning, watery, or dry eyes, or blurred or double vision during or after use.

There is no evidence that this is caused by radiation from the screen. However, the main causes appear to be decreased blinking during computer use and wearing improper or outdated eyeglass prescriptions.

Coloured tints and filters are not the solution. Instead properly adjust your office chair or positioning your computer monitor so that it is 20 to 25 inches from your eyes and slightly below eye level (a screen that is too high or too low will be hard for your eyes to work together). Also, adjust brightness and contrast.

Andy Brazier

The war on error

Article by David Learmount published 10 October 2006 on Flight website

Talks about a course titled 'Safety Stand-down' for experienced pilots run in US. Claims that the course "takes fully trained pilots well above and beyond what an advanced conventional or recurrent flying training programme provides. It challenges preconceptions, stimulates questions, and presents a pilot with a mirror in which his/her latent professional and personal vulnerabilities become fully visible. More than that, it renews a pilot’s respect for the multiple disciplines it takes to be a really good aviator."

Quotes Bob Agostino (Bombardier Business Aircraft director of operations): “Development of the human half of the man-machine equation has not kept pace with the technology developments in either formal training programmes nor in regulatory development.”

Also Dr Tony Kern (senior partner in Convergent Knowledge Solutions): "The challenge of human error will never be remedied by any traditional safety programme. Personal error must be slowly untangled in a private battle within each individual.”

Finally, researcher from University of Manchester: “The study of human error has grown dramatically in the last 20 years. We know why people make errors and how to prevent 90% of them, but no-one seems to care.”

Andy Brazier