Let Risk and Your Equipment Determine Your Maintenance Strategy
When I attend conferences and workshops and read articles on maintenance and reliability, more and more I hear people touting that preventive maintenance is more costly and not the right approach. When we talk about preventive maintenance, we are primarily talking about time-based inspections, but it can include overhauls/rebuilds as well. But before we take statements like these as the gospel and apply them carte blanche to all of our equipment, we need to understand the basis for that reasoning and when to apply it.
From the Reliability-Centered Maintenance side, we know that we can’t apply the single bathtub curve to all equipment failures. We recognize that six separate failure curves exist, and greater than 80 percent of all failures occur outside the infant mortality or end-of-life wear-out zones, meaning the failures are totally random in nature. No doubt that condition-based monitoring using various tools like vibration or temperature monitoring is better and a less costly solution over time-based preventive maintenance, especially overhauls. This provides the basis for some saying that doing more than roughly 20 percent preventive maintenance is counterproductive.
RCM teaches us that maintenance is really about minimizing the consequences of failure or our risk. Our equipment and the risk should determine our maintenance strategy for any given piece of equipment, not someone telling us preventive maintenance is bad. To do this, we can use a tool like the RCM Logic Tree. The RCM Logic tree places the emphasis on safety and environmental concerns first, followed by production or capacity losses. If we can apply a condition-monitoring strategy, that is our first choice because of the random failures. However, not all equipment lends itself to a condition-monitoring approach, which leads us to time-based preventive maintenance inspections and overhauls. If we can’t detect the equipment in the act of failing using either of these based on risk, we can choose to re-engineer the equipment or run it to failure. If we choose “run to failure”, we should have a proactive strategy that returns the equipment to its normal state as quickly as possible (Neil Bloom’s Canon Law). Additionally, we should ensure that running an item to failure doesn’t cost us more money due to collateral damage (i.e. a bearing failure that takes out a shaft and the housing).
Let me give you a few examples of where time-based (cycles, hours, days, etc.) preventive maintenance is a better solution based on the equipment, the failure modes and the consequences of failure. Take a correctional facility and the lamps in the overhead lighting fixtures as the first example. In most organizations, we would choose to run these to failure, correct? In a correctional facility, the cell block has to be cleared of the inmates before the maintenance worker can access the lamps. Depending on the security level of the cell block, it may be more cost-effective to re-lamp all the fixtures at say, a 10,000-hour interval. Next, let’s take a contact lens manufacturer. The machines that form, shape and package the contact lens utilize small components like cylinders and small servo drives in tight locations that don’t lend themselves to many condition-monitoring approaches. Past best practice has been to change out some of these components based on the number of strokes or cycles. Now, let’s take a 165-valve bottle filler used in the various beverage industries. Organizations with this equipment typically run many shifts during their peak season. The filler valves typically don’t lend themselves to condition monitoring, and random failures with partial replacements during the shifts are cost-prohibitive due to downtime losses. Most choose to overhaul these fillers on time-based intervals to lessen the risk of failure during the runs.
So, in summary, while condition-based monitoring is a more cost-effective solution, the risk and failure modes must determine the maintenance strategy that you apply to your equipment. Use tools like the RCM Logic Tree to help you determine the strategy.
From the Reliability-Centered Maintenance side, we know that we can’t apply the single bathtub curve to all equipment failures. We recognize that six separate failure curves exist, and greater than 80 percent of all failures occur outside the infant mortality or end-of-life wear-out zones, meaning the failures are totally random in nature. No doubt that condition-based monitoring using various tools like vibration or temperature monitoring is better and a less costly solution over time-based preventive maintenance, especially overhauls. This provides the basis for some saying that doing more than roughly 20 percent preventive maintenance is counterproductive.
RCM teaches us that maintenance is really about minimizing the consequences of failure or our risk. Our equipment and the risk should determine our maintenance strategy for any given piece of equipment, not someone telling us preventive maintenance is bad. To do this, we can use a tool like the RCM Logic Tree. The RCM Logic tree places the emphasis on safety and environmental concerns first, followed by production or capacity losses. If we can apply a condition-monitoring strategy, that is our first choice because of the random failures. However, not all equipment lends itself to a condition-monitoring approach, which leads us to time-based preventive maintenance inspections and overhauls. If we can’t detect the equipment in the act of failing using either of these based on risk, we can choose to re-engineer the equipment or run it to failure. If we choose “run to failure”, we should have a proactive strategy that returns the equipment to its normal state as quickly as possible (Neil Bloom’s Canon Law). Additionally, we should ensure that running an item to failure doesn’t cost us more money due to collateral damage (i.e. a bearing failure that takes out a shaft and the housing).
Let me give you a few examples of where time-based (cycles, hours, days, etc.) preventive maintenance is a better solution based on the equipment, the failure modes and the consequences of failure. Take a correctional facility and the lamps in the overhead lighting fixtures as the first example. In most organizations, we would choose to run these to failure, correct? In a correctional facility, the cell block has to be cleared of the inmates before the maintenance worker can access the lamps. Depending on the security level of the cell block, it may be more cost-effective to re-lamp all the fixtures at say, a 10,000-hour interval. Next, let’s take a contact lens manufacturer. The machines that form, shape and package the contact lens utilize small components like cylinders and small servo drives in tight locations that don’t lend themselves to many condition-monitoring approaches. Past best practice has been to change out some of these components based on the number of strokes or cycles. Now, let’s take a 165-valve bottle filler used in the various beverage industries. Organizations with this equipment typically run many shifts during their peak season. The filler valves typically don’t lend themselves to condition monitoring, and random failures with partial replacements during the shifts are cost-prohibitive due to downtime losses. Most choose to overhaul these fillers on time-based intervals to lessen the risk of failure during the runs.
So, in summary, while condition-based monitoring is a more cost-effective solution, the risk and failure modes must determine the maintenance strategy that you apply to your equipment. Use tools like the RCM Logic Tree to help you determine the strategy.