Going Beyond Condition Monitoring
By accurately diagnosing fault conditions at an early stage, the risk of failure can be reduced, and the cost of corrective action can be minimized. However, in the majority of cases, condition monitoring techniques are used to detect fault conditions that should not exist – they arise due to poor design, procurement, storage, work management, installation, maintenance, and operating practices.
Condition monitoring is a powerful technique used to detect incipient failures in rotating machinery and other plant assets. To set up an effective condition monitoring programme it is necessary to understand why rotating machinery fails, and the criticality of those assets.
Only then can we decide which monitoring technologies should be employed, where we can justify utilizing those technologies, and how often the tests should be performed. But we should also use this information to make a plan to eliminate the root causes of those failures.
We can proactively employ techniques like RCM and Failure Mode, Effect Analysis (FMEA) to study the possible failure modes. We can use Root Cause Failure Analysis (RCFA) to determine why equipment failed (and to avoid repeat failures). Or we can utilize industry knowledge to eliminate the common causes of failure. We will start with this third approach.
Common solutions to common problems
Many industrial organizations suffer from the same problems. If these sources of poor reliability are recognized, and proactive action is taken, it is possible to drastically improve reliability before a detailed study of reliability and failure modes (via RCM/FMEA/RCFA) is performed.
The design phase rarely involves people experienced in maintenance, condition monitoring and production, and thus equipment is not designed with maintainability, operability, or reliability in mind. The procurement process has similar flaws. Purchase price is prioritized over total cost of ownership. And service providers, such as motor rewind and balance shops are also selected without due consideration of quality of service.
The result is that the equipment imported into the plant will have inherent reliability problems that will frustrate maintenance and operations for the life of the equipment, costing the company dearly. A part of the solution is to utilize acceptance testing to reject equipment that does not meet reasonable standards of quality.
Once the asset is under our control we have to scrutinize the spares management, planning and scheduling, installation, and operating practices to ensure that we do not introduce additional defects. Spares can degrade while sitting in stores.
Poor planning and scheduling that result in rushed jobs with the wrong tools will introduce additional defects. And poor installation practices can damage bearings and gears and leave machines misaligned, loose, out of balance, and operating in resonance that will lead to additional problems.
We must also maintain the equipment; keep it clean, adjusted, and running smoothly with the correct, contaminant-free lubricant and minimal misalignment and unbalance. Therefore we can learn from industry experience and proactively deal with of these root causes of poor reliability.
Solving the plant’s unique reliability problems
If all of the above issues are addressed then the plant will achieve significantly improved financial results and there will be fewer safety and environmental incidents. However there will be unique issues at your plant. It is recommended that you take a two-pronged approach to identify and resolve these problems.
Performing a plant walk-through, and inviting mechanics and operators to point out the common problems experienced on the plant-floor will achieve two goals:
1. By listening to the mechanics and operators you will learn about problems that an RCM team may never identify – and you will do it quickly and effectively.
2. Taking action on the identified problems will generate goodwill between the reliability group and the plant-floor staff. Additional suggestions will be forthcoming. This process will accelerate the culture change process.
There will be equipment that demonstrates poor reliability where it will be more difficult to identify the root cause. This is where it is recommended to perform an RCM or FMEA, and when failure occurs, perform a root cause failure analysis (RCFA). You may need to involve consultants, the OEM, or people from a sister plant to get to the bottom of the problems. It may be necessary to replace equipment, redesign a process, or install monitoring and/or control systems.
Either way, these processes can be performed at the same time that you are working to deal with the common sources of defects described earlier.
Utilizing condition monitoring skills to improve reliability
Condition monitoring will always be required to provide an early warning of fault conditions – but they can do more to improve reliability. As described earlier, it is important that acceptance testing (QA/QC) is performed on new and overhauled equipment. The condition monitoring group can help to define the standard and conduct the tests.
A form of acceptance testing can be performed when new, repaired or overhauled equipment is installed. Vibration and other checks should be performed to ensure that it is fit to provide a long, reliable life.
Too many vibration analysis programmes focus on the detection of bearing defects and pay less attention to the prevention of bearing defects. Conditions such as unbalance, misalignment, bent shaft, run-out, looseness, resonance, soft foot, cavitation, cocked bearing and others will result in excessive load and reduced life. In many condition monitoring programmes, if these conditions are detected, they are not reported until the condition appears to be severe.
The same is true for a wide range of fault conditions detected with other technologies; lubricant contamination, under-lubricated bearings, over-lubricated bearings, electrical supply unbalance, poor performance characteristics, and others.
The fact is that all of these conditions result in reduced life. All of the rotating components, especially the bearings, will develop faults far more quickly when any of these conditions exist. Therefore, although the vibration amplitude may not indicate that the unbalance is severe, it must be understood that the life of the bearings will be reduced.
The condition monitoring team holds important evidence in their database that will explain why a machine failed. The analyst may see signs of unbalance, misalignment or some other condition that was the root cause to the failure.
Improving reliability – the missing ingredients
More needs to be said regarding the implementation of a reliability improvement programme. Everything discussed in this article is common sense, and it has been tried in many plants. However a large percentage of the reliability improvement initiatives do not succeed in the long term. Some progress may have been made, but many programmes either start then peter out, or they make more substantial progress that proves to be unsustainable.
There are five important steps that are often missed in these programmes:
- They do not have commitment from senior management. Leadership from the top is essential.
- The plant does not have a clear understanding of asset criticality, and it does not have a maintenance and reliability strategy. In the author’s opinion it is often not necessary to perform a full classical RCM analysis on every asset, instead a more streamlined process should be undertaken in order to form that strategy.
- They do not take culture change issues into account. Any plant can change if small, strategic steps are taken. Larger steps will be resisted.
- Everyone within the plant contributes to the reliability problem, therefore everyone needs to receive training; from basic awareness training to detailed skill-building training. People who do not receive training will feel left out and will act as anchors on the programme.
Condition monitoring provides a great service to an organization, reducing unexpected breakdowns and thus reducing maintenance costs, downtime, safety incidents and environmental incidents. But the condition monitoring group should also work to improve reliability by assisting in the acceptance testing process, identifying conditions that will lead to reduced reliability, and assisting in the root cause failure analysis process when equipment does eventually fail.
But all of this work should be part of a properly planned and orchestrated reliability improvement programme that involves defect elimination and process optimization.