XPM: eXplainable Predictive Maintenance
The XPM project aims to integrate explanations into Artificial Intelligence (AI) solutions within the area of Predictive Maintenance (PM).
In the XPM project, we will develop several different types of explanations (anything from visual analytics through prototypical examples to deductive argumentative systems) and demonstrate their usefulness in four selected case studies: electric vehicles, metro trains, steel plant and wind farms. In each of them, we will demonstrate how the right explanations of decisions made by AI systems lead to better results across several dimensions, including identifying the component or part of the process where the problem has occurred; understanding the severity and future consequences of detected deviations; choosing the optimal repair and maintenance plan from several alternatives created based on different priorities; and understanding the reasons why the problem has occurred in the first place as a way to improve system design for the future.
Real-world applications of predictive maintanance (PM) are increasingly complex, with intricate interactions of many components. AI solutions are a very popular technique in this domain, and especially the black-box models based on deep learning approaches are showing very promising results in terms of predictive accuracy and capability of modelling complex systems.
However, the decisions made by these black-box models are often difficult for human experts to understand – and therefore to act upon. The complete repair plan and maintenance actions that must be performed based on the detected symptoms of damage and wear often require complex reasoning and planning processes, involving many actors and balancing different priorities. It is not realistic to expect this complete solution to be created automatically – there is too much context that needs to be taken into account. Therefore, operators, technicians and managers require insights to understand what is happening, why it is happening, and how to react. Today’s mostly black-box AI does not provide these insights, nor does it support experts in making maintenance decisions based on the deviations it detects. The effectiveness of the PM system depends much less on the accuracy of the alarms the AI raises than on the relevancy of the actions operators perform based on these alarms.
The XPM project focuses on creating methods that can explain the operation of AI systems within the PM domain. What makes the creation of the maintenance plans challenging is incorporating the PM AI output into the human decision-making process and integrating it with human expertise. To make AI useful and trustworthy means putting fault predictions in a relevant context and making them understandable for the humans involved. It requires explanations that are adapted to the roles and needs of different actors – for example, to lower-level engineers through connection to the blueprints of the technical installation; while using other means for the managers who evaluate the costs of system downtime; or company lawyers assessing the possible liability in the case of safety-threatening failure.
The underlying theme of the XAI is building up the trust towards AI systems, but in order to maximise the real usefulness of PM, we need to go one step further. In particular, the primary novelty of the XPM project is developing technological solutions that explicitly address four concrete reasons why the explanations are needed. Within these four rationales, the explanations of AI decisions will lead to the most significant improvement in the repair and maintenance actions taken by human experts. The first rationale (R1) is identifying (isolating and characterizing) the fault. In complex industrial systems, AI is often able to identify deviations from normal behaviour, but due to a huge number of components with complex interactions, pinpointing the actual spot requires experts and their domain knowledge. The second, (R2), is understanding fault consequences. The right maintenance depends on how the issue will evolve, what is Remaining Useful Life (RUL), and what can be the collateral damage or loss of productivity. These questions require considering the broader context, beyond the scope of the AI system. The third rationale, (R3), is supporting domain experts by helping human operators create the right repair plan, including optimising the system performances (e.g. uptime or safety) in the presence of the degradation. It also needs to strike the right trade-off between different quality criteria. Finally, the fourth (R4) is understanding the reasons why the fault has occurred and how to improve the system in the future. It can be related to incorrect usage, suboptimal design, as well as the monitoring process itself. For example, what optimisation of sensor types and placement will allow for earlier detection in the future, or what changes to the manufacturing process parameters will extend the lifetime of certain critical components.
- March 1, 2021, to February 29, 2023
- CHIST-ERA (EU) project through the Swedish Research Council
- Inesc Tec, Portugal
- Jagiellonian University, Poland
- IMT Lille-Douai, France
Project team at Halmstad University:
The project belongs to Technology Area Aware Intelligent Systems (AIS) at the Department of Intelligent Systems and Digital Design (ISDD) at the School of Information Technology (ITE). All resarch project at ITE are within the research environment Embedded and Intelligent Systems (EIS).