Human-machine collaboration improves the detection of anomaly
When a deviation in a data set is found, it can implicate that something is wrong. It can, for example, be a machine part that does not work, a computer intrusion or a first sign of illness. By combining human knowledge and machine learning, the detection of anomalies will become more efficient and improved. Ece Calikus has recently published her PhD thesis on the subject.
Hi there, Ece Calikus! A year ago, before your licentiate defence, we wrote an article about your research and how joint human-machine learning can improve district heating. What has happened since then?
”I have continued my research in intelligent and self-monitoring systems that use artificial intelligence combined with human knowledge. In March, I presented my doctoral thesis, and my research questions and challenges have been consistent with my licentiate thesis. However, my focus is now centred around ‘user-centric anomaly detection.”
Can you tell us a little more about your research?
“In data science, anomaly detection is the identification of rare events and observations in a data set that significantly differs from the majority of the data. User-centric anomaly detection plays a key role in making data-driven anomaly detection approaches more effective and practical in real-world applications. My research shows a comprehensive approach enabling human-machine collaboration and where the parts learn from each other. This can significantly improve anomaly detection performance and its practical use in a specific application domain, for example, district heating.”
“In our research, we focus on designing and evaluating algorithms for user-centric anomaly detection in which people investigate, interpret and learn from the detectors' results, and then themselves provide domain knowledge or feedback to the system.”
Where can anomalies be found?
“Anomalies can be found in different areas and situations that affect our daily lives. They can, for example, be intrusions in data systems, finance fraud, faults or breaks in production units, diseases or conditions in medical diagnostics.”
Can anomaly detection be problematic?
“All types of abnormal observations are not equally interesting to the end-user. For example, anomalies such as abnormally high temperatures can be recorded occasionally in a domestic hot water heat-pump system due to disinfecting pipes from Legionella bacteria. In that case, the deviation is normal and not very interesting for an analyst looking for actual faults in the hot water system, such as compressor faults in the pumps.”
“The large gap between detected anomalous behaviours and ´anomalies of interest´ can produce many false alarms and easily render anomaly detection unusable in practice. Human domain knowledge plays an essential role in bridging this gap. For example, an analyst might give clues to create more likely features to indicate interesting anomalies or provide feedback to separate them from irrelevant ones.”
It sounds like humans and machines must work together to create more accurate and efficient ways to detect deviations. Can you tell us more about this?
“After the anomalies in the data are identified, human experts typically investigate them for root cause analysis, troubleshooting, or action planning. As shown in the previous example, one cannot automatically schedule repair without knowing the anomaly is caused by a compressor failure, not by killing bacteria. Interpretability of the detected outliers, which provides reasons for abnormal behaviours, can significantly reduce the effort of such manual inspections.”
You have also focused on contextual anomaly detection. What is that and how can it be helpful?
“With contextual anomaly detection, we want to identify abnormal objects that might be disguised as normal within specific contexts. For example, high energy consumption in a heating system during summer is abnormal, while the same consumption level can be totally normal in winter. We try to provide context-based explanations of anomalies that can explain what makes an object stand out as deviating. Such explanations can help characterize and interpret different types of anomalies and normal groups.”
“With anomaly detectors, we want to effectively separate rare and unusual observations from the majority. However, the rare data instances reported as anomalies may cause discrimination against the minority groups existing in the data. For example, surveillance applications designed to detect criminal activities can be racially biased if the detection heavily relies on humans' appearance. Additional information on what makes certain behaviour stand out enables us to discover biased decisions of the algorithms and improve algorithmic fairness.”
How does your research contribute to the development of society?
“The user-centric anomaly detection helps us to distinguish anomalies better. Furthermore, improving the work for the interpretability of detection results allows end-users to validate the algorithm's performance and facilitate trust in the anomaly detection system. This is especially important for data sets that include sensitive features such as sex, ethnicity or age.”
Text: Anna-Frida Agardson
Photos: iStock and Dan Bergmark
About the thesis
The doctoral defence took place at Halmstad University on March 22, 2022.
Chairman of the defence:
Professor Urban Persson, Högskolan i Halmstad
Associate Professor, Evgeny Burnaev, Skoltech, Moscow, Russia
Associate Professor, Indre liobaite, University of Helsinki, Finland
Professor Jesse Davis, KU Leuven, Belgium
Professor Jesse Read, École Polytechnique, France
Professor Slawomir Nowaczyk, Högskolan i Halmstad
“My major collaborations throughout my PhD involved two district heating companies HEM and Öresundskraft. I have also worked with Dr. Henrik Gadd and Prof. Sven Werner. I am very grateful to them for sharing their knowledge and expertise in district heating and helping me see things from the perspective of the district heating domain”, says Ece Calikus.