anomaly detection

In data analysis, anomaly detection (also referred to as outlier detection and sometimes as novelty detection) is generally understood to be the identification of rare items, events or observations which deviate significantly from the majority of the data and do not conform to a well defined notion of normal behavior. Such examples may arouse suspicions of being generated by a different mechanism,{{cite book |last= Hawkins |first= Douglas M. |title=Identification of Outliers |publisher=Springer |location= |date=1980 |isbn=978-0-412-21900-9 |oclc=6912274 }} or appear inconsistent with the remainder of that set of data.{{cite book |last1= Barnett |first1=Vic |last2= Lewis |first2=Lewis |author-link= |date=1978 |title=Outliers in statistical data |publisher=Wiley |isbn=978-0-471-99599-9 |oclc=1150938591 }}

Anomaly detection finds application in many domains including cybersecurity, medicine, machine vision, statistics, neuroscience, law enforcement and financial fraud to name only a few. Anomalies were initially searched for clear rejection or omission from the data to aid statistical analysis, for example to compute the mean or standard deviation. They were also removed to better predictions from models such as linear regression, and more recently their removal aids the performance of machine learning algorithms. However, in many applications anomalies themselves are of interest and are the observations most desirous in the entire data set, which need to be identified and separated from noise or irrelevant outliers.

Three broad categories of anomaly detection techniques exist. Supervised anomaly detection techniques require a data set that has been labeled as "normal" and "abnormal" and involves training a classifier. However, this approach is rarely used in anomaly detection due to the general unavailability of labelled data and the inherent unbalanced nature of the classes. Semi-supervised anomaly detection techniques assume that some portion of the data is labelled. This may be any combination of the normal or anomalous data, but more often than not, the techniques construct a model representing normal behavior from a given normal training data set, and then test the likelihood of a test instance to be generated by the model. Unsupervised anomaly detection techniques assume the data is unlabelled and are by far the most commonly used due to their wider and relevant application.

Definition

Many attempts have been made in the statistical and computer science communities to define an anomaly. The most prevalent ones include the following, and can be categorised into three groups: those that are ambiguous, those that are specific to a method with pre-defined thresholds usually chosen empirically, and those that are formally defined:

=Ill defined=

An outlier is an observation which deviates so much from the other observations as to arouse suspicions that it was generated by a different mechanism.
Anomalies are instances or collections of data that occur very rarely in the data set and whose features differ significantly from most of the data.
An outlier is an observation (or subset of observations) which appears to be inconsistent with the remainder of that set of data.
An anomaly is a point or collection of points that is relatively distant from other points in multi-dimensional space of features.
Anomalies are patterns in data that do not conform to a well-defined notion of normal behaviour.

=Specific=

Let T be observations from a univariate Gaussian distribution and O a point from T. Then the z-score for O is greater than a pre-selected threshold if and only if O is an outlier.

History

= Intrusion detection =

The concept of intrusion detection, a critical component of anomaly detection, has evolved significantly over time. Initially, it was a manual process where system administrators would monitor for unusual activities, such as a vacationing user's account being accessed or unexpected printer activity. This approach was not scalable and was soon superseded by the analysis of audit logs and system logs for signs of malicious behavior.{{Cite journal |last1=Kemmerer |first1=R.A. |last2=Vigna |first2=G. |date=April 2002 |title=Intrusion detection: a brief history and overview |url=http://dx.doi.org/10.1109/mc.2002.1012428 |journal=Computer |volume=35 |issue=4 |pages=supl27–supl30 |doi=10.1109/mc.2002.1012428 |issn=0018-9162|url-access=subscription }}

By the late 1970s and early 1980s, the analysis of these logs was primarily used retrospectively to investigate incidents, as the volume of data made it impractical for real-time monitoring. The affordability of digital storage eventually led to audit logs being analyzed online, with specialized programs being developed to sift through the data. These programs, however, were typically run during off-peak hours due to their computational intensity.

The 1990s brought the advent of real-time intrusion detection systems capable of analyzing audit data as it was generated, allowing for immediate detection of and response to attacks. This marked a significant shift towards proactive intrusion detection.

As the field has continued to develop, the focus has shifted to creating solutions that can be efficiently implemented across large and complex network environments, adapting to the ever-growing variety of security threats and the dynamic nature of modern computing infrastructures.

Applications

Anomaly detection is applicable in a very large number and variety of domains, and is an important subarea of unsupervised machine learning. As such it has applications in cyber-security, intrusion detection, fraud detection, fault detection, system health monitoring, event detection in sensor networks, detecting ecosystem disturbances, defect detection in images using machine vision, medical diagnosis and law enforcement.{{cite book |last= Aggarwal |first= Charu |author-link= |date=2017 |title=Outlier Analysis |url= |location= |publisher=Springer Publishing Company, Incorporated |page= |isbn= 978-3319475776}}

=Intrusion detection=

Anomaly detection was proposed for intrusion detection systems (IDS) by Dorothy Denning in 1986.{{cite journal | last1 = Denning | first1 = D. E. | author-link1 = Dorothy E. Denning| doi = 10.1109/TSE.1987.232894 | title = An Intrusion-Detection Model | journal = IEEE Transactions on Software Engineering| issue = 2 | pages = 222–232 | year = 1987 | url = http://apps.dtic.mil/dtic/tr/fulltext/u2/a484998.pdf| archive-url = https://web.archive.org/web/20150622044937/http://www.dtic.mil/dtic/tr/fulltext/u2/a484998.pdf| url-status = live| archive-date = June 22, 2015| citeseerx=10.1.1.102.5127 | volume=SE-13| s2cid = 10028835 }} Anomaly detection for IDS is normally accomplished with thresholds and statistics, but can also be done with soft computing, and inductive learning.{{cite book | last1 = Teng | first1 = H. S. | last2 = Chen | first2 = K. | last3 = Lu | first3 = S. C. | title = Proceedings. 1990 IEEE Computer Society Symposium on Research in Security and Privacy | chapter = Adaptive real-time anomaly detection using inductively generated sequential patterns | doi = 10.1109/RISP.1990.63857 | pages = 278–284| year = 1990 | isbn = 978-0-8186-2060-7 | s2cid = 35632142 | url = http://www.cs.unc.edu/~jeffay/courses/nidsS05/ai/Teng-AdaptiveRTAnomaly-SnP90.pdf}} Types of features proposed by 1999 included profiles of users, workstations, networks, remote hosts, groups of users, and programs based on frequencies, means, variances, covariances, and standard deviations.{{cite journal | last1 = Jones | first1 = Anita K. | last2 = Sielken | first2 = Robert S. | title = Computer System Intrusion Detection: A Survey | journal=Computer Science Technical Report |date=2000 |pages=1–25} |publisher=Department of Computer Science, University of Virginia |url=https://citeseerx.ist.psu.edu/document?repid=rep1&type=pdf&doi=349277a67468e7f6a5bfc487ab125887c6925229 }} The counterpart of anomaly detection in intrusion detection is misuse detection.

= Fintech fraud detection =

Anomaly detection is vital in fintech for fraud prevention.{{Cite journal |last1=Stojanović |first1=Branka |last2=Božić |first2=Josip |last3=Hofer-Schmitz |first3=Katharina |last4=Nahrgang |first4=Kai |last5=Weber |first5=Andreas |last6=Badii |first6=Atta |last7=Sundaram |first7=Maheshkumar |last8=Jordan |first8=Elliot |last9=Runevic |first9=Joel |date=January 2021 |title=Follow the Trail: Machine Learning for Fraud Detection in Fintech Applications |journal=Sensors |language=en |volume=21 |issue=5 |pages=1594 |doi=10.3390/s21051594 |pmid=33668773 |pmc=7956727 |bibcode=2021Senso..21.1594S |issn=1424-8220 |doi-access=free }}{{Cite journal |last1=Ahmed |first1=Mohiuddin |last2=Mahmood |first2=Abdun Naser |last3=Islam |first3=Md. Rafiqul |date=February 2016 |title=A survey of anomaly detection techniques in financial domain |url=http://dx.doi.org/10.1016/j.future.2015.01.001 |journal=Future Generation Computer Systems |volume=55 |pages=278–288 |doi=10.1016/j.future.2015.01.001 |s2cid=204982937 |issn=0167-739X|url-access=subscription }}

= Preprocessing =

Preprocessing data to remove anomalies can be an important step in data analysis, and is done for a number of reasons. Statistics such as the mean and standard deviation are more accurate after the removal of anomalies, and the visualisation of data can also be improved. In supervised learning, removing the anomalous data from the dataset often results in a statistically significant increase in accuracy.{{cite journal | doi = 10.1109/TSMC.1976.4309523 | first = Ivan | last = Tomek| title = An Experiment with the Edited Nearest-Neighbor Rule | journal = IEEE Transactions on Systems, Man, and Cybernetics| volume = 6 | issue = 6 | pages = 448–452 | year = 1976 }}{{cite book | last1 = Smith | first1 = M. R. | last2 = Martinez | first2 = T. | doi = 10.1109/IJCNN.2011.6033571 | chapter = Improving classification accuracy by identifying and removing instances that should be misclassified | title = The 2011 International Joint Conference on Neural Networks | pages = 2690 | year = 2011 | isbn = 978-1-4244-9635-8 | chapter-url = http://axon.cs.byu.edu/papers/smith.ijcnn2011.pdf| citeseerx = 10.1.1.221.1371 | s2cid = 5809822 }}

= Video surveillance =

Anomaly detection has become increasingly vital in video surveillance to enhance security and safety.{{Cite journal |date=2023-06-01 |title=Video anomaly detection system using deep convolutional and recurrent models |journal=Results in Engineering |language=en-US |volume=18 |pages=101026 |doi=10.1016/j.rineng.2023.101026 |issn=2590-1230 |last1=Qasim |first1=Maryam |last2=Verdu |first2=Elena |s2cid=257728239 |doi-access=free }}{{Cite book |last1=Zhang |first1=Tan |last2=Chowdhery |first2=Aakanksha |last3=Bahl |first3=Paramvir (Victor) |last4=Jamieson |first4=Kyle |last5=Banerjee |first5=Suman |title=Proceedings of the 21st Annual International Conference on Mobile Computing and Networking |chapter=The Design and Implementation of a Wireless Video Surveillance System |date=2015-09-07 |chapter-url=https://doi.org/10.1145/2789168.2790123 |series=MobiCom '15 |location=New York, NY, USA |publisher=Association for Computing Machinery |pages=426–438 |doi=10.1145/2789168.2790123 |isbn=978-1-4503-3619-2|s2cid=12310150 |url=https://discovery.ucl.ac.uk/id/eprint/1506446/ }} With the advent of deep learning technologies, methods using Convolutional Neural Networks (CNNs) and Simple Recurrent Units (SRUs) have shown significant promise in identifying unusual activities or behaviors in video data. These models can process and analyze extensive video feeds in real-time, recognizing patterns that deviate from the norm, which may indicate potential security threats or safety violations. An important aspect for video surveillance is the development of scalable real-time frameworks.{{Cite book |last1=Park |first1=Chaewon |last2=Cho |first2=MyeongAh |last3=Lee |first3=Minhyeok |last4=Lee |first4=Sangyoun |chapter=FastAno: Fast Anomaly Detection via Spatio-temporal Patch Transformation |year=2022 |title=2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) |chapter-url=https://ieeexplore.ieee.org/document/9706649 |publisher=IEEE |pages=1908–1918 |doi=10.1109/WACV51458.2022.00197 |arxiv=2106.08613 |isbn=978-1-6654-0915-5}}{{Cite book |last1=Ristea |first1=Nicolae-Cătălin |last2=Croitoru |first2=Florinel-Alin |last3=Ionescu |first3=Radu Tudor |last4=Popescu |first4=Marius |last5=Khan |first5=Fahad Shahbaz |last6=Shah |first6=Mubarak |chapter=Self-Distilled Masked Auto-Encoders are Efficient Video Anomaly Detectors |date=2024-06-16 |title=2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) |chapter-url=https://ieeexplore.ieee.org/document/10655393 |publisher=IEEE |pages=15984–15995 |doi=10.1109/CVPR52733.2024.01513 |isbn=979-8-3503-5300-6|arxiv=2306.12041 }} Such pipelines are required for processing multiple video streams with low computational resources.

= IT infrastructure =

In IT infrastructure management, anomaly detection is crucial for ensuring the smooth operation and reliability of services.{{Cite journal |title=Anomaly Detection in Complex Real World Application Systems |url=https://ieeexplore.ieee.org/document/8101009 |access-date=2023-11-08 |journal=IEEE Transactions on Network and Service Management |date=2018 |doi=10.1109/TNSM.2017.2771403 |s2cid=3883483 |language=en-US |last1=Gow |first1=Richard |last2=Rabhi |first2=Fethi A. |last3=Venugopal |first3=Srikumar |volume=15 |pages=83–96 |hdl=1959.4/unsworks_73660 |hdl-access=free }} These are complex systems, composed of many interactive elements and large data quantities, requiring methods to process and reduce this data into a human and machine interpretable format.{{Cite journal |last1 = Herrera|first1 = Manuel |last2 = Proselkov|first2 = Yaniv |last3 = Pérez-Hernández|first3 = Marco |last4 = Kumar Parlikad|first4 = Ajith |title = Mining Graph-Fourier Transform Time Series for Anomaly Detection of Internet Traffic at Core and Metro Networks |journal = IEEE Access |date = January 2021 |volume=9|pages= 8997–9011 |doi=10.1109/ACCESS.2021.3050014 |issn= 2169-3536 |doi-access = free |bibcode = 2021IEEEA...9.8997H }} Techniques like the IT Infrastructure Library (ITIL) and monitoring frameworks are employed to track and manage system performance and user experience. Detected anomalies can help identify and pre-empt potential performance degradations or system failures, thus maintaining productivity and business process effectiveness.

= IoT systems =

Anomaly detection is critical for the security and efficiency of Internet of Things (IoT) systems.{{Cite journal |last1=Chatterjee |first1=Ayan |last2=Ahmed |first2=Bestoun S. |date=August 2022 |title=IoT anomaly detection methods and applications: A survey |journal=Internet of Things |volume=19 |pages=100568 |doi=10.1016/j.iot.2022.100568 |s2cid=250644468 |issn=2542-6605|doi-access=free |arxiv=2207.09092 }} It helps in identifying system failures and security breaches in complex networks of IoT devices. The methods must manage real-time data, diverse device types, and scale effectively. Garbe et al.{{Cite journal |last1=Garg |first1=Sahil |last2=Kaur |first2=Kuljeet |last3=Batra |first3=Shalini |last4=Kaddoum |first4=Georges |last5=Kumar |first5=Neeraj |last6=Boukerche |first6=Azzedine |date=2020-03-01 |title=A multi-stage anomaly detection scheme for augmenting the security in IoT-enabled applications |url=https://www.sciencedirect.com/science/article/pii/S0167739X19319703 |journal=Future Generation Computer Systems |volume=104 |pages=105–118 |doi=10.1016/j.future.2019.09.038 |s2cid=204077191 |issn=0167-739X|url-access=subscription }} have introduced a multi-stage anomaly detection framework that improves upon traditional methods by incorporating spatial clustering, density-based clustering, and locality-sensitive hashing. This tailored approach is designed to better handle the vast and varied nature of IoT data, thereby enhancing security and operational reliability in smart infrastructure and industrial IoT systems.

= Petroleum industry =

Anomaly detection is crucial in the petroleum industry for monitoring critical machinery.{{Cite journal |last1=Martí |first1=Luis |last2=Sanchez-Pi |first2=Nayat |last3=Molina |first3=José Manuel |last4=Garcia |first4=Ana Cristina Bicharra |date=February 2015 |title=Anomaly Detection Based on Sensor Data in Petroleum Industry Applications |journal=Sensors |language=en |volume=15 |issue=2 |pages=2774–2797 |doi=10.3390/s150202774 |pmid=25633599 |pmc=4367333 |bibcode=2015Senso..15.2774M |issn=1424-8220 |doi-access=free }} Martí et al. used a novel segmentation algorithm to analyze sensor data for real-time anomaly detection. This approach helps promptly identify and address any irregularities in sensor readings, ensuring the reliability and safety of petroleum operations.

= Oil and gas pipeline monitoring =

In the oil and gas sector, anomaly detection is not just crucial for maintenance and safety, but also for environmental protection.{{Cite journal |last1=Aljameel |first1=Sumayh S. |last2=Alomari |first2=Dorieh M. |last3=Alismail |first3=Shatha |last4=Khawaher |first4=Fatimah |last5=Alkhudhair |first5=Aljawharah A. |last6=Aljubran |first6=Fatimah |last7=Alzannan |first7=Razan M. |date=August 2022 |title=An Anomaly Detection Model for Oil and Gas Pipelines Using Machine Learning |journal=Computation |language=en |volume=10 |issue=8 |pages=138 |doi=10.3390/computation10080138 |issn=2079-3197 |doi-access=free }} Aljameel et al. propose an advanced machine learning-based model for detecting minor leaks in oil and gas pipelines, a task traditional methods may miss.

Methods

Many anomaly detection techniques have been proposed in literature.{{cite journal|last1=Chandola|first1=V.|last2=Banerjee|first2=A.|last3=Kumar|first3=V.|s2cid=207172599|year=2009|title=Anomaly detection: A survey|journal=ACM Computing Surveys|volume=41|issue=3|pages=1–58|doi=10.1145/1541880.1541882}}{{cite journal|last1=Zimek|first1=Arthur|author-link1=Arthur Zimek|last2=Filzmoser|first2=Peter|title=There and back again: Outlier detection between statistical reasoning and data mining algorithms|journal=Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery|volume=8|issue=6|year=2018|pages=e1280|issn=1942-4787|doi=10.1002/widm.1280|s2cid=53305944|url=https://findresearcher.sdu.dk:8443/ws/files/153197807/There_and_Back_Again.pdf|access-date=2019-12-09|archive-date=2021-11-14|archive-url=https://web.archive.org/web/20211114121638/https://findresearcher.sdu.dk:8443/ws/files/153197807/There_and_Back_Again.pdf|url-status=dead}} The performance of methods usually depend on the data sets. For example, some may be suited to detecting local outliers, while others global, and methods have little systematic advantages over another when compared across many data sets.{{cite journal |last1=Campos |first1=Guilherme O. |last2=Zimek |first2=Arthur |author-link2=Arthur Zimek |last3=Sander |first3=Jörg |last4=Campello |first4=Ricardo J. G. B. |last5=Micenková |first5=Barbora |last6=Schubert |first6=Erich |last7=Assent |first7=Ira |last8=Houle |first8=Michael E. |year=2016 |title=On the evaluation of unsupervised outlier detection: measures, datasets, and an empirical study |journal=Data Mining and Knowledge Discovery |volume=30 |issue=4 |pages=891 |doi=10.1007/s10618-015-0444-8 |issn=1384-5810 |s2cid=1952214}}[http://www.dbs.ifi.lmu.de/research/outlier-evaluation/ Anomaly detection benchmark data repository] of the Ludwig-Maximilians-Universität München; [http://lapad-web.icmc.usp.br/repositories/outlier-evaluation/ Mirror] {{Webarchive|url=https://web.archive.org/web/20220331072353/http://lapad-web.icmc.usp.br/repositories/outlier-evaluation/|date=2022-03-31}} at University of São Paulo. Almost all algorithms also require the setting of non-intuitive parameters critical for performance, and usually unknown before application. Some of the popular techniques are mentioned below and are broken down into categories:

= Statistical =

== Parameter-free ==

Also referred to as frequency-based or counting-based, the simplest non-parametric anomaly detection method is to build a histogram with the training data or a set of known normal instances, and if a test point does not fall in any of the histogram bins mark it as anomalous, or assign an anomaly score to test data based on the height of the bin it falls in. The size of bins are key to the effectiveness of this technique but must be determined by the implementer.

A more sophisticated technique uses kernel functions to approximate the distribution of the normal data. Instances in low probability areas of the distribution are then considered anomalies.{{Cite journal |last1=Chandola |first1=Varun |last2=Banerjee |first2=Arindam |last3=Kumar |first3=Vipin |date=2009-07-30 |title=Anomaly detection: A survey |url=https://dl.acm.org/doi/10.1145/1541880.1541882 |journal=ACM Comput. Surv. |volume=41 |issue=3 |pages=15:1–15:58 |doi=10.1145/1541880.1541882 |issn=0360-0300|url-access=subscription }}

== Parametric-based ==

= Density =

Density-based techniques (k-nearest neighbor,{{cite journal | doi = 10.1007/s007780050006| title = Distance-based outliers: Algorithms and applications| journal = The VLDB Journal the International Journal on Very Large Data Bases| volume = 8| issue = 3–4| pages = 237–253| year = 2000| last1 = Knorr | first1 = E. M. | last2 = Ng | first2 = R. T. | last3 = Tucakov | first3 = V. | citeseerx = 10.1.1.43.1842| s2cid = 11707259}}{{cite conference | doi = 10.1145/342009.335437| title = Efficient algorithms for mining outliers from large data sets| conference = Proceedings of the 2000 ACM SIGMOD international conference on Management of data – SIGMOD '00| pages = 427| year = 2000| last1 = Ramaswamy | first1 = S. | last2 = Rastogi | first2 = R. | last3 = Shim | first3 = K. | isbn = 1-58113-217-4}}{{cite conference | doi = 10.1007/3-540-45681-3_2| title = Fast Outlier Detection in High Dimensional Spaces| conference = Principles of Data Mining and Knowledge Discovery| volume = 2431| pages = 15| series = Lecture Notes in Computer Science| year = 2002| last1 = Angiulli | first1 = F. | last2 = Pizzuti | first2 = C. | isbn = 978-3-540-44037-6| doi-access = free}} local outlier factor,{{cite conference| doi = 10.1145/335191.335388| title = LOF: Identifying Density-based Local Outliers| year = 2000| last1 = Breunig | first1 = M. M.| last2 = Kriegel | first2 = H.-P. | author-link2 = Hans-Peter Kriegel| last3 = Ng | first3 = R. T.| last4 = Sander | first4 = J.| work = Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data| series = SIGMOD| isbn = 1-58113-217-4| pages = 93–104| url = http://www.dbs.ifi.lmu.de/Publikationen/Papers/LOF.pdf}} isolation forests,{{Cite book|last1=Liu|first1=Fei Tony|last2=Ting|first2=Kai Ming|last3=Zhou|first3=Zhi-Hua|title=2008 Eighth IEEE International Conference on Data Mining |chapter=Isolation Forest |date=December 2008|url=https://www.computer.org/csdl/proceedings/icdm/2008/3502/00/3502a413-abs.html|language=en|pages=413–422|doi=10.1109/ICDM.2008.17|isbn=9780769535029|s2cid=6505449}}{{Cite journal|last1=Liu|first1=Fei Tony|last2=Ting|first2=Kai Ming|last3=Zhou|first3=Zhi-Hua|date=March 2012|title=Isolation-Based Anomaly Detection|url=https://www.researchgate.net/publication/239761771|journal=ACM Transactions on Knowledge Discovery from Data |language=en|volume=6|issue=1|pages=1–39|doi=10.1145/2133360.2133363|s2cid=207193045}} and many more variations of this concept{{cite journal | last1 = Schubert | first1 = E. | last2 = Zimek | first2 = A. | author-link2 = Arthur Zimek | last3 = Kriegel | first3 = H. -P. | s2cid = 19036098 | author-link3 = Hans-Peter Kriegel| doi = 10.1007/s10618-012-0300-z | title = Local outlier detection reconsidered: A generalized view on locality with applications to spatial, video, and network outlier detection | journal = Data Mining and Knowledge Discovery | volume = 28 | pages = 190–237 | year = 2012 }})
Subspace-base (SOD),{{cite conference | doi = 10.1007/978-3-642-01307-2_86| title = Outlier Detection in Axis-Parallel Subspaces of High Dimensional Data| conference = Advances in Knowledge Discovery and Data Mining| volume = 5476| pages = 831| series = Lecture Notes in Computer Science| year = 2009| last1 = Kriegel | first1 = H. P. | author-link1 = Hans-Peter Kriegel| last2 = Kröger | first2 = P. | last3 = Schubert | first3 = E. | last4 = Zimek | first4 = A. | author-link4 = Arthur Zimek | isbn = 978-3-642-01306-5}} correlation-based (COP){{cite conference | doi = 10.1109/ICDM.2012.21| title = Outlier Detection in Arbitrarily Oriented Subspaces| conference = 2012 IEEE 12th International Conference on Data Mining| pages = 379| year = 2012| last1 = Kriegel | first1 = H. P. | author-link1 = Hans-Peter Kriegel| last2 = Kroger | first2 = P. | last3 = Schubert | first3 = E. | last4 = Zimek | first4 = A. | author-link4 = Arthur Zimek | isbn = 978-1-4673-4649-8}} and tensor-based{{cite journal | last1 = Fanaee-T| first1 = H. | last2 = Gama | first2 = J.| title = Tensor-based anomaly detection: An interdisciplinary survey | doi = 10.1016/j.knosys.2016.01.027 | journal = Knowledge-Based Systems | volume = 98 | pages = 130–147| year = 2016| s2cid = 16368060 | url = http://repositorio.inesctec.pt/handle/123456789/5381 }} outlier detection for high-dimensional data{{cite journal | last1 = Zimek | first1 = A. | author-link1 = Arthur Zimek | last2 = Schubert | first2 = E.| last3 = Kriegel | first3 = H.-P. | author-link3=Hans-Peter Kriegel| title = A survey on unsupervised outlier detection in high-dimensional numerical data | doi = 10.1002/sam.11161 | journal = Statistical Analysis and Data Mining | volume = 5 | issue = 5 | pages = 363–387| year = 2012 | s2cid = 6724536 }}
One-class support vector machines{{cite journal|last1=Schölkopf|first1=B.|author-link=Bernhard Schölkopf|last2=Platt|first2=J. C.|last3=Shawe-Taylor|first3=J.|last4=Smola|first4=A. J.|last5=Williamson|first5=R. C.|year=2001|title=Estimating the Support of a High-Dimensional Distribution|journal=Neural Computation|volume=13|issue=7|pages=1443–71|citeseerx=10.1.1.4.4106|doi=10.1162/089976601750264965|pmid=11440593|s2cid=2110475}} (OCSVM, SVDD)

= Neural networks =

Replicator neural networks,{{cite book |doi=10.1007/3-540-46145-0_17 |chapter=Outlier Detection Using Replicator Neural Networks |title=Data Warehousing and Knowledge Discovery |volume=2454 |pages=170–180 |year=2002 |last1=Hawkins |first1=Simon |last2=He |first2=Hongxing |last3=Williams |first3=Graham |last4=Baxter |first4=Rohan |isbn=978-3-540-44123-6 |series=Lecture Notes in Computer Science |citeseerx=10.1.1.12.3366 |s2cid=6436930 }} autoencoders, variational autoencoders,{{cite journal |last1=An |first1=J. |last2=Cho |first2=S. |title=Variational autoencoder based anomaly detection using reconstruction probability |journal=Special Lecture on IE |volume=2 |issue=1 |pages=1–18 |date=2015 |id=SNUDM-TR-2015-03 |url=http://dm.snu.ac.kr/static/docs/TR/SNUDM-TR-2015-03.pdf}} long short-term memory neural networks{{Cite conference|last1=Malhotra|first1=Pankaj|last2=Vig|first2=Lovekesh|last3=Shroff|first3=Gautman|last4=Agarwal|first4=Puneet|title=Long Short Term Memory Networks for Anomaly Detection in Time Series|url=https://www.researchgate.net/publication/304782562|conference=ESANN 2015: 23rd European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning |date=22–24 April 2015 |isbn=978-2-87587-015-5 |pages=89–94 }}
Bayesian networks
Hidden Markov models (HMMs)
Minimum Covariance Determinant{{Cite journal|last1=Hubert|first1=Mia|author-link=Mia Hubert|last2=Debruyne|first2=Michiel|last3=Rousseeuw|first3=Peter J.|author-link3=Peter J. Rousseeuw|date=2018|title=Minimum covariance determinant and extensions|journal=WIREs Computational Statistics|language=en|volume=10|issue=3|doi=10.1002/wics.1421|s2cid=67227041 |issn=1939-5108|doi-access=free|arxiv=1709.07045}}{{Cite journal|last1=Hubert|first1=Mia|author-link=Mia Hubert|last2=Debruyne|first2=Michiel|date=2010|title=Minimum covariance determinant|url=https://onlinelibrary.wiley.com/doi/abs/10.1002/wics.61|journal=WIREs Computational Statistics|language=en|volume=2|issue=1|pages=36–43|doi=10.1002/wics.61|s2cid=123086172 |issn=1939-0068|url-access=subscription}}
Deep Learning
Convolutional Neural Networks (CNNs): CNNs have shown exceptional performance in the unsupervised learning domain for anomaly detection, especially in image and video data analysis. Their ability to automatically and hierarchically learn spatial hierarchies of features from low to high-level patterns makes them particularly suited for detecting visual anomalies. For instance, CNNs can be trained on image datasets to identify atypical patterns indicative of defects or out-of-norm conditions in industrial quality control scenarios.{{Cite journal |last1=Alzubaidi |first1=Laith |last2=Zhang |first2=Jinglan |last3=Humaidi |first3=Amjad J. |last4=Al-Dujaili |first4=Ayad |last5=Duan |first5=Ye |last6=Al-Shamma |first6=Omran |last7=Santamaría |first7=J. |last8=Fadhel |first8=Mohammed A. |last9=Al-Amidie |first9=Muthana |last10=Farhan |first10=Laith |date=2021-03-31 |title=Review of deep learning: concepts, CNN architectures, challenges, applications, future directions |journal=Journal of Big Data |volume=8 |issue=1 |pages=53 |doi=10.1186/s40537-021-00444-8 |issn=2196-1115 |pmc=8010506 |pmid=33816053 |doi-access=free }}
Simple Recurrent Units (SRUs): In time-series data, SRUs, a type of recurrent neural network, have been effectively used for anomaly detection by capturing temporal dependencies and sequence anomalies. Unlike traditional RNNs, SRUs are designed to be faster and more parallelizable, offering a better fit for real-time anomaly detection in complex systems such as dynamic financial markets or predictive maintenance in machinery, where identifying temporal irregularities promptly is crucial.{{Cite journal |last1=Belay |first1=Mohammed Ayalew |last2=Blakseth |first2=Sindre Stenen |last3=Rasheed |first3=Adil |last4=Salvo Rossi |first4=Pierluigi |date=January 2023 |title=Unsupervised Anomaly Detection for IoT-Based Multivariate Time Series: Existing Solutions, Performance Analysis and Future Directions |journal=Sensors |language=en |volume=23 |issue=5 |pages=2844 |doi=10.3390/s23052844 |pmid=36905048 |pmc=10007300 |bibcode=2023Senso..23.2844B |issn=1424-8220 |doi-access=free }}
Foundation models: Since the advent of large-scale foundation models that have been used successfully on most downstream tasks, they have also been adapted for use in anomaly detection and segmentation. Methods utilizing pretrained foundation models include using the alignment of image and text embeddings (CLIP, etc.) for anomaly localization,{{Cite book |last1=Jeong |first1=Jongheon |last2=Zou |first2=Yang |last3=Kim |first3=Taewan |last4=Zhang |first4=Dongqing |last5=Ravichandran |first5=Avinash |last6=Dabeer |first6=Onkar |chapter=WinCLIP: Zero-/Few-Shot Anomaly Classification and Segmentation |date=June 2023 |title=2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) |chapter-url=https://doi.org/10.1109/cvpr52729.2023.01878 |publisher=IEEE |pages=19606–19616 |doi=10.1109/cvpr52729.2023.01878|arxiv=2303.14814 |isbn=979-8-3503-0129-8 }} while others may use the inpainting ability of generative image models for reconstruction-error based anomaly detection.{{Cite journal |last1=Liu |first1=Zhenzhen |last2=Zhou |first2=Jin Peng |last3=Weinberger |first3=Kilian Q. |date=2024-05-09 |title=Leveraging diffusion models for unsupervised out-of-distribution detection on image manifold |journal=Frontiers in Artificial Intelligence |volume=7 |doi=10.3389/frai.2024.1255566 |doi-access=free |pmid=38783869 |issn=2624-8212|pmc=11112019 }}

= Cluster-based =

Clustering: Cluster analysis-based outlier detection{{cite journal | doi = 10.1016/S0167-8655(03)00003-5| title = Discovering cluster-based local outliers| journal = Pattern Recognition Letters| volume = 24| issue = 9–10| pages = 1641–1650| year = 2003| last1 = He | first1 = Z. | last2 = Xu | first2 = X. | last3 = Deng | first3 = S. | bibcode = 2003PaReL..24.1641H| citeseerx = 10.1.1.20.4242}}{{cite journal

| first1 = R. J. G. B. | last1 = Campello

| first2 = D. | last2 = Moulavi

| first3 = A. | last3 = Zimek | author-link3 = Arthur Zimek

| first4 = J. | last4 = Sander

| s2cid = 2887636

| title = Hierarchical Density Estimates for Data Clustering, Visualization, and Outlier Detection

| journal = ACM Transactions on Knowledge Discovery from Data

| volume = 10 | issue = 1 | pages = 5:1–51 | year = 2015 | doi = 10.1145/2733381}}

Deviations from association rules and frequent itemsets
Fuzzy logic-based outlier detection

= Ensembles =

Ensemble techniques, using feature bagging,{{cite book| doi = 10.1145/1081870.1081891| year = 2005| last1 = Lazarevic | first1 = A.| last2 = Kumar | first2 = V.| title = Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining| chapter = Feature bagging for outlier detection| pages = 157–166| isbn = 978-1-59593-135-1| citeseerx = 10.1.1.399.425| s2cid = 2054204}}{{cite conference | doi = 10.1007/978-3-642-12026-8_29| title = Mining Outliers with Ensemble of Heterogeneous Detectors on Random Subspaces| conference = Database Systems for Advanced Applications| volume = 5981| pages = 368| series = Lecture Notes in Computer Science| year = 2010| last1 = Nguyen | first1 = H. V. | last2 = Ang | first2 = H. H. | last3 = Gopalkrishnan | first3 = V. | isbn = 978-3-642-12025-1}} score normalization{{cite conference | doi = 10.1137/1.9781611972818.2| title = Interpreting and Unifying Outlier Scores| conference = Proceedings of the 2011 SIAM International Conference on Data Mining| pages = 13–24| year = 2011| last1 = Kriegel | first1 = H. P. | author-link1 = Hans-Peter Kriegel| last2 = Kröger | first2 = P. | last3 = Schubert | first3 = E. | last4 = Zimek | first4 = A. | author-link4 = Arthur Zimek | isbn = 978-0-89871-992-5| citeseerx = 10.1.1.232.2719}}{{cite conference | doi = 10.1137/1.9781611972825.90| title = On Evaluation of Outlier Rankings and Outlier Scores| conference = Proceedings of the 2012 SIAM International Conference on Data Mining| pages = 1047–1058| year = 2012| last1 = Schubert | first1 = E. | last2 = Wojdanowski | first2 = R. | last3 = Zimek | first3 = A. | author-link3 = Arthur Zimek | last4 = Kriegel | first4 = H. P. | author-link4 = Hans-Peter Kriegel| isbn = 978-1-61197-232-0}} and different sources of diversity{{cite journal | doi = 10.1145/2594473.2594476| title = Ensembles for unsupervised outlier detection| journal = ACM SIGKDD Explorations Newsletter| volume = 15| pages = 11–22| year = 2014| last1 = Zimek | first1 = A. | author-link1 = Arthur Zimek | last2 = Campello | first2 = R. J. G. B. | last3 = Sander | first3 = J. R. | s2cid = 8065347}}{{cite conference | doi = 10.1145/2618243.2618257| title = Data perturbation for outlier detection ensembles| conference = Proceedings of the 26th International Conference on Scientific and Statistical Database Management – SSDBM '14| pages = 1| year = 2014| last1 = Zimek | first1 = A. | author-link1 = Arthur Zimek | last2 = Campello | first2 = R. J. G. B. | last3 = Sander | first3 = J. R. | isbn = 978-1-4503-2722-0}}

=Others=

Histogram-based Outlier Score (HBOS) uses value histograms and assumes feature independence for fast predictions.{{cite web |last1=Goldstein |first1=Markus |last2=Dengel |first2=Andreas |title=Histogram-based Outlier Score (HBOS): A fast Unsupervised Anomaly Detection Algorithm |date=2012 |url=https://www.goldiges.de/publications/HBOS-KI-2012.pdf|website=Personal page of Markus Goldstein}} (Poster only at KI 2012 conference, not in proceedings)

Anomaly detection in dynamic networks

Dynamic networks, such as those representing financial systems, social media interactions, and transportation infrastructure, are subject to constant change, making anomaly detection within them a complex task. Unlike static graphs, dynamic networks reflect evolving relationships and states, requiring adaptive techniques for anomaly detection.

= Types of anomalies in dynamic networks =

Community anomalies
Compression anomalies
Decomposition anomalies
Distance anomalies
Probabilistic model anomalies

Explainable anomaly detection

Many of the methods discussed above only yield an anomaly score prediction, which often can be explained to users as the point being in a region of low data density (or relatively low density compared to the neighbor's densities). In explainable artificial intelligence, the users demand methods with higher explainability. Some methods allow for more detailed explanations:

The Subspace Outlier Degree (SOD) identifies attributes where a sample is normal, and attributes in which the sample deviates from the expected.
Correlation Outlier Probabilities (COP) compute an error vector of how a sample point deviates from an expected location, which can be interpreted as a counterfactual explanation: the sample would be normal if it were moved to that location.

Software

ELKI is an open-source Java data mining toolkit that contains several anomaly detection algorithms, as well as index acceleration for them.
PyOD is an open-source Python library developed specifically for anomaly detection.{{cite journal |last1= Zhao |first1= Yue |last2= Nasrullah |first2= Zain |last3= Li |first3= Zheng |date=2019 |title=Pyod: A python toolbox for scalable outlier detection |journal=Journal of Machine Learning Research |volume=20 |issue= |pages= |arxiv=1901.01588 |url=https://www.jmlr.org/papers/volume20/19-011/19-011.pdf}}
scikit-learn is an open-source Python library that contains some algorithms for unsupervised anomaly detection.
Wolfram Mathematica provides functionality for unsupervised anomaly detection across multiple data types {{cite web |url=https://reference.wolfram.com/language/ref/FindAnomalies.html |title=FindAnomalies |work=Mathematica documentation}}

Datasets

[http://www.dbs.ifi.lmu.de/research/outlier-evaluation/ Anomaly detection benchmark data repository] with carefully chosen data sets of the Ludwig-Maximilians-Universität München; [http://lapad-web.icmc.usp.br/repositories/outlier-evaluation/ Mirror] {{Webarchive|url=https://web.archive.org/web/20220331072353/http://lapad-web.icmc.usp.br/repositories/outlier-evaluation/ |date=2022-03-31 }} at University of São Paulo.
[http://odds.cs.stonybrook.edu/ ODDS] – ODDS: A large collection of publicly available outlier detection datasets with ground truth in different domains.
[https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/OPQMVF Unsupervised Anomaly Detection Benchmark] at Harvard Dataverse: Datasets for Unsupervised Anomaly Detection with ground truth.
[https://researchdata.edu.au/kmash-repository-outlier-detection/1733742/ KMASH Data Repository] at Research Data Australia having more than 12,000 anomaly detection datasets with ground truth.

References

Category:Data mining

Category:Machine learning

Category:Data security

Category:Statistical outliers

Category:Reliability engineering