Abstract
Anomaly detection task, which identifies abnormal patterns in data, has been widely applied to various domains. Most recent work on anomaly detection have focused on an accurate modeling of the normal data based on unsupervised methods. To get a satisfactory anomaly detection accuracy, they need pure normal data without abnormal data. This scenario requires many labels to get pure normal data. In many real-world scenarios, there exist abundant unlabeled data and a limited number of partially labeled anomalies. This paper proposes a novel anomaly detection method, PUMAD, which uses a Positive and Unlabeled (PU) learning approach to learn from abundant unlabeled data and a small number of partially labeled anomalies (i.e., positives). PUMAD successfully works on the anomaly detection scenario by exploiting deep metric learning with a hashing-based filtering method. Extensive experimental results on real-world benchmark datasets demonstrate that our approach based on PU learning is effective to detect anomalies. PUMAD achieves a much higher accuracy of up to 24% than state-of-the-art competitors.
Original language | English |
---|---|
Pages (from-to) | 167-183 |
Number of pages | 17 |
Journal | Information sciences |
Volume | 523 |
DOIs | |
Publication status | Published - 2020 Jun |
Bibliographical note
Funding Information:This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government(MSIT) (No. 2016R1E1A1A01942642 ).
Publisher Copyright:
© 2020 Elsevier Inc.
All Science Journal Classification (ASJC) codes
- Software
- Control and Systems Engineering
- Theoretical Computer Science
- Computer Science Applications
- Information Systems and Management
- Artificial Intelligence