As the influence of the internet continues to expand as a medium for communications and commerce, the threat from spammers, system attackers, and criminal enterprises has grown accordingly. This paper proposes a random effects logistic regression model to predict anomaly detection. Unlike the previous studies on anomaly detection, a random effects model was applied, which accommodates not only the risk factors of the exposures but also the uncertainty not explained by such factors. The specific factors of the risk category such as retained 'protocol type' and 'logged in' are included in the proposed model. The research is based on a sample of 49,427 random observations for 42 variables of the KDD-cup 1999 (Data Mining and Knowledge Discovery competition) data set that contains 'normal' and 'anomaly' connections. The proposed model has a classification accuracy of 98.94% for the training data set, while that for the validation data set is 98.68%.
Bibliographical noteFunding Information:
This work is financially supported by the Ministry of Knowledge Economy (MKE) and Korea Institute for Advancement in Technology (KIAT) through the Workforce Development Program in Strategic Technology.
All Science Journal Classification (ASJC) codes
- Computer Science Applications
- Artificial Intelligence