Considering the fatality of phishing attacks that are emphasized by many organizations, the inductive learning approach using reported malicious URLs has been verified in the field of deep learning. However, the deep learning-based method mainly focused on the fitting of a classification task via historical URL observation shows a limitation of recall due to the characteristics of zero-day attack. In order to model the nature of a zero-day phishing attack in which URL addresses are generated and discarded immediately, an approach that utilizes the expert knowledge is promising. We introduce the integration method of deep learning and logic programmed domain knowledge to inject the real-world constraints. We design neural and logic classifiers and propose the joint learning method of each component based on the traditional neuro-symbolic integration. Extensive experiments on three real-world datasets consisting of 222,541 URLs showed the highest recall among the latest deep learning methods, despite the hostile class-imbalanced condition. We demonstrate that the optimized weighting between neural and logic component has an effect of improving the recall over 3% compared to the existing methods.
|Number of pages||5|
|Journal||ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings|
|Publication status||Published - 2021|
|Event||2021 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2021 - Virtual, Toronto, Canada|
Duration: 2021 Jun 6 → 2021 Jun 11
Bibliographical noteFunding Information:
This work was supported by an IITP grant funded by the Korean MSIT (No. 2020-0-01361, Artificial Intelligence Graduate School Program (Yonsei University)) and a grant funded by Air Force Research Laboratory, USA.
© 2021 IEEE
All Science Journal Classification (ASJC) codes
- Signal Processing
- Electrical and Electronic Engineering