Abstract
Considering the fatality of malware attacks, the data-driven approach using massive malware observations has been verified. Deep learning-based approaches to learn the unified features by exploiting the local and sequential nature of control flow graph achieved the best performance. However, only considering local and sequential information from graph-based malware representation is not enough to model the semantics, such as structural and functional nature of malware. In this paper, functional nature are combined to the control flow graph by adding opcodes, and structural nature is embedded through DeepWalk algorithm. Subsequently, we propose the transformer-based malware control flow embedding to overcome the difficulty in modeling the long-term control flow and to selectively learn the code embeddings. Extensive experiments achieved performance improvement compared to the latest deep learning-based graph embedding methods, and in a 37.50% improvement in recall was confirmed for the Simda botnet attack.
Original language | English |
---|---|
Title of host publication | Intelligent Data Engineering and Automated Learning - 22nd International Conference, IDEAL 2021, Proceedings |
Editors | David Camacho, Peter Tino, Richard Allmendinger, Hujun Yin, Antonio J. Tallón-Ballesteros, Ke Tang, Sung-Bae Cho, Paulo Novais, Susana Nascimento |
Publisher | Springer Science and Business Media Deutschland GmbH |
Pages | 426-436 |
Number of pages | 11 |
ISBN (Print) | 9783030916077 |
DOIs | |
Publication status | Published - 2021 |
Event | 22nd International Conference on Intelligent Data Engineering and Automated Learning, IDEAL 2021 - Virtual, Online Duration: 2021 Nov 25 → 2021 Nov 27 |
Publication series
Name | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) |
---|---|
Volume | 13113 LNCS |
ISSN (Print) | 0302-9743 |
ISSN (Electronic) | 1611-3349 |
Conference
Conference | 22nd International Conference on Intelligent Data Engineering and Automated Learning, IDEAL 2021 |
---|---|
City | Virtual, Online |
Period | 21/11/25 → 21/11/27 |
Bibliographical note
Funding Information:Acknowledgment. This work was supported by an IITP grant funded by the Korean government (MSIT) (No. 2020-0-01361, Artificial Intelligence Graduate School Program (Yonsei University)) and Air Force Defense Research Sciences Program funded by Air Force Office of Scientific Research.
Publisher Copyright:
© 2021, Springer Nature Switzerland AG.
All Science Journal Classification (ASJC) codes
- Theoretical Computer Science
- Computer Science(all)