Evolutionary Triplet Network of Learning Disentangled Malware Space for Malware Classification

Kyoung Won Park, Seok Jun Bu, Sung Bae Cho

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

With the advent of sophisticated deep learning models, various methods for classifying malware from structural features of source codes have been devised. Nevertheless, recent advanced detection-avoidance techniques actively imitate structural features of benign programs and share vulnerable subroutines, making it difficult to distinguish malicious attacks. Therefore, a method to distinguish and classify similar malicious attacks is urgent and significant. In this paper, we propose a method based on a triplet network of learning the disentangled malware space from assembly-level features beyond the structural characteristics of malware. The method comprises two major components, which are 1) triplet loss-trained network to disentangle deep representation between malware being close in the latent vector space, and 2) genetic optimization of assembly-level features to resolve collisions between thousands of assembly-level features. Experiments with the assembly and binary code dataset released from Microsoft show that the proposed method outperforms existing methods based on structural features, achieving the highest performance in 10-fold cross-validation. Moreover, we demonstrate the superiority of disentangled representation for malware classification by visualizing the latent space and ROC curves.

Original languageEnglish
Title of host publicationHybrid Artificial Intelligent Systems - 17th International Conference, HAIS 2022, Proceedings
EditorsPablo García Bringas, Hilde Pérez García, Francisco Javier Martínez de Pisón, José Ramón Villar Flecha, Alicia Troncoso Lora, Enrique A. de la Cal, Alvaro Herrero, Francisco Martínez Álvarez, Giuseppe Psaila, Hector Quintián, Emilio Corchado
PublisherSpringer Science and Business Media Deutschland GmbH
Pages311-322
Number of pages12
ISBN (Print)9783031154706
DOIs
Publication statusPublished - 2022
Event17th International Conference on Hybrid Artificial Intelligence Systems, HAIS 2022 - Salamancaa, Spain
Duration: 2022 Sept 52022 Sept 7

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume13469 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference17th International Conference on Hybrid Artificial Intelligence Systems, HAIS 2022
Country/TerritorySpain
CitySalamancaa
Period22/9/522/9/7

Bibliographical note

Funding Information:
Acknowledgements. This work was supported by an IITP grant funded by the Korean government (MSIT) (No.2020-0-01361, Artificial Intelligence Graduate School Program (Yonsei University)) and an ETRI grant funded by the Korean government (22ZS1100, Core Technology Research for Self-Improving Integrated Artificial Intelligence System).

Publisher Copyright:
© 2022, Springer Nature Switzerland AG.

All Science Journal Classification (ASJC) codes

  • Theoretical Computer Science
  • Computer Science(all)

Fingerprint

Dive into the research topics of 'Evolutionary Triplet Network of Learning Disentangled Malware Space for Malware Classification'. Together they form a unique fingerprint.

Cite this