A small amount of unknown malware can be analyzed manually, but it is generated with extremely more and more so that automatic detection of them is needed. Malware is usually generated with different features from those of existing ones (e.g., code exchange, null value insertion, or reorganization of subroutines) to avoid detection of antivirus systems. To detect malware with obfuscation, this paper proposes a method called latent semantic controlling generative adversarial networks (LSC-GAN) that learns to generate malware data with i-feature from a specific Gaussian distribution which represents i-feature and distinguish it from the real. Variational autoencoder (VAE) projects data to latent space for feature extraction and is transferred to generator (G) of LSC-GAN to train it stably. G generates data from Gaussian distribution, so it produces similar data but not identical to the actual data: it includes modified features compared with the real. The detector is inherited with transfer learning in a encoder that learns various malware features using real and modified data generated by the LSC-GAN based on a LSC-VAE. We show that LSC-GAN achieves detection accuracy of 96.97% on average that is higher than those of other conventional models. We demonstrate statistical significance of the performance of the proposed model using t-test. The result of detection is analyzed with confusion matrix and F1-score.
|Title of host publication||Intelligent Data Engineering and Automated Learning – IDEAL 2018 - 19th International Conference, Proceedings|
|Editors||Hujun Yin, Paulo Novais, David Camacho, Antonio J. Tallón-Ballesteros|
|Number of pages||9|
|Publication status||Published - 2018|
|Event||19th International Conference on Intelligent Data Engineering and Automated Learning, IDEAL 2018 - Madrid, Spain|
Duration: 2018 Nov 21 → 2018 Nov 23
|Name||Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)|
|Other||19th International Conference on Intelligent Data Engineering and Automated Learning, IDEAL 2018|
|Period||18/11/21 → 18/11/23|
Bibliographical noteFunding Information:
Acknowledgment. This work was supported by Air Force Defense Research Sciences Program funded by Air Force Office of Scientific Research.
All Science Journal Classification (ASJC) codes
- Theoretical Computer Science
- Computer Science(all)