MolBit: De novo Drug Design via Binary Representations of SMILES for avoiding the Posterior Collapse Problem

Jonghwan Choi, Sangmin Seo, Jinuk Park, Sanghyun Park

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Deep generative models for molecular generation have accelerated the development of de novo drug design by introducing how to generate novel molecular structures expressed in simplified molecular-input line-entry system (SMILES) or molecular graph formats. Numerous drug design studies have proposed combinations of variational autoencoder (VAE) and autoregressive generators such as recurrent neural networks (RNNs) to generate SMILES strings. However, RNN-VAE has one notorious issue, called posterior collapse, in which different latent vectors produce indistinguishable molecular distributions. In this study, we proposed a Gumbel-Softmax-based generative model, MolBit, and a genetic algorithm-based molecular property optimization method. We confirmed that the proposed model avoided the posterior collapse problem and outperformed the existing drug design models with SMILES.

Original languageEnglish
Title of host publicationProceedings - 2021 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2021
EditorsYufei Huang, Lukasz Kurgan, Feng Luo, Xiaohua Tony Hu, Yidong Chen, Edward Dougherty, Andrzej Kloczkowski, Yaohang Li
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages364-367
Number of pages4
ISBN (Electronic)9781665401265
DOIs
Publication statusPublished - 2021
Event2021 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2021 - Virtual, Online, United States
Duration: 2021 Dec 92021 Dec 12

Publication series

NameProceedings - 2021 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2021

Conference

Conference2021 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2021
Country/TerritoryUnited States
CityVirtual, Online
Period21/12/921/12/12

Bibliographical note

Funding Information:
This work was supported by Institute of Information & communications Technology Planning & Evaluation (IITP) grant funded by the Korea government(MSIT) (IITP-2017-0-00477, (SW starlab) Research and development of the high performance in-memory distributed DBMS based on flash memory storage in IoT environment). 978-1-6654-0126-5/21/$31.00 ©2021 IEEE

Publisher Copyright:
© 2021 IEEE.

All Science Journal Classification (ASJC) codes

  • Artificial Intelligence
  • Computer Science Applications
  • Biomedical Engineering
  • Health Informatics
  • Information Systems and Management

Fingerprint

Dive into the research topics of 'MolBit: De novo Drug Design via Binary Representations of SMILES for avoiding the Posterior Collapse Problem'. Together they form a unique fingerprint.

Cite this