Softregex: Generating regex from natural language descriptions using softened regex equivalence

Jun U. Park, Sang Ki Ko, Marco Cognetta, Yo Sub Han

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

We continue the study of generating semantically correct regular expressions from natural language descriptions (NL). The current state-of-the-art model, SemRegex, produces regular expressions from NLs by rewarding the reinforced learning based on the semantic (rather than syntactic) equivalence between two regular expressions. Since the regular expression equivalence problem is PSPACE-complete, we introduce the EQ Reg model for computing the similarity of two regular expressions using deep neural networks. Our EQ Reg model essentially softens the equivalence of two regular expressions when used as a reward function. We then propose a new regex generation model, SoftRegex, using the EQ Reg model, and empirically demonstrate that SoftRegex substantially reduces the training time (by a factor of at least 3.6) and produces state-ofthe-art results on three benchmark datasets.

Original languageEnglish
Title of host publicationEMNLP-IJCNLP 2019 - 2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint Conference on Natural Language Processing, Proceedings of the Conference
PublisherAssociation for Computational Linguistics
Pages6425-6431
Number of pages7
ISBN (Electronic)9781950737901
Publication statusPublished - 2020
Event2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint Conference on Natural Language Processing, EMNLP-IJCNLP 2019 - Hong Kong, China
Duration: 2019 Nov 32019 Nov 7

Publication series

NameEMNLP-IJCNLP 2019 - 2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint Conference on Natural Language Processing, Proceedings of the Conference

Conference

Conference2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint Conference on Natural Language Processing, EMNLP-IJCNLP 2019
CountryChina
CityHong Kong
Period19/11/319/11/7

All Science Journal Classification (ASJC) codes

  • Computational Theory and Mathematics
  • Computer Science Applications
  • Information Systems

Fingerprint Dive into the research topics of 'Softregex: Generating regex from natural language descriptions using softened regex equivalence'. Together they form a unique fingerprint.

  • Cite this

    Park, J. U., Ko, S. K., Cognetta, M., & Han, Y. S. (2020). Softregex: Generating regex from natural language descriptions using softened regex equivalence. In EMNLP-IJCNLP 2019 - 2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint Conference on Natural Language Processing, Proceedings of the Conference (pp. 6425-6431). (EMNLP-IJCNLP 2019 - 2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint Conference on Natural Language Processing, Proceedings of the Conference). Association for Computational Linguistics.