Most regular-expression matching engines in practice are based on the Thompson construction and the Spencer matching algorithm. While these engines work fast and efficiently, a serious problem, the regular expression denial-of-service (ReDoS), has been reported recently. ReDoS is an algorithm complexity attack, which exploits the backtracking feature of the engine, and makes the service unresponsive indefinitely. Researchers suggested a few remedies to cope with the ReDoS problem, yet they are often ad-hoc or undesirable in practice. We instead propose a hybrid matching scheme that selects between the Thompson and the Spencer matching algorithms depending on the needed features. We also suggest to use the position construction for its intrinsic characteristics for fast matching. We evaluate the proposed approach using a benchmark dataset collected from various open-source projects, and compare the performance with the current approach. The experimental results show that a hybrid matcher reduces the ReDoS-vulnerability by 96% and 99.98% in full and partial matching, respectively. Moreover, 55% of the most problematic regular expressions become invulnerable to ReDoS by the position construction.
|Title of host publication||Implementation and Application of Automata - 26th International Conference, CIAA 2022, Proceedings|
|Editors||Pascal Caron, Ludovic Mignot|
|Publisher||Springer Science and Business Media Deutschland GmbH|
|Number of pages||16|
|Publication status||Published - 2022|
|Event||26th International Conference on Implementation and Application of Automata, CIAA 2022 - Rouen, France|
Duration: 2022 Jun 28 → 2022 Jul 1
|Name||Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)|
|Conference||26th International Conference on Implementation and Application of Automata, CIAA 2022|
|Period||22/6/28 → 22/7/1|
Bibliographical noteFunding Information:
2020R1A4A3079947) funded by MIST.
by the NRF grant (NRF-
© 2022, Springer Nature Switzerland AG.
All Science Journal Classification (ASJC) codes
- Theoretical Computer Science
- Computer Science(all)