Probabilistic finite automata (PFAs) are common statistical language model in natural language and speech processing. A typical task for PFAs is to compute the probability of all strings that match a query pattern. An important special case of this problem is computing the probability of a string appearing as a prefix, suffix, or infix. These problems find use in many natural language processing tasks such word prediction and text error correction. Recently, we gave the first incremental algorithm to efficiently compute the infix probabilities of each prefix of a string (Cognetta et al., 2018). We develop an asymptotic improvement of that algorithm and solve the open problem of computing the infix probabilities of PFAs from streaming data, which is crucial when processing queries online and is the ultimate goal of the incremental approach.
|Title of host publication||ACL 2019 - 57th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference|
|Publisher||Association for Computational Linguistics (ACL)|
|Number of pages||6|
|Publication status||Published - 2020|
|Event||57th Annual Meeting of the Association for Computational Linguistics, ACL 2019 - Florence, Italy|
Duration: 2019 Jul 28 → 2019 Aug 2
|Name||ACL 2019 - 57th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference|
|Conference||57th Annual Meeting of the Association for Computational Linguistics, ACL 2019|
|Period||19/7/28 → 19/8/2|
Bibliographical noteFunding Information:
This work was supported by the Institute for Information & Communications Technology Promotion (IITP) grant funded by the Korea government (MSIP) (2018-0-00247).
All Science Journal Classification (ASJC) codes
- Language and Linguistics
- Computer Science(all)
- Linguistics and Language