This paper studies the problem of generating likely queries for multimodal documents with images. Our application scenario is enabling efficient “first-stage retrieval” of relevant documents, by attaching generated queries to documents before indexing. We can then index this expanded text to efficiently narrow down to candidate matches using inverted index, so that expensive reranking can follow. Our evaluation results show that our proposed multimodal representation meaningfully improves relevance ranking. More importantly, our framework can achieve the state of the art in the first-stage retrieval scenarios.
|Title of host publication||EACL 2021 - 16th Conference of the European Chapter of the Association for Computational Linguistics, Proceedings of the Conference|
|Publisher||Association for Computational Linguistics (ACL)|
|Number of pages||10|
|Publication status||Published - 2021|
|Event||16th Conference of the European Chapter of the Associationfor Computational Linguistics, EACL 2021 - Virtual, Online|
Duration: 2021 Apr 19 → 2021 Apr 23
|Name||EACL 2021 - 16th Conference of the European Chapter of the Association for Computational Linguistics, Proceedings of the Conference|
|Conference||16th Conference of the European Chapter of the Associationfor Computational Linguistics, EACL 2021|
|Period||21/4/19 → 21/4/23|
Bibliographical noteFunding Information:
This research was supported by the MSIT, under IITP-2017-0-01779; A machine learning and statistical inference framework for explainable artificial intelligence) and the ITRC support program (IITP-2021-2020-0-01789), supervised by the IITP.
© 2021 Association for Computational Linguistics
All Science Journal Classification (ASJC) codes
- Computational Theory and Mathematics
- Linguistics and Language