Web Document Encoding for Structure-Aware Keyphrase Extraction

Jihyuk Kim, Young In Song, Seung Won Hwang

Research output: Chapter in Book/Report/Conference proceedingConference contribution

3 Citations (Scopus)

Abstract

We study keyphrase extraction (KPE) from Web documents. Our key contribution is encoding Web documents to leverage structure, such as title or anchors, by building a graph of words representing both (a) position-based proximity and (b) structural relations. We evaluate KPE performance on real-world search engine NAVER and human-annotated KPE benchmarks, and ours outperforms state-of-the-arts in both tasks.

Original languageEnglish
Title of host publicationSIGIR 2021 - Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval
PublisherAssociation for Computing Machinery, Inc
Pages1823-1827
Number of pages5
ISBN (Electronic)9781450380379
DOIs
Publication statusPublished - 2021 Jul 11
Event44th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2021 - Virtual, Online, Canada
Duration: 2021 Jul 112021 Jul 15

Publication series

NameSIGIR 2021 - Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval

Conference

Conference44th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2021
Country/TerritoryCanada
CityVirtual, Online
Period21/7/1121/7/15

Bibliographical note

Funding Information:
This work was supported by NAVER-SQR program from NAVER corporation, and IITP funded by MSIT (No. 2017-0-01779, XAI).

Publisher Copyright:
© 2021 ACM.

All Science Journal Classification (ASJC) codes

  • Software
  • Computer Graphics and Computer-Aided Design
  • Information Systems

Fingerprint

Dive into the research topics of 'Web Document Encoding for Structure-Aware Keyphrase Extraction'. Together they form a unique fingerprint.

Cite this