Evaluating research novelty detection: Counterfactual approaches

Reinald Kim Amplayo, Seung Won Hwang, Min Song

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)

Abstract

In this paper, we explore strategies to evaluate models for the task research paper novelty detection: Given all papers released at a given date, which of the papers discuss new ideas and influence future research? We find the novelty is not a singular concept, and thus inherently lacks of ground truth annotations with cross-annotator agreement, which is a major obstacle in evaluating these models. Test-oftime award is closest to such annotation, which can only be made retrospectively and is extremely scarce. We thus propose to compare and evaluate models using counterfactual simulations. First, we ask models if they can differentiate papers at time t and counterfactual paper from future time t + d. Second, we ask models if they can predict test-of-time award at t + d. These are proxies that can be agreed by human annotators and easily augmented by correlated signals, using which evaluation can be done through four tasks: classification, ranking, correlation and feature selection. We show these proxy evaluation methods complement each other regarding error handling, coverage, interpretability, and scope, and thus altogether contribute to the observation of the relative strength of existing models.

Original languageEnglish
Title of host publicationEMNLP-IJCNLP 2019 - Graph-Based Methods for Natural Language Processing - Proceedings of the 13th Workshop
PublisherAssociation for Computational Linguistics (ACL)
Pages124-133
Number of pages10
ISBN (Electronic)9781950737864
Publication statusPublished - 2019
Event13th Workshop on Graph-Based Methods for Natural Language Processing, TextGraphs 2019, in conjunction with the 2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint Conference on Natural Language Processing, EMNLP-IJCNLP 2019 - Hong Kong, Hong Kong
Duration: 2019 Nov 42019 Nov 4

Publication series

NameEMNLP-IJCNLP 2019 - Graph-Based Methods for Natural Language Processing - Proceedings of the 13th Workshop

Conference

Conference13th Workshop on Graph-Based Methods for Natural Language Processing, TextGraphs 2019, in conjunction with the 2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint Conference on Natural Language Processing, EMNLP-IJCNLP 2019
Country/TerritoryHong Kong
CityHong Kong
Period19/11/419/11/4

Bibliographical note

Funding Information:
This work is supported by Micrsoft Research Asia.

Publisher Copyright:
© 2019 EMNLP-IJCNLP 2019 - Graph-Based Methods for Natural Language Processing - Proceedings of the 13th Workshop. All rights reserved.

All Science Journal Classification (ASJC) codes

  • Computational Theory and Mathematics
  • Computer Science Applications
  • Information Systems

Fingerprint

Dive into the research topics of 'Evaluating research novelty detection: Counterfactual approaches'. Together they form a unique fingerprint.

Cite this