We present a method to detect the novelty of a research paper. Because novelty in scholarly literature also examines the larger research community, a network-based approach for extracting features is proposed. Two graphs are introduced, a macro-level graph, where authors and documents are used as nodes, and a micro-level graph, where keywords, topics, and words are used as nodes. After constructing the seed graph, papers are incrementally added while changes in the graph are recorded as the feature set of a paper. An autoencoder neural network is then used as the novelty detection model. The experimental results show that the commonly used text feature representations, TF-IDF and one-class SVM, are not suitable for detecting the novelty of a research paper. Among the constructed graphs, keyword-level graph features exhibit the best performance using regression analysis as the metric. We also combine the macro-level graph, micro-level graph, and all features and find that the combination of keywords, topics, and word features perform the best using regression and citation count analysis. Other factors that could affect the citation counts, impact, and audience, are also discussed.
Bibliographical noteFunding Information:
This project is supported by Microsoft Research. This project acknowledges the role of the Microsoft Cognitive Service API and its contribution to the success of this research. This project is also partly supported by the Ministry of Education of the Republic of Korea and the National Research Foundation of Korea ( NRF-2015S1A3A2046711 ).
All Science Journal Classification (ASJC) codes
- Control and Systems Engineering
- Theoretical Computer Science
- Computer Science Applications
- Information Systems and Management
- Artificial Intelligence