Structured abstract summarization of scientific articles: Summarization using full-text section information

Hanseok Oh, Seojin Nam, Yongjun Zhu

Research output: Contribution to journalArticlepeer-review

Abstract

The automatic summarization of scientific articles differs from other text genres because of the structured format and longer text length. Previous approaches have focused on tackling the lengthy nature of scientific articles, aiming to improve the computational efficiency of summarizing long text using a flat, unstructured abstract. However, the structured format of scientific articles and characteristics of each section have not been fully explored, despite their importance. The lack of a sufficient investigation and discussion of various characteristics for each section and their influence on summarization results has hindered the practical use of automatic summarization for scientific articles. To provide a balanced abstract proportionally emphasizing each section of a scientific article, the community introduced the structured abstract, an abstract with distinct, labeled sections. Using this information, in this study, we aim to understand tasks ranging from data preparation to model evaluation from diverse viewpoints. Specifically, we provide a preprocessed large-scale dataset and propose a summarization method applying the introduction, methods, results, and discussion (IMRaD) format reflecting the characteristics of each section. We also discuss the objective benchmarks and perspectives of state-of-the-art algorithms and present the challenges and research directions in this area.

Original languageEnglish
Pages (from-to)234-248
Number of pages15
JournalJournal of the Association for Information Science and Technology
Volume74
Issue number2
DOIs
Publication statusPublished - 2023 Feb

Bibliographical note

Funding Information:
This work was supported by the Yonsei University Research Fund of 2022 (2022‐22‐0122).

Funding Information:
Yonsei University Research Fund, Grant/Award Number: 2022‐22‐0122 Funding information

Publisher Copyright:
© 2022 Association for Information Science and Technology.

All Science Journal Classification (ASJC) codes

  • Information Systems
  • Computer Networks and Communications
  • Information Systems and Management
  • Library and Information Sciences

Fingerprint

Dive into the research topics of 'Structured abstract summarization of scientific articles: Summarization using full-text section information'. Together they form a unique fingerprint.

Cite this