Identifying strategic information from scientific articles through sentence classification

Fidelia Ibekwe-Sanjuan, Chaomei Chen, Roberto Pinho

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Citations (Scopus)

Abstract

We address here the need to assist users in rapidly accessing the most important or strategic information in the text corpus by identifying sentences carrying specific information. More precisely, we want to identify contribution of authors of scientific papers through a categorization of sentences using rhetorical and lexical cues. We built local grammars to annotate sentences in the corpus according to their rhetorical status: objective, new things, results, findings, hypotheses, conclusion, related-word, future work. The annotation is automatically projected automatically onto two other corpora to test their portability across several domains. The local grammars are implemented in the Unitex system. After sentence categorization, the annotated sentences are clustered and users can navigate the result by accessing specific information types. The results can be used for advanced information retrieval purposes.

Original languageEnglish
Title of host publicationProceedings of the 6th International Conference on Language Resources and Evaluation, LREC 2008
PublisherEuropean Language Resources Association (ELRA)
Pages1518-1522
Number of pages5
ISBN (Electronic)2951740840, 9782951740846
Publication statusPublished - 2008 Jan 1
Event6th International Conference on Language Resources and Evaluation, LREC 2008 - Marrakech, Morocco
Duration: 2008 May 282008 May 30

Other

Other6th International Conference on Language Resources and Evaluation, LREC 2008
CountryMorocco
CityMarrakech
Period08/5/2808/5/30

Fingerprint

grammar
information retrieval
Rhetoric
Grammar
Text Corpus
Information Retrieval
Annotation

All Science Journal Classification (ASJC) codes

  • Library and Information Sciences
  • Linguistics and Language
  • Language and Linguistics
  • Education

Cite this

Ibekwe-Sanjuan, F., Chen, C., & Pinho, R. (2008). Identifying strategic information from scientific articles through sentence classification. In Proceedings of the 6th International Conference on Language Resources and Evaluation, LREC 2008 (pp. 1518-1522). European Language Resources Association (ELRA).
Ibekwe-Sanjuan, Fidelia ; Chen, Chaomei ; Pinho, Roberto. / Identifying strategic information from scientific articles through sentence classification. Proceedings of the 6th International Conference on Language Resources and Evaluation, LREC 2008. European Language Resources Association (ELRA), 2008. pp. 1518-1522
@inproceedings{284b66beae824e2f951576157b10b22b,
title = "Identifying strategic information from scientific articles through sentence classification",
abstract = "We address here the need to assist users in rapidly accessing the most important or strategic information in the text corpus by identifying sentences carrying specific information. More precisely, we want to identify contribution of authors of scientific papers through a categorization of sentences using rhetorical and lexical cues. We built local grammars to annotate sentences in the corpus according to their rhetorical status: objective, new things, results, findings, hypotheses, conclusion, related-word, future work. The annotation is automatically projected automatically onto two other corpora to test their portability across several domains. The local grammars are implemented in the Unitex system. After sentence categorization, the annotated sentences are clustered and users can navigate the result by accessing specific information types. The results can be used for advanced information retrieval purposes.",
author = "Fidelia Ibekwe-Sanjuan and Chaomei Chen and Roberto Pinho",
year = "2008",
month = "1",
day = "1",
language = "English",
pages = "1518--1522",
booktitle = "Proceedings of the 6th International Conference on Language Resources and Evaluation, LREC 2008",
publisher = "European Language Resources Association (ELRA)",

}

Ibekwe-Sanjuan, F, Chen, C & Pinho, R 2008, Identifying strategic information from scientific articles through sentence classification. in Proceedings of the 6th International Conference on Language Resources and Evaluation, LREC 2008. European Language Resources Association (ELRA), pp. 1518-1522, 6th International Conference on Language Resources and Evaluation, LREC 2008, Marrakech, Morocco, 08/5/28.

Identifying strategic information from scientific articles through sentence classification. / Ibekwe-Sanjuan, Fidelia; Chen, Chaomei; Pinho, Roberto.

Proceedings of the 6th International Conference on Language Resources and Evaluation, LREC 2008. European Language Resources Association (ELRA), 2008. p. 1518-1522.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - Identifying strategic information from scientific articles through sentence classification

AU - Ibekwe-Sanjuan, Fidelia

AU - Chen, Chaomei

AU - Pinho, Roberto

PY - 2008/1/1

Y1 - 2008/1/1

N2 - We address here the need to assist users in rapidly accessing the most important or strategic information in the text corpus by identifying sentences carrying specific information. More precisely, we want to identify contribution of authors of scientific papers through a categorization of sentences using rhetorical and lexical cues. We built local grammars to annotate sentences in the corpus according to their rhetorical status: objective, new things, results, findings, hypotheses, conclusion, related-word, future work. The annotation is automatically projected automatically onto two other corpora to test their portability across several domains. The local grammars are implemented in the Unitex system. After sentence categorization, the annotated sentences are clustered and users can navigate the result by accessing specific information types. The results can be used for advanced information retrieval purposes.

AB - We address here the need to assist users in rapidly accessing the most important or strategic information in the text corpus by identifying sentences carrying specific information. More precisely, we want to identify contribution of authors of scientific papers through a categorization of sentences using rhetorical and lexical cues. We built local grammars to annotate sentences in the corpus according to their rhetorical status: objective, new things, results, findings, hypotheses, conclusion, related-word, future work. The annotation is automatically projected automatically onto two other corpora to test their portability across several domains. The local grammars are implemented in the Unitex system. After sentence categorization, the annotated sentences are clustered and users can navigate the result by accessing specific information types. The results can be used for advanced information retrieval purposes.

UR - http://www.scopus.com/inward/record.url?scp=84889796765&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84889796765&partnerID=8YFLogxK

M3 - Conference contribution

SP - 1518

EP - 1522

BT - Proceedings of the 6th International Conference on Language Resources and Evaluation, LREC 2008

PB - European Language Resources Association (ELRA)

ER -

Ibekwe-Sanjuan F, Chen C, Pinho R. Identifying strategic information from scientific articles through sentence classification. In Proceedings of the 6th International Conference on Language Resources and Evaluation, LREC 2008. European Language Resources Association (ELRA). 2008. p. 1518-1522