Understanding relations using concepts and semantics

Jouyon Park, Hyunsouk Cho, Seungwon Hwang

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)

Abstract

The Financial Entity Identification and Information Integration (FEIII) task aims at the question of understanding relationships among financial entities and their roles using three sentences extracted from each financial contract containing the target word. FEIII task has two challenges -1) data sparseness: small training sets (9% of test data) and 2) context sparseness: limited context (three sentences). Existing statistical approaches, such as Bayes and TF-IDF, cannot evaluate the imporatance of words unobservged in training data, which is vulnerable to the above challenges. We overcome each challenge by considering 1) the concepts of words from knowledge bases (Probase) in addition to the words themselves (conceptual feature) and 2) word semantics from distributed representations such as word2vec (semantic feature). We empirically evaluate the proposed classification model on the four-class classification (highly relevant, relevant, neutral, and irrelevant), and show that the proposed model increases 18% of F1-score compared to the statistical baselines.

Original languageEnglish
Title of host publicationProceedings of the 3rd International Workshop on Data Science for Macro-Modeling with Financial and Economic Datasets, DSMM 2017 - In conjunction with the ACM SIGMOD/PODS Conference
PublisherAssociation for Computing Machinery, Inc
ISBN (Electronic)9781450350310
DOIs
Publication statusPublished - 2017 May 14
Event3rd International Workshop on Data Science for Macro-Modeling with Financial and Economic Datasets, DSMM 2017 - Chicago, United States
Duration: 2017 May 14 → …

Publication series

NameProceedings of the 3rd International Workshop on Data Science for Macro-Modeling with Financial and Economic Datasets, DSMM 2017 - In conjunction with the ACM SIGMOD/PODS Conference

Other

Other3rd International Workshop on Data Science for Macro-Modeling with Financial and Economic Datasets, DSMM 2017
CountryUnited States
CityChicago
Period17/5/14 → …

Fingerprint

Semantics

All Science Journal Classification (ASJC) codes

  • Artificial Intelligence
  • Information Systems
  • Computer Science Applications
  • Human-Computer Interaction
  • Computer Networks and Communications

Cite this

Park, J., Cho, H., & Hwang, S. (2017). Understanding relations using concepts and semantics. In Proceedings of the 3rd International Workshop on Data Science for Macro-Modeling with Financial and Economic Datasets, DSMM 2017 - In conjunction with the ACM SIGMOD/PODS Conference [15] (Proceedings of the 3rd International Workshop on Data Science for Macro-Modeling with Financial and Economic Datasets, DSMM 2017 - In conjunction with the ACM SIGMOD/PODS Conference). Association for Computing Machinery, Inc. https://doi.org/10.1145/3077240.3077250
Park, Jouyon ; Cho, Hyunsouk ; Hwang, Seungwon. / Understanding relations using concepts and semantics. Proceedings of the 3rd International Workshop on Data Science for Macro-Modeling with Financial and Economic Datasets, DSMM 2017 - In conjunction with the ACM SIGMOD/PODS Conference. Association for Computing Machinery, Inc, 2017. (Proceedings of the 3rd International Workshop on Data Science for Macro-Modeling with Financial and Economic Datasets, DSMM 2017 - In conjunction with the ACM SIGMOD/PODS Conference).
@inproceedings{02b22a23481448e5abee7c31a074e459,
title = "Understanding relations using concepts and semantics",
abstract = "The Financial Entity Identification and Information Integration (FEIII) task aims at the question of understanding relationships among financial entities and their roles using three sentences extracted from each financial contract containing the target word. FEIII task has two challenges -1) data sparseness: small training sets (9{\%} of test data) and 2) context sparseness: limited context (three sentences). Existing statistical approaches, such as Bayes and TF-IDF, cannot evaluate the imporatance of words unobservged in training data, which is vulnerable to the above challenges. We overcome each challenge by considering 1) the concepts of words from knowledge bases (Probase) in addition to the words themselves (conceptual feature) and 2) word semantics from distributed representations such as word2vec (semantic feature). We empirically evaluate the proposed classification model on the four-class classification (highly relevant, relevant, neutral, and irrelevant), and show that the proposed model increases 18{\%} of F1-score compared to the statistical baselines.",
author = "Jouyon Park and Hyunsouk Cho and Seungwon Hwang",
year = "2017",
month = "5",
day = "14",
doi = "10.1145/3077240.3077250",
language = "English",
series = "Proceedings of the 3rd International Workshop on Data Science for Macro-Modeling with Financial and Economic Datasets, DSMM 2017 - In conjunction with the ACM SIGMOD/PODS Conference",
publisher = "Association for Computing Machinery, Inc",
booktitle = "Proceedings of the 3rd International Workshop on Data Science for Macro-Modeling with Financial and Economic Datasets, DSMM 2017 - In conjunction with the ACM SIGMOD/PODS Conference",

}

Park, J, Cho, H & Hwang, S 2017, Understanding relations using concepts and semantics. in Proceedings of the 3rd International Workshop on Data Science for Macro-Modeling with Financial and Economic Datasets, DSMM 2017 - In conjunction with the ACM SIGMOD/PODS Conference., 15, Proceedings of the 3rd International Workshop on Data Science for Macro-Modeling with Financial and Economic Datasets, DSMM 2017 - In conjunction with the ACM SIGMOD/PODS Conference, Association for Computing Machinery, Inc, 3rd International Workshop on Data Science for Macro-Modeling with Financial and Economic Datasets, DSMM 2017, Chicago, United States, 17/5/14. https://doi.org/10.1145/3077240.3077250

Understanding relations using concepts and semantics. / Park, Jouyon; Cho, Hyunsouk; Hwang, Seungwon.

Proceedings of the 3rd International Workshop on Data Science for Macro-Modeling with Financial and Economic Datasets, DSMM 2017 - In conjunction with the ACM SIGMOD/PODS Conference. Association for Computing Machinery, Inc, 2017. 15 (Proceedings of the 3rd International Workshop on Data Science for Macro-Modeling with Financial and Economic Datasets, DSMM 2017 - In conjunction with the ACM SIGMOD/PODS Conference).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - Understanding relations using concepts and semantics

AU - Park, Jouyon

AU - Cho, Hyunsouk

AU - Hwang, Seungwon

PY - 2017/5/14

Y1 - 2017/5/14

N2 - The Financial Entity Identification and Information Integration (FEIII) task aims at the question of understanding relationships among financial entities and their roles using three sentences extracted from each financial contract containing the target word. FEIII task has two challenges -1) data sparseness: small training sets (9% of test data) and 2) context sparseness: limited context (three sentences). Existing statistical approaches, such as Bayes and TF-IDF, cannot evaluate the imporatance of words unobservged in training data, which is vulnerable to the above challenges. We overcome each challenge by considering 1) the concepts of words from knowledge bases (Probase) in addition to the words themselves (conceptual feature) and 2) word semantics from distributed representations such as word2vec (semantic feature). We empirically evaluate the proposed classification model on the four-class classification (highly relevant, relevant, neutral, and irrelevant), and show that the proposed model increases 18% of F1-score compared to the statistical baselines.

AB - The Financial Entity Identification and Information Integration (FEIII) task aims at the question of understanding relationships among financial entities and their roles using three sentences extracted from each financial contract containing the target word. FEIII task has two challenges -1) data sparseness: small training sets (9% of test data) and 2) context sparseness: limited context (three sentences). Existing statistical approaches, such as Bayes and TF-IDF, cannot evaluate the imporatance of words unobservged in training data, which is vulnerable to the above challenges. We overcome each challenge by considering 1) the concepts of words from knowledge bases (Probase) in addition to the words themselves (conceptual feature) and 2) word semantics from distributed representations such as word2vec (semantic feature). We empirically evaluate the proposed classification model on the four-class classification (highly relevant, relevant, neutral, and irrelevant), and show that the proposed model increases 18% of F1-score compared to the statistical baselines.

UR - http://www.scopus.com/inward/record.url?scp=85021269172&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85021269172&partnerID=8YFLogxK

U2 - 10.1145/3077240.3077250

DO - 10.1145/3077240.3077250

M3 - Conference contribution

T3 - Proceedings of the 3rd International Workshop on Data Science for Macro-Modeling with Financial and Economic Datasets, DSMM 2017 - In conjunction with the ACM SIGMOD/PODS Conference

BT - Proceedings of the 3rd International Workshop on Data Science for Macro-Modeling with Financial and Economic Datasets, DSMM 2017 - In conjunction with the ACM SIGMOD/PODS Conference

PB - Association for Computing Machinery, Inc

ER -

Park J, Cho H, Hwang S. Understanding relations using concepts and semantics. In Proceedings of the 3rd International Workshop on Data Science for Macro-Modeling with Financial and Economic Datasets, DSMM 2017 - In conjunction with the ACM SIGMOD/PODS Conference. Association for Computing Machinery, Inc. 2017. 15. (Proceedings of the 3rd International Workshop on Data Science for Macro-Modeling with Financial and Economic Datasets, DSMM 2017 - In conjunction with the ACM SIGMOD/PODS Conference). https://doi.org/10.1145/3077240.3077250