A document query search using an extended centrality with the Word2vec

Wooju Kim, Heewon Jang, Hak Jin Kim, Donghe Kim

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

While everyday document search is done by keyword-based queries to search engines, we have situations that need deep search of documents such as scrutinies of patents, legal documents, and so on. In such cases, using document queries, instead of keyword-based queries, can be more helpful because it exploits more information from the query document. This paper studies a scheme of document search based on document queries. In particular, it uses centrality vectors, instead of tf-idf vectors, to represent query documents, combined with the Word2vec method to capture the semantic similarity in contained words. This scheme improves the performance of document search and provides a way to find documents not only lexically, but semantically close to a query document.

Original languageEnglish
Title of host publicationProceedings of the 18th Annual International Conference on Electronic Commerce
Subtitle of host publicatione-Commerce in Smart connected World, ICEC 2016
PublisherAssociation for Computing Machinery
ISBN (Electronic)9781450342223
DOIs
Publication statusPublished - 2016 Aug 17
Event18th International Conference on Electronic Commerce, ICEC 2016 - Suwon, Korea, Republic of
Duration: 2016 Aug 172016 Aug 19

Publication series

NameACM International Conference Proceeding Series
Volume17-19-August-2016

Other

Other18th International Conference on Electronic Commerce, ICEC 2016
CountryKorea, Republic of
CitySuwon
Period16/8/1716/8/19

Fingerprint

Search engines
Semantics

All Science Journal Classification (ASJC) codes

  • Software
  • Human-Computer Interaction
  • Computer Vision and Pattern Recognition
  • Computer Networks and Communications

Cite this

Kim, W., Jang, H., Kim, H. J., & Kim, D. (2016). A document query search using an extended centrality with the Word2vec. In Proceedings of the 18th Annual International Conference on Electronic Commerce: e-Commerce in Smart connected World, ICEC 2016 [2971617] (ACM International Conference Proceeding Series; Vol. 17-19-August-2016). Association for Computing Machinery. https://doi.org/10.1145/2971603.2971617
Kim, Wooju ; Jang, Heewon ; Kim, Hak Jin ; Kim, Donghe. / A document query search using an extended centrality with the Word2vec. Proceedings of the 18th Annual International Conference on Electronic Commerce: e-Commerce in Smart connected World, ICEC 2016. Association for Computing Machinery, 2016. (ACM International Conference Proceeding Series).
@inproceedings{b6304738ed494ed68bbbb06e74deedfc,
title = "A document query search using an extended centrality with the Word2vec",
abstract = "While everyday document search is done by keyword-based queries to search engines, we have situations that need deep search of documents such as scrutinies of patents, legal documents, and so on. In such cases, using document queries, instead of keyword-based queries, can be more helpful because it exploits more information from the query document. This paper studies a scheme of document search based on document queries. In particular, it uses centrality vectors, instead of tf-idf vectors, to represent query documents, combined with the Word2vec method to capture the semantic similarity in contained words. This scheme improves the performance of document search and provides a way to find documents not only lexically, but semantically close to a query document.",
author = "Wooju Kim and Heewon Jang and Kim, {Hak Jin} and Donghe Kim",
year = "2016",
month = "8",
day = "17",
doi = "10.1145/2971603.2971617",
language = "English",
series = "ACM International Conference Proceeding Series",
publisher = "Association for Computing Machinery",
booktitle = "Proceedings of the 18th Annual International Conference on Electronic Commerce",

}

Kim, W, Jang, H, Kim, HJ & Kim, D 2016, A document query search using an extended centrality with the Word2vec. in Proceedings of the 18th Annual International Conference on Electronic Commerce: e-Commerce in Smart connected World, ICEC 2016., 2971617, ACM International Conference Proceeding Series, vol. 17-19-August-2016, Association for Computing Machinery, 18th International Conference on Electronic Commerce, ICEC 2016, Suwon, Korea, Republic of, 16/8/17. https://doi.org/10.1145/2971603.2971617

A document query search using an extended centrality with the Word2vec. / Kim, Wooju; Jang, Heewon; Kim, Hak Jin; Kim, Donghe.

Proceedings of the 18th Annual International Conference on Electronic Commerce: e-Commerce in Smart connected World, ICEC 2016. Association for Computing Machinery, 2016. 2971617 (ACM International Conference Proceeding Series; Vol. 17-19-August-2016).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - A document query search using an extended centrality with the Word2vec

AU - Kim, Wooju

AU - Jang, Heewon

AU - Kim, Hak Jin

AU - Kim, Donghe

PY - 2016/8/17

Y1 - 2016/8/17

N2 - While everyday document search is done by keyword-based queries to search engines, we have situations that need deep search of documents such as scrutinies of patents, legal documents, and so on. In such cases, using document queries, instead of keyword-based queries, can be more helpful because it exploits more information from the query document. This paper studies a scheme of document search based on document queries. In particular, it uses centrality vectors, instead of tf-idf vectors, to represent query documents, combined with the Word2vec method to capture the semantic similarity in contained words. This scheme improves the performance of document search and provides a way to find documents not only lexically, but semantically close to a query document.

AB - While everyday document search is done by keyword-based queries to search engines, we have situations that need deep search of documents such as scrutinies of patents, legal documents, and so on. In such cases, using document queries, instead of keyword-based queries, can be more helpful because it exploits more information from the query document. This paper studies a scheme of document search based on document queries. In particular, it uses centrality vectors, instead of tf-idf vectors, to represent query documents, combined with the Word2vec method to capture the semantic similarity in contained words. This scheme improves the performance of document search and provides a way to find documents not only lexically, but semantically close to a query document.

UR - http://www.scopus.com/inward/record.url?scp=84988723919&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84988723919&partnerID=8YFLogxK

U2 - 10.1145/2971603.2971617

DO - 10.1145/2971603.2971617

M3 - Conference contribution

AN - SCOPUS:84988723919

T3 - ACM International Conference Proceeding Series

BT - Proceedings of the 18th Annual International Conference on Electronic Commerce

PB - Association for Computing Machinery

ER -

Kim W, Jang H, Kim HJ, Kim D. A document query search using an extended centrality with the Word2vec. In Proceedings of the 18th Annual International Conference on Electronic Commerce: e-Commerce in Smart connected World, ICEC 2016. Association for Computing Machinery. 2016. 2971617. (ACM International Conference Proceeding Series). https://doi.org/10.1145/2971603.2971617