KBQA: Learning question answering over QA corpora and knowledge bases

Wanyun Cui, Yanghua Xiao, Haixun Wang, Yangqiu Song, Seung Won Hwang, Wei Wang

Research output: Contribution to journalConference articlepeer-review

74 Citations (Scopus)

Abstract

Question answering (QA) has become a popular way for humans to access billion-scale knowledge bases. Unlike web search, QA over a knowledge base gives out accurate and concise results, provided that natural language questions can be understood and mapped precisely to structured queries over the knowledge base. The challenge, however, is that a human can ask one question in many different ways. Previous approaches have natural limits due to their representations: Rule based approaches only understand a small set of "canned" questions, while keyword based or synonym based approaches cannot fully understand the questions. In this paper, we design a new kind of question representation: Templates, over a billion scale knowledge base and a million scale QA corpora. For example, for questions about a city's population, we learn templates such as What's the population of $city?, How many people are there in $city?. We learned 27 million templates for 2782 intents. Based on these templates, our QA system KBQA effectively supports binary factoid questions, as well as complex questions which are composed of a series of binary factoid questions. Furthermore, we expand predicates in RDF knowledge base, which boosts the coverage of knowledge base by 57 times. Our QA system beats all other state-of-art works on both effectiveness and efficiency over QALD benchmarks.

Original languageEnglish
Pages (from-to)565-576
Number of pages12
JournalProceedings of the VLDB Endowment
Volume10
Issue number5
DOIs
Publication statusPublished - 2016
Event43rd International Conference on Very Large Data Bases, VLDB 2017 - Munich, Germany
Duration: 2017 Aug 282017 Sep 1

Bibliographical note

Funding Information:
This paper was supported by the National Key Basic Research Program of China under No.2015CB358800, by the National NSFC (No.61472085, U1509213), by Shanghai Municipal Science and Technology Commission foundation key project under No.15JC1400900, by Shanghai Municipal Science and Technology project under No.16511102102. Seung-won Hwang was supported by IITP grant funded by the Korea government (MSIP; No. B0101-16-0307) and Microsoft Research.

All Science Journal Classification (ASJC) codes

  • Computer Science (miscellaneous)
  • Computer Science(all)

Cite this