Exploring the research landscape of data warehousing and mining based on DaWaK Conference full-text articles

Tatsawan Timakum, Soobin Lee, Min Song

Research output: Contribution to journalArticlepeer-review


The international conference on Data Warehousing and Knowledge Discovery (DaWaK) has become a pivotal place to exchange experiences and knowledge among researchers and practitioners in big data analytics. The conference has been essential to data warehousing and data analytics for the last 21 years (1999–2019). This study explored the knowledge structure embedded in the DaWaK Conference papers and examined the research trends over time. It also analyzed the performance of published papers, authors, and their affiliations and countries and visualized a collaboration network in DaWaK. We applied several text mining techniques, including co-word analysis, topic modeling, co-author network analysis, and network visualization. The study's findings indicate that the core topics are data mining techniques, algorithm performance, and information systems. The popular topic trends are associated with database encryption, whereas the topics related to online analytical processing (OLAP) technology are in decline. The research metrics results demonstrate that the DaWaK papers were cited 6,262 times, with an h-index of 34 for the 722 DaWaK papers. The article titled “Outlier Detection Using Replicator Neural Networks” reached the most citations (177), and the most productive author was Bellatreche, Ladjel (15 papers). Nanyang Technological University is the most frequently mentioned as the author's affiliation, the United States is the country with the largest number of authors, and the National Science Foundation was the largest funding agency that supported the DaWaK researchers. Moreover, the authorship network of Bellatreche, Ladjel is the largest collaboration network in the DaWaK scholar community. The outcomes of this study would be beneficial for comprehending the knowledge in data warehousing and the relevant cross-disciplinary areas of research and collaboration networks in this field.

Original languageEnglish
Article number101926
JournalData and Knowledge Engineering
Publication statusPublished - 2021 Sep

Bibliographical note

Funding Information:
This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT) [No. NRF-2019R1A2C2002577 ].

Funding Information:
Table 5 reveals the top 15 research funding foundations that supported the DaWaK Conference researchers and their publication achievements. In the last 21 years, 63 research foundations worldwide supporting 123 times were detected in the dataset. The National Science Foundation supported the researchers with 15 scholarships, which is the highest rate in the DaWaK Conference, followed by the Natural Sciences and Engineering Research Council of Canada with 12 papers, the Conselho Nacional de Desenvolvimento Científico e Tecnológico, and the National Natural Science Foundation of China with 5 grants each.

Funding Information:
The DaWaK research metrics confirm that DaWaK research outcomes have received consistent attention from the scholarly community in the last 21 years. The papers were cited 6262 times, in which the average number of citations of each published paper was 34 (h-index = 34). The most cited paper was “Outlier Detection Using Replicator Neural Networks”, with 177 citations, co-authored by Hawkins, Simon, He, Hongxing, Williams, Graham, and Baxter, Rohan, published in 2002. The most productive author at the DaWaK Conference was Bellatreche, Ladjel, who wrote 15 papers. Cuzzocrea, Alfredo had the most citations at 176, and his document h-index is 6 of the 12 papers published in DaWaK. Furthermore, Nanyang Technological University is the top affiliation of the authors, and the top affiliated country is the United States. The National Science Foundation is the highest-ranking grant sponsor in the DaWaK Conference.

Publisher Copyright:
© 2021 Elsevier B.V.

All Science Journal Classification (ASJC) codes

  • Information Systems and Management


Dive into the research topics of 'Exploring the research landscape of data warehousing and mining based on DaWaK Conference full-text articles'. Together they form a unique fingerprint.

Cite this