Scalable Prediction Models for Airbnb Listing in Spark Big Data Cluster using GPU-accelerated RAPIDS

Samyuktha Muralidharan, Savita Yadav, Jungwoo Huh, Sanghoon Lee, Jongwook Woo

Research output: Contribution to journalArticlepeer-review

Abstract

We aim to build predictive models for Airbnb’s prices using a GPU-accelerated RAPIDS in a big data cluster. The Airbnb Listings datasets are used for the predictive analysis. Several machine-learning algorithms have been adopted to build models that predict the price of Airbnb listings. We compare the results of traditional and big data approaches to machine learning for price prediction and discuss the performance of the models. We built big data models using Databricks Spark Cluster, a distributed parallel computing system. Furthermore, we implemented models using multiple GPUs using RAPIDS in the spark cluster. The model was developed using the XGBoost algorithm, whereas other models were developed using traditional central processing unit (CPU)-based algorithms. This study compared all models in terms of accuracy metrics and computing time. We observed that the XGBoost model with RAPIDS using GPUs had the highest accuracy and computing time.

Original languageEnglish
Pages (from-to)96-102
Number of pages7
JournalJournal of Information and Communication Convergence Engineering
Volume20
Issue number2
DOIs
Publication statusPublished - 2022

Bibliographical note

Funding Information:
Dr. Woo received his Ph.D. from USC and w ent to Yonsei University. He is a Professor at CIS Department of California State University Los Angeles and serves as a Technical Advisor of Teradata, Spark Technology Center and KSEA-SC. He has consulted companies in Hollyw ood. He has published more than 50 papers regarding Scalable Deep Learning, Big Data Analysis and Prediction. He has been awarded Teradata TUN faculty Scholarship and received grants from Databricks, NVidia, Amazon, IBM, Oracle, Microsoft, Cloudera, Hortonworks, SAS, QlikView, and Tableau.

Funding Information:
The Databrick University Alliance supported this research. We appreciate the support of Rob Reed, Program Director at Databricks University Alliance

Publisher Copyright:
© The Korea Institute of Information and Communication Engineering

All Science Journal Classification (ASJC) codes

  • Information Systems
  • Media Technology
  • Computer Networks and Communications
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'Scalable Prediction Models for Airbnb Listing in Spark Big Data Cluster using GPU-accelerated RAPIDS'. Together they form a unique fingerprint.

Cite this