A Scalable Partitioned Approach to Model Massive Nonstationary Non-Gaussian Spatial Datasets

Benjamin Seiyon Lee, Jaewoo Park

Research output: Contribution to journalArticlepeer-review

Abstract

Nonstationary non-Gaussian spatial data are common in many disciplines, including climate science, ecology, epidemiology, and social sciences. Examples include count data on disease incidence and binary satellite data on cloud mask (cloud/no-cloud). Modeling such datasets as stationary spatial processes can be unrealistic since they are collected over large heterogeneous domains (i.e., spatial behavior differs across subregions). Although several approaches have been developed for nonstationary spatial models, these have focused primarily on Gaussian responses. In addition, fitting nonstationary models for large non-Gaussian datasets is computationally prohibitive. To address these challenges, we propose a scalable algorithm for modeling such data by leveraging parallel computing in modern high-performance computing systems. We partition the spatial domain into disjoint subregions and fit locally nonstationary models using a carefully curated set of spatial basis functions. Then, we combine the local processes using a novel neighbor-based weighting scheme. Our approach scales well to massive datasets (e.g., 2.7 million samples) and can be implemented in nimble, a popular software environment for Bayesian hierarchical modeling. We demonstrate our method to simulated examples and two massive real-world datasets acquired through remote sensing.

Original languageEnglish
Pages (from-to)105-116
Number of pages12
JournalTechnometrics
Volume65
Issue number1
DOIs
Publication statusPublished - 2023

Bibliographical note

Funding Information:
Jaewoo Park was supported by the Yonsei University Research Fund 2020-22-0501 and the National Research Foundation of Korea (NRF-2020R1C1C1A0100386811). The authors are grateful to Matthew Heaton, Murali Haran, John Hughes, and Whitney Huang for providing useful sample code and advice. The authors are thank the anonymous reviewers for their careful review and valuable comments.

Publisher Copyright:
© 2022 American Statistical Association and the American Society for Quality.

All Science Journal Classification (ASJC) codes

  • Statistics and Probability
  • Modelling and Simulation
  • Applied Mathematics

Fingerprint

Dive into the research topics of 'A Scalable Partitioned Approach to Model Massive Nonstationary Non-Gaussian Spatial Datasets'. Together they form a unique fingerprint.

Cite this