Bayesian model selection for high-dimensional Ising models, with applications to educational data

Jaewoo Park, Ick Hoon Jin, Michael Schweinberger

Research output: Contribution to journalArticlepeer-review

Abstract

Doubly-intractable posterior distributions arise in many applications of statistics concerned with discrete and dependent data, including physics, spatial statistics, machine learning, the social sciences, and other fields. A specific example is psychometrics, which has adapted high-dimensional Ising models from machine learning, with a view to studying the interactions among binary item responses in educational assessments. To estimate high-dimensional Ising models from educational assessment data, ℓ1-penalized nodewise logistic regressions have been used. Theoretical results in high-dimensional statistics show that ℓ1-penalized nodewise logistic regressions can recover the true interaction structure with high probability, provided that certain assumptions are satisfied. Those assumptions are hard to verify in practice and may be violated, and quantifying the uncertainty about the estimated interaction structure and parameter estimators is challenging. We propose a Bayesian approach that helps quantify the uncertainty about the interaction structure and parameters without requiring strong assumptions, and can be applied to Ising models with thousands of parameters. We demonstrate the advantages of the proposed Bayesian approach compared with ℓ1-penalized nodewise logistic regressions by simulation studies and applications to small and large educational data sets with up to 2,485 parameters. Among other things, the simulation studies suggest that the Bayesian approach is more robust against model misspecification due to omitted covariates than ℓ1-penalized nodewise logistic regressions.

Original languageEnglish
Article number107325
JournalComputational Statistics and Data Analysis
Volume165
DOIs
Publication statusPublished - 2022 Jan

Bibliographical note

Funding Information:
Jaewoo Park was partially supported by the Yonsei University Research Fund of 2019-22-0194 and the National Research Foundation of Korea ( NRF-2020R1C1C1A0100386811 ). Ick Hoon Jin was partially supported by the Yonsei University Research Fund of 2019-22-0210 and the National Research Foundation of Korea ( NRF-2020R1A2C1A01009881 ). Michael Schweinberger was partially supported by the U.S. National Science Foundation (NSF award DMS-1812119 ). The authors are grateful to an anonymous associate editor and two anonymous reviewers, whose constructive comments have greatly improved the paper.

Publisher Copyright:
© 2021 Elsevier B.V.

All Science Journal Classification (ASJC) codes

  • Statistics and Probability
  • Computational Mathematics
  • Computational Theory and Mathematics
  • Applied Mathematics

Fingerprint

Dive into the research topics of 'Bayesian model selection for high-dimensional Ising models, with applications to educational data'. Together they form a unique fingerprint.

Cite this