The sizes of the three popular asymptotic tests for testing homogeneity of two binomial proportions

Seung Ho Kang, Yonghee Lee, Eun Sug Park

Research output: Contribution to journalArticle

3 Citations (Scopus)

Abstract

In statistical hypothesis testing it is important to ensure that the type I error rate is preserved under the nominal level. This paper addresses the sizes and the type I errors rates of the three popular asymptotic tests for testing homogeneity of two binomial proportions: the chi-square test with and without continuity correction, the likelihood ratio test. Although it has been recognized that, based on limited simulation studies, the sizes of the tests are inflated in small samples, it has been thought that the sizes are well preserved under the nominal level when the sample size is sufficiently large. But, Loh [1989. Bounds on the size of the χ2 test of independence in a contingency table. Ann. Statist. 17, 1709-1722], and Loh and Yu [1993. Bounds on the size of the likelihood ratio test of independence in a contingency table. J. Multivariate Anal. 45, 291-304] showed theoretically that the sizes are always greater than or equal to the nominal level when the sample size is infinite. In this paper, we confirm their results by computing the large-sample lower bounds of the sizes numerically. Applying complete enumeration which does not have any error, we confirm again the results by computing the sizes precisely on computer in moderate sample sizes. When the sample sizes are unbalanced, the peaks of the type I error rates occur at the extremes of the nuisance parameter. But, the type I error rates of the three tests are close to the nominal level in most values of the nuisance parameter except the extremes. We also find that, when the sample sizes are severely unbalanced and the value of the nuisance parameter is very small, the size of the chi-square test with continuity correction can exceed the nominal level excessively (for instance, the size could be at least 0.877 at 5% nominal level in some cases).

Original languageEnglish
Pages (from-to)710-722
Number of pages13
JournalComputational Statistics and Data Analysis
Volume51
Issue number2
DOIs
Publication statusPublished - 2006 Nov 15

Fingerprint

Asymptotic Test
Homogeneity
Proportion
Testing
Categorical or nominal
Type I Error Rate
Sample Size
Nuisance Parameter
Continuity Correction
Test of Independence
Chi-squared test
Contingency Table
Likelihood Ratio Test
Extremes
Computing
Hypothesis Testing
Small Sample
Enumeration
Exceed
Simulation Study

All Science Journal Classification (ASJC) codes

  • Statistics and Probability
  • Computational Mathematics
  • Computational Theory and Mathematics
  • Applied Mathematics

Cite this

@article{c02802a04f11453ea63561b1b2feff61,
title = "The sizes of the three popular asymptotic tests for testing homogeneity of two binomial proportions",
abstract = "In statistical hypothesis testing it is important to ensure that the type I error rate is preserved under the nominal level. This paper addresses the sizes and the type I errors rates of the three popular asymptotic tests for testing homogeneity of two binomial proportions: the chi-square test with and without continuity correction, the likelihood ratio test. Although it has been recognized that, based on limited simulation studies, the sizes of the tests are inflated in small samples, it has been thought that the sizes are well preserved under the nominal level when the sample size is sufficiently large. But, Loh [1989. Bounds on the size of the χ2 test of independence in a contingency table. Ann. Statist. 17, 1709-1722], and Loh and Yu [1993. Bounds on the size of the likelihood ratio test of independence in a contingency table. J. Multivariate Anal. 45, 291-304] showed theoretically that the sizes are always greater than or equal to the nominal level when the sample size is infinite. In this paper, we confirm their results by computing the large-sample lower bounds of the sizes numerically. Applying complete enumeration which does not have any error, we confirm again the results by computing the sizes precisely on computer in moderate sample sizes. When the sample sizes are unbalanced, the peaks of the type I error rates occur at the extremes of the nuisance parameter. But, the type I error rates of the three tests are close to the nominal level in most values of the nuisance parameter except the extremes. We also find that, when the sample sizes are severely unbalanced and the value of the nuisance parameter is very small, the size of the chi-square test with continuity correction can exceed the nominal level excessively (for instance, the size could be at least 0.877 at 5{\%} nominal level in some cases).",
author = "Kang, {Seung Ho} and Yonghee Lee and Park, {Eun Sug}",
year = "2006",
month = "11",
day = "15",
doi = "10.1016/j.csda.2006.03.006",
language = "English",
volume = "51",
pages = "710--722",
journal = "Computational Statistics and Data Analysis",
issn = "0167-9473",
publisher = "Elsevier",
number = "2",

}

The sizes of the three popular asymptotic tests for testing homogeneity of two binomial proportions. / Kang, Seung Ho; Lee, Yonghee; Park, Eun Sug.

In: Computational Statistics and Data Analysis, Vol. 51, No. 2, 15.11.2006, p. 710-722.

Research output: Contribution to journalArticle

TY - JOUR

T1 - The sizes of the three popular asymptotic tests for testing homogeneity of two binomial proportions

AU - Kang, Seung Ho

AU - Lee, Yonghee

AU - Park, Eun Sug

PY - 2006/11/15

Y1 - 2006/11/15

N2 - In statistical hypothesis testing it is important to ensure that the type I error rate is preserved under the nominal level. This paper addresses the sizes and the type I errors rates of the three popular asymptotic tests for testing homogeneity of two binomial proportions: the chi-square test with and without continuity correction, the likelihood ratio test. Although it has been recognized that, based on limited simulation studies, the sizes of the tests are inflated in small samples, it has been thought that the sizes are well preserved under the nominal level when the sample size is sufficiently large. But, Loh [1989. Bounds on the size of the χ2 test of independence in a contingency table. Ann. Statist. 17, 1709-1722], and Loh and Yu [1993. Bounds on the size of the likelihood ratio test of independence in a contingency table. J. Multivariate Anal. 45, 291-304] showed theoretically that the sizes are always greater than or equal to the nominal level when the sample size is infinite. In this paper, we confirm their results by computing the large-sample lower bounds of the sizes numerically. Applying complete enumeration which does not have any error, we confirm again the results by computing the sizes precisely on computer in moderate sample sizes. When the sample sizes are unbalanced, the peaks of the type I error rates occur at the extremes of the nuisance parameter. But, the type I error rates of the three tests are close to the nominal level in most values of the nuisance parameter except the extremes. We also find that, when the sample sizes are severely unbalanced and the value of the nuisance parameter is very small, the size of the chi-square test with continuity correction can exceed the nominal level excessively (for instance, the size could be at least 0.877 at 5% nominal level in some cases).

AB - In statistical hypothesis testing it is important to ensure that the type I error rate is preserved under the nominal level. This paper addresses the sizes and the type I errors rates of the three popular asymptotic tests for testing homogeneity of two binomial proportions: the chi-square test with and without continuity correction, the likelihood ratio test. Although it has been recognized that, based on limited simulation studies, the sizes of the tests are inflated in small samples, it has been thought that the sizes are well preserved under the nominal level when the sample size is sufficiently large. But, Loh [1989. Bounds on the size of the χ2 test of independence in a contingency table. Ann. Statist. 17, 1709-1722], and Loh and Yu [1993. Bounds on the size of the likelihood ratio test of independence in a contingency table. J. Multivariate Anal. 45, 291-304] showed theoretically that the sizes are always greater than or equal to the nominal level when the sample size is infinite. In this paper, we confirm their results by computing the large-sample lower bounds of the sizes numerically. Applying complete enumeration which does not have any error, we confirm again the results by computing the sizes precisely on computer in moderate sample sizes. When the sample sizes are unbalanced, the peaks of the type I error rates occur at the extremes of the nuisance parameter. But, the type I error rates of the three tests are close to the nominal level in most values of the nuisance parameter except the extremes. We also find that, when the sample sizes are severely unbalanced and the value of the nuisance parameter is very small, the size of the chi-square test with continuity correction can exceed the nominal level excessively (for instance, the size could be at least 0.877 at 5% nominal level in some cases).

UR - http://www.scopus.com/inward/record.url?scp=33750378332&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=33750378332&partnerID=8YFLogxK

U2 - 10.1016/j.csda.2006.03.006

DO - 10.1016/j.csda.2006.03.006

M3 - Article

AN - SCOPUS:33750378332

VL - 51

SP - 710

EP - 722

JO - Computational Statistics and Data Analysis

JF - Computational Statistics and Data Analysis

SN - 0167-9473

IS - 2

ER -