A weighted sample size for microarray datasets that considers the variability of variance and multiplicity

Research output: Contribution to journalArticle

5 Citations (Scopus)

Abstract

Microarray experiments are often performed to detect differently expressed genes among different clinical phenotypes. The method used to calculate the appropriate sample size for this purpose differs from the sample size calculation used for general clinical experiments, because microarrays include tens of thousands of genes. We proposed a sample size calculation method that considers variance among an entire gene set and used the Bonferroni correction to address the multiplicity problem. Specifically, by adjusting for the multiplicity problem, the existing equation for sample size calculation was modified based on the Bonferroni correction. By k-means cluster analysis, the variances across all genes can be divided into several groups with similar values, and the sample sizes for each group were subsequently calculated and weight-averaged. The results of this study show that the sample size was related to the number of genes on a chip. The weighted sample size, calculated by the proposed method, preserved the Type I error for selection of significant genes within a microarray data set.

Original languageEnglish
Pages (from-to)252-258
Number of pages7
JournalJournal of Bioscience and Bioengineering
Volume108
Issue number3
DOIs
Publication statusPublished - 2009 Sep 1

Fingerprint

Microarrays
Sample Size
Genes
Cluster analysis
Datasets
Cluster Analysis
Experiments
Phenotype
Weights and Measures

All Science Journal Classification (ASJC) codes

  • Biotechnology
  • Applied Microbiology and Biotechnology
  • Bioengineering

Cite this

@article{1f22725195f54f50883e6db35b0ffd7a,
title = "A weighted sample size for microarray datasets that considers the variability of variance and multiplicity",
abstract = "Microarray experiments are often performed to detect differently expressed genes among different clinical phenotypes. The method used to calculate the appropriate sample size for this purpose differs from the sample size calculation used for general clinical experiments, because microarrays include tens of thousands of genes. We proposed a sample size calculation method that considers variance among an entire gene set and used the Bonferroni correction to address the multiplicity problem. Specifically, by adjusting for the multiplicity problem, the existing equation for sample size calculation was modified based on the Bonferroni correction. By k-means cluster analysis, the variances across all genes can be divided into several groups with similar values, and the sample sizes for each group were subsequently calculated and weight-averaged. The results of this study show that the sample size was related to the number of genes on a chip. The weighted sample size, calculated by the proposed method, preserved the Type I error for selection of significant genes within a microarray data set.",
author = "Kim, {Ki Yeol} and Hyuncheol Chung and SunYoung Rha",
year = "2009",
month = "9",
day = "1",
doi = "10.1016/j.jbiosc.2009.03.017",
language = "English",
volume = "108",
pages = "252--258",
journal = "Journal of Bioscience and Bioengineering",
issn = "1389-1723",
publisher = "Elsevier",
number = "3",

}

TY - JOUR

T1 - A weighted sample size for microarray datasets that considers the variability of variance and multiplicity

AU - Kim, Ki Yeol

AU - Chung, Hyuncheol

AU - Rha, SunYoung

PY - 2009/9/1

Y1 - 2009/9/1

N2 - Microarray experiments are often performed to detect differently expressed genes among different clinical phenotypes. The method used to calculate the appropriate sample size for this purpose differs from the sample size calculation used for general clinical experiments, because microarrays include tens of thousands of genes. We proposed a sample size calculation method that considers variance among an entire gene set and used the Bonferroni correction to address the multiplicity problem. Specifically, by adjusting for the multiplicity problem, the existing equation for sample size calculation was modified based on the Bonferroni correction. By k-means cluster analysis, the variances across all genes can be divided into several groups with similar values, and the sample sizes for each group were subsequently calculated and weight-averaged. The results of this study show that the sample size was related to the number of genes on a chip. The weighted sample size, calculated by the proposed method, preserved the Type I error for selection of significant genes within a microarray data set.

AB - Microarray experiments are often performed to detect differently expressed genes among different clinical phenotypes. The method used to calculate the appropriate sample size for this purpose differs from the sample size calculation used for general clinical experiments, because microarrays include tens of thousands of genes. We proposed a sample size calculation method that considers variance among an entire gene set and used the Bonferroni correction to address the multiplicity problem. Specifically, by adjusting for the multiplicity problem, the existing equation for sample size calculation was modified based on the Bonferroni correction. By k-means cluster analysis, the variances across all genes can be divided into several groups with similar values, and the sample sizes for each group were subsequently calculated and weight-averaged. The results of this study show that the sample size was related to the number of genes on a chip. The weighted sample size, calculated by the proposed method, preserved the Type I error for selection of significant genes within a microarray data set.

UR - http://www.scopus.com/inward/record.url?scp=67949103688&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=67949103688&partnerID=8YFLogxK

U2 - 10.1016/j.jbiosc.2009.03.017

DO - 10.1016/j.jbiosc.2009.03.017

M3 - Article

VL - 108

SP - 252

EP - 258

JO - Journal of Bioscience and Bioengineering

JF - Journal of Bioscience and Bioengineering

SN - 1389-1723

IS - 3

ER -