SUBic: A scalable unsupervised framework for discovering high quality biclusters

Jooil Lee, Yanhua Jin, Won Suk Lee

Research output: Contribution to journalArticlepeer-review

Abstract

A biclustering algorithm extends conventional clustering techniques to extract all of the meaningful subgroups of genes and conditions in the expression matrix of a microarray dataset. However, such algorithms are very sensitive to input parameters and show poor scalability. This paper proposes a scalable unsupervised biclustering framework, SUBic, to find high quality constant-row biclusters in an expression matrix effectively. A one-dimensional clustering algorithm is proposed to partition the attributes, that is, columns of an expression matrix into disjoint groups based on the similarity of expression values. These groups form a set of short transactions and are used to discover a set of frequent itemsets each of which corresponds to a bicluster. However, a bicluster may include any attribute whose expression value is not similar enough to others, so a bicluster refinement is used to enhance the quality of a bicluster by removing those attributes based on its distribution of expression values. The performance of the proposed method is comparatively analyzed through a series of experiments on synthetic and real datasets.

Original languageEnglish
Pages (from-to)636-646
Number of pages11
JournalJournal of Computer Science and Technology
Volume28
Issue number4
DOIs
Publication statusPublished - 2013 Jul

Bibliographical note

Funding Information:
Regular Paper This work was supported by the National Research Foundation of Korea (NRF) funded by the Ministry of Education, Science and Technology (MEST) of Korea under Grant No. 2011-0016648. The preliminary version of the paper was published in the Proceedings of EDB2012. ∗Corresponding Author ©2013 Springer Science + Business Media, LLC & Science Press, China

All Science Journal Classification (ASJC) codes

  • Software
  • Theoretical Computer Science
  • Hardware and Architecture
  • Computer Science Applications
  • Computational Theory and Mathematics

Fingerprint

Dive into the research topics of 'SUBic: A scalable unsupervised framework for discovering high quality biclusters'. Together they form a unique fingerprint.

Cite this