High-resolution arrayCGH platform makes it possible to detect small gains and losses which previously could not be measured. However, current CNV detection tools fitted to early low-resolution data are not applicable to larger high-resolution data. When CNV detection tools are applied to high-resolution data, they suffer from high false-positives, which increases validation cost. Existing CNV detection tools also require optimal parameter values. In most cases, obtaining these values is a difficult task. This study developed a CNV detection algorithm that is optimized for high-resolution arrayCGH data. This tool operates up to 1500 times faster than existing tools on a high-resolution arrayCGH of whole human chromosomes which has 42 million probes whose average length is 50 bases, while preserving false positive/negative rates. The algorithm also uses a normality test, thereby removing the need for optimal parameters. To our knowledge, this is the first formulation for CNV detecting problems that results in a near-linear empirical overall complexity for real high-resolution data.
All Science Journal Classification (ASJC) codes
- Computer Science Applications
- Health Informatics