A computational approach for identifying pathogenicity islands in prokaryotic genomes

Sung Ho Yoon, Cheol Goo Hur, Ho Young Kang, Yeoun Hee Kim, Tae Kwang Oh, Jihyun F. Kim

Research output: Contribution to journalArticle

47 Citations (Scopus)

Abstract

Background: Pathogenicity islands (PAIs), distinct genomic segments of pathogens encoding virulence factors, represent a subgroup of genomic islands (GIs) that have been acquired by horizontal gene transfer event. Up to now, computational approaches for identifying PAIs have been focused on the detection of genomic regions which only differ from the rest of the genome in their base composition and codon usage. These approaches often lead to the identification of genomic islands, rather than PAIs. Results: We present a computational method for detecting potential PAIs in complete prokaryotic genomes by combining sequence similarities and abnormalities in genomic composition. We first collected 207 GenBank accessions containing either part or all of the reported PAI loci. In sequenced genomes, strips of PAI-homologs were defined based on the proximity of the homologs of genes in the same PAI accession. An algorithm reminiscent of sequence-assembly procedure was then devised to merge overlapping or adjacent genomic strips into a large genomic region. Among the defined genomic regions, PAI-like regions were identified by the presence of homolog(s) of virulence genes. Also, GIs were postulated by calculating G+C content anomalies and codon usage bias. Of 148 prokaryotic genomes examined, 23 pathogenic and 6 non-pathogenic bacteria contained 77 candidate PAIs that partly or entirely overlap GIs. Conclusion: Supporting the validity of our method, included in the list of candidate PAIs were thirty four PAIs previously identified from genome sequencing papers. Furthermore, in some instances, our method was able to detect entire PAIs for those only partial sequences are available. Our method was proven to be an efficient method for demarcating the potential PAIs in our study. Also, the function(s) and origin(s) of a candidate PAI can be inferred by investigating the PAI queries comprising it. Identification and analysis of potential PAIs in prokaryotic genomes will broaden our knowledge on the structure and properties of PAIs and the evolution of bacterial pathogenesis.

Original languageEnglish
Article number184
JournalBMC Bioinformatics
Volume6
DOIs
Publication statusPublished - 2005 Jul 21

Fingerprint

Genomic Islands
Genomics
Genome
Genes
Gene
Strip
Gene transfer
Pathogens
Computational methods
Chemical analysis
Base Composition
Bacteria
Computational Methods
Codon
Sequencing
Proximity
Anomaly
Overlapping
Locus
Overlap

All Science Journal Classification (ASJC) codes

  • Structural Biology
  • Biochemistry
  • Molecular Biology
  • Computer Science Applications
  • Applied Mathematics

Cite this

Yoon, Sung Ho ; Hur, Cheol Goo ; Kang, Ho Young ; Kim, Yeoun Hee ; Oh, Tae Kwang ; Kim, Jihyun F. / A computational approach for identifying pathogenicity islands in prokaryotic genomes. In: BMC Bioinformatics. 2005 ; Vol. 6.
@article{50d95cee206248cfbf2d277451a721d4,
title = "A computational approach for identifying pathogenicity islands in prokaryotic genomes",
abstract = "Background: Pathogenicity islands (PAIs), distinct genomic segments of pathogens encoding virulence factors, represent a subgroup of genomic islands (GIs) that have been acquired by horizontal gene transfer event. Up to now, computational approaches for identifying PAIs have been focused on the detection of genomic regions which only differ from the rest of the genome in their base composition and codon usage. These approaches often lead to the identification of genomic islands, rather than PAIs. Results: We present a computational method for detecting potential PAIs in complete prokaryotic genomes by combining sequence similarities and abnormalities in genomic composition. We first collected 207 GenBank accessions containing either part or all of the reported PAI loci. In sequenced genomes, strips of PAI-homologs were defined based on the proximity of the homologs of genes in the same PAI accession. An algorithm reminiscent of sequence-assembly procedure was then devised to merge overlapping or adjacent genomic strips into a large genomic region. Among the defined genomic regions, PAI-like regions were identified by the presence of homolog(s) of virulence genes. Also, GIs were postulated by calculating G+C content anomalies and codon usage bias. Of 148 prokaryotic genomes examined, 23 pathogenic and 6 non-pathogenic bacteria contained 77 candidate PAIs that partly or entirely overlap GIs. Conclusion: Supporting the validity of our method, included in the list of candidate PAIs were thirty four PAIs previously identified from genome sequencing papers. Furthermore, in some instances, our method was able to detect entire PAIs for those only partial sequences are available. Our method was proven to be an efficient method for demarcating the potential PAIs in our study. Also, the function(s) and origin(s) of a candidate PAI can be inferred by investigating the PAI queries comprising it. Identification and analysis of potential PAIs in prokaryotic genomes will broaden our knowledge on the structure and properties of PAIs and the evolution of bacterial pathogenesis.",
author = "Yoon, {Sung Ho} and Hur, {Cheol Goo} and Kang, {Ho Young} and Kim, {Yeoun Hee} and Oh, {Tae Kwang} and Kim, {Jihyun F.}",
year = "2005",
month = "7",
day = "21",
doi = "10.1186/1471-2105-6-184",
language = "English",
volume = "6",
journal = "BMC Bioinformatics",
issn = "1471-2105",
publisher = "BioMed Central",

}

A computational approach for identifying pathogenicity islands in prokaryotic genomes. / Yoon, Sung Ho; Hur, Cheol Goo; Kang, Ho Young; Kim, Yeoun Hee; Oh, Tae Kwang; Kim, Jihyun F.

In: BMC Bioinformatics, Vol. 6, 184, 21.07.2005.

Research output: Contribution to journalArticle

TY - JOUR

T1 - A computational approach for identifying pathogenicity islands in prokaryotic genomes

AU - Yoon, Sung Ho

AU - Hur, Cheol Goo

AU - Kang, Ho Young

AU - Kim, Yeoun Hee

AU - Oh, Tae Kwang

AU - Kim, Jihyun F.

PY - 2005/7/21

Y1 - 2005/7/21

N2 - Background: Pathogenicity islands (PAIs), distinct genomic segments of pathogens encoding virulence factors, represent a subgroup of genomic islands (GIs) that have been acquired by horizontal gene transfer event. Up to now, computational approaches for identifying PAIs have been focused on the detection of genomic regions which only differ from the rest of the genome in their base composition and codon usage. These approaches often lead to the identification of genomic islands, rather than PAIs. Results: We present a computational method for detecting potential PAIs in complete prokaryotic genomes by combining sequence similarities and abnormalities in genomic composition. We first collected 207 GenBank accessions containing either part or all of the reported PAI loci. In sequenced genomes, strips of PAI-homologs were defined based on the proximity of the homologs of genes in the same PAI accession. An algorithm reminiscent of sequence-assembly procedure was then devised to merge overlapping or adjacent genomic strips into a large genomic region. Among the defined genomic regions, PAI-like regions were identified by the presence of homolog(s) of virulence genes. Also, GIs were postulated by calculating G+C content anomalies and codon usage bias. Of 148 prokaryotic genomes examined, 23 pathogenic and 6 non-pathogenic bacteria contained 77 candidate PAIs that partly or entirely overlap GIs. Conclusion: Supporting the validity of our method, included in the list of candidate PAIs were thirty four PAIs previously identified from genome sequencing papers. Furthermore, in some instances, our method was able to detect entire PAIs for those only partial sequences are available. Our method was proven to be an efficient method for demarcating the potential PAIs in our study. Also, the function(s) and origin(s) of a candidate PAI can be inferred by investigating the PAI queries comprising it. Identification and analysis of potential PAIs in prokaryotic genomes will broaden our knowledge on the structure and properties of PAIs and the evolution of bacterial pathogenesis.

AB - Background: Pathogenicity islands (PAIs), distinct genomic segments of pathogens encoding virulence factors, represent a subgroup of genomic islands (GIs) that have been acquired by horizontal gene transfer event. Up to now, computational approaches for identifying PAIs have been focused on the detection of genomic regions which only differ from the rest of the genome in their base composition and codon usage. These approaches often lead to the identification of genomic islands, rather than PAIs. Results: We present a computational method for detecting potential PAIs in complete prokaryotic genomes by combining sequence similarities and abnormalities in genomic composition. We first collected 207 GenBank accessions containing either part or all of the reported PAI loci. In sequenced genomes, strips of PAI-homologs were defined based on the proximity of the homologs of genes in the same PAI accession. An algorithm reminiscent of sequence-assembly procedure was then devised to merge overlapping or adjacent genomic strips into a large genomic region. Among the defined genomic regions, PAI-like regions were identified by the presence of homolog(s) of virulence genes. Also, GIs were postulated by calculating G+C content anomalies and codon usage bias. Of 148 prokaryotic genomes examined, 23 pathogenic and 6 non-pathogenic bacteria contained 77 candidate PAIs that partly or entirely overlap GIs. Conclusion: Supporting the validity of our method, included in the list of candidate PAIs were thirty four PAIs previously identified from genome sequencing papers. Furthermore, in some instances, our method was able to detect entire PAIs for those only partial sequences are available. Our method was proven to be an efficient method for demarcating the potential PAIs in our study. Also, the function(s) and origin(s) of a candidate PAI can be inferred by investigating the PAI queries comprising it. Identification and analysis of potential PAIs in prokaryotic genomes will broaden our knowledge on the structure and properties of PAIs and the evolution of bacterial pathogenesis.

UR - http://www.scopus.com/inward/record.url?scp=25444473873&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=25444473873&partnerID=8YFLogxK

U2 - 10.1186/1471-2105-6-184

DO - 10.1186/1471-2105-6-184

M3 - Article

VL - 6

JO - BMC Bioinformatics

JF - BMC Bioinformatics

SN - 1471-2105

M1 - 184

ER -