Motivation: Since the newly developed Grid platform has been considered as a powerful tool to share resources in the Internet environment, it is of interest to demonstrate an efficient methodology to process massive biological data on the Grid environments at a low cost. This paper presents an efficient and economical method based on a Grid platform to predict secondary structures of all proteins in a given organism, which normally requires a long computation time through sequential execution, by means of processing a large amount of protein sequence data simultaneously. From the prediction results, a genome scale protein fold space can be pursued. Results: Using the improved Grid platform, the secondary structure prediction on genomic scale and protein topology derived from the new scoring scheme for four different model proteomes was presented. This protein fold space was compared with structures from the Protein Data Bank, database and it showed similarly aligned distribution. Therefore, the fold space approach based on this new scoring scheme could be a guideline for predicting a folding family in a given organism.
Bibliographical noteFunding Information:
We thank Se-Jung Kook for helpful discussion about clustering. This work was supported by the NRL program of MOST NRDP (M1-0203-00-0020) and Protein Network Research Center at Yonsei University (W.L.) and by a grant of the International Mobile Telecommunications 2000 R&D Project, Ministry of Information & Communication, Korea and by the Ministry of Science and Technology of Korea/the Korea Science and Engineering Foundation.
All Science Journal Classification (ASJC) codes
- Statistics and Probability
- Molecular Biology
- Computer Science Applications
- Computational Theory and Mathematics
- Computational Mathematics