TY - JOUR
T1 - Chromosome-Based Proteomic Study for Identifying Novel Protein Variants from Human Hippocampal Tissue Using Customized neXtProt and GENCODE Databases
AU - Hwang, Heeyoun
AU - Park, Gun Wook
AU - Kim, Kwang Hoe
AU - Lee, Ju Yeon
AU - Lee, Hyun Kyoung
AU - Ji, Eun Sun
AU - Park, Sung Kyu Robin
AU - Xu, Tao
AU - Yates, John R.
AU - Kwon, Kyung Hoon
AU - Park, Young Mok
AU - Lee, Hyoung Joo
AU - Paik, Young Ki
AU - Kim, Jin Young
AU - Yoo, Jong Shin
N1 - Publisher Copyright:
© 2015 American Chemical Society.
PY - 2015/12/4
Y1 - 2015/12/4
N2 - The goal of the Chromosome-Centric Human Proteome Project (C-HPP) is to fully provide proteomic information from each human chromosome, including novel proteoforms, such as novel protein-coding variants expressed from noncoding genomic regions, alternative splicing variants (ASVs), and single amino acid variants (SAAVs). In the 144 LC/MS/MS raw files from human hippocampal tissues of control, epilepsy, and Alzheimer's disease, we identified the novel proteoforms with a workflow including integrated proteomic pipeline using three different search engines, MASCOT, SEQUEST, and MS-GF+. With a <1% false discovery rate (FDR) at the protein level, the 11 detected peptides mapped to four translated long noncoding RNA variants against the customized databases of GENCODE lncRNA, which also mapped to coding-proteins at different chromosomal sites. We also identified four novel ASVs against the customized databases of GENCODE transcript. The target peptides from the variants were validated by tandem MS fragmentation pattern from their corresponding synthetic peptides. Additionally, a total of 128 SAAVs paired with their wild-type peptides were identified with FDR <1% at the peptide level using a customized database from neXtProt including nonsynonymous single nucleotide polymorphism (nsSNP) information. Among these results, several novel variants related in neuro-degenerative disease were identified using the workflow that could be applicable to C-HPP studies. All raw files used in this study were deposited in ProteomeXchange (PXD000395).
AB - The goal of the Chromosome-Centric Human Proteome Project (C-HPP) is to fully provide proteomic information from each human chromosome, including novel proteoforms, such as novel protein-coding variants expressed from noncoding genomic regions, alternative splicing variants (ASVs), and single amino acid variants (SAAVs). In the 144 LC/MS/MS raw files from human hippocampal tissues of control, epilepsy, and Alzheimer's disease, we identified the novel proteoforms with a workflow including integrated proteomic pipeline using three different search engines, MASCOT, SEQUEST, and MS-GF+. With a <1% false discovery rate (FDR) at the protein level, the 11 detected peptides mapped to four translated long noncoding RNA variants against the customized databases of GENCODE lncRNA, which also mapped to coding-proteins at different chromosomal sites. We also identified four novel ASVs against the customized databases of GENCODE transcript. The target peptides from the variants were validated by tandem MS fragmentation pattern from their corresponding synthetic peptides. Additionally, a total of 128 SAAVs paired with their wild-type peptides were identified with FDR <1% at the peptide level using a customized database from neXtProt including nonsynonymous single nucleotide polymorphism (nsSNP) information. Among these results, several novel variants related in neuro-degenerative disease were identified using the workflow that could be applicable to C-HPP studies. All raw files used in this study were deposited in ProteomeXchange (PXD000395).
UR - http://www.scopus.com/inward/record.url?scp=84948953549&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84948953549&partnerID=8YFLogxK
U2 - 10.1021/acs.jproteome.5b00472
DO - 10.1021/acs.jproteome.5b00472
M3 - Article
C2 - 26549206
AN - SCOPUS:84948953549
VL - 14
SP - 5028
EP - 5037
JO - Journal of Proteome Research
JF - Journal of Proteome Research
SN - 1535-3893
IS - 12
ER -