Chromosome-Based Proteomic Study for Identifying Novel Protein Variants from Human Hippocampal Tissue Using Customized neXtProt and GENCODE Databases

Heeyoun Hwang, Gun Wook Park, Kwang Hoe Kim, Ju Yeon Lee, Hyun Kyoung Lee, Eun Sun Ji, Sung Kyu Robin Park, Tao Xu, John R. Yates, Kyung Hoon Kwon, Young Mok Park, Hyoung Joo Lee, Young Ki Paik, Jin Young Kim, Jong Shin Yoo

Research output: Contribution to journalArticle

4 Citations (Scopus)

Abstract

The goal of the Chromosome-Centric Human Proteome Project (C-HPP) is to fully provide proteomic information from each human chromosome, including novel proteoforms, such as novel protein-coding variants expressed from noncoding genomic regions, alternative splicing variants (ASVs), and single amino acid variants (SAAVs). In the 144 LC/MS/MS raw files from human hippocampal tissues of control, epilepsy, and Alzheimer's disease, we identified the novel proteoforms with a workflow including integrated proteomic pipeline using three different search engines, MASCOT, SEQUEST, and MS-GF+. With a <1% false discovery rate (FDR) at the protein level, the 11 detected peptides mapped to four translated long noncoding RNA variants against the customized databases of GENCODE lncRNA, which also mapped to coding-proteins at different chromosomal sites. We also identified four novel ASVs against the customized databases of GENCODE transcript. The target peptides from the variants were validated by tandem MS fragmentation pattern from their corresponding synthetic peptides. Additionally, a total of 128 SAAVs paired with their wild-type peptides were identified with FDR <1% at the peptide level using a customized database from neXtProt including nonsynonymous single nucleotide polymorphism (nsSNP) information. Among these results, several novel variants related in neuro-degenerative disease were identified using the workflow that could be applicable to C-HPP studies. All raw files used in this study were deposited in ProteomeXchange (PXD000395).

Original languageEnglish
Pages (from-to)5028-5037
Number of pages10
JournalJournal of Proteome Research
Volume14
Issue number12
DOIs
Publication statusPublished - 2015 Dec 4

Fingerprint

Chromosomes
Proteomics
Databases
Tissue
Human Chromosomes
Peptides
Long Noncoding RNA
Workflow
Alternative Splicing
Proteome
Proteins
Neurodegenerative diseases
Amino Acids
Search Engine
Search engines
Polymorphism
Single Nucleotide Polymorphism
Epilepsy
Alzheimer Disease
Nucleotides

All Science Journal Classification (ASJC) codes

  • Biochemistry
  • Chemistry(all)

Cite this

Hwang, Heeyoun ; Park, Gun Wook ; Kim, Kwang Hoe ; Lee, Ju Yeon ; Lee, Hyun Kyoung ; Ji, Eun Sun ; Park, Sung Kyu Robin ; Xu, Tao ; Yates, John R. ; Kwon, Kyung Hoon ; Park, Young Mok ; Lee, Hyoung Joo ; Paik, Young Ki ; Kim, Jin Young ; Yoo, Jong Shin. / Chromosome-Based Proteomic Study for Identifying Novel Protein Variants from Human Hippocampal Tissue Using Customized neXtProt and GENCODE Databases. In: Journal of Proteome Research. 2015 ; Vol. 14, No. 12. pp. 5028-5037.
@article{7a97ef1c418c4c778ca7115b0dc7ec2b,
title = "Chromosome-Based Proteomic Study for Identifying Novel Protein Variants from Human Hippocampal Tissue Using Customized neXtProt and GENCODE Databases",
abstract = "The goal of the Chromosome-Centric Human Proteome Project (C-HPP) is to fully provide proteomic information from each human chromosome, including novel proteoforms, such as novel protein-coding variants expressed from noncoding genomic regions, alternative splicing variants (ASVs), and single amino acid variants (SAAVs). In the 144 LC/MS/MS raw files from human hippocampal tissues of control, epilepsy, and Alzheimer's disease, we identified the novel proteoforms with a workflow including integrated proteomic pipeline using three different search engines, MASCOT, SEQUEST, and MS-GF+. With a <1{\%} false discovery rate (FDR) at the protein level, the 11 detected peptides mapped to four translated long noncoding RNA variants against the customized databases of GENCODE lncRNA, which also mapped to coding-proteins at different chromosomal sites. We also identified four novel ASVs against the customized databases of GENCODE transcript. The target peptides from the variants were validated by tandem MS fragmentation pattern from their corresponding synthetic peptides. Additionally, a total of 128 SAAVs paired with their wild-type peptides were identified with FDR <1{\%} at the peptide level using a customized database from neXtProt including nonsynonymous single nucleotide polymorphism (nsSNP) information. Among these results, several novel variants related in neuro-degenerative disease were identified using the workflow that could be applicable to C-HPP studies. All raw files used in this study were deposited in ProteomeXchange (PXD000395).",
author = "Heeyoun Hwang and Park, {Gun Wook} and Kim, {Kwang Hoe} and Lee, {Ju Yeon} and Lee, {Hyun Kyoung} and Ji, {Eun Sun} and Park, {Sung Kyu Robin} and Tao Xu and Yates, {John R.} and Kwon, {Kyung Hoon} and Park, {Young Mok} and Lee, {Hyoung Joo} and Paik, {Young Ki} and Kim, {Jin Young} and Yoo, {Jong Shin}",
year = "2015",
month = "12",
day = "4",
doi = "10.1021/acs.jproteome.5b00472",
language = "English",
volume = "14",
pages = "5028--5037",
journal = "Journal of Proteome Research",
issn = "1535-3893",
publisher = "American Chemical Society",
number = "12",

}

Hwang, H, Park, GW, Kim, KH, Lee, JY, Lee, HK, Ji, ES, Park, SKR, Xu, T, Yates, JR, Kwon, KH, Park, YM, Lee, HJ, Paik, YK, Kim, JY & Yoo, JS 2015, 'Chromosome-Based Proteomic Study for Identifying Novel Protein Variants from Human Hippocampal Tissue Using Customized neXtProt and GENCODE Databases', Journal of Proteome Research, vol. 14, no. 12, pp. 5028-5037. https://doi.org/10.1021/acs.jproteome.5b00472

Chromosome-Based Proteomic Study for Identifying Novel Protein Variants from Human Hippocampal Tissue Using Customized neXtProt and GENCODE Databases. / Hwang, Heeyoun; Park, Gun Wook; Kim, Kwang Hoe; Lee, Ju Yeon; Lee, Hyun Kyoung; Ji, Eun Sun; Park, Sung Kyu Robin; Xu, Tao; Yates, John R.; Kwon, Kyung Hoon; Park, Young Mok; Lee, Hyoung Joo; Paik, Young Ki; Kim, Jin Young; Yoo, Jong Shin.

In: Journal of Proteome Research, Vol. 14, No. 12, 04.12.2015, p. 5028-5037.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Chromosome-Based Proteomic Study for Identifying Novel Protein Variants from Human Hippocampal Tissue Using Customized neXtProt and GENCODE Databases

AU - Hwang, Heeyoun

AU - Park, Gun Wook

AU - Kim, Kwang Hoe

AU - Lee, Ju Yeon

AU - Lee, Hyun Kyoung

AU - Ji, Eun Sun

AU - Park, Sung Kyu Robin

AU - Xu, Tao

AU - Yates, John R.

AU - Kwon, Kyung Hoon

AU - Park, Young Mok

AU - Lee, Hyoung Joo

AU - Paik, Young Ki

AU - Kim, Jin Young

AU - Yoo, Jong Shin

PY - 2015/12/4

Y1 - 2015/12/4

N2 - The goal of the Chromosome-Centric Human Proteome Project (C-HPP) is to fully provide proteomic information from each human chromosome, including novel proteoforms, such as novel protein-coding variants expressed from noncoding genomic regions, alternative splicing variants (ASVs), and single amino acid variants (SAAVs). In the 144 LC/MS/MS raw files from human hippocampal tissues of control, epilepsy, and Alzheimer's disease, we identified the novel proteoforms with a workflow including integrated proteomic pipeline using three different search engines, MASCOT, SEQUEST, and MS-GF+. With a <1% false discovery rate (FDR) at the protein level, the 11 detected peptides mapped to four translated long noncoding RNA variants against the customized databases of GENCODE lncRNA, which also mapped to coding-proteins at different chromosomal sites. We also identified four novel ASVs against the customized databases of GENCODE transcript. The target peptides from the variants were validated by tandem MS fragmentation pattern from their corresponding synthetic peptides. Additionally, a total of 128 SAAVs paired with their wild-type peptides were identified with FDR <1% at the peptide level using a customized database from neXtProt including nonsynonymous single nucleotide polymorphism (nsSNP) information. Among these results, several novel variants related in neuro-degenerative disease were identified using the workflow that could be applicable to C-HPP studies. All raw files used in this study were deposited in ProteomeXchange (PXD000395).

AB - The goal of the Chromosome-Centric Human Proteome Project (C-HPP) is to fully provide proteomic information from each human chromosome, including novel proteoforms, such as novel protein-coding variants expressed from noncoding genomic regions, alternative splicing variants (ASVs), and single amino acid variants (SAAVs). In the 144 LC/MS/MS raw files from human hippocampal tissues of control, epilepsy, and Alzheimer's disease, we identified the novel proteoforms with a workflow including integrated proteomic pipeline using three different search engines, MASCOT, SEQUEST, and MS-GF+. With a <1% false discovery rate (FDR) at the protein level, the 11 detected peptides mapped to four translated long noncoding RNA variants against the customized databases of GENCODE lncRNA, which also mapped to coding-proteins at different chromosomal sites. We also identified four novel ASVs against the customized databases of GENCODE transcript. The target peptides from the variants were validated by tandem MS fragmentation pattern from their corresponding synthetic peptides. Additionally, a total of 128 SAAVs paired with their wild-type peptides were identified with FDR <1% at the peptide level using a customized database from neXtProt including nonsynonymous single nucleotide polymorphism (nsSNP) information. Among these results, several novel variants related in neuro-degenerative disease were identified using the workflow that could be applicable to C-HPP studies. All raw files used in this study were deposited in ProteomeXchange (PXD000395).

UR - http://www.scopus.com/inward/record.url?scp=84948953549&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84948953549&partnerID=8YFLogxK

U2 - 10.1021/acs.jproteome.5b00472

DO - 10.1021/acs.jproteome.5b00472

M3 - Article

C2 - 26549206

AN - SCOPUS:84948953549

VL - 14

SP - 5028

EP - 5037

JO - Journal of Proteome Research

JF - Journal of Proteome Research

SN - 1535-3893

IS - 12

ER -