ASV-ID, a Proteogenomic Workflow to Predict Candidate Protein Isoforms on the Basis of Transcript Evidence

Seul Ki Jeong, Chae Yeon Kim, Young-Ki Paik

Research output: Contribution to journalArticle

1 Citation (Scopus)

Abstract

One of the goals of the Chromosome-Centric Human Proteome Project (C-HPP) is to map and characterize the functions of protein isoforms produced by alternative splicing of genes. However, identifying alternative splice variants (ASVs) via mass spectrometry remains a major challenge, because ASVs usually contain highly homologous peptide sequences. A routine protein sequence analysis suggests that more than half of the investigated proteins do not generate two or more uniquely mapping peptides that would enable their isoforms to be distinguished. Here, we develop a new proteogenomics method, named "ASV-ID" (alternative splicing variants identification), which enables identification of ASVs by using a cell type-specific protein sequence database that is supported by RNA-Seq data. Using this workflow, we identify 1935 distinct proteins under highly stringent conditions. In fact, transcript evidence on these 841 proteins helps us distinguish them from other isoforms, despite the fact that these proteins are not predicted to make 2 or more uniquely mapping peptides. We also demonstrate that ASV-ID enables detection of 19 differently expressed isoforms present in several cell lines. Thus, a new workflow using ASV-ID has the potential to map yet-to-be-identified difficult protein isoforms in a simple and robust way.

Original languageEnglish
Pages (from-to)4235-4242
Number of pages8
JournalJournal of Proteome Research
Volume17
Issue number12
DOIs
Publication statusPublished - 2018 Dec 7

Fingerprint

Workflow
Protein Isoforms
Peptide Mapping
Alternative Splicing
Proteins
Peptides
Protein Databases
Protein Sequence Analysis
Human Chromosomes
Proteome
Sequence Homology
Chromosomes
Mass Spectrometry
Mass spectrometry
Proteogenomics
Proteomics
RNA
Cell Line
Genes
Cells

All Science Journal Classification (ASJC) codes

  • Biochemistry
  • Chemistry(all)

Cite this

@article{91780b8a851343099eb48faf0bf7de34,
title = "ASV-ID, a Proteogenomic Workflow to Predict Candidate Protein Isoforms on the Basis of Transcript Evidence",
abstract = "One of the goals of the Chromosome-Centric Human Proteome Project (C-HPP) is to map and characterize the functions of protein isoforms produced by alternative splicing of genes. However, identifying alternative splice variants (ASVs) via mass spectrometry remains a major challenge, because ASVs usually contain highly homologous peptide sequences. A routine protein sequence analysis suggests that more than half of the investigated proteins do not generate two or more uniquely mapping peptides that would enable their isoforms to be distinguished. Here, we develop a new proteogenomics method, named {"}ASV-ID{"} (alternative splicing variants identification), which enables identification of ASVs by using a cell type-specific protein sequence database that is supported by RNA-Seq data. Using this workflow, we identify 1935 distinct proteins under highly stringent conditions. In fact, transcript evidence on these 841 proteins helps us distinguish them from other isoforms, despite the fact that these proteins are not predicted to make 2 or more uniquely mapping peptides. We also demonstrate that ASV-ID enables detection of 19 differently expressed isoforms present in several cell lines. Thus, a new workflow using ASV-ID has the potential to map yet-to-be-identified difficult protein isoforms in a simple and robust way.",
author = "Jeong, {Seul Ki} and Kim, {Chae Yeon} and Young-Ki Paik",
year = "2018",
month = "12",
day = "7",
doi = "10.1021/acs.jproteome.8b00548",
language = "English",
volume = "17",
pages = "4235--4242",
journal = "Journal of Proteome Research",
issn = "1535-3893",
publisher = "American Chemical Society",
number = "12",

}

ASV-ID, a Proteogenomic Workflow to Predict Candidate Protein Isoforms on the Basis of Transcript Evidence. / Jeong, Seul Ki; Kim, Chae Yeon; Paik, Young-Ki.

In: Journal of Proteome Research, Vol. 17, No. 12, 07.12.2018, p. 4235-4242.

Research output: Contribution to journalArticle

TY - JOUR

T1 - ASV-ID, a Proteogenomic Workflow to Predict Candidate Protein Isoforms on the Basis of Transcript Evidence

AU - Jeong, Seul Ki

AU - Kim, Chae Yeon

AU - Paik, Young-Ki

PY - 2018/12/7

Y1 - 2018/12/7

N2 - One of the goals of the Chromosome-Centric Human Proteome Project (C-HPP) is to map and characterize the functions of protein isoforms produced by alternative splicing of genes. However, identifying alternative splice variants (ASVs) via mass spectrometry remains a major challenge, because ASVs usually contain highly homologous peptide sequences. A routine protein sequence analysis suggests that more than half of the investigated proteins do not generate two or more uniquely mapping peptides that would enable their isoforms to be distinguished. Here, we develop a new proteogenomics method, named "ASV-ID" (alternative splicing variants identification), which enables identification of ASVs by using a cell type-specific protein sequence database that is supported by RNA-Seq data. Using this workflow, we identify 1935 distinct proteins under highly stringent conditions. In fact, transcript evidence on these 841 proteins helps us distinguish them from other isoforms, despite the fact that these proteins are not predicted to make 2 or more uniquely mapping peptides. We also demonstrate that ASV-ID enables detection of 19 differently expressed isoforms present in several cell lines. Thus, a new workflow using ASV-ID has the potential to map yet-to-be-identified difficult protein isoforms in a simple and robust way.

AB - One of the goals of the Chromosome-Centric Human Proteome Project (C-HPP) is to map and characterize the functions of protein isoforms produced by alternative splicing of genes. However, identifying alternative splice variants (ASVs) via mass spectrometry remains a major challenge, because ASVs usually contain highly homologous peptide sequences. A routine protein sequence analysis suggests that more than half of the investigated proteins do not generate two or more uniquely mapping peptides that would enable their isoforms to be distinguished. Here, we develop a new proteogenomics method, named "ASV-ID" (alternative splicing variants identification), which enables identification of ASVs by using a cell type-specific protein sequence database that is supported by RNA-Seq data. Using this workflow, we identify 1935 distinct proteins under highly stringent conditions. In fact, transcript evidence on these 841 proteins helps us distinguish them from other isoforms, despite the fact that these proteins are not predicted to make 2 or more uniquely mapping peptides. We also demonstrate that ASV-ID enables detection of 19 differently expressed isoforms present in several cell lines. Thus, a new workflow using ASV-ID has the potential to map yet-to-be-identified difficult protein isoforms in a simple and robust way.

UR - http://www.scopus.com/inward/record.url?scp=85055175749&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85055175749&partnerID=8YFLogxK

U2 - 10.1021/acs.jproteome.8b00548

DO - 10.1021/acs.jproteome.8b00548

M3 - Article

VL - 17

SP - 4235

EP - 4242

JO - Journal of Proteome Research

JF - Journal of Proteome Research

SN - 1535-3893

IS - 12

ER -