Neopepsee: Accurate genome-level prediction of neoantigens by harnessing sequence and amino acid immunogenicity information

S. Kim, H. S. Kim, E. Kim, M. G. Lee, E. C. Shin, S. Paik, Sangwoo Kim

Research output: Contribution to journalArticlepeer-review

86 Citations (Scopus)

Abstract

Background: Tumor-specific mutations form novel immunogenic peptides called neoantigens. Neoantigens can be used as a biomarker predicting patient response to cancer immunotherapy. Although a predicted binding affinity (IC50) between peptide and major histocompatibility complex class I is currently used for neoantigen prediction, large number of false-positives exist. Materials and methods: We developed Neopepsee, a machine-learning-based neoantigen prediction program for nextgeneration sequencing data. With raw RNA-seq data and a list of somatic mutations, Neopepsee automatically extracts mutated peptide sequences and gene expression levels. We tested 14 immunogenicity features to construct a machine-learning classifier and compared with the conventional methods based on IC50 regarding sensitivity and specificity. We tested Neopepsee on independent datasets from melanoma, leukemia, and stomach cancer. Results: Nine of the 14 immunogenicity features that are informative and inter-independent were used to construct the machine-learning classifiers. Neopepsee provides a rich annotation of candidate peptides with 87 immunogenicity-related values, including IC50, expression levels of neopeptides and immune regulatory genes (e.g. PD1, PD-L1), matched epitope sequences, and a three-level (high, medium, and low) call for neoantigen probability. Compared with the conventional methods, the performance was improved in sensitivity and especially two- to threefold in the specificity. Tests with validated datasets and independently proven neoantigens confirmed the improved performance in melanoma and chronic lymphocytic leukemia. Additionally, we found sequence similarity in proteins to known pathogenic epitopes to be a novel feature in classification. Application of Neopepsee to 224 public stomach adenocarcinoma datasets predicted ~ 7 neoantigens per patient, the burden of which was correlated with patient prognosis. Conclusions: Neopepsee can detect neoantigen candidates with less false positives and be used to determine the prognosis of the patient. We expect that retrieval of neoantigen sequences with Neopepsee will help advance research on nextgeneration cancer immunotherapies, predictive biomarkers, and personalized cancer vaccines.

Original languageEnglish
Pages (from-to)1030-1036
Number of pages7
JournalAnnals of Oncology
Volume29
Issue number4
DOIs
Publication statusPublished - 2018 Apr 1

Bibliographical note

Funding Information:
The authors thank Dong-Su Jang (Medical Illustrator, Department of Research Affairs, Yonsei University College of Medicine, Seoul, South Korea) for his help in creating the medical illustrations. Funding This work was supported by the Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Science, ICT & Future Planning (2015R1C1A1A01053638); HSK and MGL were supported by the National Research Foundation, the Ministry of Science, ICT & Future Planning (2013R1A3A2042197); and the Korea Health Technology R&D Projects through the Korea Health Industry Development Institute (KHIDI) funded by the Ministry of Health & Welfare (HI14C1324), Republic of Korea. SWK was additionally funded by a faculty research grant from the Yonsei University College of Medicine (6-2016-0081)

Funding Information:
This work was supported by the Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Science, ICT & Future Planning (2015R1C1A1A01053638); HSK and MGL were supported by the National Research Foundation, the Ministry of Science, ICT & Future Planning (2013R1A3A2042197); and the Korea Health Technology R&D Projects through the Korea Health Industry Development Institute (KHIDI) funded by the Ministry of Health & Welfare (HI14C1324), Republic of Korea. SWK was additionally funded by a faculty research grant from the Yonsei University College of Medicine (6-2016-0081).

Publisher Copyright:
© The Author(s) 2018. Published by Oxford University Press on behalf of the European Society for Medical Oncology. All rights reserved.

All Science Journal Classification (ASJC) codes

  • Hematology
  • Oncology

Fingerprint

Dive into the research topics of 'Neopepsee: Accurate genome-level prediction of neoantigens by harnessing sequence and amino acid immunogenicity information'. Together they form a unique fingerprint.

Cite this