Digestive Organ Recognition in Video Capsule Endoscopy Based on Temporal Segmentation Network

Yejee Shin, Taejoon Eo, Hyeongseop Rha, Dong Jun Oh, Geonhui Son, Jiwoong An, You Jin Kim, Dosik Hwang, Yun Jeong Lim

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

The interpretation of video capsule endoscopy (VCE) usually takes more than an hour, which can be a tedious process for clinicians. To shorten the reading time of VCE, algorithms that automatically detect lesions in the small bowel are being actively developed, however, it is still necessary for clinicians to manually mark anatomic transition points in VCE. Therefore, anatomical temporal segmentation must first be performed automatically at the full-length VCE level for the fully automated reading. This study aims to develop an automated organ recognition method in VCE based on a temporal segmentation network. For temporal locating and classifying organs including the stomach, small bowel, and colon in long untrimmed videos, we use MS-TCN++ model containing temporal convolution layers. To improve temporal segmentation performance, a hybrid model of two state-of-the-art feature extraction models (i.e., TimeSformer and I3D) is used. Extensive experiments showed the effectiveness of the proposed method in capturing long-range dependencies and recognizing temporal segments of organs. For training and validation of the proposed model, the dataset of 200 patients (100 normal and 100 abnormal VCE) was used. For the test set of 40 patients (20 normal and 20 abnormal VCE), the proposed method showed accuracy of 96.15, F1-score@{50,75,90} of {96.17, 93.61, 86.80}, and segmental edit distance of 95.83 in the three-class classification of organs including the stomach, small bowel, and colon in the full-length VCE.

Original languageEnglish
Title of host publicationMedical Image Computing and Computer Assisted Intervention – MICCAI 2022 - 25th International Conference, Proceedings
EditorsLinwei Wang, Qi Dou, P. Thomas Fletcher, Stefanie Speidel, Shuo Li
PublisherSpringer Science and Business Media Deutschland GmbH
Pages136-146
Number of pages11
ISBN (Print)9783031164484
DOIs
Publication statusPublished - 2022
Event25th International Conference on Medical Image Computing and Computer-Assisted Intervention, MICCAI 2022 - Singapore, Singapore
Duration: 2022 Sept 182022 Sept 22

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume13437 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference25th International Conference on Medical Image Computing and Computer-Assisted Intervention, MICCAI 2022
Country/TerritorySingapore
CitySingapore
Period22/9/1822/9/22

Bibliographical note

Funding Information:
Acknowledgements. This research was supported by a grant (grant number: HI19C0665) from the Korean Health Technology R & D project through the Korean Health Industry Development Institute (KHIDI) funded by the Ministry of Health & Welfare, Republic of Korea. And this research was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Science and ICT (2021R1C1C2008773), and Y-BASE R&E Institute a Brain Korea 21, Yonsei University. And this research was partially supported by the Yonsei Signature Research Cluster Program of 2022 (2022-22-0002), the KIST Institutional Program (Project No.2E31051-21-204), the Institute of Information and Communications Technology Planning and Evaluation (IITP) Grant funded by the Korean Government (MSIT) Artificial Intelligence Graduate School Program, Yonsei University (2020-0-01361).

Publisher Copyright:
© 2022, The Author(s), under exclusive license to Springer Nature Switzerland AG.

All Science Journal Classification (ASJC) codes

  • Theoretical Computer Science
  • Computer Science(all)

Fingerprint

Dive into the research topics of 'Digestive Organ Recognition in Video Capsule Endoscopy Based on Temporal Segmentation Network'. Together they form a unique fingerprint.

Cite this