Automatic identification of fictional characters is one of the primary analysis techniques for video content. A common approach to detect characters in live-action movies involves detecting human faces; however, this approach cannot be used in non-realistic domains, such as animation movies. Detection of characters in animation movies presents two major challenges: the same subject of character can be expressed in various unique styles and there are no stylistic or other restrictions on the nature and design of character objects. To address these challenges, we introduce the “animation adaptive region-based convolutional neural network (AA-R-CNN)” model to detect characters in animation movies and determine whether the detected characters are human or non-human types. Our model extends the Faster R-CNN model, which is a two-stage object detector, in the following manner: 1) we add a hierarchical animation adaptation (HAA) module to learn the variety of unique styles from animation movies using a single model; 2) we incorporate a double-detector architecture to focus on the regions that are visually important in determining the character class. We build a new dataset for the animated character detection task. Experiments on this dataset show that our model outperforms other existing representative object detector models in terms of character detection. Furthermore, our model achieves significant performance improvements compared with previous state-of-the-art methods used for the character dictionary generation task. Our model is robust for a variety of animation styles and can find common visual representations of all types of characters, providing an effective way to detect animated characters.
All Science Journal Classification (ASJC) codes
- Signal Processing
- Media Technology
- Computer Science Applications
- Electrical and Electronic Engineering