Content-based video retrieval has become a very active research area in the last decade due to the increasing number of video content shared on social networks such as YouTube and DailyMotion. While most of the content-based video retrieval approaches employ low-level visual features for global analysis of the video, this paper proposes an object-based retrieval method as an alternative. The goal of the proposed method is to retrieve key frames and shots of a video that contain a particular object. The key idea is to apply an existing object duplicate detection method iteratively to the video sequence in order to compensate for 3D view variations, illumination changes and partial occlusions. Our approach combines viewpoint-invariant region descriptors to describe the appearance of an object using a graph model which considers the spatial layout of the individual regions. Given a query object provided by the user in the form of an image and a region of interest, the system retrieves shots containing this object by analyzing a set of key frames for each shot. The robustness of our approach is demonstrated using a video in which a 3D object is recorded from different view points and with partial occlusions.