While results of subjective quality assessment are represented by mean opinion scores and corresponding confidence intervals, the output of an objective quality metric for a given stimulus is only a single estimated quality level. Accordingly, the performance of a metric is evaluated by measuring the accuracy of its outputs with respect to the corresponding subjective scores. However, the concept of the ambiguity interval for objective quality has been raised recently. In this paper, we propose to consider not only the accuracy but also the ambiguity of objective quality metrics for performance evaluation. In particular, we conduct benchmarking of the seven state-of-the-art image quality metrics for images compressed with JPEG and JPEG2000. It is demonstrated that the best metric in terms of accuracy may not be the best in terms of ambiguity.