Background: Thyroid ultrasound (US) is used as the first diagnostic tool to assess the management of the disease. In spite of its importance, US is a very subjective method and highly dependent on the skill of the performer. There have been few reports evaluating thyroid US performance and even fewer reports of observer variability in US assessment. Therefore, we evaluated inter- and intraobserver variations in US assessment of thyroid nodules and diagnosis among four radiologists and estimated its diagnostic accuracy. Methods: A total of 204 thyroid nodules in 144 patients were reviewed. There were 89 benign and 115 malignant cases. Four radiologists with more than 5 years of experience independently reviewed US images twice at 6-week intervals. Echogenicity, composition, margin, shape, calcification, vascularity, and final assessment were evaluated. Inter- and intraobserver variations were determined with Cohen's kappa statistics, and accuracy was calculated. Results: For interobserver variations, echogenicity showed slight agreement (κ = 0.34); composition, margin, calcification, and final assessment had fair agreement (κ = 0.59, 0.42, 0.58, and 0.54, respectively); shape and vascularity showed substantial agreement (κ = 0.61 and 0.64, respectively). For intraobserver variability, almost all showed substantial agreement (κ > 0.61). Overall sensitivity, specificity, positive predictive value, negative predictive value, and accuracy for the four radiologists were 88.2%, 78.7%, 76.2%, 89.6%, and 82.8%, respectively. Conclusions: Experienced radiologists showed more than a moderate degree of agreement in US assessment of thyroid nodules, and their final assessments were highly accurate.
All Science Journal Classification (ASJC) codes
- Endocrinology, Diabetes and Metabolism