Background: The differences regarding categorization of thyroid nodules among different guidelines may affect the diagnostic performances and agreement among observers. Purpose: To compare the diagnostic performances and agreements between observers with various degree of experience when applying different guidelines for stratifying thyroid nodules using suspicious ultrasonography (US) features. Material and Methods: This retrospective study included 370 thyroid nodules (≥10 mm). Four observers, grouped as experienced and inexperienced, evaluated the US features and made final assessments according to the Kim criteria, Thyroid Imaging Reporting and Data System (TIRADS) by Kwak et al., and the 2015 American Thyroid Association (ATA) guideline. Diagnostic performances and agreements among the two groups were compared. Results: The Kim criteria shows higher specificity with significantly lower sensitivity when compared to TIRADS and the 2015 ATA guideline (all P < 0.001), regardless of the level of experience. The experienced group showed significantly higher specificity with the Kim criteria and the 2015 ATA guideline compared to the inexperienced group (P < 0.001), and the inexperienced group showed significantly higher sensitivity using the Kim criteria (P = 0.002). The experienced group showed significantly higher agreement than the inexperienced group when using TIRADS while higher agreement was seen when using the 2015 ATA guideline for the inexperienced group. Agreement was not significantly different for the Kim criteria according to observer experience. Conclusion: The diagnostic performances and agreements show significant differences in risk stratification of thyroid nodules according to the three guidelines using suspicious US features and the level of experience of the observer.
All Science Journal Classification (ASJC) codes
- Radiological and Ultrasound Technology
- Radiology Nuclear Medicine and imaging