AUTHORS
Kicky G. van Leeuwen, MSc • Steven Schalekamp, MD, PhD • Matthieu J. C. M. Rutten, MD, PhDs •
Merel Huisman, MD, PhD • Cornelia M. Schaefer-Prokop, MD, PhD • Maarten de Rooij, MD, PhD •
Bram van Ginneken, PhD • Bas Maresch, MD • Bram H. J. Geurts, MD • Cornelius F. van Dijke, MD, PhD •
Emmeline Laupman-Koedam, MD • Enzo V. Hulleman, MD • Eric L. Verhoeff, MD • Evelyne M. J. Meys, MD, PhD •
Firdaus A. A. Mohamed Hoesein, MD, PhD • Floor M. ter Brugge, MD • Francois van Hoorn, MD •
Frank van der Wel, Ad • Inge A. H. van den Berk, MD • Jacqueline M. Luyendijk, MD • James Meakin, PhD •
Jesse Habets, MD, PhD • Jonathan I. M. L. Verbeke, MD • Joost Nederend, MD, PhD • Karlijn M. E. Meys, MD •
Laura N. Deden, MSc • Lucianne C. M. Langezaal, MD • Mahtab Nasrollah, MD • Marleen Meij, MD •
Martijn F. Boomsma, MD, PhD • Matthijs Vermeulen, MD • Myrthe M. Vestering, MD • Onno Vijlbrief, MD •
Paul Algra, MD • Selma Algra, MD, PhD • Stijn M. Bollen, MD • Tijs Samson, PDEng •
Yntor H. G. von Brucken Fock, MD • for the Project AIR Working Group1
From the Department of Medical Imaging, Radboud University Medical Center, Geert Grooteplein Zuid 10, 6525 GA Nijmegen, the Netherlands (K.G.v.L., S.S.,
M.J.C.M.R., M.H., C.M.S.P., M.d.R., B.v.G., B.H.J.G., J.M.); Department of Radiology (M.J.C.M.R.) and Department of MICT and Imaging Techniques (T.S.), Jeroen
Bosch Hospital, ’s-Hertogenbosch, the Netherlands; Department of Radiology, Meander Medical Centre, Amersfoort, the Netherlands (C.M.S.P., M.V.); Department of
Radiology, Hospital Gelderse Vallei, Ede, the Netherlands (B.M., M.M.V.); Department of Radiology, Noordwest Ziekenhuisgroep, Alkmaar, the Netherlands (C.F.v.D., P.A.);
Department of Radiology & Nuclear Medicine, Máxima Medical Center, Eindhoven, the Netherlands (E.L.K., F.v.d.W.); Department of Radiology, Ziekenhuisgroep Twente,
Almelo, the Netherlands (E.V.H., F.M.t.B., M.M., O.V., Y.H.G.v.B.F.); Center for Radiology and Nuclear Medicine, Deventer Hospital, Deventer, the Netherlands (E.L.V.,
J.M.L., M.N.); Department of Radiology, Catharina Hospital, Eindhoven, the Netherlands (E.M.J.M., J.N., K.M.E.M.); Department of Radiology, University Medical
Center Utrecht, Utrecht, the Netherlands (F.A.A.M.H.); Department of Radiology, Zaans Medisch Centrum, Zaandam, the Netherlands (F.v.H.); Department of Radiology
and Nuclear Medicine, Amsterdam UMC–Location University of Amsterdam, Amsterdam, the Netherlands (I.A.H.v.d.B.); Department of Radiology & Nuclear Medicine,
Haaglanden Medical Center, The Hague, the Netherlands (J.H.); Department of Radiology, Amsterdam University Medical Center, Amsterdam, the Netherlands (J.I.M.L.V.);
Department of Radiology and Nuclear Medicine, Rijnstate, Arnhem, the Netherlands (L.N.D.); Department of Radiology, St Antonius Hospital, Nieuwegein, the Netherlands
(L.C.M.L., S.A.); Department of Radiology, Isala Hospital, Zwolle, the Netherlands (M.F.B.); and Department of Radiology, Groene Hart Hospital, Gouda, the Netherlands
(S.M.B.). Received April 26, 2023; revision requested July 6; final revision received November 22; accepted November 27. Address correspondence to K.G.v.L. (email: kicky.
vanleeuwen@radboudumc.nl).
PUBLISHED
Background:
Multiple commercial artificial intelligence (AI) products exist for assessing radiographs; however, comparable performance
data for these algorithms are limited.
Purpose:
To perform an independent, stand-alone validation of commercially available AI products for bone age prediction based on hand radiographs and lung nodule detection on chest radiographs.
Materials and Methods:
This retrospective study was carried out as part of Project AIR. Nine of 17 eligible AI products were validated on data from seven Dutch hospitals. For bone age prediction, the root mean square error (RMSE) and Pearson correlation coefficient were computed. The reference standard was set by three to five expert readers. For lung nodule detection, the area under the receiver operating characteristic curve (AUC) was computed. The reference standard was set by a chest radiologist based on CT. Randomized subsets of hand (n = 95) and chest (n = 140) radiographs were read by 14 and 17 human readers, respectively, with varying experience.
Results:
Two bone age prediction algorithms were tested on hand radiographs (from January 2017 to January 2022) in 326 patients (mean age, 10 years ± 4 [SD]; 173 female patients) and correlated strongly with the reference standard (r = 0.99; P < .001 for both). No difference in RMSE was observed between algorithms (0.63 years [95% CI: 0.58, 0.69] and 0.57 years [95% CI: 0.52, 0.61]) and readers (0.68 years [95% CI: 0.64, 0.73]). Seven lung nodule detection algorithms were validated on chest radiographs (from January 2012 to May 2022) in 386 patients (mean age, 64 years ± 11; 223 male patients). Compared with readers (mean AUC, 0.81 [95% CI: 0.77, 0.85]), four algorithms performed better (AUC range, 0.86–0.93; P value range, <.001 to .04).
Conclusion:
Compared with human readers, four AI algorithms for detecting lung nodules on chest radiographs showed improved performance, whereas the remaining algorithms tested showed no evidence of a difference in performance.
© RSNA, 2024