Skip to main content

Table 2 Statistical analysis of AI performance compared to human raters

From: Artificial intelligence-guided distal radius fracture detection on plain radiographs in comparison with human raters

 

AI on ap radiograph

Orthopaedic surgeon on ap radiograph

p-value

AI on lateral view radiograph

Orthopaedic surgeon on lateral view radiograph

p-value

Accuracy

95.90

94.95

0.239

94.81

96.10

0.646

Cohen’s Kappa

0.913

0.894

N/A

0.891

0.918

N/A

F1 score

0.947

0.935

N/A

0.934

0.950

N/A

Sensitivity

92.02

89.73

0.182

89.79

90.83

0.773

Specificity

98.45

98.48

1.000

98.25

99.71

0.131

Youden Index

90.47

88.22

N/A

88.04

90.55

N/A

True Positives

196

201

N/A

211

218

N/A

True Negatives

318

325

N/A

337

348

N/A

False Positives

5

5

N/A

6

1

N/A

False Negatives

17

23

N/A

24

22

N/A

  1. AI: artificial intelligence; AP: anteroposterior; N/A: not applicable