
Screening mammography acquisition parameters affect both AI's and radiologists' interpretation performance, according to a study published September 17 in Radiology: Artificial Intelligence.
The findings could help clinicians better develop "effective strategies for the clinical integration of AI," wrote a team led by William Lotter, PhD, of Dana-Farber Cancer Institute and Harvard Medical School in Boston.
"Along with influencing human interpretation, variations in acquisition parameters may affect the accuracy of AI models," he and colleagues noted. "As such, it is crucial to understand the impact of image acquisition parameters on AI performance alone and in comparison to the performance of radiologists' interpretations."
The investigators assessed any associations between seven mammogram acquisition parameters and AI and radiologist performance for interpreting 2D screening mammograms acquired between December 2010 and 2019. The parameters included the following:
- Mammography machine type,
- Kilovoltage peak (kVp),
- X-ray exposure delivered,
- Relative x-ray exposure,
- Paddle size,
- Compression force, and
- Breast tissue thickness.
The study assessed an ensemble AI model developed from the Digital Mammography DREAM Challenge. It included a dataset of 28,278 screening 2D mammograms from 22,626 women (mean age, 58 years). Of the total mammograms, 324 resulted in a breast cancer diagnosis within a year.
The group reported the following:
Radiologist, AI model performance for identifying breast cancer on 2D screening mammography | ||
Measure | Radiologist readers | AI model |
Sensitivity | 79.3% | 76.9% |
Specificity | 88.7% | 76.9% |
It also found that increased x-ray exposure translated to reduced specificity for the AI model (-4.5% per one standard deviation increase; p < 0.001) but not for radiologists (p = 0.44), while increased compression reduced specificity for radiologists (-1.3% per one standard deviation increase; p < 0.001) but not for AI; p = 0.60). The authors also reported that radiologists' and AI models' performance also "showed similar trends for kVp, where increased kVp had little effect on sensitivity but was associated with a slight increase in specificity for both."
The results contribute to the literature regarding how "the behavior of AI models across variations present among real-world clinical populations is critical for their safe and effective deployment," according to Lotter and colleagues.
"The ability to understand an AI model’s strengths and limitations in relation to measurable, clinically available parameters can help radiologists determine when AI predictions should be more or less trusted in clinical decision-making, helping ensure that AI delivers maximal benefits to patients," they concluded.
The complete study can be found here.