Predicting breast cancer by applying deep learning to linked health records and mammograms

machine learning
deep learning
healthcare

Improve breast cancer prediction by combining mammography with clinical records using deep learning and boosting trees.

Authors
Affiliations

Ayelet Akselrod-Ballin

IBM Research

Michal Chorev

IBM Research

Yoel Shoshan

IBM Research

Adam Spiro

IBM Research

Alon Hazan

IBM Research

Roie Melamed

IBM Research

Ella Barkan

IBM Research

Esma Herzel

Maccabi Healthcare Services

Shaked Naor

IBM Research

Ehud Karavani

IBM Research

Gideon Koren

Maccabi Healthcare Services

Yaara Goldschmidt

IBM Research

Varda Shalev

Maccabi Healthcare Services

Michal Rosen-Zvi

IBM Research

Michal Guindy

Assuta Medical Centers

Published

June 18, 2019

Doi
Abstract

Background: Computational models on the basis of deep neural networks are increasingly used to analyze health care data. However, the efficacy of traditional computational models in radiology is a matter of debate. Purpose: To evaluate the accuracy and efficiency of a combined machine and deep learning approach for early breast cancer detection applied to a linked set of digital mammography images and electronic health records. Materials and Methods: In this retrospective study, 52 936 images were collected in 13 234 women who underwent at least one mammogram between 2013 and 2017, and who had health records for at least 1 year before undergoing mammography. The algorithm was trained on 9611 mammograms and health records of women to make two breast cancer predictions: to predict biopsy malignancy and to differentiate normal from abnormal screening examinations. The study estimated the association of features with outcomes by using t test and Fisher exact test. The model comparisons were performed with a 95% confidence interval (CI) or by using the DeLong test. Results: The resulting algorithm was validated in 1055 women and tested in 2548 women (mean age, 55 years ± 10 [standard deviation]). In the test set, the algorithm identified 34 of 71 (48%) false-negative findings on mammograms. For the malignancy prediction objective, the algorithm obtained an area under the receiver operating characteristic curve (AUC) of 0.91 (95% CI: 0.89, 0.93), with specificity of 77.3% (95% CI: 69.2%, 85.4%) at a sensitivity of 87%. When trained on clinical data alone, the model performed significantly better than the Gail model (AUC, 0.78 vs 0.54, respectively; P < .004). Conclusion: The algorithm, which combined machine-learning and deep-learning approaches, can be applied to assess breast cancer at a level comparable to radiologists and has the potential to substantially reduce missed diagnoses of breast cancer.

Citation

@article{akselrod2019predicting,
  title={Predicting breast cancer by applying deep learning to linked health records and mammograms},
  author={Akselrod-Ballin, Ayelet and Chorev, Michal and Shoshan, Yoel and Spiro, Adam and Hazan, Alon and Melamed, Roie and Barkan, Ella and Herzel, Esma and Naor, Shaked and Karavani, Ehud and others},
  journal={Radiology},
  volume={292},
  number={2},
  pages={331--342},
  year={2019},
  publisher={Radiological Society of North America}
}