We assessed the feasibility of a data-driven imaging biomarker based on weakly supervised learning (DIB; an imaging biomarker derived from large-scale medical image data with deep learning technology) in mammography (DIB-MG). A total of 29,107 digital mammograms from five institutions (4,339 cancer cases and 24,768 normal cases) were included. After matching patients’ age, breast density, and equipment, 1,238 and 1,238 cases were chosen as validation and test sets, respectively, and the remainder were used for training. The core algorithm of DIB-MG is a deep convolutional neural network; a deep learning algorithm specialized for images. Each sample (case) is an exam composed of 4-view images (RCC, RMLO, LCC, and LMLO). For each case in a training set, the cancer probability inferred from DIB-MG is compared with the per-case ground-truth label. Then the model parameters in DIB-MG are updated based on the error between the prediction and the ground-truth. At the operating point (threshold) of 0.5, sensitivity was 75.6% and 76.1% when specificity was 90.2% and 88.5%, and AUC was 0.903 and 0.906 for the validation and test sets, respectively. This research showed the potential of DIB-MG as a screening tool for breast cancer.
All Science Journal Classification (ASJC) codes