A research application for AI-based breast MRI analysis (BreastMARC, v1.6, Siemens Healthcare, Erlangen, Germany) is used to automatically segment the breast tissue, parenchyma, and lesions. Its data processing pipeline is depicted in Figure 2.
The breast tissue and parenchyma segmentation are carried out by deep learning models based on a 3D U-Net [3] and 2D U-Net [4] respectively. The image intensities are normalized by mapping their 2nd and 98th intensity percentiles to 0 and 1. For breast tissue segmentation the images are resampled to a fixed size of 240 x 240 x 128 while for parenchyma segmentation they are cropped to the breast region and resampled to a voxel size of 0.6 x 0.6 mm2. The breast segmentation is trained on 198 annotated volumes and the parenchyma segmentation on 55 volumes, both using a 60-20-20 split for training, validation and test sets.
The lesion detection algorithm is based on a re-implementation of the nnU-Net framework [5], using a 2D model that takes in whole slices as input. The image intensities are normalized by mapping their 2nd and 98th intensity percentiles to 0 and 1. The slices are cropped to the breast region and resampled to a voxel size of 0.435 x 0.435 mm2. It is trained on 1757 volumes using a 60-20-20 split for training, validation and test sets. To reduce the number of false positive detections, shape, intensity and texture features are extracted from the detected lesion candidates using OBIA features [6]. A random forest classifier is trained on these features, classifying the detections into true and false positives. This allows to set different threshold values on the false positive reduction that can balance sensitivity against the number of false positives produced by the algorithm.
More information about the dataset used for model training and testing during model development is depicted in Table 3.
For external evaluation, 116 patients who underwent DCE-MRI and in whom cancer was diagnosed by breast biopsy between 2021 and 2022 are randomly selected from a Radiology Information System in one institution (Site 5 in Table 3). These external test cases are evaluated directly at the clinical site and not accessed in any way during model development. Three different cut-off points of the false positive reduction for lesion detection are analyzed. The deep learning-based results are compared to original interpretation and biopsy results by one radiology expert in breast cancer, assessing the number of true positive, false positive and false negative detections per case.