A highly experienced breast radiologist performed the ultrasound examinations and systematically collected still images of retroareolar findings using two different probes: a classic linear probe (CLP) with a frequency of 14 MHz (LA2-14A) and a high-frequency probe (HFP) with a frequency of 22 MHz (LA3-22AI). The radiologist carefully selected and stored the images that best characterized the anatomical structures and pathological features of the findings, ensuring optimal representation of the lesions for evaluation.
To assess the diagnostic consistency and accuracy of these imaging modalities, a total of four radiologists—comprising two senior radiologists, with extensive experience in breast imaging, and two radiology residents—were recruited for independent evaluation. Each radiologist, blinded to the original reports and clinical history, was tasked with assigning a BI-RADS score to each image. The scoring was categorized into two groups: BI-RADS 2–3, indicating benign or likely benign findings, and BI-RADS 4–5, suggesting suspicious or highly suggestive findings that may warrant further investigation or biopsy.
To quantify the level of agreement among the radiologists, inter-reader agreement was evaluated using both Cohen's Kappa (κ) and Fleiss' Kappa (κ) statistical methods. These statistical measures were applied to assess consistency between individual readers as well as across the entire group, providing valuable insights into the reliability of each ultrasound probe in characterizing retroareolar lesions. By comparing the agreement levels between the CLP and HFP, this study aims to determine whether high-frequency ultrasound improves diagnostic consensus, thereby enhancing confidence in the assessment of this challenging anatomical region.