We applied the Multi-Diversity Consistency Self-distillation (MDCS) [3] framework to develop a deep learning model for classifying BMD into three categories—normal, osteopenia, and osteoporosis—based on chest X-ray (CXR) images. These categories were defined by DXA T-scores as follows: normal (T-score ≥ −1.0), osteopenia (−2.5 < T-score < −1.0), and osteoporosis (T-score ≤ −2.5). A total of 69,201 CXRs paired with DXA T-scores —collected on the same date—were obtained from a single hospital and served as the training dataset. All images were resized to 224×224 for model training.
The MDCS framework consists of three major strategies. First, we utilized augmentation techniques to each CXR, generating two versions: a weakly augmented image with minimal modifications and a strongly augmented image with more extensive transformations. Both versions of the images were then processed by a shared feature extractor—a neural network that converts CXR images into numerical representations useful for classification. This approach allowed the model to learn robust and generalizable features. Next, multi-expert models [4] were employed to address the distribution differences across the three BMD categories—normal, osteopenia, and osteoporosis—which arise from their varying prevalence in the dataset as shown in Figure 1. Each expert targeted a distinct class distribution, helping manage class imbalances by capturing a broad spectrum of bone health conditions and thereby improving the robustness of the three-class classification. A diversity loss function was applied to encourage each expert model to focus on different aspects of the dataset, preventing redundant learning and promoting specialized feature extraction.
Lastly, a self-distillation [5] was applied to further refine model predictions. In the process, weakly augmented images were first passed through the model to generate initial predictions and the most confident outputs were selected as pseudo-labels for strongly augmented images. By aligning predictions across these augmentations through a consistency loss, the model achieved improved stability and reduced variance.