| Microwave breast imaging augmented by Artificial Intelligence (AI) offers considerable potential for improving breast cancer detection. However, the resulting data are high-dimensional, complex, and redundant, spanning wide frequency bands, multiple viewing angles, and numerous sensors. This high feature-to-sample ratio can hinder the analysis and degrade AI models performance. In this study, Principal Component Analysis (PCA) was applied to clinical microwave breast imaging data acquired with MammoWave, a system developed by Umbria Bioengineering Technologies (UBT) S.r.l. within the EU- and UK Research and Innovation-funded MammoScreen project. PCA extracted compact representations were used to train a Long Short-Term Memory (LSTM) model, chosen for its ability to model sequential data. Despite severe class imbalance, the LSTM trained without imbalance handling achieved stable and competitive performance (accuracy ≈ 0.80, AUC ≈ 0.79, F1-score ≈ 0.59), indicating balanced classification behavior. To further investigate the impact of class imbalance, we evaluated imbalance-handling strategies, including weighted sampling, weighted loss, and focal loss. All approaches produced similar results, with focal loss yielding the highest performance (AUC ≈ 0.80, F1-score ≈ 0.60), though improvements were marginal. Notably, these methods helped maintain a balanced precision–recall trade-off rather than substantially increasing other overall metrics. These findings suggest that, while accuracy alone is insufficient for imbalanced clinical data, imbalance-handling techniques primarily improve prediction stability and reliability rather than dramatically enhancing performance in AI-based breast cancer detection. |
*** Title, author list and abstract as submitted during Camera-Ready version delivery. Small changes that may have occurred during processing by Springer may not appear in this window.