20th AIAI 2024, 27 - 30 June 2024, Corfu, Greece

Vision transformer based tokenization for enhanced breast cancer histopathological images classification

Mouhamed Laid ABIMOULOUD, Khaled BENSID, Mohamed Elleuch, Oussama AIADI, Monji KHERALLAH

Abstract:

  Breast cancer remains a global concern, underscoring the crucial need for early diagnosis to ensure effective treatment. In recent years, convolutional neural networks (CNNs) have dominated medical vision tasks, but recent interest in computer-aided diagnosis (CAD) has shifted towards vision transformers (ViTs). However, the vision transformer (ViT) is recognized for its data-intensive nature and a substantial number of parameters, resulting in poorer performance compared to CNNs. The challenges associated with the data-intensive characteristics and extensive parameters of ViTs are especially pronounced in tasks involving medical image datasets characterized by data scarcity. To address this gap, our paper proposes the TokenLearner model for classifying breast tumours in histopathological images. This hybrid model fuses ViT and convolution layers operating directly on input patches and utilizing standard convolutions in the attention map block to dynamically highlight relevant regions in the input patches, minimizing the number of patches used in training for a Vision Transformer with lower training time and complexity compared to the ViT base architecture. The model was extensively evaluated using the BreakHis dataset, comprising 2496 benign and 5429 malignant masses, resulting in remarkable performance. It achieved an accuracy of 97.04%, precision of 96.99%, sensitivity of 96.11%, and F1 score of 96.54% for breast mass classification. Our study highlights the effective utilization of trained attention mechanisms in developing high- performance computer-aided systems for breast cancer diagnosis. Importantly, this approach achieves outstanding results while minimizing computational resource requirements and reducing processing time.  

*** Title, author list and abstract as seen in the Camera-Ready version of the paper that was provided to Conference Committee. Small changes that may have occurred during processing by Springer may not appear in this window.