| Diatoms are unicellular microorganisms that play an important role in global primary production and inhabit both freshwater and marine environments. Identifying them is essential for assessing water quality and monitoring bio-diversity, but the limited number of experts limits the spatial and temporal scope and resolution of these activities. Alongside manual identification us-ing light microscopy, algorithmic diatom identification using machine learn-ing has increasingly been explored recently. In this study, we implemented an end-to-end deep learning-based workflow for identifying benthic diatoms, combining a segmentation model to locate diatom valves in digital light mi-croscopy slide images with four classifiers with different architectures for genus-level taxonomic identification. In contrast to other diatom deep learn-ing studies, we evaluated the performance on newly collected data from dif-ferent geographical regions across six European countries, thus forcing a domain shift during inference. Our results show that the SWIN transformer performed best, and that diatoms from Scandinavian countries proved sub-stantially more difficult for the models to identify than Central European samples. Besides classifier comparison, we tested different metrics that could be useful for selecting cases for expert control in an envisaged future itera-tive computer-assisted expert identification workflow. We conclude that up-scaling deep learning-supported diatom identification to a continental or global scale will require substantial additional training data, and that an in-teractive human-computer collaboration scenario seems a promising future avenue for filling this gap. |
*** Title, author list and abstract as submitted during Camera-Ready version delivery. Small changes that may have occurred during processing by Springer may not appear in this window.