20th AIAI 2024, 27 - 30 June 2024, Corfu, Greece

Evaluation of Language Models for Multilabel Classification of Biomedical Texts

Panagiotis G. Syriopoulos, Andreas D. Andriopoulos, Dimitrios Koutsomitropoulos

Abstract:

  The continuous increase of data availability and the need for their utilization make it imperative to organize them into categories. Recent classification problems often involve the prediction of multiple labels simultaneously ap-plying to a single instance. In this paper, we propose a structured approach for the implementation and evaluation of multilabel classification tasks in the context of biomedical texts. This involves selecting appropriate datasets and models, designing experiments, and defining metrics that accurately measure the models' performance across various aspects of the task. Our results yield notable scores and conclusions for the behavior of some state-of-the-art lan-guage models in specific data. It is shown that the complexity of biomedical data and the intricacy of multilabel classification require careful considera-tion of these models' capabilities to handle large label spaces, label correla-tions, and the nuances of biomedical language.  

*** Title, author list and abstract as seen in the Camera-Ready version of the paper that was provided to Conference Committee. Small changes that may have occurred during processing by Springer may not appear in this window.