20th AIAI 2024, 27 - 30 June 2024, Corfu, Greece

GreekT5: Sequence-to-Sequence Models for Greek News Summarization

Nikolaos Giarelis, Charalampos Mastrokostas, Nikos Karacapilidis

Abstract:

  Text summarization is a natural language processing subtask pertaining to the automatic formulation of a concise and coherent summary that covers the major concepts and topics from one or multiple documents. Recent advancements in deep learning have led to the development of abstractive summarization Transformer-based models, which outperform classical approaches. In any case, research in this field focuses on high resource languages such as English, while the corresponding work for low resource languages is limited. Dealing with modern Greek, this paper proposes a series of new abstractive models for news article summarization. The proposed models were thoroughly evaluated on the same dataset against GreekBART, the only existing model for Greek abstractive news summarization. Our evaluation results reveal that most of the proposed models perform better than GreekBART on various evaluation metrics. Our experiments indicate that multilingual Seq2Seq models, fine-tuned for a specific language and task, can achieve similar or even better performance compared to monolingual models pre-trained and fine-tuned for the same language and task, while requiring significantly less computational resources. We make our evaluation code public, aiming to increase the reproducibility of this work and facilitate future research in the field.  

*** Title, author list and abstract as seen in the Camera-Ready version of the paper that was provided to Conference Committee. Small changes that may have occurred during processing by Springer may not appear in this window.