21th AIAI 2025, 26 - 29 June 2025, Limassol, Cyprus

OPJUSTICE - A Multi-label Text Classification Model

Grecu Ionuț-Cătălin, Popescu Paul Stefan, Mihaescu Marian Cristian

Abstract:

  Online communication platforms such as Discord have be- come central to various communities, yet their rapid growth has also increased the complexity of moderation. This paper addresses the challenge by presenting a multi-label text classification system that automatically detects toxic messages in Romanian-language Discord chats. Our approach employs a cyclical workflow encompassing data collection from both Discord channels and public datasets, rigorous data curation and labeling (including human-in-the-loop validation), and iterative model retraining. We fine-tune a RoBERT-small model to classify messages into multiple toxicity categories—including aggression, hateful speech, violence, and sexual content—while retaining an “OK” label for non-toxic messages. Experimental results on a dataset of over 22,000 labeled instances reveal high accuracy and robust F1-scores across toxicity categories. Furthermore, seamless integration with Discord’s bot infras- tructure allows the model to respond in near real-time, either alerting human moderators or taking automated actions (e.g., warnings or bans). By offloading repetitive and time-sensitive tasks, the system empowers moderators to focus on community-building and nuanced policy decisions. Overall, our solution offers a scalable and effective framework for AI-driven moderation, paving the way for broader applications in multi-lingual and cross-platform contexts.  

*** Title, author list and abstract as submitted during Camera-Ready version delivery. Small changes that may have occurred during processing by Springer may not appear in this window.