22nd AIAI 2026, 16 - 19 July 2026, Chania, Crete, Greece

Harmful Content Moderation in e-Participation Platforms

Vidal Rui, Neto Marco, Dias Tomas, Tomas Pedro, Rosa Luis, Cordeiro Luis

Abstract:

  Moderating harmful textual content in e-participation platforms requires balancing user safety with the protection of legitimate democratic expression, procedural fairness, and deliberative quality. This paper reviews the evolution of automated moderation approaches, from rule-based and classical machine learning methods to transformers and large language models, while highlighting the contextual, explainability, bias, and governance challenges that are especially salient in civic settings. Building on this review, the paper presents a multi-LLM moderation workflow with council-based consolidation and human oversight. Experimental results using a test set of 150 examples show that while individual model accuracy peaked at 0.6267 with Llama 3.2, an oracle accuracy of 0.9467 highlights the potential for improved results. Aggregating judgments through a council yielded 0.5733 accuracy, suggesting that refining consolidation rules is a key path forward for robust decisions. The proposed workflow is further situated as a possible moderation component within the broader CoParticipation platform. The study positions harmful content moderation in e-participation as a socio-technical governance challenge rather than a conventional text classification task, and emphasises the need for evaluation methods aligned with deliberative and institutional goals.  

*** Title, author list and abstract as submitted during Camera-Ready version delivery. Small changes that may have occurred during processing by Springer may not appear in this window.