22nd AIAI 2026, 16 - 19 July 2026, Chania, Crete, Greece

A Framework for LLM-Guided and ISO 5338-Aligned Data Engineering in Medical Device Adverse Event Reporting

Muaz Muhammad, Janjua Zaffar Haider, Loughran Roisin, St John Lynch Niamh

Abstract:

  Medical device adverse event reporting systems, such as Manufacturer and User Facility Device Experience (MAUDE) database from the Food and Drug Administration (FDA), play a critical role in post-market surveillance and patient safety monitoring. However, the accurate assignment of standardized Adverse Event Terminology (AET) codes, particularly the globally harmonized International Medical Device Regulators Forum (IMDRF) codes, remains a labour-intensive task prone to variability in human judgment. This study presents a Large Language Model (LLM)-guided LLM-agreement filtering methodology for creating a high-confidence benchmark dataset from MAUDE records. By requiring agreement between human-annotated IMDRF codes and predictions from state-of-the-art LLMs at Level 2 hierarchical granularity, a low-ambiguity subset of 1,728 device problem records and 1,224 health effect records from an initial pool of 10,000 adverse events is established. The approach aligns the entire data pre-processing pipeline with ISO/IEC 5338:2023 AI data engineering principles, implementing modular Agent Skills that lays foundation for reproducibility, traceability, and quality assurance. Moreover, the open-source GPT-OSS-120B reasoning model is evaluated on this curated dataset, revealing significant challenges in cross-model alignment: the model achieves only 59.49% consensus at Level 1 for device problems and drops to 23.04% at Level 3, highlighting the difficulty of reconciling diverse textual inputs with rigid regulatory terminologies in long-context scenarios. These findings underscore the need for specialized NLP workflows beyond simple zero-shot prompting and establish a rigorous benchmark for future automated adverse event coding research.  

*** Title, author list and abstract as submitted during Camera-Ready version delivery. Small changes that may have occurred during processing by Springer may not appear in this window.