22nd AIAI 2026, 16 - 19 July 2026, Chania, Crete, Greece

Context-Aware Embeddings of System Events via LLMs Towards Fine-Grained Threat Detection

Zuo Fei, Rhee Junghwan, Choe Yung Ryn, Chi Haotian

Abstract:

  In recent years, the emergence of Large Language Models (LLMs) has become one of the most impactful breakthroughs in AI. LLMs possess several notable characteristics, including a natural ability to process textual information, strong reasoning and semantic capturing capabilities, and powerful contextualized embedding generation mechanisms. These properties imply that LLMs may offer significant advantages in assisting the analysis of system provenance data. Although prior research has explored various applications of LLMs in cybersecurity, the extent to which LLMs can effectively support system provenance analysis remains an open question. In this work, we address this gap by investigating the capability of LLMs to understand and reason over provenance data derived from real-world cyber-attack scenarios. Leveraging the contextualized embeddings produced by LLMs, we implement a threat detection prototype that operates at the fine-grained level of system events. Evaluation results show that LLMs have a strong ability to interpret system events, and the embeddings they produce exhibit clear analogical reasoning capabilities. Collectively, our threat detection method achieves up to 100% accuracy. Our findings demonstrate the strong potential of LLMs for provenance-based security analysis and highlight their promise for future research in this domain.  

*** Title, author list and abstract as submitted during Camera-Ready version delivery. Small changes that may have occurred during processing by Springer may not appear in this window.