| Prompt injection poses a critical security threat to language models in production. Multi-agent systems amplify this risk, as attacks can propagate across agent boundaries. This paper introduces TIVS-O, an extended security evaluation framework incorporating semantic caching and a novel Observability Score Ratio (OSR) metric. The central research question is whether transparency about security decisions strengthens or weakens protection in multi-agent pipelines. We implement a three-agent sequential pipeline (Frontend, Guard-Sanitizer, Policy Enforcer) augmented with Continuum Memory Systems providing semantic caching at two timescales. Evaluation covers 301 adversarial prompts across ten attack families, scored by an independent fourth-agent using five security metrics. OSR quantifies the degree to which each agent exposes its reasoning about blocking decisions. Results demonstrate robust security performance. Zero high-risk breaches (ISR ≥ 0.5) occurred across all prompts, with 84.4% classified as secure (ISR < 0.2). Semantic caching reduced computation by 41.6%, lowering energy use and carbon footprint while mitigating context window overflow. Notably, the ExtremeObservability configuration achieved the best performance (mean TIVS-O −0.521, std dev 0.088), demonstrating that transparency enhances rather than degrades security. Ablation studies confirm independent contributions of memory layers: MTM alone improved TIVS-O by 35%, with LTM contributing an additional 24%. The full architecture achieved 67% vulnerability reduction and 59% latency improvement, without model retraining. |
*** Title, author list and abstract as submitted during Camera-Ready version delivery. Small changes that may have occurred during processing by Springer may not appear in this window.