Your procurement agent just approved a vendor contract outside its authorized spend limit. The transaction cleared. The vendor has been paid.
You go to the logs. The action is attributed to a service account that three different agents share. The agent that executed the transaction was built by an engineer who left four months ago. Its permissions were elevated six weeks ago to handle a one-off edge case, and nobody ever walked them back.
No one can tell you who authorized this end-to-end. No one can show you the reasoning chain. No one is sure whether this is a one-time event or whether the same agent has been doing this every time a certain condition is met.
This scenario is not a corner case. It is the default state of AI agent deployments in 2025. And the uncomfortable part is not that it happened — it's that most organizations don't have the infrastructure to detect it until something has already gone wrong.
The deployment gap
The statistics are unambiguous.
That gap — 38 percentage points between deployment and governance — represents organizations running autonomous systems with live access to real data, real APIs, and real money, without documented accountability structures.
Gartner projects that more than 40% of agentic AI projects will be canceled by end of 2027 due to inadequate risk controls. These are not predictions about future misuse. They are assessments of what is happening in production systems today.
What the law now requires
The EU AI Act entered into force on 1 August 2024. The full compliance deadline for high-risk AI systems is 2 August 2026.
Two articles are directly relevant to anyone deploying AI agents in operational contexts.
High-risk AI systems must have automatic logging capabilities to record events throughout their lifecycle.
The word automatic is not incidental. According to the European Commission's AI Act Service Desk, logs must be generated without operator intervention at the moment events occur. Scheduled exports do not satisfy this requirement. Manual reviews do not satisfy this requirement. The capability must be built into or applied to the system itself — from the moment it is deployed until it is decommissioned.
Deployers — not model providers, not cloud vendors — are responsible for compliance.
Deployers must retain automatically generated logs for a minimum of six months. They must assign human oversight to competent individuals. They must monitor the system's operation continuously and report identified risks to providers and relevant authorities.
Who counts as a deployer? Under Annex III of the Act, high-risk covers: employment and recruitment tools, credit scoring and financial assessment, insurance pricing, access to essential public services, healthcare triage, education and vocational training, and law enforcement. If your AI agents make or influence decisions in any of these areas, you are a deployer with obligations under Article 26.
The penalty structure exceeds even the GDPR: up to €35 million or 7% of global annual turnover for the most serious violations, and €15 million or 3% for non-compliance with high-risk obligations.
Why application logs are not enough
The instinctive response from engineering teams is: we already have logs. CloudWatch, Datadog, Splunk — the infrastructure exists.
But application logs and compliance logs answer different questions.
Application logs capture what happened at the infrastructure level: API calls made, response codes received, latency measurements. They were built for debugging. They answer did the system function — not what did the agent decide, on what basis, and with what level of confidence.
A compliance log for an AI agent needs to answer:
- What was the agent asked?
- What information did it have access to at the time of the decision?
- What decision did it make, and with what confidence?
- What risk factors were present?
- Was a human able to review this before the decision was acted upon?
Article 12 specifies that logs should support identification of risks, post-market monitoring, and tracking of system operation over the system's lifetime. Application logs were not designed for this. Retrofitting them is harder than it appears — particularly in multi-agent systems where decisions emerge from chains of interactions that standard logging infrastructure was never built to trace.
"Technically means the logging capability must be built into or applied to the system itself. A manual process for exporting logs, or a human who periodically reviews AI outputs and writes notes, does not satisfy this requirement." — FireTail, analysis of Article 12, April 2026
The multi-agent problem
Single-agent governance is difficult. Multi-agent governance is significantly harder.
Modern enterprise AI deployments increasingly involve agents that trigger other agents. A customer service agent escalates to a resolution agent, which calls a payment agent, which logs to a compliance agent. Each step may be on different infrastructure, built by a different team, with a different logging format.
"Traditional AI systems logging captures API calls but misses contextual reasoning. When something goes wrong, tracing accountability through this chain becomes extraordinarily challenging." — Aisera, January 2026
The EU AI Act does not provide an exemption for architectural complexity. If the system is high-risk, the logging requirement applies regardless of how many agents are involved in producing the output.
The compliance window is closing
The organizations that struggled most with GDPR were not the ones that ignored it. They were the ones that had policies without the technical controls to enforce them — breach notification procedures but no systems capable of detecting a breach in time to use them.
The pattern is repeating with the AI Act. Organizations have governance policies. Many do not have the technical infrastructure to operationalize those policies at the agent level — decision by decision, in real time, with a log that survives for six months and can be produced for a regulator on demand.
National Competent Authorities across EU member states move into active enforcement mode on 2 August 2026. The technical work required to build compliant logging infrastructure is not a weekend project. It involves:
- Instrumentation across every agent that touches high-risk decisions
- A storage layer that meets the six-month retention requirement
- A log structure that maps to the specific accountability questions regulators will ask
- Human oversight mechanisms that are genuinely functional — not a checkbox
The organizations starting that work now will be ready. The ones waiting for August will not.
See where your AI agents stand today
Run a free EU AI Act risk scan — no account required. Get your risk classification and a compliance gap report in under two minutes.
Run free risk scan →