Article 12 Logging: Why Your Point-in-Time Logs Won't Pass | Cytra

After a high-risk AI system causes harm, the first question is always the same. What was it doing at the time? If the answer depends on logs that were sampled, summarized, rotated out after thirty days, or editable by the team being investigated, the record will not hold. Article 12 of the EU AI Act exists precisely so that this question has a reliable answer. Most logging setups in production today were never built to provide one, because they were built for engineers debugging at 2 a.m., not for a regulator reconstructing a decision months later.

What Article 12 actually requires

Article 12 obliges high-risk AI systems to technically allow for the automatic recording of events, the logs, over the lifetime of the system. Read those words carefully, because each carries weight.

Automatic. Logging is a capability designed into the system, not a manual export someone remembers to run.
Recording of events. Discrete, traceable events, not aggregate dashboards or summary statistics.
Over the lifetime of the system. Not a rolling window. The capability has to span the system's operational life.

Article 12(2) sets the purpose. The logging capabilities must ensure a level of traceability appropriate to the system's intended purpose, supporting the identification of situations that may cause the system to present a risk under Article 79(1), a system presenting a risk at national level, or that may lead to a substantial modification, and facilitating post-market monitoring under Article 72.

Article 12(3) gets specific for a subset of high-risk systems, notably the biometric identification systems referenced in Annex III point 1(a). For those, the logging must record at minimum the period of each use (start and end date and time), the reference database against which input data was checked, the input data for which the search led to a match, and the identification of the natural persons involved in verifying the results, per Article 14(5).

Article 12 does not stand alone. Article 19 requires providers to keep the automatically generated logs under their control for an appropriate period, at least six months unless other Union or national law specifies otherwise. Article 26(6) places a parallel obligation on deployers to keep the logs the high-risk system generates, to the extent those logs are under their control, again for at least six months. The obligation runs across both sides of the relationship, which matters for any enterprise that both builds systems and deploys vendor models.

Why point-in-time logs do not satisfy it

Most application logging was designed for debugging and operations, not for traceability under regulatory scrutiny. The mismatch produces several predictable failures.

Rotation and retention. Default log retention is measured in days or weeks. Article 19 and Article 26(6) require at least six months, and Article 12 frames the capability as lifetime-spanning. A system that rotates logs out after thirty days cannot answer questions about an incident discovered two months later.

Sampling and summarization. Observability pipelines routinely sample high-volume events and store aggregates to control cost. Aggregates are useless for reconstructing a specific decision on a specific input at a specific time, which is exactly what Article 12(2) traceability and Article 12(3)'s per-use records demand.

Mutability. This is the failure that quietly sinks an audit. If the logs live in a store the operating team can edit, delete, or backfill, their evidentiary value collapses. An assessor cannot distinguish a complete record from a curated one. Nothing in Article 12 uses the word "immutable," but the traceability purpose in Article 12(2), supporting investigation of risk situations and substantial modifications, cannot be met by records that could have been altered after the fact.

Gaps around denials and edge cases. Teams instrument the happy path. The events that matter most after an incident are often the ones the system refused to perform: the inputs it rejected, the timeouts, the fallbacks. If those are not logged with the same rigor as successful calls, the most diagnostic events are missing exactly when you need them.

A point-in-time log, the snapshot pulled when someone asks, is the inversion of what Article 12 wants. The Act wants a continuous, automatic record produced as the system runs, not a report compiled when the system is questioned.

Why it is hard in practice

The honest difficulty is that traceable, long-lived, tamper-resistant logging is a different engineering problem than operational logging, and the two get conflated on the architecture diagram.

Operational logging optimizes for cost, query speed, and signal-to-noise, which is why it samples, summarizes, and rotates. Compliance logging optimizes for completeness, integrity, and retention. Bolting the second requirement onto a stack built for the first usually means either ballooning storage costs or quietly failing the completeness and integrity tests when they finally get checked.

There is also an attribution problem. Article 12(3) and Article 14(5) require logs that identify the people involved in verifying results. The log has to carry identity and context, not just a payload, and it has to do so without becoming a privacy liability of its own. Getting that balance right is real design work, and it lands squarely between your CISO and your privacy office.

What good evidence looks like

An assessor or notified body evaluating Article 12 wants logs they can trust without trusting you. In practice:

Complete, automatic event records covering each relevant operation, including denials, rejections, timeouts, and human-verification steps, not just successful outputs.
Per-use detail for the systems Article 12(3) names: use period, reference database, matched input data, and the identity of the human verifier.
Tamper-evidence. A mechanism such as append-only storage, write-once-read-many (WORM) media, or cryptographic chaining that lets a reviewer detect whether any record was altered or removed after it was written. This is what converts a log from "the operator's account" into evidence.
Retention that meets or exceeds six months, demonstrably, with no silent rotation inside the window (Articles 19 and 26(6)).
Time integrity. Trustworthy, consistent timestamps, so the sequence of events can be reconstructed and correlated across systems.
A clear provider/deployer split. Documentation of which logs each party controls and retains, so the Article 26(6) deployer obligation is visibly met alongside the provider's.

The test an assessor applies, implicitly, is whether this record could have been quietly edited and whether anyone would know. If the answer is that it could and no one would, the logs carry no evidentiary weight no matter how detailed they are.

The runtime-record approach

Tamper-evidence is the property you cannot add later. Once a record is written to a mutable store, no subsequent process can prove it was never changed. That is the gap Cytra is built to close. The design routes AI and agent tool calls through a managed gateway, currently in private beta, that records every event, successful calls, policy denials, timeouts, and configuration changes, to a per-tenant, append-only ledger secured with a SHA-256 hash-chain, so any later alteration breaks the chain and is detectable. Where routing through a gateway is not feasible, a standalone collector runs outbound-only inside the customer environment and streams signed events to the same ledger. The intent is to map these records directly to Article 12's traceability and retention obligations, and to Article 26(6) for deployers, so compliance reflects how the system ran rather than a snapshot assembled on request. Cytra is built to keep you aligned and audit-ready, not certified, and coverage means mapping evidence to control objectives, not guaranteeing an outcome. SOC 2 and a HIPAA BAA are in process.

Takeaway checklist

Logging must be automatic and span the system lifetime, not a rolling debug window (Article 12(1)).
Capture discrete events with traceability, not sampled aggregates (Article 12(2)).
For Article 12(3) systems, record use period, reference database, matched input, and human verifier identity.
Log denials, rejections, timeouts, and edge cases, not just successful outputs.
Make records tamper-evident via append-only, WORM, or cryptographic chaining, so alteration is detectable.
Retain logs for at least six months, with no silent rotation inside the window (Articles 19, 26(6)).
Keep trustworthy timestamps for sequence reconstruction.
Document the provider/deployer retention split explicitly (Article 26(6)).