Methodology

The AI confidence score, factor breakdown, and model transparency panel

Why “AI said so” is not a defensible answer, and how RegRadar makes the AI's claim testable so that a 1LoD operator can sign it responsibly and a 2LoD reviewer can challenge it substantively.

By RegRadar Editorial TeamPublished 22 April 20268 min read

Every AI output on a RegRadar impact carries four pieces of visible metadata: a confidence score between 0 and 100, a factor breakdown that explains how the score was computed, a model transparency panel that names the provider and its EU-jurisdiction attestation, and a human-editable structured object that lets the operator accept, override, or reject each extracted field before signoff. The combination is what makes an AI-assisted decision defensible under three lines of defence. This page describes how the score is computed, what operators are expected to do with it, and what is deliberately not in the score.

Why a confidence number, and not a model log-probability

Modern LLMs expose per-token log-probabilities, and it is tempting to surface those directly as “confidence”. We do not, for three reasons. First, token-level probabilities are miscalibrated on out-of-distribution text — regulatory releases are exactly that. Second, probabilities reward the model for hedging in its wording, which is the opposite of what an operator needs. Third, a single number sums poorly over structured extraction: an output can have a high token probability and still misidentify the article number.

RegRadar computes confidence from observable, auditable signals that an operator or 2LoD reviewer can themselves verify. The signal set evolved from controlled workflow tests; as of release 2.3.0 it contains six factors.

Factor	Signal	Weight
Fields extracted	How many of the six structured fields (obligation, deadline, severity, perimeter, audience, regime) the model returned a non-null value for.	30%
Source trust tier	Primary tier (EUR-Lex, ESA final text) vs secondary (official national supervisor) vs tertiary (trade press, commentary). Navigation/noise pages are rejected upstream.	20%
Length and structure	Did the source document parse into recitals / articles / annexes, or is it a one-paragraph press release? Is the extracted text above the minimum length required for that regime?	15%
Operator edits	If any field was manually overridden by the 1LoD operator before signoff, the score is reduced and the affected fields are tagged in the factor breakdown.	15%
Safety gates	High or critical severity without a verified source excerpt is capped to medium. Deadlines earlier than the document date are rejected. These gates visibly lower the score and are listed explicitly.	10%
Profile match	The profile hash at extraction time is compared to the current tenant profile hash. A stale profile reduces the score and surfaces a warning.	10%

The factor breakdown is rendered inline on the impact detail — for example, Confidence 56% · 2/6 champs extraits · opérateur a modifié: objet structuré · gate: severity capped is a legitimate “medium” confidence with a visible reason. A 92% score with “6/6 fields extracted, no operator overrides, primary source, fresh profile” is a legitimate “high” confidence.

What the AI produces, and what it does not

RegRadar's AI produces three categories of artefact:

Summariesof captured documents, in the document's source language and in the tenant's operator language when they differ.
Classifications: in-perimeter vs out-of-perimeter, severity, suggested audience, suggested obligation, suggested deadline.
Drafts: stakeholder summaries, digest entries, inspection checklists.

It does not produce decisions. A decision is made by a named operator and signed via the three-lines-of-defence chain. A 1LoD signoff attaches the operator's rationale and evidence pack to the AI's structured object. A 2LoD countersignature attaches an independent check. The chain records both, hashes both, and exposes both in audit exports.

The model transparency panel

The transparency panel sits behind a “Détails” control next to the model name on the impact detail. A production instance for a regulated banking tenant looks like this:

Provider            Azure OpenAI (EU)
Smart profile       gpt-5.4-mini (fallback: Gemini 3 Flash)
Fast profile        gpt-5.4-nano (fallback: Gemini 2.5 Flash-Lite)
Jurisdiction        EU (West Europe)
Temperature         0.3
Top-p               0.9
Retention policy    Zero-retention (provider attestation 2026-02-14)
Training policy     No-training (provider attestation 2026-02-14)
PII-in-prompts      Disabled by tenant policy
Prompt hash         a7c04f3e...9b12
Response hash       5de9b8c1...33a2
Extraction trace    Available in Audit export (JSON/CSV)

If the tenant has switched to Mistral Large (EU-hosted) or to an alternative provider, the panel reflects that. The attestation date tells the 2LoD reviewer how current the provider contract is. The prompt and response hashes are logged in the audit journal and included in the Snapshot complet pack; they enable post-hoc reconstruction of exactly what the model was shown and what it returned.

Calibration: how we know the score is not just decoration

The weights above were chosen from an internal benchmark set of roughly 2,400 reviewed impacts. The calibration metric is “1LoD override rate stratified by confidence bucket”: how often operators change a structured field before signing, broken down by the AI's reported confidence. On the 2026-Q1 cohort:

Override rate per confidence bucket, Q1 2026 cohort (n=2,412)
Confidence bucket	Sample size	Override rate	Interpretation
≥ 90%	1,098	7%	Accept as-is after a visual check
70–89%	712	28%	Review structured object carefully before signing
50–69%	402	58%	Rewrite the structured object; treat AI as draft only
< 50%	200	83%	Treat as raw material; AI summary may still help

The monotone relationship between the score and the override rate is what we rely on. If the curve inverts — say a 90% bucket starts showing 50% override — we know something has shifted upstream (a source parser broke, a model contract changed, a profile became stale) and the score stops meaning what it says. We recalibrate per quarter and publish the new table inside the Trust Center.

What a 2LoD reviewer is expected to do with the score

The 2LoD reviewer receives the impact with 1LoD's rationale, the structured object, confidence and factor breakdown, and the model transparency panel. The review is a judgment, not a rubber-stamp:

If confidence is high and the factor breakdown is clean, 2LoD's job is to verify the evidence pack links to the canonical source and to challenge the rationale's fit to scope. A 90% confidence without a supporting citation is still unsignable.
If confidence is medium and the 1LoD operator overrode fields, 2LoD should read the overrides. The chain will record that the 2LoD accepted the overrides — inspection can then see that the 2LoD was aware of the deviation from AI output.
If confidence is low, 2LoD should look for the gate that capped it (e.g. severity capped because source excerpt is missing) and decide whether to challenge the 1LoD back or to escalate.

What the score is not

A correctness probability.The AI can be 92% confident and wrong. The score says only “the signals that correlate with operator acceptance look clean here.”
A risk rating.Severity, impact on perimeter, and deadline proximity are separate fields. Do not conflate them with the AI's confidence in reading them off.
A replacement for the decision.A signoff without a human decision is refused at the API layer. There is no “auto-sign high-confidence impacts” mode in RegRadar, and we do not intend to build one.

Citing these facts to an auditor

When internal audit or a supervisor asks about AI use, the short answer is this: the model is named, hosted in the EU, governed by a zero-retention and no-training contract, runs at temperature 0.3 for stability, and produces summaries and classifications that are always subject to named human signoff. Every impact has a confidence score derived from six observable signals, calibrated quarterly against real operator override behaviour, and the score plus the factor breakdown plus the model transparency panel are visible on the impact detail and exportable as JSON via the Audit pack. That answer is testable; it is the one we rehearse in buyer and operator reviews.

Next step

Scope an 8-week paid pilot for your perimeter.

One topic, one team, one jurisdiction pack. €15k, 50% credited on annual conversion.

Book a paid pilot Review pricing