All case studies
AI Governance · Financial Services

You Rejected the Loan — but do you know Why?

SCHUFA C-634/21

EU court ruling: if a score determines an outcome, the score is the decision

5 failure modes

Outcome, inputs, features, model version, rules — each a gap in reconstruction

Decision Record

Capture the event as it happens — retrieval, not reconstruction

At the RegTech FS event in London in March 2026, I was speaking to someone who leads risk at a fintech lender. We were discussing their underwriting system — decisioning models, rules, data sources, the usual stack. At one point, I asked him: if we take a rejected application from a few months ago, can we say with certainty why it was rejected?

He hesitated, not because he didn't have an answer, but because the answer depended on what one meant by "why". He could describe the system. He could point to the model, the thresholds, the key variables. With some effort, he could reconstruct a plausible explanation. But answering that question precisely, for that specific decision, at that specific point in time, was not straightforward.

This is a relatively new situation. Lending decisions have always involved judgement, but for a long time that judgement sat, at least partially, with humans. A credit officer would review an application, weigh the evidence, and arrive at a decision. That decision could be questioned, revisited, and explained — imperfectly perhaps, but within a shared frame of reference.

That picture has now changed, and continues to change. Today, in many systems, the decision is effectively made by a combination of models, rules, and data pipelines. The human role, where it exists, is often supervisory or exception-based. The system is not just assisting the decision — it is, in practice, making it.

When a human makes a decision, we ask them to explain their reasoning. When a system makes a decision, we assume that explanation can be derived from its components. That assumption does not always stand up to scrutiny. A recent ruling by the Court of Justice of the European Union makes this explicit. In the SCHUFA case C-634/21, the court held that if a score plays a determining role in an outcome, it effectively is the decision. It is no longer sufficient to say that a human made the final call. The logic of the system itself must stand up to scrutiny.

This brings us back to the original question: what does "why" actually mean here?

How a Loan Decision Is Actually Made

What appears, from the outside, to be a single decision is, in practice, the outcome of a sequence of transformations. Each step operates on a different representation of the same application.

Data Ingestion Feature Engineering Risk Model Rule Engine Application, Credit Bureau, Open Banking Income ratios Spending patterns Credit aggregates Behavioural features Score (e.g. 0.42) Thresholds Policy checks Overrides Approve / Reject Decision Loan Decision Pipeline Operational Complexity Multiple APIs Async fetches Retries
Partial failures Versioned pipelines Evolving definitions Derived assumptions Multiple models Periodic retraining A/B deployments Layered policies Priority conflicts Manual interventions Final state written, but no unified record of how it emerged
Figure 1 — The pipelined flow: a loan application passes through a sequence of transformations before a decision is written.

The Emerging Agentic Paradigm

What is already beginning to replace this pipeline is something less linear. Instead of a fixed sequence, we see specialised components interacting:

Data Agent Behaviour Agent Signals Agent Risk Evaluation Agent Policy / Contraint Agent Orchestrator Decision Operational Complexity Classification Ambiguity Noisy Signals Model ensembles Probabilistic reasoning Non-determinism Dynamic policies Conflicting constraints OCR errors Ambiguity Retries Iterative reasoning Dependency on intermediate outputs Emerges from interaction No single point of ownership
Figure 2 — The agentic model: specialised components each interpret part of the problem and contribute to the final outcome.

Each component:

Where Does the Decision Actually Happen?

At first glance, the flow suggests a clear answer. The model produces a score, a threshold is applied, rules are evaluated, and a final status is written. But when you look more closely for traceability and accountability, the boundaries begin to blur. Is the decision made when the model produces a score? When the threshold is applied? When a rule overrides the model? Or when the final status is recorded? Each of these could plausibly be called the decision point.

Consider a simple failure. An income figure is extracted incorrectly from a document. A transaction is classified as "risky" behaviour. A policy rule is triggered based on that classification. The final outcome is a rejection. Where did the decision actually go wrong? In data extraction? In interpretation? In policy? The pipeline does not provide a clean answer. And as the system evolves into a collection of interacting components, the answer becomes even less obvious. In practice, the decision is not made at a single point. It is distributed across the system.

Why Reconstruction Fails

When an applicant asks "why?" months later, we attempt to reassemble the facts.

  1. The Outcome: We find the rejection status and perhaps a score. This tells us what happened, not how.
  2. The Inputs: Bureau data or banking transactions may have changed or become inaccessible. We often retrieve a current version, not the original snapshot.
  3. The Features: These are rarely stored. We recompute them using current pipelines. But feature definitions and logic evolve; we cannot be certain the recomputed vector matches what the model actually saw.
  4. The Model: Even with a model registry, pinpointing the version is difficult in environments with champion/challenger setups or shadow models. Without a bound reference, we are guessing which model instance owned that specific 5% of traffic.
  5. The Rules: Thresholds change and manual overrides are often undocumented. Replaying policy logic on reconstructed inputs creates a "plausible" explanation, but one built on compounding uncertainty.

The system fails not due to complexity, but because the decision was never captured as a coherent, timestamped event.

Towards a Record of the Decision

What becomes clear from the reconstruction exercise is not that any one component has failed. The data exists, the models are functioning, the rules are defined, and the systems are operational. Yet, when asked to explain a specific decision, we are forced to rebuild it piece by piece.

We tend to treat decisions as outputs — a value written to a database, a status returned by an API. But in practice, a decision is the point at which data, transformations, models, rules, and context come together to produce an outcome. If that moment is not recorded, it cannot be reliably revisited.

One way to approach this is to ask a simple question: what would we need to store in order to answer "why" without reconstructing anything? At a minimum, such a record would need to capture:

Not as fragments spread across systems, but as a single representation of the event — what we can think of as a decision record.

From Reconstruction to Retrieval

In the current approach, answering "why" involves recomputing the decision from available pieces, each of which may have changed over time. The result is often plausible, but not definitive. In this alternative view, the decision is retrieved, not reconstructed. The inputs are exactly those that were used. The features are exactly those that were computed. The model and rules are exactly those that were applied. What remains is the task of interpretation, not recovery.

As systems evolve, what appears today as a pipeline is beginning to resemble a collection of interacting components, each responsible for part of the decision. Some extract data, some interpret behaviour, some evaluate risk, and others apply policy. Their outputs are combined, sometimes iteratively, to arrive at a final outcome. In such a system, the idea of a single decision point becomes harder to sustain. The decision is not made in one place. It emerges from a sequence of interactions. If anything, this strengthens the need to capture the decision as it happens — not just the final outcome, but the state and transitions that led to it.

The move from human judgement to system-driven decisions has brought speed and scale. It has also changed what it means to answer a simple question. Why was this loan rejected? Today, we often respond with a story assembled from fragments. As systems become more complex, that approach becomes harder to sustain. At some point, we will need something more concrete — not just the outcome of a decision, but a record of how it came to be.