Operationalizing AI Risk Governance in Financial Services: From Policy to Practice

At A Glance

At the Bankers Institute's 2026 Risk Management in Banking Conference (May 12), Mark Ritcey of LatentBridge delivers a practical walkthrough: opening with high-profile AI failures (Amazon's gender-biased hiring tool, Air Canada's chatbot court loss, Microsoft's Tay inflammatory tweets, Zillow's $300M prediction flop) to expose bias, hallucinations, and drift risks, then unpacking financial services' 3-tier risk appetite (auto-approve public data uses, review sensitive customer data, restrict agentic AI on critical info), use case personas with controls like human review and monitoring; AI Steering Committee roles (business owners to legal); and 3 Lines of Defense tailored for AI governance from policy to production.

AI in financial services is now well past the experimentation phase. Models are screening customers, flagging anomalous transactions, summarizing complex documents, and in some cases making or triggering real decisions. That shift brings value but it also introduces very real risk.

In his session at the Bankers Institute’s 2026 Risk Management in Banking Conference, Mark Ritcey from LatentBridge focused on a practical question: how do you move from AI risk policy to day‑to‑day practice in a bank? His answer combines real-world failures, a structured risk framework, and a governance model built for financial services.

‍

What Recent AI Failures Teach Banks

Mark opens with a set of public incidents to make one point very concrete: AI risk is not theoretical, it is already on the impacting the bottom line and companies reputations.

‍

Amazon: When historical data encodes bias

Amazon built an AI‑driven resume screening tool intended to identify top engineering candidates automatically. Trained on 10 years of past resumes, largely from men, the model began to systematically downgraded CVs that collaborate verbs such as “supported” and even participation in “women’s chess club”.

As Mark notes, “they tried to even remove any identifier for the separation of male candidates to female candidates, and they still found out their model was flawed.” The project was abandoned after Amazon lost confidence that it could reliably eliminate discrimination.

‍

Air Canada Hallucinations become legal liabilities

Air Canada deployed a customer‑facing chatbot on its site. In one case, a customer who asked about bereavement travel was told by the chatbot that he could claim a refund after travel. That policy did not exist.

The dispute went to a Canadian civil tribunal, which ruled in favor of the customer and held Air Canada responsible for what its chatbot said. Mark’s key warning: “companies can be liable or responsible for the output of their AI solution.”

‍

Microsoft Tay: Uncontrolled learning in the wild

Microsoft released “Tay,” a chatbot on Twitter designed to learn from public interactions in real time. Within hours, Tay began producing offensive content, including explicit statements like “I love genocide,” forcing Microsoft to shut it down and issue a public apology.

Mark calls this out as a classic case of “uncontrolled learning, uncontrolled output.”

‍

Zillow Offers: Model error at scale

Zillow used a machine‑learning model to predict home prices and then act on those predictions by automatically buying properties it believed were undervalued. When reality diverged from the model, the company incurred losses of over $300 million, shut the business, and laid off around 2,000 employees.

Mark describes it as “absolutely staggering in terms of financial loss and reputational impact.”

‍

The four common failure modes

Across these cases, Mark highlights four recurring themes:

Data bias: training data does not represent the real‑world population or scenarios, leading to systematic unfairness.

Hallucinations: generative models produce plausible but fabricated content, as in the Air Canada case.

Data and model drift: underlying data (e.g., interest rates, market volatility, customer behavior) and model behavior change over time, degrading performance.

Uncontrolled learning and output: systems that adapt without oversight can become unsafe or misaligned, like Microsoft’s Tay.

For banks, these four themes map directly to concerns about fairness, operational risk, legal liability, and prudent safety.

‍

Defining an Enterprise AI Risk Appetite

Before designing controls, Mark argues that financial institutions need to answer a deceptively simple question: “What is our AI risk appetite?”

He stresses that there are “no hard and fast rules.” Risk tolerance depends on your products, customer demographics, and regulatory context. But a useful starting point is to classify AI use cases into three broad tiers based on data sensitivity and impact.

Tier 1: Lower‑risk, public or non‑sensitive data

These are AI solutions that:

Use public data (e.g., market data, public filings like 10‑K/10‑Q).

Used primarily for internal analysis or summarization, with human review before any client communication.

Example: using a GenAI tool to summarize public earnings reports for an internal briefing.

Mark notes that such use cases “may be auto‑approved to proceed for design and development,” as long as they still follow baseline technical and operational standards.

Example: using a GenAI tool to summarize public earnings reports for an internal briefing.

‍

Tier 2: Medium‑risk, sensitive customer or policy data

These use cases:

Involve sensitive internal data such as customer portfolios, names, or internal policies.

Produce outputs that feed into client‑facing decisions or advice, even if a human remains in the loop.

Example: an ML model segmenting customers for investment recommendations, or a GenAI assistant that drafts client emails based on portfolio data.

Mark’s guidance: these solutions require additional risk review before development and deployment.

‍

Tier 3: High‑risk with restricted critical data

At the top of the risk spectrum are use cases that:

Rely on the most sensitive and critical data (credit card numbers, social security numbers, merger and acquisition information).

Utilize agentic AI to make auto decisions and execute actions without direct human sign‑off.

Mark is explicit that “this type of business use case would be barred or restricted” in most institutions because of the combination of sensitive data, autonomous action, and potential external impact (customers, regulators, etc.).

‍

From Use Cases to Risk Personas and Controls

Once a bank defines its risk appetite, the next step is to classify AI use cases into risk personas with clear routes through the governance process. Mark proposes three broad personas:

Auto‑approved: lower‑risk use cases with predefined controls.

Review‑required: medium‑risk use cases needing deeper assessment.

Restricted/banned: high‑risk scenarios that the institution will not pursue.

‍

Evaluating a use case: three dimensions

Mark suggests evaluating AI use cases along three core dimensions:

Functionality:

Is the use case summarizing, classifying, predicting, or generating content?

Is it using GenAI, traditional ML, or agentic AI?

Data type:

Public, internal non‑sensitive, sensitive, or highly confidential data?

Intended audience:

Internal subject‑matter experts who know the content.

Internal non‑experts.

External clients, customers, or counterparties.

Regulators or external auditors.

For example, Mark contrasts two internal summarization use cases using GenAI:

Lower Risk: summarization for domain experts who “are well familiar with [the data], they work with on a daily basis”

Increased risk: summarization for internal users who are not experts in that data because they are more likely to take model output at face value.

‍

Controls: what “good” looks like in practice

Mark emphasizes that every approved AI use case needs defined controls and monitoring, not just a one‑time risk sign‑off. Examples include:

Human‑in‑the‑loop: mandatory review of outputs before they reach customers, especially for GenAI summaries or recommendations.

Content moderation and safety filters: prevent offensive or harmful outputs

Data controls: manage data lineage, cleanliness, and integrity to mitigate bias and drift.

Issue intake and production monitoring: mechanisms to capture defects, hallucinations, unexpected behaviors, and security incidents in production.

Crisis management plans: predefined response playbooks for critical failures in externally facing AI systems.

Training and certification: periodic staff training and knowledge checks on AI risks and proper use, supported by HR and shared services.

Mark also notes the role of enterprise shared services (like HR or finance) to embed controls. For example, requiring employees to complete a training and attestation in systems like Workday every 3–6 months as models evolve.

‍

The AI Governance Model: Steering Committee at the Core

To move from policy to execution, Mark recommends a formal governance structure anchored by an AI Steering Committee (sometimes called an AI Governance Board).

‍

Role of the AI Steering Committee

The committee’s primary responsibilities are to:

Set the AI roadmap and priorities which use cases to pursue and in what sequence, based on business value and risk.

Oversee the risk assessment process for AI solutions.

Ensure that controls and monitoring are in place for approved use cases.

‍

Who sits on the committee

Mark outlines a cross‑functional group of senior leaders:

Business owner

Champions the use case and defines the problem in measurable terms (cycle time, throughput, transaction counts).

Owns the risk for that use case and engages other stakeholders to identify and mitigate it.

Technology leaders

From development and delivery teams, plus infrastructure.

Responsible for designing, building, integrating, and supporting AI solutions.

Cybersecurity

Ensures that AI integrations, data flows, and prompts do not introduce new attack vectors or data leakage.

Model Risk Management (MRM)

Monitors models in production (both GenAI and ML) for performance, drift, and stability.

Assesses changes when model providers upgrade versions (e.g., “models as a service” where the underlying model can change under the hood).

Risk, Compliance, and Legal

Bring expertise from enterprise and business risk, third‑party risk, and regulatory compliance.

Legal drafts and reviews AI‑related contractual clauses and addenda for third‑party tools and platforms.

‍

The Risk Evaluation Committee

For higher‑risk use cases (Tier 2 and Tier 3), Mark describes a Risk Evaluation Committee or similar construct. This group:

Reviews the specific use case, data, and proposed solution.

Confirms the appropriate risk persona (auto‑approved, review‑required, restricted).

Coordinates additional analysis with subject‑matter experts in enterprise risk, cybersecurity, legal, and compliance.

In financial services, that can include business risk and controls, general risk and controls, and third‑party risk teams.

‍

Applying the Three Lines of Defense to AI

Mark maps his approach onto a familiar structure for banks: the three lines of defense for risk.

‍

First line: Business and AI delivery teams

The first line is where AI is conceived and built. It includes:

Business units and sponsors

Own and manage the risks of their AI use cases.

Are accountable for identifying risks (strategic, reputational, legal/regulatory, ethical, financial) and specifying controls.

AI Center of Excellence (CoE)

Business analysts and developers who design, code, and deliver solutions.

Follow secure coding practices, manage credentials, and implement technical controls and logging.

Enterprise shared services

Functions like HR which implement cross‑cutting controls such as mandatory education and periodic testing on AI risk.

Mark’s emphasis is clear: the first line owns AI risk; other functions advise and challenge but cannot replace that ownership.

‍

Second line: Enterprise risk, compliance, and legal

The second line provides oversight and challenge:

Enterprise risk management

Maintains the AI risk framework and ensures it is applied consistently.

Tracks identified risks, associated controls, and residual risk; confirms residual risk remains within tolerance.

Compliance

Ensures AI use aligns with regulatory expectations, including emerging AI guidelines and existing conduct/consumer rules.

Legal

Plays a dual role:

Review of contracts and agreements relating to AI addenda for third‑party tools and embedded AI capabilities in enterprise platforms (e.g., ServiceNow, Workday, lending systems).

Ongoing assessment of AI legal ramifications based on defined and emerging regulatory standards.

Mark calls out a growing trend: enterprise applications are quietly adding AI features, and legal needs to ensure these are covered by robust clauses around safety, security, and data usage (e.g., prohibiting vendors from training their models on sensitive client data).

‍

Third line: Internal audit

Internal audit functions as the third and final line:

Conducts independent evaluations of AI risk processes, governance, and controls.

Assesses control design and effectiveness, and whether AI risks are being monitored and escalated appropriately.

Aggregates findings into reports for senior leadership and the board, highlighting gaps, residual risk, and root causes often including insufficient training and awareness.

Mark underscores that internal audit is not just checking boxes; it is providing leadership with a realistic view of whether AI risk governance is working in practice.

‍

From Policy to Practice: Three Actions to Start Now

Mark closes by returning to advice from a former manager: when asked “should we do A or B?”, the answer is often “it depends.” The same is true in AI risk. But he also insists that every bank can take three concrete steps now.

‍

Define and document your AI risk appetite

Clarify what types of data, use cases, and audiences you are comfortable with and which ones you are not.

Explicitly categorize use cases into tiers.

Make it clear which types are auto‑approved, which require deeper review, and which are not allowed.

‍

Build a categorization and control framework at the use‑case level

For each AI use case:

Classify it by function, data, and audience.

Assign it a risk persona (auto‑approved, review‑required, restricted).

Attach a standard control set and monitoring plan appropriate to that persona.

This ensures every use case has a clear path through governance and a consistent baseline of controls.

‍

Stand up an AI governance and oversight model

Put in place:

An AI Steering Committee with clear remit, membership, and decision rights.

Integration with your existing three lines of defense, including model risk management and internal audit.

Regular reporting so leadership can see AI risk alongside other enterprise risks.

As Mark’s examples show - from Amazon’s biased hiring tool to Air Canada’s chatbot and Zillow’s losses, the question is not whether AI risk is real, but whether your institution is deliberate and disciplined in how it governs AI.

With a defined risk appetite, structured use‑case categorization, and a robust governance model, banks can move confidently from AI policy to AI practice capturing value while staying within their risk tolerance.

Financial Institutions

Corporate Banking

Investment Banking

Retail and Corporate Banking

Thank you! Your submission has been received!