Agentic AI · Data Governance · Pharma

Agentic AI for Data Governance:
A Production Pattern for First-Pass Curation

Why the next operating model for pharma data governance is not “AI replaces stewards” but “agents propose, stewards adjudicate, provenance binds them” — and what that pattern has to look like to survive regulatory scrutiny.

May 2026 ~22 min read Ali Shahmohammadi, Ph.D. 32 references

Read Article PROV-O (W3C)

10x

The leverage gain stewardship teams need if governance backlogs are going to stop compounding

Core commitments in the production pattern: calibrated confidence, structured human review, semantic provenance

Aug 2026

EU AI Act Article 14 oversight obligations apply to high-risk systems from 2 August 2026

References spanning agent architecture, pharma regulation, governance standards, and production exemplars

Table of Contents

01The Governance Backlog Nobody Wants to Talk About
02Workflows vs. Agents
03Agents Propose, Stewards Adjudicate, Provenance Binds
04The Pipeline, Concretely
05What the Vendor Tools Actually Bring
06The Regulatory Frame Has Already Closed
07The Standards Stack Underneath
08What the Pharma Exemplars Actually Show
09Operating Model Decisions Worth Getting Right
10Where the Next Two Years Go

01 — The Governance Backlog

The Governance Backlog Nobody Wants to Talk About

The real problem is not whether the catalog exists. It is whether the flow of human judgment can keep up with the inflow of governance work.

In most large pharma data offices, the visible tooling story looks healthy: a catalog platform is licensed, a glossary exists, stewardship roles are named, and governance roadmaps are in flight. Behind that visible layer sits the real operational condition: thousands of unmapped attributes, policy assignments waiting on review, lineage gaps, and domain definitions that accumulate faster than steward teams can process them.

The issue is structural, not cultural. Data governance is still a piecework discipline. Each new dataset, field, policy, and ownership decision requires a small unit of expert judgment. The supply of that judgment is finite. The demand is rising because IDMP, AI oversight, cloud migration, and cross-domain analytics each add more assets and more review obligations.

This is why agentic AI is attractive in governance settings. The promise is not full automation. The promise is that the first pass can be done by machine, so humans adjudicate only the contested or risky residue. That promise is directionally correct. The failure mode is treating the first pass as if it were equivalent to final authority.

The framing that survives scrutiny: not “AI replaces stewards,” but “agents make bounded proposals, humans make accountable decisions, and the system preserves evidence for every step.”

02 — Workflow Design

Workflows vs. Agents: A Definition Worth Holding

Most governance use cases should start as controlled workflows and only use full agent autonomy at the open-ended edge cases.

Anthropic’s engineering distinction is the cleanest useful definition in this space: workflows are systems where LLMs and tools run through predefined code paths; agents are systems where the model dynamically chooses its own sequence of actions.¹

“Workflows are systems where LLMs and tools are orchestrated through predefined code paths. Agents, on the other hand, are systems where LLMs dynamically direct their own processes and tool usage.”

That distinction matters because governance risk sits in the control surface. A workflow is testable, structurally deterministic, and easier to validate. A true agent can solve more open-ended problems, but it is harder to constrain and creates a more complex audit trail. For first-pass curation in pharma, the right default is workflow-first, agent-second.

Schema mapping, glossary suggestion, classification, PII tagging, and lineage stitching are structured enough to fit controlled workflows. Reserve autonomous looping for the residue: cross-system reconciliation, novel ontology alignment, and multi-hop evidence discovery. Even then, keep the tool surface narrow and the human checkpoint explicit.

Reason + Act

ReAct

The canonical loop for interleaving reasoning and tool use. Useful when governance tasks need evidence lookup before proposing an assertion.²

Tool Use

Toolformer

Shows the value of explicit tool calling instead of treating the model as a closed box. Governance agents should call systems, not hallucinate them.³

Reflection

Reflexion

Improvement over time through verbal feedback loops. In governance, steward overrides are the strongest candidate signal for this pattern.⁴

03 — The Pattern

Agents Propose, Stewards Adjudicate, Provenance Binds Them

A production-ready governance architecture differs from a demo in three ways: it measures uncertainty, it routes humans deliberately, and it records provenance semantically.

Commitment 1

Confidence is a first-class signal

Every proposal needs a calibrated confidence and a structured citation. Selective classification and calibrated abstention matter more than raw answer rate because abstention is a valid control outcome in regulated governance.⁷⁸⁹¹⁰

Commitment 2

Human review is a routing problem

Human-in-the-loop is not a courtesy step. It is a policy-enforced route for low-confidence cases, regulated identifiers, cross-domain conflicts, and assets whose failure cost is unacceptable.¹¹

Commitment 3

Provenance is semantic and queryable

PROV-O gives the vocabulary for who generated what, when, with which evidence, and under whose review. That turns auditability into a data model instead of an afterthought.¹²

The design principle: every assertion needs a who, why, when, and on-what-evidence. Every override becomes a reusable labeled signal. Every change to prompt, model, or tool surface becomes versioned configuration.

04 — Pipeline

The Pipeline, Concretely

The critical separation is between proposals and production. Agents write to staging. Policy and humans determine promotion.

agentic-governance-pipeline.txt

CATALOG / KNOWLEDGE GRAPH
  glossary | classifications | lineage | policies | owners
  provenance graph on every assertion

AGENT PROPOSALS (staging graph)
  - classification
  - glossary link
  - lineage edge
  - policy tag
  - confidence
  - citation

STEWARD ADJUDICATION
  approve | override | escalate

POLICY ENGINE
  - confidence floors
  - regulated-asset rules
  - domain ownership rules
  - audit policies

SOURCE METADATA
  cataloged systems | lineage events | samples | glossary | ontologies

Boundary 1

Proposals do not write directly to production

Automatic promotion is allowed only when confidence clears policy thresholds and the asset is not in a protected class.

Boundary 2

The policy engine sits upstream

The system decides which tasks the agent may attempt before the agent runs, not after it has already acted.

Boundary 3

Provenance is its own first-class store

You need one-query answers to questions like which model version touched a regulated substance during a specific period.

05 — Vendor Reality

What the Vendor Tools Actually Bring

The useful evaluation question is not whether a catalog vendor has AI. It is whether the AI’s behavior can be governed to pharma-grade standards.

Collibra

AI Governance as control plane

Positioned around use-case registration, risk controls, and auditability across the AI lifecycle.¹³

Atlan

Metadata-first AI governance

The catalog remains the substrate; AI is treated as both consumer and contributor of active metadata.¹⁴

Alation

Embedded curation assistance

ALLIE AI focuses on catalog curation, glossary generation, and stewardship workflows inside the existing platform.¹⁵

Informatica

CLAIRE as agent-capable layer

Metadata understanding, classification, and rule recommendations framed inside the broader IDMC platform.¹⁶

The buying question: where does the agent write, what confidence and provenance accompany each assertion, and how is the policy engine enforced for regulated assets? Product marketing is secondary to those controls.

06 — Regulation

The Regulatory Frame Has Already Closed

In pharma, oversight, audit trails, validation, and AI risk management are design inputs. They are not optional add-ons.

EU AI Act

Article 14 human oversight

High-risk AI systems must be built so natural persons can understand capabilities and limits, interpret outputs, override decisions, and stop the system. The production pattern maps directly to that requirement.¹⁷

GxP Controls

21 CFR Part 11 and Annex 11

Electronic records rules demand audit trails, accountability, and validated computerized systems. Agentic curation touching GxP-adjacent data needs ALCOA+ provenance by construction.¹⁸¹⁹

Validation

GAMP 5 Second Edition

The current practical validation framework for AI-enabled computerized systems in regulated pharma, especially where iterative change and service providers are involved.²⁰

Risk Management

NIST AI RMF + GenAI Profile

Govern, Map, Measure, and Manage provide the cleanest non-binding structure for organizing explainability, accountability, resilience, and generative-AI-specific risks.²¹²²

Architectural conclusion: there is no plausible 2026 deployment of agentic curation in a pharma data office that survives audit without Article-14-grade oversight, ALCOA+ evidence trails, GAMP-grade validation, and AI-RMF-aligned governance documentation.

07 — Standards

The Standards Stack Underneath

Production governance needs both semantic standards and operating-model standards. One set defines the facts; the other defines how the organization controls them.

Semantic Provenance

W3C PROV-O

The only standard in this stack that directly makes who, what, when, and why queryable across agent runs, catalog assertions, and steward review.¹²

EntityActivityAgent

Governance Vocabulary

DAMA-DMBOK 2 + ISO data governance

DMBOK, ISO 8000-1, and ISO/IEC 38505-1 give shared language for data quality and governance structure across functions and control forums.²³²⁴²⁵

Lineage

OpenLineage + Marquez

Useful for pipeline-level events and run metadata. Combined with PROV-O, it gives both operational lineage and semantic accountability.²⁶²⁷

Cloud Controls

EDM Council CDMC

Provides a practical scoring rubric for whether sensitive data in cloud and hybrid-cloud environments is governed tightly enough for agentic curation to operate safely.²⁸

Capability Maturity

EDM Council DCAM

Useful for determining whether a domain is mature enough to adopt agent-augmented governance before the pilot starts, rather than after it fails.²⁹

08 — Pharma Signals

What the Pharma Exemplars Actually Show — and Don’t

The strongest public signal is that pharma is treating agentic governance as a standards-and-architecture problem. What is still missing is a mature peer-reviewed case study of full production-scale LLM agents running governance loops.

Pistoia Alliance

Agentic AI initiative

The clearest cross-industry sign that pharma sees agentic AI as a standards problem, not just a vendor feature race.³⁰

AstraZeneca

Knowledge graphs as R&D infrastructure

AZ publicly describes graph-based semantic infrastructure across genomic, disease, drug, clinical, and safety data, which is the closest public analogue to the substrate needed for governed agentic curation.³¹³²

Current gap

No mature public benchmark yet

What the public record still lacks is a full-scale, peer-reviewed pharma deployment where LLM agents own governance control flow in production at enterprise scale.

The practical reading: the architecture is established, the controls are specified, the vendors are converging, and the first exemplars are imminent rather than historical. Teams building now are likely to define the reference pattern others adopt later.

09 — Operating Model

The Operating Model Decisions Worth Getting Right

Most pilot failures come from weak operating decisions, not weak model capability.

Configuration Control

Treat prompts and tools as regulated configuration

Prompt changes, tool additions, and model upgrades should all trigger review, versioning, and regression checks against steward-labeled data.

Human Capacity

Fund the steward side of the loop

Agentic curation increases proposal volume. Without enough steward review capacity, the queue still grows, just faster and with more machine-generated work.

Policy Visibility

Make the policy engine human-readable

Rules like “regulated identifiers always require review” should be expressible in declarative policy, not buried in prompt text.

Measurement

Calibrate before you trust

Reliability diagrams, selective accuracy, and expected calibration error on the real workload matter more than generic benchmark confidence claims.

Learning Loop

Treat human override as signal, not noise

Every steward override should feed the next cycle as labeled evidence. Otherwise the system repeats the same failure modes every quarter.

10 — The Next Two Years

Where the Next Two Years Go

Three forces are converging quickly enough to make 2026–2027 the inflection window for agent-augmented governance in pharma.

Regulatory deadlines

Oversight becomes enforceable

Article 14 timing, Annex 11 direction, and modernized GxP expectations make retrofit a weaker option every quarter.

Vendor maturity

Product capability catches up

The conversation is shifting from “does it have an agent?” to “can it be governed to GxP standards?” That is the right inflection point.

Industry coordination

Standards work opens up

Pistoia Alliance’s initiative creates a focal point for vocabularies and controls that vendors and pharmas will likely converge around.

The architectural conclusion: the governance layer of the next pharma data stack is an agent-augmented stewardship loop with semantic provenance, policy-gated autonomy, and human review designed in from the start. The opportunity is not replacing governance. It is giving governance the leverage it has been missing.

Closing Thought

Agents Propose. Stewards Adjudicate. Provenance Binds Them.

Data governance has always been finite humans facing effectively infinite work. Catalog tooling, stewardship communities, and glossary programs improved the surface, but they did not close the judgment gap. The curve kept widening.

What changes the curve is not an agent pretending to be a steward. It is an agent that knows when to abstain, surfaces evidence when it does act, and operates inside a policy and provenance framework the auditor can actually inspect. That system is buildable now with stable W3C standards, current governance frameworks, and vendor platforms that are finally approaching the right control surface.

Agents propose. Stewards adjudicate. Provenance binds them.

Back to Portfolio Related: Semantic Layers → Related: Ontology Curation → Related: ISO/IEC 23894 in Pharma R&D →

References

32 References

1Schluntz, E., Zhang, B. Building effective agents. Anthropic Engineering, 19 December 2024. anthropic.com
2Yao, S., Zhao, J., Yu, D., Du, N., Shafran, I., Narasimhan, K., Cao, Y. ReAct: Synergizing Reasoning and Acting in Language Models. ICLR 2023. arxiv.org
3Schick, T., Dwivedi-Yu, J., Dessì, R., et al. Toolformer: Language Models Can Teach Themselves to Use Tools. NeurIPS 2023. arxiv.org
4Shinn, N., Cassano, F., Berman, E., et al. Reflexion: Language Agents with Verbal Reinforcement Learning. NeurIPS 2023. arxiv.org
5Wang, L., Ma, C., Feng, X., et al. A Survey on Large Language Model based Autonomous Agents. Frontiers of Computer Science, 2024. arxiv.org
6Weng, L. LLM Powered Autonomous Agents. Lil'Log, 23 June 2023. lilianweng.github.io
7Geifman, Y., El-Yaniv, R. Selective Classification for Deep Neural Networks. NeurIPS 2017. arxiv.org
8Kamath, A., Jia, R., Liang, P. Selective Question Answering under Domain Shift. ACL 2020. aclanthology.org
9Lin, S., Hilton, J., Evans, O. Teaching Models to Express Their Uncertainty in Words. TMLR, 2022. arxiv.org
10Kadavath, S., Conerly, T., Askell, A., et al. Language Models (Mostly) Know What They Know. arxiv.org
11Settles, B. Active Learning Literature Survey. University of Wisconsin-Madison. burrsettles.com
12W3C. PROV-O: The PROV Ontology. W3C Recommendation, 30 April 2013. w3.org
13Collibra. AI Governance. collibra.com
14Atlan. AI Governance. atlan.com
15Alation. ALLIE AI. alation.com
16Informatica. CLAIRE AI. informatica.com
17European Union. Regulation (EU) 2024/1689. eur-lex.europa.eu
18U.S. FDA. 21 CFR Part 11 — Electronic Records; Electronic Signatures. ecfr.gov
19European Commission. EudraLex Volume 4, Annex 11: Computerised Systems. ec.europa.eu
20ISPE. GAMP 5 Second Edition. ispe.org
21NIST. AI RMF 1.0. doi.org
22NIST. Generative Artificial Intelligence Profile. doi.org
23DAMA International. DAMA-DMBOK 2. dama.org
24ISO. ISO 8000-1:2022 — Data quality — Part 1: Overview. iso.org
25ISO. ISO/IEC 38505-1:2017 — Governance of data. iso.org
26OpenLineage. An Open Standard for Lineage Data Collection. openlineage.io
27Marquez Project. Marquez. marquezproject.ai
28EDM Council. Cloud Data Management Capabilities (CDMC). edmcouncil.org
29EDM Council. Data Management Capability Assessment Model (DCAM). edmcouncil.org
30Pistoia Alliance. Agentic AI initiative. pistoiaalliance.org
31AstraZeneca. Data Science & AI. astrazeneca.com
32Graphwise. AstraZeneca: Enabling new medicines through semantic knowledge graphs. graphwise.ai

Agentic AI for Data Governance:A Production Pattern for First-Pass Curation

The Governance Backlog Nobody Wants to Talk About

Workflows vs. Agents: A Definition Worth Holding

ReAct

Toolformer

Reflexion

Agents Propose, Stewards Adjudicate, Provenance Binds Them

Confidence is a first-class signal

Human review is a routing problem

Provenance is semantic and queryable

The Pipeline, Concretely

Proposals do not write directly to production

The policy engine sits upstream

Provenance is its own first-class store

What the Vendor Tools Actually Bring

AI Governance as control plane

Metadata-first AI governance

Embedded curation assistance

CLAIRE as agent-capable layer

The Regulatory Frame Has Already Closed

Article 14 human oversight

21 CFR Part 11 and Annex 11

GAMP 5 Second Edition

NIST AI RMF + GenAI Profile

The Standards Stack Underneath

W3C PROV-O

DAMA-DMBOK 2 + ISO data governance

OpenLineage + Marquez

EDM Council CDMC

EDM Council DCAM

What the Pharma Exemplars Actually Show — and Don’t

Agentic AI initiative

Knowledge graphs as R&D infrastructure

No mature public benchmark yet

The Operating Model Decisions Worth Getting Right

Treat prompts and tools as regulated configuration

Fund the steward side of the loop

Make the policy engine human-readable

Calibrate before you trust

Treat human override as signal, not noise

Where the Next Two Years Go

Oversight becomes enforceable

Product capability catches up

Standards work opens up

Agents Propose. Stewards Adjudicate. Provenance Binds Them.

32 References

Agentic AI for Data Governance:
A Production Pattern for First-Pass Curation