Agentic AI  ·  Data Governance  ·  Pharma

Agentic AI for Data Governance:
A Production Pattern for First-Pass Curation

Why the next operating model for pharma data governance is not “AI replaces stewards” but “agents propose, stewards adjudicate, provenance binds them” — and what that pattern has to look like to survive regulatory scrutiny.

May 2026 ~22 min read Ali Shahmohammadi, Ph.D. 32 references
Read Article PROV-O (W3C)
10x
The leverage gain stewardship teams need if governance backlogs are going to stop compounding
3
Core commitments in the production pattern: calibrated confidence, structured human review, semantic provenance
Aug 2026
EU AI Act Article 14 oversight obligations apply to high-risk systems from 2 August 2026
32
References spanning agent architecture, pharma regulation, governance standards, and production exemplars
Table of Contents
  1. 01The Governance Backlog Nobody Wants to Talk About
  2. 02Workflows vs. Agents
  3. 03Agents Propose, Stewards Adjudicate, Provenance Binds
  4. 04The Pipeline, Concretely
  5. 05What the Vendor Tools Actually Bring
  6. 06The Regulatory Frame Has Already Closed
  7. 07The Standards Stack Underneath
  8. 08What the Pharma Exemplars Actually Show
  9. 09Operating Model Decisions Worth Getting Right
  10. 10Where the Next Two Years Go
01 — The Governance Backlog

The Governance Backlog Nobody Wants to Talk About

The real problem is not whether the catalog exists. It is whether the flow of human judgment can keep up with the inflow of governance work.

In most large pharma data offices, the visible tooling story looks healthy: a catalog platform is licensed, a glossary exists, stewardship roles are named, and governance roadmaps are in flight. Behind that visible layer sits the real operational condition: thousands of unmapped attributes, policy assignments waiting on review, lineage gaps, and domain definitions that accumulate faster than steward teams can process them.

The issue is structural, not cultural. Data governance is still a piecework discipline. Each new dataset, field, policy, and ownership decision requires a small unit of expert judgment. The supply of that judgment is finite. The demand is rising because IDMP, AI oversight, cloud migration, and cross-domain analytics each add more assets and more review obligations.

This is why agentic AI is attractive in governance settings. The promise is not full automation. The promise is that the first pass can be done by machine, so humans adjudicate only the contested or risky residue. That promise is directionally correct. The failure mode is treating the first pass as if it were equivalent to final authority.

The framing that survives scrutiny: not “AI replaces stewards,” but “agents make bounded proposals, humans make accountable decisions, and the system preserves evidence for every step.”

02 — Workflow Design

Workflows vs. Agents: A Definition Worth Holding

Most governance use cases should start as controlled workflows and only use full agent autonomy at the open-ended edge cases.

Anthropic’s engineering distinction is the cleanest useful definition in this space: workflows are systems where LLMs and tools run through predefined code paths; agents are systems where the model dynamically chooses its own sequence of actions.1

“Workflows are systems where LLMs and tools are orchestrated through predefined code paths. Agents, on the other hand, are systems where LLMs dynamically direct their own processes and tool usage.”

That distinction matters because governance risk sits in the control surface. A workflow is testable, structurally deterministic, and easier to validate. A true agent can solve more open-ended problems, but it is harder to constrain and creates a more complex audit trail. For first-pass curation in pharma, the right default is workflow-first, agent-second.

Schema mapping, glossary suggestion, classification, PII tagging, and lineage stitching are structured enough to fit controlled workflows. Reserve autonomous looping for the residue: cross-system reconciliation, novel ontology alignment, and multi-hop evidence discovery. Even then, keep the tool surface narrow and the human checkpoint explicit.

Reason + Act

ReAct

The canonical loop for interleaving reasoning and tool use. Useful when governance tasks need evidence lookup before proposing an assertion.2

Tool Use

Toolformer

Shows the value of explicit tool calling instead of treating the model as a closed box. Governance agents should call systems, not hallucinate them.3

Reflection

Reflexion

Improvement over time through verbal feedback loops. In governance, steward overrides are the strongest candidate signal for this pattern.4

03 — The Pattern

Agents Propose, Stewards Adjudicate, Provenance Binds Them

A production-ready governance architecture differs from a demo in three ways: it measures uncertainty, it routes humans deliberately, and it records provenance semantically.

Commitment 1

Confidence is a first-class signal

Every proposal needs a calibrated confidence and a structured citation. Selective classification and calibrated abstention matter more than raw answer rate because abstention is a valid control outcome in regulated governance.78910

Commitment 2

Human review is a routing problem

Human-in-the-loop is not a courtesy step. It is a policy-enforced route for low-confidence cases, regulated identifiers, cross-domain conflicts, and assets whose failure cost is unacceptable.11

Commitment 3

Provenance is semantic and queryable

PROV-O gives the vocabulary for who generated what, when, with which evidence, and under whose review. That turns auditability into a data model instead of an afterthought.12

The design principle: every assertion needs a who, why, when, and on-what-evidence. Every override becomes a reusable labeled signal. Every change to prompt, model, or tool surface becomes versioned configuration.

04 — Pipeline

The Pipeline, Concretely

The critical separation is between proposals and production. Agents write to staging. Policy and humans determine promotion.

agentic-governance-pipeline.txt
CATALOG / KNOWLEDGE GRAPH
  glossary | classifications | lineage | policies | owners
  provenance graph on every assertion

AGENT PROPOSALS (staging graph)
  - classification
  - glossary link
  - lineage edge
  - policy tag
  - confidence
  - citation

STEWARD ADJUDICATION
  approve | override | escalate

POLICY ENGINE
  - confidence floors
  - regulated-asset rules
  - domain ownership rules
  - audit policies

SOURCE METADATA
  cataloged systems | lineage events | samples | glossary | ontologies
Boundary 1

Proposals do not write directly to production

Automatic promotion is allowed only when confidence clears policy thresholds and the asset is not in a protected class.

Boundary 2

The policy engine sits upstream

The system decides which tasks the agent may attempt before the agent runs, not after it has already acted.

Boundary 3

Provenance is its own first-class store

You need one-query answers to questions like which model version touched a regulated substance during a specific period.

05 — Vendor Reality

What the Vendor Tools Actually Bring

The useful evaluation question is not whether a catalog vendor has AI. It is whether the AI’s behavior can be governed to pharma-grade standards.

Collibra

AI Governance as control plane

Positioned around use-case registration, risk controls, and auditability across the AI lifecycle.13

Atlan

Metadata-first AI governance

The catalog remains the substrate; AI is treated as both consumer and contributor of active metadata.14

Alation

Embedded curation assistance

ALLIE AI focuses on catalog curation, glossary generation, and stewardship workflows inside the existing platform.15

Informatica

CLAIRE as agent-capable layer

Metadata understanding, classification, and rule recommendations framed inside the broader IDMC platform.16

The buying question: where does the agent write, what confidence and provenance accompany each assertion, and how is the policy engine enforced for regulated assets? Product marketing is secondary to those controls.

06 — Regulation

The Regulatory Frame Has Already Closed

In pharma, oversight, audit trails, validation, and AI risk management are design inputs. They are not optional add-ons.

EU AI Act

Article 14 human oversight

High-risk AI systems must be built so natural persons can understand capabilities and limits, interpret outputs, override decisions, and stop the system. The production pattern maps directly to that requirement.17

GxP Controls

21 CFR Part 11 and Annex 11

Electronic records rules demand audit trails, accountability, and validated computerized systems. Agentic curation touching GxP-adjacent data needs ALCOA+ provenance by construction.1819

Validation

GAMP 5 Second Edition

The current practical validation framework for AI-enabled computerized systems in regulated pharma, especially where iterative change and service providers are involved.20

Risk Management

NIST AI RMF + GenAI Profile

Govern, Map, Measure, and Manage provide the cleanest non-binding structure for organizing explainability, accountability, resilience, and generative-AI-specific risks.2122

Architectural conclusion: there is no plausible 2026 deployment of agentic curation in a pharma data office that survives audit without Article-14-grade oversight, ALCOA+ evidence trails, GAMP-grade validation, and AI-RMF-aligned governance documentation.

07 — Standards

The Standards Stack Underneath

Production governance needs both semantic standards and operating-model standards. One set defines the facts; the other defines how the organization controls them.

Semantic Provenance

W3C PROV-O

The only standard in this stack that directly makes who, what, when, and why queryable across agent runs, catalog assertions, and steward review.12

EntityActivityAgent
Governance Vocabulary

DAMA-DMBOK 2 + ISO data governance

DMBOK, ISO 8000-1, and ISO/IEC 38505-1 give shared language for data quality and governance structure across functions and control forums.232425

Lineage

OpenLineage + Marquez

Useful for pipeline-level events and run metadata. Combined with PROV-O, it gives both operational lineage and semantic accountability.2627

Cloud Controls

EDM Council CDMC

Provides a practical scoring rubric for whether sensitive data in cloud and hybrid-cloud environments is governed tightly enough for agentic curation to operate safely.28

Capability Maturity

EDM Council DCAM

Useful for determining whether a domain is mature enough to adopt agent-augmented governance before the pilot starts, rather than after it fails.29

08 — Pharma Signals

What the Pharma Exemplars Actually Show — and Don’t

The strongest public signal is that pharma is treating agentic governance as a standards-and-architecture problem. What is still missing is a mature peer-reviewed case study of full production-scale LLM agents running governance loops.

Pistoia Alliance

Agentic AI initiative

The clearest cross-industry sign that pharma sees agentic AI as a standards problem, not just a vendor feature race.30

AstraZeneca

Knowledge graphs as R&D infrastructure

AZ publicly describes graph-based semantic infrastructure across genomic, disease, drug, clinical, and safety data, which is the closest public analogue to the substrate needed for governed agentic curation.3132

Current gap

No mature public benchmark yet

What the public record still lacks is a full-scale, peer-reviewed pharma deployment where LLM agents own governance control flow in production at enterprise scale.

The practical reading: the architecture is established, the controls are specified, the vendors are converging, and the first exemplars are imminent rather than historical. Teams building now are likely to define the reference pattern others adopt later.

09 — Operating Model

The Operating Model Decisions Worth Getting Right

Most pilot failures come from weak operating decisions, not weak model capability.

Configuration Control

Treat prompts and tools as regulated configuration

Prompt changes, tool additions, and model upgrades should all trigger review, versioning, and regression checks against steward-labeled data.

Human Capacity

Fund the steward side of the loop

Agentic curation increases proposal volume. Without enough steward review capacity, the queue still grows, just faster and with more machine-generated work.

Policy Visibility

Make the policy engine human-readable

Rules like “regulated identifiers always require review” should be expressible in declarative policy, not buried in prompt text.

Measurement

Calibrate before you trust

Reliability diagrams, selective accuracy, and expected calibration error on the real workload matter more than generic benchmark confidence claims.

Learning Loop

Treat human override as signal, not noise

Every steward override should feed the next cycle as labeled evidence. Otherwise the system repeats the same failure modes every quarter.

10 — The Next Two Years

Where the Next Two Years Go

Three forces are converging quickly enough to make 2026–2027 the inflection window for agent-augmented governance in pharma.

Regulatory deadlines

Oversight becomes enforceable

Article 14 timing, Annex 11 direction, and modernized GxP expectations make retrofit a weaker option every quarter.

Vendor maturity

Product capability catches up

The conversation is shifting from “does it have an agent?” to “can it be governed to GxP standards?” That is the right inflection point.

Industry coordination

Standards work opens up

Pistoia Alliance’s initiative creates a focal point for vocabularies and controls that vendors and pharmas will likely converge around.

The architectural conclusion: the governance layer of the next pharma data stack is an agent-augmented stewardship loop with semantic provenance, policy-gated autonomy, and human review designed in from the start. The opportunity is not replacing governance. It is giving governance the leverage it has been missing.

Closing Thought

Agents Propose. Stewards Adjudicate. Provenance Binds Them.

Data governance has always been finite humans facing effectively infinite work. Catalog tooling, stewardship communities, and glossary programs improved the surface, but they did not close the judgment gap. The curve kept widening.

What changes the curve is not an agent pretending to be a steward. It is an agent that knows when to abstain, surfaces evidence when it does act, and operates inside a policy and provenance framework the auditor can actually inspect. That system is buildable now with stable W3C standards, current governance frameworks, and vendor platforms that are finally approaching the right control surface.

Agents propose. Stewards adjudicate. Provenance binds them.

Back to Portfolio Related: Semantic Layers → Related: Ontology Curation → Related: ISO/IEC 23894 in Pharma R&D →
References

32 References

  1. 1Schluntz, E., Zhang, B. Building effective agents. Anthropic Engineering, 19 December 2024. anthropic.com
  2. 2Yao, S., Zhao, J., Yu, D., Du, N., Shafran, I., Narasimhan, K., Cao, Y. ReAct: Synergizing Reasoning and Acting in Language Models. ICLR 2023. arxiv.org
  3. 3Schick, T., Dwivedi-Yu, J., Dessì, R., et al. Toolformer: Language Models Can Teach Themselves to Use Tools. NeurIPS 2023. arxiv.org
  4. 4Shinn, N., Cassano, F., Berman, E., et al. Reflexion: Language Agents with Verbal Reinforcement Learning. NeurIPS 2023. arxiv.org
  5. 5Wang, L., Ma, C., Feng, X., et al. A Survey on Large Language Model based Autonomous Agents. Frontiers of Computer Science, 2024. arxiv.org
  6. 6Weng, L. LLM Powered Autonomous Agents. Lil'Log, 23 June 2023. lilianweng.github.io
  7. 7Geifman, Y., El-Yaniv, R. Selective Classification for Deep Neural Networks. NeurIPS 2017. arxiv.org
  8. 8Kamath, A., Jia, R., Liang, P. Selective Question Answering under Domain Shift. ACL 2020. aclanthology.org
  9. 9Lin, S., Hilton, J., Evans, O. Teaching Models to Express Their Uncertainty in Words. TMLR, 2022. arxiv.org
  10. 10Kadavath, S., Conerly, T., Askell, A., et al. Language Models (Mostly) Know What They Know. arxiv.org
  11. 11Settles, B. Active Learning Literature Survey. University of Wisconsin-Madison. burrsettles.com
  12. 12W3C. PROV-O: The PROV Ontology. W3C Recommendation, 30 April 2013. w3.org
  13. 13Collibra. AI Governance. collibra.com
  14. 14Atlan. AI Governance. atlan.com
  15. 15Alation. ALLIE AI. alation.com
  16. 16Informatica. CLAIRE AI. informatica.com
  17. 17European Union. Regulation (EU) 2024/1689. eur-lex.europa.eu
  18. 18U.S. FDA. 21 CFR Part 11 — Electronic Records; Electronic Signatures. ecfr.gov
  19. 19European Commission. EudraLex Volume 4, Annex 11: Computerised Systems. ec.europa.eu
  20. 20ISPE. GAMP 5 Second Edition. ispe.org
  21. 21NIST. AI RMF 1.0. doi.org
  22. 22NIST. Generative Artificial Intelligence Profile. doi.org
  23. 23DAMA International. DAMA-DMBOK 2. dama.org
  24. 24ISO. ISO 8000-1:2022 — Data quality — Part 1: Overview. iso.org
  25. 25ISO. ISO/IEC 38505-1:2017 — Governance of data. iso.org
  26. 26OpenLineage. An Open Standard for Lineage Data Collection. openlineage.io
  27. 27Marquez Project. Marquez. marquezproject.ai
  28. 28EDM Council. Cloud Data Management Capabilities (CDMC). edmcouncil.org
  29. 29EDM Council. Data Management Capability Assessment Model (DCAM). edmcouncil.org
  30. 30Pistoia Alliance. Agentic AI initiative. pistoiaalliance.org
  31. 31AstraZeneca. Data Science & AI. astrazeneca.com
  32. 32Graphwise. AstraZeneca: Enabling new medicines through semantic knowledge graphs. graphwise.ai