Ali Shahmohammadi, Ph.D. — Director, Data Governance & AI Strategy

01 — About

Data Governance Leader.
Knowledge Graph Builder.
Agentic AI Architect.

I’m a Chemical Engineer turned digital leader with a Ph.D. from Queen’s University and 10+ years bridging pharmaceutical R&D, data science, and enterprise AI.

I lead FAIR Data Strategy and Digital Connectivity at Takeda Pharmaceutical — defining master data, ontology, and data catalog strategy across R&D. I’ve deployed production Agentic AI systems for ontology curation, built knowledge-graph-aligned data models for cell therapy, and own FAIR Studio, an industry-alliance product (Pistoia Alliance) that operationalizes FAIR assessment for pharmaceutical R&D organizations.

My focus has evolved from process modeling and digital twin development into AI strategy and data governance — building the enterprise-grade data foundations that make AI trustworthy in regulated environments. That means master data management, semantic layers, governance frameworks, and the Agentic AI systems that keep them current and auditable at scale.

LinkedIn GitHub Message on LinkedIn Resume

AI Strategy & Data Governance

Governed AI Pipelines

LangGraph · Multi-Agent Systems · RAG · AI Evaluation

Agentic Curator · Human-in-the-Loop

Master Data & Knowledge Graph

MDM · Ontology Engineering · SPARQL · Neo4j · AWS Neptune

Data Catalog & Governance

FAIR Studio · Data Lineage · DCAT · ISO/IEC 23894 · AI Governance

Source Systems

ELN · LIMS · MES · Clinical Data · Compound Registration

02 — Experience

Career

Three roles across pharma R&D, biotech, and academic research.

Takeda

Sep 2022 – Present

Associate Director, FAIR Data Strategy & Digital Connectivity

Takeda Pharmaceutical Inc. · Greater Boston, MA

Lead enterprise FAIR data strategy and governance roadmap across R&D data domains
Designed and deployed Agentic AI system for ontology curation — accelerating cycle times with human-in-the-loop governance
Architect next-gen R&D data catalog with automated quality checks, lineage, and metadata management
Built digital twin for continuous manufacturing of a small molecule API integrating unit-operation mechanistic models
Led cell therapy workflow automation connecting lab instruments, AWS, LabKey, and JMP
Delivered AI/ML model for product-level CO₂ emissions in support of Takeda’s sustainability strategy
Built strategic partnerships with MIT, BYU, Brown, and Purdue on PINNs and in-silico development

Moderna

Aug 2021 – Sep 2022

Senior CMC Scientist

Moderna Inc. · Greater Boston, MA

Developed ML models for mRNA drug substance/product stability and shelf-life prediction (ICH-compliant)
Led comparability and product specification projects through process scale-up phases
Optimized IVT reaction for mRNA process characterization using ML + fundamental modeling
Implemented SPC strategies for raw materials, drug products, and drug substances
Contributed to IND and BLA submissions through statistical analysis and documentation

UT Austin

Oct 2019 – Aug 2021

Post-Doctoral Research Fellow

The University of Texas at Austin · Austin, TX

Developed fundamental models for thin film gallium phosphate on silicon for plasma etch optimization
Built model-based DoE tools using R Shiny for plasma etch process optimization
Created first-principles model for viscoelastic properties of adhesive soft particles
Led process engineering team building mathematical models for gelation time prediction
Performed Molecular Dynamics Simulations with 100,000+ particles

03 — Open Source

Projects

Public repositories focused on scientific ML, drug discovery, and pharmaceutical AI.

Python Jupyter Featured

OntoCurator Agent

A 4-agent LangGraph pipeline that autonomously curates biomedical ontology terms from pharmaceutical R&D documents. Extracts named entities with GPT-5.2, maps them across 10 ontologies (BioPortal + EBI OLS4), detects conflicts, and routes governance decisions to named domain stewards.

LangGraph · GPT-5.2 · Pydantic v2 · BioPortal API

Project Page Case Study GitHub

Python Jupyter Featured

FAIR Data Toolkit

Python package and 5-article series implementing the RDA FAIR Maturity Model (41 indicators) and Pistoia Alliance FAIR Maturity Matrix (L0–L5 × 7 dimensions) for pharmaceutical R&D. Includes manual assessment, gap analysis, and remediation roadmaps.

Python · Pydantic v2 · RDA FAIR · Pistoia Alliance

Project Page Case Study GitHub

PythonJupyter

ScientificML

PINNs, data-driven dynamics, hybrid physics-ML models, and uncertainty quantification for biopharmaceutical process development.

PyTorch · JAX · NumPy · SciPy

GitHub

PythonJupyter

Chemoinformatics

AI-powered drug discovery: molecular property prediction, SMILES encoding, and graph neural networks for molecular design.

RDKit · DeepChem · PyTorch Geometric

GitHub

Python

NLP to BERT

NLP fundamentals to fine-tuned BERT models on scientific text, with applications in pharmaceutical literature mining.

HuggingFace · Transformers · PyTorch

GitHub

Jupyter

PDF Querying

LangChain-powered system for summarizing and querying multiple PDFs — applicable to regulatory document analysis.

LangChain · OpenAI · FAISS

GitHub

04 — Recognition

Recognition & Industry Presence

Industry alliances, academic partnerships, and open-source contributions to the pharmaceutical data governance community.

Industry Alliance

🏛️

FAIR Studio — Pistoia Alliance

Product owner and driving contributor for FAIR Studio, an industry-wide platform developed under the Pistoia Alliance for operationalizing FAIR data assessments at scale. An alliance product used across member pharma R&D organizations to embed governance into digital workflows.

FAIR Governance Industry Alliance Data Product Owner

Academic Partnerships

🎓

Academic Research Collaborations

Forged strategic research partnerships with MIT, BYU, Brown University, and Purdue University to advance Physics-Informed Neural Networks (PINNs), mechanistic modeling, and in-silico–first development capabilities for pharmaceutical R&D.

MIT BYU Brown Purdue PINNs Research

Open Source

🔬

Open-Source Contributions

Author and maintainer of open-source tools for the pharma AI community: OntoCurator Agent (multi-agent ontology curation), FAIR Data Toolkit (automated FAIR assessment), and ScientificML (PINNs and mechanistic modeling).

OntoCurator FAIR Toolkit ScientificML

05 — Skills

Expertise

Four pillars spanning data governance, AI, engineering science, and cloud infrastructure.

Data Strategy & Governance

Master Data ManagementKnowledge Graphs Ontology EngineeringData Catalogs Data LineageData Stewardship FAIR Maturity AssessmentAI Governance ISO/IEC 23894Semantic Interoperability SQLStatistical Process Control

AI & Machine Learning

Agentic AILangGraph Multi-Agent SystemsLLMs & RAG AI EvaluationPhysics-Informed Neural Networks PyTorchJAX Graph Neural NetworksScikit-learn

Engineering & Science

Digital CMCProcess Modeling Mechanistic ModelsDesign of Experiments Continuous ManufacturingMolecular Dynamics Cheminformatics

Infrastructure & Cloud

PythonR AWS (S3, EC2, Redshift, Neptune) Neo4jSPARQL Apache Airflowdbt TerraformDocker GitHub ActionsGit

06 — Writing

Articles & Blog

Thoughts on master data, knowledge graphs, FAIR data, Agentic AI, and pharmaceutical R&D data strategy.

MDM · Knowledge Graphs May 2026

Published 25 min read

Product Mastering Across the R&D-to-Commercial Lifecycle: Why Knowledge Graphs Beat Traditional MDM

A practitioner’s view on why the row-and-column gold record breaks down for medicinal products — and what to build instead. Covers ISO IDMP, IDMP-O, bitemporal graphs, agentic entity resolution, and a deployable reference architecture.

Topics: MDM vs KG · ISO IDMP · IDMP-O (Pistoia Alliance) · ChEBI · RxNorm · Agentic entity resolution · Bitemporal data · Reference architecture · 24 references

Read Article

MDM · Semantic Layers May 2026

Published 30 min read

From Ontology to MDM: How Semantic Layers Are Replacing Hub-and-Spoke Master Data Architectures in Pharma

Why the next generation of pharma data architecture treats meaning as a first-class artifact. Covers OWL, SHACL, SKOS, R2RML, data mesh federation, production exemplars from Bayer COLID and Roche EDIS, and a five-step migration pattern for moving your hub to a read model.

Topics: Semantic layer vs MDM · OWL / SHACL / SKOS / R2RML · Data mesh · Bayer COLID · Roche EDIS · EMA eAF 2026 · Migration pattern · 24 references

Read Article

Agentic AI · Data Governance May 2026

Published 22 min read

Agentic AI for Data Governance: A Production Pattern for First-Pass Curation with Human-in-the-Loop

How to design an audit-ready operating model where agents make bounded governance proposals, human stewards adjudicate the risky residue, and PROV-O provenance records every decision. Covers Article 14 oversight, GAMP 5 v2, NIST AI RMF, and a deployable curation pipeline.

Topics: Human-in-the-loop governance · PROV-O provenance · EU AI Act Article 14 · GAMP 5 v2 · NIST AI RMF · OpenLineage · Stewardship operating model · 32 references

Read Article

AI Governance · ISO Standards May 2026

Published 28 min read

AI Governance Without the Theater: Operationalizing ISO/IEC 23894 in Pharmaceutical R&D

A practical implementation pattern for pharma AI risk that connects ISO/IEC 23894 to ICH Q9(R1), ISO/IEC 42001, NIST AI RMF crosswalks, and FAIR plus ontology curation controls. Focuses on evidence-ready operations over governance theater.

Topics: ISO/IEC 23894 · ISO/IEC 42001 · ICH Q9(R1) · FDA Jan 2025 draft · EMA NDSG · PIC/S Annex 22 · FAIR infrastructure · OBO ontology governance · 31 references

Read Article

Data Catalog · FAIR Maturity May 2026

Published 24 min read

A 90-Day Maturity Assessment for Pharmaceutical R&D Data Catalogs

A practical 90-day playbook for data leaders to baseline FAIR maturity, identify one high-leverage quick win, and ship a defensible 12-month roadmap before the first quarterly review.

Topics: 30-30-30 assessment model · FAIR maturity indicators · Pistoia FAIR matrix · Stewardship mapping · Quick-win design · DCAM alignment · Pharma regulatory overlay · 13 references

Read Article

Agentic AI May 2026

Published 14 min read

Building Agentic AI Systems for Ontology Curation in Drug R&D

How a four-agent LangGraph pipeline automates biomedical term extraction, ontology mapping, conflict detection, and governance routing — while preserving human oversight at every critical decision point.

Topics: LangGraph state machine · BioPortal & OLS4 · Conflict detection · Human-in-the-loop governance · Pydantic v2 · Information Content ranking

Read Article GitHub Repo

Scientific ML Coming Soon

Physics-Informed Neural Networks for Continuous Manufacturing

A practical guide to embedding physical laws into neural network architectures for process development in pharmaceutical continuous manufacturing.

Article in Progress

FAIR Data · 5-Part Series May 2026

Published ~45 min · 5 Jupyter notebooks

What FAIR Data Actually Means for AI-Ready Pharmaceutical R&D

A practitioner’s deep-dive into FAIR data maturity frameworks. Combines the RDA FAIR Maturity Model (41 indicators) and Pistoia Alliance FAIR Maturity Matrix (L0–L5 × 7 dimensions) with a complete walkthrough on a CAR-T viability dataset and a semi-automated Python scorer.

Topics: RDA 41 indicators · Pistoia FMM v1.1 · Manual assessment · Gap analysis · Semi-automated scoring · Agentic scorer architecture

Read the Series Notebooks on GitHub

FAIR Data · Data Governance May 2026

Published 18 min read

FAIR Data Principles: A Practical Guide to Making Your Data Findable, Accessible, Interoperable, and Reusable

A decade after their publication, the FAIR principles have quietly become the backbone of modern data strategy. What they actually mean, why they matter more than ever in the age of AI, and how to put them to work—with a 5-step starter plan and a full ecosystem map.

Topics: FAIR sub-principles · Machine-actionability · Persistent identifiers · Ontologies & metadata standards · CARE Principles · FAIR for AI · 22 key references

Read Article

Ali Shahmohammadi Ph.D.

Data Governance Leader.
Knowledge Graph Builder.
Agentic AI Architect.

Career

Projects

Recognition & Industry Presence

Expertise

Articles & Blog

Let’s Work
Together.

Ali Shahmohammadi Ph.D.

Data Governance Leader.Knowledge Graph Builder.Agentic AI Architect.

Career

Projects

Recognition & Industry Presence

Expertise

Articles & Blog

Let’s WorkTogether.

Data Governance Leader.
Knowledge Graph Builder.
Agentic AI Architect.

Let’s Work
Together.