Strategic Digital Leader  ·  Pharmaceutical R&D

Ali Shahmohammadi Ph.D.

Director-level data leader building enterprise master data, ontology, and Agentic AI governance systems for regulated R&D. From FAIR strategy and knowledge graphs to AI-ready data products across the discovery-to-commercial lifecycle.

Master Data Management Knowledge Graphs Agentic AI FAIR Governance AI-Ready Data Pharma R&D Data Products
governance_architecture
Agentic Curator · LangGraph Multi-Agent · Human-in-the-Loop
Source
Systems
Ontology
Service
Master Data
Graph
Data
Catalog
Governed AI
Pipelines
ELN · LIMS · MES BioPortal · OLS4 Neptune · Neo4j FAIR Studio · DCAT LangGraph · RAG
15+
Years across pharma R&D, data science & enterprise AI
12+
Production data systems deployed — governance, MDM, catalog, AI
1
Industry-alliance product owned — FAIR Studio, Pistoia Alliance
Ph.D.
Chemical Process Engineering, Queen’s University
01 — About

Data Governance Leader.
Knowledge Graph Builder.
Agentic AI Architect.

I’m a Chemical Engineer turned digital leader with a Ph.D. from Queen’s University and 10+ years bridging pharmaceutical R&D, data science, and enterprise AI.

I lead FAIR Data Strategy and Digital Connectivity at Takeda Pharmaceutical — defining master data, ontology, and data catalog strategy across R&D. I’ve deployed production Agentic AI systems for ontology curation, built knowledge-graph-aligned data models for cell therapy, and own FAIR Studio, an industry-alliance product (Pistoia Alliance) that operationalizes FAIR assessment for pharmaceutical R&D organizations.

My focus has evolved from process modeling and digital twin development into AI strategy and data governance — building the enterprise-grade data foundations that make AI trustworthy in regulated environments. That means master data management, semantic layers, governance frameworks, and the Agentic AI systems that keep them current and auditable at scale.

AI Strategy & Data Governance
Governed AI Pipelines
LangGraph · Multi-Agent Systems · RAG · AI Evaluation
Agentic Curator · Human-in-the-Loop
Master Data & Knowledge Graph
MDM · Ontology Engineering · SPARQL · Neo4j · AWS Neptune
Data Catalog & Governance
FAIR Studio · Data Lineage · DCAT · ISO/IEC 23894 · AI Governance
Source Systems
ELN · LIMS · MES · Clinical Data · Compound Registration
02 — Experience

Career

Three roles across pharma R&D, biotech, and academic research.

Takeda
Sep 2022 – Present
Associate Director, FAIR Data Strategy & Digital Connectivity
Takeda Pharmaceutical Inc. · Greater Boston, MA
  • Lead enterprise FAIR data strategy and governance roadmap across R&D data domains
  • Designed and deployed Agentic AI system for ontology curation — accelerating cycle times with human-in-the-loop governance
  • Architect next-gen R&D data catalog with automated quality checks, lineage, and metadata management
  • Built digital twin for continuous manufacturing of a small molecule API integrating unit-operation mechanistic models
  • Led cell therapy workflow automation connecting lab instruments, AWS, LabKey, and JMP
  • Delivered AI/ML model for product-level CO₂ emissions in support of Takeda’s sustainability strategy
  • Built strategic partnerships with MIT, BYU, Brown, and Purdue on PINNs and in-silico development
Moderna
Aug 2021 – Sep 2022
Senior CMC Scientist
Moderna Inc. · Greater Boston, MA
  • Developed ML models for mRNA drug substance/product stability and shelf-life prediction (ICH-compliant)
  • Led comparability and product specification projects through process scale-up phases
  • Optimized IVT reaction for mRNA process characterization using ML + fundamental modeling
  • Implemented SPC strategies for raw materials, drug products, and drug substances
  • Contributed to IND and BLA submissions through statistical analysis and documentation
UT Austin
Oct 2019 – Aug 2021
Post-Doctoral Research Fellow
The University of Texas at Austin · Austin, TX
  • Developed fundamental models for thin film gallium phosphate on silicon for plasma etch optimization
  • Built model-based DoE tools using R Shiny for plasma etch process optimization
  • Created first-principles model for viscoelastic properties of adhesive soft particles
  • Led process engineering team building mathematical models for gelation time prediction
  • Performed Molecular Dynamics Simulations with 100,000+ particles
03 — Open Source

Projects

Public repositories focused on scientific ML, drug discovery, and pharmaceutical AI.

PythonJupyter
ScientificML

PINNs, data-driven dynamics, hybrid physics-ML models, and uncertainty quantification for biopharmaceutical process development.

PyTorch · JAX · NumPy · SciPy
GitHub
PythonJupyter
Chemoinformatics

AI-powered drug discovery: molecular property prediction, SMILES encoding, and graph neural networks for molecular design.

RDKit · DeepChem · PyTorch Geometric
GitHub
Python
NLP to BERT

NLP fundamentals to fine-tuned BERT models on scientific text, with applications in pharmaceutical literature mining.

HuggingFace · Transformers · PyTorch
GitHub
Jupyter
PDF Querying

LangChain-powered system for summarizing and querying multiple PDFs — applicable to regulatory document analysis.

LangChain · OpenAI · FAISS
GitHub
04 — Recognition

Recognition & Industry Presence

Industry alliances, academic partnerships, and open-source contributions to the pharmaceutical data governance community.

Academic Partnerships
🎓
Academic Research Collaborations

Forged strategic research partnerships with MIT, BYU, Brown University, and Purdue University to advance Physics-Informed Neural Networks (PINNs), mechanistic modeling, and in-silico–first development capabilities for pharmaceutical R&D.

MIT BYU Brown Purdue PINNs Research
Open Source
🔬
Open-Source Contributions

Author and maintainer of open-source tools for the pharma AI community: OntoCurator Agent (multi-agent ontology curation), FAIR Data Toolkit (automated FAIR assessment), and ScientificML (PINNs and mechanistic modeling).

OntoCurator FAIR Toolkit ScientificML
05 — Skills

Expertise

Four pillars spanning data governance, AI, engineering science, and cloud infrastructure.

Data Strategy & Governance
Master Data ManagementKnowledge Graphs Ontology EngineeringData Catalogs Data LineageData Stewardship FAIR Maturity AssessmentAI Governance ISO/IEC 23894Semantic Interoperability SQLStatistical Process Control
AI & Machine Learning
Agentic AILangGraph Multi-Agent SystemsLLMs & RAG AI EvaluationPhysics-Informed Neural Networks PyTorchJAX Graph Neural NetworksScikit-learn
Engineering & Science
Digital CMCProcess Modeling Mechanistic ModelsDesign of Experiments Continuous ManufacturingMolecular Dynamics Cheminformatics
Infrastructure & Cloud
PythonR AWS (S3, EC2, Redshift, Neptune) Neo4jSPARQL Apache Airflowdbt TerraformDocker GitHub ActionsGit
06 — Writing

Articles & Blog

Thoughts on master data, knowledge graphs, FAIR data, Agentic AI, and pharmaceutical R&D data strategy.

Scientific ML Coming Soon
Physics-Informed Neural Networks for Continuous Manufacturing

A practical guide to embedding physical laws into neural network architectures for process development in pharmaceutical continuous manufacturing.

Article in Progress

Let’s Work
Together.

Open to collaboration on scientific ML, Agentic AI, and pharmaceutical data science projects.