Part of ToxMCP Suite → https://github.com/ToxMCP/toxmcp
Public MCP endpoint for physiologically based pharmacokinetic (PBPK) simulation, qualification, and dossier export.
Expose PK-Sim / MoBi .pkml, MCP-ready rxode2 .R, deterministic and population runs, and OECD-oriented validation/reporting to any MCP-aware agent (Codex CLI, Gemini CLI, Claude Code, etc.).
flowchart LR
subgraph Clients["Clients and Agents"]
Codex["Codex CLI / Desktop"]
Gemini["Gemini CLI"]
Claude["Claude Code"]
Scripts["Scripts / notebooks"]
end
subgraph API["FastAPI MCP Service"]
Router["HTTP entrypoints\n/health and /mcp"]
Registry["Tool registry\npbpk-mcp.v1 contracts"]
Tools["Tool handlers\ndiscovery, validation,\nexecution, reporting"]
end
subgraph Contracts["Discovery and Qualification Layer"]
Catalog["Filesystem model catalog\nADAPTER_MODEL_PATHS"]
Manifest["Static manifest validation\nvalidate_model_manifest"]
Qualification["Qualification metadata\ncapabilities, profile,\nvalidation, qualificationState"]
Verification["Executable verification\nrun_verification_checks"]
Reports["OECD dossier export\nexport_oecd_report"]
end
subgraph Backends["Execution Backends"]
OSPSuite["OSPSuite backend\nPK-Sim / MoBi (.pkml)"]
Rxode2["rxode2 backend\nMCP-ready R (.R)"]
Convert["Conversion-only inputs\n.pksim5 and .mmd -> .pkml or .R"]
end
subgraph Runtime["Runtime and QA Path"]
Jobs["Celery + Redis jobs\nasync runs and result handles"]
Overlay["Source-overlay runtime\nbind mounts + worker image"]
Tests["Unit + live smoke tests\nrelease readiness checks"]
end
Clients --> Router
Router --> Registry
Registry --> Tools
Tools --> Catalog
Tools --> Manifest
Tools --> Qualification
Qualification --> Verification
Qualification --> Reports
Tools --> OSPSuite
Tools --> Rxode2
Convert --> OSPSuite
Convert --> Rxode2
Tools --> Jobs
Overlay --> Tools
Qualification --> Tests
Overlay --> Tests
The current implementation follows a layered model:
FastAPI + JSON-RPCexpose/mcpand/health, and keep transport concerns separate from PBPK model logic.Tool handlersare the agent-facing API. They validate inputs, call adapters, and emit structured responses with normalizedpbpk-mcp.v1payloads.Discovery and manifest validationmake model files visible before load, and keep curation checks separate from runtime execution.Execution backendsroute.pkmlfiles toospsuiteand MCP-ready.Rmodules torxode2, while treating.pksim5and.mmdas conversion-only inputs.Qualification metadatakeepscapabilities,profile,validation, andqualificationStateseparate so runnable does not get conflated with scientifically qualified.Executable verificationadds lightweight but structured runtime checks on top of qualification metadata, including parameter-unit consistency, structural flow/volume consistency, deterministic smoke, result-integrity, and repeat-run reproducibility checks without conflating them with formal qualification evidence.Packaged-default runtime + overlay smoke testskeep the documented tool surface synchronized with the live API while still leaving an explicit maintainer override path for workspace iteration.Release metadata checksnow verify that package version markers, compose/envSERVICE_VERSION, README release markers, the top changelog entry, and the matchingdocs/releases/note stay aligned.Release artifact evidencenow retains a machine-readable report withsdist/wheel hashes and the linked contract-manifest identity during tag builds.
See docs/architecture/dual_backend_pbpk_mcp.md for the fuller architecture narrative, docs/architecture/capability_matrix.md for the published support matrix, docs/architecture/mcp_payload_conventions.md for the response contract, docs/deployment/runtime_patch_flow.md for the operator path behind the current local deployment model, and docs/pbpk_model_onboarding_checklist.md for the recommended trust pipeline when onboarding another chemical model.
See docs/architecture/exposure_led_ngra_role.md for the explicit boundary statement on what PBPK MCP does and does not own inside exposure-led NGRA workflows.
See benchmarks/regulatory_goldset/regulatory_goldset_summary.md for the current gold-set documentation benchmark summary derived from the fetched public-code PBPK corpus. Internal examples such as the bundled synthetic reference model remain bounded MCP rehearsal models, not benchmark exemplars.
- Added benchmark-derived dossier guidance, onboarding guidance, and explicit trust-surface reporting so MCP can show what is still missing before a PBPK model can credibly approach regulatory-grade use.
- Tightened the public runtime posture by keeping the analyst UI retired, preserving viewer-only anonymous development mode, and keeping critical actions behind confirmation and sign-off surfaces.
- Hardened release and audit evidence by shipping a reproducible gold-set benchmark dossier, a verifiable synthetic reference-model audit pack, and stricter contract/release-bundle integrity checks on the live stack.
PBPK workflows often juggle PK-Sim or MoBi transfer files, R-authored mechanistic models, runtime-specific scripts, and incomplete qualification notes. That makes them difficult for coding agents to automate safely and difficult for reviewers to inspect consistently.
The PBPK MCP server wraps those workflows in a single, programmable interface:
- Unified MCP surface – discovery, manifest checks, load, validation, execution, results, and dossier export share one tool catalog.
- Dual-backend execution –
.pkmlmodels run onospsuite; MCP-ready.Rmodels run onrxode2. - Qualification-aware workflows – runtime capability, scientific profile, preflight validation, and derived
qualificationStatestay separate. - NGRA-ready PBPK objects – validation and dossier export expose PBPK-side typed objects such as
assessmentContext,pbpkQualificationSummary,uncertaintySummary,uncertaintyHandoff,internalExposureEstimate, a typed external uncertainty-register reference, and a typed external PoD reference handoff, now with explicit boundary/support metadata plus additivesemanticCoveragefor PBPK-side uncertainty semantics, without collapsing PBPK MCP into a full decision engine. - Exposure-led boundary clarity –
assessmentContext.workflowRole,assessmentContext.populationSupport,pbpkQualificationSummary.evidenceBasis,pbpkQualificationSummary.workflowClaimBoundaries, andpbpkQualificationSummary.reviewStatusnow make IVIVE/exposure dependencies, supported population contexts, no-direct-in-vivo status, reviewer-attention requirements, and claim boundaries explicit. - Static curation feedback for NGRA gaps –
validate_model_manifestnow exposesmanifest.ngraCoverageplus a compactcurationSummary, so missing workflow-role, population-support, evidence-basis, and claim-boundary declarations are visible before load in both machine-readable and reviewer-facing form; that summary now also carries an anti-misread block, explicit summary-transport risk for thin/forwarded views, a normalizedcautionSummarywith advisory-versus-blocking handling, a machine-readableexportBlockPolicy, and rendering guardrails for analyst-facing clients. - Regulatory benchmark bar – the workspace now also ships a gold-set benchmark dossier under
benchmarks/regulatory_goldset/plus an advisorycurationSummary.regulatoryBenchmarkReadinessblock, so MCP reviewers can compare a model dossier to public regulatory-grade PBPK packaging without promoting research models to regulatory-ready status. That advisory block now also carries per-dimension evaluation paths, benchmark examples, and concrete next-artifact guidance for closing documentation gaps. - Reviewer-friendly discovery summaries –
discover_modelsand/mcp/resources/modelsnow surfacemanifestStatus,qualificationState, and a compactcurationSummaryso curated illustrative examples can be distinguished from exploratory or partially declared models at discovery time, with explicit anti-misread guidance and no official “bare label only” rendering path. - Human-review summary – OECD report export now includes a compact
humanReviewSummaryblock that pulls those boundary fields into one reviewer-facing surface, including reviewer-status, unresolved-dissent context, normalizedcautionSummary, summary-transport risk,exportBlockPolicy, and rendering guardrails, without creating a second hidden score or decision engine. - Anti-misread report section – OECD report export now also includes a mandatory
misreadRiskSummaryblock so common over-interpretations are stated explicitly next to the primary report summary instead of being left implicit. - Audit-backed operator sign-off – trust-bearing outputs now surface an additive
operatorReviewSignoffsummary plus an explicitoperatorReviewGovernanceblock derived from the real workflow contract, and the REST surface now exposes viewer-readable/review_signoff/historyso recorded acknowledgement, bounded-use sign-off, rejection, and revocation remain auditable without being mistaken for a second qualification score, override path, or decision authority. - No bundled analyst UI in the current release – trust-bearing review remains on the MCP and REST surfaces for now;
/consoleand/console/api/*are intentionally absent until a stronger reviewer-facing client exists. - Thin-client trust-surface contract – trust-bearing MCP tool results now also expose a top-level
trustSurfaceContractso thin clients can find the nested summary surfaces, required adjacent caveats, and primary block reasons they must carry together instead of rediscovering those paths tool by tool. - Traceable predictive evidence – performance bundles can now carry traceability supplements, and the report path now warns when bundled benchmark rows do not actually line up with the declared datasets, target outputs, or acceptance criteria.
- Discovery before execution – models are discoverable from disk before they are loaded into a live session.
- Release-tested local deployment – the packaged local runtime and the explicit source-overlay maintainer profile are both exercised with unit tests, live-stack tests, and a readiness check.
- Model-specific executable qualification checks – MCP-ready models can add runtime verification hooks for checks such as flow/volume consistency, mass balance, or numerical stability without overstating regulatory qualification.
rxode2is a native PBPK execution engine in this project. It is not limited to Berkeley Madonna conversion workflows.
| Capability | Description |
|---|---|
| 🧬 Dual-backend PBPK execution | Route .pkml models to ospsuite and MCP-ready .R models to rxode2 through one MCP surface. |
| 🗂️ Model discovery and curation | Discover supported model files from ADAPTER_MODEL_PATHS, inspect unloaded models, and run static manifest checks before load. |
| 🛡️ OECD-oriented qualification | Keep capabilities, profile, validation, and qualificationState explicit; expose applicability, provenance, uncertainty, implementation verification, software-platform qualification, and qualification gaps. |
| 🧱 NGRA-ready PBPK objects | Emit typed PBPK-side objects such as assessmentContext, pbpkQualificationSummary, uncertaintySummary, uncertaintyHandoff, internalExposureEstimate, a typed uncertaintyRegisterReference, a typed pointOfDepartureReference, and a thin BER-ready reference bundle in dossier export, with explicit boundary/support flags plus additive uncertainty semanticCoverage for downstream NGRA orchestration and without embedding BER decision logic in PBPK MCP. |
| 📐 Published object schemas | Publish machine-readable JSON Schemas and example payloads for the PBPK-side NGRA handoff object family under schemas/ so downstream tooling can validate the public object layer directly. |
| 🔌 External PBPK normalization | Normalize externally generated PBPK outputs, qualification metadata, and optional PoD references through ingest_external_pbpk_bundle without pretending PBPK MCP executed the upstream engine. |
| ✅ Executable verification | Run run_verification_checks to capture preflight validation, parameter coverage, parameter-unit consistency, structural flow/volume consistency, deterministic smoke, deterministic result integrity, repeat-run reproducibility, optional population smoke, and verification-evidence summaries in one payload. |
| 📈 Deterministic and population jobs | Submit asynchronous deterministic and population simulations, then retrieve result handles, stored results, and PK summaries. |
| 🧾 Dossier export | Export a structured OECD-style report with model metadata, validation context, checklist state, parameter provenance, performance evidence, uncertainty evidence, verification evidence, and software-platform qualification evidence when declared. |
| ⚙️ Source-overlay deployment hygiene | Recreate the local stack through the documented compose entrypoints and validate the live contract with readiness and smoke gates. |
| 🤖 Agent friendly | Verified through MCP HTTP surfaces such as /mcp/list_tools, /mcp/call_tool, and /mcp/resources/models, with live-stack regression checks. |
- Architecture
- Published schemas
- Capability matrix
- Quick start
- Configuration
- Tool catalog
- Running the server
- Integrating with coding agents
- Output artifacts
- Security checklist
- Current limitations
- Development notes
- Contributing
- Security policy
- Code of conduct
- Citation
- Roadmap
- License
# 1) build the worker image
git clone https://github.com/ToxMCP/pbpk-mcp.git
cd pbpk-mcp
# 2) start the local stack
./scripts/build_rxode2_worker_image.sh
./scripts/deploy_rxode2_stack.sh
# 3) verify
curl -s http://localhost:8000/health | jq .
curl -s http://localhost:8000/mcp/list_tools | jq '.tools[].name'The PBPK-side NGRA handoff objects are now published as machine-readable JSON Schemas under schemas/, with matching examples under schemas/examples/.
Published object family:
schemas/assessmentContext.v1.jsonschemas/pbpkQualificationSummary.v1.jsonschemas/uncertaintySummary.v1.jsonschemas/uncertaintyHandoff.v1.jsonschemas/uncertaintyRegisterReference.v1.jsonschemas/internalExposureEstimate.v1.jsonschemas/pointOfDepartureReference.v1.jsonschemas/berInputBundle.v1.json
Design intent:
- require the stable core fields only
- allow additive convenience fields
- keep BER calculation and final decision policy outside PBPK MCP
- keep exposure assessment, IVIVE policy, reverse dosimetry, and direct regulatory dose derivation outside PBPK MCP
- make the PBPK-side handoff layer consumable by downstream validators and orchestrators without scraping examples out of tests
See schemas/README.md for the short schema guide and tests/test_ngra_object_schemas.py for the validation gate that keeps the published schemas aligned with live payload generation. The same schema family is also exposed through the live MCP resource surface at /mcp/resources/schemas. In the current local packaged runtime, the live schema/capability/contract-manifest resources treat the packaged mcp_bridge.contract content as authoritative rather than depending on copied JSON under /app/var/contract. scripts/check_installed_package_contract.py is the maintainer gate that proves the generated package fallback still matches the published JSON artifacts after a non-editable local install. When you explicitly opt into the workspace source-overlay profile, scripts/runtime_src_overlay.pth promotes /app/src ahead of the installed package.
PBPK MCP now also publishes a machine-readable contract manifest in:
docs/architecture/contract_manifest.json/mcp/resources/contract-manifest
That manifest inventories the published PBPK-side schema family, the capability matrix, the legacy artifacts intentionally excluded from the PBPK-side object family, and the stable resource endpoints that expose the contract.
The live schema, capability-matrix, and contract-manifest resources now also expose SHA-256 values so downstream clients can verify that the running API matches the published artifact inventory.
The shared schema/capability/contract-manifest route logic now lives in packaged src/mcp_bridge/routes/resources_base.py, and packaged src/mcp_bridge/routes/resources.py now owns the full generic /mcp/resources surface including /mcp/resources/models.
The same is now true for tools as well: packaged src/mcp_bridge/tools/registry_base.py and src/mcp_bridge/tools/registry.py now own the generic discovery/static-manifest/result/import descriptors alongside the rest of the documented PBPK workflow surface.
The next 0.4.x debt-reduction slices are now live too: generic discovery, manifest, load/session-status, preflight validation, executable verification, dossier export, deterministic-result retrieval, external-import normalization, population workflow tools, the shared model_catalog / model_manifest helpers, the top-level mcp namespace, and the generic adapter contract/runtime now live in packaged src/. The default local runtime no longer needs a patch-copy step or a source overlay for code: the worker image bakes scripts/ospsuite_bridge.R and the bundled reference .R model into their runtime locations, docker-compose.celery.yml runs the packaged baseline, and docker-compose.overlay.yml is now the explicit maintainer profile when workspace src/ and scripts/ should override the image.
PBPK MCP now publishes a dedicated capability matrix for adopters in:
docs/architecture/capability_matrix.mddocs/architecture/capability_matrix.json
The machine-readable matrix is also exposed from the running server at /mcp/resources/capability-matrix.
This matrix is the crisp public answer to:
- what is discoverable
- what is statically validatable
- what is loadable
- what is deterministic-run capable
- what is population-run capable
- what is dossier-capable
Current headline boundaries:
.pkmlis discoverable, loadable, validatable, verification-capable, deterministic-run capable, and dossier-capable throughospsuite- contract-complete MCP-ready
.Ris discoverable, loadable, validatable, verification-capable, deterministic-run capable, and dossier-capable throughrxode2 - population execution is currently
rxode2-only and remains conditional on model capability - incomplete
.Rfiles may still be discoverable before they are runnable .pksim5and.mmdare conversion-only and do not appear as runtime-supported catalog entries
The machine-readable JSON file is intentionally small and stable so downstream tooling can consume it directly without scraping prose from the README.
scripts/generate_contract_artifacts.py --check is the maintainer gate that keeps the contract manifest and generated packaged fallback aligned with the published JSON artifacts.
git clone https://github.com/ToxMCP/pbpk-mcp.git
cd pbpk-mcp
./scripts/build_rxode2_worker_image.sh
./scripts/deploy_rxode2_stack.shHeads-up:
./scripts/deploy_rxode2_stack.shnow runs the packaged local runtime by default and waits for stable/healthand/mcp/list_toolsresponses before returning. When you need workspacesrc/andscripts/to override the image for active development, use./scripts/deploy_source_overlay_stack.sh.
Once the server is running:
- HTTP MCP endpoint:
http://localhost:8000/mcp - Health check:
http://localhost:8000/health - Model discovery resource:
http://localhost:8000/mcp/resources/models - Architecture docs:
docs/architecture/dual_backend_pbpk_mcp.md - Runtime operator docs:
docs/deployment/runtime_patch_flow.md - New-model trust pipeline:
docs/pbpk_model_onboarding_checklist.md
Once the server is running:
# health
curl -s http://localhost:8000/health | jq .
# list MCP tools
curl -s http://localhost:8000/mcp/list_tools | jq .
# discover reference_compound
curl -s -X POST http://localhost:8000/mcp/call_tool \
-H "Content-Type: application/json" \
-d '{"tool":"discover_models","arguments":{"search":"reference_compound","limit":10}}' | jq .
# static manifest validation
curl -s -X POST http://localhost:8000/mcp/call_tool \
-H "Content-Type: application/json" \
-d '{"tool":"validate_model_manifest","arguments":{"filePath":"/app/var/models/rxode2/reference_compound/reference_compound_population_rxode2_model.R"}}' | jq .For the full live-stack gate, run:
python3 scripts/release_readiness_check.pyThat live readiness gate now also reruns the curated manifest publication check for the bundled synthetic reference model and pregnancy examples, and it fails if those examples regress to implicit NGRA workflow-role, population-support, evidence-basis, or claim-boundary declarations.
For the standalone static gate on that same curated publication set, run:
python3 scripts/validate_model_manifests.py --strict --require-explicit-ngra --curated-publication-setTo smoke-test the actual discovered model inventory in the current workspace, run:
python3 scripts/workspace_model_smoke.py --auth-dev-secret pbpk-local-dev-secret
python3 scripts/workspace_model_smoke.py --include-population --auth-dev-secret pbpk-local-dev-secretThe script discovers runtime-supported models through /mcp/resources/models, runs static manifest validation, loads each model, submits a deterministic simulation, retrieves stored results, and optionally runs a small population smoke for rxode2 models that declare population support. It writes a JSON report to var/workspace_model_smoke_report.json. On hardened local stacks, the load/run operations require operator or admin access, so pass --auth-dev-secret pbpk-local-dev-secret for the default development profile or provide a real bearer token.
For the GitHub-hosted verification path, the repository also carries:
- a lightweight
CIworkflow for patch/runtime contract checks on pushes and pull requests - a heavier
Model Smokeworkflow that builds the Docker-backed stack, runs the live readiness gate, executesworkspace_model_smoke.py --include-population --auth-dev-secret pbpk-local-dev-secret, and uploads the resulting JSON reports as workflow artifacts
The default local packaged stack is defined in docker-compose.celery.yml. An explicit source-overlay development profile is available in docker-compose.overlay.yml, and a stricter operator overlay is available in docker-compose.hardened.yml for deployments that should disable anonymous access and require explicit auth configuration. Key environment variables currently shaping the runtime are:
| Variable | Default | Description |
|---|---|---|
ADAPTER_MODEL_PATHS |
/app/var |
Filesystem roots scanned by discover_models and /mcp/resources/models. |
ADAPTER_BACKEND |
subprocess |
Uses the R / OSPSuite subprocess bridge instead of an in-memory mock path. |
JOB_BACKEND |
celery |
Enables asynchronous job submission through Redis-backed Celery workers. |
CELERY_BROKER_URL |
redis://redis:6379/0 |
Queue broker for asynchronous jobs. |
CELERY_RESULT_BACKEND |
redis://redis:6379/1 |
Result storage for job handles and async result chaining. |
ADAPTER_R_PATH / ADAPTER_R_HOME |
container defaults | Point the bridge to the R runtime used by rxode2 and OSPSuite tooling. |
ADAPTER_TIMEOUT_MS |
30000 |
Default adapter timeout in milliseconds. |
AUDIT_ENABLED |
true in local compose |
Enables audit-backed sign-off, verification traceability, and readable history surfaces. |
R_MAX_VSIZE |
2G |
Caps R virtual memory use inside the current local worker setup. |
DOTNET_GCHeapLimitPercent |
60 |
Constrains the .NET heap used by the OSPSuite runtime. |
PBPK_ENABLE_SRC_OVERLAY |
false |
Keeps /app/src out of import precedence by default; docker-compose.overlay.yml sets it to true for workspace-override development. |
SERVICE_VERSION |
0.4.3 |
Exposed through /health and compose-level runtime metadata. |
AUTH_ALLOW_ANONYMOUS |
true |
Development-friendly local default; do not expose beyond localhost without hardening. |
PBPK_BIND_HOST |
127.0.0.1 |
Host/interface used by the hardened overlay when publishing the API port. |
PBPK_BIND_PORT |
8000 |
Host port used by the hardened overlay when publishing the API port. |
AUTH_ISSUER_URL |
unset | Required by the hardened overlay. OIDC issuer used for token validation. |
AUTH_AUDIENCE |
unset | Required by the hardened overlay. Audience expected in bearer tokens. |
AUTH_JWKS_URL |
unset | Required by the hardened overlay. JWKS endpoint used to validate bearer tokens. |
For the default packaged local runtime, use docker-compose.celery.yml through ./scripts/deploy_rxode2_stack.sh. When you need workspace overrides, use ./scripts/deploy_source_overlay_stack.sh, which layers docker-compose.overlay.yml over the same local stack and enables /app/src precedence. For a more production-like local or operator-managed deployment, use ./scripts/deploy_hardened_stack.sh, which layers docker-compose.hardened.yml over the packaged local runtime and waits for stable readiness before returning.
Legacy aliases such as R_PATH, R_HOME, R_LIBS, MCP_MODEL_SEARCH_PATHS, ADAPTER_TIMEOUT_SECONDS, and AUDIT_TRAIL_ENABLED are still accepted for compatibility, but startup now warns that they are scheduled for removal in v0.5.0.
See docker-compose.celery.yml, docker-compose.overlay.yml, docker-compose.hardened.yml, docs/deployment/runtime_patch_flow.md, and docs/deployment/rxode2_worker_image.md for the current operator-facing deployment surface.
| Category | Highlight tools | Notes |
|---|---|---|
| Discovery and curation | discover_models, validate_model_manifest, load_simulation |
Discover models before load, inspect static manifest state, and load supported .pkml or MCP-ready .R files into the live session registry. |
| Qualification and reporting | validate_simulation_request, run_verification_checks, export_oecd_report, ingest_external_pbpk_bundle |
Run OECD-oriented preflight checks, executable verification with unit/integrity/reproducibility checks, structured dossier export with profile, validation, qualificationState, evidence sections, and a descriptive oecdCoverage map to OECD Tables 3.1/3.2, and normalize externally generated PBPK outputs into the same typed PBPK-side objects. |
| Simulation control | list_parameters, get_parameter_value, set_parameter_value, run_simulation |
Inspect and modify simulation parameters, then submit deterministic runs asynchronously. |
| Async status and results | get_job_status, get_results, calculate_pk_parameters, cancel_job |
Track async jobs, retrieve stored deterministic results, compute PK summaries, and cancel queued/running jobs. |
| Population and exploration | run_population_simulation, get_population_results, run_sensitivity_analysis |
Run population workflows and sensitivity analyses on the backends that declare those capabilities. |
The normalized pbpk-mcp.v1 contract currently applies to:
load_simulationvalidate_model_manifestvalidate_simulation_requestrun_verification_checksexport_oecd_reportget_job_statusget_results
For the current public workflow:
- Call
discover_modelsor inspect/mcp/resources/models. - Call
validate_model_manifestbefore loading a new model. - Call
load_simulation. - Call
validate_simulation_requestfor the intended context of use. - Call
run_verification_checkswhen you want a lightweight executable verification summary before broader use or release. - Submit
run_simulationorrun_population_simulation. - Poll
get_job_status, then fetchget_resultsorget_population_results. - Call
export_oecd_reportwhen you need a dossier/report package.
Supported runtime inputs:
.pkmlviaospsuite- MCP-ready
.Rviarxode2
Conversion-only inputs:
.pksim5PK-Sim project files- Berkeley Madonna
.mmd
Both conversion-only formats are rejected early with explicit guidance to export to .pkml or convert to an MCP-ready .R model.
Qualification model:
PBPK MCP keeps these concepts separate:
capabilitiesfor operational/runtime supportprofilefor declared scientific metadatavalidationfor preflight applicability and guardrail assessmentqualificationStatefor the derived summary labelngraObjectsfor PBPK-side typed NGRA-ready objects such asassessmentContext,pbpkQualificationSummary,uncertaintySummary,uncertaintyHandoff,internalExposureEstimate, anduncertaintyRegisterReferenceuncertaintySummarynow also carries additivesemanticCoverageso downstream consumers can distinguish declared-only versus quantified PBPK-side variability, sensitivity, and residual uncertainty without inferring that from raw evidence rows alone; it now includesquantifiedRowCount,declaredOnlyRowCount, and a strictervariabilityQuantificationStatus
oecdCoveragefor a descriptive mapping from exported dossier fields onto OECD reporting-template and evaluation-checklist sections; it does not modifyoecdChecklistScoreor qualification state
Current derived qualification states include:
exploratoryillustrative-exampleresearch-useregulatory-candidatequalified-within-context
This is the core policy boundary for the server:
- executable does not mean qualified
- in-bounds does not mean suitable for regulatory use
For .R models, richer OECD-oriented reporting is enabled by hooks such as:
pbpk_model_profile()pbpk_validate_request(...)pbpk_parameter_table(...)pbpk_performance_evidence(...)pbpk_uncertainty_evidence(...)pbpk_verification_evidence(...)pbpk_run_verification_checks(...)pbpk_platform_qualification_evidence(...)
For example, the reference_compound rxode2 model now exports bounded local sensitivity evidence and a compact variability-propagation summary through uncertaintyEvidence, using the currently loaded parameter state rather than a hard-coded placeholder row. Its performanceEvidence is also explicitly classified as runtime-only/internal evidence so executable smoke checks cannot be mistaken for predictive validation.
export_oecd_report now also adds oecdCoverage, an additive coverage map aligned to OECD PBK Guidance Tables 3.1 and 3.2. It is intentionally descriptive: it shows which dossier sections/questions are covered by the current report payload and which remain incomplete, but it does not alter oecdChecklistScore, qualificationState, or any downstream decision boundary.
Researchers can also attach generic companion performance bundles next to either .pkml or .R models without modifying the bridge code. See performance_evidence_bundles.md and the starter template at performance_evidence_bundle.template.json. Those bundles can now carry both row-level evidence and a traceability-only profileSupplement for predictive dataset records, acceptance criteria, and target outputs. The MCP validates row-level claims conservatively and keeps the supplement descriptive rather than silently upgrading the qualification boundary.
modelPerformance can now also declare structured datasetRecords and acceptanceCriteria inside goodnessOfFit, predictiveChecks, or evaluationData. The bridge normalizes those fields and exposes additive traceability counts so predictive support is not reduced to a single status token.
The same pattern now exists for uncertainty evidence. See uncertainty_evidence_bundles.md and uncertainty_evidence_bundle.template.json. The MCP surfaces warnings when uncertainty rows are missing method/summary or scope information, and variability-propagation rows are only treated as quantified propagation evidence when they actually include quantitative outputs such as value, lowerBound, upperBound, mean, or sd.
The same companion-bundle pattern now exists for richer parameter tables. See parameter_table_bundles.md and parameter_table_bundle.template.json. export_oecd_report and validate_model_manifest now surface row-level coverage for sources, citations, distributions, study conditions, and rationale so dossier gaps are visible without custom bridge code.
Structured peer-engagement traceability can now stay inside the existing peerReview profile section. If a model declares reviewRecords, priorRegulatoryUse, revisionStatus, or revisionHistory, the bridge now normalizes those fields and exposes dossier-ready coverage counts instead of treating peer review as a single free-text status flag.
For .pkml models, richer qualification metadata can be supplied with sidecars such as:
model.profile.jsonmodel.pbpk.json
The default local stack is:
pbpk_mcp-api-1onhttp://127.0.0.1:8000pbpk_mcp-worker-1on the samepbpk_mcp-worker-rxode2:latestimage for executionpbpk_mcp-redis-1for job brokering and result storage
The preferred local operator entrypoint is:
./scripts/deploy_rxode2_stack.shThat command recreates the packaged local runtime so the live tool catalog remains aligned with the documented contract.
When you want to iterate against the current workspace source tree instead of the baked image baseline, use:
./scripts/deploy_source_overlay_stack.shThat profile layers docker-compose.overlay.yml on top of the packaged stack, bind-mounts src/, scripts/, and var/, enables PBPK_ENABLE_SRC_OVERLAY=true, and then waits for stable /health and /mcp/list_tools responses before returning.
When you want the same runtime contract with stricter deployment defaults, use:
AUTH_ISSUER_URL="https://issuer.example" \
AUTH_AUDIENCE="pbpk-mcp" \
AUTH_JWKS_URL="https://issuer.example/.well-known/jwks.json" \
./scripts/deploy_hardened_stack.shThat overlay disables anonymous access, switches the stack to ENVIRONMENT=production, binds the published API port through PBPK_BIND_HOST / PBPK_BIND_PORT, and still waits for stable /health and /mcp/list_tools responses before returning.
Add http://localhost:8000/mcp as an MCP provider in Codex CLI/Desktop, Gemini CLI, Claude Code, or other MCP-aware hosts.
Useful companion surfaces:
/mcp/list_toolsfor the live tool catalog/mcp/call_toolfor direct HTTP tool execution/mcp/resources/modelsfor discoverable model files/mcp/resources/simulationsfor already-loaded live sessions/mcp/resources/schemasfor the published PBPK-side object schemas and example payloads/mcp/resources/capability-matrixfor the live machine-readable runtime-support matrix
Critical tools still declare MCP-side guardrails such as critical and requiresConfirmation in the tool catalog. The current live regression suite checks that the documented workflow is actually exposed through /mcp/list_tools.
The server currently produces and exposes:
- discovered model inventories with loaded/unloaded state
- static manifest validation reports
- load-time model metadata with
backend,capabilities,profile,validation, andqualificationState - validation/report payloads with typed PBPK-side NGRA-ready objects such as
assessmentContext,pbpkQualificationSummary,uncertaintySummary,uncertaintyHandoff,internalExposureEstimate,uncertaintyRegisterReference, andpointOfDepartureReference- uncertainty objects now include machine-readable PBPK-side uncertainty semantics for quantified variability, declared-only residual uncertainty, and still-missing components
- a BER-ready reference bundle in dossier export that can become
ready-for-external-ber-calculationwhen an externalpodRefand a resolved PBPK exposure target are both available - imported external PBPK run records and NGRA-ready object bundles from
ingest_external_pbpk_bundle - executable verification summaries with structured check results, smoke-run artifact handles, parameter-unit consistency, result-integrity/reproducibility checks, and verification-evidence snapshots
- deterministic result handles and stored deterministic result payloads
- population summary payloads and chunk handles
- PK metric outputs from
calculate_pk_parameters - OECD-style dossier/report exports with checklist state, missing-evidence hints, performance evidence, uncertainty evidence, verification evidence, software-platform qualification evidence, and parameter provenance when declared
- local/internal audit generators can produce verifiable evidence packs for bounded rehearsal models, but those study-specific outputs are not part of the public release surface and should not be treated as benchmark or distribution artifacts
- ✅ Dual-backend runtime policy is explicit:
.pkmland MCP-ready.Rare supported;.pksim5and.mmdare conversion-only. - ✅ Discovery, manifest validation, runtime validation, and OECD dossier export are exposed as separate tool surfaces.
- ✅
capabilities,profile,validation, andqualificationStateremain distinct, so runtime support is not mislabeled as scientific qualification. - ✅ The default local deployment path is now packaged by default, and the source-overlay maintainer path is explicit and opt-in.
- ✅ Live readiness checks verify
/mcp/list_tools,/mcp/resources/models, discovery parity, and conversion-only rejection behavior. - ✅ A documented hardened overlay now disables anonymous access by default and requires explicit auth settings before the stack will start.
- 🔲 The default compose file is still development-oriented; keep it on localhost and use the hardened overlay plus environment-specific ingress controls when moving beyond local development.
- 🔲 Qualification evidence completeness still depends on the model or sidecar; executable does not imply regulatory readiness.
- Raw Berkeley Madonna
.mmdfiles are not directly executable. - Only
.pkmland MCP-ready.Rare supported runtime formats. - PK-Sim / MoBi project files such as
.pksim5are not loaded directly and must be exported to.pkml. - An
.Rfile can be discoverable without being runnable if it does not implement the required contract. - Backend capabilities are intentionally not identical across
ospsuiteandrxode2. - Population simulation is currently implemented for
rxode2models, not as a generic OSPSuite.pkmlcapability.
validate_model_manifestis a static pre-load check; it does not by itself prove executable correctness or scientific validity.- Runtime guardrails are not the same as external scientific validation.
ingest_external_pbpk_bundleis a normalization/import path for externally generated PBPK outputs. It does not execute vendor engines such as GastroPlus or Simcyp, and it does not parse proprietary project files directly.run_verification_checksis intentionally lightweight implementation verification; it now includes parameter-unit consistency, structural flow/volume consistency, deterministic integrity, and reproducibility checks, but it still does not replace formal dimensional analysis, full unit-consistency proof across equations, solver qualification, software-platform qualification, or external qualification evidence.export_oecd_reportnow carries the latest storedrun_verification_checkssnapshot asexecutableVerificationwhen one has been run for that loaded simulation. The report does not silently rerun verification during export.- Model-specific hooks can add executable qualification checks such as flow/volume consistency, mass balance, or solver-stability comparisons, but those checks should still be interpreted as implementation evidence within the declared context, not as blanket regulatory qualification.
- Bundled
uncertaintyEvidencecan now include bounded local sensitivity screens and compact variability-propagation summaries for specific models, but those rows are still internal quantitative evidence. They are not the same thing as variance-based global sensitivity analysis, posterior uncertainty propagation, or externally reviewed qualification studies. research-useorillustrative-examplemodels should not be represented as regulatory-ready.- OECD-style metadata completeness can be high while scientific evidence remains incomplete.
- Exported
performanceEvidenceis now explicitly classified and bounded, but many models will still only carry runtime/internal evidence rather than observed-versus-predicted datasets, predictive evaluation sets, or external qualification studies. - Parameter provenance may be structured and exportable while still lacking full per-parameter citations, study conditions, or identifiability evidence.
rxode2image builds are heavy and should be prebuilt rather than compiled inside a capped runtime worker.- Runtime workers should stay conservatively capped, such as
4 GiB. - Durable
rxode2image builds can take a long time on laptop hardware because of C/C++ compilation. - The default local deployment now runs the packaged baseline, but active workspace development should still go through the documented deploy entrypoints so the intended runtime mode is explicit.
- The hardened overlay improves auth defaults and bind behavior, but it is still an operator-side compose profile rather than a full production deployment package with ingress, secret rotation, or managed identity infrastructure.
Recommended checks:
python3 scripts/check_runtime_contract_env.py
python3 -m unittest -v tests/test_load_simulation_contract.py
python3 -m unittest -v tests/test_model_manifest.py
python3 -m unittest -v tests/test_oecd_bridge.py
python3 -m unittest -v tests/test_model_discovery_live_stack.py
python3 -m unittest -v tests/test_oecd_live_stack.py
python3 scripts/release_readiness_check.py
python3 scripts/workspace_model_smoke.py --auth-dev-secret pbpk-local-dev-secretmake runtime-contract-test in the public repository now runs the same dependency preflight first, runs the named misuse-prevention gate, reruns the strict curated-manifest NGRA gate for the bundled example models, builds a temporary sdist and wheel outside the repo worktree, validates that the normative contract files survive source distribution packaging, and performs an installed-package check of mcp_bridge.contract against the built wheel so the published contract artifacts are validated across both source and distribution boundaries.
The same strict curated-manifest NGRA gate is now part of make runtime-contract-test, scripts/release_readiness_check.py, and the Release Artifacts workflow, so bundled example models cannot quietly regress to implicit NGRA workflow role, population support, evidence basis, or claim-boundary declarations on the normal release-prep path.
That gate is now selected through --curated-publication-set, so the Makefile, workflow, and manual release checklist all use one curated model inventory instead of duplicating path lists.
Human-process release artifacts now live alongside the runtime gates:
docs/hardening_migration_notes.mddocs/pbpk_model_onboarding_checklist.mddocs/pbk_reviewer_signoff_checklist.mddocs/post_release_audit_plan.md
Repository automation is split intentionally:
- normal CI should catch contract drift in the local runtime layer quickly
- the dedicated
Release Artifactsworkflow should be used on tags or release-prep runs to validate and retain the exact publishedsdistand wheel - the heavier catalog-wide model smoke should run as a dedicated workflow or release gate because it builds the full worker image and executes the live stack
The published pbpk-mcp.v1 contract manifest now distinguishes:
normativeartifacts that define the machine-readable contractsupportingartifacts that help users adopt and publish the contract correctlylegacy-excludedartifacts that remain in the repo but are intentionally outside the PBPK-side object family
If you are maintaining the local stack in the current convergence stage:
- Change the authoritative runtime files in the packaged modules under
src/, the local runtime helpers underscripts/,scripts/ospsuite_bridge.R, or the bundled.Rmodel modules.- generic MCP namespaces, tool modules, and adapter/runtime files now live under
src/ - the local operator/runtime helper surface now lives in the deploy scripts, worker Dockerfile, and overlay hook rather than a tracked
patches/implementation layer
- generic MCP namespaces, tool modules, and adapter/runtime files now live under
- If the worker image baseline should change, rebuild it with
./scripts/build_rxode2_worker_image.sh. - Recreate the packaged local stack with
./scripts/deploy_rxode2_stack.sh.- When you need workspace overrides for active development, use
./scripts/deploy_source_overlay_stack.sh. - When you need stricter auth defaults, use
./scripts/deploy_hardened_stack.shwithAUTH_ISSUER_URL,AUTH_AUDIENCE, andAUTH_JWKS_URLset.
- When you need workspace overrides for active development, use
- Verify the live API with:
curl -s http://localhost:8000/healthcurl -s http://localhost:8000/mcp/list_toolspython3 scripts/release_readiness_check.pypython3 scripts/workspace_model_smoke.py --auth-dev-secret pbpk-local-dev-secretpython3 scripts/workspace_model_smoke.py --include-population --auth-dev-secret pbpk-local-dev-secret
Important boundaries:
scripts/deploy_rxode2_stack.shis the preferred packaged local operator entrypoint.scripts/deploy_source_overlay_stack.shis the explicit maintainer entrypoint when workspacesrc/andscripts/should override the image.scripts/deploy_hardened_stack.shis the stricter operator entrypoint when you need non-anonymous auth defaults on the same packaged local stack.docker/rxode2-worker.Dockerfilenow owns the baked baseline worker assets directly, includingscripts/ospsuite_bridge.Rand the bundled reference.Rmodel.- the local stack now defaults to the packaged runtime, while the overlay
.pthremains only for the explicit source-overlay maintainer profile.
scripts/ospsuite_bridge.Rscripts/runtime_src_overlay.pthdocker/rxode2-worker.Dockerfilescripts/workspace_model_smoke.pysrc/mcp_bridge/model_catalog.pysrc/mcp_bridge/model_manifest.pysrc/mcp/__init__.pysrc/mcp/tools/discover_models.pysrc/mcp/tools/load_simulation.pysrc/mcp/tools/get_job_status.pysrc/mcp/tools/validate_simulation_request.pysrc/mcp/tools/run_verification_checks.pysrc/mcp/tools/validate_model_manifest.pysrc/mcp/tools/export_oecd_report.pysrc/mcp/tools/get_results.pysrc/mcp/tools/ingest_external_pbpk_bundle.pysrc/mcp/tools/run_population_simulation.pysrc/mcp_bridge/adapter/__init__.pysrc/mcp_bridge/adapter/interface.pysrc/mcp_bridge/adapter/ospsuite.pydocs/architecture/dual_backend_pbpk_mcp.mddocs/deployment/runtime_patch_flow.mddocs/integration_guides/rxode2_adapter.mddocs/integration_guides/ospsuite_profile_sidecars.mddocs/github_publication_checklist.mddocs/hardening_migration_notes.mddocs/pbpk_model_onboarding_checklist.mddocs/pbk_reviewer_signoff_checklist.mddocs/post_release_audit_plan.md
Use the clean GitHub clone as the release source of truth when publishing. Before proposing a release or publication update, run the readiness checks above and review the publication checklist in docs/github_publication_checklist.md, the migration notes in docs/hardening_migration_notes.md, the model-onboarding checklist in docs/pbpk_model_onboarding_checklist.md, the reviewer checklist in docs/pbk_reviewer_signoff_checklist.md, and the post-release audit plan in docs/post_release_audit_plan.md.
For the published repo, follow CONTRIBUTING.md.
The default local compose stack is development-oriented and currently allows anonymous local use. Do not expose it beyond localhost without hardening authentication, ingress, and runtime resource limits.
For the published repo, follow SECURITY.md.
Community interactions for the published repo should follow CODE_OF_CONDUCT.md.
If you use PBPK MCP in your work, please cite the BioRxiv preprint:
Djidrovski, I. ToxMCP: Guardrailed, Auditable Agentic Workflows for Computational Toxicology via the Model Context Protocol. bioRxiv (2026). https://doi.org/10.64898/2026.02.06.703989
@article{djidrovski2026toxmcp,
title = {ToxMCP: Guardrailed, Auditable Agentic Workflows for Computational Toxicology via the Model Context Protocol},
author = {Djidrovski, Ivo},
journal = {bioRxiv},
year = {2026},
doi = {10.64898/2026.02.06.703989},
url = {https://doi.org/10.64898/2026.02.06.703989}
}Citation metadata: CITATION.cff
- Keep reducing the remaining source-overlay-specific maintainer surface now that the default local runtime is packaged by default.
- Add richer quantitative uncertainty, sensitivity, and implementation-verification evidence to strengthen regulatory-facing qualification.
- Expand dossier-grade parameter provenance and model-performance evidence across curated example and production models.
- Keep improving release automation so image build, hot patching, docs, and version markers stay aligned.
Apache-2.0