Skip to content

Add regulatory compliance evaluation pipeline cookbook#2653

Open
garyatwalAI wants to merge 2 commits intolangfuse:mainfrom
garyatwalAI:claude/add-compliance-cookbook-sFzwk
Open

Add regulatory compliance evaluation pipeline cookbook#2653
garyatwalAI wants to merge 2 commits intolangfuse:mainfrom
garyatwalAI:claude/add-compliance-cookbook-sFzwk

Conversation

@garyatwalAI
Copy link

Summary

This PR adds a comprehensive Jupyter notebook cookbook demonstrating how to build an automated compliance evaluation pipeline using Langfuse for regulated industries (legal, financial, insurance, healthcare).

Key Changes

  • New cookbook notebook (cookbook/regulatory_compliance_evaluation_pipeline.ipynb) with a complete compliance evaluation system featuring:
    • ComplianceScore and ComplianceReport dataclasses for structured compliance results
    • PIIDetector: Rule-based evaluator detecting emails, phone numbers, SSNs, credit cards, passports, and other PII patterns
    • LegalPrivilegeDetector: Hybrid rule-based + LLM-as-a-Judge evaluator identifying attorney-client privilege, work product, settlement communications, litigation strategy, and regulatory investigation privilege
    • FactualGroundingEvaluator: LLM-as-a-Judge evaluator checking if outputs are grounded in provided context (hallucination detection)
    • BiasEvaluator: LLM-as-a-Judge evaluator detecting discriminatory or biased language across protected characteristics
    • CompliancePipeline: Orchestrator class that runs all evaluators and ingests scores back into Langfuse

Notable Implementation Details

  • Hybrid evaluation approach combining fast rule-based checks with nuanced LLM-based assessments
  • Two-stage legal privilege detection: keyword scan followed by LLM classification for high-confidence results
  • Structured compliance reporting with PASS/WARN/FAIL categories and detailed flagged items
  • Seamless integration with Langfuse for score ingestion and dashboard visualization
  • Extensible architecture allowing easy addition of domain-specific evaluators
  • Alignment with NIST AI Risk Management Framework (GOVERN, MAP, MEASURE, MANAGE functions)
  • Practical examples for litigation support, insurance claims, financial advisory, and healthcare use cases

https://claude.ai/code/session_011De5LfdoirAg729crpDFaF

@vercel
Copy link

vercel bot commented Mar 11, 2026

@claude is attempting to deploy a commit to the langfuse Team on Vercel.

A member of the Team first needs to authorize it.

@CLAassistant
Copy link

CLAassistant commented Mar 11, 2026

CLA assistant check
All committers have signed the CLA.

@dosubot dosubot bot added the size:XL This PR changes 500-999 lines, ignoring generated files. label Mar 11, 2026
@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@dosubot dosubot bot added the documentation Improvements or additions to documentation label Mar 11, 2026
Add a comprehensive cookbook demonstrating how to build automated compliance
evaluation pipelines for regulated industries using Langfuse.

The cookbook covers:
- PII detection (rule-based regex patterns)
- Legal privilege detection (hybrid rule + LLM-as-a-Judge)
- Factual grounding / hallucination detection (LLM-as-a-Judge)
- Bias and fairness evaluation (LLM-as-a-Judge)
- Complete pipeline orchestration with Langfuse score ingestion
- NIST AI RMF alignment mapping

Designed for legal services, financial services, insurance, and healthcare
organisations deploying AI in regulated contexts.

https://claude.ai/code/session_011De5LfdoirAg729crpDFaF
@garyatwalAI garyatwalAI force-pushed the claude/add-compliance-cookbook-sFzwk branch from 076be5d to 3ad0547 Compare March 11, 2026 18:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation size:XL This PR changes 500-999 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants