Skip to content

[EPIC] ADR-Compliant Database Management & PyRIT Integration Platform #110

@Cybonto

Description

@Cybonto

[EPIC] ADR-Compliant Database Management & PyRIT Integration Platform

UAT Specification

---
schemaVersion: "uat-v1.0"
issueID: "ADR-COMPLIANT-PYRIT-DB-PLATFORM"
type: "epic"
status: "pending-acceptance"
priority: 1
assignee: ""

spec:
  businessGoal: "Implement comprehensive database management API endpoints that enable PyRIT memory integration while maintaining full compliance with existing Architecture Decision Records (ADR-F2.2, ADR-003, ADR-007, ADR-006, ADR-010), creating an enterprise-grade database management platform within violentutf-api's established polyglot persistence architecture"
  keyObjectives:
    - "Create PyRIT memory bridge that integrates DuckDB engine with ADR-F2.2 polyglot persistence (PostgreSQL + Document DB + Blob Storage)"
    - "Implement organization-isolated database operations following ADR-003 RBAC+ABAC security model"
    - "Build async database management APIs using ADR-007 task processing patterns (202 Accepted + status URLs)"
    - "Develop JSON-exclusive database endpoints compliant with ADR-006 serialization standards"
    - "Establish PyRIT dependency management following ADR-010 security and compliance policies"
    - "Enable PyRIT dataset operations (SeedPromptDataset, QuestionAnsweringDataset, ChatMessagesDataset) within enterprise architecture"
    - "Create evidence storage layer for PyRIT conversations in Document DB with automated archival to Blob Storage"
    - "Implement comprehensive audit trails for all database operations using existing audit infrastructure"
  successMetrics:
    - metric: "ADR Compliance Score"
      target: "100%"
      measurement: "All database operations comply with ADR-F2.2, ADR-003, ADR-007, ADR-006, ADR-010"
    - metric: "PyRIT Engine Compatibility"
      target: "100%"
      measurement: "All PyRIT memory operations function normally with memory bridge"
    - metric: "Organization Isolation"
      target: "100%"
      measurement: "Zero cross-organization data access in security testing"
    - metric: "API Response Performance"
      target: "<500ms"
      measurement: "95th percentile for database status operations"
    - metric: "Async Operation Coverage"
      target: "100%"
      measurement: "All operations >30 seconds use async task processing"
    - metric: "Data Lifecycle Compliance"
      target: "100%"
      measurement: "PyRIT evidence follows 90-day hot→cold migration policy"
  estimatedScope:
    storiesCount: 24
    estimatedWeeks: 16

quality:
  performance:
    response_time_p95_ms: 500
    concurrent_operations: 100
    memory_bridge_overhead_max: 10
  security:
    organization_isolation_required: true
    rbac_abac_enforcement_required: true
    audit_trail_required: true
    security_scan_required: true
  maintainability:
    adr_compliance_required: true
    api_documentation_required: true
    test_coverage_min: 85
    pyrit_compatibility_required: true

relations:
  parent: null
  dependsOn: []
  relatesTo: ["ADR-F2.2", "ADR-003", "ADR-007", "ADR-006", "ADR-010", "ViolentUTF Issue #116"]
  childStories: []
  adrs:
    - "ADR-F2.2: Polyglot Persistence Strategy"
    - "ADR-003: Hybrid RBAC+ABAC Authorization"
    - "ADR-007: Async Task Processing"
    - "ADR-006: JSON Data Serialization"
    - "ADR-010: Software Dependency Management"
---

Business Context

This epic creates a comprehensive database management platform that bridges PyRIT engine requirements with violentutf-api's established enterprise architecture, ensuring full compliance with all existing Architecture Decision Records while enabling advanced AI security evaluation workflows.

Stakeholders:

  • Enterprise Security Teams: Need PyRIT-powered security evaluation within compliant enterprise architecture
  • Platform Operations: Require database operations that align with established polyglot persistence strategy
  • Compliance Teams: Need assurance that PyRIT integration maintains ADR compliance for regulatory requirements
  • Development Teams: Need clear PyRIT integration patterns that follow established architectural principles
  • Security Researchers: Need PyRIT capabilities without compromising multi-tenant security model

Business Problem:

  • PyRIT engine requires DuckDB memory, but ADR-F2.2 mandates polyglot persistence (PostgreSQL + Document DB + Blob Storage)
  • Missing database management APIs that enterprise workflows require for PyRIT operations
  • No organization-isolated PyRIT memory management compliant with ADR-003 RBAC+ABAC model
  • Lack of async database operations following ADR-007 task processing patterns
  • PyRIT dependency management not aligned with ADR-010 security policies
  • Missing integration between PyRIT evidence storage and established data lifecycle policies

Expected Value:

  • ADR-Compliant Architecture: PyRIT integration that strengthens rather than compromises established architectural decisions
  • Enterprise PyRIT Operations: Full PyRIT functionality within multi-tenant, organization-isolated environment
  • Seamless Developer Experience: PyRIT operations appear native while leveraging enterprise infrastructure
  • Data Lifecycle Compliance: PyRIT evidence properly managed through hot→warm→cold storage lifecycle
  • Security Assurance: PyRIT operations maintain RBAC+ABAC security model with comprehensive audit trails
  • Operational Excellence: Database management APIs that integrate with existing monitoring and alerting systems

Architectural Innovation:

  • PyRIT Memory Bridge: Presents DuckDB interface to PyRIT engine while storing data in polyglot persistence layers
  • Evidence Storage Integration: PyRIT conversations stored as evidence documents in Document DB layer
  • Temporary Processing Layer: DuckDB instances for PyRIT compatibility with automated sync and cleanup
  • Multi-Tenant Security: Organization-scoped PyRIT resources with complete data isolation

High-Level Technical Approach

ADR-F2.2 Compliance Strategy:

  • PostgreSQL Layer: PyRIT metadata, session tracking, organization-scoped resource management
  • Document DB Layer: PyRIT conversation evidence, prompt/response pairs, scoring results
  • Blob Storage Layer: PyRIT memory archives, dataset exports, long-term retention
  • Temporary Processing: DuckDB instances for PyRIT engine compatibility with automatic sync

ADR-003 Security Integration:

  • RBAC Extension: PyRIT-specific roles (pyrit_user, pyrit_admin, pyrit_analyst)
  • ABAC Enforcement: Mandatory organization_id on all PyRIT resources with query-level isolation
  • Multi-Tenant Design: Complete data separation between organizations for all PyRIT operations

ADR-007 Async Processing:

  • Task Queue Integration: Celery tasks for PyRIT memory initialization, dataset operations, archival processes
  • HTTP Polling Pattern: All long operations return 202 Accepted with status URLs
  • Progress Tracking: Real-time status updates for async PyRIT operations

ADR-006 Serialization Compliance:

  • JSON-Exclusive APIs: All database management endpoints use application/json
  • Schema Design: Pydantic models for type-safe PyRIT operation requests/responses
  • No Binary Formats: Dataset operations handled through JSON APIs with base64 encoding if needed

ADR-010 Dependency Management:

  • PyRIT Security Scanning: Automated pip-audit scans for PyRIT and transitive dependencies
  • Dependabot Monitoring: Continuous PyRIT version tracking and security updates
  • License Compliance: PyRIT license compatibility verification (MIT/Apache approved)

Technology Integration:

  • Existing FastAPI Framework: Extend current API patterns for database management
  • Current SQLAlchemy Models: Add PyRIT metadata models with existing patterns
  • Established Middleware Stack: Leverage existing auth, audit, and security middleware
  • Current Monitoring: Extend Prometheus metrics and logging for PyRIT operations

Decomposition Plan

Phase 1: ADR-Compliant Foundation (Weeks 1-4)

ADR Compliance Infrastructure:

  • PyRIT Dependency Security Integration (ADR-010)
  • Polyglot Persistence Extension for PyRIT (ADR-F2.2)
  • Organization-Scoped PyRIT Security Framework (ADR-003)
  • Database Management Audit System Extension

Phase 2: PyRIT Memory Bridge & Async APIs (Weeks 5-8)

Core Integration Services:

  • PyRIT Memory Bridge Implementation (DuckDB ↔ Polyglot Storage)
  • Async Database Management APIs (ADR-007 Compliant)
  • JSON-Exclusive Database Endpoints (ADR-006 Compliant)
  • Organization-Isolated PyRIT Memory Sessions

Phase 3: Evidence Storage & Lifecycle (Weeks 9-12)

Data Management Services:

  • PyRIT Evidence Storage in Document DB Layer
  • Automated Data Lifecycle Management (90-day hot→cold policy)
  • PyRIT Dataset Operations (SeedPrompt, QA, ChatMessages)
  • Blob Storage Integration for PyRIT Archives

Phase 4: Enterprise Integration & Validation (Weeks 13-16)

Production Readiness:

  • Comprehensive ADR Compliance Testing
  • PyRIT Engine Compatibility Validation
  • Organization Isolation Security Testing
  • Performance Optimization & Monitoring Integration

Success Criteria

ADR Compliance Criteria:

  1. ADR-F2.2 Compliance: PyRIT data flows through PostgreSQL → Document DB → Blob Storage layers
  2. ADR-003 Compliance: All PyRIT resources include organization_id with RBAC+ABAC enforcement
  3. ADR-007 Compliance: All long PyRIT operations (>30s) use async task processing with status URLs
  4. ADR-006 Compliance: All database APIs use application/json exclusively
  5. ADR-010 Compliance: PyRIT dependency passes security scans and license compliance

PyRIT Engine Compatibility:

  1. Memory Operations: PyRIT orchestrators function normally with memory bridge
  2. Dataset Support: Full compatibility with SeedPromptDataset, QuestionAnsweringDataset, ChatMessagesDataset
  3. Performance: PyRIT operations maintain acceptable performance (<10% overhead)
  4. API Transparency: PyRIT engine unaware of underlying polyglot persistence

Security & Compliance:

  1. Organization Isolation: Zero cross-tenant data access in comprehensive security testing
  2. Audit Trail Completeness: All PyRIT operations generate proper audit events
  3. Data Lifecycle Compliance: PyRIT evidence follows established archival policies
  4. Security Integration: PyRIT operations secured by existing middleware stack

Technical Excellence:

  1. API Performance: Database status operations <500ms (95th percentile)
  2. Concurrent Support: 100+ concurrent PyRIT operations
  3. Test Coverage: >85% code coverage including ADR compliance tests
  4. Documentation: Complete OpenAPI specs with PyRIT integration examples

Risk Mitigation Strategy

ADR Compliance Risks:

  1. Polyglot Complexity: PyRIT memory bridge adds architectural complexity
    • Mitigation: Comprehensive testing, clear documentation, phased rollout
  2. Performance Impact: Dual storage may affect PyRIT performance
    • Mitigation: Async background sync, intelligent caching, performance monitoring
  3. Security Model Integration: Complex RBAC+ABAC with PyRIT operations
    • Mitigation: Extensive security testing, clear access control patterns

PyRIT Compatibility Risks:

  1. Engine Integration: Memory bridge may break PyRIT functionality
    • Mitigation: Comprehensive PyRIT engine testing, compatibility validation
  2. Version Compatibility: PyRIT updates may break integration
    • Mitigation: Automated testing, version pinning, upgrade testing procedures

Technical Implementation Risks:

  1. Data Synchronization: Complex sync between DuckDB and polyglot storage
    • Mitigation: Transactional patterns, data integrity checks, recovery procedures
  2. Resource Management: Memory and storage resource consumption
    • Mitigation: Resource quotas, monitoring, automated cleanup policies

Verification Requirements

ADR Compliance Verification:

  • ADR-F2.2 Testing: Verify PyRIT data properly flows through all storage layers
  • ADR-003 Testing: Confirm organization isolation with security penetration testing
  • ADR-007 Testing: Validate all long operations use async task patterns
  • ADR-006 Testing: Ensure all APIs use JSON exclusively
  • ADR-010 Testing: Verify PyRIT dependency security and license compliance

PyRIT Engine Verification:

  • Memory Bridge Testing: PyRIT orchestrators function normally
  • Dataset Compatibility: All PyRIT dataset types work correctly
  • Performance Testing: PyRIT operations maintain acceptable performance
  • Integration Testing: End-to-end PyRIT workflows function properly

Security Verification:

  • Multi-Tenant Testing: Organizations cannot access each other's PyRIT data
  • RBAC/ABAC Testing: Role-based access properly enforced for PyRIT operations
  • Audit Trail Testing: All PyRIT operations generate complete audit events
  • Data Lifecycle Testing: PyRIT evidence follows archival policies

This epic represents a sophisticated integration challenge that strengthens violentutf-api's architectural integrity while enabling advanced PyRIT capabilities. The solution demonstrates how enterprise architecture principles can be maintained while integrating complex external systems.

Implementation Reference

Detailed implementation guidance available in:

  • /docs/planning/database_management_api_implementation_plan_v2_adr_compliant.md

Key architectural innovations include the PyRIT Memory Bridge pattern and evidence storage integration that maintains both PyRIT engine compatibility and full ADR compliance.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions