-
Notifications
You must be signed in to change notification settings - Fork 1
Description
[EPIC] ADR-Compliant Database Management & PyRIT Integration Platform
UAT Specification
---
schemaVersion: "uat-v1.0"
issueID: "ADR-COMPLIANT-PYRIT-DB-PLATFORM"
type: "epic"
status: "pending-acceptance"
priority: 1
assignee: ""
spec:
businessGoal: "Implement comprehensive database management API endpoints that enable PyRIT memory integration while maintaining full compliance with existing Architecture Decision Records (ADR-F2.2, ADR-003, ADR-007, ADR-006, ADR-010), creating an enterprise-grade database management platform within violentutf-api's established polyglot persistence architecture"
keyObjectives:
- "Create PyRIT memory bridge that integrates DuckDB engine with ADR-F2.2 polyglot persistence (PostgreSQL + Document DB + Blob Storage)"
- "Implement organization-isolated database operations following ADR-003 RBAC+ABAC security model"
- "Build async database management APIs using ADR-007 task processing patterns (202 Accepted + status URLs)"
- "Develop JSON-exclusive database endpoints compliant with ADR-006 serialization standards"
- "Establish PyRIT dependency management following ADR-010 security and compliance policies"
- "Enable PyRIT dataset operations (SeedPromptDataset, QuestionAnsweringDataset, ChatMessagesDataset) within enterprise architecture"
- "Create evidence storage layer for PyRIT conversations in Document DB with automated archival to Blob Storage"
- "Implement comprehensive audit trails for all database operations using existing audit infrastructure"
successMetrics:
- metric: "ADR Compliance Score"
target: "100%"
measurement: "All database operations comply with ADR-F2.2, ADR-003, ADR-007, ADR-006, ADR-010"
- metric: "PyRIT Engine Compatibility"
target: "100%"
measurement: "All PyRIT memory operations function normally with memory bridge"
- metric: "Organization Isolation"
target: "100%"
measurement: "Zero cross-organization data access in security testing"
- metric: "API Response Performance"
target: "<500ms"
measurement: "95th percentile for database status operations"
- metric: "Async Operation Coverage"
target: "100%"
measurement: "All operations >30 seconds use async task processing"
- metric: "Data Lifecycle Compliance"
target: "100%"
measurement: "PyRIT evidence follows 90-day hot→cold migration policy"
estimatedScope:
storiesCount: 24
estimatedWeeks: 16
quality:
performance:
response_time_p95_ms: 500
concurrent_operations: 100
memory_bridge_overhead_max: 10
security:
organization_isolation_required: true
rbac_abac_enforcement_required: true
audit_trail_required: true
security_scan_required: true
maintainability:
adr_compliance_required: true
api_documentation_required: true
test_coverage_min: 85
pyrit_compatibility_required: true
relations:
parent: null
dependsOn: []
relatesTo: ["ADR-F2.2", "ADR-003", "ADR-007", "ADR-006", "ADR-010", "ViolentUTF Issue #116"]
childStories: []
adrs:
- "ADR-F2.2: Polyglot Persistence Strategy"
- "ADR-003: Hybrid RBAC+ABAC Authorization"
- "ADR-007: Async Task Processing"
- "ADR-006: JSON Data Serialization"
- "ADR-010: Software Dependency Management"
---Business Context
This epic creates a comprehensive database management platform that bridges PyRIT engine requirements with violentutf-api's established enterprise architecture, ensuring full compliance with all existing Architecture Decision Records while enabling advanced AI security evaluation workflows.
Stakeholders:
- Enterprise Security Teams: Need PyRIT-powered security evaluation within compliant enterprise architecture
- Platform Operations: Require database operations that align with established polyglot persistence strategy
- Compliance Teams: Need assurance that PyRIT integration maintains ADR compliance for regulatory requirements
- Development Teams: Need clear PyRIT integration patterns that follow established architectural principles
- Security Researchers: Need PyRIT capabilities without compromising multi-tenant security model
Business Problem:
- PyRIT engine requires DuckDB memory, but ADR-F2.2 mandates polyglot persistence (PostgreSQL + Document DB + Blob Storage)
- Missing database management APIs that enterprise workflows require for PyRIT operations
- No organization-isolated PyRIT memory management compliant with ADR-003 RBAC+ABAC model
- Lack of async database operations following ADR-007 task processing patterns
- PyRIT dependency management not aligned with ADR-010 security policies
- Missing integration between PyRIT evidence storage and established data lifecycle policies
Expected Value:
- ADR-Compliant Architecture: PyRIT integration that strengthens rather than compromises established architectural decisions
- Enterprise PyRIT Operations: Full PyRIT functionality within multi-tenant, organization-isolated environment
- Seamless Developer Experience: PyRIT operations appear native while leveraging enterprise infrastructure
- Data Lifecycle Compliance: PyRIT evidence properly managed through hot→warm→cold storage lifecycle
- Security Assurance: PyRIT operations maintain RBAC+ABAC security model with comprehensive audit trails
- Operational Excellence: Database management APIs that integrate with existing monitoring and alerting systems
Architectural Innovation:
- PyRIT Memory Bridge: Presents DuckDB interface to PyRIT engine while storing data in polyglot persistence layers
- Evidence Storage Integration: PyRIT conversations stored as evidence documents in Document DB layer
- Temporary Processing Layer: DuckDB instances for PyRIT compatibility with automated sync and cleanup
- Multi-Tenant Security: Organization-scoped PyRIT resources with complete data isolation
High-Level Technical Approach
ADR-F2.2 Compliance Strategy:
- PostgreSQL Layer: PyRIT metadata, session tracking, organization-scoped resource management
- Document DB Layer: PyRIT conversation evidence, prompt/response pairs, scoring results
- Blob Storage Layer: PyRIT memory archives, dataset exports, long-term retention
- Temporary Processing: DuckDB instances for PyRIT engine compatibility with automatic sync
ADR-003 Security Integration:
- RBAC Extension: PyRIT-specific roles (
pyrit_user,pyrit_admin,pyrit_analyst) - ABAC Enforcement: Mandatory
organization_idon all PyRIT resources with query-level isolation - Multi-Tenant Design: Complete data separation between organizations for all PyRIT operations
ADR-007 Async Processing:
- Task Queue Integration: Celery tasks for PyRIT memory initialization, dataset operations, archival processes
- HTTP Polling Pattern: All long operations return
202 Acceptedwith status URLs - Progress Tracking: Real-time status updates for async PyRIT operations
ADR-006 Serialization Compliance:
- JSON-Exclusive APIs: All database management endpoints use
application/json - Schema Design: Pydantic models for type-safe PyRIT operation requests/responses
- No Binary Formats: Dataset operations handled through JSON APIs with base64 encoding if needed
ADR-010 Dependency Management:
- PyRIT Security Scanning: Automated
pip-auditscans for PyRIT and transitive dependencies - Dependabot Monitoring: Continuous PyRIT version tracking and security updates
- License Compliance: PyRIT license compatibility verification (MIT/Apache approved)
Technology Integration:
- Existing FastAPI Framework: Extend current API patterns for database management
- Current SQLAlchemy Models: Add PyRIT metadata models with existing patterns
- Established Middleware Stack: Leverage existing auth, audit, and security middleware
- Current Monitoring: Extend Prometheus metrics and logging for PyRIT operations
Decomposition Plan
Phase 1: ADR-Compliant Foundation (Weeks 1-4)
ADR Compliance Infrastructure:
- PyRIT Dependency Security Integration (ADR-010)
- Polyglot Persistence Extension for PyRIT (ADR-F2.2)
- Organization-Scoped PyRIT Security Framework (ADR-003)
- Database Management Audit System Extension
Phase 2: PyRIT Memory Bridge & Async APIs (Weeks 5-8)
Core Integration Services:
- PyRIT Memory Bridge Implementation (DuckDB ↔ Polyglot Storage)
- Async Database Management APIs (ADR-007 Compliant)
- JSON-Exclusive Database Endpoints (ADR-006 Compliant)
- Organization-Isolated PyRIT Memory Sessions
Phase 3: Evidence Storage & Lifecycle (Weeks 9-12)
Data Management Services:
- PyRIT Evidence Storage in Document DB Layer
- Automated Data Lifecycle Management (90-day hot→cold policy)
- PyRIT Dataset Operations (SeedPrompt, QA, ChatMessages)
- Blob Storage Integration for PyRIT Archives
Phase 4: Enterprise Integration & Validation (Weeks 13-16)
Production Readiness:
- Comprehensive ADR Compliance Testing
- PyRIT Engine Compatibility Validation
- Organization Isolation Security Testing
- Performance Optimization & Monitoring Integration
Success Criteria
ADR Compliance Criteria:
- ADR-F2.2 Compliance: PyRIT data flows through PostgreSQL → Document DB → Blob Storage layers
- ADR-003 Compliance: All PyRIT resources include
organization_idwith RBAC+ABAC enforcement - ADR-007 Compliance: All long PyRIT operations (>30s) use async task processing with status URLs
- ADR-006 Compliance: All database APIs use
application/jsonexclusively - ADR-010 Compliance: PyRIT dependency passes security scans and license compliance
PyRIT Engine Compatibility:
- Memory Operations: PyRIT orchestrators function normally with memory bridge
- Dataset Support: Full compatibility with SeedPromptDataset, QuestionAnsweringDataset, ChatMessagesDataset
- Performance: PyRIT operations maintain acceptable performance (<10% overhead)
- API Transparency: PyRIT engine unaware of underlying polyglot persistence
Security & Compliance:
- Organization Isolation: Zero cross-tenant data access in comprehensive security testing
- Audit Trail Completeness: All PyRIT operations generate proper audit events
- Data Lifecycle Compliance: PyRIT evidence follows established archival policies
- Security Integration: PyRIT operations secured by existing middleware stack
Technical Excellence:
- API Performance: Database status operations <500ms (95th percentile)
- Concurrent Support: 100+ concurrent PyRIT operations
- Test Coverage: >85% code coverage including ADR compliance tests
- Documentation: Complete OpenAPI specs with PyRIT integration examples
Risk Mitigation Strategy
ADR Compliance Risks:
- Polyglot Complexity: PyRIT memory bridge adds architectural complexity
- Mitigation: Comprehensive testing, clear documentation, phased rollout
- Performance Impact: Dual storage may affect PyRIT performance
- Mitigation: Async background sync, intelligent caching, performance monitoring
- Security Model Integration: Complex RBAC+ABAC with PyRIT operations
- Mitigation: Extensive security testing, clear access control patterns
PyRIT Compatibility Risks:
- Engine Integration: Memory bridge may break PyRIT functionality
- Mitigation: Comprehensive PyRIT engine testing, compatibility validation
- Version Compatibility: PyRIT updates may break integration
- Mitigation: Automated testing, version pinning, upgrade testing procedures
Technical Implementation Risks:
- Data Synchronization: Complex sync between DuckDB and polyglot storage
- Mitigation: Transactional patterns, data integrity checks, recovery procedures
- Resource Management: Memory and storage resource consumption
- Mitigation: Resource quotas, monitoring, automated cleanup policies
Verification Requirements
ADR Compliance Verification:
- ADR-F2.2 Testing: Verify PyRIT data properly flows through all storage layers
- ADR-003 Testing: Confirm organization isolation with security penetration testing
- ADR-007 Testing: Validate all long operations use async task patterns
- ADR-006 Testing: Ensure all APIs use JSON exclusively
- ADR-010 Testing: Verify PyRIT dependency security and license compliance
PyRIT Engine Verification:
- Memory Bridge Testing: PyRIT orchestrators function normally
- Dataset Compatibility: All PyRIT dataset types work correctly
- Performance Testing: PyRIT operations maintain acceptable performance
- Integration Testing: End-to-end PyRIT workflows function properly
Security Verification:
- Multi-Tenant Testing: Organizations cannot access each other's PyRIT data
- RBAC/ABAC Testing: Role-based access properly enforced for PyRIT operations
- Audit Trail Testing: All PyRIT operations generate complete audit events
- Data Lifecycle Testing: PyRIT evidence follows archival policies
This epic represents a sophisticated integration challenge that strengthens violentutf-api's architectural integrity while enabling advanced PyRIT capabilities. The solution demonstrates how enterprise architecture principles can be maintained while integrating complex external systems.
Implementation Reference
Detailed implementation guidance available in:
/docs/planning/database_management_api_implementation_plan_v2_adr_compliant.md
Key architectural innovations include the PyRIT Memory Bridge pattern and evidence storage integration that maintains both PyRIT engine compatibility and full ADR compliance.