Skip to content

feat(cloud-security): cloud tests v2 — services, remediation, multi-provider adapters#2493

Open
tofikwest wants to merge 50 commits intomainfrom
tofik/q1-cloud-tests-v2
Open

feat(cloud-security): cloud tests v2 — services, remediation, multi-provider adapters#2493
tofikwest wants to merge 50 commits intomainfrom
tofik/q1-cloud-tests-v2

Conversation

@tofikwest
Copy link
Copy Markdown
Contributor

Summary

Major upgrade to cloud security scanning and remediation across AWS, Azure, and GCP:

  • 45 AWS service adapters — S3, IAM, RDS, Lambda, ECS/EKS, CloudFront, DynamoDB, ElastiCache, and 37 more, each with dedicated security checks
  • 13 Azure service adapters — AKS, Key Vault, Entra ID, Cosmos DB, Storage Account, SQL Database, etc.
  • GCP remediation — AI-driven fix generation via SCC integration
  • Remediation engine — preview/execute/rollback flow with acknowledgment, batch support via Trigger.dev task
  • Services grid UI — findings grouped by cloud service with per-service cards
  • Integration provider detail pages — connection management, activity tracking, account settings
  • Integration platform extensionsservices metadata on manifests, services controller, scheduled scan popover

Key files

Area Key files
AWS adapters apps/api/src/cloud-security/providers/aws/*.adapter.ts (45 files)
Azure adapters apps/api/src/cloud-security/providers/azure/*.adapter.ts (13 files)
Remediation remediation.controller.ts, remediation.service.ts, *-remediation.service.ts
Batch remediation apps/app/src/trigger/tasks/cloud-security/remediate-batch.ts
Frontend cloud-tests/components/, integrations/[slug]/components/
Schema remediation-action.prisma, remediation-batch.prisma

Conflict resolution notes (rebased onto main)

  • Merged main's syncDefinition support with our services support across all layers (Prisma, types, repo, controller, manifest loader)
  • Respected main's Ramp integration removal — cleaned up orphaned references
  • Kept main's additional GCP SCC error handling (Legacy, API not enabled)

Test plan

  • Verify AWS cloud test scan runs successfully with new adapters
  • Verify Azure scan with new service adapters
  • Verify GCP scan + SCC error handling
  • Test remediation preview → execute → rollback flow
  • Test batch remediation via Trigger.dev
  • Verify services grid groups findings correctly
  • Verify integration detail page loads and manages connections
  • Run npx jest src/cloud-security --passWithNoTests for API tests
  • Run typecheck: npx turbo run typecheck

🤖 Generated with Claude Code

…ti-provider adapters

Major upgrade to cloud security scanning and remediation across AWS, Azure, and GCP:

- Add 45 AWS service-specific adapters (S3, IAM, RDS, Lambda, ECS, etc.)
- Add 13 Azure service adapters (AKS, Key Vault, Entra ID, Cosmos DB, etc.)
- Add GCP remediation service with AI-driven fix generation
- Add remediation controller with preview/execute/rollback flow
- Add batch remediation support with Trigger.dev task
- Add services grid UI for grouping findings by cloud service
- Add integration provider detail pages with connection management
- Add scheduled scan popover and activity tracking
- Extend integration platform with services metadata support

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@mintlify
Copy link
Copy Markdown
Contributor

mintlify bot commented Apr 9, 2026

Preview deployment for your docs. Learn more about Mintlify Previews.

Project Status Preview Updated (UTC)
CompAI 🟢 Ready View Preview Apr 9, 2026, 7:58 PM

💡 Tip: Enable Workflows to automatically generate PRs for you.

@vercel
Copy link
Copy Markdown

vercel bot commented Apr 9, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
app Ready Ready Preview, Comment Apr 11, 2026 8:24pm
comp-framework-editor Ready Ready Preview, Comment Apr 11, 2026 8:24pm
1 Skipped Deployment
Project Deployment Actions Updated (UTC)
portal Skipped Skipped Apr 11, 2026 8:24pm

Request Review

@cursor
Copy link
Copy Markdown

cursor bot commented Apr 9, 2026

PR Summary

High Risk
Introduces automated remediation execution/rollback flows that can modify cloud infrastructure across AWS/GCP/Azure, plus expands auth behavior for service tokens to impersonate users via x-user-id. These changes increase security and operational risk due to new write-capable automation and expanded trust surface.

Overview
Adds an AI-driven remediation system for cloud-security findings across AWS, GCP, and Azure, including plan generation/refinement from real provider state, execution with retries/self-healing, and rollback support with persisted remediationAction audit trails.

Expands cloud-security API capabilities with new endpoints for activity, AWS/GCP service detection and setup flows (GCP org/project auto-detection and setup; Azure subscription/permission validation), and updates scan behavior to support baseline services plus detected/disabled service filtering and AWS task auto-satisfaction when service findings fully pass.

Updates auth for service tokens: UserId now returns system when no user context is present, and HybridAuthGuard can accept x-user-id for validated “act on behalf of” behavior. Also adds safety/validation layers for remediation executors (AWS blocked commands + param normalization; GCP/Azure URL validation/SSRF protections) and significantly expands AWS SDK client dependencies to support many services.

Reviewed by Cursor Bugbot for commit aef367b. Bugbot is set up for automated code reviews on this repo. Configure here.

- Fix Azure remediation always recording 'success' regardless of
  verification outcome (was `verified ? 'success' : 'success'`)
- Remove InvalidInputException from idempotent success list in AWS
  executor — it indicates real validation errors, not duplicates
- Re-check blocked commands after fuzzy match resolution to prevent
  bypass via AI-generated command name variants
- Remove dead code ensureProvidersRegistered (superseded by
  azure-command-executor's auto-registration)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Azure validator: parse URLs with new URL() and check hostname against
  allowlist instead of using .includes() substring match (CodeQL fix)
- GCP validator: same — parse hostname properly, check against
  *.googleapis.com instead of substring match (CodeQL fix)
- Azure detectNeededRole: return null when no specific role matches
  instead of always falling back to Contributor (privilege escalation fix)
- Pre-flight ensureWriteAccess now passes a matching keyword to
  explicitly trigger the Contributor role condition

The GCP security service errorText.includes('securitycenter.googleapis.com')
is a CodeQL false positive — it checks error message content, not URLs.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Remove redundant securitycenter.googleapis.com substring check in
  GCP error detection — already covered by 'Security Command Center API'
  and 'has not been used' conditions (CodeQL fix)
- Allow rollback of 'unverified' remediation actions across all three
  providers (AWS, Azure, GCP) — the previous fix introduced 'unverified'
  status but rollback only accepted 'success', making unverified actions
  that changed infrastructure impossible to undo
- Clone step.params before passing to executeAwsCommand to prevent
  normaliseInputParams from mutating the original plan objects, which
  corrupts the plan cache and stored appliedState

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Service tokens now verify that the x-user-id header references a real
member of the target organization before accepting it. Previously, any
arbitrary user ID was trusted without validation, which could create
misleading audit trails.

Also removes accidentally committed generated Prisma schema files
(apps/*/prisma/schema.prisma) that are build artifacts from
generate-prisma-client-js.js.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The two patterns got merged into one nonsensical line during rebase.
Restore the original .superpowers/* coverage and add .claude/worktrees/
as a separate entry.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
tofikwest and others added 2 commits April 10, 2026 14:50
…gner

Root package.json pinned @aws-sdk/client-s3 to 3.1013.0 which hoisted
over the app's ^3.859.0, making it incompatible with
@aws-sdk/s3-request-presigner (getSignedUrl type error). AWS SDK deps
belong in apps/api/package.json only.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The app had @aws-sdk/client-s3@^3.859.0 while the api added 30+ AWS
SDK clients at ^3.948.0. Bun hoisted the newer @smithy/types which
was incompatible with the app's older s3-request-presigner, causing
a type error on getSignedUrl. Bumped all app AWS SDK deps to ^3.948.0
to match.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
tofikwest and others added 3 commits April 10, 2026 15:22
BASELINE_SERVICES had 'iam' but the IAM adapter's serviceId is
'iam-analyzer'. When service filtering was active, IAM security
checks were silently skipped from baseline scans.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Both apps now use the exact same version, eliminating the duplicate
nested @smithy/types that caused the getSignedUrl type error on Vercel.
The ^3.948.0 range wasn't enough — bun resolved presigner to 3.957.0
which bundled @smithy/types@4.11.0 while client-s3@3.1013.0 used
@smithy/types@4.13.1.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
tofikwest and others added 3 commits April 10, 2026 16:15
The auto-rollback path in executePlanSteps passed rbStep.params
directly to executeAwsCommand, unlike the normal path which uses
structuredClone. normaliseInputParams mutates in place, corrupting
the stored rollback steps.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Bun on Vercel resolves duplicate @smithy/types copies for client-s3
and s3-request-presigner even when pinned to the same version. This
causes a private property 'handlers' type conflict on S3Client.

Added root-level pins to force hoisting and a typed wrapper for
getSignedUrl that uses the client-s3 types directly, bypassing the
class identity check while keeping full type safety.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
tofikwest and others added 3 commits April 10, 2026 17:21
TypeScript requires the intermediate 'unknown' cast when the source
and target types have incompatible private properties (separate
declarations of 'handlers' from duplicate @smithy packages).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
serverApi.post returns typed {} by default. Added generic type param
for the batch create response in both cloud-tests and integrations
batch-fix actions.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The STATUS_CONFIG map included needs_permissions but the FindingStatus
union type didn't, causing a Vercel build type error.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
tofikwest and others added 2 commits April 10, 2026 17:33
…hProgress phase

The component compared progress.phase against 'retrying' and
'waiting_for_permissions' but the type only included 'running',
'scanning', 'done', 'cancelled'.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
… types

Added permChecksLeft to BatchProgress, and key + severity to
FindingProgress to match all runtime property accesses in the
component.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
tofikwest and others added 5 commits April 10, 2026 17:38
findingsResponse?.mutate?.() was never declared or imported. Replaced
with the existing onComplete prop callback which triggers the parent
to refresh findings data.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
PROVIDER_FIELDS was typed as Record<'aws', ...> but accessed with
'gcp' | 'azure' | 'aws'. Changed to Partial<Record<...>> with guards
for undefined (GCP/Azure use OAuth, not credential fields).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Created shared @/lib/s3-presigner.ts wrapper that handles the
@smithy/types duplicate class identity issue. Updated all 6 files
that import getSignedUrl to use the shared wrapper instead of
importing directly from @aws-sdk/s3-request-presigner.

Also fixed EmptyState PROVIDER_FIELDS type for multi-provider support.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The app uses @db/server for Prisma client, not @db directly.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 827ca8d. Configure here.

logger.warn(`Rollback step ${r} failed: ${err instanceof Error ? err.message : String(err)}`);
}
}
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Auto-rollback skips first completed step on Azure failure

Medium Severity

When a fix step fails, auto-rollback only triggers when i > 0, which means if step index 1 (the second step) fails, only step 0 is rolled back. However, the failed step's result is pushed to results before the success check, so results includes the failed entry. This is inconsistent with the AWS executor pattern where failed steps are not pushed. More critically, the convention states rollbackSteps[i] undoes fixSteps[i], but the rollback loop starts at Math.min(i, autoRollbackSteps.length) - 1, which skips the rollback for step i itself if step i partially executed before failing.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 827ca8d. Configure here.

…y else branch

TypeScript narrowing already excludes 'needs_permissions' from retry.status
in the final else block (handled by prior else-if). The guard was always true,
causing a build error on Vercel.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants