Skip to content

Implement CloudBeat as a Collector-Receiver #3603

@orestisfl

Description

@orestisfl

Summary

As a follow-up to the "Cloudbeat Size Reduction" RFC and subsequent meeting, this issue tracks the effort to implement Cloudbeat as an OpenTelemetry collector-receiver. This is the agreed-upon strategic direction, superseding the previous proposal of merging Cloudbeat directly into agentbeat. The primary goal is to align Cloudbeat with the future architecture of Elastic Agent, improve developer experience, and resolve complex dependency issues.

Context

The current approach of shipping Cloudbeat as a separate binary within Elastic Agent contributes significantly to the overall package size. While merging it into agentbeat showed a potential size reduction of ~100MB, it introduced a problematic circular dependency (agentbeat -> cloudbeat -> libbeat within beats).

The consensus from the meeting is to bypass the agentbeat merge and move directly to a collector-based architecture. This involves creating a Cloudbeat receiver that runs within the Elastic OpenTelemetry Collector.

Implementation Plan

  1. Create the Cloudbeat Receiver:

    • A new "Cloudbeat receiver" will be created inside the elastic/beats repository.
    • This receiver should be scaffolded using the existing templates for the FileBeat and MetricBeat receivers as a reference.
  2. Refactor Cloudbeat into a Library:

    • The core logic of Cloudbeat (CSPM/CNVM scanning, cloud provider interactions, etc.) must be refactored into a Go library, completely decoupled from libbeat.
    • This library will remain in the elastic/cloudbeat repository.
    • The new Cloudbeat receiver (in elastic/beats) will import this library to perform its functions. This model (beats repo -> cloudbeat repo) avoids the circular dependency.
  3. Integration with Elastic Agent:

    • The Cloudbeat receiver will be compiled as part of the Elastic OpenTelemetry Collector.
    • The Elastic Agent will then import and run this collector, making the Cloudbeat functionality available.

Key Considerations & Risks

  • Binary Size: Adding Cloudbeat's functionality will significantly increase the size of the collector binary. This is a major concern. This task should be done in parallel with efforts to enable Dead Code Elimination (DCE) for Cloudbeat's dependencies (like OPA and Trivy) to mitigate the size increase. The impact on the final binary size must be closely monitored.
  • Dependency Conflicts: The proposed architecture is designed to solve the dependency cycle. However, care must be taken during the refactoring to ensure a clean separation of concerns and that no libbeat dependencies remain in the core Cloudbeat library.
  • CI/CD and Testing: The Cloudbeat team's testing infrastructure will need significant updates to support this new build and execution model. The new workflow will involve building the receiver in the beats repo and testing it against the Cloudbeat library.

Action Items

  • Prototype the Cloudbeat receiver in the elastic/beats repository.
  • Begin refactoring the elastic/cloudbeat repository to extract a libbeat-free core library.
  • Establish a CI plan for building and testing the new receiver.
  • Continuously measure the impact on the collector's binary size and coordinate with the team working on DCE.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions