-
Notifications
You must be signed in to change notification settings - Fork 473
Detectors #5003
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft
yanivagman
wants to merge
73
commits into
aquasecurity:main
Choose a base branch
from
yanivagman:detectors
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Draft
Detectors #5003
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #5003 +/- ##
==========================================
+ Coverage 29.72% 34.05% +4.32%
==========================================
Files 234 260 +26
Lines 26186 30593 +4407
==========================================
+ Hits 7785 10418 +2633
- Misses 17864 19423 +1559
- Partials 537 752 +215
🚀 New features to boost your workflow:
|
8cc653f to
817333b
Compare
10f7269 to
a659dbe
Compare
Add ./detectors/ to TRACEE_SRC_DIRS and add go.mod/go.sum dependencies (both root and detectors module) for the tracee binary target. This ensures that changes to detector source files or module dependencies trigger a rebuild when running 'make tracee'.
Add GetAncestry() method to ProcessStore interface to retrieve process ancestry chains up to a specified depth. Implementation walks the parent chain and returns a slice where [0]=process itself, [1]=parent, [2]=grandparent, etc. The method stops if: - maxDepth is reached - Parent not found in the tree - Circular reference detected - maxDepth <= 0 (returns empty slice) This enables detectors to analyze process lineage for threat detection patterns without complex manual queries through the process tree.
Implement ProcessAncestry auto-population in the detector dispatcher. When a detector declares AutoPopulate.ProcessAncestry = true, the engine automatically queries the process store and populates up to 5 ancestor levels in the output event's Workload.Process.Ancestors field. The implementation: - Queries ancestry using detector's DataStores reference - Skips the first entry (process itself, already in output) - Converts datastores.ProcessInfo to v1beta1.Process format - Only populates if Ancestors field is empty (respects manual overrides) Add comprehensive integration test that creates a 3-level process tree (init -> bash -> python) and verifies ancestry is correctly populated with proper entity IDs and PIDs for each ancestor.
- Extract buildEventFromOutput() to properly construct output events - Deep clone Workload from input event to prevent race conditions - Clone Policies from input event to preserve policy matching context - Clear process ancestors array (will be populated by auto-population if requested) This fixes potential data races where multiple detectors could modify shared event structures, and ensures each detector output has its own isolated copy of the workload and policy context.
… Threat - Rename autoPopulateFields() to autoPopulateFieldsFromOutput() - Support output-level AutoPopulate overrides (output.AutoPopulate > definition) - Support output-level Threat overrides (output.Threat > definition.ThreatMetadata) - Allow detectors to customize threat severity/description per detection - Allow detectors to selectively disable auto-population for specific outputs This enables detectors to dynamically adjust threat levels based on detection confidence or context, and to control field population on a per-detection basis rather than only at the definition level.
Add ExampleDetector demonstrating the detector API: - Subscribes to execve events - Produces example_detection event with 3 fields - Shows field definitions, threat metadata, and auto-population - Implements full detector lifecycle (Init/OnEvent/Close) This detector serves as: - Reference implementation for new detectors - Test case for detector infrastructure - Live documentation of API patterns
…ation Detector data filters were skipped when DataFilter enabled kernel filtering mode for pathname fields. Detectors lack kernel-side filtering, so userspace filtering must always run. Add skipKernelFilter flag to DataFilter and NewDetectorDataFilter() constructor. Detector registry now uses NewDetectorDataFilter() to ensure filters are always applied in userspace.
- Change all datastore query methods to return (value, error) pattern - Return datastores.ErrNotFound instead of (nil, false) - Document that Registry accessors never return nil stores - Update all datastore implementations and tests
Update example detectors and tests to use error returns instead of bool. Check errors.Is(err, datastores.ErrNotFound) for not-found cases.
…ontrol Add optional AncestryDepth field to DetectorOutput enabling fine-grained control over process ancestry fetching depth on a per-detection basis. Changes: - Add AncestryDepth *uint32 to DetectorOutput struct - Implement priority: output.AncestryDepth > ProcessAncestry bool (default 5) - Update dispatch logic to use explicit depth when provided - Document new field and usage patterns in developer guide The ProcessAncestry boolean remains as simple fallback (default 5 levels). Detectors can now override depth per detection: nil (use boolean), 0 (disable), or N (fetch N levels).
Add comprehensive documentation for EventDetector API and DataStore system: - docs/docs/detectors/developer-guide.md - Complete detector development guide with quick start, API reference, testing, migration from signatures, and examples - docs/docs/detectors/datastore-api.md - DataStore API reference for all stores (Process, Container, System, Syscall, Symbol, DNS) with usage patterns - docs/docs/detectors/index.md - Landing page with overview and navigation - docs/docs/events/custom/overview.md - Updated to highlight new detector system and mark signatures as legacy Covers auto-registration, filters, version validation, enrichment requirements, metrics, process ancestry, and more.
Add WritableStore interface for detectors/extensions to store custom data. - Add WritableStore interface (WriteValue, WriteBatchValues, Delete, ClearSource, ListSources) - Add RegisterWritableStore() to Registry - Define protobuf schemas for DataEntry and gRPC DataSourceService - Source isolation for data provenance tracking - Update Makefile for subdirectory protobuf generation - Ownership model: registrant controls store behavior - Fix test mocks to return valid health/metrics objects
Add reference implementation of WritableStore for IP reputation tracking. - Implement IPReputationStore with type-safe and generic methods - Support three conflict resolution policies (LastWriteWins, MaxSeverity, PriorityBased) - Source partitioning for multi-feed aggregation - Thread-safe operations with proper locking - Comprehensive test suite (15 tests, 78% coverage) - Type-safe methods: GetReputation, IsBlacklisted, WriteReputation, WriteBatch - Generic methods: WriteValue, WriteBatchValues, Delete, ClearSource, ListSources
Add comprehensive documentation for writable datastores. - Document WritableStore interface and usage patterns - Registration, writing, and reading examples - Source isolation and batch operations - Complete detector example using IP reputation store - Focus on generic API contract, reference implementation-specific details - Fix event data access examples (use GetData instead of non-existent GetSourceIP)
Ensure Registry accessor methods never return nil for safety and consistency. - Add null object implementations for all store types - Initialize Registry with null objects by default - Null stores return ErrStoreUnhealthy for all operations - Update tests to verify nil-safety guarantee - Document nil-safety guarantee in API docs Benefits: - Eliminates need for nil checks in detector code - Enables safe method chaining - Clear error signals (ErrStoreUnhealthy vs ErrNotFound) - Graceful degradation when stores unavailable
…datastores Add comprehensive test validating core use case where detectors share data through writable datastores. Tests proper event-based communication where one detector writes threat intelligence and another reads it during event processing. Test scenarios: - Threat intelligence sharing between detectors - Ownership model (single writer, multiple readers) - Graceful degradation when store unavailable - Batch operations (1000+ entries) All communication happens through OnEvent() methods, demonstrating realistic detector interaction patterns.
Enable declarative threat detection via YAML files with full Go detector parity. Supports event filtering, runtime data extraction, and auto-population. - YAML schema with validation - Field extraction using v1beta1 helpers - CLI: --detectors yaml-dir=/path/to/dir - Config: detectors.yaml-dir - Default paths: ./detectors, /etc/tracee/detectors - Test coverage: 48.1%
Add example YAML detector and update documentation: - suspicious_binary_execution.yaml: detects execution of networking tools - Updated detectors/README.md with YAML detector documentation - Added detectors/yaml/README.md with usage guide
Add complete user guide for YAML detectors: - Quick start with working examples - Full schema reference with all fields - Event filtering and data extraction guide - Complete list of supported extraction paths - Deployment options and best practices - Troubleshooting guide - Updated detector index with YAML guide link
Add 5 integration tests covering: - Basic YAML detector loading and registration - Event generation with field extraction (data.*, workload.*) - 2-level detector chaining/composition - Data filter application (matching/non-matching) - Error handling (invalid YAML, missing fields) Tests follow existing integration test patterns with proper cleanup, goleak checking, and clear logging for debugging.
Add comprehensive testing infrastructure for unit testing detectors: - TestHarness for engine setup and detector registration - Event builders for constructing test inputs - Assertion helpers for validating outputs - SimpleDetectorTest for lightweight unit testing - Support for both Go and YAML detectors - Examples demonstrating stateful and stateless detector testing
Add parent field to DetectedFrom for complete detection provenance through multi-level chains. Engine automatically preserves chain when dispatching between detectors. - Add DetectedFrom.parent field (protobuf) - Auto-chain parent in dispatch logic with cloning - Add helpers: GetDetectionChain, GetRootDetection, GetChainDepth - Update docs and add chain tests (2-level, 3-level)
- Add Common Expression Language (CEL) for dynamic conditions and field extraction - Support short form (field names) and long form (CEL expressions) for output fields - Add helper functions: getData(), getDataInt(), getDataUInt(), hasData() - Rename extract_fields -> fields, source -> expression for clarity - Update all existing YAML detectors to use CEL syntax - Add comprehensive documentation for CEL features - All unit tests pass (38.3% coverage) - YAML detectors load successfully with CEL compilation Breaking changes: - YAML schema: extract_fields -> fields, source -> expression - Field extraction now uses CEL expressions instead of JSONPath-like paths
Implements CEL global list variables to share common lists (e.g., shell
binaries, sensitive paths) across YAML detectors. Lists are defined in
{detector-dir}/lists/ subdirectory and exposed as CEL variables.
- Add list schema and loader with validation (uppercase snake_case names)
- Register lists as CEL variables (list<string>) in environment
- Pass lists to CEL evaluation context at runtime
- Add comprehensive unit and integration tests (16 new tests)
- Add example shell_binaries list and detector
- Update documentation with usage examples
Lists are compiled into CEL at load time for zero runtime overhead and
compile-time type safety.
Expose all core datastores (Process, Container, System, Kernel, DNS, Syscall) as namespaced CEL functions for use in YAML detector conditions and outputs. Key changes: - Add datastores.Registry field to YAMLDetector, rebuild CEL env in Init() - Implement 12 datastore functions: process.get/getAncestry/getChildren, container.get/getByName, system.info, kernel.resolveSymbol/getSymbolAddress, dns.getResponse, syscall.getName/getId - Add comprehensive test coverage with mock datastores - Update documentation with examples and usage patterns Functions return null for not-found entities, handle time.Time conversion, and integrate seamlessly with existing CEL expressions.
Add 8 string utility functions for YAML detectors:
- split(str, delimiter) - Split string into list
- join(list, delimiter) - Join list into string
- trim(str) - Remove leading/trailing whitespace
- replace(str, old, new) - Replace all occurrences
- upper(str) - Convert to uppercase
- lower(str) - Convert to lowercase
- basename(path) - Get filename from path
- dirname(path) - Get directory from path
Functions are available in both conditions and output expressions.
All functions handle CEL's various list representations ([]string, []interface{}, []ref.Val).
Includes comprehensive unit tests and documentation updates.
Support proto-native PipelineEvents to avoid redundant conversions. Detector outputs now flow as protobuf without trace.Event conversion. - Make PipelineEvent.ProtoEvent public - Update ToProto() to handle proto-native events (Event == nil) - Add matchPoliciesProto() for proto-native filtering - Unify sink stage to handle both event types - Pool all PipelineEvents uniformly Eliminates proto→trace→proto conversion overhead for detector outputs.
4ca54eb to
1444cab
Compare
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Detectors draft